Method of targeted gene disruption, genome of hyperthermostable bacterium and genome chip using the same

Information

  • Patent Application
  • 20060248617
  • Publication Number
    20060248617
  • Date Filed
    August 29, 2003
    20 years ago
  • Date Published
    November 02, 2006
    17 years ago
Abstract
It is intended to provide an efficient and sure gene targeting method embodied at an arbitrary position in the genome of an organism and a kit therefor. It is also intended to provide a method for targeted-disruption of an arbitrary gene in the genome of an organism which comprises: 1) the step of providing the whole sequencial data of the genome of the organism; 2) the step of selecting at least one arbitrary region in the sequence; 3) the step of providing a vector containing a sequence homologous with the region selected above and a marker gene; 4) the step of transforming the organism by the vector; and 5) the step of providing the organism under such conditions as allowing homologous recombination. Moreover, the genome of a hyperthermostable bacterium and its array are provided.
Description
TECHNICAL FIELD

The present invention relates to genomics. More specifically, the present invention relates to a genome of a hyperthermostable bacterium and a genome chip thereof. The present invention relates to a novel method for targeted disruption.


BACKGROUND ART

Hyperthermostable bacteria survive in high temperature environments, proteins (such as enzymes) produced by the bacteria are generally thermostable, i.e., structurally stable. Further, archaebacteria, to which the hyperthermostable bacteria belong, are living organisms different from conventionally known prokaryotic or eukaryotic organisms. Therefore, it is clear that the hyperthermostable bacteria are evolutionally different from these organisms. Accordingly, even if an enzyme derived from the hyperthermostable bacteria has similar functions to those already known derived from prokaryotic or eukaryotic cells, the enzymes derived from the hyperthermostable bacteria are often structurally and/or enzymatically different from conventional enzymes. For example, chaperonin isolated from the KOD-1 strain (Thermococcus kodakaraensis KOD1, hereinafter also called KOD1 or KOD1 strain; Morikawa, M. et al., Appl. Environ. Microbiol. 60(12), 4559-4566(1994)), a hyperthermostable bacterium, has similar functions to GroEL from Escherichia coli. However, GroEL forms a 14-mer and further complexes with GroES, which forms a 7-mer, in order to achieve its functions, whereas the chaperonin from KOD-1 strain functions alone (Yan, Z. et al., Appl. Environ. Microbiol. 63: 785-789).


Gene disruption using a plasmid is conventionally known as a method for targeted disruption of a gene in thermostable bacteria (Bartolucci S., Third International Congress on Extremophiles Hamburg, Germany, Sep. 3-7, 2000). The method of Bartolucci utilizes a homogeneous or heterogeneous expression system with a recombinant protein using a thermostable bacterium. However, it is unclear as to whether targeted genes are definitely disrupted by this method, and therefore it cannot be said that effecient targeted disruption is achieved.


Accordingly, there is a limitation in gene targeting based on information of some of the genes.


Therefore, it is an object of the invention to provide a method for gene targeting in an efficient and definite manner in an arbitrary site of a genome of a living organism, and a kit therefor.


Further, there is no method as of this date for analysing a genome as a whole in an efficient and/or global manner by the genome of a hyperthermostable bacterium onto a chip. Therefore, it is another object of the invention to develop a technology for analysing such a genome as a whole in an efficient and/or global manner.


SUMMARY OF INVENTION

The above identified problem has been solved by using an entire sequence of a genome of a living organism for targeting a portion of chromosomes thereof. In particular, the present invention demonstrates that the above-mentioned method has been carried out in an efficient and definite manner by sequencing the whole genome of Thermococcus kodakaraensis KOD1 strain, a strain of thermostable bacteria, as an example of genomic sequence.


The present invention also provides for the first time a technology for analyzing an entire genome in an efficient and/or global manner by sequencing the entire genomic sequence of Thermococcus kodakaraensis KOD1 strain, a strain of the thermostable bacteria as an example of the genomic sequence. Therefore, it is now possible to simulate gene expression of the organism per se on a chip.


Accordingly, the present invention provides the following:

  • 1) A method for targeted-disuption of an arbitrary gene in the genome of a living organism comprising the steps of:


A) providing information of the entire sequence of the genome of the living organism;


B) selecting at least one arbitrary region of the sequence;


C) providing a vector comprising a sequence complementary to the selected region and a marker gene;


D) transforming the living organism with the vector; and


E) placing the living organism in a condition allowing homologous recombination.

  • (2) The method accoding to Item 1 wherein in the step B), the region comprises at least two regions.
  • (3) The method according to Item 1, wherein the vector further comprises a promoter.
  • (4) The method according to Item 1 further comprising the step of detecting an expression product of the marker gene.
  • (5) The method according to Item 1 wherein the marker gene is located in the selected region.
  • (6) The method according to Item 1, wherein the marker is located outside of the selected region.
  • (7) The method according to Item 1, wherein the genome is the genome of Thermococcus kodakaraensis KOD1.
  • (8) The method according to Item 1, wherein the genome has a sequence set forth in SEQ ID NO: 1 or 1087.
  • (9) The method according to Item 1, wherein the region comprises a sequence encoding at least one sequence selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (10) A nucleic acid molecule having a sequence set forth in SEQ ID NO: 1 or 1087.
  • (11) A nucleic acid molecule comprising at least eight contiguous nucleic acid sequence of a sequence set forth in SEQ ID NO: 1 or 1087.
  • (12) A nucleic acid molecule comprising a sequence encoding an amino acid sequence encoding at least one sequence selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157; or a sequence having 70 % homology thereto.
  • (13) A nucleic acid molecule wherein when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto.
  • (14) A polypeptide comprising at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto.
  • (15) A polypeptide comprising at least three contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto.
  • (16) A polypeptide comprising at least eight contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto.
  • (17) A polypeptide comprising at least three contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, wherein the polypeptide has biological activity.
  • (18) The polypeptide according to Item 17, wherein the biological activity comprises a function set foth in Table 2.
  • (19) A method for screening for a heat resistant protein, comprising the steps of:


A) providing the entire sequence of the genome of a thermoresistant living organism;


B) selecting at least one arbitrary region of the sequence;


C) providing a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the heat resistance protein;


D) transforming the living organism with the vector;


E) placing the thermoresistant living organism in a condition allowing to cause homologous recombination;


F) selecting the thermoresistant living organism in which homologous recombination has occurred; and


G) assaying to identify the thermoresistant protein.

  • (20) A kit for screening for a thermoresistant protein, comprising:


A) a thermoresistant living organism; and


B) a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the thermoresistant protein.

  • (21) The kit according to Item 20, further comprising an assay system for identifying the thermoresistant protein.
  • (22) The kit according to Item 20, wherein the thermoresistant living organism is hyperthermophilic bacteria.
  • (23) The kit according to Item 20, wherein the thermoresistant living organism is Thermococcus kodakaraensis KOD1.
  • (24) A biomolecule chip having at least one nucleic acid molecule having at least eight contiguous or non-contiguous nucleotides of the sequences set forth in SEQ ID NOs: 1 or 1087, or a variant thereof located therein.
  • (25) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof is located to cover the sequences set forth in SEQ ID NO: 1 or 1087.
  • (26) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises any open reading frame of the sequences set forth in SEQ ID NO: 1 or 1087.
  • (27) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises substantially all open reading frames of the sequences set forth in SEQ ID NO: 1 or 1087.
  • (28) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises a sequence encoding at least one sequence selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (29) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (30) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises at least eight contiguous nucleotide lengths of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (31) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises at least fifteen contiguous nucleotide lengths of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (32) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof comprises at least thirty contiguous nucleotide lengths of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
  • (33) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof, comprises substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto.
  • (34) The biomolecule chip according to Item 24, wherein the nucleic acid molecule or the variant thereof, comprises at least eight contiguous nucleotide lengths of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto.
  • (35) The biomolecule chip according to Item 24, wherein when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule or the variant thereof, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto.
  • (36) The biomoleculeip to Item 24, wherein the substrate is addressable.
  • (37) A biomolecule chip with a polypeptide or a variant thereof, having at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (38) The biochip according to Item 37, wherein the polypeptide or the variant thereof, has at least three contiguous amino acid lengths of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (39) The biochip according to Item 37, wherein the polypeptide or the variant thereof, has at least eight contiguous amino acid lengths of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (40) The biochip according to Item 37, wherein the polypeptide or the variant thereof, has at least three contiguous or non-contiguous amino acid lengths of at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, and having a biological function, located therein.
  • (41) The biomolecule chip according to Item 40, wherein the biological activity comprises a function set forth in Table 2.
  • (42) The biomolecule chip according to Item 40, wherein the biological activity comprises epitope activity.
  • (43) A recording medium having stored therein information of a nucleic acid sequence of a nucleic acid molecule having at least eight contiguous or non-contiguous nucleotide sequences of the sequences set forth in SEQ ID NOs: 1 or 1087, or a variant thereof.
  • (44) The storing medium according to Item 43 wherein the nucleic acid molecule or the variant thereof comprises at least eight contiguous nucleotide lengths of substantially all the sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto.
  • (45) The storage medium according to Item 43, wherein when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule or the variant thereof has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto.
  • (46) A storage medium comprising information of a polpeptide or a variant thereof having at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (47) The storage medium according to Item 46, wherein the polypeptide or the variant thereof has at least three contiguous amino acid lengths of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (48) The storage medium according to Item 46, wherein the polypeptide or the variant thereof ahs at least eight contiguous amino acid lengths of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
  • (49) The storage medium according to Item 46, wherein the polypeptide or the variant thereof has at least three contiguous or non-contiguous amino acid lengths of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, and having a biological function, located therein.
  • (50) The storage medium according to Item 49, wherein the biological activity comprises a function set forth in Table 2.
  • (51) A biomolecule chip having at least one antibody against a polypeptide or a variant thereof, located on a substrate, the polypeptide or the variant thereof comprises at least one amino acid sequence of sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto.
  • (52) An RNAi molecule having a sequence homologous to a reading frame sequence wherein, when the reading frame of Table 2 is f-1, f-2 or f-3, the reading frame sequence has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the reading frame sequence has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto.
  • (53) The RNAi molecule according to Item 52, which is an RNA or a variant thereof comprising a double-stranded portion of at least 10 nucleotides in length.
  • (54) The RNAi molecule according to Item 52, comprising a 3′ overhang terminus.
  • (55) The RNAi molecule according to Item 54, wherein the 3′ overhang terminus is a DNA of at least 2 nucleotides in length.
  • (56) The RNAi molecule according to Item 54, wherein the 3′ overhang terminus is a DNA of two to four nucleotides in length.


The prsent biomolecule chip may be DNA chip, protein chip or the like.


Hereinafter the preferable embodiments of the present invention are described. However, it should be appreciated that those skilled in the art can readily and appropriately carry out such embodiments of the invention from the description of the present invention and the well-known technology and common general knowledge of the art, and readily understand the effects and advantages of the present invention therefrom.




BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a schematic diagram of double-cross over disruption.



FIG. 2 is a schematic diagram of linear DNA using double cross-over disruption.



FIG. 3 is a schematic diagram of single cross-over disruption.



FIG. 4 is a diagram showing a genome structure of the present invention.



FIG. 5 is another diagram showing a genome structure of the present invention.



FIG. 6 is another diagram showing a genome structure of the present invention.



FIG. 7 is an exemplary schematic diagram showing a genomic biomolecule chip.




The description of the sequence listings is set forth in another Table (Table 2).


DETAILED DESCRIPTION OF THE INVENTION

Heterinafter the best modes of the present invention are described. It should be understood throughout the present specification that expression of a singular form includes the concept of their plurality unless otherwise mentioned. Specifically, articles for a singular form (e.g., “a”, “an”, “the”, etc. in English; “ein”, “der”, “das”, “die”, etc. and their inflections in German; “un”, “une”, “le”, “la”, etc. in French; “un”, “una”, “el”, “la”, etc. in Spanish, and articles, adjectives, etc. in other languages) include the concept of their plurality unless otherwise mentioned. It should be also understood that the terms as used herein have definitions typically used in the art unless otherwise mentioned. Thus, unless otherwise defined, all scientific and technical terms have the same meanings as those generally used by those skilled in the art to which the present invention pertain. If there is contradiction, the present specification (including the definition) precedes.


The embodiments provided hereinafter are provided for better understanding of the present invention, and should be understood that the the scope of the present invention should not be limited to the following description. Accordingly, it is apparant that those skilled in the art can appropriately modify the present invention within the scope thereof upon reading the description of the present specification.


(Definition of Terms)


The definitions of terms used herein are described below.


As used herein the term “organism” is used in the widest sense in the art and refers to a living entity haveing a genome. An organism comprises prokaryotes (for example, E. coli, hyperthermophillic bacteria and the like) and eukaryotes (for example, plants, animals and the like) and the like.


As used herein, the term “genome” refers to a group of genes of a set of chromosomes which is indispensable for supporting living activity of a living organism. In monoploidic organisms such as bacteria, phages, viruses and the like, one DNA or RNA molecule per se is responsible for the genetic information defining these species and is considered the genome. On the other hand, in diploidic organisms such as many eukaryotic organisms, a set of chromosomes (for example, a human has 23 pairs of chromosomes, a mouse has 20 pairs of chromosomes) in a germ cell, and two sets of chromosomes in a somatic cell comprise the genome.


As used herein, the term “gene” refers to an element defining a genetic trait. A gene is typically arranged in a given sequence on a chromosome. A gene which defines the primary structure of a protein is called a structural gene. A gene which regulates the expression of a structural gene is called a regulatory gene. As used herein, the term “gene” may refer to “polynucleotide”, “oligonucleotide”, “nucleic acid”, and “nucleic acid molecule” and/or “protein”, “polypeptide”, “oligopeptide” and “peptide”.


The terms “protein”, “polypeptide”, “oligopeptide” and “peptide” as used herein have the same meaning and refer to an amino acid polymer having any length. This polymer may be a straight, branched or cyclic chain. An amino acid may be a naturally-occurring or non-naturally-occurring amino acid, or a variant amino acid. The term may include those assembled into a composite or a plurality of polypeptide chains. The term also includes a naturally-occurring or artificially modified amino acid polymer. Such modification includes, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification (e.g., conjugation with a labeling moiety). This definition encompasses a polypeptide containing at least one amino acid analog (e.g., non-naturally-occurring amino acid, etc.), a peptide-like compound (e.g., peptoid), and other variants known in the art, for example. Gene products comprising a sequence listed in the Sequence Listing usually take a polypeptide form. As used herein, the polypeptide of the present invention has a specific sequence (a sequence set forth in Sequence Listings or a variant thereof). A sequence having a variant may be used for a varitey of purposes, such as diagnostic use, in the present invention.


The terms “polynucleotide”, “oligonucleotide”, and “nucleic acid” as used herein have the same meaning and refer to a nucleotide polymer having any length. This term also includes an “oligonucleotide derivative” or a “polynucleotide derivative”. An “oligonucleotide derivative” or a “polynucleotide derivative” includes a nucleotide derivative, or refers to an oligonucleotide or a polynucleotide having different linkages between nucleotides from typical linkages, which are interchangeably used. Examples of such an oligonucleotide specifically include 2′-O-methyl-ribonucleotide, an oligonucleotide derivative in which a phosphodiester bond in an oligonucleotide is converted to a phosphorothioate bond, an oligonucleotide derivative in which a phosphodiester bond in an oligonucleotide is converted to a N3′-P5′ phosphoroamidate bond, an oligonucleotide derivative in which a ribose and a phosphodiester bond in an oligonucleotide are converted to a peptide-nucleic acid bond, an oligonucleotide derivative in which uracil in an oligonucleotide is substituted with C-5 propynyl uracil, an oligonucleotide derivative in which uracil in an oligonucleotide is substituted with C-5 thiazole uracil, an oligonucleotide derivative in which cytosine in an oligonucleotide is substituted with C-5 propynyl cytosine, an oligonucleotide derivative in which cytosine in an oligonucleotide is substituted with phenoxazine-modified cytosine, an oligonucleotide derivative in which ribose in DNA is substituted with 2′-O-propyl ribose, and an oligonucleotide derivative in which ribose in an oligonucleotide is substituted with 2′-methoxyethoxy ribose. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively-modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be produced by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081(1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); Rossolini et al., Mol. Cell. Probes 8:91-98(1994)). The gene of the present invention usually takes this polynucleotide form.


As used herein, the term “nucleic acid molecule” is used interchangeably with “nucleic acid”, “oligonucleotide”, and “polynucleotide”, including cDNA, mRNA, genomic DNA, and the like. As used herein, nucleic acid and nucleic acid molecule may be included by the concept of the term “gene”. A nucleic acid molecule encoding the sequence of a given gene includes “splice mutant (variant)”. Similarly, a particular protein encoded by a nucleic acid encompasses any protein encoded by a splice variant of that nucleic acid. “Splice mutants”, as the name suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic acid transcript may be spliced such that different (alternative) nucleic acid splice products encode different polypeptides. Mechanisms for the production of splice variants vary, but include alternative splicing of exons. Alternative polypeptides derived from the same nucleic acid by read-through transcription are also encompassed by this definition. Any products of a splicing reaction, including recombinant forms of the splice products, are included in this definition. Such variants are useful for a variety of assays.


As used herein, the term “amino acid” may refer to a naturally-occurring or non-naturally-occurring amino acid as long as the object of the present invention is satisfied.


As used herein, the term “amino acid derivative” or “amino acid analog” refers to an amino acid which is different from a naturally-occurring amino acid and has a function similar to that of the original amino acid. Such amino acid derivatives and amino acid analogs are well known in the art.


The term “naturally-occurring amino acid” refers to an L-isomer of a naturally-occurring amino acid. The naturally-occurring amino acids are glycine, alanine, valine, leucine, isoleucine, serine, methionine, threonine, phenylalanine, tyrosine, tryptophan, cysteine, proline, histidine, aspartic acid, asparagine, glutamic acid, glutamine, γ-carboxyglutamic acid, arginine, ornithine, and lysine. Unless otherwise indicated, all amino acids as used herein are L-isomers. An embodiment using a D-isomer of an amino acid falls within the scope of the present invention.


The term “non-naturally-occurring amino acid” refers to an amino acid which is ordinarily not found in nature. Examples of non-naturally-occurring amino acids include D-forms of an amino acid as described above, norleucine, para-nitrophenylalanine, homophenylalanine, para-fluorophenylalanine, 3-amino-2-benzyl propionic acid, D- or L-homoarginine, and D-phenylalanine.


As used herein, the term ““amino acid analog” refers to a molecule having a physical property and/or function similar to that of amino acids, but is not an amino acid. Examples of amino acid analogs include, for example, ethionine, canavanine, 2-methylglutamine, and the like. An amino acid mimic refers to a compound which has a structure different from that of the general chemical structure of amino acids but which functions in a manner similar to that of naturally-occurring amino acids.


As used herein, the term “nucleotide” may be either naturally-occurring or non-naturally-occurring. The term “nucleotide derivative” or “nucleotide analog” refers to a nucleotide which is different from naturally-occurring nucleotides and has a function similar to that of the original nucleotide. Such nucleotide derivatives and nucleotide analogs are well known in the art. Examples of such nucleotide derivatives and nucleotide analogs include, but are not limited to, phosphorothioate, phosphoramidate, methylphosphonate, chiral-methylphosphonate, 2-O-methyl ribonucleotide, and peptide-nucleic acid (PNA).


Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.


As used herein, the term “corresponding” amino acid or nucleic acid refers to an amino acid or nucleotide in a given polypeptide or polynucleotide molecule, which has, or is anticipated to have, a function similar to that of a predetermined amino acid or nucleotide in a polypeptide or polynucleotide as a reference for comparison. Particularly, in the case of enzyme molecules, the term refers to an amino acid which is present at a similar position in an active site and similarly contributes to catalytic activity. For example, in the case of an antisense molecule, a corresponding antisense molecule may be a similar portion in an ortholog corresponding to a particular portion of the antisense molecule.


As used herein, the term “corresponding” gene (e.g., a polypeptide or polynucleotide molecule) refers to a gene in a given species, which has, or is expected to have, a function similar to that of a predetermined gene in a species as a reference for comparison. When there are a plurality of genes having such a function, the term refers to a gene having the same evolutionary origin. Therefore, a gene corresponding to a given gene may be an ortholog of the given gene. Thus, a gene corresponding to each gene can be found in other organisms. Such a corresponding gene can be identified by techniques well known in the art. For example, a corresponding gene in a given organism can be found by searching a sequence database of the organism (e.g., hyperthermophillic bacteria) using the sequence of a reference gene (e.g., gene comprising a sequence set forth in Sequence Listing etc.) as a query sequence.


As used herein, the term “fragment” with respect to a polypeptide or polynucleotide refers to a polypeptide or polynucleotide having a sequence length ranging from 1 to n-1 with respect to the full length of the reference polypeptide or polynucleotide (of length n). The length of the fragment can be appropriately changed depending on the purpose. For example, in the case of polypeptides, the lower limit of the length of the fragment includes 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50 or more nucleotides. Lengths represented by integers which are not herein specified (e.g., 11 and the like) may be appropriate as a lower limit. For example, in the case of polynucleotides, the lower limit of the length of the fragment includes 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100 or more nucleotides. Lengths represented by integers which are not herein specified (e.g., 11 and the like) may be appropriate as a lower limit. As used herein, the length of polypeptides or polynucleotides can be represented by the number of amino acids or nucleic acids, respectively. However, the above-described numbers are not absolute. The above-described numbers, as the upper or lower limit, are intended to include some greater or smaller numbers (e.g., ±10%), as long as the same function is maintained. For this purpose, “about” may be herein put ahead of the numbers. However, it should be understood that the interpretation of numbers is not affected by the presence or absence of “about” in the present specification.


As used herein, the term “agent specifically interacting with” a biological agent, or “specific agent”, such as a polynucleotide, a polypeptide or the like, are used interchangeably and refer to an agent which has an affinity for the biological agent, such as a polynucleotide, a polypeptide or the like, which is representatively higher than or equal to the affinity for other non-related biological agents, such as polynucleotides, polypeptides or the like (particularly, those with identity of less than 30%; in a specific embodiment, less than 99% identity), and preferably significantly (e.g., statistically significantly) higher. Such affinity may be measured by hybridizatin assay, binding assay and the like. When a biologial agent is a polypeptide, a specific agent to the polypeptide includes a specific antibody, and it should be understood that in a particular embodiment, the specific agents of the present invention may include an agent specific to the specific antibodies. It should be understood that such specific agents to the specific andibodies include the polypeptide of interest per se.


As used herein, the “agent” may be any substance or other agent (e.g., energy) as long as the intended purpose can be achieved. Examples of such a substance include, but are not limited to, proteins, polypeptides, oligopeptides, peptides, polynucleotides, oligonucleotides, nucleotides, nucleic acids (e.g., DNA such as cDNA, genomic DNA, or the like, and RNA such as mRNA), polysaccharides, oligosaccharides, lipids, low molecular weight organic molecules (e.g., hormones, ligands, information transfer substances, molecules synthesized by combinatorial chemistry, low molecular weight molecules, and the like (e.g., pharmaceutically acceptable low molecular weight ligands and the like)), and combinations of these molecules. Examples of an agent specific to a polynucleotide include, but are not limited to, a polynucleotide having complementarity to the sequence of the polynucleotide with a predetermined sequence homology (e.g., 70% or more sequence identity), a polypeptide such as a transcriptional agent binding to a promoter region, and the like. Examples of an agent specific to a polypeptide include, but are not limited to, an antibody specifically directed to the polypeptide or derivatives or analogs thereof (e.g., single chain antibody), a specific ligand or receptor when the polypeptide is a receptor or ligand, a substrate when the polypeptide is an enzyme, and the like.


As used herein, the term “low molecular weight organic molecule” refers to an organic molecule having a relatively small molecular weight. Usually, the low molecular weight organic molecule refers to a molecular weight of about 1,000 or less, or may refer to a molecular weight of more than 1,000. Low molecular weight organic molecules can be ordinarily synthesized by methods known in the art or combinations thereof. These low molecular weight organic molecules may be produced by organisms. Examples of the low molecular weight organic molecule include, but are not limited to, hormones, ligands, information transfer substances, molecules synthesized by combinatorial chemistry, pharmaceutically acceptable low molecular weight molecules (e.g., low molecular weight ligands and the like), and the like.


As used herein, the term “antibody” encompasses polyclonal antibodies, monoclonal antibodies, human antibodies, humanized antibodies, polyfunctional antibodies, chimeric antibodies, and anti-idiotype antibodies, and fragments thereof (e.g., F(ab′)2 and Fab fragments), and other recombinant conjugates. These antibodies may be fused with an enzyme (e.g., alkaline phosphatase, horseradish peroxidase, α-galactosidase, and the like) via a covalent bond or by recombination.


As used herein, the term “monoclonal antibody” refers to an antibody composition having a group of homologous antibodies. This term is not limited by the production manner thereof. This term encompasses all immunoglobulin molecules and Fab molecules, F(ab′)2 fragments, Fv fragments, and other molecules having an immunological binding property of the original monoclonal antibody molecule. Methods for producing polyclonal antibodies and monoclonal antibodies are well known in the art, and will be more sufficiently described below.


Monoclonal antibodies are prepared by using a standard technique well known in the art (e.g., Kohler and Milstein, Nature, 1975, 256:495) or a modification thereof (e.g., Buck et al., In Vitro, 18, 1982:377). Representatively, a mouse or rat is immunized with a protein bound to a protein carrier, and boosted. Subsequently, the spleen (and optionally several large lymph nodes) is removed and dissociated into single cells. If desired, the spleen cells may be screened (after removal of nonspecifically adherent cells) by applying a cell suspension to a plate or well coated with a protein antigen. B-cells that express membrane-bound immunoglobulin specific for the antigen bind to the plate, and are not rinsed away with the rest of the suspension. Resulting B-cells, or all dissociated spleen cells, are then induced to fuse with myeloma cells to form hybridomas. The hybridomas are used to produce monoclonal antibodies.


As used herein, the term “antigen” refers to any substrate to which an antibody molecule may specifically bind. As used herein, the term “immunogen” refers to an antigen initiating activation of the antigen-specific immune response of a lymphocyte.


As used herein, the term 'single chain antibody” refers to a single chain polypeptide formed by linking a heavy chain fragment and the light chain fragment of the Fv region via a peptide crosslinker.


As used herein, the term “composite molecule” refers to a molecule in which a plurality of molecules, such as polypeptides, polynucleotides, lipids, sugars, small molecules, or the like, are linked together. Examples of a composite molecule include, but are not limited to, glycolipids, glycopeptides, and the like. Such composite molecules can be herein used as a DICS1 gene or a product thereof, or an agent of the present invention, as long as they have a similar function to that of the gene or the product thereof, or the agent of the present invention.


As used herein, the term “isolated” biological agent (e.g., nucleic acid, protein, or the like) refers to a biological agent that is substantially separated or purified from other biological agents in cells of a naturally-occurring organism (e.g., in the case of nucleic acids, agents other than nucleic acids and a nucleic acid having nucleic acid sequences other than an intended nucleic acid; and in the case of proteins, agents other than proteins and proteins having an amino acid sequence other than an intended protein). The “isolated” nucleic acids and proteins include nucleic acids and proteins purified by a standard purification method. The isolated nucleic acids and proteins also include chemically synthesized nucleic acids and proteins.


As used herein, the term “purified” biological agent (e.g., nucleic acids, proteins, and the like) refers to one from which at least a part of the naturally accompanying agents are removed. Therefore, ordinarily, the purity of a purified biological agent is higher than that of the biological agent in a normal state (i.e., concentrated).


As used herein, the terms “purified” and “isolated” mean that the same type of biological agent is present preferably at least 75% by weight, more preferably at least 85% by weight, even more preferably at least 95% by weight, and most preferably at least 98% by weight.


As used herein, the term “expression” of a gene, a polynucleotide, a polypeptide, or the like, indicates that the gene or the like is affected by a predetermined action in vivo to be changed into another form. Preferably, the term “expression” indicates that genes, polynucleotides, or the like are transcribed and translated into polypeptides. In one embodiment of the present invention, genes may be transcribed into mRNA. More preferably, these polypeptides may have post-translational processing modifications.


Therefore, as used herein, the term “reduction” of “expression” of a gene, a polynucleotide, a polypeptide, or the like indicates that the level of expression is significantly reduced in the presence of or under the action of the agent of the present invention as compared to when the action of the agent is absent. Preferably, the reduction of expression includes a reduction in the amount of expression of a polypeptide. As used herein, the term “increase” of “expression” of a gene, a polynucleotide, a polypeptide, or the like indicates that the level of expression is significantly increased in the presence of the action of the agent of the present invention as compared to when the action of the agent is absent. Preferably, the increase of expression includes an increase in the amount of expression of a polypeptide. As used herein, the term “induction” of “expression” of a gene indicates that the amount of expression of the gene is increased by applying a given agent to a given cell. Therefore, the induction of expression includes allowing a gene to be expressed when expression of the gene is not otherwise observed, and increasing the amount of expression of the gene when expression of the gene is observed.


As used herein, the term “specifically expressed” in relation to a gene indicates that the gene is expressed in a specific site or for a specific period of time, at a level different from (preferably higher than) that in other sites or for other periods of time. The term “specifically expressed” indicates that a gene may be expressed only in a given site (specific site) or may be expressed in other sites. Preferably, the term “specifically expressed” indicates that a gene is expressed only in a given site.


As used herein, the term “biological activity” refers to activity possessed by an agent (e.g., a polynucleotide, a protein, etc.) within an organism, including activities exhibiting various functions (e.g., transcription promoting activity, etc.). For example, when two agents interact with each other (the gene product of the present invention and the receptor therefor), the biological activity thereof includes the binding of the gene product of the present invention and the receptor therefor and a biological change (e.g., apoptosis) caused thereby. In another example, when a certain factor is an enzyme, the biological activity thereof includes its enzyme activity. In still another example, when a certain factor is a ligand, the biological activity thereof includes the binding of the ligand to a receptor corresponding thereto. The above-described biological activity can be measured by techniques well-known in the art. Alternatively, in the present invention, the cases of a modified molecule having similar activity in the living organism may be included in the definition of having biological activity.


As used herein, the term “antisense (activity) ” refers to activity which permits specific suppression or reduction of expression of a target gene. The antisense activity is ordinarily achieved by a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is complementary to the nucleic acid sequence of a target gene (e.g., genes of the present invention, etc.). A molecule having such antisense activity is called an antisense molecule. Such a nucleic acid sequence preferably has a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, and even more preferably a length of at least 11 contiguous nucleotides, a length of at least 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, and a length of at least 50 contiguous nucleotides. These nucleic acid sequences include nucleic acid sequences having at least 70% homology thereto, more preferably at least 80%, even more preferably at least 90%, and still even more preferably at least 95%. The antisense activity is preferably complementary to a 5′ terminal sequence of the nucleic acid sequence of a target gene. Such an antisense nucleic acid sequence includes the above-described sequences having one or several, or at least one, nucleotide substitutions, additions, and/or deletions.


As used herein, the term “RNAi” is an abbreviation of RNA interference and refers to a phenomenon where an agent for causing RNAi, such as double-stranded RNA (also called dsRNA), is introduced into cells and mRNA homologous thereto is specifically degraded, so that synthesis of gene products is suppressed, and also referes to a technique using the phenomenon. As used herein, RNAi may have the same meaning as that of an agent which causes RNAi.


As used herein, the term “an agent causing RNAi” refers to any agent causing RNAi. As used herein, “an agent causing RNAi for a gene” indicates that the agent causes RNAi relating to the gene and the effect of RNAi is achieved (e.g., suppression of expression of the gene, and the like). Examples of such an agent causing RNAi include, but are not limited to, a sequence having at least about 70% homology to the nucleic acid sequence of a target gene or a sequence hybridizable under stringent conditions, RNA containing a double-stranded portion having a length of at least 10 nucleotides or variants thereof. Herein, this agent may be preferably DNA containing a 3′ protruding end, and more preferably the 3′ protruding end has a length of 2 or more nucleotides (e.g., 2-4 nucleotides in length).


Though not wishing to be bound by any theory, a mechanism which causes RNAi is considered as follows. When a molecule which causes RNAi, such as dsRNA, is introduced into a cell, an RNase III-like nuclease having a helicase domain (called dicer) cleaves the molecule on about a 20 base pair basis from the 3′ terminus in the presence of ATP in the case where the RNA is relatively long (e.g., 40 or more base pairs). As used herein, the term “siRNA” is an abbreviation of short interfering RNA and refers to short double-stranded RNA of 10 or more base pairs which are artificially chemically or biochemically synthesized, synthesized in the organism body, or produced by double-stranded RNA of about 40 or more base pairs being degraded within the body. siRNA typically has a structure having 5′-phosphate and 3′-OH, where the 3′ terminus projects by about 2 bases. A specific protein is bound to siRNA to form RISC (RNA-induced-silencing-complex). This complex recognizes and binds to mRNA having the same sequence as that of siRNA and cleaves mRNA at the middle of siRNA due to RNase III-like enzymatic activity. It is preferable that the relationship between the sequence of siRNA and the sequence of mRNA to be cleaved as a target is a 100% match. However, base mutation at a site away from the middle of siRNA does not completely remove the cleavage activity by RNAi, leaving partial activity, while base mutation in the middle of siRNA has a large influence and the mRNA cleavage activity by RNAi is considerably lowered. By utilizing this nature, mRNA having a mutation can be specifically degraded. Specifically, siRNA in which the mutation is provided in the middle thereof is synthesized and is introduced into a cell. Therefore, in the present invention, siRNA per se as well as an agent capable of producing siRNA (e.g., representatively dsRNA of about 40 or more base pairs) can be used as an agent capable of eliciting RNAi.


Also, though not wishing to be bound by any theory, apart from the above-described pathway, the antisense strand of siRNA binds to mRNA and siRNA functions as a primer for RNA-dependent RNA polymerase (RdRP), so that dsRNA is synthesized. This dsRNA is a substrate for a dicer again, leading to production of new siRNA. It is intended that such an action is amplified. Therefore, in the present invention, siRNA per se as well as an agent capable of producing siRNA, are useful. In fact, in insects and the like, for example, 35 dsRNA molecules can substantially completely degrade 1000 or more copies of intracellular mRNA, and therefore, it will be understood that siRNA per se, as well as an agent capable of producing siRNA, is useful.


In the present invention, double-stranded RNA having a length of about 20 bases (e.g., representatively about 21 to 23 bases) or less than about 20 bases, which is called siRNA, can be used. Expression of siRNA in cells can suppress expression of a pathogenic gene targeted by the siRNA. Therefore, siRNA can be used for treatment of diseases as a prophylaxis, prognosis, and the like.


The siRNA of the present invention may be in any form as long as it can elicit RNAi.


In another embodiment, an agent capable of causing RNAi may have a short hairpin structure having a sticky portion at the 3′ terminus (shRNA; short hairpin RNA). As used herein, the term “shRNA” refers to a molecule of about 20 or more base pairs in which a single-standed RNA partially contains a palindromic base sequence and forms a double-strand structure therein (i.e., a hairpin structure). shRNA can be artificially synthesized chemically. Alternatively, shRNA can be produced by linking sense and antisense strands of a DNA sequence in reverse directions and synthesizing RNA in vitro with T7 RNA polymerase using the DNA as a template. Though not wishing to be bound by any theory, it should be understood that after shRNA is introduced into a cell, the shRNA is degraded in the cell into a length of about 20 bases (e.g., representatively 21, 22, 23 bases), and causes RNAi as with siRNA, leading to the treatment effect of the present invention. It should be understood that such an effect is exhibited in a wide range of organisms, such as insects, plants, animals (including mammals), and the like. Thus, shRNA elicits RNAi as with siRNA and therefore can be used as an effective component of the present invention. shRNA may preferably have a 3′ protruding end. The length of the double-stranded portion is not particularly limited, but is preferably about 10 or more nucleotides, and more preferably about 20 or more nucleotides. Here, the 3′ protruding end may be preferably DNA, more preferably DNA of at least 2 nucleotides in length, and even more preferably DNA of 2-4 nucleotides in length.


An agent capable of causing RNAi used in the present invention may be artificially synthesized (chemically or biochemically) or naturally occurring. There is substantially no difference therebetween in terms of the effect of the present invention. A chemically synthesized agent is preferably purified by liquid chromatography or the like.


An agent capable of causing RNAi used in the present invention can be produced in vitro. In this synthesis system, T7 RNA polymerase and T7 promoter are used to synthesize antisense and sense RNAs from template DNA. These RNAs are annealed and thereafter are introduced into a cell. In this case, RNAi is caused via the above-described mechanism, thereby achieving the effect of the present invention. Here, for example, the introduction of RNA into cell can be carried out by a calcium phosphate method.


Another example of an agent capable of causing RNAi according to the present invention is a single-stranded nucleic acid hybridizable to mRNA or all nucleic acid analogs thereof. Such agents are useful for the method and composition of the present invention.


As used herein, “polynucleotides hybridizing under stringent conditions” refers to conditions commonly used and well known in the art. Such a polynucleotide can be obtained by conducting colony hybridization, plaque hybridization, Southern blot hybridization, or the like using a polynucleotide selected from the polynucleotides of the present invention. Specifically, a filter on which DNA derived from a colony or plaque is immobilized is used to conduct hybridization at 65° C. in the presence of 0.7 to 1.0 M NaCl. Thereafter, a 0.1 to 2-fold concentration SSC (saline-sodium citrate) solution (1-fold concentration SSC solution is composed of 150 mM sodium chloride and 15 mM sodium citrate) is used to wash the filter at 65° C. Polynucleotides identified by this method are referred to as “polynucleotides hybridizing under stringent conditions”. Hybridization can be conducted in accordance with a method described in, for example, Molecular Cloning 2nd ed., Current Protocols in Molecular Biology, Supplement 1-38, DNA Cloning 1: Core Techniques, A Practical Approach, Second Edition, Oxford University Press (1995), and the like. Here, sequences hybridizing under stringent conditions exclude, preferably, sequences containing only A or T. “Hybridizable polynucleotide” refers to a polynucleotide which can hybridize other polynucleotides under the above-described hybridization conditions. Specifically, the hybridizable polynucleotide includes at least a polynucleotide having a homology of at least 60% to the base sequence of DNA encoding a polypeptide having an amino acid sequence specifically herein disclosed, preferably a polynucleotide having a homology of at least 80%, and more preferably a polynucleotide having a homology of at least 95%.


The term “highly stringent conditions” refers to those conditions that are designed to permit hybridization of DNA strands whose sequences are highly complementary, and to exclude hybridization of significantly mismatched DNAs. Hybridization stringency is principally determined by temperature, ionic strength, and the concentration of denaturing agents such as formamide. Examples of “highly stringent conditions” for hybridization and washing are 0.0015 M sodium chloride, 0.0015 M sodium citrateat 65-68° C. or 0.015 M sodium chloride, 0.0015 M sodium citrate, and 50% formamide at 42° C. See Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory, N.Y., 1989); Anderson et al., Nucleic Acid Hybridization: A Practical Approach Ch. 4 (IRL Press Limited) (Oxford Express). More stringent conditions (such as higher temperature, lower ionic strength, higher formamide, or other denaturing agents) may be optionally used. Other agents may be included in the hybridization and washing buffers for the purpose of reducing non-specific and/or background hybridization. Examples are 0.1% bovine serum albumin, 0.1% polyvinylpyrrolidone, 0.1% sodium pyrophosphate, 0.1% sodium dodecylsulfate (NaDodSO4 or SDS), Ficoll, Denhardt's solution, sonicated salmon sperm DNA (or another noncomplementary DNA), and dextran sulfate, although other suitable agents can also be used. The concentration and types of these additives can be changed without substantially affecting the stringency of the hybridization conditions. Hybridization experiments are ordinarily carried out at pH 6.8-7.4; however, at typical ionic strength conditions, the rate of hybridization is nearly independent of pH. See Anderson et al., Nucleic Acid Hybridization: A Practical Approach Ch. 4 (IRL Press Limited, Oxford UK).


Factors affecting the stability of DNA duplex include base composition, length, and degree of base pair mismatch. Hybridization conditions can be adjusted by those skilled in the art in order to accommodate these variables and allow DNAs of different sequence relatedness to form hybrids. The melting temperature of a perfectly matched DNA duplex can be estimated by the following equation:

Tm(° C.)=81.5+16.6(log [Na+])+0.41(% G+C)−600/N−0.72(% formamide)

where N is the length of the duplex formed, [Na+] is the molar concentration of the sodium ion in the hybridization or washing solution, % G+C is the percentage of (guanine+cytosine) bases in the hybrid. For imperfectly matched hybrids, the melting temperature is reduced by approximately 1° C. for each 1% mismatch.


The term “moderately stringent conditions” refers to conditions under which a DNA duplex with a greater degree of base pair mismatching than could occur under “highly stringent conditions” is able to form. Examples of typical “moderately stringent conditions” are 0.015 M sodium chloride, 0.0015 M sodium citrate at 50-65° C. or 0.015 M sodium chloride, 0.0015 M sodium citrate, and 20% formamide at 37-50° C. By way of example, “moderately stringent conditions” of 50° C. in 0.015 M sodium ion will allow about a 21% mismatch.


It will be appreciated by those skilled in the art that there may be no absolute distinction between “highly stringent conditions” and “moderately stringent conditions”. For example, at 0.015 M sodium ion (no formamide), the melting temperature of perfectly matched long DNA is about 71° C. With a wash at 65° C. (at the same ionic strength), this would allow for approximately a 6% mismatch. To capture more distantly related sequences, those skilled in the art can simply lower the temperature or raise the ionic strength.


A good estimate of the melting temperature in 1 M NaCl for oligonucleotide probes up to about 20 nucleotides is given by:

Tm=(2° C. per A-T base pair)+(4° C. per G-C base pair).

Note that the sodium ion concentration in 6× salt sodium citrate (SSC) is 1 M. See Suggs et al., Developmental Biology Using Purified Genes 683 (Brown and Fox, eds., 1981).


A naturally-occurring nucleic acid encoding a protein (e.g., Pep5, p75, Rho GDI, MAG, p21, Rho, Rho kinase, or variants or fragments thereof, or the like) may be readily isolated from a cDNA library having PCR primers and hybridization probes containing part of a nucleic acid sequence indicated in the sequence listing. A preferable nucleic acid, or variants or fragments thereof, or the like is hybridizable to the whole or part of a sequence as set forth in SEQ ID NO: 1 or 1087 under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 500 mM sodium phosphate (NaPO4); 1 mM EDTA; and 7% SDS at 42° C., and wash buffer essentially containing 2×SSC (600 mM NaCl; 60 mM sodium citrate); and 0.1% SDS at 50° C., more preferably under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 500 mM sodium phosphate (NaPO4); 15% formamide; 1 mM EDTA; and 7% SDS at 50° C., and wash buffer essentially containing 1×SSC (300 mM NaCl; 30 mM sodium citrate); and 1% SDS at 50° C., and most preferably under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 200 mM sodium phosphate (NaPO4); 15% formamide; 1 mM EDTA; and 7% SDS at 50° C., and wash buffer essentially containing 0.5×SSC (150 mM NaCl; 15 mM sodium citrate); and 0.1% SDS at 65° C.


As used herein, the term “probe” refers to a substance for use in searching, which is used in a biological experiment, such as in vitro and/or in vivo screening or the like, including, but not being limited to, for example, a nucleic acid molecule having a specific base sequence or a peptide containing a specific amino acid sequence.


Examples of a nucleic acid molecule as a usual probe include one having a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is homologous or complementary to the nucleic acid sequence of a gene of interest. Such a nucleic acid sequence may be preferably a nucleic acid sequence having a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, and even more preferably a length of at least 11 contiguous nucleotides, a length of 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 25 contiguous nucleotides, a length of 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, or a length of at least 50 contiguous nucleotides. A nucleic acid sequence used as a probe includes a nucleic acid sequence having at least 70% homology to the above-described sequence, more preferably at least 80%, and even more preferably at least 90%, or at least 95%.


As used herein, the term “search” indicates that a given nucleic acid base sequence is utilized to find other nucleic acid base sequences having a specific function and/or property electronically or biologically, or other methods. Examples of electronic search include, but are not limited to, BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)), FASTA (Pearson & Lipman, Proc. Natl. Acad. Sci., USA 85:2444-2448 (1988)), Smith and Waterman method (Smith and Waterman, J. Mol. Biol. 147:195-197 (1981)), and Needleman and Wunsch method (Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970)), and the like. Examples of biological search include, but are not limited to, a macroarray in which genomic DNA is attached to a nylon membrane or the like or a microarray (microassay) in which genomic DNA is attached to a glass plate under stringent hybridization, PCR and in situ hybridization, and the like. It is herein intended that the genes used in the present invention include corresponding genes identified by such an electronic or biological search.


As used herein, the term “primer” refers to a substance required for initiation of a reaction of a macromolecule compound to be synthesized in a macromolecule synthesis enzymatic reaction. In a reaction for synthesizing a nucleic acid molecule, a nucleic acid molecule (e.g., DNA, RNA, or the like) which is complementary to part of a macromolecule compound to be synthesized may be used.


A nucleic acid molecule which is ordinarily used as a primer includes one that has a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is complementary to the nucleic acid sequence of a gene of interest. Such a nucleic acid sequence preferably has a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, even more preferably a length of at least 11 contiguous nucleotides, a length of at least 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 16 contiguous nucleotides, a length of at least 17 contiguous nucleotides, a length of at least 18 contiguous nucleotides, a length of at least 19 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 25 contiguous nucleotides, a length of at least 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, and a length of at least 50 contiguous nucleotides. A nucleic acid sequence used as a primer includes a nucleic acid sequence having at least 70% homology to the above-described sequence, more preferably at least 80%, even more preferably at least 90%, and at least 95%. An appropriate sequence as a primer may vary depending on the property of a sequence to be synthesized (amplified). Those skilled in the art can design an appropriate primer depending on a sequence of interest. Such a primer design is well known in the art and may be performed manually or using a computer program (e.g., LASERGENE, Primer Select, DNAStar).


As used herein, the term “epitope” refers to a basic structure constituting an antigenic determinant. Therefore, the term “epitope” includes a set of amino acid residues which is involved in recognition by a particular immunoglobulin, or in the context of T cells, those residues necessary for recognition by T cell receptor proteins and/or Major Histocompatibility Complex (MHC) receptors. This term is also used interchangeably with “antigenic determinant” or“antigenic determinant site”. In the field of immunology, in vivo or in vitro, an epitope is the features of a molecule (e.g., primary, secondary and tertiary peptide structure, and charge) that form a site recognized by an immunoglobulin, T cell receptor or HLA molecule. An epitope including a peptide comprises 3 or more amino acids in a spatial conformation which is unique to the epitope. Generally, an epitope consists of at least 5 such amino acids, and more ordinarily, consists of at least 6, 7, 8, 9 or 10 such amino acids. The greater the length of an epitope, the more the similarity of the epitope to the original peptide, i.e., longer epitopes are generally preferable. This is not necessarily the case when the conformation is taken into account. Methods of determining the spatial conformation of amino acids are known in the art, and include, for example, X-ray crystallography and 2-dimensional nuclear magnetic resonance spectroscopy. Furthermore, the identification of epitopes in a given protein is readily accomplished using techniques well known in the art. See, also, Geysen et al., Proc. Natl. Acad. Sci. USA (1984) 81: 3998 (general method of rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given antigen); U. S. Pat. No. 4,708,871 (procedures for identifying and chemically synthesizing epitopes of antigens); and Geysen et al., Molecular Immunology (1986) 23: 709 (technique for identifying peptides with high affinity for a given antibody). Antibodies that recognize the same epitope can be identified in a simple immunoassay. Thus, methods for determining epitopes including a peptide are well known in the art. Such an epitope can be determined using a well-known, common technique by those skilled in the art if the primary nucleic acid or amino acid sequence of the epitope is provided.


Therefore, an epitope including a peptide requires a sequence having a length of at least 3 amino acids, preferably at least 4 amino acids, more preferably at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, and 25 amino acids. Epitopes may be linear or conformational.


As used herein, “homology” of a gene (e.g., a nucleic acid sequence, an amino acid sequence, or the like) refers to the proportion of identity between two or more gene sequences. As used herein, the identity of a sequence (a nucleic acid sequence, an amino acid sequence, or the like) refers to the proportion of the identical sequence (an individual nucleic acid, amino acid, or the like) between two or more comparable sequences. Therefore, the greater the homology between two given genes, the greater the identity or similarity between their sequences. Whether or not two genes have homology is determined by comparing their sequences directly or by a hybridization method under stringent conditions. When two gene sequences are directly compared with each other, these genes have homology if the DNA sequences of the genes have representatively at least 50% identity, preferably at least 70% identity, more preferably at least 80%, 90%, 95%, 96%, 97%, 98%, or 99% identity with each other.


The similarity, identity and homology of base sequences are herein compared using BLAST (sequence analyzing tool) with the default parameters. The similarity, identity and homology of amino acid sequences are herein compared using BLASTX (sequence analyzing tool) with the default parameters.


Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.


As used herein, the “percentage of (amino acid, nucleotide, or the like) sequence identity, homology or similarity” is determined by comparing two optimally aligned sequences over a window of comparison, wherein the portion of a polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e. gaps), as compared to the reference sequences (which does not comprise additions or deletions (if the other sequence includes an addition, a gap may occur)) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid bases or amino acid residues occur in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the reference sequence (i.e. the window size) and multiplying the results by 100 to yield the percentage of sequence identity. When used in a search, homology is evaluated by an appropriate technique selected from various sequence comparison algorithms and programs well known in the art. Examples of such algorithms and programs include, but are not limited to, TBLASTN, BLASTP, FASTA, TFASTA and CLUSTALW (Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85(8):2444-2448, Altschul et al., 1990, J. Mol. Biol. 215(3) :403-410, Thompson et al., 1994, Nucleic Acids Res. 22(2):4673-4680, Higgins et al., 1996, Methods Enzymol. 266:383-402, Altschul et al., 1990, J. Mol. Biol. 215(3):403-410, Altschul et al., 1993, Nature Genetics 3:266-272). In a particularly preferable embodiment, the homology of a protein or nucleic acid sequence is evaluated using a Basic Local Alignment Search Tool (BLAST) well known in the art (e.g., see Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268, Altschul et al., 1990, J. Mol. Biol. 215:403-410, Altschul et al., 1993, Nature Genetics 3:266-272, Altschul et al., 1997, Nuc. Acids Res. 25:3389-3402). Particularly, 5 specialized-BLAST programs may be used to perform the following tasks to achieve comparison or search:

  • (1) comparison of an amino acid query sequence with a protein sequence database using BLASTP and BLAST3;
  • (2) comparison of a nucleotide query sequence with a nucleotide sequence database using BLASTN;
  • (3) comparison of a conceptually translated product in which a nucleotide query sequence (both strands) is converted over 6 reading frames with a protein sequence database using BLASTX;
  • (4) comparison of all protein query sequences converted over 6 reading frames (both strands) with a nucleotide sequence database using TBLASTN; and
  • (5) comparison of nucleotide query sequences converted over 6 reading frames with a nucleotide sequence database using TBLASTX.


The BLAST program identifies homologous sequences by specifying analogous segments called “high score segment pairs” between amino acid query sequences or nucleic acid query sequences and test sequences obtained from preferably a protein sequence database or a nucleic acid sequence database. A large number of the high score segment pairs are preferably identified (aligned) using a scoring matrix well known in the art. Preferably, the scoring matrix is the BLOSUM62 matrix (Gonnet et al., 1992, Science 256:1443-1445, Henikoff and Henikoff, 1993, Proteins 17:49-61). The PAM or PAM250 matrix may be used, although they are not as preferable as the BLOSUM62 matrix (e.g., see Schwartz and Dayhoff, eds., 1978, Matrices for Detecting Distance Relationships: Atlas of Protein Sequence and Structure, Washington: National Biomedical Research Foundation). The BLAST program evaluates the statistical significance of all identified high score segment pairs and preferably selects segments which satisfy a threshold level of significance independently defined by a user, such as a user set homology. Preferably, the statistical significance of high score segment pairs is evaluated using Karlin's formula (see Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268).


As used hererin, a sequence is “homologous” refers to that the homology thereof is so high that homologous recombination occurs. Accordingly, those skilled in the art can determine whether a sequence is “homologous” by introducing a DNA capable of completing a variation in a chromosome, and causing in vivo gene recombination. There is a method for confirming such a homologous state by determining incorporation of a DNA capable of complementation by a phenotype thereof (for example, if a green fluorescence protein is used, green fluorescence is used). Accordingly, in order that a sequence be homologous, homology between two sequences may be typically at least about 70%, preferably at least about 80%, more preferably at least about 90%, still more preferably at least about 95%, and most preferably, at least about 99%.


As used herein the term “region” of a sequence, is a portion having a certain-length in the sequence. Such a region usually has a function. When used for targeting disruption of the present invention, the “region” of a sequence, is at least about 10 nucleotides in length, preferably at least about 15 nucleotides in length, more preferably at least about 20 nucleotides in length, still more preferably at least about 30 nucleotides in length, yet more preferably at least about 50 nucleotides in length. Preferably, such a region may include a portion responsible for genetic function. In a preferable embodiment, the “region” of a sequence may be one or more genes.


As used herein the term “targeting” refers to to target a certain gene when used in the targeting disruption of a gene.


As used herein the term “biological activity” refers to an activity which an agent (for example, a polypeptide or protein) may have in the living body, and includes those attaining a variety of functions. For example, when an agent is an enzyme, the biological activity thereof includes the enzymatic activity thereof. In another example, when an agent is a ligand, the binding thereof to the receptor therefor is included. In the present invention, each gene product has the biological activities described in Table 2. Alternatively, the polypeptide of the present invention has an epitope activity.


As used herein the term “marker gene” refers to a gene used as a label (or marker) in genetic analysis. Typically, marker genes are those having a clear variant phenotype and are easily detectable rather than having a detailed function. In addition to genes for drug resistance, genes of biochemical property (such as auxotrophic) are often used in microorganism. Genes for morphological properties may also be used. Drug resistance genes include, but are not limited to, for example, kanamycin resistance gene, hygromycin resistance gene, ampicillin resistance gene, chloramphenicol resistance gene, streptomycin resistance gene, and the like.


As used herein the term “vector” refers to one which can transfer a polynucleotide of interest into a cell of interest. Such a vector includes, but is not limited to, for example, one which allows autonomous replication in a host cell such as a prokaryotic cell, yeast cell, animal cell, plant cell, insect cell, animal individual or plant individual or the like, or one which can be incorporated into the chromosome, and comprises a promoter at an appropriate position for trascription of the polynucleotide of the present invention. Preferably, such a vector includes one which can autonomously replicate in Thermococcus kodakarensis KOD1.


As used herein the term “expression vector” refers to a nucleic acid sequence which comprises a structural gene and a promoter regulating the expression thereof, and a number of regulatory elements operably linked in the host cell. Preferably, regulatory elements may comprise a terminator, a selective marker such as a drug resistance gene (for example, kanamycin resistance gene, hygromycin resistance gene and the like), and an enhancer. It is well known in the art that the,types of expression vectors used in an organism (for example, plant), and the regulatory elements used may vary depending on the host cell used. In a plant, plant expression vectors used in the present invention may further have a T-DNA region. The T-DNA region enhances the efficiency of introduction of a gene when, in particular, Agrobacterium is used to transform the plant.


As used herein the term “recombinant vector” refers to a vector which can transfer a polynucleotide of interest into a cell of interest. Such a vector includes, but is not limited to, for example, one which allows autonomous replication in a host cell such as a prokaryotic cell, yeast cell, animal cell, plant cell, insect cell, animal individual or plant individual or the like, or one which can be incorporated into the chromosome, and comprises a promoter at an appropriate position for trascription of the polynucleotide of the present invention.


“Recombinant vectors” for prokaryotic cells include pBTrp2, pBTac1, pBTac2 (both available from Roche Molecular Biochemicals), pKK233-2(Pharmacia), pSE280 (Invitrogen), pGEMEX-1 (Promega), pQE-8 (QIAGEN), pKYP10 (Japanese Laid-Open Publication No.: 58-110600), pKYP200 (Agric. Biol. Chem., 48,669(1984)), pLSA1 (Agric. Biol. Chem., 53, 277 (1989)), pGEL1 (Proc. Natl. Acad. Sci. USA, 82, 4306 (1985)), pBluescript II SK+(Stratagene), pBluescript II SK(−) (Stratagene), pTrs30 (FERM BP-5407), pTrs32 (FERM BP-5408), pGHA2 (FERM BP-400), pGKA2 (FERM B-6798), pTerm2(Japanese Laid-Open Publication No.: 3-22979, U.S. Pat. No. 4,686,191, U.S. Pat. No. 4,939,094, U.S. Pat. No. 5,160,735), pEG400 (J. Bacteriol., 172, 2392 (1990)), pGEX (Pharmacia), pET systems (Novagen), psupex, pUB110, pTP5, pC194, pTrxFus (Invitrogen), pMAL-c2 (New England Biolabs), pUC19 (Gene, 33, 103 (1985)), pSTV28 (TaKaRa), pUC118 (TaKaRa), pPA1 (Japanese Laid-Open Publication No.: 63-233798), and the like.


As used herein, the term “promoter” refers to a base sequence which determines the initiation site of transcription of a gene and is a DNA region which directly regulates the frequency of transcription. Transcription is started by RNA polymerase binding to a promoter. A promoter region is usually located within about 2 kbp upstream of the first exon of a putative protein coding region. Therefore, it is possible to estimate a promoter region by predicting a protein coding region in a genomic base sequence using DNA analysis software. A putative promoter region is usually located upstream of a structural gene, but depending on the structural gene, a putative promoter region may be located downstream of a structural gene. Preferably, a putative promoter region is located within about 2 kbp upstream of the translation initiation site of the first exon, but such a putative promoter region is not limited to this and may be located in an intron or downstream of 3′ terminus.


As used herein, the term “terminator” refers to a sequence which is located downstream of a protein-encoding region of a gene and which is involved in the termination of transcription when DNA is transcribed into mRNA, and the addition of a poly-A sequence.


When using the present invention, any method for introducing a nucleic acid into a cell may be used as methods for introducing a vector, and includes, for example, transfection, transduction, transformation (calcium chloride method, electroporation method (Japanese Laid-Open Publication 60-251887), particle gun (gene gun) method (Japanese Patent Nos. 2606856, and 2517813) As used herein, the term “transformant” refers to the whole or a part of an organism, such as a cell, which is produced by transformation. Examples of a transformant include prokaryotic cells, yeast cells, animal cells, plant cells, insect cells and the like. Transformants may be referred to as transformed cells, transformed tissue, transformed hosts, or the like, depending on the subject. As used herein, all of the forms are encompassed, however, a particular form may be specified in a particular context.


As used herein the term “homologous recombination” refers to a recombination in the portion having a homologous base sequence in a pair of double stranded DNA. In a living organism, such homologous recombinations are observed in a form of chromosomal crossover and the like.


As used herein the phrase “conditions under which homologous recombination occurs” refers to conditions under which homologous recombination occurs when an organism having a genome and a nucleic acid molecule having a sequence homologous to at least any one region of the genomic sequence thereof, are present. Such conditions may differ depending on the organism, and are well known for those skilled in the art. Such conditions include, but are not limited to, for example:

  • Tk-pyrF deleted strain No. 25, No. 27 are cultured in 20 ml of ASW-YT liquid medium.
  • Collect the bacteria from the culture medium (3 ml) per one sample (No. 25, No. 27, five samples for each)
  • Suspend the cells in 0.8×ASW+80 mM CaCl2 2001 μl, and let stand on ice for 30 minutes
  • 3 μg pUC118/DS and 3 μg pUC118/DD are mixed and let stand on ice for 1 hour (two samples for each. Equivalent volume of TE buffer added sample was used as a control)
  • heat shock at 85° C., 45 s
  • let stand on ice for 10 minutes
  • Preculture in Ura-ASW-AA liquid medium (proliferation occurs based on the incorporated uracil)
  • Culture on Ura-ASW-AA liquid medium (enriched for PyrF+strain)
  • Culture on Ura-ASW-AA solid medium
  • The present invention is not limited to the above conditions. As used herein the composition of ASW (artificial sea water) is as follows: 1×Artificial sea water (ASW) (/L): NaCl 20 g; MgCl2.6H2O 3 g; MgSO4.7H2O 6 g; (NH4)2SO4 1 g; NaHCO30.2 g; CaCl2.2H2O 0.3 g; KCl 0.5 g; NaBr 0.05 g; SrCl2.6H2O 0.02 g; and Fe(NH4) citric acid 0.01 g.


Homologous recombination may occur when there is at least one homologous region between a genome and a vector, and preferably, when there are two homologous regions between the genome and the vector.


As used herein the term “cross-over” or “crossover”, when used for a chromosome, refers to a pair of homologous chromosomes is crossed in this way, resulting in a new combination of nucleic acid sequences.


As used herein the term “single cross over”, when used for chromosome, refers to that there is one homologous region causing the cross-over between the nucleic acid molecules, and cross-over occurs only in that particular region, resulting in one nucleic acid sequence thereof that is incorporated in the other sequence.


As used herein the term “double cross-over”, when used for chromosome, refers to that there are two homologous regions between two nucleic acid molecules for cross-over, and the nucleic acid sequence is replaced with each other between the homologous regions.


As used herein, the term “expression” of a gene, a polynucleotide, a polypeptide, or the like, indicates that the gene or the like is affected by a predetermined action in vivo to be changed into another form. Preferably, the term “expression” indicates that genes, polynucleotides, or the like are transcribed and translated into polypeptides. In one embodiment of the present invention, genes may be transcribed into mRNA. More preferably, these polypeptides may have post-translational processing modifications.


As used herein the term “expression product” of a gene, refers to a substance resulting from expression of the gene, and includes mRNA which is a transcription product, a polypeptide which is a translation product, and a polypeptide which is a post-translational product, and the like. Detection of such expression products maybe directly or indirectly performed, and may be performed using a well known technology in the art (for example, Southern blotting, Northern blotting and the like). These technologies are described elsewhere herein, as well as in the references cited elsewhere herein.


Polypeptides used in the present invention may be produced by, for example, cultivating primary culture cells producing the peptides or cell lines thereof, followed by separation or purification of the peptides from culture supernatant. Alternatively, genetic manipulation techniques can be used to incorporate a gene encoding a polypeptide of interest into an appropriate expression vector, transform an expression host with the vector, and collect recombinant polypeptides from the culture supernatant of the transformed cells. The above-described host cell may be any host cells conventionally used in genetic manipulation techniques as long as they can express a polypeptide of interest while keeping the physiological activity of the peptide (e.g., E. coli, yeast, an animal cell, etc.). Conditions for culturing recombinant host cells may be appropriately selected depending on the type of host cell used. Any host cells which may be used in a recombinant DNA technology may be used as a host cell in the present invention, including bacterial cells, yeast cells, animal cells, plant cells, insect cells, and the like. Preferable host cell is a bacterial cell. Polypeptides derived from the thus-obtained cells may have at least one amino acid substitution, addition, and/or deletion or at least one sugar chain substitution, addition, and/or deletion as long as they have substantially the same function as that of naturally-occurring polypeptides. When an expression product is secreted extracellularly, for example, the supernatant is obtained by centrifuging or filtering a culture, and directly purifying the same or concentrating by precipitation or ultra filtration for purification. When an expression product is accumulated intracellularly, cells may be disrupted by a cell wall lysis enzyme, change in osmolarity, use of glass beads, homogenizer, or sonication or the like, to obtain cellular extract for purification. Purification may be performed by combining known methods in the art, such as ion exchange chromatography, gel filtration, affinity chromatography, electrophoresis and the like.


A given amino acid may be substituted with another amino acid in a protein structure, such as a cationic region or a substrate molecule binding site, without a clear reduction or loss of interactive binding ability. A given biological function of a protein is defined by the interactive ability or other property of the protein. Therefore, a particular amino acid substitution may be performed in an amino acid sequence, or at the DNA code sequence level, to produce a protein which maintains the original property after the substitution. Therefore, various modifications of peptides as disclosed herein and DNA encoding such peptides may be performed without clear losses of biological usefulness.


When the above-described modifications are designed, the hydrophobicity indices of amino acids may be taken into consideration. Hydrophobic amino acid indices play an important role in providing a protein with an interactive biological function, which is generally recognized in the art (Kyte, J. and Doolittle, R. F., J. Mol. Biol. 157(1):105-132, 1982). The hydrophobic property of an amino acid contributes to the secondary structure of a protein and then regulates interactions between the protein and other molecules (e.g., enzymes, substrates, receptors, DNA, antibodies, antigens, etc.). Each amino acid is given a hydrophobicity index based on the hydrophobicity and charge properties thereof as follows: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2); glutamic acid (−3.5); glutamine (−3.5); aspartic acid (−3.5); asparagine (−3.5); lysine (−3.9); and arginine (−4.5).


It is well known that if a given amino acid is substituted with another amino acid having a similar hydrophobicity index, the resultant protein may still have a biological function similar to that of the original protein (e.g., a protein having an equivalent enzymatic activity). For such an amino acid substitution, the hydrophobicity index is preferably within ±2, more preferably within ±1, and even more preferably within ±0.5. It is understood in the art that such an amino acid substitution based on hydrophobicity is efficient.


A hydrophilicity index is also useful for modification of an amino acid sequence of the present invention. As described in U.S. Pat. No. 4,554,101, amino acid residues are given the following hydrophilicity indices: arginine (+3.0); lysine (+3.0); aspartic acid (+3.0±1); glutamic acid (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); and tryptophan (−3.4). It is understood that an amino acid may be substituted with another amino acid which has a similar hydrophilicity index and can still provide a biological equivalent. For such an amino acid substitution, the hydrophilicity index is preferably within ±2, more preferably ±1, and even more preferably ±0.5.


The term “conservative substitution” as used herein refers to amino acid substitution in which a substituted amino acid and a substituting amino acid have similar hydrophilicity indices or/and hydrophobicity indices. For example, the conservative substitution is carried out between amino acids having a hydrophilicity or hydrophobicity index of within ±2, preferably within ±1, and more preferably within ±0.5. Examples of the conservative substitution include, but are not limited to, substitutions within each of the following residue pairs: arginine and lysine; glutamic acid and aspartic acid; serine and threonine; glutamine and asparagine; and valine, leucine, and isoleucine, which are well known to those skilled in the art.


As used herein the term “silent substitution” refers to a substitution in which there are nucleotide sequence substitutions but no amino acid change is encoded by the substituted nucleotides. Such silent substitutions may be performed using genetic code degeneracy. Such degeneracy is well known in the art, and is also described in the references cited herein.


As used herein, the term “variant” refers to a substance, such as a polypeptide, polynucleotide, or the like, which differs partially from the original substance. Examples of such a variant include a substitution variant, an addition variant, a deletion variant, a truncated variant, an allelic variant, and the like. Examples of such a variant include, but are not limited to, a nucleotide or polypeptide having one or several substitutions, additions and/or deletions or a nucleotide or polypeptide having at least one substitution, addition and/or deletion. The term “allele” as used herein refers to a genetic variant located at a locus identical to a corresponding gene, where the two genes are distinguished from each other. Therefore, the term “allelic variant” as used herein refers to a variant which has an allelic relationship with a given gene. Such an allelic variant ordinarily has a sequence the same as or highly similar to that of the corresponding allele, and ordinarily has almost the same biological activity, though it rarely has different biological activity. The term “species homolog” or “homolog” as used herein refers to one that has an amino acid or nucleotide homology with a given gene in a given species (preferably at least 60% homology, more preferably at least 80%, at least 85%, at least 90%, and at least 95% homology). A method for obtaining such a species homolog is clearly understood from the description of the present specification. The term “orthologs” (also called orthologous genes) refers to genes in different species derived from a common ancestry (due to speciation) For example, in the case of the hemoglobin gene family having multigene structure, human and mouse α-hemoglobin genes are orthologs, while the human α-hemoglobin gene and the human β-hemoglobin gene are paralogs (genes arising from gene duplication). Orthologs are useful for estimation of molecular phylogenetic trees. Usually, orthologs in different species may have a function similar to that of the original species. Therefore, orthologs of the present invention may be useful in the present invention.


As used herein, the term “conservative (or conservatively modified) variant” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refer to those nucleic acids which encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For example, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations” which represent one species of conservatively modified variation. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. Those skilled in the art will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence. Preferably, such modification may be performed while avoiding substitution of cysteine which is an amino acid capable of largely affecting the higher-order structure of a polypeptide. Such a conservative modification or silent modification is also within the scope of the present invention.


The above-described nucleic acid can be obtained by a well-known PCR method, i.e., chemical synthesis. This method may be combined with, for example, site-specific mutagenesis, hybridization, or the like.


As used herein, the term “substitution, addition or deletion” for a polypeptide or a polynucleotide refers to the substitution, addition or deletion of an amino acid or its substitute, or a nucleotide or its substitute, with respect to the original polypeptide or polynucleotide, respectively. This is achieved by techniques well known in the art, including a site directed mutagenesis technique and the like. A polypeptide or a polynucleotide may have any number (>0) of substitutions, additions, or deletions. The number can be as large as a variant having such a number of substitutions, additions or deletions which maintains an intended function (e.g., the cancer marker, nervous disorder marker, etc.). For example, such a number may be one or several, and preferably within 20% or 10% of the full length, or no more than 100, no more than 50, no more than 25, or the like.


As used herein, the term “specifically expressed” in the case of genes indicates that a gene is expressed in a specific site or in a specific period of time at a level different from (preferably higher than) that in other sites or periods of time. The term “specifically expressed” includes that a gene may be expressed only in a given site (specific site) or may be expressed in other sites. Preferably, the term “specifically expressed” indicates that a gene is expressed only in a given site.


General molecular biological technologies which may be used in the present invention may be readily performed by those skilled in the art by referring to for example, Ausubel F. A. et al., ed. (1988), Current Protocols in Molecular Biology, Wiley, New York, N.Y.; Sambrook J et al. (1987) Molecular Cloning:A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.


As used herein the term “thermostable” refers to a property having resistance against a temperture which is higher than circumstancial temperature in which a usual organism survives, and includes resistance against temperature higher than 37° C. More usually, the thermostable refers to resistance against temperature higher than 50° C. Thermostable, when used for a living organism, may refer to a property thereof in which an organism can survive at lower and higher temperatures. On the other hand, thermostable, when used for a polypeptide, refers to resistance against higher temperature, for example a temperature higher than 37° C., a temperature higher than 50° C. Amongst them, the property of having resistance to temperatures higher than 90° C. refers to “hyperthermostable”.


As used herein, an organism which can survive at higher temperature is often called “thermophillic bacteria”. Thermophillic bacteria usually have survival optimum temperatures of 50-105° C. and do not grow at 30° C. or lower. Amongst them, those having an optimum temperature of 90° C. or higher are called “hyperthermophillic bacteria”.


As used herein the term “hyperthermophillic archeabacteria” and “hyperthermostable bacteria” are interchangeably used to refer to a microorganism growing at 90° C. or higher. Preferably, the hyperthermophillic archeabacteria is Thermococcus kodakaraensis KOD1 strain, a thermostable DNA ligase producing, thermostable thiol protease producing bacteria isolated by the present inventors (Morikawa, M. et al., Appl. Environ. Microbiol. 60(12), 4559-4566(1994)). KOD-1 strains were deposited in the International Patent Organism Depositary (Chuo No. 6, Higashi 1-Chome, 1-1, Tsukuba-shi, Ibaraki, 305-8566), and the accession number there of FERM P-15007. KOD-1 strains were originally classified as a Pyrococcus bacteria, as described in the above-mentioned reference. However, when we compared the sequence of 16S rRNA using the registered data in GenBank R91.0 October, 1995+Daily Update inputted in DNASIS (Hitachi Software Engineering), it was revealed y that KOD-1 strains belongs to the Thermococcus genus, rather than the Pyrococcus genus, and thus is presently classified as Thermococcus kodakaraensis KOD-1.


As used herein, culturing hyperthermophillic archeabateria producing hyperthermostable proteins may be performed under any culture conditions, for example, those described in Appl. Environ. Microbiol. 60(12), 4559-4566 (1994) (ibid). Culture may be either static culture or jar fermentation culture by nitrogen gas, and may be either in a continuous or batch manner.


The chromosomal DNA of a hyperthermophillic archeabacteria may be obtained by solubilizing the cultured bacterial cells with detergent (for example, N-lauryl sarcosin), and fractionating the resultant soluent by cesium chloride ethidium bromide equilibrium density-gradient centrifugation (see, for example, Imanaka et al., J. Bacteriol. 147:776-786 (1981)). Libraries may be obtained by digesting the resultant chromosomal DNA by a variety of restriction enzymes, followed by ligating the same into a vector (such as a phage or plasmid), which has been digested with the same restriction enzyme or similar restriction enzyme resulting in the same digestion terminus, with an enzyme such as T4 DNA ligase or the like.


Libraries may be screened by selecting a clone comprising a DNA encoding a thermophilic DNA ligase of interest therefrom. Selection may be performed using an oligonucleotide designed based on a partial amino acid sequence of the predetermined hyperthermophillic DNA ligase and a cloned DNA deduced to have homology with the DNA of interest as a probe. Alternatively, selection may be performed by expressing the enzyme of interest. Detection of expression may be performed, for example, when the activity of the enzyme of interest may be readily detected, by detecting the activity of expression product against the substrate added to the plate, or alternatively when an antibody against the enzyme of interest is available, using the reactivity between the expression product and the antibody.


Analysis of the resultant cloned DNA may be performed by, for example, isolating a selected DNA, producing a restriction map therefor, and determining the nucleotide sequence, and the like. Technologies such as preparation of a cloned DNA, restriction enzyme processing, subcloning, nucleotide sequencing and the like are well known in the art, and may be performed by referring to “Molecular Cloning: A Laboratory Manual Second Edition, ” (Sambrook, Fritsch and Maniatis ed., Cold Spring Harbor Laboratory Press, 1989) Next, the resultant cloned DNA may be expressed by operably inserting the same into an expression vector applicable to a host cell used, transforming a host cell with the expression vector, and culturing the transformed host cell.


(Biomolecule Chip)


The genomic information of the present invention may be used for providing a biomolecule chip (for example, DNA chip, protein chip, glycoprotein chip, antibody chip and the like).


The analysis of expression control of the genes of the present invention may be performed by genetic analysis method using a DNA array. The present invention also provides a virtual genome DNA array (also called as “hyperthermophillic genomic array”) using the genomic sequence which has first identified in the present invention.


The nucleotides of the present invention may be used in a gene analysis method using a DNA array. A DNA array is widely reviewed (Shujunsha Ed., Saibo-kogaku (Cellular Engineering), Special issue, “DNA-maikuro-arei-to-saisin-PCR-ho [DNA microarray and Up-to-date PCR Method”). Further, plant analysis using a DNA array has been recently used (Schenk P M et al. (2000) Proc. Natl. Acad. Sci. (USA) 97: 11655-11660). Hereinafter, a DNA array and a gene analysis method using the same will be briefly described.


“DNA array” refers to a device in which DNAs are arrayed and immobilized on a plate. DNA arrays are divided into DNA macroarrays, DNA microarrays, and the like according to the size of a plate or the density of DNA placed on the plate, however, the use of these terms are not strict as used herein.


The border between macro and micro is not strictly determined. However, generally, “DNA macroarray” refers to a high density filter in which DNA is spotted on a membrane, while “DNA microarray” refers to a plate of glass, silicon, and the like which carries DNA on a surface thereof. There are a cDNA array, an oligo DNA array, and the like according to the type of DNA placed.


A certain high density oligo DNA array, in which a photolithography technique for production of semiconductor integrated circuits is applied and a plurality of oligo DNAs are simultaneously synthesized on a plate, is particularly called “DNA chip”, an adaptation of the term “semiconductor chip”. Examples of the DNA chip prepared by this method include GeneChip® (Affymetrix, Calif.), and the like (Marshall A et al., (1998) Nat. Biotechnol. 16: 27-31 and Ramsay G et al., (1998) Nat. Biotechnol. 16 40-44). Preferably, GeneChip® may be used in gene analysis using a microarray according to the present invention. The DNA chip is defined as described above in a narrow sense, but may refer to all types of DNA arrays or DNA microarrays.


Thus, DNA microarrays are a device in which several thousands to several ten thousands or more of gene DNAs are arrayed on a glass plate in high density. Therefore, it is possible to analyze gene expression profiles or gene polymorphism at a genomic scale by hybridization of cDNA, cDNA or genomic DNA. With this technique, it has been made possible to analyze a signal transfer system and/or a transcription control pathway (Fambrough D et al. (1999), Cell 97, 727-741); the mechanism of tissue repair (Iyer V R et al., (1999), Science 283: 83-87); the action mechanism of medicaments (Marton M J, (1999), Nat. Med. 4: 1293-1301); fluctuations in gene expression during development and differentiation processes in a wide scale, and the like; identify a gene group whose expression is fluctuated according to pathologic conditions; find a novel gene involved in a signal transfer system or a transcription control; and the like. Further, as to gene polymorphism, it has been made possible to analyze a number of SNP with a single DNA microarray (Cargill Met al., (1999), Nat. Genet. 22:231-238).


The principle of an assay using a DNA microarray will be described. DNA microarrays are prepared by immobilizing a number of different DNA probes in high density on a solid-phase plate, such as a slide glass, whose surface is appropriately processed. Thereafter, labeled nucleic acids (targets) are subjected to hybridization under appropriate hybridization conditions, and a signal from each probe is detected by an automated detector. The resultant data is subjected to massive analysis by a computer. For example, in the case of gene monitoring, target cDNAs integrated with fluorescent labels by reverse transcription from mRNA are allowed to hybridize to oligo DNAs or cDNAs as a probe on a microarray, and are detected with a fluorescence image analyzer. In this case, T7 polymerase may be used to carry out other various signal amplification reactions, such as cRNA synthesis reactions or via enzymatic reactions.


Fodor et al. has developed a technique for synthesizing polymers on a plate using a combination of combinatorial chemistry and photolithography for semiconductor production (Fodor S P et al., (1991) Science 251: 767-773). This is called the synthesized DNA chip. Photolithography allows for extremely minute surface processing, thereby making it possible to produce a DNA microarray having a packing density of as high as 10 μm2/DNA sample. In this method, generally, about 25 to about 30 DNAs are synthesized on a glass plate.


Gene expression using a synthesized DNA chip was reported by Lockart et al. (Lockart D J et al. (1996) Nat. Biotechnol.: 14: 1675-1680). This method overcomes a drawback of the chip of this type in that the specificity is low since the length of synthesized DNA is short. This problem was solved by preparing perfect match (PM) oligonucleotide probes corresponding to from about 10 to about 20 regions and mismatch (MM) oligonucleotide probes having a one base mutation in the middle of the PM probes for the purpose of monitoring the expression of one gene. Here, the MM probes are used as an indicator for the specificity of hybridization. Based on the signal ratio between the PM probe and the MM probe, the level of gene expression may be determined. When the signal ratio between the PM probe and the MM probe is substantially 1:1, the result is called cross hybridization, which is not interpreted as a significant signal.


A so-called attached DNA microarray is prepared by attaching DNAs onto a slide glass, and fluorescence is detected (see also http://cmgm.stanford.edu/pbrown). In this method, no gigantic semiconductor production machine is required, and only a DNA array machine and a detector are used to perform the assay in a laboratory. This method has the advantage that it is possible to select DNAs to be attached. A high density array can be obtained by spotting spots having a diameter of 100 μm at intervals of 100 μm, for example. It is mathematically possible to spot 2500 DNAs per cm2. Therefore, a usual slide glass (the effective area is about 4 cm2) can carry about 10,000 DNAs.


As a labeling method for synthesized DNA arrays, for example, double fluorescence labeling is used. In this method, two different mRNA samples are labeled by different fluorescent dyes respectively. The two samples are subjected to competitive hybridization on the same microarray, and both fluorescences are measured. A difference in gene expression is detected by comparing the fluorescences. Examples of the fluorescent dye include, but are not limited to, Cy5 and Cy3, which are most often used, and the like. The advantage of Cy3 and Cy5 is that the wavelengths of fluorescences do not overlap substantially. Double fluorescence labeling maybe used to detect mutations or morphorisms in addition to differences in gene expression.


An array machine may be used for assays using a DNA array. In the array machine, basically, a pin tip or a slide holder is moved in directions along the X, Y and Z axes in combination with a high-performance servo motor under the control of a computer so that DNA samples are transferred from a microtiter plate to the surface of a slide glass. The pin tip is processed into various shapes. For example, a DNA solution is retained in a cloven pen tip like a crow's bill and spotted onto a plurality of slide glasses. After washing and drying cycles, a DNA sample is then placed on the slide glasses. The above-described steps are repeated. In this case, in order to prevent contamination of the pin tip by a different sample, the pin tip has to be perfectly washed and dried. Examples of such an array machine include SPBIO2000 (Hitachi Software Engineering Co., Ltd.; single strike type), GMS417 Arrayer (Takara Shuzo Co., Ltd.; pin ring type), Gene Tip Stamping (Nippon Laser & Electronics Lab.; fountain pen type), and the like.


There are various DNA immobilizing methods for use in assays using a DNA array. Glass as a material for a plate has a small effective area for immobilization and electrical charge amount as compared to membranes, and therefore is given various coatings such as poly L-lysine coating (Reference 55), silane finishing (Reference 56), or the like. Further, a commercially available precoated slide glass exclusive to DNA microarrays (e.g., polycarboimide glass (Nissin Spinning Co., Ltd.) and the like) may also be used. In the case of oligo DNA, a method of aminating a terminal of the DNA and crosslinking the DNA to silane-finished glass is available.


DNA microarrays may carry mainly cDNA fragments amplified by PCR. When the concentration of cDNA is insufficient, signals cannot be sufficiently detected in some cases. In a case when a sufficient amount of cDNA fragments is not obtained by one PCR operation, PCR is repeated. The resultant overall PCR products may be purified and condensed at one time. A probe cDNA may generally carry a number of random cDNAs, but may carry a group of selected genes (e.g., the gene or promoter groups of the present invention) or candidate genes for gene expression changes obtained by RDA (representational differential analysis) according to the purpose of an experiment. It is preferable to avoid overlapping clones. Clones may be prepared from a stock cDNA library, or cDNA clones may be purchased.


In assays using a DNA array, a fluorescent signal indicating hybridization on the DNA microarray is detected by a fluorescence detector or the like. There are various conventionally available detectors for this purpose. For example, a research group at the Stanford University has developed an original scanner which is a combination of a fluorescence microscope and a movable stage (see http://cmgm.stanford.edu/pbrown). A conventional fluorescence image analyzer for gel, such as FMBIO (Hitachi Software Engineering), Storm (Molecular Dynamics), and the like, can read a DNA microarray if the spots are not arrayed in very high density. Examples of other available detectors include ScanArray 4000 and 5000 (General Scanning; scan type (confocal type)), GMS418 Array Scanner (Takara Shuzo; scan type (confocal type)), Gene Tip Scanner (Nippon Laser & Electronics Lab.; scan type (non-confocal type)), Gene Tac 2000 (Genomic Solutions; CCD camera type)), and the like.


The amount of data obtained from DNA microarrays is huge. Software for managing correspondences between clones and spots, analyzing data, and the like is important. Such software attached to each detection system is available (Ermolaeva O et al. (1998) Nat. Genet. 20:19-23). Further, an example of a database format is GATC (genetic analysis technology consortium) proposed by Affymetrix.


The present invention may also be used in gene analysis using a differential display technique.


The differential display technique is a method for detecting or identifying a gene whose expression fluctuates. In this method, cDNA is prepared from each of at least two samples, and amplified by PCR using a set of any primers. Thereafter, a plurality of generated PCR products are separated by gel electrophoresis. After the electrophoresis pattern is produced, expression-fluctuating genes are cloned based on a relative signal strength change between each band.


The term “support” as used herein refers to a material for an array construction of the present invention. Examples of a material for the substrate include any solid material having a property of binding to a biomolecule used in the present invention either by covalent bond or noncovalent bond, or which can be derived in such a manner as to have such a property.


Such a material for the substrate may be any material capable of forming a solid surface, for example, including, but being not limited to, glass, silica, silicon, ceramics, silica dioxide, plastics, metals (including alloys), naturally-occurring and synthetic polymer (e.g., polystyrene, cellulose, chitosan, dextran, and nylon). The substrate may be formed of a plurality of layers made of different materials. For example, an inorganic insulating material, such as glass, silica glass, alumina, sapphire, forsterite, silicon carbide, silicon oxide, silicon nitride, or the like, can be used. Moreover, an organic material, such as polyethylene, ethylene, polypropylene, polyisobutylene, polyethylene terephthalate, unsaturated polyester, fluorine-containing resin, polyvinyl chloride, polyvinylidene chloride, polyvinyl acetate, polyvinyl alcohol, polyvinyl acetal, acrylic resin, polyacrylonitrile, polystyrene, acetal resin, polycarbonate, polyamide, phenol resin, urea resin, epoxy resin, melamine resin, styrene·acrylonitrile copolymer, acrylonitrilebutadienestyrene copolymer, silicone resin, polyphenylene oxide, or polysulfone, can be used. In the present invention, a film used for nucleic acid blotting, such as a nitrocellulose film, a PVDF film, or the like, can also be used. When material constituting the substrate is a solid phase, it is specifically referred to as “solid (phase) substrate” as used herein. As used herein such a substrate may be a form of plate, microwell plate, chip, glass slide, film, bead, metal (surface) and the like. Substrates may or may not be coated.


“Chip” as used herein refers to an ultramicro-integrated circuit having various functions, which constitutes a part of a system. “Biomolecule chip” as used herein refers to a chip comprising a substrate and a biomolecule, in which at least one biomolecule as set forth herein is disposed on the substrate.


The term “address” as used herein refers to a unique position on a substrate which can be distinguished from other unique positions. An address is suitably used to access a biomolecule associated with the address. Any entity present at each address can have an arbitrary shape which allows the entity to be distinguished from entities present at other addresses (e.g., in an optical manner). The shape of an address may be, for example, a circle, an ellipse, a square, or a rectangle, or alternatively an irregular shape.


The size of each address varies depending on, particularly, the size of a substrate, the number of addresses on the specific substrate, the amount of samples to be analyzed and/or an available reagent, the size of a biomolecule, and the magnitude of a resolution required for any method in which the array is used. The size of an address may range from 1-2 nm to several centimeters (e.g., 1-2 mm to several centimeters, etc., 125×80 mm, 10×10 mm, etc.). Any size of an address is possible as long as it matches the array to which it is applied. In such a case, a substrate material is formed into a size and a shape suitable for a specific production process and application of an array. For example, in the case of analysis where a large amount of samples to be measured are available, an array may be more economically constructed on a relatively large substrate (e.g., 1 cm×1 cm or more). Here, a detection system which does not require much sensitivity and is therefore economical may be further advantageously used. On the other hand, when the amount of an available sample to be analyzed and/or reagent is limited, an array may be designed so that consumption of the sample and reagent is minimized.


The spatial arrangement and forms of addresses are designed in such a manner as to match a specific application in which the microarray is used. Addresses may be densely loaded, widely distributed, or divided into subgroups in a pattern suitable for a specific type of sample to be analyzed. “Array” as used herein refers to a pattern of solid substances fixed on a solid phase surface or a film, or a group of molecules having such a pattern. Typically, an array comprises biomolecules (e.g., DNA, RNA, protein-RNA fusion molecules, proteins, low-weight organic molecules, etc.) conjugated to nucleic acid sequences fixed on a solid phase surface or a film as if the biomolecule captured the nucleic sequence. “Spots” of biomolecules may be arranged on an array. “Spot” as used herein refers to a predetermined set of biomolecules.


Any number of addresses may be arranged on a substrate, typically up to 108 addresses, in other embodiments up to 107 addresses, up to 106 addresses, up to 105 addresses, up to 104 addresses, up to 103 addresses, or up to 102 addresses. Therefore, when one biomolecule is placed on one address, up to 108 biomolecules can be placed on a substrate, and in other embodiment up to 107 biomolecules, up to 106 biomolecules, up to 105 biomolecules, up to 104 biomolecules, up to 103 biomolecules, or up to 102 biomolecules can be placed on a substrate. In these cases, a smaller size of substrate and a smaller size of address are suitable. In particular, the size of an address may be as small as the size of a single biomolecule (i.e., this size may be of the order of 1-2 nm). In some cases, the minimum area of a substrate is determined based on the number of addresses on the substrate.


The term “biomolecule” as used herein refers to a molecule related to an organism. An “organism (or “bio-”)” as used herein refers to a biological organic body, including, but being limited to, an animal, a plant, a fungus, a virus, and the like. A biomolecule includes a molecule extracted from an organism, but is not so limited. A biomolecule is any molecule capable of having an influence on an organism. Therefore, a biomolecule also includes a molecule synthesized by combinatorial chemistry, and a low weight molecule capable of being used as a medicament (e.g., a low molecular weight ligand, etc.) as long as they are intended to have an influence on an organism. Examples of such a biomolecule include, but are not limited to, proteins, polypeptides, oligopeptides, peptides, polynucleotides, oligonucleotides, nucleotides, nucleic acids (e.g., including DNA (such as cDNA and genomic DNA) and RNA (such as mRNA)), polysaccharides, oligosaccharides, lipids, low weight molecules (e.g., hormones, ligands, signal transduction substances, low-weight organic molecules, etc.), and complex molecules thereof, and the like. A biomolecule also includes a cell itself, and a part or the whole of a tissue, and the like as long as they can be coupled to a substrate of the present invention. Preferably, a biomolecule includes a nucleic acid or a protein. In a preferable embodiment, a biomolecule is a nucleic acid (e.g., genomic DNA or cDNA, or DNA synthesized by PCR or the like). In another preferable embodiment, a biomolecule may be a protein. Preferably, one type of biomolecule may be provided for each address on a substrate of the present invention. In another embodiment, a sample containing two or more types of biomolecules may be provided for each address.


As used herein the term “liquid phase” is used to mean as usually used in the art, and usually refers to a state in a solution.


As used herein the term “solid phase” is used to mean as usually used in the art, and usually refers to a state in a solid. As used herein liquid and solid collectively refer to “fluid”.


As used herein the term “contact” refers to existing in a sufficient vicinity distance for interaction between two matters (for example, a composition and a cell) to each other.


As used herein the term “interaction” refers, when referring to two matters, to that the two matters exert a force to each other. Such interaction includes, but is not limited to, for example, covalent bonding, hydrpgen bonding, van der Waals forces, ionic interaction, non-ionic interaction, hydrophobic interaction, electrostatic interaction and the like. Preferably, the interaction may be normal interaction caused in a living body such as hydrogen bonding, hydrophobic interaction, and the like.


In one embodiment, the present invention may produce a micoarray for screening for a molecule, by binding a library of biomolecules (for example, organic low-molecular weight moleculre, combinatorial chemistory products) to a substrate, and using the same. Chemical library used in the present invention, may be produced or obtained by any means including, but is not limited to, for example, by the use of combinatorial chemistry technology, fermentation technology, plant and cell extraction procedures and the like. Production of a combinatorial library is well known in the art. For example, E. R. Felder, Chimia 1994, 48, 512-541; Gallop et al., J. Med. Chem. 1994, 37, 1233-1251; R. A. Houghten, Trends Genet. 1993, 9, 235-239; Houghtenet al., Nature 1991, 354, 84-86; Lam et al., Nature 1991, 354, 82-84; Carell et al., Chem. Biol. 1995, 3, 171-183; Madden et al., Perspectives in Drug Discovery and Design 2, 269-282; Cwirla et al., Biochemistry 1990, 87, 6378-6382; Brenner et al., Proc. Natl. Acad. Sci. USA 1992, 89, 5381-5383; Gordon et al., J. Med. Chem. 1994,37, 1385-1401; Lebl et al., Biopolymers 1995, 37177-198 ; and references cited therein. These references are incorporated by reference for their entireties


Methods, biomolecule chips and apparatuses of the present invention may be used for, for example, diagnosis, forensic medicine, drug discovery (screening for drugs) and development, molecular biological analysis (for example, nucleotide sequencing based array and gene sequence analysis based on array), analysis of protein properties and functions, pharmacogenomics, proteomics, environmental search, and additional biological and chemical analyses.


The present invention can also be applied to polymorphism analysis, such as RFLP analysis, SNP (snipp, single nucleotide polymorphism) analysis, or the like, analysis of base sequences, and the like. The present invention can also be used for screening of a medicament.


The present invention can be applied to any situation requiring a biomolecule test other than medical applications, such as food testing, quarantine, medicament testing, forensic medicine, agriculture, husbandry, fishery, forestry, and the like.


The present invention can also be used for detection of a gene amplified by PCR, SDA, NASBA, or the like, other than a sample directly collected from an organism. In the present invention, a target gene can be labeled in advance with an electrochemically active substance, a fluorescent substance (e.g., FITC, rhodamine, acridine, Texas Red, fluorecein, etc.), an enzyme (e.g., alkaline phosphatase, peroxidase, glucose oxidase, etc.), a colloid particle (e.g., a hapten, a light-emitting substance, an antibody, an antigen, gold colloid, etc.), a metal, a metal ion, a metal chelate (e.g., trisbipyridine, trisphenanthroline, hexamine, etc.), or the like.


In one embodiment, a nucleic acid component is extracted from these samples in order to test the nucleic acid. The extraction is not limited to a particular method. A liquid-liquid extraction method, such as phenol-chloroform method and the like, or a liquid-solid extraction method using a carrier can be used. Alternatively, a commercially available nucleic acid extraction method such as QIAamp (QIAGEN, Germany) or the like can be used. Next, a sample containing an extracted nucleic acid component is subjected to a hybridization reaction on a biomolecule chip of the present invention. The reaction is conducted in a buffer solution having an ionic strength of 0.01 to 5 and a pH of 5 to 10. To this solution may be added dextran sulfate (hybridization accelerating agent), salmon sperm DNA, bovine thymus DNA, EDTA, a surfactant, or the like. The extracted nucleic acid component is added to the solution, followed by heat denaturation at 90° C. or more. Insertion of a biomolecule chip can be carried out immediately after denaturation or after rapid cooling to 0° C. Alternatively, a hybridization reaction can be conducted by dropping a solution on a substrate. The rate of a reaction can be increased by stirring or shaking during the reaction. The temperature of a reaction is in the range of 10° C. to 90° C. The time of a reaction is in the range of one minute to about one night. After a hybridization reaction, an electrode is removed and then washed. For washing, a buffer solution having an ionic strength of 0.01 to 5 and a pH of 5 to 10 can be used. “Label” as used herein refers to an entity which distinguishes an intended molecule or substance from other substances (e.g., asubstance, energy, electromagnetic wave, etc.). Examples of such a labeling method include an RI (radioisotope) method, a fluorescence method, a biotin method, a chemiluminescence method, and the like. When both a nucleic acid fragment and its complementary oligonucleotide are labeled by a fluorescence method, they are labeled with fluorescence substances having different maximum wavelengths of fluoresence. The difference in the maximum wavelength of fluorescence is preferably at least 10 nm. Any fluorescence substance which can bind to a base portion of nucleic acid can be used. Preferable fluorescence substances include cyanine dye (e.g., Cy3, Cy5, etc. in Cy Dye™ series), a rhodamine 6G reagent, N-acetoxy-N2-acetylaminofluorene (AAF), AAIF (an iodine derivative of AAF), and the like. Examples of a combination of fluorescence substances having a difference in the maximum wavelength of fluorescence of at least 10 nm, include a combination of Cy5 and a rhodamine 6G reagent, a combination of Cy3 and fluorescein, a combination of a rhodamine 6G reagent and fluorescein, and the like.


“Chip attribute data” as used herein refers to data associated with some information relating to a biomolecule chip of the present invention. Chip attribute data includes information associated with a biomolecule chip, such as a chip ID, substrate data, and biomolecule attribute data. “Chip ID” as used herein refers to a code for identification of each chip. “Substrate data” or “substrate attribute data” as used herein refers to data relating to a substrate used in a biomolecule chip of the present invention. Substrate data may contain information relating to an arrangement or pattern of a biomolecule. “Biomolecule attribute data” refers to information relating to a biomolecule, inclding, for example, the gene sequence of the biomolecule (a nucleotide sequence in the case of nucleic acid, and an amino acid sequence in the case of protein), information relating to a gene sequence (e.g., a relationship between the gene and a specific disease or condition), a function in the case of a low weight molecule or a hormone, library information in the case of a combinatorial library, molecular information relating to affinity for a low weight molecule, and the like. “Personal information data” as used herein refers to data associated with information for identifying an organism or subject to be measured by a method, chip or apparatus of the present invention. When the organism or subject is a human, personal information data includes, but is not limited to, age, sex, health condition, medical history (e.g., drug history), educational background, the company of your insurance, personal genome information, address, name, and the like. When the personal information data is for a domestic animal, the information may include data about the production company of the animal. “Measurement data” as used herein refers to raw data as a result of measurement by a biomolecule substrate, apparatus and system of the present invention and specific processed data derived therefrom. Such raw data may be represented by the intensity of an electric signal. Such processed data may be specific biochemical data, such as a blood sugar level or a gene expression level.


“Recording region” as used herein refers to a region in which data may be recorded. In a recording region, measurement data as well as the above-described chip attribute data can be recorded.


Techniques as used herein are well known techniques commonly used in microfluidics, micromachining, organic chemistry, biochemistry, genetic engineering, molecular biology, genetics, and their related fields within the technical scope of the art, unless otherwise specified. These techniques are sufficiently described in, for example, literature listed below and described elsewhere herein.


Micromachining is described in, for example, Campbell, S. A. (1996). The Science and Engineering of Microelectronic Fabrication, Oxford University Press; Zaut, P. V. (1996). Microarray Fabrication: a Practical Guide to Semiconductor Processing, Semiconductor Services; Madou, M. J. (1997). Fundamentals of Microfabrication, CRC15 Press; Rai-Choudhury, P. (1997). Handbook of Microlithography, Micromachining, & Microfabrication: Microlithography; and the like, related portions of which are herein incorporated by reference.


Photolithography is a technique developed by Fodor et al., in which a photoreactive protecting group is utilized (see Science, 251, 767(1991)). A protecting group for a base inhibits a base monomer of the same or different type from binding to that base. Thus, a base terminus to which a protecting group is bound has no new base-binding reaction. A protecting group can be easily removed by irradiation. Initially, amino groups having a protecting group are immobilized throughout a substrate. Thereafter, only spots to which a desired base is to be bound are selectively irradiated by a method similar to a photolithography technique usually used in a semiconductor process, so that another base can be introduced by subsequent binding into only the bases in the irradiated portion. Now, desired bases having the same protecting group at a terminus thereof are bound to such bases. Thereafter, the pattern of a photomask is changed, and other spots are selectively irradiated. Thereafter, bases having a protecting group are similarly bound to the spots. This process is repeated until a desired base sequence is obtained in each spot, thereby preparing a DNA array. Photolithography techniques may be herein used.


An ink jet method (technique) is a technique of projecting considerably small droplets onto a predetermined position on a two-dimensional plane using heat or a piezoelectric effect. This technique is widely used mainly in printers. In production of a DNA array, an ink jet apparatus is used, which has a configuration in which a piezoelectric device is combined with a glass capillary. A voltage is applied to the piezoelectric device which is connected to a liquid chamber, so that the volume of the piezoelectric device is changed and the liquid within the chamber is expelled as a droplet from the capillary connected to the chamber. The size of the expelled droplet is determined by the diameter of the capillary, the volume variation of the piezoelectric device, and the physical property of the liquid. The diameter of the droplet is generally 30 μm. An ink jet apparatus using such a piezoelectric device can expel droplets at a frequency of about 10 KHz. In a DNA array fabricating apparatus using such an ink jet apparatus, the ink jet apparatus and a DNA array substrate are relatively moved so that droplets can be dropped onto desired spots on the DNA array. DNA array fabricating apparatuses using an ink jet apparatus are roughly divided into two categories. One category includes a DNA array fabricating apparatus using a single ink jet apparatus, and the other includes a DNA array fabricating apparatus using a multi-head ink jet apparatus. The DNA array fabricating apparatus with a single ink jet apparatus has a configuration in which a reagent for removing a protecting group at a terminus of an oligomer is dropped onto desired spots. A protecting group is removed from a spot, to which a desired base is to be introduced, by using the ink jet apparatus so that the spot is activated. Thereafter, the desired base is subjected to a binding reaction throughout a DNA array. In this case, the desired base is bound only to spots having an oligomer whose terminus is activated by the reagent dropped from the ink jet apparatus. Thereafter, the terminus of a newly added base is protected. Thereafter, a spot from which a protecting group is removed is changed and the procedures are repeated until desired nucleotide sequences are obtained. On the other hand, in a DNA array fabricating apparatus using a multi-head ink jet apparatus, an ink jet apparatus is provided for each reagent containing a different base, so that a desired base can be bound directly to each spot. A DNA array fabricating apparatus using a multi-head ink jet apparatus can have a higher throughput than that of a DNA array fabricating apparatus using a single ink jet apparatus. Among methods for fixing a presynthesized oligonucleotide to a substrate is a mechanical microspotting technique in which liquid containing an oligonucleotide, which is attached to the tip of a stainless pin, is mechanically pressed against a substrate so that the oligonucleotide is immobilized on the substrate. The size of a spot obtained by this method is 50 to 300 μm. After microspotting, subsequent processes, such as immobilization using UV light, are carried out.


DESCRIPTION OF PREFERRED EMBODIMENTS

Hereinafter, preferred embodiments of the present invention will be described. The following embodiments are provided for a better understanding of the present invention and the scope of the present invention should not be limited to the following description. It will be clearly appreciated by those skilled in the art that variations and modifications can be made without departing from the scope of the present invention with reference to the specification.


Next, a novel gene targeted-disruption technique, a feature of the present invention, is described.


In one aspect, the present invention provides a method for targeted-disuption of an arbitrary gene in a genome of a living organism. The subject method comprises the steps of: A) providing information of the entire sequence of the genome of the living organism; B) selecting at least one arbitrary region of the sequence; C) providing a vector comprising a sequence complementary to the selected region and a marker gene; D) transforming the living organism with the vector; and E) placing the living organism in a condition allowing to cause homologous recombination. The method is first attained by clarifying the entire genomic sequence, and is different from the conventional technology in that, for example, a model system using Sulfolobus solfataricus, by Bartolucci S., cannot disrupt a desired gene, and can merely utilize the result from accidental disruption. In the present invention, this difference has attained effects which can rapidly disrupt a desired gene in an efficient manner, and allow functional anlaysis.


Preferably, in the step B) of the present invention, the region comprises at least two regions. By having two such regions, targeted-disruption of genes by double cross-over may be available. As demonstrated in the present invention, targeted-disruption of a gene by double cross-over is generally more efficient than targeted-disruption of a gene by single cross-over. Accordingly, it is preferable to have two such regions.


Vectors used in the present invention, are also called disruption vectors, and may further comprise an additional gene regulatory element such as a promoter.


The gene targeting method of the present invention may further comprise the step of detecting an expression product of the marker gene. As used herein, the expression product may be for example an mRNA, a polypeptide, or a post-translationally modified polypeptide.


In one embodiment, the marker gene is located in or outside the selected region.


As used herein, the genome used in the present invention, may be any genome as long as the entire genomic sequence is substantially sequenced. Examples of such a genome include, but are not limited to, for example, archeabacteria such as Aeropyrum pernix, Archaeoglobus fulgidus, Methanobacterium thermoautorophicum, Methanococcus jannaschii, Pyurococcus abyssi, Pyrococcus furiosus, Pyrococcus horikoshii, Sulfolobus solfataricus, Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasma volcanium; bacteria such as Aquifex aeolicus, Thermotoga maritima, and the like. In one embodiment, the genome used may be the genome of Thermococcus kodakaraensis KOD1, because the entire genome of Thermococcus kodakaraensis KOD1 has now been sequenced. As used herein, that the entire sequence has been sequenced or substantially sequenced, refers to that sequences are clarified so that for any regional sequence selected, a sufficiently homologous region for causing homologous recombination may be provided. Accordingly, it is preferable that the entire sequence is sequenced without lack of a single base, however, it is permissible to have one, two, or three bases unclarified in a sequences. A plurality of such unclarified sequences may be present as long as for any regional sequence selected, a region sufficiently homologous for causing homologous recombination may be provided.


Preferably, the genome of the present invention has a sequence set forth in SEQ ID NO: 1.


Preferably, in the method of the present invention, the above-mentioned region selected, is an open reading frame of SEQ ID NO; 1, which are selected from the group of sequences of gene Numbers (1) to (2151) in the following Table in the sequence of SEQ ID NO: 1, 342, 723, 1087, 1469 or 1838.

TABLE 1Nucleic acidNo. (senseNucleic acidNucleic acidNucleic acidchainNo. (senseNo. (antisenseNo. (antisense(correspondingchainchainchainto SEQ(corresponding(corresponding(correspondingID NO: 1,to SEQ IDto SEQ ID NO:to SEQ ID NO:342, 723),NO: 1, 342,1087, 1469,1087, 1469,Correspondinggenestarting723), ending1838), ending1838), startingSEQNo.nucleotidenucleotidenucleotidenucleotideID NO.1150162089377208436222513457332084244208364533607965432083299208283514684658670142082792208236445715273912082226208198718376739976142081979208176414677765587552081723208062321578884310093208053520792853439100951037920792832078999724101037610807207900220785713441110808114162078570207796221561211406117262077972207765272513117231228620776552077092345141233813411207704020759673461513392138412075986207553718361613808140562075570207532221551714153148962075225207448234718152391596420741392073414348191615116699207322720726793492016696176972072682207168152117780187932071598207058521542218786192802070592207009818352319290201832070088206919518342420183211872069195206819121532521266219192068112206745921522621913225692067465206680914662722597241952066781206518314652823947248342065431206454462924813254512064565206392772630254132581120639652063567183331258132739620635652061982146432275652862020618132060758733285912933420607872060044146334297823068120595962058697835311023126620582762058112936314143223520579642057143103732367332512057011205612772738332913503320560872054345728393504835824205433020535543504035882365412053496205283735141365533738020528252051998114237394378702051984205150835243378743929820515042050080353443976040332204961820490461245403604107020490182048308134641072426942048306204668435447426964444420466822044934729484444146435204493720429433554946470469912042908204238773050471714741620422072041962356514731747799204206120415791452479374913920414412040239183253491534932920402252040049146254493934973120399852039647155549728502972039650203908173156502785055920391002038819146157506935141220386852037966357585148352061203789520373171831595206352605203731520367731460605260253792203677620355861830615416955020203520920343581662550585560620343202033772358635574656018203363220333607326456132562632033246203311535965562445670820331342032670733665667457267203270420321111767572645758420321142031794182968575995827620317792031102215169588555970320305232029675187059704598682029674202951014597159898617992029480202757918287262830637232026548202565519736422665992202515220233863607466045673822023333202199673475673996897320219792020405207669117693742020261202000473577695836979520197952019583217869792705112019586201886773679705047111220188742018266228071117712452018261201813336181716797259320176992016785737827276473339201661420160393628373336746432016042201473523847460375760201477520136183638575753760252013625201335373886760227745820133562011920364877773579045201164320103333658879622797262009756200965221508979968801292009410200924973990802468042820091322008950366918043283176200894620062023679283431836282005947200575024938390884267200547020051112594842648444020051142004938740958446185018200491720043603689684999853402004379200403874197854218594820039572003430369988633387139200304520022392149998721187663200216720017152610087663882652001715200111374210188266892792001112200009974310289307900592000071199931974410390079902671999299199911127104902769056019991021998818745105905839105619987951998322145810691178913661998200199801237010791363929791998015199639928108930729455019963061994828746109945529571219948261993666291109618597636199319319917423711119762098147199175819912317471129841799583199096119897953721139964810089219897301988486748114100915101205198846319881731457115101224101733198815419876451456116101796102347198758219870317491171023931025631986985198681575011810298610343219863921985946214811910347610431819859021985060751120104398106101198498019832773012110621010677919831681982599311221068341074541982544198192432123107637108455198174119809237521241084821090991980896198027921471251090921110351980286197834318271261116431130191977735197635914551271132051145631976173197481575312811466811535119747101974027373129115397116401197398119729773741301164821166341972896197274414541311166761174941972702197188418261321174751182421971903197113614531331181781187111971200197066721461341190611199391970317196943918251351199731204851969405196889375413612047912095219688991968426214513712112112119219682571968186214413812140412185619679741967522755139122007122438196737119669407561401224311226671966947196671133141122668123594196671019657843414212357812386819658001965510214314312393212615719654461963221214214412630612856119630721960817757145128631130013196074719593651824146130150131154195922819582241452147131148133049195823019563291823148132745133890195663319554883514913388513454719554931954831145115013454413483419548341954544182215113497813575419544001953624214115213747713817219519011951206214015313852113867619508571950702213915413936514097219500131948406758155141078141311194830019480677591561413351418561948043194752237515714185314270719475251946671145015814273214379319466461945585144915914375614493119456221944447213816014492414523519444541944143182116114533414595119440441943427376162146007146603194337119427751820163147207149273194217119401051819164149293149697194008519396811448165149699150874193967919385042137166150876151928193850219374501818167152076152471193730219369077601681524171527431936961193663537716915280115349019365771935888213617015348715475219358911934626144717115484415588119345341933497213517215604415730919333341932069378173157368158228193201019311507611741581581590181931220193036014461751589821594641930396192991476217615951716008319298611929295144517716020616025619291721929122763178160526160744192885219286342134179160787161719192859119276592133180161795163255192758319261232132181163362164405192601619249737641821643981653931924980192398514441831653901675311923988192184718171841688811703771920497191900121311851704571711281918921191825018161861711301713811918248191799714431871713831725341917995191684421301881725271738341916851191554418151891738961739851915482191539314421901744041746011914974191477737919117458517534919147931914029765192175740177038191363819123401814193177138178151191224019112277661941781841783481911194191103038019517832017903919110581910339181319617919518055319101831908825381197180543181031190883519083471812198181028181288190835019080902129199181345183324190803319060541441200183436184935190594219044431440201185362185955190401619034231439202185988187004190339019023741811203187111187953190226719014251438204188074189315190130419000633620518986519027818995131899100372061902531906211899125189875738220719063019179918987481897579143720819187419250918975041896869767209192535192981189684318963973821019297119348618964071895892383211193701194033189567718953451810212194152194358189522618950201436213195097195405189428118939733921419574219584618936361893532143521519599519611118933831893267384216196138196959189324018924191434217197032197625189234618917531433218197747198367189163118910113852191984951997541890883188962418092201997482006861889630188869221282212007422010981888636188828076822220106720173818883111887640402232016922021021887686188727638622420210320292418872751886454387225202929203372188644918860067692262035852044751885793188490338822720447220508318849061884295412282050702062001884308188317838922920628020681318830981882565770230206810207397188256818819813902312073992081001881979188127877123220808220884018812961880538391233208850209479188052818798993922342094762104861879902187889242235210470211198187890818781803932362112962119821878082187739677223721197921295618773991876422394238212938214239187644018751394323921423621481418751421874564773240214807215433187457118739454424121542621659518739521872783395242216588217343187279018720357742432173252180951872053187128321272442180202191141871358187026414322452190772192531870301187012521262462194072204741869971186890421252472204712217181868907186766014312482216762222361867702186714218082492224722228521866906186652614302502228792232591866499186611918072512232822239231866096186545514292522238772250221865501186435621242532248902258041864488186357414282542258012268441863577186253418062552267182273771862660186200121232562273702277411862008186163718052572279312282421861447186113677525822825722871818611211860660396259228710229147186066818602312122260229347229745186003118596331804261229732230820185964618585581427262230826231581185855218577971803263231591232583185778718567951802264232580233410185679818559682121265233428233589185595018557891426266233684234727185569418546512120267234715235206185466318541721425268235203236345185417518530331801269236342237427185303618519512119270237653238216185172518511622118271238509239528185086918498507762722394892396861849889184969239727323967724042618497011848952142427424056024302818488181846350398275243977244525184540118448533992762445912450551844787184432345277245052245747184432618436317772782457382462291843640184314921172792462392463401843139184303821162802472262481341842152184124421152812481972496061841181183977214232822511612512651838217183811346283251394251477183798418379017782842515572517601837821183761847285254653255162183472518342161422286255227256987183415118323912114287257124258452183225418309261800288258556259233183082218301451421289260703261923182867518274557792902621762624841827202182689417992912625442638301826834182554821132922640652651651825313182421321122932648952662621824483182311614202942666962669771822682182240121112952670022680751822376182130321102962681092691971821269182018121092972692972700641820081181931440029827005227030618193261819072482992703012712781819077181810014193002713612721191818017181725940130127212127242918172571816949780302272525274057181685318153212108303274244274963181513418144154023042753402755641814038181381478130527668827775818126901811620493062777592785261811619181085250307278454278981181092418103977823082789692797361810409180964240330927985928052118095191808857141831028062928107218087491808306783311281104282072180827418073065131228206928246718073091806911784313282544283272180683418061061417314283421284416180595718049622107315284413285099180496518042791416316285104285292180427418040862106317285716286492180366218028862105318286543287079180283518022995231928704628764518023321801733179832028775828815318016201801225141532128815028843718012281800941179732228850528904718008731800331141432328917328949318002051799885179632428949028994817998881799430210432529013629102917992421798349179532629093929115717984391798221210332729135329269617980251796682404328292703293509179667517958694053292935102935931795868179578521023302936272944151795751179496340633129434629466317950321794715533322947502950011794628179437778533329511529662617942631792752407334296627297139179275117922392101335297204297731179217417916471794336297773298702179160517906764083372986993008251790679178855354338300795301748178858317876307863393018033032511787575178612717933403033053037661786073178561221003413037503046881785628178469017923423046983051261784680178425217913433053393061931784039178318540934430619030685817831881782520553453074733077001781905178167878734630831130888617810671780492141334730893030940617804481779972209934830949231063717798861778741179034931064231101617787361778362141235031101731162517783611777753141135131210831253617772701776842178935231263731290317767411776475563533129533133061776425177607241035431334431412017760341775258788355314205314447177517317749317893563144293155891774949177378941135731561831605817737601773320178835831624531697317731331772405178735931712431827217722541771106790360318265319239177111317701391410361319807319851176957117695271409362320239320928176913917684505736332137432151117680041767867412364321508321696176787017676825836532201232236517673661767013593663222653242561767113176512241336732426132639917651171762979791368326552326935176282617624434143693270133272821762365176209660370327284327514176209417618644153713275183283211761860176105741637232833332881517610451760563613733288123292881760566176009079237432929033009017600881759288623753302243316871759154175769141737633169133245217576871756926418377332449332736175692917566426337833417533494517552031754433419379335068335664175431017537146438033704533726017523331752118653813377113382951751667175108314083823393633397881750015174959079338334064134072717487371748651794384341558341995174782017473834203853423973434611746981174591766386343454343891174592417454874213873438883440761745490174530267388344090344401174528817449774223893452813454721744097174390642339034556634562217438121743756209839134561534574017437631743638795392346174346356174320417430226839334652834688117428501742497693943466063466681742772174271014073953471383484631742240174091542439634856735041717408111738961178639735053735159817388411737780425398351592352155173778617372237039935241935298517369591736393796400353923354102173545517352767140135417435533417352041734044797402355393355872173398517335067240335585635645217335221732926209740435644935738117329291731997140640535737835803717320001731341178540635803435932917313441730049209640735940736017117299711729207734083601683614661729210172791279840936149736340717278811725971799410366699367151172267917222271784411367290368240172208817211381783412368237369289172114117200892095413370634371449171874417179294264143714813729201717897171645880041537448837455017148901714828744163745833748401714795171453880141737483337553417145451713844140541837553537630817138431713070140441937600037609217133781713286754203762983767711713080171260720944213791773803101710201170906814034223803663811091709012170826920934233811113823131708267170706517824243823103826751707068170670320924253828503838391706528170553920914263842443844711705134170490714024273845283850401704850170433817814283850303861391704348170323914014293890563901321700322169924614004303901293913281699249169805017804313915703921871697808169719113994323926143933211696764169605713984333934493947501695929169462842743439489439810916944841691269764353981783984711691200169090717794363985023990111690876169036780243739905040418516903281685193428438404484405290168489416840888034394054194056311683959168374720904404056284059631683750168341513974414059604067091683418168266917784424068354080551682543168132342944340805240880716813261680571774444088094094621680569167991643044540945940964716799191679731784464096474104591679731167891980444741046041108016789181678298805448411176411688167820216776904314494118784132931677500167608543245041341541391516759631675463806451413926414252167545216751267945241487741520916745011674169804534171094172701672269167210881454417291417929167208716714498074554186364191751670742167020382456419247420563167013116688158084574206274221321668751166724680945842233342271916670451666659433459422876424030166650216653482089460426547426711166283116626678346142674742774216626311661636810462427799429064166157916603144344634290654303901660313165898820884644303944306331658984165874520874654306184307851658760165859313964664308834322591658495165711920864674323974327381656981165664084468432751433449165662716559298546943344643462116559321654757177747043453043573516548481653643864714357794363001653599165307820854724363004368121653078165256613954734374094382091651969165116981147443822243965816511561649720177647543969644040316496821648975139447644057844144416488001647934874774415114418821647867164749688478441887442267164749116471114354794423584428731647020164650543648044292244414216464561645236437481444220444681164515816446978948244497244531016444061644068812483446197448899164318116404791393484448945450294164043316390841392485450481450996163889716383829048645107745123816383011638140813487451250451597163812816377814384884527704531231636608163625591489453183454601163619516347778144904548354553411634543163403743949145533845550216340401633876924924563304566621633048163271681549345662345683516327551632543440494456838457587163254016317919349545761845818416317601631194944964584764591261630902163025295497459138459680163024016296981775498459718460674162966016287049649946066746193516287111627443208450046261846380816267601625570177450146426646442116251121624957139150246446046497216249181624406177350346533646656216240421622816816504466632466847162274616225311772505466975467631162240316217479750646762846880616217501620572177150747101847263716183601616741177050847269147414516166871615233208350947423947524016151391614138441510475250475708161412816136704425114757024770421613676161233698512477049477657161232916117219951347773847803116116401611347817514477971479050161140716103282082515478881479639161049716097398185164796294801621609749160921613905174801984807551609180160862317695184808434811271608535160825117685194813154826791608063160669910052048498148544516043971603933101521485442486008160393616033701767522486065486484160331316028944435234864814889791602897160039913895244895174906441599861159873413885254907444918441598634159753410252649192249337615974561596002819527493561495408159581715939701035284954104964801593968159289844452949709049918615922881590192445530499596499949158978215894291766531500938501252158844015881261387532501249501479158812915878991765533501658502464158772015869141386534502547502792158683115865862081535502785502967158659315864111764536503187503354158619115860248205375049715050991584407158427944653850624250666415831361582714138553950750650759215818721581786447540508803509420158057515799581763541510163510879157921515784991384542511923512477157745515769011762543513104513481157627415758974485445137105142611575668157511720805455148435152231574535157415513835465155435157911573835157358720795475170035178031572375157157513825485178055182811571573157109720785495182785187601571100157061813815505187725195751570606156980317615515195795198091569799156956917605525201585205411569220156883717595535206945226281568684156675020775545228375248281566541156455017585555247285250421564650156433613805565253975255851563981156379313795575258845264831563494156289520765585271995274681562179156191082155952768952832415616891561054104560528364528969156101415604091055615289845292171560394156016182256252921452952815601641559850449563529509529739155986915596398235645297365299811559642155939745056552997853038515594001558993106566530659532146155871915572321075675321235325301557255155684813785685326155337541556763155562410856953378953491615555891554462451570534917535363155446115540152075571535366536694155401215526841377572536818536871155256015525071376573536998537846155238015515321095745378475382091551531155116911057553823053929715511481550081824576539304540950155007415484288255775409865416811548392154769745257854167154229415477071547084826579542291542914154708715464644535805429045451591546474154421982758154519154568815441871543690111582545706546455154367215429238285835464685475021542910154187682958454749954775915418791541619454585547830548183154154815411958305865482185485531541160154082511258754853154951415408471539864455588549515549850153986315395284565895500805511501539298153822883159055124955246015381291536918457591552309553043153706915363358325925531335536991536245153567945859355374555473415356331534644207459455485555567615345231533702459595555783556910153359515324681757596556879558105153249915312731375597558125558196153125315311822073598558864559322153051415300561756599559506560798152987215285808336005608385623641528540152701483460156236156339515270171525983460602563371564303152600715250751136035643105653111525068152406713746045654095675411523969152183746160556755656778615218221521592137360656786556851215215131520866137260756871157012915206671519249114608570172570729151920615186491371609570898570957151848015184211156105710315717381518347151764046261157173557207015176431517308137061257214957465615172291514722136961357465357541115147251513967175561457549057650315138881512875175461557654057758615128381511792175361657775057856515116281510813116617578612579025151076615103534636185793925794541509986150992446461958046158055315089171508825175262058107058116815083081508210175162158257358344515068051505933175062258358258522815057961504150136862358539658638215039821502996835624587383587667150199515017111367625588220589968150115814994101366626590029591039149934914983391365627591078592301149830014970771749628592190593191149718814961874656295932145939571496164149542183663059391459449514954641494883117631594739594795149463914945831364632595329595610149404914937688376335954275975501493951149182846663459752059779814918581491580136363559869559939914906831489979174863659939660009714899821489281207263760009460094514892841488433136263860095860099914884201488379136163960138860182814879901487550467640601912602571148746614868071360641602643603974148673514854041747642603976605406148540214839721359643605506605823148387214835551186446058566067491483522148262917466456067466076781482632148170020716466076786086251481700148075313586476087206093491480658148002946864860966561120014797131478178469649611281612924147809714764541196506129216138681476457147551083865161385561461614755231474762120652614613615374147476514740048396536153796161161473999147326212165461611761662614732611472752135765561671361737514726651472003840656617430618005147194814713731745657617873619891147150514694872070658619888620115146949014692631356659620116620346146926214690321355660620526621581146885214677978416616215546223661467824146701247066262233862340214670401465976842663623814624353146556414650251744664624301624510146507714648681354665624735625205146464314641731743666625223625891146415514634874716676259166261701463462146320847266862620262693614631761462442174266962690962785314624691461525206967062783262898914615461460389135367162906162968714603171459691174167262968463102414596941458354206867363102163183914583571457539135267463187163235014575071457028473675632430632630145694814567488436766326176330991456761145627912267763311263393314562661455445123678633964634764145541414546141246796348156353301454563145404817406806359346360711453444145330717396816371436374511452235145192784468263748763806214518911451316474683638134639000145124414503781351684639553639651144982514497271256856396266403961449752144898220676866403936411811448985144819713506876412046419231448174144745520666886419726424901447406144688847568964251164309814468671446280134969064320964367014461691445708845691644598646496144478014428821738692647573650017144180514393614766936500786505841439300143879447769465058765108714387911438291126695651198652340143818014370388466966523436535481437035143583020656976537846550791435594143429984769865593765768814334411431690206469965772265864214316561430736206370065877365982514306051429553173770165985066015514295281429223173670266024666441814291321424960848703664498665586142488014237921277046656276659951423751142338347870566633266661614230461422762206270666661866716914227601422209173570766712366717614222551422202128708667218667724142216014216541734709667824669488142155414198908497106697356719181419643141746085071167370767398514156711415393851712674033674911141534514144674797136749576759701414421141340848071467642567729414129531412084852715677302678150141207614112281348716678143679063141123514103152061717679100679813141027814095652060718679850679924140952814094544817196801566804701409222140890848272068060668175414087721407624483721682401682496140697714068828537226824466827991406932140657917337236827176847111406661140466712972468469868517414046801404204205972568625368687314031251402505173272668686368763314025151401745134772768763868844714017401400931205872868851668957114008621399807130729689568690029139981013993498547306903166905131399062139886513467316905506913531398828139802513457326913876928201397991139655813447336928176949281396561139445017317346949866954051394392139397317307356954106966541393968139272413437366966516978081392727139157017297376978016995101391577138986813427386995077002741389871138910417287397002287010041389150138837413417407010377013991388341138797917277417015507023591387828138701985574270235670317713870221386201484743703152703868138622613855108567447038377052491385541138412913407457053097064601384069138291885774670645570665513829231382723172674770673970855613826391380822485748708558711569138082013778098587497118597124401377519137693813175071244571319113769331376187205775171314271363313762361375745859752713693714955137568513744232056753715024715470137435413739081339754715543716427137383513729511338755716424718136137295413712421725756718317719339137106113700398607577195077197881369871136959048675871979072059313695881368785172475972068972142613686891367952205576072178972230413675891367074132761722344722481136703413668971337762722592723116136678613662628617637231427243141366236136506413367647244197255731364959136380517237657257047262491363674136312913376672645872664313629201362735487767728745728798136063313605808627687290827297861360296135959213357697298447309891359534135838913477073096173148513584171357893488771731586733985135779213553938637727340167343361355362135504286477373434973493913550291354439172277473521573576013541631353618489775735762735941135361613534378657767359657371461353413135223220547777372107376831352168135169549077873782273969613515561349682205377973968774052313496911348855133478074058474129413487941348084135781741329741541134804913478374917827419207420841347458134729449278374268474337613466941346002136784743424743609134595413457698667857435877446031345791134477513337867445607453721344818134400649378774536974682613440091342552137788746823747761134255513416171721789747766748353134161213410251332790748338749033134104013403451720791749030749443134034813399352052792749440749877133993813395011331793750208750714133917013386641330794751954752967133742413364111387957530467541101336332133526813979675416675541013352121333968205179775549675643113338821332947867798756477756968133290113324108687997569587576291332420133174913298007577127584581331666133092020508017586897596451330689132973314080275976276069113296161328687869803760688761674132869013277042049804762327763418132705113259608708057633967640581325982132532014180676520076531613241781324062204880776563776604713237411323331142808766138766683132324013226951438097666857679741322693132140449481076797676843413214021320944871811768477769343132090113200358728127694597699621319919131941614481376995077126913194281318109873814771283771807131809513175711328815771820773541131755813158371458167735437748171315835131456149581777483877508913145401314289146818775493776422131388513129564968197764807776431312898131173549782077817677834613112021311032874821778362779411131101613099678758227793367802471310042130913149882378043878227613089401307102876824782329783108130704913062701478257830987849271306280130445120478267853827861041303996130327417198277862187868381303160130254020468287869307872861302448130209217188297872837876091302095130176920458307877497889301301629130044817178317889757892681300403130011049983278931778946013000611299918204483378985279002212995261299356171683479043879105812989401298320132783579067279073712987061298641148836791117792469129826112969095008377925057926751296873129670314983879266579311412967131296264501839793111795000129626712943781508407950387955441294340129383450284179631079753612930681291842204384279755279831612918261291062204284379847379953412909051289844503844799610799858128976812895205048457998488003271289530128905187784680032480042512890541288953204184780045080051812889281288860204084880091980242412884591286954878849802436802672128694212867065058508026698028901286709128648815185180288780329712864911286081879852803294805027128608412843515068538052208060681284158128331050785480602480741512833541281963203985580736680874512820121280633880856808746809576128063212798021715857810847811266127853112781121326858811367811606127801112777725088598116088123511277770127702788186081263581364812767431275730152861813652814113127572612752651538628140778164191275301127295988286381650181665012728771272728883864816754817728127262412716501548658177258185191271653127085988486681862381946812707551269910155867819475820395126990312689831568688204108211801268968126819817148698211468225701268232126680813258708228108235141266568126586417138718235998240211265779126535788587282401582519612653631264182203887382526682629412641121263084203787482637982741312629991261965203687582743582890412619431260474203587682898582972812603931259650132487782972583047112596531258907171287883055183236812588271257010157879832337833035125704112563435098808360108372601253368125211817118818373358376011252043125177720348828376478396381251731124974020338838396498398851249729124949317108848400978404711249281124890715888584050384132112488751248057510886841293842288124808512470908868878422758426281247103124675015988884298684405912463921245319132388984432084451712450581244861170989084459784565212447811243726132289184572584638712436531242991160892846422846727124295612426515118938467738479031242605124147551289484789684899012414821240388887895848774848884124060412404942032896848987849100124039112402782031897849375849638124000312397401708898849669851036123970912383421707899851134851325123824412380531321900851346851582123803212377961706901851738854035123764012353435139028518188518831237560123749513209038541268558411235252123353751490485588885665212334901232726888905856637856798123274112325802030906857151858227123222712311518899078587288589341230650123044451590886008086034012292981229038161909860404861084122897412282941319910861133862545122824512268331318911862729864021122664912253571317912864121864819122525712245591316913865002865454122437612239248909148653878663041223991122307416291586649686831312228821221065891916868296868430122108212209481705917868444870222122093412191561639188702638705471219115121883151691987053287084012188461218538164920870842871846121853612175325179218718368721201217542121725889292287194287277512174361216603165923872833873117121654512162611669248735248743061215854121507251892587470787494012146711214438893926875022875840121435612135388949278758378768561213541121252220299288770208772351212358121214389592987727187819712121071211181519930878209878658121116912107201315931878718878765121066012106138969328788868791821210492121019689793387921188050012101671208878167934880506881387120887212079918989358815508816541207828120772489993688281288292512065661206453202893788569488653912036841202839131493888656788717812028111202200131393988727588748712021031201891168940887717887920120166112014585209418879248907011201454119867752194289111489139811982641197980900943891434895009119794411943695229448950138956781194365119370052394589567589609711937031193281131294689662689904011927521190338169947899156900004119022211893742027948900134900385118924411889935249499016969025741187682118680413119509027009034581186678118592017049519039129041151185466118526317039529041279045551185251118482320269539046109050261184768118435252595490510590689811842731182480526955906982907974118239611814041709569079759082171181403118116113109579083709092601181008118011817029589093019101161180077117926217195991009791051611792811178862527960910513912024117886511773541729619120219128931177357117648517019629128909141881176488117519020259639143059144931175073117488517396491471191512111746671174257528965915118916428117426011729501749669165899172571172789117212152996791734891835211720301171026530968918655918705117072311706731309969918719919171117065911702072024970919305923264117007311661149019719241169248141165262116456420239729250109272441164368116213453197392724992757811621291161800170097492825792930911611211160069169997592942492970511599541159673169897693048093101311588981158365169797793110393157611582751157802532978931594932070115778411573081759799325269330861156852115629290298093312893343011562501155948533981933728933904115565011554745349829339199343921155459115498613089839345649353791154814115399917698493551393666411538651152714202298593666693694411527121152434169698693698793882211523911150556169598793895494019211504241149186535988940239940469114913911489099039899408039409371148575114844190499094093494205511484441147323536991942591942917114678711464619059929429149433061146464114607220219939433579435451146021114583313079949435339437781145845114560016949959438899445361145489114484220209969445429449941144836114438413069979449969454361144382114394220199989454339457411143945114363713059999457559469391143623114243920181000946932948164114244611412141693100194807994966211412991139716130410029496599530301139719113634816921003953048953296113633011360822017100495349595419011358831135188201610059543019550201135077113435817710069552049563911134174113298717810079563759565331133003113284520151008957270957638113210811317409061009957640961329113173811280491303101096140796232411279711127054907101196237296257511270061126803537101296259396380411267851125574130210139641689648271125210112455117910149648319654301124547112394813011015965603965896112377511234825381016965901966098112347711232809081017966166967002112321211223761801018967002967181112237611221979091019967184967987112219411213915391020968134968757112124411206211811021968754969002112062411203769101022968995969663112038311197151821023969660970463111971811189159111024970555971892111882311174861831025971952973340111742611160381691102697336697477211160121114606130010279748239762771114555111310116901028976234976803111314411125751299102997687197705311125071112325201410309770829777651112296111161316891031977762978706111161611106722013103297877697974711106021109631540103397982698110011095521108278541103498115998142511082191107953168810359817629818151107616110756316871036982136982483110724211068955421037982480982953110689811064251298103898302598348611063531105892912103998348398382111058951105557543104098380298437111055761105007168610419843599853991105019110397920121042985204986352110417411030261297104398634998691211030291102466168510449868519872461102527110213212961045987243987566110213511018121684104698751798838311018611100995129510479883839895731100995109980516831048989577989894109980110994841682104999076299151110986161097867913105099180399199110975751097387914105199203699301010973421096368201110529942419950201095137109435854410539950479951121094331109426618410549953809958441093998109353418510559958789965581093500109282012941056997037998464109234110909145451057998525999265109085310901132010105899975010002291089628108914991510591000226100121210891521088166546106010012171001987108816110873919161061100200210032401087376108613820091062100325310054661086125108391254710631005467100608710839111083291200810641006202100789010831761081488200710651007979101019210813991079186168110661010189101095610791891078422200610671011011101194910783671077429200510681012013101287910773651076499548106910129611013278107641710761005491070101337110138831076007107549518610711013995101441110753831074967129310721014829101722810745491072150187107310173311020711107204710686671881074102082110209701068557106840820041075102142410223381067954106704055010761022319102331110670591066067168010771023301102378010660771065598129210781023781102478510655971064593129110791024877102569210645011063686551108010256821026086106369610632921679108110260831026376106329510630022003108210263571026986106302110623921678108310269831027579106239510617992002108410276571029558106172110598201891085102951710300681059861105931012901086103027610309501059102105842812891087103101310318071058365105757116771088103181410323441057564105703416761089103240610327921056972105658619010901032841103437310565371055005191109110344581035498105492010538801921092103554110361011053837105327719310931036098103664910532801052729917109410366361037469105274210519091941095103739010382291051988105114920011096103822610397041051152104967412881097103979610406831049582104869555210981041012104107110483661048307918109910416241041935104775410474439191100104213310423841047245104699455311011042526104370110468521045677554110210436761044812104570210445661675110310448091046068104456910433102000110410470161048092104236210412861951105104820910486101041169104076816741106104868410487611040694104061712871107104871810495991040660103977955511081049596105127510397821038103128611091051307105171110380711037667199911101051708105199510376701037383128511111052192105270110371861036677556111210527531053022103662510363565571113105303210537931036346103558555811141053859105527410355191034104196111510553581055663103402010337159201116105628510563951033093103298392111171056392105738110329861031997199811181057362105783510320161031543167311191057832105830210315461031076199711201058495105904310308831030335559112110590471059307103033110300711996112210593991059863102997910295151672112310599211060517102945710288619221124106058210613101028796102806819711251061307106176810280711027610167111261061878106322110275001026157198112710632981064599102608010247795601128106465610650001024722102437812841129106537010660231024008102335512831130106602010672131023358102216516701131106721510678111022163102156712821132106779310683921021585102098616691133106839410692871020984102009112811134106928810711381020090101824012801135107085810709651018520101841356111361071135107262210182431016756166811371072619107296310167591016415199511381072960107368810164181015690127911391073670107395410157081015424166711401073951107434310154271015035199411411074340107459410150381014784127811421074591107512410147871014254166611431075360107586010140181013518127711441076013107727810133651012100923114510774321077986101194610113929241146107807110791891011307101018916651147107920110804721010177100890619931148108072310818621008655100751692511491082285108463910070931004739562115010823631082779100701510065991992115110846401085716100473810036621991115210858201086698100355810026809261153108676210869861002616100239292711541087256108851210021221000866199011551088568108881310008101000565166411561088815108938410005639999941276115710891601089210100021810001681991158108948410896399998949997391275115910899091090604999469998774166311601091118109152599826099785316621161109164610921979977329971819281162109220610935229971729958561989116310935561093957995822995421198811641093967109512799541199425119871165109637510968399930039925392001166109687010983039925089910752011167109828110985389910979908405631168109855410991569908249902225641169109922010994869901589898925651170109946810999089899109894702021171109995411009919894249883872031172110107311015109883059878681274117311018681102326987510987052127311741102786110318198659298619712721175110367311044619857059849171661117611045851106492984793982886929117711066861107264982692982114127111781107524110801598185498136319861179110855911102539808199791251985118011103471111819979031977559566118111118621112080977516977298198411821112624111300197675497637719831183111345911142179759199751619301184111440711170829749719722969311185111757711180299718019713495671186111808611197389712929696401270118711198401120178969538969200932118811201721120504969206968874568118911205051121407968873967971569119011214081122520967970966858198211911122517112374696686196563212691192112381011244729655689649062041193112456911251149648099642641268119411251701125637964208963741198111951125727112690296365196247620511961128262112849596111696088312671197112853511289729608439604061266119811290341130476960344958902198011991130532113194495884695743416601200113200611324229573729569561265120111324321132659956946956719126412021132744113512595663495425312631203113515411352139542249541655701204113525511377419541239516371262120511386341138867950744950511571120611391591142494950219946884572120711425371142836946841946542573120811428731144054946505945324574120911440541145121945324944257206121011451771146514944201942864575121111465531148040942825941338207121211480861149231941292940147208121311500931151094939285938284209121411510911154534938287934844165912151155108115546493427093391493312161155466115599993391293337912611217115741811576279319609317511658121811576241157836931754931542197912191157916115829393146293108516571220115836111595549310179298241260122111596861160306929692929072165612221161299116163492807992774419781223116169011636069276889257721655122411637031164656925675924722934122511646631165082924715924296935122611651211165714924257923664576122711657241165948923654923430577122811659591166231923419923147936122911662591166948923119922430937123011670011167234922377922144210123111675031168657921875920721197712321168678116947292070091990612591233116957611710249198029183541976123411710211171905918357917473125812351172047117227791733191710121112361172264117302591711491635319751237117302211736369163569157421257123811736871174022915691915356938123911740231174274915355915104165412401174284117438891509491499016531241117449311778709148859115085781242117829611788629110829105162121243117884011793229105389100565791244117933511806069100439087721974124511806031181361908775908017125612461181719118191690765990746212551247118228111826739070979067051973124811828991183855906479905523580124911844351184731904943904647197212501184832118575290454690362616521251118626411865249031149028541254125211873721187653902006901725197112531188250118890690112890047212531254118896211899069004168994721970125511899401190062899438899316196912561191309119194189806989743716511257119577311958418936058935379391258119642111969398929578924391650125911971211197330892257892048125212601197327119782789205189155116491261119785911981168915198912621251126211981291198395891249890983125012631198775119896989060389040958112641199210119953689016888984219681265120046512005428889138888369401266120274112042588866378851201967126712042601205624885118883754164812681205780120707588359888230319661269120736212077938820168815859411270120779012084828815888808965821271120946412101418799148792375831272121017412108938792048784852131273121089012111118784888782679421274121112812117878782508775912141275121185012127558775288766239431276121276012131048766188762741249127712131011214369876277875009164712781214366121521487501287416419651279121525012158618741288735171248128012173741217490872004871888215128112190741219190870304870188944128212191971220690870181868688164612831220740122151386863886786512471284122150312222018678758671771964128512222821223655867096865723216128612237581225113865620864265217128712251131225991864265863387945128812261691226861863209862517946128912270761227702862302861676124612901227756122846686162286091216451291122862212304938607568588855841292123058012330818587988562972181293123323612345468561428548325851294123456312362848548158530941644129512365841237978852794851400196312961237975123837685140385100212451297123843312397078509458496711643129812397911239994849587849384196212991240125124021484925384916494713001240801124089684857784848212441301124159212419218477868474571642130212419831243014847395846364124313031243011124366184636784571716411304124369212437788456868456001640130512437751244272845603845106196113061244307124476584507184461316391307124478812449738445908444051242130812450041246125844374843253124113091246241124705984313784231919601310124736912487098420098406691959131112486211249226840757840152948131212504991251188838879838190163813131251193125156183818583781712401314125163212535788377468358001958131512535881253788835790835590195713161254304125547083507483390821913171255582125643683379683294212391318125637912568468329998325321637131912574021258961831976830417949132012589721259079830406830299220132112591241259858830254829520950132212598551260172829523829206195613231260229126225682914982712212381324126238812626518269908267279511325126270912646618266698247179521326126465812650748247208243041955132712651451265591824233823787953132812655931266390823785822988221132912667501267955822628821423954133012681301269137821248820241163613311269155127004282022381933619541332127006212711628193168182161635133312711621272181818216817197195313341272174127310381720481627516341335127310012741588162788152201952133612741511275281815227814097163313371275461127613581391781324319511338127612012766898132588126891237133912767271278301812651811077195013401278636127953581074280984316321341127995812805878094208087911949134212806611281740808717807638955134312818041282397807574806981163113441282384128303480699480634412361345128305512842518063238051271630134612846671285869804711803509222134712859751289823803403799555223134812900191292922799359796456224134912933961293860795982795518162913501294892129572279448679365658613511295748129711579363079226395613521297116129844479226279093416281353129862512988467907537905329571354129918913002207901897891581627135513002901301624789088787754162613561301759130293478761978644419481357130293113036177864477857611235135813036901304454785688784924123413591304451130523978492778413916251360130523613062497841427831291947136113062461306722783132782656123313621306665130703978271378233916241363130707613079637823027814151623136413079891309053781389780325123213651309106130994878027277943058713661309950131102077942877835895813671311965131331777741377606119461368131341213142247759667751541622136913156611315879773717773499194513701316041131615177333777322712311371131641013177657729687716132251372131776213180017716167713779591373131799813185287713807708505881374131858513192987707937700802261375131930813196377700707697412271376131962013200787697587693001230137713213261322096768052767282960137813221021322401767276766977194413791322840132300476653876637419431380132318313237887661957655901621138113238021324827765576764551122913821325139132533676423976404216201383132536913258007640097635781942138413257871326215763591763163161913851326222132659376315676278516181386132673813275267626407618521617138713275481327970761830761408161613881327967132850976141176086919411389132852013290777608587603011615139013290841329671760294759707161413911330058133021375932075916558913921330540133156575883875781312281393133177713320077576017573711940139413320431332753757335756625122713951332861133311275651775626616131396133311313336947562657556841612139713337061333999755672755379193913981334020133455075535875482812261399133453713351367548417542421938140013352101336667754168752711161114011336699133714575267975223312251402133715713376247522217517541610140313376361338343751742751035193714041338340133895475103875042412241405133895613394117504227499671936140613394131339793749965749585160914071339810134037374956874900512231408134037513407677490037486111935140913407791340949748599748429122214101340951134150274842774787619341411134151613422477478627471311608141213422471342612747131746766193314131342624134304974675474632912211414134305313434067463257459721220141513433941343660745984745718160714161343657134395374572174542519321417134396013441607454187452181931141813441471344785745231744593160614191344782134525274459674412619301420134526313456737441157437051605142113456701346398743708742980192914221346403134666374297574271516041423134667013474377427087419411603142413474481348488741930740890121914251348490134934474088874003419281426134988213512587394967381201927142713513221352506738056736872192614281352613135326973676573610916021429135457413557407348047336385901430135582113564027335577329761218143113566061357514732772731864961143213575171358350731861731028192514331358441135943373093772994519241434136118113624617281977269179621435136244913625237269297268555911436136301013639307263687254481923143713639721365465725406723913121714381365589136615572378972322322814391366195136734672318372203222914401367357136848172202172089759214411368582136919372079672018596314421369248137056772013071881196414431370627137098971875171838919221444137184713721257175317172532301445137232213737527170567156265931446137390213766647154767127142311447137692113784027124577109765941448137847013795347109087098441601144913796491380014709729709364965145013799811380445709397708933192114511380532138128470884670809412161452138128113826877080977066911600145313827671384572706611704806232145413845691385354704809704024159914551385351138591470402770346419201456138606113875787033177018001215145713879221388011701456701367595145813880041389050701374700328159814591388485138858970089370078923314601389047138998270033169939619191461139010813906176992706987612341462139065613911656987226982139661463139139713916696979816977099671464139398013945406953986948389681465139616913969516932096924275961466139696513975226924136918569691467139752813979686918506914101918146813982711399176691107690202235146913991731400693690205688685970147014006901401382688688687996597147114015021401813687876687565236147214018151403806687563685572598147314038241404309685554685069237147414043491404960685029684418238147514049571406060684421683318971147614060571406365683321683013599147714063721407382683006681996600147814074751408257681903681121239147914082541409654681124679724972148014096741410327679704679051240148114104131411189678965678189601148214111991411954678179677424602148314119381413167677440676211973148414132351413960676143675418241148514139351414642675443674736603148614149431415797674435673581604148714158001418658673578670720121414881418655142045767072366892115971489142045014209236689286684551213149014210491422080668329667298159614911422217142275966716166661924214921422740142359466663866578419171493142361714241296657616652491595149414242661424787665112664591243149514247871428260664591661118974149614283061428734661072660644975149714288421430410660536658968605149814304211430807658957658571976149914308011431283658577658095606150014312901432483658088656895607150114325471433398656831655980608150214334321434445655946654933609150314348741435398654504653980244150414353951436108653983653270159415051436180143659365319865278519161506143664514369356527336524431915150714369581437776652420651602159315081437769143852765160965085112121509143850214392756508766501031914151014392721439982650106649396121115111439994144077664938464860215921512144111514415826482636477966101513144155714419766478216474021591151414418881442184647490647194121015151442268144252564711064685397715161442602144452464677664485424515171444521144496764485764441115901518144528814460016440906433771913151914464211446744642957642634120915201447018144782764236064155124615211447763144829964161564107919121522144835414485276410246408511911152314487331449227640645640151978152414497641450072639614639306611152514500761451272639302638106612152614513621452348638016637030247152714523451452566637033636812158915281452921145357163645763580715881529145373914539546356396354246131530145465814547536347206346251587153114557801457495633598631883158615321458373145851663100563086212081533146085914613716285196280071585153414613431461726628035627652120715351462494146310862688462627015841536146310514642836262736250951910153714642551466492625123622886158315381466599146760962277962176912061539146765514677446217236216342481540146776914679066216096214722491541146789114686766214876207021582154214684981469019620880620359120515431469265147053362011361884597915441470609147179061876961758815811545147181214719376175666174411580154614718701472673617508616705250154714747311474928614647614450157915481475072147598361430661339519091549147710714775746122716118049801550147758414790296117946103491578155114790301479884610348609494157715521480088148087360929060850561415531480960148178160841860759712041554148175314818696076256075091908155514820491482780607329606598120315561484422148641360495660296525115571486448148821160293060116761515581488253148930860112560007012021559148941714901575999615992212521560149021114907535991675986259811561149089614910875984825982912531562149122214913955981565979831576156314914061491738597972597640120115641491692149222559768659715319071565149222214924315971565969471200156614924281493000596950596378157515671493037149357359634159580515741568149363114945935957475947851573156914946131495560594765593818119915701495557149656459382159281415721571149667714972165927015921621198157214972311497902592147591476157115731498015149850659136359087211971574149989315009545894855884241196157515009751501334588403588044982157615012341501755588144587623254157715017521502747587626586631983157815027821504029586596585349255157915037051503881585673585497157015801506454150768358292458169525615811507680150836958169858100998415821508513150925058086558012861615831509284151158458009457779419061584151298615137595763925756196171585151375615148355756225745432571586151587715168425735015725362581587151851015185695708685708091569158815198161521600569562567778259158915198241519925569554569453156815901521735152259256764356678698515911523210152466756616856471161815921525075152607656430356330226015931526066152644956331256292919051594152948915302955598895590836191595153029615307335590825586456201596153089415361645584845532149861597153629815367715530805526072611598153681115373655525675520132621599154032615417025490525476769871600154190115436915474775456871567160115437541544062545624545316621160215440931544920545285544458622160315449701545347544408544031988160415454321545968543946543410156616051546165154936254321354001626316061549370154952254000853985619041607155019515514545391835379241903160815513841551506537994537872989160915516371552008537741537370119516101551975155221753740353716115651611155233015530885370485362902641612155310815554805362705338981902161315554741556295533904533083119416141556455155743853292353194011931615155741615585075319625308711901161615583901559334530988530044119216171559337156035053004152902815641618156038215610115289965283671191161915613921562597527986526781156316201562832156428652654652509299016211564489156493852488952444026516221564960156577252441852360611901623156594315696535234355197259911624156969915711445196795182341562162515708581571220518520518158266162615712171572563518161516815156116271572612157363751676651574115601628157364115737485157375156301559162915737101575680515668513698992163015757531577099513625512279993163115771381578040512240511338623163215780371579284511341510094267163315792941582596510084506782268163415827071583825506671505553994163515838581584259505520505119624163615842891585641505089503737269163715856461586575503732502803190016381586361158854750301750083199516391588597158896250078150041627016401588919159021450045949916462516411590298159157849908049780027116421591902159237249747649700615581643159276915935154966094958639961644159368215948844956964944941189164515950171595325494361494053272164615964651597058492913492320155716471597751159850949162749086918991648159867615999024907024894769971649159988616009354894924884432731650160122016017774881584876019981651160372716037864856514855926261652160408816042644852904851141556165316047081606048484670483330627165416060391606902483339482476118816551606912160768548246648169311871656160766316079714817154814071898165716082131609220481165480158155516581609231161019048014747918811861659161020216116234791764777551554166016116351612684477743476694189716611612865161531247651347406618961662161565316168824737254724969991663161686016175614725184718172741664161755816185174718204708611000166516177561617815471622471563155316661618578161927647080047010210011667161926316212274701154681511185166816213051621934468073467444155216691622735162292046664346645862816701622922162411246645646526610021671162413316252874652454640916291672162532116255634640574638156301673162562816257174637504636611003167416258161625929463562463449631167516259191626824463459462554155116761627009162761446236946176411841677162779316293374615854600416321678162943516305954599434587831004167916305961631720458782457658100516801630637163070545874145867318951681163179916330734575794563051006168216331291633257456249456121275168316341251634739455253454639276168416342531634369455125455009155016851634744163504645463445433263316861635049163636545432945301311831687163637616373564530024520226341688163733616386734520424507051894168916386701639755450708449623118216901639752164081644962644856215491691164093716415574484414478211548169216415811643545447797445833189316931643712164403844566644534010071694164403516446644453434447141892169516447111645832444667443546100816961645842164619544353644318310091697164655016477494428284416291010169816511921652691438186436687118116991652842165346243653643591627717001653443165462443593543475463517011654676165551243470243386663617021655924165697643345443240218911703165725716582104321214311681547170416586331658857430745430521189017051659540166003442983842934410111706166013716606164292414287621012170716606051661033428773428345154617081661293166143942808542793927817091661519166258342785942679518891710166258516660194267934233591545171116661851666505423193422873154417121667046166850042233242087815431713166857316689144208054204641013171416688711669944420507419434279171516699411671896419437417482154217161671856167254541752241683311801717167264216726864167364166921179171816727131673096416665416282154117191673965167499941541341437911781720167544816765454139304128336371721167663016777904127484115886381722167781216786364115664107426391723167870516795534106734098252801724167954016803704098384090086401725168036716811284090114082502811726168138316817304079954076481014172716817401682333407638407045101517281682428168281740695040656128217291682818168349540656040588311771730168356816845784058104048001176173116844391684564404939404814641173216855351686689403843402689154017331686869168704540250940233364217341687089168793140228940144710161735168793216892994014464000791539173616893991690175399979399203101717371691003169244239837539693618881738169251516931803968633961986431739169318416934893961943958896441740169349916940563958793953226451741169415716956293952213937491018174216956421696265393736393113153817431696275169772639310339165215371744169780716981453915713912336461745169909216991783902863902001019174616996221700173389756389205188717471700210170149338916838788518861748170353117041633858473852156471749170422417049703851543844081885175017049891705141384389384237188417511705367170631438401138306418831752170613917069843832393823941020175317069861707378382392382000283175417073751708133382003381245153617551708168171071438121037866411751756171085517114873785233778911535175717127781714040376600375338102117581714040171624737533837313164817591716248172164437313036773464917601721669172240636770936697265017611722894172343636648436594210221762172522217258603641563635181023176317258571726705363521362673188217641727964172902236141436035610241765172902917297873603493595911025176617297841730227359594359151651176717302701731955359108357423652176817319451732280357433357098153417691732332173298235704635639615331770173299817331203563803562581532177117334731734267355905355111284177217342551735046355123354332153117731735212173579335416635358510261774173641917365203529593528582851775173645617368963529223524826531776173689317374233524853519551174177717376201738414351758350964188117781738777173950535060134987311731779173950217398523498763495261530178017399351740549349443348829117217811740792174182634858634755210271782174192617437043474523456741028178317436941743957345684345421117117841743938174424334544034513518801785174424517455913451333437871529178617456501746300343728343078286178717468941747268342484342110102917881747308174866034207034071810301789174975517499313396233394471879179017499001749992339478339386103117911750416175154333896233783515281792175171717527933376613365851878179317527951753493336583335885152717941753468175529133591033408711701795175544417561003339343332781526179617561331756924333245332454187717971757029175746033234933191811691798175749417587353318843306431168179917588701758998330508330380152518001760394176073532898432864310321801176216617625583272123268201876180217626761762846326702326532654180317628431763493326535325885116718041763590176414132578832523728718051764136176460932524232476911661806176470417658043246743235746551807176584017666823235383226962881808176667917670683226993223101033180917670791767885322299321493116518101767919176826932145932110911641811176827117693503211073200281875181217694691770143319909319235152418131770892177216931848631720928918141772144177271931723431665918741815177265317733033167253160751163181617735711774485315807314893116218171774489177514531488931423311611818177513917760683142393133101523181917760731776540313305312838116018201776586177729331279231208529018211777281177781131209731156710341822177779917788303115793105486561823177906917795543103093098241035182417795581779923309820309455152218251779979178161930939930775911591826178159717829283077813064506571827178286617838283065123055501873182817840101784594305368304784103618291784774178495330460430442565818301784955178615130442330322710371831178614817870923032303022866591832178714717874733022313019056601833178748517886693018933007092911834178867117896753007072997036611835178971417906972996642986812921836179070517915682986732978106621837179162417919592977542974191038183817919631792769297415296609103918391792792179332829658629605029318401793325179452429605329485415211841179452117948232948572945551872184217949641796124294414293254294184317961291797154293249292224187118441797235179756129214329181711581845179756117976652918172917131520184617978741798116291504291262115718471798158180054529122028883315191848180068618013062886922880721870184918015921802125287786287253663185018022451803363287133286015115618511803363180360228601528577615181852180366618042802857122850981040185318043171804535285061284843151718541804571180504728480728433118691855180552118058532838572835251155185618059111806657283467282721115418571806654180707328272428230515161858180716118080842822172812941041185918082491808404281129280974664186018083941808819280984280559151518611808985181161828039327776010421862181174418124872776342768916651863181251818135102768602758681868186418133531813550276025275828104318651813638181405427574027532415141866181414118146442752372747341867186718145591814648274819274730104418681814829181596227454927341610451869181595918170022734192723766661870181699918177452723792716332951871181775618187152716222706636671872181957018197762698082696021153187318201871820936269191268442151318741820961182165926841726771915121875182165918218412677192675371866187618221051823073267273266305296187718237021823782265676265596186518781823857182467526552126470329718791824662182562426471626375418641880182564818261512637302632272981881182622618265042631522628741511188218265721826886262806262492299188318268591827470262519261908104618841827563182840826181526097018631885182849318296982608852596806681886182973118305582596472588203001887183062118311152587572582631510188818310761831645258302257733186218891831699183277225767925660630118901832777183370925660125566966918911833706183415825567225522011521892183415518348562552232545221509189318349921835603254386253775104718941835581183620125379725317730218951836239183711125313925226767018961837108183850825227025087011511897183851518398462508632495321150189818398431842821249535246557150818991842996184486424638224451415071900184494718452732444312441053031901184524118459422441372434361149190218459321846168243446243210671190318462671847184243111242194114819041847191184811124218724126711471905184811718496642412612397141506190618534371853742235941235636114619071853826185389423555223548410481908185393318546072354452347711861190918546121855832234766233546150519101855928185758623345023179218601911185765618580122317222313666721912185801718593002313612300781504191318593801859607229998229771114519141859695186014122968322923711441915186055618607412288222286371143191618608141862100228564227278114219171862097186290022728122647815031918186290218637862264762255921141191918637831864895225595224483150219201865656186671122372222266730419211866693186722322268522215510491922186747318686662219052207121050192318686961869637220682219741673192418696431870143219735219235305192518708331871861218545217517105119261872015187255721736321682110521927187253318728112168452165676741928187280818731792165702161993061929187317618734422162022159361053193018734391873735215939215643675193118737321874181215646215197307193218741691874537215209214841105419331874534187607821484421330067619341876071187642721330721295110551935187646518769952129132123833081936187699218775612123862118171056193718775581878838211820210540677193818788431879835210535209543105719391879832188026320954620911567819401880264188079720911420858118591941188078418812782085942081001501194218812711881759208107207619114019431881790188227220758820710611391944188233418835422070442058366791945188354318840762058352053026801946188415718851492052212042293091947188528118866272040972027511058194818866711887270202707202108310194918872671887560202111201818150019501887544188821820183420116011381951188872418900252006541993536811952189000618905571993721988211499195318906341894026198744195352311195418943181894365195060195013312195518944421895158194936194220682195618952221895692194156193686185819571895730189628419364819309414981958189633018968181930481925601497195918968861897806192492191572313196018978031898744191575190634149619611898830189925519054819012311371962189930919001781900691892001059196319001711900881189207188497113619641901205190172018817318765814951965190178319027061875951866726831966190274619032731866321861056841967190327719044341861011849446851968190443119054621849471839163141969190550119063371838771830411060197019063341907098183044182280185719711907089190806618228918131211351972190812719094611812511799171134197319095171910014179861179364686197419100231910727179355178651315197519120101912546177368176832687197619126511912902176727176476316197719129211913589176457175789113319781913472191405017590617532814941979191438719148121749911745661493198019148821916204174496173174149219811916252191647917312617289968819821916521191735117285717202731719831917310191787917206817149911321984191821519187091711631706691061198519186931920390170685168988113119861920429192133116894916804714911987192140719230651679711663131490198819233771923970166001165408185619891923967192431716541116506111301990192447819262501649001631286891991192625219265661631261628121062199219267071929025162671160353690199319290371930491160341158887112919941930573193092015880515845831819951930917193158815846115779010631996193153519320021578431573761489199719321931932927157185156451319199819329281933236156450156142112819991933306193357815607215580032020001933671193405115570715532710642001193402919357351553491536431127200219357451936650153633152728112620031936888193783515249015154311252004193796519393051514131500731124200519413781941863148000147515106520061942184194250714719414687169120071942618194457614676014480211232008194472919458651446491435131488200919459931946349143385143029112220101947328194844614205014093232120111948368194983414101013954410662012194978819518751395901375031121201319518251953192137553136186322201419531891954478136189134900106720151954540195520813483813417032320161955253195739413412513198410682017195739719582061319811311721855201819584541958975130924130403148720191959384195998012999412939814862020195999719602091293811291691120202119619111965690127467123688111920221962226196236012715212701832420231964567196462912481112474969220241965873196665812350512272010692025196689919694031224791199751070202619693961970652119982118726325202719708041971262118574118116693202819713281971672118050117706326202919716821972395117696116983327203019724931973851116885115527694203119742991975357115079114021185420321975695197701711368311236110712033197697119773991124071119791118203419773961977704111982111674148520351977819197840011155911097814842036197839719789931109811103851853203719789661979769110412109609111720381979866198048910951210888932820391980484198094210889410843611162040198094619818781084321075001115204119819861982897107392106481107220421982894198330710648410607169520431983573198432510580510505314832044198436919857241050091036541114204519859421987522103436101856696204619875351988848101843100530185220471988883198967110049599707148220481989712199070199666986771113204919910431992029983359734914812050199217819933239720096055111220511993320199392896058954501480205219939561994684954229469414792053199468119956949469793684185120541995731199706293647923161850205519970621999713923168966511112056199971020010928966888286147820572001233200302088145863581849205820031362003711862428566710732059200369620042178568285161697206020042202004576851588480211102061200489020049438448884435698206220051882006615841908276314772063200653620091368284280242329206420091332010641802457873710742065201069720120137868177365330206620120722012314773067706469920672012311201251477067768641109206820127122013572766667580614762069201360920146617576974717147520702014525201556874853738101108207120156322016564737467281411072072201668420174217269471957107520732017378201880272000705763312074201918220194067019669972184820752019763202042569615689531106207620204352021076689436830211052077202115720215226822167856107620782021495202221467883671647002079202226920231116710966267701208020253402025417640386396133220812028631202891260747604663332082202891420294896046459889702208320294832030094598955928411042084203014220310235923658355147420852031138203272758240566511077208620327342033420566445595814732087203350120344665587754912703208820343302035610550485376810782089203563720362545374153124704209020363312036594530475278410792091203660920372445276952134705209220372902038219520885115970620932038219203939451159499843342094203942920400404994949338707209520399942040326493844905210802096204031620408164906248562110320972040797204173248581476461847209820430102044203463684517511022099204434020451704503844208708210020451272046032442514334614722101204607720473994330141979709210220474062047780419724159871021032047777204831341601410651101210420483202049099410584027911002105204910620494714027239907109921062050697205161438681377647112107205166420519003771437478108121082051888205229837490370807122109205229520530143708336364335211020531252053190362533618810822111205599220571463338632232184621122057204205746732174319111845211320574772058655319013072318442114205874220591493063630229109821152059310205950130068298777132116205956020608012981828577108321172060819206159828559277807142118206150120619112787727467108421192061997206244627381269321097212020624482062966269302641218432121206296620636072641225771109621222063612206421425766251641842212320642802065428250982395010952124206547120667782390722600109421252066863206755822515218203362126206762320683842175520994715212720683842069838209941954033721282069828207018419550191941841212920701892070728191891865014712130207077820715991860017779109321312071722207206917656173091085213220720662072986173121639271621332073002207349016376158887172134207353420737371584415641147021352074012207542415366139543382136207555720761621382113216339213720761992076411131791296710922138207652820769591285012419108621392076986207766312392117157182140207770320781521167511226719214120781642078964112141041410912142207900120800261037793521090214320803192082169905972097202144208237620828977002648134021452082919208328464596094108921462083288208400760905371108821472084057208531653214062184021482085470208711039082268721214920872162088568216281018392150208867020889217084573412151208890520893784730722


In one embodiment, such a region is selected from the group consisting of genes (1) through (2151).


As used herein, in the above Table, translated amino acid sequences usually start with methionine, and is identified as “amino acid SEQ ID No: Y (SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157)”, however the other reading frames may also be readily translated using known molecular biological techniques. It is also understood that the polypeptide produced by another open reading frame is also encompassed in the scope of the present invention.


The accuracy of the sequence disclosed herein is sufficient and suitable for a variety of applications well known in the art and further described hereinbelow. For example, the sequence of the open reading frame region of SEQ ID NO: 1 is useful for designing a nucleic acid hybridization probe for detection of cDNA contained in the nucleic acid sequence in the open reading frame. These probes also hybridize with a nucleic acid molecule in a biological sample, thereby allowing a variety of forensic and diagnostic methods of the present invention. Similarly, the polypeptide identified by SEQ ID NO: Z may be used for, for example, producing an antibody specifically binding to a protein (including a polypeptide and secreted protein) encoded by an open reading frame identified herein.


Although we have analyzed the sequence of the present invention with special care, DNA sequences produced by sequencing reactions may comprise an error in sequencing. This error may be present as an incorrectly identified nucleotide, or as an insertion or a deletion of a nucleotide, in the DNA sequence produced. Incorrectly inserted or deleted nucleotides cause frame shifts in the deduced amino acid sequence of the reading frame. In such cases, the produced DNA sequences may be identical with more than 99.9 % identity (for example, 1 base insertion or deletion in an open reading frame over 1000 bases), but the deduced amino acid sequence may differ from the actual amino acid sequence.


Accordingly, in these applications where accuracy is required in nucleotide or amino acid sequence, the present invention also provides the nucleic acid sequence and the amino acid sequence encoded by the genome of Thermococcus kodakaraensis KOD1 of the present invention, which was deposisted in the International Patent Organism Depositary (IPOD). Those skilled in the art may determine a more accurate sequence by sequencing the sequence of the deposited Thermococcus kodakaraensis KOD 1 of the present invention. What is also provided in the present ivention are allelic variants, orthologs, and/or speicies homologs.


In another aspect, the present invention provides a nucleic acid molecule per se having a sequence set forth in SEQ ID NO: 1 or 1087. The nucleic acid molecule per se is useful in the gene targeting disruption method of the present invention.


In another aspect, the present invention provides a nucleic acid molecule comprising at least eight contiguous nucleic acid sequence of the sequence set forth in SEQ ID NO: 1 or 1087.


As used herein, the term “probe” refers to a substance for use in searching, which is a nucleic acid sequence having a variable length. Probes are variable depending on the use thereof. Examples of a nucleic acid molecule as a common probe include one having a nucleic acid sequence of at least about 8 nucleotides in length, preferably at least about 10 nucleotides, preferably at least about 15 nucleotides, preferably at least about 20 nucleotides, preferably at least about 30 nucleotides, preferably at least about 40 nucleotides, preferably at least about 50 nucleotides, preferably at least about 100 nucleotides, or may be at least about 6000 nucleotides. Probes are used for detecting an identical, similar or complementary nucleic acid sequence. Longer probes may be usually available from natural or recombinant sources, are very specific, and hybridize much slower than oligomers. Probes may be single- or double-stranded, and are designed to have specificity in technologies such as PCR, membrane based hybridization or ELIS and the like.


As used herein, the term “primer” refers to a nucleic acid sequence having variable length, and serves for initiation of elongation of a polynucleotide strand in a synthetic reaction of a nucleic acid such as a PCR. Examples of a nucleic acid molecule as a common primer include one having a nucleic acid sequence having a length of at least about 6 nucleotides, at least about 7 nucleotides, at least about 8 nucleotides, preferably at least about 10 nucleotides, preferably at least about 15 nucleotides, at least about 17 nucleotides, preferably at least about 20 nucleotides, preferably at least about 30 nucleotides, preferably at least about 40 nucleotides, preferably at least about 50 nucleotides, preferably at least about 100 nucleotides, or may be at least about 6000 nucleotides.


In one aspect, the present invention provides a polypeptide having an amino acid sequence selected from a group consisting of any Gene ID (1) through (2151) as listed in Table 1 (namely, SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157). The polypeptide of the present invention is preferably fused to another protein. These fusion proteins may be used for a variety of applications. For example, fusion of His tag, HA tag, Protein A, IgG domain and maltose binding protein to the polypeptide of the present invention facilitates purification (see also EP A 394,827, Traunecker et al., Nature, 331:84-86(1988)).


In another aspect, the present invention provides a peptide molecule comprising at least one amino acid sequence of an amino acid sequence selected from a group consisting of any Gene ID (1) through (2151) as listed in Table 1 (namely, SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157). Such peptide molecules may be used as an epitope. Preferably, such a peptide molecule may comprise at least about a 4 amino acid sequence, at least about a 5 amino acid sequence, at least about a 6 amino acid sequence, at least about a 7 amino acid sequence, at least about a 8 amino acid sequence, at least about a 9 amino acid sequence, at least about a 10 amino acid sequence, at least about a 15 amino acid sequence, at least about a 20 amino acid sequence, at least about a 30 amino acid sequence, at least about a 40 amino acid sequence, at least about a 50 amino acid sequence, or at least about a 100 amino acid sequence. The longer the peptide becomes, the higher the specificity thereof becomes.


As used herein the term “epitope” refers to a portion of a polypeptide having antigenicity or immunogenicity in an animal, preferably a mammal, and most preferably in a human. In a preferable embodiment, the invention comprises a polypeptide comprising an epitope, and a polynucleotide encoding the polypeptide. As used herein the term “immunogenic epitope” is defined as a portion of a protein inducing antibody reaction in an animal, as determined by any method known in the art such as those for producing an antibody described herein below (see for example, Geysen et al., Proc.Natl.Acad.Sci.USA 81:3998-4002(1983)). As used herein the term “antigenic epitope” refers to a portion of a protein capable of binding to an antibody in an immunologically specific manner, as determined by any method well known in the art, such as an immunoassay as described herein. Immunologically specific binding excludes non-immunological binding, but does not necessarily exclude cross-reaction with different antigens. Antigenic epitopes are not necessarily immunogenic.


Fragments working as an epitope may be produced in any method conventionally known in the art (for example, see Houghten, Proc. Natl. Acad. Sci. USA 82:5131-5135(1985); see also, U.S. Pat. No. 4,631,211).


As used herein an antigenic epitope may comprise usually at least three amino acids, preferably at least 4 amino acids, at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, more preferably at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 11 amino acids, at least 12 amino acids, at least 13 amino acids, at least 14 amino acids, at least 154 amino acids, at least 20 amino acids, at least 25 amino acids, at least 30 amino acids, at least 40 amino acids, at least 50 amino acids, and most preferably comprises a sequence of between about 15 amino acids and 30 amino acids. Preferable polypeptides comprising an immunogenic epitope or antigenic epitope are at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length. Still, non-exclusively preferable antigenic epitopes comprise antigenic epitopes and a portion thereof as disclosed herein. Antigenic epitopes are useful for raising an antibody capable of specifically binding to an epitope (including monoclonal antibodies). Preferable antigenic epitopes comprise any combination of the antigenic epitopes as disclosed herein and 2, 3, 4, 5 or more these antigenic epitopes. Antigenic epitopes may be used as a target molecule in an immunoassay (see, for example, Wilson et al., Cell 37:767-778(1984); Sutcliffe et al., Science 219: 660-666 (1983)).


Similarly, with respect to the use of an immunogenic epitope, for example, an antibody may be induced according to a method well known in the art (see, for example, Sutcliffe et al., (ibid.) ; Wilson et al., (ibid.); Chow et al., , Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle et al., J. Gen. Virol. 66: 2347-2354 (1985)). Preferable immunogenic epitopes are those immunogenic epitopes as disclosed herein, and any combination of two, three, four, five or more of these immunogenic epitopes. Polypeptides comprising one or more immunogenic epitopes may be presented for raising antibody response against an animal system (for example, rabbit or mouse) with a carrier protein (for example, albumin), or if the polypeptide is sufficiently long (at least about 25 amino acids), the polypeptide is presented withouth carrier. However, immunogenic epitopes as short as 8-10 amino acids have been shown to be sufficient for raising an antibody capable of binding to (at least) a linear epitope of a modified polypeptide (for example, by Western blotting).


Epitope-containing polypeptides of the present invention may be used for inducing an antibody according to a well known technology in the art. Such a method includes, but is not limited to in vivo immunization, in vitro immunization, and phage display method. For example, see Sutcliffe et al. ibid; Wilson et al., ibid; and Bittle et al., J. Gen. Virol., 66: 2347-2354 (1985). When using in vivo immunization, an animal may be immunized using a free peptide. However, anti-peptide antibody titer may be boosted by binding a peptide to a macromolecular carrier (for example, keyhole limpet hemocyanin (KLH) or tetanus toxoid). For example, a peptide comprising a cysteine residue, may be bound to a carrier by the use of a linker such as a maleidobenzoyl-N-hydroxysuccineimideester (MBS). On the other hand, another peptide may be bound to a carrier by the use of more general binder such as glutaraldehyde. An animal such as a rabbit, rat, or mouse may be immunized by peritoneal injection and/or intradermic injection of, for example, an emulsion (containing about 100 μg of a peptide or carrier protein and Freund's adjuvant or any other adjuvant known to stimulate an immunoresponse). Some booster injections may be necessary to provide an effective titer of anti-peptide, for example, at about-two week intervals. This titer may be detected by an ELISA assay using a free peptide absorbed onto a solid surface. Titer of such anti-peptide antibodies in the serum derived from an immunized animal may be enhanced by selecting anti-peptide antibodies (for example, by absorption of the peptide on a solid support and elution of the selected antibody according to a well known method in the art).


As can be understood by those skilled in the art, and as discussed hereinabove, the polypeptide of the present invention comprising an immunogenic or antigenic epitope, may be fused to another polypeptide. For example, the polypeptide of the present invention may be fused to a constant domain or a portion thereof (CH1, CH2, CH3 or any combination or fragment thereof), or albumin (including, but not limited to, for example, recombinant albumin (see, for example, U.S. Pat. No. 5,876,969 (issued Mar. 2, 1999), EP 0 413 622 and U.S. Pat. No. 5,766,883 (issued Jun. 16, 1998), which are herein incorporated as reference in their entireties) to result in a chimeric protein. Such a fusion protein may facilitate purification, and enhance half-life in vivo. This has been demonstrated for the first two domains of a human CD4-polypeptide, and a chimeric protein consisting of a variety of domains from heavy chain or light chain constant regions of an immunoglobulin of a mammal. For example, see EP 394,827; Traunecker et al., Nature, 331: 84-86 (1988). An enhanced delivery of an antigen into the immune system across the epidermal barrier, has been demonstrated for an antigen (for example, insulin) bound to an IgG or a FcRn binding partner such as Fc fragment (see, PCT publications WO 96/22024 and WO 99/04812). IgG fusion proteins having a dimeric structure due to disulfide bonding of the IgG portions have also been demonstrated to be more effective in binding and neutralizing of another molecule, than a monomer polypeptide or a fragment thereof alone. See Fountoulakis et al., J.Biochem., 270: 3958-3964 (1995). A nucleic acid encoding the epitope may be recombined as a gene of interest as an epitope tag (for example, hemagglutinin “HA” or flag tag) to assist detection and purification of the expressed polypeptide. For example, a system described by Janknecht et al., allows simple purification of a non-modified fusion protein expressed in a human cell line (see Janknecht et al., 1991, Proc. Natl. Acad. Sci. USA 88: 8972-897). In this system, a gene of interest may be subcloned into a vaccinia recombinant plasmid to result in fusion of the open reading frame of the gene with an amino terminal tag consisting of six histidine residues upon translation. This tag functions as a substrate binding domain for the fusion protein. An extract from a cell infected with the recombinant vaccinia virus may be loaded onto a Ni2+ nitriloacetate-agarose column and a histidine tagged protein may be selectively eluted using imidazole containing buffer.


An “isolated” nucleic acid molecule is separated from the other nucleic acid molecules present in the natural source of the subject nucleic acid molecule. Examples of such isolated nucleic acid molecules include, but are not limited to, for example, recombinant DNA molecules contained in a vector, recombinant DNA molecules maintained in a heterologous host cell, nucleic acid molecules partially or substantially purified, and synthetic DNA or RNA molecules. Preferably, “isolated” nucleic acid is free of naturally flanking sequences to the subject nucleic acid in the genomic DNA of the organism from which the subject nucleic acid is derived (i.e., sequences located at 5′ and 3′ termini of the subject nucleic acid). For example, in a variety of embodiments, isolated novel nucleic acids molecules may include nucleotide sequence of less than about 50 kb, 25 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb. Further, “isolated” nucleic acid molecules, for example, cDNA molecules, may be substantially free of other cellular materials or culture medium when recombinantly produced, or of chemical precursors or other chemical substances when chemically synthesized.


In one aspect, the present invention provides a nucleic acid molecule comprising a sequence encoding an amino acid sequence having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (at least one sequence selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157); or a sequence having 70 % homology thereto.


In another aspect, the present invention provides a polpeptide, having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (comprising at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157), or a sequence having at least 70 % homology thereto.


In another aspect, the present invention provides an epitope or a variant thereof, having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (at least one amino acid sequence consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157), or a sequence having at least 70 % homology thereto, or a portion thereof.


In another aspect, the present invention provides a method for screening for a thermostable protein. The present method comprises A) providing the entire sequence of the genome of a thermoresistant living organism; B) selecting at least one arbitrary region of the sequence; C) providing a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the heat resistance protein; D) transforming the living organism with the vector; E) placing the thermoresistant living organism in a condition causing possible homologous recombination; F) selecting the thermoresistant living organism in which homologous recombination has occurred; and G) assaying for identifying the thermoresistant protein. As used herein the entire sequence of the genome may not necessarily be a complete sequence, but preferably is an entire complete sequence. As used herein, as the selected region, two or more regions may be selected. The length of the region may be any length, as long as homologous recombination occurs, and includes, for example, at least about 500 bases, at least about 600 bases, at least about 700 bases, at least about 800 bases, at least about 900 bases, at least about 1000 bases, at least about 2000 bases, and the like. The candidate for the above thermotable proteins may be any protein of the present invention, as long as the expression thereof is expected. Vectors may be any vector, as long as they can express the protein of interest.


Vectors may preferably comprise gene regulation elements such as a promoter. Transformation may be any condition, as long as it is appropriate therefor.


Conditions causing homologous recombination may be any condition, as long as homologous recombination occurs under such conditions. Usually, the following condition may be used:

  • Tk-pyrF deleted strain No. 25, No. 27 are cultured in 20 ml of ASW-YT liquid medium.
  • Collect the bacteria from the culture medium (3 ml) per one sample (No. 25, No. 27, five samples for each)
  • Suspend the cells in 0.8×ASW+80 mM CaCl2 200 μl, and let stand on ice for 30 minutes
  • 3 μg pUC118/DS and 3 μg pUC118/DD are mixed and let stand on ice for 1 hour (two samples for each. Equivalent volume of TE buffer added to the sample was used as a control)
  • heat shock at 85° C., 45 s
  • let stand on ice for 10 minutes
  • Preculture in Ura-ASW-AA liquid medium (proliferation occurs based on the incorporated uracil)
  • Culture on Ura-ASW-AA liquid medium (enriched for PyrF+ strain)
  • Culture on Ura-ASW-AA solid medium


The present invention is not limited to the above-condition. As used herein the composition of ASW (artificial sea water) is as follows: 1×Artificial sea water (ASW) (/L) : NaCl 20 g; MgCl2.6H2O 3 g ; MgSO4.7H2O 6 g; (NH4)2SO4 1 g ; NaHCO3 0.2 g; CaCl2.2H2O 0.3 g; KCl 0.5 g; NaBr 0.05 g; SrCl26.H2O 0.02 g; and Fe (NH4) citric acid 0.01 g.


A method for selecting an organism in which homologous recombination has occurred may be performed by detecting a marker specific for the organism in which homologous recombination has occurred. Accordingly, it is preferable to use a marker which can be expressed in an organism which is expressed upon occurrence of homologous recombination, in the above-mentioned vector.


Identification of a thermostable protein may be performed by determining that the protein of interest is observed to have an activity under the same condition under which the protein usually attains the activity, but changes only the temperature to about 50° C., preferably to about 60° C., more preferably to about 70° C., still more preferably to about 80° C., most preferably to about 90° C.


In another aspect, the present invention provides a kit for screening for a thermoresistant protein. The kit comprises A) a thermoresistant living organism; and B) a vector comprising a sequence complementary to the selected region and a gene encoding a candide for the thermoresistant protein.


In a preferable embodiment, the thermostable organism is a hyperthermophillic archaebacteria, and more preferably, Thermococcus kodakaraensis KOD1.


In a preferable embodiment, the kit of the present invention further comprises C) an assay system for identifying the thermoresistant protein. The assay system may vary depending on the activity of the thermostable protein of interest.


(Description of each Gene)


Hereinafter, each gene comprised in the genomoic sequence of Thermococcus kodakaraensis KOD1 strain as identified in the present invention, is described.


(Overview of the Genome of Hyperthermophillic Bacteria)


Chromosomal DNA of hyperthermophillic bacteria is stable. As double stranded DNA is maintained by hydrogen bonds, it is questionable if it will dissociate into single strands under higher temperature circumstances. KOD 1 strain has two types of basic histone-like proteins, which are stabilized by binding to the DNA, which is negatively charged, to form a nucleosome-like complex to be compacted. In the present invention, polyamines may be used to further enhance stabilization by binding to the same. Acetylated polyamine (acetyl polyamine) is weak in binding ability to the nucleosome-like complex, and thus can more firmly bind to polyamine obtained by the action of deacetylated enzyme. Generally, hyperthermophillic bacetria have much more intracellylar K+ ion than a normal-temperature bacteria, and this should contribute to the stabilization of double-stranded DNA. Actually, when the melting curve of such DNA is observed, this property thereof is clearly demonstrated.


(Universality of Thermophillic Property)


The present inventors have found universal properties in proteins from hyperthermophillic bacteria through studies of glutumate dehydrogenase (GDH) of KOD-1 strain. That is, it has been demonstrated that proteins from ordinary temperature bacteria generally denature due to heat, whereas recombinant proteins from hyperthermophillic bacteria mature once heat is given. GDH synthesized in the high temperature circumstances in the KOD-1 strain has a hexamer structure and high specific activity. On the other hand, when the GDH gene is expressed in E. coli as a host, such GDH has weaker enzymatic activity than a natural form thereof, and is a monomer protein having a different structure. It was demonstrated that when heat treatment at 70° C. for twenty minuties was performed, a recombinant GDH developed similar specific activity and three-dimensional structure of the natural GDH. Once heat treatment is given, the present enzyme behaved similarly to the natural GDH thereof even in the lower temperature range. Such features were acknowledged for not only for GDH, but also all the enzymes anlayzed by the present inventors from hyperthermophillic bacteria. As such, heat is important for maturation of thermostable proteins, and was determined that this is due to irreversible structural change of enzymatic proteins by heat.


(Discovery of Enzymes having New Structures and Functions)


Ribulose 1,5-bisphosphate carboxylase (Rubisco) is present in all the plants, algae, and cyanophyte, and plays an important role in fixing carbon dioxide to an organic material. Rubisco is the most abundant enzyme on earth, and is expected to heavily contribute to the solution of global warming or green house effects, and food problems. To date, archeabacteria, which is close to a primordial living organism, is believed not to possess a Rubisco, however, the present inventors have discovered Rubisco having high carbon dioxide fixation ability in the KOD-1 strain. The present enzyme (Tk-Rubisco) has twenty times greater activity than the conventional Rubisco, and the specificity to the carbon dioxide is extremely high. Tk-Rubisco is novel in terms of structure, and possesses the novel structure of a pentagonal decamaer. Presently, the analysis of physiological role of the present invention and introduction into a plant and the like is performed.


(Analysis of Thermostable Mechanism of Proteins from Hyperthermophillic Bacteria based on Three-Dimensional Structure)


High thermostablility presented by a protein derived from hyperthermophillic bacteria is not only from the basic field of protein sciences but also from a variety of applied field using the enzymes. The present inventors have clarified a number of three dimensional structures of enzymes derived from the KOD-1 strain, and also clarified a number of thermostable mechanisms. Typical examples thereof include O6-methyl guanine-DNA methyl transferase (Tk-MGMT). Comparing the three dimensional structures of Tk-MGMT and the same derived from E. coli (AdaC), it was demonstrated that Tk-MGMT has a number of intrahelical ionic bond stablizing alpha-helices. Further, there were also a number of intrahelical ionic bonds stablizing the global protein structure. It was shown that AdaC derived from E. coli has less such ionic bonds, and thus the hyperthermophillic bacteria derived enzymes attain high thermostability by a number of ionic bonds and ionic bond networks. This is also true of the above-mentioned GDH, and also demonstrated biochemically. That is, when introducing site-directed mutations disrupting ionic bond networks present inside the GDH, thermostability of the variant enzyme is greatly reduced. On the other hand, a variant enzyme with increased ionic bonds enhanced its thermostability.


(Use of Useful Enzymes)


Polymerase chain reaction (PCR) method is an essential technology for gene engineering technologies, and the application thereof ranges from medicine, environment fields, to food industries and the like. Presently, improvements presently required for PCR methods, are the shortening of amplification time, prevention of misamplification, and the proliferation of long DNA fragments. In particular, clinical or food tests require rapid and accurate DNA synthesizing DNA polymerases. As a result of our functional analysis of the DNA polymerase (KOD DNA polymerase) from the KOD-1 strain, we found that the present enzyme has improved ability of synthesizing a longer DNA, and the speed of the synthesis of DNA is increased, in comparison of conventional enzymes. In fact, when the DNA polymerase from the KOD-1 strain is used, reaction time for PCR only takes 25 minutes, while the conventional Taq enzyme takes two hours. Further, modified enzyme with 3→5′ exonuclease activity of the KOD DNA polymerase, and the wild type enzyme can be mixed in an appropriate ratio to yield significantly superior reaction efficiency and amplification property. Further, the present inventors further have attained that an antibody to the KOD DNA polymerase is used to suppress mis-amplification which is often seen in the initial period of PCR reactions, and thus could establish an extremely efficient DNA amplification system. The present system is now commercially available from TOYOBO as “KOD-Plus-” in Japan, and available elsewhere thrhough Life Technologies/GIBCO BRL, as “Platinum™ Pfx DNA polymerase” including Europe and America. Recently, the present inventors have further analyzed the KOD DNA polymerase to determine the three dimensional structure thereof. Detailed three dimensional structure could be analyzed with respect to the speed of elongation reaction of the present enzyme, accuracy of the replication capability and the like, in view of what the structure is related to.


The present inventors have identified and analyzed a number of useful thermophillic enzymer other than DNA polymerases. DNA ligases catalyze reaction of binding termini of two DNA fragments, and thus are essential enzymes for genetic engineering. Most conventional enzymes from bacteria and phages are sensitive to heat and unstable. HOwever, the DNA ligase from KOD-1 strain (Tk-Lig) presented high DNA ligase activity from 30-100° C. Further, substrate specificity in Nick-site of Tk-Lig (base-pairing) was interesting, and it was turned out that it was necessary to form accurate base-pairing against the 3′ terminus, while substrate specificity was loose against the 5′ terminus. No such DNA ligases having such features are reported to date, and these are expected to be applicable for detection of single nucleotide polymorphisms (SNPs). Sugar-related enzymes identified with respect to biochemical properties include alpha-amylase digesting alpha(1-4)bond as appears in starch and the like, or cyclodextrin glucanotransferase synthesizing cyclodextrine which catalyzes circulation, and 4-alpha-glucanotransferase, catalyzing a transferase reaction. Beta-glucosidase, which digests beta(1-4)bonds, appears in cellulose and chitin, and chitinase were also analyzed in detail. Two chitinase activities are present on the same polypeptide chain in chitinase from the KOD-1 strain, and one is responsable for endochitinase activity, while the other is responsable for exochitinase activity. These catalytic domains attain extremely high chitin degrading activity by synergy.


(Genomic Analysis of Thermococcus Kodakaraensis KOD-1 Strain and Development of Gene Introduction Technology)


Through the present studies, the present inventors have analyzed substantially all the genes relating to the KOD-1 strain, and revealed detailed biochemical properties of a huge variety of proteins. KOD-1 strain is a simple organism, located in the vicinity of the bottom of the evolutionary tree of organisms, and thus is believed to be a good tool for understanding basic mechanisms of life. Further, the KOD-1 strain produces a number of thermostable enzymes with broad applicability or novel enzymes with novel features as described above. Having such as background, the present inventors have proceeded with the entire genomic analysis of the KOD-1 strain. The genome of the KOD-1 strain consists of 2,076,138 base pairs, and is very short, as we have expected (40% or less of that of E. coli). Further, there were about 1,500 genes. As the KOD-1 strain maintains its life with such low number of genes, it is expected to allow analysis of basic principle of life through the research of the present bacteria.


The most important object of research in the post-genomic era is to analyze the physiological role of unknown genes. Exhaustive gene expression analysis by DNA chips, and exhaustive protein analysis by proteomics are effective analysis methods for these purposes. The present inventors have proceeded using these methods, and recently, have succeeded in constructing a novel system, which is an important new technology for specifically disrupting any gene of interest on the genome of the KOD-1 strain. This technology is used to disrupt a functionally-known gene to allow analysis and clarififcation of the physiological role thereof.


Genes comprised in the genome of KOD1 encompass a variety of species as listed in Table 2 below. Description of such genes are described in biochemistry references well known in the art, such as Sambrook, J. et al.Molecular Cloning:A Laboratory Manual,3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(2001);Ausubel, F.et al., Short protocols in molecular biology, 4th ed. John Wiley&Sons, NJ, USA(1999);Ausubel, F.,et al., Current Protocols in Molecular Biology, John Wiley&Sons, NJ, USA(1988); Jiro Ota ed., Biochemistry Handbook, Asakura Shoten, (1987); Kazutomo Imabori, Tamio Yamakawa ed., Seikagaku Jiten (Dictionary of BIOCHEMISTRY), Third Edition, Tokyo Kagaku Dojin (1998); Yasudomi NISHIDZUKA ed., Saibokino to Taisha mappu (Cellular Functions and Metabolism map), Tokyo Kagaku Dojin (1997); Lewin Genes VII, Oxford University Press, Oxford, UK (2000) and the like). Further, methods for measuring such function of a protein are described in for example, Sambrook,J.et al.Molecular Cloning:A Laboratory Manual,3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(2001);Frank T., et al., Thermophiles(Archaea:A Laboratory Manual 3), Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(1995); KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982); Methods in Enzymology series, Academic Press; Kazutomo Imabori, Tamio Yamakawa ed., Seikagaku Jiten (Dictionary of BIOCHEMISTRY), Third Edition, Tokyo Kagaku Dojin (1998); Yasudomi NISHIDZUKA ed., Saibokino to Taisha mappu (Cellular Functions and Metabolism map), Tokyo Kagaku Dojin (1997); Lengeler, J. et al. Biology of the Prokaryotes, Blackwell Science, Oxford, UK(1998); Lewin Genes VII, Oxford University Press, Oxford, UK(2000) and the like.


As such, the functions of genes comprised in the genome of KOD are revealed by the present invention, which are summarized in the following Table. Table 2 describes genes defined by the region (1) as described in Table 2 (hereinafter, Gene ID No. (1) and the like; the amino acid sequence of the gene is a sequence corresponding to the SEQ ID NO: set forth in SEQ ID NO: as described in the table).

TABLE 2DECRIPTION OF GENES COMPRISED IN THE GENOME OF Thermococcus kodakaraensis KOD1startstopnucleicnucleicacidacidnumbernumberhavinghavingNucleicNucleicNucleicNucleichighhighacid No.acid No.acid No.acid No.correspondinghomologyhomologyGENE(sense(sense(antisense(antisenseSEQtotoaminoIDstrand,strand,strand,strand,IDreadingknownknownacidgeneNO:start)stop)stop)start)NO:framegenesgeneslengthnomenclatureclassificationDescription115016208937720843622f-111216702PolBLDNA polymerase elongationsubunit (family B) (homingendonuclease)251345733208424420836453f-151345707165RPredicted metal-dependenthydrolase360796543208329920828351468r-16424654133CarBEFCarbamoylphosphate synthaselarge subunit (split gene in MJ)COG0458 CarB465867014208279220823644f-165867012262RPredicted CoA-binding protein571527391208222620819871837r-27170733830RPredicted ATPase or kinase673997614208197920817641467r-17399754929RpoZKDNA-directed RNA polymerasesubunit K/omega776558755208172320806232157r-376588726470LPredicted DNA modificationmethylase888431009320805352079285343f-29011957234GPredicted N-acetylglucosaminyltransferase9100951037920792832078999724f-3101041029930PutACNAD-dependent aldehydedehydrogenases10103761080720790022078571344f-21038510787161SUncharacterized ACR111080811416207857020779622156r-31085911414277RGTPases12114061172620779722077652725f-3114451164630UgpQCGlycerophosphoryl diesterphosphodiesterase13117231228620776552077092345f-21175912275150RPredicted hydrolases of HDsuperfamily14123381341120770402075967346f-21240413391550ModAPABC-type molybdate transportsystem151339213841207598620755371836r-21342513833146RPredicted nucleic acid-bindingprotein161380814056207557020753222155r-3138411400057AbrBKRegulators ofstationary/sporulation geneexpression17141531489620752252074482347f-21415914885379CysUPABC-type sulfate/molybdatetransport systems18152391596420741392073414348f-21537115962266RPredicted ATPases19161511669920732272072679349f-2165051664929RPredicted ATPases of PP-loopsuperfamily201669617697207268220716815f-11670817686448CysAPABC-type sulfate/molybdatetransport systems211778018793207159820705852154r-3178791843740HflCOMembrane protease subunits221878619280207059220700981835r-2187921925129NqrACNa+-transportingNADH:ubiquinoneoxidoreductase alpha subunit231929020183207008820691951834r-2192931940732LArchaea-specific RecJ-likeexonuclease242018321187206919520681912153r-3206452088540PnpJPolyribonucleotidenucleotidyltransferase(polynucleotide phosphorylase)252126621919206811220674592152r-32126921908223GphRPredicted phosphatases262191322569206746520668091466r-12193122552320SUncharacterized ACR272259724195206678120651831465r-12292124193691SAM1HS-adenosylhomocysteinehydrolase282394724834206543120645446f-12395324808141GloBRZn-dependent hydrolases29248132545120645652063927726f-32487925446218RUncharacterized ACR302541325811206396520635671833r-22547625770159RPR2JRNAse P protein subunit RPR2312581327396206356520619821464r-12593027364295MCM2LPredicted ATPase involved inreplication control322756528620206181320607587f-1275682801242SbcCLATPase involved in DNA repair332859129334206078720600441463r-1287772911633UshAF5′-nucleotidase/2′342978230681205959620586978f-12979130655227SUncharacterized proteins ofWD40-like repeat family353110231266205827620581129f-1311023126494SUncharacterized ArCR3631414322352057964205714310f-13141432182270SmtAQRSAM-dependentmethyltransferases COG0500SmtA37323673325120570112056127727f-33238233087202FlaBNArchaeal flagellins(flagellin)38332913503320560872054345728f-33330933636125FlaBNArchaeal flagellins(flagellin)39350483582420543302053554350f-23504835804206FlaBNArchaeal flagellins(flagellin)40358823654120534962052837351f-23588836533262FlaBNArchaeal flagellins(flagellin)4136553373802052825205199811f-13655337378290FlaBNArchaeal flagellins(flagellin)42373943787020519842051508352f-23754137868181FlaCNPutative archaeal flagellar protein C43378743929820515042050080353f-23887039296258FlaDNPutative archaeal flagellar proteinD/E4439760403322049618204904612f-13986240318194FlaGNPutative archaeal flagellar protein G4540360410702049018204830813f-14037241068385FlaHNPredicted ATPases involved inbiogenesis of archaeal flagella46410724269420483062046684354f-24107242692905VirB11NPredicted ATPases involved inpili and flagella biosynthesis47426964444420466822044934729f-34269644436656FlaJNUncharacterized membranecomponent of archaeal flagella48444414643520449372042943355f-2458694607336RPredicted helicases49464704699120429082042387730f-34649746986294PcmOProtein-L-isoaspartatecarboxylmethyltransferase50471714741620422072041962356f-2471714732160SerBEPhosphoserine phosphatase5147317477992042061204157914f-14732047794143SerBEPhosphoserine phosphatase524793749139204144120402391832r-24794349128224PppANSignal peptidase534915349329204022520400491462r-15449393497312039985203964715f-1495284966928SPS1TSerine/threonine protein kinases55497285029720396502039081731f-34972850292246SUncharacterized ACR565027850559203910020388191461r-1502905046129RSTAS domain protein57506935141220386852037966357f-25070551410276RPredicted hydrolases of the HADsuperfamily585148352061203789520373171831r-25149252056219PgsAIPhosphatidylglycerophosphatesynthase595206352605203731520367731460r-15206952603276SUncharacterized ArCR605260253792203677620355861830r-2535235371532DnaXLDNA polymerase III6154169550202035209203435816f-15425055018407SUncharacterized ACR62550585560620343202033772358f-2553225549944RPredicted nucleotidyltransferases63557465601820336322033360732f-3557495601043SUncharacterized ACR64561325626320332462033115359f-265562445670820331342032670733f-3562445666199RPredicted nucleic acid-bindingprotein6656674572672032704203211117f-15671057265320NadRHNicotinamide mononucleotideadenylyltransferase675726457584203211420317941829r-2574085752828AlsTENa+/alanine symporter685759958276203177920311022151r-3577225815736RPredicted helicases6958855597032030523202967518f-15886759701481RPredicted methyltransferase705970459868202967420295101459r-1597255985127FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG715989861799202948020275791828r-25991061719390CAldehyde:ferredoxinoxidoreductase7262830637232026548202565519f-1629416337640XerCLIntegrase73642266599220251522023386360f-2646976498535XynBGBeta-xylosidase74660456738220233332021996734f-3663306674134FliDNFlagellar capping protein7567399689732021979202040520f-16808068833173AprEOSubtilisin-like serine proteases76691176937420202612020004735f-3692406932732RPredicted membrane protein7769583697952019795201958321f-178697927051120195862018867736f-3699037029636FtsWDBacterial cell division membraneprotein7970504711122018874201826622f-1708857097232QPhytoene dehydrogenase andrelated proteins80711177124520182612018133361f-2711237123729GcvPEGlycine cleavage system proteinP (pyridoxal-binding)81716797259320176992016785737f-3719227217438IleSJIsoleucyl-tRNA synthetase82727647333920166142016039362f-2730497323534KPredicted transcriptional regulator8373336746432016042201473523f-1740057411035GloBRZn-dependent hydrolases84746037576020147752013618363f-285757537602520136252013353738f-3757867597228FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG86760227745820133562011920364f-2762117647534SUncharacterized BCR87777357904520116432010333365f-2778047800534UshAF5′-nucleotidase/2′887962279726200975620096522150r-389799688012920094102009249739f-3799688005831AbrBKRegulators ofstationary/sporulation geneexpression90802468042820091322008950366f-2803188040229CaiCIQAcyl-CoA synthetases(AMP-forming)/AMP-acidligases II COG0318 CaiC91804328317620089462006202367f-28110183075233MCM2LPredicted ATPase involved inreplication control9283431836282005947200575024f-1834408360233GlpCCFe—S oxidoreductases9383908842672005470200511125f-1839478410928ESerine proteases of the peptidasefamily S9A94842648444020051142004938740f-3843038442026DnaJOMolecular chaperones (containC-terminal Zn finger domain)95844618501820049172004360368f-284530847312996849998534020043792004038741f-3850028517628RNa+-dependent transporters of theSNF family97854218594820039572003430369f-28544885847100XerCLIntegrase988633387139200304520022392149r-38634587128428DPH5JDiphthamide biosynthesismethyltransferase DPH59987211876632002167200171526f-18722687619221TroRKMn-dependent transcriptionalregulator100876638826520017152001113742f-3879128822439NorMQNa+-driven multidrug effluxpump101882668927920011122000099743f-3883958885132PolCLDNA polymerase III alphasubunit102893079005920000711999319744f-38931990003286RPredicted hydrolases of the HADsuperfamily10390079902671999299199911127f-19008890265131JPredicted Zn-ribbonRNA-binding protein with afunction in translation104902769056019991021998818745f-39028590558167EFB1JTranslation elongation factorEF-1beta1059058391056199879519983221458r-1908119097632WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD106911789136619982001998012370f-2912689135528AroCEChorismate synthase10791363929791998015199639928f-19136392974892PutPEHRNa+/proline108930729455019963061994828746f-3930729453917HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases10994552957121994826199366629f-19456795710635DadAEGlycine/D-amino acid oxidases(deaminating)110961859763619931931991742371f-29618597601702HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases111976209814719917581991231747f-39762998127287HycBCFe—S-cluster-containinghydrogenase components 2112984179958319909611989795372f-29847499581464DadAEGlycine/D-amino acid oxidases(deaminating)1139964810089219897301988486748f-399654100881398114100915101205198846319881731457r-110097510109830SUncharacterized ACR115101224101733198815419876451456r-1101239101695212WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD11610179610234719875821987031749f-3101805102315206KPredicted transcription factor11710239310256319869851986815750f-3118102986103432198639219859462148r-3103016103364182SUncharacterized ArCR11910347610431819859021985060751f-3103539104313429SppANOPeriplasmic serine proteases(ClpP class) COG0616 SppA1201043981061011984980198327730f-1104398106099723SUncharacterized ACR1211062101067791983168198259931f-1106210106759316SPT15KTranscription initiation factorTFIID (TATA-binding protein)1221068341074541982544198192432f-110689410710430RAD55TRecA-superfamily ATPasesimplicated in signal transduction12310763710845519817411980923752f-3107640108435354AcuCTQDeacetylases124108482109099198089619802792147r-3108491109097374PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases(Indole-pyruvate ferredoxinoxidoreductase)125109092111035198028619783431827r-2109092110067452PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases (Indole-pyruvateferredoxin oxidoreductase)126111643113019197773519763591455r-1111652113017732CAcyl-CoA synthetase (NDPforming)12711320511456319761731974815753f-3113205114555724RPredicted ATPase of the AAAsuperfamily12811466811535119747101974027373f-2114677115346390RPredicted Zn-dependenthydrolases of the beta-lactamasefold12911539711640119739811972977374f-2115490116378284LytBMPutative cell wall-binding domain130116482116634197289619727441454r-111652411659627RPredicted nucleic-acid-bindingprotein containing a Zn-ribbon131116676117494197270219718841826r-211670011705434RecNLATPases involved in DNA repair132117475118242197190319711361453r-111755611783534SPredicted membrane protein133118178118711197120019706672146r-311823511837930PitAPPhosphate/sulphate permeases134119061119939197031719694391825r-2119100119931416SpeEESpermidine synthase13511997312048519694051968893754f-312015612042035RHydrolases of the alpha/betasuperfamily136120479120952196889919684262145r-3120479120947269SUncharacterized ACR137121121121192196825719681862144r-313812140412185619679741967522755f-3121443121854245GcvHEGlycine cleavage system Hprotein (lipoate-binding)13912200712243819673711966940756f-312200712225690PspCKTPutative stress-responsivetranscriptional regulatorCOG1983 PspC1401224311226671966947196671133f-11411226681235941966710196578434f-1122680123508313CitGHTriphosphoribosyl-dephospho-CoAsynthetase142123578123868196580019655102143r-312359912371029LArchaea-specific RecJ-likeexonuclease143123932126157196544619632212142r-31239321261461300LArchaea-specific RecJ-likeexonuclease14412630612856119630721960817757f-3126333128553448TarNMethyl-accepting chemotaxisprotein145128631130013196074719593651824r-2128640130011628RPermeases146130150131154195922819582241452r-1130150131110392MalKGABC-typesugar/spermidine/putrescine/iron/thiaminetransport systems147131148133049195823019563291823r-2131409133029584ThiPHABC-type thiamine transportsystem1481327451338901956633195548835f-1132856133831394DmpAEQL-aminopeptidase/D-esteraseCOG3191 DmpA149133885134547195549319548311451r-1133900134527182CcmAQABC-type multidrug transportsystem150134544134834195483419545441822r-213458913476330FhaBMPutative hemagglutinin/hemolysin151134978135754195440019536242141r-313502013521533RPermeases of the major facilitatorsuperfamily152137477138172195190119512062140r-313782813800532SUncharacterized BCR153138521138676195085719507022139r-313859013867128MapJMethionine aminopeptidase15413936514097219500131948406758f-3139365140970914CFe—S oxidoreductases family 215514107814131119483001948067759f-314108714129446LrpKTranscriptional regulators15614133514185619480431947522375f-2141335141797147KPredicted transcriptionalregulators157141853142707194752519466711450r-1141862142702474NfoLEndonuclease IV158142732143793194664619455851449r-114290314360240SbcCLATPase involved in DNA repair159143756144931194562219444472138r-3143765144896451SPredicted membrane protein160144924145235194445419441431821r-2144936145224134SUncharacterized ACR16114533414595119440441943427376f-2145334145949383SUncharacterized ACR162146007146603194337119427751820r-2146016146553261SUncharacterized ACR163147207149273194217119401051819r-2147309149253934LSuperfamily I DNA and RNAhelicases and helicase subunits164149293149697194008519396811448r-1149293149695230RPredicted nucleic-acid-bindingprotein containing a Zn-ribbon165149699150874193967919385042137r-3149708150872612PaaJIAcetyl-CoA acetyltransferases166150876151928193850219374501818r-2150876151926582PksGI3-hydroxy-3-methylglutaryl CoAsynthase16715207615247119373021936907760f-3152076152433157SUncharacterized ACR16815241715274319369611936635377f-2152417152738164SUncharacterized ACR169152801153490193657719358882136r-3152810153485416NOP1JFibrillarin-like rRNA methylase170153487154752193589119346261447r-1153487154609553SIK1JProtein implicated in ribosomalbiogenesis171154844155881193453419334972135r-3154919155879578GCD2JTranslation initiation factoreIF-2B delta subunit17215604415730919333341932069378f-2156056157292602ARO8KETranscriptional regulatorscontaining a DNA-binding HTHdomain and an aminotransferasedomain (MocR family) and theireukaryotic orthologs COG1167ARO817315736815822819320101931150761f-3157452157953129RPredicted glutamineamidotransferases174158158159018193122019303601446r-1158179159016422SplBLDNA repair photolyase17515898215946419303961929914762f-3159054159462216SUncharacterized ACR176159517160083192986119292951445r-1159517160081350GuaAFGMP synthase - Glutamineamidotransferase domain17716020616025619291721929122763f-3178160526160744192885219286342134r-316061916073327CAcyl-CoA synthetase (NDPforming)179160787161719192859119276592133r-3160799161717567GuaAFGMP synthase - PP-ATPasedomain180161795163255192758319261232132r-3162410163253495GuaBFIMP dehydrogenase/GMPreductase18116336216440519260161924973764f-316350316376132182164398165393192498019239851444r-1164398165388544RATP-utilizing enzymes ofATP-grasp superfamily (probablycarboligases)183165390167531192398819218471817r-21653901675051051PurLFPhosphoribosylformylglycinamidine(FGAM) synthase184168881170377192049719190012131r-3169019169826162PpsAGPhosphoenolpyruvatesynthase/pyruvate phosphatedikinase185170457171128191892119182501816r-2170457171126385PurLFPhosphoribosylformylglycinamidine(FGAM) synthase186171130171381191824819179971443r-1171139171376110PurSFPhosphoribosylformylglycinamidine(FGAM) synthase187171383172534191799519168442130r-3171392172532673RATP-utilizing enzymes ofATP-grasp superfamily (probablycarboligases)188172527173834191685119155441815r-2172539173829602PurDFPhosphoribosylamine-glycineligase189173896173985191548219153931442r-119017440417460119149741914777379f-217443417459929PolBLDNA polymerase elongationsubunit (family B)19117458517534919147931914029765f-317459717487634RAD55TRecA-superfamily ATPasesimplicated in signal transduction192175740177038191363819123401814r-2175749177036781PurTFFormate-dependentphosphoribosylglycinamideformyltransferase (GARtransformylase)19317713817815119122401911227766f-3177147178146545PurMFPhosphoribosylaminoimidazol(AIR) synthetase19417818417834819111941911030380f-217821717833128PyrFFOrotidine-5′-phosphatedecarboxylase195178320179039191105819103391813r-2178332179028341PurCFPhosphoribosylaminoimidazolesuccinocarboxamide(SAICAR)synthase19617919518055319101831908825381f-2179195180551661PurFFGlutaminephosphoribosylpyrophosphateamidotransferase197180543181031190883519083471812r-2180543181002102RPredicted nucleic acid-bindingprotein198181028181288190835019080902129r-318102818127773SUncharacterized ACR199181345183324190803319060541441r-1181345183322984BisCCAnaerobic dehydrogenases200183436184935190594219044431440r-118412918427333MalGGSugar permeases201185362185955190401619034231439r-1185365185953330PDX2HPredicted glutamineamidotransferase involved inpyridoxine biosynthesis202185988187004190339019023741811r-2185997186966536SNZ1HPyridoxine biosynthesis enzyme203187111187953190226719014251438r-1187120187939410NadCHNicotinate-nucleotidepyrophosphorylase2041880741893151901304190006336f-1188083189256188GCD1MJNucleoside-diphosphate-sugarpyrophosphorylases involved inlipopolysaccharidebiosynthesis/translation initiationfactor eIF2B subunits COG1208GCD12051898651902781899513189910037f-1189865190276167SUncharacterized ACR20619025319062118991251898757382f-2190253190583154RPredicted nucleotidyltransferases207190630191799189874818975791437r-119063019178571520819187419250918975041896869767f-3191889192489256SmtAQRSAM-dependentmethyltransferases COG0500SmtA2091925351929811896843189639738f-119255319276329PilONFimbrial assembly protein21019297119348618964071895892383f-219300419334942SbcCLATPase involved in DNA repair211193701194033189567718953451810r-2193740194025117WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD212194152194358189522618950201436r-119424219435028RimLJAcetyltransferases2131950971954051894281189397339f-119509719531346CcmAQABC-type multidrug transportsystem214195742195846189363618935321435r-121519599519611118933831893267384f-2216196138196959189324018924191434r-1196138196951291WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD217197032197625189234618917531433r-1197044197563125RimLJAcetyltransferases21819774719836718916311891011385f-219783719818565SmtAQRSAM-dependentmethyltransferases COG0500SmtA219198495199754189088318896241809r-219854919899675220199748200686188963018886922128r-319990120036333RfaGMPredicted glycosyltransferases22120074220109818886361888280768f-320093120100327BtuCPHABC-typecobalamin/Fe3+-siderophorestransport systems2222010672017381888311188764040f-1201067201727360RPredicted amidohydrolase22320169220210218876861887276386f-2201773202100181ARC1REMAP domain22420210320292418872751886454387f-2202313202922229SpeBEArginase/agmatinase/formimionoglutamatehydrolase22520292920337218864491886006769f-3202944203361187CDC14TPredicted protein-tyrosinephosphatase22620358520447518857931884903388f-220363320417082HisSJHistidyl-tRNA synthetase2272044722050831884906188429541f-1204484205048155HisGEATP phosphoribosyltransferase(histidine biosynthesis)22820507020620018843081883178389f-2205079206111276HisDEHistidinol dehydrogenase22920628020681318830981882565770f-3206280206766117HisBEImidazoleglycerol-phosphatedehydratase23020681020739718825681881981390f-2206810207380182HisHEGlutamine amidotransferase23120739920810018819791881278771f-3207405208038162HisAEPhosphoribosylformimino-5-aminoimidazolecarboxamideribonucleotide (ProFAR)isomerase23220808220884018812961880538391f-2208082208826310HisFEImidazoleglycerol-phosphatesynthase23320885020947918805281879899392f-2208898209171119HisIEPhosphoribosyl-AMPcyclohydrolase2342094762104861879902187889242f-1209542210427184HisCEHistidinol-phosphateaminotransferase/Tyrosineaminotransferase23521047021119818789081878180393f-221047621099537GphRPredicted phosphatases23621129621198218780821877396772f-3211296211980355TrpCEIndole-3-glycerol phosphatesynthase23721197921295618773991876422394f-2211985212951415TrpDEAnthranilatephosphoribosyltransferase2382129382142391876440187513943f-1212980214228610TrpEEHAnthranilate/para-aminobenzoatesynthases component I COG0147TrpE23921423621481418751421874564773f-3214236214806326PabAEHAnthranilate/para-aminobenzoatesynthases component IICOG0512 PabA2402148072154331874571187394544f-1214816215428253TrpFEPhosphoribosyl anthranilateisomerase24121542621659518739521872783395f-2215435216587676TrpBETryptophan synthase beta chain24221658821734318727901872035774f-3216588217323370TrpAETryptophan synthase alpha chain243217325218095187205318712832127r-321732821791385TyrAEPrephenate dehydrogenase244218020219114187135818702641432r-1218029218971191AvtAEPLP-dependent aminotransferases245219077219253187030118701252126r-321908021922135PheAEChorismate mutase246219407220474186997118689042125r-3219407220457530AroCEChorismate synthase247220471221718186890718676601431r-1220513221710470AroAE5-enolpyruvylshikimate-3-phosphatesynthase248221676222236186770218671421808r-2221742222234175EHArchaeal shikimate kinaseCOG1685 -249222472222852186690618665261430r-1222472222850161AroEEShikimate 5-dehydrogenase250222879223259186649918661191807r-2222879223197142AroEEShikimate 5-dehydrogenase251223282223923186609618654551429r-1223282223894207AroDE3-dehydroquinate dehydratase252223877225022186550118643562124r-3223985224876350AroBE3-dehydroquinate synthetase253224890225804186448818635741428r-1224965225682395AroAE3-Deoxy-D-arabino-heptulosonate7-phosphate (DAHP) synthase254225801226844186357718625341806r-2225924226824426TktAGTransketolase255226718227377186266018620012123r-3226742227369278TktAGTransketolase256227370227741186200818616371805r-222746322754730AraCKAraC-type DNA-bindingdomain-containing proteins25722793122824218614471861136775f-322798522823771ProCEPyrroline-5-carboxylate reductase25822825722871818611211860660396f-2228257228701136ProCEPyrroline-5-carboxylate reductase259228710229147186066818602312122r-3228710229079201ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases260229347229745186003118596331804r-2229347229716195ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases261229732230820185964618585581427r-1229732230809523ArgDEPLP-dependent aminotransferases262230826231581185855218577971803r-2230826231579315ArgBEAcetylglutamate kinase263231591232583185778718567951802r-2231591232578564ArgCEAcetylglutamate semialdehydedehydrogenase264232580233410185679818559682121r-3232589233405437RimKHJGlutathione synthase/Ribosomalprotein S6 modification enzyme(glutaminyl transferase)COG0189 RimK265233428233589185595018557891426r-123343123351228PqiASUncharacterizedparaquat-inducible protein A266233684234727185569418546512120r-3233684234692456LeuBEIsocitrate/isopropylmalatedehydrogenase267234715235206185466318541721425r-1234715235201256LeuDE3-isopropylmalate dehydratasesmall subunit268235203236345185417518530331801r-2235203236337595LeuCE3-isopropylmalate dehydrataselarge subunit269236342237427185303618519512119r-3236342237425536LeuAEIsopropylmalate/homocitrate/citramalatesynthases270237653238216185172518511622118r-3237653238214297NfnBCNitroreductase27123850923952818508691849850776f-3238581239505289RPredicted ATPase of the AAAsuperfamily27223948923968618498891849692397f-223949523967276RPredicted ATPase of the AAAsuperfamily273239677240426184970118489521424r-1239677240424406PhnPRMetal-dependent hydrolases ofthe beta-lactamase superfamily I27424056024302818488181846350398f-2240662242990424PflDCPyruvate-formate lyase27524397724452518454011844853399f-224411824432235ArpRAnkyrin repeat proteins2762445912450551844787184432345f-1244591245044228SUncharacterized ACR27724505224574718443261843631777f-3245052245736322SUncharacterized ArCR278245738246229184364018431492117r-324574424588833279246239246340184313918430382116r-324623924632626TehAPTellurite resistance protein andrelated permeases280247226248134184215218412442115r-3247241248132503NadAHQuinolinate synthase281248197249606184118118397721423r-1248275249586598NadBHAspartate oxidase2822511612512651838217183811346f-128325139425147718379841837901778f-32842515572517601837821183761847f-125160225173132GpmAGPhosphoglycerate mutase 1285254653255162183472518342161422r-1254653255151248KptASUncharacterized ACR286255227256987183415118323912114r-325630425691957ElsHRMetal-dependent hydrolase287257124258452183225418309261800r-2257133258450728HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases288258556259233183082218301451421r-1258556259231310PyrHFUridylate kinase28926070326192318286751827455779f-3260703261798430SrmBLKJSuperfamily II DNA and RNAhelicases COG0513 SrmB290262176262484182720218268941799r-2262176262482183RpsJJRibosomal protein S10291262544263830182683418255482113r-3262544263828762TufBJEGTPases - translation elongationfactors COG0050 TufB292264065265165182531318242132112r-3264065265157634FusAJTranslation elongation and releasefactors (GTPases)293264895266262182448318231161420r-1264895265954642FusAJTranslation elongation and releasefactors (GTPases)294266696266977182268218224012111r-3295267002268075182237618213032110r-3267005267965260RHD superfamilyphosphohydrolases296268109269197182126918201812109r-3268109269156619ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases29726929727006418200811819314400f-2269378270059270GloBRZn-dependent hydrolases2982700522703061819326181907248f-1270061270304147SUncharacterized ArCR299270301271278181907718181001419r-1270331270853117SUncharacterized ACR30027136127211918180171817259401f-2271361272117317TatDLMg-dependent DNase30127212127242918172571816949780f-327220827242158SmtAQRSAM-dependentmethyltransferases COG0500SmtA302272525274057181685318153212108r-3272534274055679FolPHDihydropteroate synthase30327424427496318151341814415402f-2274244274955417SUncharacterized ACR30427534027556418140381813814781f-327546327553827RPredicted nucleic acid-bindingprotein3052766882777581812690181162049f-127703027716533CAldehyde:ferredoxinoxidoreductase3062777592785261811619181085250f-127831427848528ThiPHABC-type thiamine transportsystem30727845427898118109241810397782f-327870027879329KRNA-binding proteins (RRMdomain)30827896927973618104091809642403f-2279002279638156CcmAQABC-type multidrug transportsystem309279859280521180951918088571418r-1279883280513255HIS2ERHistidinol phosphatase andrelated hydrolases of the PHPfamily COG1387 HIS231028062928107218087491808306783f-3280638281070251SbmIMethylmalonyl-CoA mutase3112811042820721808274180730651f-1281113282061494ArgKEPutative periplasmic proteinkinase ArgK and related GTPasesof G3E family31228206928246718073091806911784f-3282069282462233GloAELactoylglutathione lyase andrelated lyases313282544283272180683418061061417r-1282544283186182WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD314283421284416180595718049622107r-3283421284405414DUR1EAllophanate hydrolase subunit 2315284413285099180496518042791416r-1284419285085318DUR1EAllophanate hydrolase subunit 1316285104285292180427418040862106r-328510728525739VapCRPredicted nucleic acid-bindingprotein317285716286492180366218028862105r-3285725286487455RUncharacterized proteins3182865432870791802835180229952f-1286570287005214RPredicted nucleic acid-bindingprotein319287046287645180233218017331798r-2287112287643244FPredicted nucleotide kinase(related to CMP and AMPkinases)320287758288153180162018012251415r-128778828788128RpoDKDNA-directed RNA polymerasesigma subunits(sigma70/sigma32)321288150288437180122818009411797r-228815928842344SUncharacterized ACR322288505289047180087318003311414r-128872428890442RPredicted nucleotidyltransferases323289173289493180020517998851796r-2324289490289948179988817994302104r-328950228987433RPredicted nucleic acid-bindingprotein325290136291029179924217983491795r-2290193291024363AlkAL3-Methyladenine DNAglycosylase326290939291157179843917982212103r-329097529106530GlgBG132729135329269617980251796682404f-2291431292670516NOMembrane-bound serine protease(ClpP class) COG1030-32829270329350917966751795869405f-2292763293507374HflCOMembrane protease subunits329293510293593179586817957852102r-333029362729441517957511794963406f-2293636294413406DATPases involved in chromosomepartitioning3312943462946631795032179471553f-133229475029500117946281794377785f-329480129496928SecANPreprotein translocase subunitSecA (ATPase33329511529662617942631792752407f-2295115296624782DeoAFThymidine phosphorylase334296627297139179275117922392101r-329688229701730UvrALExcinuclease ATPase subunit335297204297731179217417916471794r-2297270297720278MoaCHMolybdenum cofactorbiosynthesis enzyme33629777329870217916051790676408f-2297785298694452CcmAQABC-type multidrug transportsystem3372986993008251790679178855354f-1298768300298273SPredicted membrane protein33830079530174817885831787630786f-3300822301671226NosYRABC-type transport systeminvolved in multi-copper enzymematuration339301803303251178757517861271793r-2302097303249645RtcBSUncharacterized ACR340303305303766178607317856122100r-3303374303752140SUncharacterized ACR341303750304688178562817846901792r-2303750304662427SunJtRNA and rRNAcytosine-C5-methylases342304698305126178468017842521791r-2304698305124183SUncharacterized ACR34330533930619317840391783185409f-2305339306185437PanBHKetopantoatehydroxymethyltransferase3443061903068581783188178252055f-1306193306853272WcaAMGlycosyltransferases involved incell wall biogenesis34530747330770017819051781678787f-330752730765626BaeSTSensory transduction histidinekinases346308311308886178106717804921413r-1308311308875240ThiIHThiamine biosynthesis ATPpyrophosphatase347308930309406178044817799722099r-3308930309377139SPredicted membrane protein348309492310637177988617787411790r-2309498310497350ThiIHThiamine biosynthesis ATPpyrophosphatase349310642311016177873617783621412r-131070831089431ThiPHABC-type thiamine transportsystem350311017311625177836117777531411r-131103531156962NfnBCNitroreductase351312108312536177727017768421789r-231239931252829PhoUPPhosphate uptake regulator3523126373129031776741177647556f-135331295331330617764251776072410f-231319331330132RATPases of the PilT family35431334431412017760341775258788f-3313407314118356QMaleate cis-trans isomerase35531420531444717751731774931789f-331431331443630AraCKAraC-type DNA-bindingdomain-containing proteins35631442931558917749491773789411f-231445331476539GloBRZn-dependent hydrolases357315618316058177376017733201788r-231576231585832KatEPCatalase358316245316973177313317724051787r-2316245316971423Spo0JKPredicted transcriptionalregulators35931712431827217722541771106790f-3317136318267480SUncharacterized ACR360318265319239177111317701391410r-1318388319225367SUncharacterized ACR361319807319851176957117695271409r-13623202393209281769139176845057f-132030832052138XerCLIntegrase36332137432151117680041767867412f-23643215083216961767870176768258f-132151732164928RPredicted nucleic acid-bindingprotein3653220123223651767366176701359f-132206032222831CysZEUncharacterized protein involvedin cysteine biosynthesis36632226532425617671131765122413f-232298232326136SPredicted membrane protein36732426132639917651171762979791f-332488232507434ArpRAnkyrin repeat proteins36832655232693517628261762443414f-232663932679231AmtBPAmmonia permeases3693270133272821762365176209660f-132704932721728ZntAPCation transport ATPases37032728432751417620941761864415f-232738632748827DraGOADP-ribosylglycohydrolase37132751832832117618601761057416f-232815732831330BioDHDethiobiotin synthetase3723283333288151761045176056361f-132833332849229SUncharacterized BCR37332881232928817605661760090792f-332900432911829NPredicted secreted acidphosphatase3743292903300901760088175928862f-132938032992944SmcDChromosome segregationATPases37533022433168717591541757691417f-233082733140642RfaGMPredicted glycosyltransferases37633169133245217576871756926418f-233215333231232GlmUMN-acetylglucosamine-1-phosphateuridyltransferase (containsnucleotidyltransferase and I-patchacetyltransferase domains)3773324493327361756929175664263f-137833417533494517552031754433419f-233422333431931CirAPOuter membrane receptor proteins3793350683356641754310175371464f-133515833543435RUncharacterized CBSdomain-containing proteins3803370453372601752333175211865f-133708733722228GCGlycosyl transferases381337711338295175166717510831408r-133805033828437LMutS-like ATPases involved inmismatch repair38233936333978817500151749590793f-333944133963934LReplication factor A large subunitand related ssDNA-bindingproteins38334064134072717487371748651794f-338434155834199517478201747383420f-234160034174742AbrBKRegulators ofstationary/sporulation geneexpression3853423973434611746981174591766f-134312634336336MarRKTranscriptional regulators38634345434389117459241745487421f-234353834376032SUncharacterized BCR3873438883440761745490174530267f-134391234398729PyrGFCTP synthase (UTP-ammonialyase)38834409034440117452881744977422f-238934528134547217440971743906423f-234535034546426NlpDMMembrane proteins related tometalloendopeptidases390345566345622174381217437562098r-339134561534574017437631743638795f-33923461743463561743204174302268f-134618334629728NrfGRTPR-repeat-containing proteins3933465283468811742850174249769f-134665134683728LReplication factor A large subunitand related ssDNA-bindingproteins394346606346668174277217427101407r-139534713834846317422401740915424f-2347351348461427SUncharacterized ACR396348567350417174081117389611786r-23485673504031032ESerine proteases of the peptidasefamily S9A39735053735159817388411737780425f-2350537350981162RibDHPyrimidine deaminase3983515923521551737786173722370f-1351601352150191RibCHRiboflavin synthase alpha chain39935241935298517369591736393796f-335246135264730RPredicted membrane-associated4003539233541021735455173527671f-135401035409725LytRKTranscriptional regulator40135417435533417352041734044797f-3354723355320243RibAHGTP cyclohydrolase II4023553933558721733985173350672f-1355414355849170RibHHRiboflavin synthase beta-chain403355856356452173352217329262097r-3355862356387125SUncharacterized ArCR404356449357381173292917319971406r-1356455357211170RATP-utilizing enzymes ofATP-grasp superfamily (probablycarboligases)405357378358037173200017313411785r-2357378357969140PurCFPhosphoribosylaminoimidazolesuccinocarboxamide(SAICAR)synthase406358034359329173134417300492096r-3358043359312651ThiCHThiamine biosynthesis proteinThiC4073594073601711729971172920773f-1359416360163386RFlavoproteins40836016836146617292101727912798f-3360171360888200ThiDHHydroxymethylpyrimidine/phosphomethylpyrimidinekinase40936149736340717278811725971799f-33615063633781016RUncharacterized ABC-typetransporter410366699367151172267917222271784r-236687936705033RPredicted metal-dependentmembrane protease411367290368240172208817211381783r-236793236819035HypFOHydrogenase maturation factor412368237369289172114117200892095r-3368243368948301SlpAOFKBP-type peptidyl-prolylcis-trans isomerases 241337063437144917187441717929426f-237121637136330CaiAIAcyl-CoA dehydrogenases41437148137292017178971716458800f-3371490372918859CysSJCysteinyl-tRNA synthetase4153744883745501714890171482874f-141637458337484017147951714538801f-3374583374832129SUncharacterized ACR417374833375534171454517138441405r-137524737542732LPredicted transposase418375535376308171384317130701404r-1375535376294105SUncharacterized ACR4193760003760921713378171328675f-1420376298376771171308017126072094r-3376298376769238KPredicted transcriptional regulator421379177380310171020117090681403r-137975637998438TarNMethyl-accepting chemotaxisprotein422380366381109170901217082692093r-338055838104732SPS1TSerine/threonine protein kinases423381111382313170826717070651782r-2381642382305360SUncharacterized ACR424382310382675170706817067032092r-338245438260429HisSJHistidyl-tRNA synthetase425382850383839170652817055392091r-3382859383837516SUncharacterized ACR426384244384471170513417049071402r-138424438430442AbrBKRegulators ofstationary/sporulation geneexpression427384528385040170485017043381781r-2384534385035239LRecB family exonuclease428385030386139170434817032391401r-138513838584340RPredicted ATPase of the AAAsuperfamily429389056390132170032216992461400r-1389056390127503SUncharacterized ACR430390129391328169924916980501780r-239045039063032SUncharacterized proteins ofWD40-like repeat family431391570392187169780816971911399r-1391570392140247SUncharacterized ACR432392614393321169676416960571398r-1392674393319399CAcyl-CoA synthetase (NDPforming)43339344939475016959291694628427f-239441539468830WcaGMGNucleoside-diphosphate-sugarepimerases COG0451 WcaG4343948943981091694484169126976f-139690139737842TarNMethyl-accepting chemotaxisprotein435398178398471169120016909071779r-239820239835227SmsOPredicted ATP-dependent serineprotease43639850239901116908761690367802f-339877239890430EmrKQMultidrug resistance efflux pump43739905040418516903281685193428f-23990504019331348LReverse gyrase43840448440529016848941684088803f-3404487405282409KPredicted transcriptionalregulators439405419405631168395916837472090r-340542240555438KPredicted transcriptional regulator440405628405963168375016834151397r-1405640405955155RUncharacterized Zn-fingercontaining protein441405960406709168341816826691778r-2405975406707256SpeBEArginase/agmatinase/formimionoglutamatehydrolase44240683540805516825431681323429f-2406835407465358SgbHG3-hexulose-6-phosphate synthaseand related proteins4434080524088071681326168057177f-1408082408796262FtsZDCell division GTPase44440880940946216805691679916430f-2408818409448248RPredicted hydrolases of the HADsuperfamily4454094594096471679919167973178f-140949540964530WcaGMGNucleoside-diphosphate-sugarepimerases COG0451 WcaG44640964741045916797311678919804f-340990241030733QPolyketide synthase modules andrelated proteins44741046041108016789181678298805f-3410499411027205RPredicted HD superfamilyhydrolase44841117641168816782021677690431f-2411176411686227NusAKTranscription terminator44941187841329316775001676085432f-241249041304536KPredicted transcriptionalregulators45041341541391516759631675463806f-341352341375439GyrALDNA gyrase (topoisomerase II) Asubunit4514139264142521675452167512679f-141393841417530SurAOParvulin-like peptidyl-prolylisomerase4524148774152091674501167416980f-141487741512331ArgSJArginyl-tRNA synthetase4534171094172701672269167210881f-141711541725927PutABProline dehydrogenase45441729141792916720871671449807f-341733041746230MetCECystathioninebeta-lyases/cystathioninegamma-synthases4554186364191751670742167020382f-141866341901733SUncharacterized proteins ofWD40-like repeat family45641924742056316701311668815808f-3419247420561771AsnSJAspartyl/asparaginyl-tRNAsynthetases (Aspartyl-tRNAsynthetase)45742062742213216687511667246809f-342163542191733RUncharacterized membraneprotein45842233342271916670451666659433f-2459422876424030166650216653482089r-3422876424019541AbgBRMetal-dependentamidase/aminoacylase/carboxypeptidase4604265474267111662831166266783f-146142674742774216626311661636810f-3426750427734452RPredicted methyltransferase46242779942906416615791660314434f-2427820429011224RUncharacterized ATPases of theAAA superfamily463429065430390166031316589882088r-3429065430388624TldDRPredicted Zn-dependent proteasesand their inactivated homologs464430394430633165898416587452087r-343049043059230SpoUJrRNA methylases465430618430785165876016585931396r-143065443072025PncAQAmidases related tonicotinamidase466430883432259165849516571192086r-3430883432257780TldDRPredicted Zn-dependent proteasesand their inactivated homologs4674323974327381656981165664084f-1432397432733176SUncharacterized ACR4684327514334491656627165592985f-1432760433429319RacXMAspartate racemase469433446434621165593216547571777r-2433650434616391CorAPMg2+ and Co2+ transporters4704345304357351654848165364386f-1434542435733681RPredicted GTPase471435779436300165359916530782085r-3435779436295208CyaBFAdenylate cyclase472436300436812165307816525661395r-1436339436810201LrpKTranscriptional regulators47343740943820916519691651169811f-3437415438207286SUncharacterized ACR474438222439658165115616497201776r-2438222439650588PykFGPyruvate kinase475439696440403164968216489751394r-1439696440368147RPredicted Zn-dependent proteases4764405784414441648800164793487f-1440578441442390SUncharacterized ArCR4774415114418821647867164749688f-1441511441880136CrcBDIntegral membrane proteinpossibly involved in chromosomecondensation47844188744226716474911647111435f-2441887442262231SUncharacterized ACR47944235844287316470201646505436f-244244844263429G2-Phosphoglycerate kinase48044292244414216464561645236437f-2442931444140630DfpHPhosphopantothenoylcysteinesynthetase/decarboxylase4814442204446811645158164469789f-144429544460739ZntAPCation transport ATPases48244497244531016444061644068812f-344497244527869SUncharacterized ACR483446197448899164318116404791393r-1446209448864962RDistinct helicase family with aunique C-terminal domainincluding a metal-bindingcysteine cluster484448945450294164043316390841392r-1449620450244148RPredicted hydrolase of thealpha/beta superfamily4854504814509961638897163838290f-1450481450994274CRubrerythrin48645107745123816383011638140813f-3451077451236111CRubredoxin48745125045159716381281637781438f-2451250451595224CDesulfoferrodoxin4884527704531231636608163625591f-145281845292933MrpDATPases involved in chromosomepartitioning48945318345460116361951634777814f-3453318454590772GlyAEGlycinehydroxymethyltransferase49045483545534116345431634037439f-245495245523433RLarge extracellular alpha-helicalprotein4914553384555021634040163387692f-145536245543725GCellobiose phosphorylase49245633045666216330481632716815f-3456330456660174RPB9KDNA-directed RNA polymerasesubunit M/Transcriptionelongation factor TFIIS49345662345683516327551632543440f-245665945673428WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD4944568384575871632540163179193f-1456838457585358DnaNLDNA polymerase III beta subunit(Proliferating cell nuclearantigen = PCNA)4954576184581841631760163119494f-1457618458128140SUncharacterized ArCR4964584764591261630902163025295f-1458476459124417AhpCOPeroxiredoxin497459138459680163024016296981775r-2459147459678164RimLJAcetyltransferases4984597184606741629660162870496f-1459718460603345KPredicted transcriptionalregulators499460667461935162871116274432084r-3460670461927532RHD superfamilyphosphohydrolases500462618463808162676016255701774r-2462624463764576MoeAHMolybdopterin biosynthesisenzyme501464266464421162511216249571391r-146432046438026RplWJRibosomal protein L23502464460464972162491816244061773r-2464460464970218MoaBHMolybdopterin biosynthesisenzymes50346533646656216240421622816816f-3465360466560653SUncharacterized ACR504466632466847162274616225311772r-25054669754676311622403162174797f-1466975467581273RPredicted phosphoesterases506467628468806162175016205721771r-2467637468804686AvtAEPLP-dependent aminotransferases507471018472637161836016167411770r-2471027472629799OPredicted carbamoyl transferase508472691474145161668716152332083r-3472706474143726ProSJProlyl-tRNA synthetase50947423947524016151391614138441f-2474239475193469LdhACHRLactate dehydrogenase andrelated dehydrogenases COG1052LdhA51047525047570816141281613670442f-247540347554145FrvXGCellulase M and related proteins5114757024770421613676161233698f-1475768477031662RPredicted DNA-binding proteincontaining a Zn-ribbon domain5124770494776571612329161172199f-1477061477640249SUncharacterized ACR51347773847803116116401611347817f-3514477971479050161140716103282082r-3477980479039533GCN3JTranslation initiation factoreIF-2B alpha subunit51547888147963916104971609739818f-3479103479622191RPredicted ATPases or kinases516479629480162160974916092161390r-1479635480148228RCBS domains517480198480755160918016086231769r-248021948050152ArsRKPredicted transcriptionalregulators518480843481127160853516082511768r-2480852481119129Ssh10bKArchaeal DNA-binding protein51948131548267916080631606699100f-1481315482656775PurBFAdenylosuccinate lyase52048498148544516043971603933101f-1485002485437219H6-pyruvoyl-tetrahydropterinsynthase521485442486008160393616033701767r-248552948579031TrpDEAnthranilatephosphoribosyltransferase52248606548648416033131602894443f-2486080486473167RPredicted DNA-binding proteinswith PD1-like DNA-bindingmotif523486481488979160289716003991389r-14864814889771328RSpecific archaeal helicases524489517490644159986115987341388r-1489604490642651TyrSJTyrosyl-tRNA synthetase52549074449184415986341597534102f-149175549184238OppAEPABC-typedipeptide/oligopeptide/nickeltransport systems52649192249337615974561596002819f-3492033493350412TbpAHABC-type iron/thiamine transportsystems52749356149540815958171593970103f-1493843495388396ThiPHABC-type thiamine transportsystem52849541049648015939681592898444f-2495419496436314MalKGABC-typesugar/spermidine/putrescine/iron/thiaminetransport systems52949709049918615922881590192445f-2497276498920114IccRPredicted phosphohydrolases530499596499949158978215894291766r-249964749979730MipBGTransaldolase531500938501252158844015881261387r-150097150108529SpoUJrRNA methylases532501249501479158812915878991765r-250131250142028AceECPyruvate dehydrogenase533501658502464158772015869141386r-1501703502453241DnaNLDNA polymerase III beta subunit(Proliferating cell nuclearantigen = PCNA)534502547502792158683115865862081r-350266150278430XylBGSugar (pentulose and hexulose)kinases535502785502967158659315864111764r-250282150295932RpoCKDNA-directed RNA polymerasebeta' subunit/160 kD subunit(split gene in archaea and Syn)53650318750335415861911586024820f-350324150332526AcrAQMembrane-fusion protein53750497150509915844071584279446f-250497150509426MarRKTranscriptional regulators538506242506664158313615827141385r-150640450663535DeoAFThymidine phosphorylase53950750650759215818721581786447f-2540508803509420158057515799581763r-250909150931333LInteins541510163510879157921515784991384r-151023551049034SUncharacterized ACR542511923512477157745515769011762r-251203451229834MtlACPhosphotransferase system54351310451348115762741575897448f-251326951338630ClpAOATPases with chaperone activity544513710514261157566815751172080r-351395351416330HitFGRDiadenosine tetraphosphate(Ap4A) hydrolase and other HITfamily hydrolases COG0537 Hit545514843515223157453515741551383r-151487351502929RUncharacterized proteins of theAP superfamily546515543515791157383515735872079r-351563651569928RplCJRibosomal protein L3547517003517803157237515715751382r-151727651761844SmcDChromosome segregationATPases548517805518281157157315710972078r-351799751811129NPredicted secreted acidphosphatase549518278518760157110015706181381r-151829651851528RPredicted hydrolase of alkalinephosphatase superfamily550518772519575157060615698031761r-2551519579519809156979915695691760r-251973551979826RbnJtRNA-processing ribonucleaseBN552520158520541156922015688371759r-252024552039831AmtBPAmmonia permeases553520694522628156868415667502077r-352111152130334ArpRAnkyrin repeat proteins554522837524828156654115645501758r-252361752385435SPredicted membrane protein555524728525042156465015643361380r-152473752490531CysZEUncharacterized protein involvedin cysteine biosynthesis556525397525585156398115637931379r-152540652553828RPredicted nucleic acid-bindingprotein557525884526483156349415628952076r-352600452619929KPredicted RNA-binding proteinhomologous to eukaryotic snRNP55852719952746815621791561910821f-3527208527451153RPL43AJRibosomal protein L37AE/L43A55952768952832415616891561054104f-1527698528319339IMP4JProtein containing the IMP4domain present in small nuclearribonucleoproteins; implicated inRNA processing56052836452896915610141560409105f-1528364528967266MnhEPMultisubunit Na+/H+ antiporter56152898452921715603941560161822f-352899352921284MnhFPMultisubunit Na+/H+ antiporter56252921452952815601641559850449f-252928052952697MnhGPMultisubunit Na+/H+ antiporter56352950952973915598691559639823f-352950952973761MnhBPMultisubunit Na+/H+ antiporter56452973652998115596421559397450f-252981752997959MnhBPMultisubunit Na+/H+ antiporter56552997853038515594001558993106f-1529978530383122MnhBPMultisubunit Na+/H+ antiporter56653065953214615587191557232107f-1530749531982315HyfBCPFormate hydrogenlyase subunit3/Multisubunit Na+/H+ antiporter567532123532530155725515568481378r-1532123532525172IlvHEAcetolactate synthase56853261553375415567631555624108f-153268453352177KefBPKef-type K+ transport systems56953378953491615555891554462451f-253457553490533SmcDChromosome segregationATPases570534917535363155446115540152075r-3534926535361249CheWNChemotaxis signal transductionprotein571535366536694155401215526841377r-1535876536542231TarNMethyl-accepting chemotaxisprotein572536818536871155256015525071376r-157353699853784615523801551532109f-1537025537838375CheRNTMethylase of chemotaxismethyl-accepting proteinsCOG1352 CheR57453784753820915515311551169110f-1537847538207224CheYTCheY-like receiver domains57553823053929715511481550081824f-3538230539286509CheBNTChemotaxis response regulatorCheB57653930454095015500741548428825f-3539304540906521CheANChemotaxis protein histidinekinase and related kinases57754098654168115483921547697452f-2540986541628349CheANChemotaxis protein histidinekinase and related kinases57854167154229415477071547084826f-3541680542289293CheCNTChemotaxis protein CheC57954229154291415470871546464453f-2542291542903303CheCNTChemotaxis protein CheC58054290454515915464741544219827f-3542916545154640TarNMethyl-accepting chemotaxisprotein58154519154568815441871543690111f-1545206545686259CheDNTChemotaxis protein; stimulatesmethylation of MCP proteinsCOG1871 CheD58254570654645515436721542923828f-354589254641140SUncharacterized archaealcoiled-coil domain58354646854750215429101541876829f-3546477547491366SUncharacterized ACR58454749954775915418791541619454f-254753854775792SUncharacterized ArCR58554783054818315415481541195830f-3547830548181136GimCOPrefoldin58654821854855315411601540825112f-154822754838632TasCPredicted oxidoreductases (relatedto aryl-alcohol dehydrogenases)58754853154951415408471539864455f-2548531549509423RExopolyphosphatase-relatedproteins58854951554985015398631539528456f-254955754982430ClsIPhosphatidylserine/phosphatidylglycerophosphate/cardioli pinsynthases and related enzymes58955008055115015392981538228831f-355016455049432TatANSec-independent protein secretionpathway components59055124955246015381291536918457f-255127055229074NrfGRTPR-repeat-containing proteins59155230955304315370691536335832f-3552318553041399RUncharacterized ArCR (containsC-terminal EMAP domain)59255313355369915362451535679458f-2553214553697265SUncharacterized ACR593553745554734153563315346442074r-3553745554720466MviMRPredicted dehydrogenases andrelated proteins59455485555567615345231533702459f-2554867555674401PPredicted divalent heavy-metalcations transporter595555783556910153359515324681757r-2555882556908419FtsYNSignal recognition particleGTPase596556879558105153249915312731375r-1556879558076334LPredicted transposases597558125558196153125315311822073r-3598558864559322153051415300561756r-255889755900231LSuperfamily I DNA and RNAhelicases and helicase subunits59955950656079815298721528580833f-3560307560760144MedNSurface lipoprotein60056083856236415285401527014834f-3560865562350525MglAGABC-type sugar (aldose)transport system60156236156339515270171525983460f-2562454563390164RUncharacterized ABC-typetransport system60256337156430315260071525075113f-1563407564241201RUncharacterized ABC-typetransport system603564310565311152506815240671374r-1564310565306276ZnuAPABC-type Mn/Zn transportsystem60456540956754115239691521837461f-256664856716434AceECPyruvate dehydrogenase605567556567786152182215215921373r-156756556766428SUncharacterized stress-inducedprotein606567865568512152151315208661372r-1567865568507355RPredictedphosphoribosyltransferases60756871157012915206671519249114f-1568747570121813CAcyl-CoA synthetase(NDPforming)608570172570729151920615186491371r-157036457049330ChaCPUncharacterized protein involvedin cation transport60957089857095715184801518421115f-161057103157173815183471517640462f-2571031571736351ApaHTDiadenosine tetraphosphatase andrelated serine/threonine proteinphosphatases611571735572070151764315173081370r-157173557198142SUncharacterized ACR612572149574656151722915147221369r-15721495746361272SpoVKOATPases of the AAA+ class(celldivision control protein A)613574653575411151472515139671755r-257473457510332SUncharacterized ACR614575490576503151388815128751754r-2575502576498595DYS1JDeoxyhypusine synthase615576540577586151283815117921753r-2576540577428182GltDERNADPH-dependent glutamatesynthase beta chain and relatedoxidoreductases COG0493 GltD61657775057856515116281510813116f-1577786578563355FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG61757861257902515107661510353463f-2578621578960151RPredicted nucleotidyltransferases61857939257945415099861509924464f-2619580461580553150891715088251752r-2620581070581168150830815082101751r-2621582573583445150680515059331750r-2582573583443326HtpXOZn-dependent protease withchaperone function622583582585228150579615041501368r-1583582585172854GroLOChaperonin GroEL (HSP60family) (Chaperonin A)62358539658638215039821502996835f-3585717586377332TMn2+-dependent serine/threonineprotein kinase624587383587667150199515017111367r-158740458762029TyrBBAspartate/aromaticaminotransferase625588220589968150115814994101366r-1588244589963615LMutS-like ATPases involved inmismatch repair626590029591039149934914983391365r-1590041591037552LdhACHRLactate dehydrogenase andrelated dehydrogenases COG1052LdhA627591078592301149830014970771749r-2591276592218147SdaCEAmino acid permeases62859219059319114971881496187465f-2592418593168346SIR2HNAD-dependent proteindeacetylases62959321459395714961641495421836f-3593229593949332RPredicted hydrolases of the HADsuperfamily63059391459449514954641494883117f-1593923594493259SUncharacterized ACR631594739594795149463914945831364r-163259532959561014940491493768837f-3595338595602124SUncharacterized membraneprotein63359542759755014939511491828466f-25956165975091017BisCCAnaerobic dehydrogenases634597520597798149185814915801363r-159754759773030PlsXIFatty acid/phospholipidbiosynthesis enzyme635598695599399149068314899791748r-259870459928338NatBCABC-type Na+ efflux pump636599396600097148998214892812072r-359943259999642RABC-type multidrug transportsystem637600094600945148928414884331362r-1600139600934281CcmAQABC-type multidrug transportsystem638600958600999148842014883791361r-163960138860182814879901487550467f-2601388601826188RPredicted nucleic acid-bindingprotein640601912602571148746614868071360r-160238660256365RPredicted DNA binding domain641602643603974148673514854041747r-2602643603972762TldDRPredicted Zn-dependent proteasesand their inactivated homologs642603976605406148540214839721359r-1603985605404756TldDRPredicted Zn-dependent proteasesand their inactivated homologs64360550660582314838721483555118f-1605530605815174MazGRPredicted pyrophosphatase644605856606749148352214826291746r-2605859606744522CMinD superfamily P-loop ATPasecontaining an inserted ferredoxindomain645606746607678148263214817002071r-3606806607664427CMinD superfamily P-loop ATPasecontaining an inserted ferredoxindomain646607678608625148170014807531358r-1607678608620476CFe—S oxidoreductases64760872060934914806581480029468f-2608720609347295SmtAQRSAM-dependentmethyltransferases COG0500SmtA64860966561120014797131478178469f-2609749611192473PutACNAD-dependent aldehydedehydrogenases64961128161292414780971476454119f-1612169612835124FecBPABC-type Fe3+-siderophorestransport systems65061292161386814764571475510838f-3612963613839185BtuCPHABC-typecobalamin/Fe3+-siderophorestransport systems65161385561461614755231474762120f-1613858614590160FepCPHABC-typecobalamin/Fe3+-siderophorestransport systems65261461361537414747651474004839f-361485061499432RPutative homoserine kinase typeII (protein kinase fold)65361537961611614739991473262121f-1615379616108323SUncharacterized ACR654616117616626147326114727521357r-1616150616618275SUncharacterized ACR65561671361737514726651472003840f-3616716617373325RMetal-dependent hydrolases ofthe beta-lactamase superfamily II656617430618005147194814713731745r-2657617873619891147150514694872070r-3617873619829739FeoBPFerrous ion uptake system proteinFeoB (predicted GTPase)658619888620115146949014692631356r-161988862010455FeoAPProtein659620116620346146926214690321355r-162019762034155FeoAPProtein66062052662158114688521467797841f-3620853621561229ModAPABC-type molybdate transportsystem66162155462236614678241467012470f-2621668622349238CysUPABC-type sulfate/molybdatetransport systems66262233862340214670401465976842f-3622377623397335CysAPABC-type sulfate/molybdatetransport systems663623814624353146556414650251744r-262407862427332ARA1RAldo/keto reductases664624301624510146507714648681354r-162430162450270STE14OPutativeprotein-S-isoprenylcysteinemethyltransferase665624735625205146464314641731743r-262506562514628GspDNGeneral secretory pathwayprotein D66662522362589114641551463487471f-2625268625595146SUncharacterized ACR66762591662617014634621463208472f-2668626202626936146317614624421742r-262623262679055RABC-type multidrug transportsystem669626909627853146246914615252069r-3626918627773206CcmAQABC-type multidrug transportsystem670627832628989146154614603891353r-162796462860344SUncharacterized proteins ofWD40-like repeat family671629061629687146031714596911741r-2629088629673198SmtAQRSAM-dependentmethyltransferases COG0500SmtA672629684631024145969414583542068r-3629684631022771RPredicted membrane componentsof an uncharacterizediron-regulated ABC-typetransporter SufB673631021631839145835714575391352r-1631099631822386RIron-regulated ABC transporterATPase subunit SufC67463187163235014575071457028473f-2631886632231196SUncharacterized ACR67563243063263014569481456748843f-363243063262546SUncharacterized ArCR67663261763309914567611456279122f-1632617633070203RPredicted nucleic acid-bindingprotein67763311263393314562661455445123f-1633121633931381RMetal-dependent hydrolases ofthe beta-lactamase superfamily II67863396463476414554141454614124f-1633973634762469FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG679634815635330145456314540481740r-263489363501630DnaXLDNA polymerase III680635934636071145344414533071739r-263598263606027CUncharacterized Fe—S protein68163714363745114522351451927844f-363732963742529ArtIEABC-type amino acid transportsystem68263748763806214518911451316474f-2637520638036145SPredicted membrane protein683638134639000145124414503781351r-1638206638998409SPredicted membrane proteins68463955363965114498251449727125f-1685639626640396144975214489822067r-3639641640298219CbiQPABC-type cobalt transport system686640393641181144898514481971350r-1640393641167299CbiOPABC-type cobalt transport system687641204641923144817414474552066r-364143864190984BirAHBiotin-(acetyl-CoA carboxylase)ligase68864197264249014474061446888475f-2641981642464146BioYRUncharacterized ACR689642511643098144686714462801349r-1642511643081162MobAHMolybdopterin-guaninedinucleotide biosynthesis protein A69064320964367014461691445708845f-364322164339831HHT1LHistones H3 and H4691644598646496144478014428821738r-26445986464881164DAP2EDipeptidylaminopeptidases/acylaminoacyl-peptidases69264757365001714418051439361476f-26475826500061260RPredicted P-loop ATPase fused toan acetyltransferase69365007865058414393001438794477f-2650099650570241SUncharacterized ACR69465058765108714387911438291126f-1650656651073236SUncharacterized ACR69565119865234014381801437038846f-3651285652236390TbpAHABC-type iron/thiamine transportsystems696652343653548143703514358302065r-3652400653513272SsnAFRCytosine deaminase and relatedmetal-dependent hydrolasesCOG0402 SsnA69765378465507914355941434299847f-3653784655065724AsnSJAspartyl/asparaginyl-tRNAsynthetases698655937657688143344114316902064r-3655958657119612TgtJQueuine/archaeosinetRNA-ribosyltransferase699657722658642143165614307362063r-3657722658622210PitAPPhosphate/sulphate permeases700658773659825143060514295531737r-2658797659823362MGlycosyltransferases701659850660155142952814292231736r-265985066012059RPredicted acetyltransferase70266024666441814291321424960848f-3662859664401827LhrRLhr-like helicases70366449866558614248801423792127f-1664582665584608GapAGGlyceraldehyde-3-phosphatedehydrogenase/erythrose-4-phosphatedehydrogenase70466562766599514237511423383478f-266575366590028ThrAEHomoserine dehydrogenase705666332666616142304614227622062r-3666341666608120SUncharacterized ACR706666618667169142276014222091735r-2666663667155258SUncharacterized ACR70766712366717614222551422202128f-1708667218667724142216014216541734r-266733266762953KPredicted transcriptionalregulators70966782466948814215541419890849f-366791466880536RPredicted drug exporters of theRND superfamily71066973567191814196431417460850f-3670269671868169RPredicted drug exporters of theRND superfamily71167370767398514156711415393851f-367370767392632SUncharacterized BCR71267403367491114153451414467479f-267403967485879RPredicted permeases71367495767597014144211413408480f-2674957675962570FrvXGCellulase M and related proteins71467642567729414129531412084852f-3676440677232177RPredicted ATPase of the AAAsuperfamily715677302678150141207614112281348r-1677314678145374XerCLIntegrase716678143679063141123514103152061r-367832967898945KPredicted transcriptionalregulators717679100679813141027814095652060r-3679127679811161SfsAGSugar fermentation stimulationprotein (uncharacterized)71867985067992414095281409454481f-271968015668047014092221408908482f-268023168028528RPredicted DNA-binding proteinswith PD1-like DNA-bindingmotif72068060668175414087721407624483f-2680708681752617FrvXGCellulase M and related proteins72168240168249614069771406882853f-3722682446682799140693214065791733r-268251268264128SUncharacterized ACR72368271768471114066611404667129f-1682804684694883DinGLRad3-related DNA helicases724684698685174140468014042042059r-368471968490233LAdenine-specific DNA methylase725686253686873140312514025051732r-2686274686841135GlpGRUncharacterized membraneprotein (homolog of Drosophilarhomboid)726686863687633140251514017451347r-1686875687622273SuhBGArchaeal fructose-1727687638688447140174014009312058r-3687644688424265SPredicted membrane proteins72868851668957114008621399807130f-1688525689569528GldACGlycerol dehydrogenase andrelated enzymes72968956869002913998101399349854f-3689601690024210SUncharacterized ArCR730690316690513139906213988651346r-169033469050227AceFCDihydrolipoamideacyltransferases731690550691353139882813980251345r-1690550691351381SUncharacterized ACR732691387692820139799113965581344r-169146269179834SppANOPeriplasmic serine proteases(ClpP class) COG0616 SppA733692817694928139656113944501731r-2694260694908170McrBLGTPase subunit of restrictionendonuclease734694986695405139439213939731730r-2694986695361160SUncharacterized ArCR735695410696654139396813927241343r-1695410696643487SUncharacterized ArCR736696651697808139272713915701729r-2696663697806699LDNA topoisomerase VI737697801699510139157713898681342r-1697807699451866LDNA topoisomerase VI738699507700274138987113891041728r-2699561700224275RPredicted RNA-binding protein(contains KH domains)739700228701004138915013883741341r-1700237700993413RIO1TPredicted serine/threonine proteinkinases740701037701399138834113879791727r-2701061701394198InfAJTranslation initiation factor IF-174170155070235913878281387019855f-3701577702336277ZnuCPABC-type Mn/Zn transportsystems74270235670317713870221386201484f-2702356703175241ZnuBPABC-type Mn2+/Zn2+ transportsystems74370315270386813862261385510856f-3703182703782262RnhBLRibonuclease HII744703837705249138554113841291340r-170429970457851PMT1ODolichyl-phosphate-mannose--proteinO-mannosyl transferasePMT174570530970646013840691382918857f-3705321706449537SmtAQRSAM-dependentmethyltransferases COG0500SmtA746706455706655138292313827231726r-270645570665029AroBE3-dehydroquinate synthetase74770673970855613826391380822485f-2706748708554805GlmSMGlucosamine 6-phosphatesynthetase74870855871156913808201377809858f-3708582711462590RUncharacterized membraneprotein74971185971244013775191376938131f-171198571231530RpoEKDNA-directed RNA polymerasespecialized sigma subunits750712445713191137693313761872057r-3712517713177349AdkFAdenylate kinase and relatedkinases75171314271363313762361375745859f-371328071359243SmcDChromosome segregationATPases752713693714955137568513744232056r-3713726714947684CUncharacterized flavoproteins753715024715470137435413739081339r-1715024715438110AhpCOPeroxiredoxin754715543716427137383513729511338r-1715597716419370PorBCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases755716424718136137295413712421725r-2717030718128453PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases75671831771933913710611370039860f-3718353718866213MsrAOPeptide methionine sulfoxidereductase75771950771978813698711369590486f-271956771973233AvtAEPLP-dependent aminotransferases758719790720593136958813687851724r-271997372052832XynBGBeta-xylosidase759720689721426136868913679522055r-37207047209623576072178972230413675891367074132f-172187072229970SUncharacterized ACR761722344722481136703413668971337r-172235972247032VacBKExoribonucleases76272259272311613667861366262861f-372259572308777SUncharacterized ACR763723142724314136623613650641336r-1723160724303528764724419725573136495913638051723r-2724488725553393HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases76572570472624913636741363129133f-1725713726238271SPredicted membrane protein76672645872664313629201362735487f-272646772661469RAD55TRecA-superfamily ATPasesimplicated in signal transduction76772874572879813606331360580862f-3768729082729786136029613595921335r-1729259729748167LrpKTranscriptional regulators76972984473098913595341358389134f-1729859730951395PurKFPhosphoribosylaminoimidazolecarboxylase (NCAIR synthetase)77073096173148513584171357893488f-2730961731462193PurEFPhosphoribosylcarboxyaminoimidazole(NCAIR) mutase77173158673398513577921355393863f-3731799733923812ZntAPCation transport ATPases77273401673433613553621355042864f-373404673425950TrxAOCThiol-disulfide isomerase andthioredoxins COG0526 TrxA773734349734939135502913544391722r-2734349734931238NfnBCNitroreductase77473521573576013541631353618489f-2735215735749288NfnBCNitroreductase77573576273594113536161353437865f-373579873592429KefBPKef-type K+ transport systems776735965737146135341313522322054r-3736043737078368ACR3PArsenite efflux pump ACR3 andrelated permeases77773721073768313521681351695490f-2737234737618110WzbTProtein-tyrosine-phosphatase778737822739696135155613496822053r-37378287396791055CAldehyde:ferredoxinoxidoreductase779739687740523134969113488551334r-1739711740518459ARA1RAldo/keto reductases78074058474129413487941348084135f-1740716741283283SUncharacterized ACR78174132974154113480491347837491f-274141974151827CUncharacterized conservedprotein containing aferredoxin-like domain78274192074208413474581347294492f-274194474207628SdrCTPredicted secreted proteincontaining a PDZ domain78374268474337613466941346002136f-1742684743185259LPredicted transposases78474342474360913459541345769866f-374348174358628VapCRPredicted nucleic acid-bindingprotein785743587744603134579113447751333r-1743596744598558CobTHNaMN:DMBphosphoribosyltransferase78674456074537213448181344006493f-274469874520870PflAOPyruvate-formate lyase-activatingenzyme78774536974682613440091342552137f-1745381746665377CobQHCobyric acid synthase788746823747761134255513416171721r-274686274717137SurAOParvulin-like peptidyl-prolylisomerase789747766748353134161213410251332r-1747778748315251HGTP:adenosylcobinamide-phosphateguanylyltransferase790748338749033134104013403451720r-2748338749013272CobSHCobalamin-5-phosphate synthase(Cobalamin synthase)791749030749443134034813399352052r-3749042749438201PgpAIPhosphatidylglycerophosphatase A792749440749877133993813395011331r-174954874962928SUncharacterized ACR793750208750714133917013386641330r-1750211750661238RPredicted ATPases of PP-loopsuperfamily79475195475296713374241336411138f-1751999752965486HisCEHistidinol-phosphateaminotransferase/Tyrosineaminotransferase79575304675411013363321335268139f-1753067754081386FecBPABC-type Fe3+-siderophorestransport systems796754166755410133521213339682051r-3754226755408708GPredicted phosphoglyceratemutase79775549675643113338821332947867f-3755586756408195ECM27PCa2+/Na+ antiporter79875647775696813329011332410868f-3756477756957304HitFGRDiadenosine tetraphosphate(Ap4A) hydrolase and other HITfamily hydrolases COG0537 Hit799756958757629133242013317491329r-175699475715632RPredicted amidohydrolase800757712758458133166613309202050r-3757733758453417THY1FPredicted alternative thymidylatesynthase80175868975964513306891329733140f-1758698759640549ArgFEOrnithine carbamoyltransferase80275976276069113296161328687869f-3759762760689549SunJtRNA and rRNAcytosine-C5-methylases803760688761674132869013277042049r-376072476113533HslUOATP-dependent protease80476232776341813270511325960870f-3762327763383518LYS9ESaccharopine dehydrogenase andrelated proteins80576339676405813259821325320141f-1763399764041323Mra1SUncharacterized ACR806765200765316132417813240622048r-380776563776604713237411323331142f-1765637766045238EfpJTranslation elongation factorP/translation initiation factoreIF-5A80876613876668313232401322695143f-176619576650434SUncharacterized ACR80976668576797413226931321404494f-2766703767969542ArsBPNa+/H+ antiporter NhaD andrelated arsenite permeases81076797676843413214021320944871f-3767985768432223UspATUniversal stress protein UspA andrelated nucleotide-bindingproteins81176847776934313209011320035872f-3768486769323387SpeBEArginase/agmatinase/formimionoglutamatehydrolase81276945976996213199191319416144f-1769459769954190RCBS domains81376995077126913194281318109873f-3770010771258553KefBPKef-type K+ transport systems814771283771807131809513175711328r-177133477146931ZntAPCation transport ATPases81577182077354113175581315837145f-1772069773122177EriCPChloride channel protein EriC81677354377481713158351314561495f-2773552774800647SUncharacterized ACR81777483877508913145401314289146f-177484777506652AbrBKRegulators ofstationary/sporulation geneexpression81877549377642213138851312956496f-2775493776399327ThiLHThiamine monophosphate kinase81977648077764313128981311735497f-2776480777614382RfaGMPredicted glycosyltransferases82077817677834613112021311032874f-377817677832962CDA1GPredicted xylanase/chitindeacetylase82177836277941113110161309967875f-3778362779409622PflAOPyruvate-formate lyase-activatingenzyme82277933678024713100421309131498f-277938477956432RUncharacterized protein82378043878227613089401307102876f-378208578220534LArchaea-specific RecJ-likeexonuclease82478232978310813070491306270147f-178277378298629GgtEGamma-glutamyltranspeptidase825783098784927130628013044512047r-3783182784919922CUncharactenzed Fe—Soxidoreductases826785382786104130399613032741719r-2785382786081310KsgAJDimethyladenosine transferase(rRNA methylation)827786218786838130316013025402046r-3786218786833337JPredicted RNA-binding protein828786930787286130244813020921718r-2786936787230135SUncharacterized ArCR829787283787609130209513017692045r-3787313787604189RPL21AJRibosomal protein L21E830787749788930130162913004481717r-2787749788916492JPredicted pseudouridylatesynthase83178897578926813004031300110499f-2788975789266138SUncharacterized ArCR832789317789460130006112999182044r-378935078944027RfeMUDP-N-acetylmuramylpentapeptidephosphotransferase/UDP-N-acetylglucosamine-1-phosphatetransferase833789852790022129952612993561716r-278985578999356NfiLDeoxyinosine 3′endonuclease(endonuclease V)834790438791058129894012983201327r-1790438791038264LTranslin (RNA-binding protein83579067279073712987061298641148f-183679111779246912982611296909500f-2791156792467683AnsBEJL-asparaginase/archaealGlu-tRNAGln amidotransferasesubunit D COG0252 AnsB83779250579267512968731296703149f-179250579261034SUncharacterized ArCR83879266579311412967131296264501f-279266579307977RPredicted nucleic acid-bindingprotein83979311179500012962671294378150f-1793111794998997GatEJArchaeal Glu-tRNAGlnamidotransferase subunit E(contains GAD domain)84079503879554412943401293834502f-279535679549134FtsWDBacterial cell division membraneprotein841796310797536129306812918422043r-3796310797534710HMG1IHydroxymethylglutaryl-CoAreductase842797552798316129182612910622042r-3797570798311335DATPases involved in chromosomepartitioning84379847379953412909051289844503f-2798482799517596TdhERThreonine dehydrogenase andrelated Zn-dependentdehydrogenases COG1063 Tdh84479961079985812897681289520504f-279962579983855SUncharacterized ACR84579984880032712895301289051877f-379984880032591RPredicted nucleic acid-bindingprotein846800324800425128905412889532041r-380032480040226UupRATPase components of ABCtransporters with duplicatedATPase domains847800450800518128892812888602040r-384880091980242412884591286954878f-3800919802422753PheSJPhenylalanyl-tRNA synthetasealpha subunit84980243680267212869421286706505f-280247880264932RPredicted ATPase of the AAAsuperfamily85080266980289012867091286488151f-180281680287626RPredicted RNA-binding protein(contains KH domains)85180288780329712864911286081879f-380288780327745RPredicted nucleic acid-bindingprotein85280329480502712860841284351506f-2803303805010933PheTJPhenylalanyl-tRNA synthetasebeta subunit85380522080606812841581283310507f-2805265806051266TruAJPseudouridylate synthase (tRNApsi55)854806024807415128335412819632039r-3806030807359722SSL2LDNA or RNA helicases ofsuperfamily II85580736680874512820121280633880f-3807480808743673UbiDH3-polyprenyl-4-hydroxybenzoatedecarboxylase and relateddecarboxylases856808746809576128063212798021715r-280887580904330RimIRAcetyltransferases857810847811266127853112781121326r-1810856811252127LPredicted transposase85881136781160612780111277772508f-281139181153230HfqRUncharacterized ACR85981160881235112777701277027881f-3811620812340392MobBHMolybdopterin-guaninedinucleotide biosynthesis protein86081263581364812767431275730152f-1812755813613280RPredicted periplasmic bindingprotein86181365281411312757261275265153f-181373081388932UvrBLHelicase subunit of the DNAexcision repair complex86281407781641912753011272959882f-3814140816300432SIntegral membrane protein86381650181665012728771272728883f-386481675481772812726241271650154f-1816754817711403RPredicted archaeal sugar kinases86581772581851912716531270859884f-381774681796233FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG86681862381946812707551269910155f-181865081930149NosYRABC-type transport systeminvolved in multi-copper enzymematuration86781947582039512699031268983156f-1819475820381317CcmAQABC-type multidrug transportsystem868820410821180126896812681981714r-2820458821160412CAcyl-CoA synthetase (NDPforming)869821146822570126823212668081325r-1821146822553724CAcyl-CoA synthetase (NDPforming)870822810823514126656812658641713r-2822810823500395RPredicted nucleotidyltransferase87182359982402112657791265357885f-382381582394729ARA1RAldo/keto reductases872824015825196126536312641822038r-3824069825182278NrfGRTPR-repeat-containing proteins873825266826294126411212630842037r-3825275826289485SUA5JPutative translation factor (SUA5)874826379827413126299912619652036r-3826379827411358RfaGMPredicted glycosyltransferases875827435828904126194312604742035r-3827453828887543AsnBEAsparagine synthase(glutamine-hydrolyzing)876828985829728126039312596501324r-1828985829720355RGTPases877829725830471125965312589071712r-2829734830466361DATPases involved in chromosomepartitioning87883055183236812588271257010157f-1830560832363924RATPases of the PilT family87983233783303512570411256343509f-2832469833018196MafDNucleotide-binding proteinimplicated in inhibition of septumformation880836010837260125336812521181711r-2836019837258744GCD1MJNucleoside-diphosphate-sugarpyrophosphorylases involved inlipopolysaccharidebiosynthesis/translation initiationfactor eIF2B subunits COG1208GCD1881837335837601125204312517772034r-383734183745835MCM2LPredicted ATPase involved inreplication control882837647839638125173112497402033r-3837677839612820FeoBPFerrous ion uptake system proteinFeoB (predicted GTPase)883839649839885124972912494931710r-283966483988383FeoAPProtein88484009784047112492811248907158f-184010384027129RfeMUDP-N-acetylmuramylpentapeptidephosphotransferase/UDP-N-acetylglucosamine-1-phosphatetransferase88584050384132112488751248057510f-2840503841277389MesJDPredicted ATPase of the PP-loopsuperfamily implicated in cellcycle control88684129384228812480851247090886f-3841305842244209HypEOHydrogenase maturation factor88784227584262812471031246750159f-184237784261750RPredicted nucleotidyltransferases888842986844059124639212453191323r-1843040843955457RPredicted RNA-binding proteins889844320844517124505812448611709r-2890844597845652124478112437261322r-1844597845650473PepPEXaa-Pro aminopeptidase89184572584638712436531242991160f-184572884627796RPredicted hydrolases of the HADsuperfamily89284642284672712429561242651511f-2846500846725100JRibosomal protein L35AE/L33A89384677384790312426051241475512f-2846773847895484TRM1JN289484789684899012414821240388887f-3847896848988450SUncharacterized membraneproteins895848774848884124060412404942032r-384877784887026RPredicted alternative tryptophansynthase beta-subunit (paralog ofTrpB)896848987849100124039112402782031r-3897849375849638124000312397401708r-284938784954043UvrCLNuclease subunit of theexcinuclease complex898849669851036123970912383421707r-2849678851004614NorMQNa+-driven multidrug effluxpump899851134851325123824412380531321r-1851134851317115RPL37AJRibosomal protein L37E900851346851582123803212377961706r-2851352851574114LSM1KSmall nuclear ribonucleoprotein(snRNP) homolog90185173885403512376401235343513f-2852581854012262AmyAGGlycosidases902851818851883123756012374951320r-190385412685584112352521233537514f-2854129855836978GRS1JGlycyl-tRNA synthetase90485588885665212334901232726888f-3855975856650291RPredicted permeases905856637856798123274112325802030r-385663785676327PotBEABC-type spermidine/putrescinetransport system90685715185822712322271231151889f-3857238858216375LPredicted DNA modificationmethylase90785872885893412306501230444515f-290886008086034012292981229038161f-186012886026629MrcAMMembrane carboxypeptidase(penicillin-binding protein)909860404861084122897412282941319r-1860443861079402RPredicted metal-dependenthydrolases related to alanyl-tRNAsynthetase HxxxH domain910861133862545122824512268331318r-186247486254340OppAEPABC-typedipeptide/oligopeptide/nickeltransport systems911862729864021122664912253571317r-1862744864004586GltPCNa+/H+-dicarboxylate symporters912864121864819122525712245591316r-1864133864793199BirAHBiotin-(acetyl-CoA carboxylase)ligase91386500286545412243761223924890f-386510786531430RUncharacterized FAD-dependentdehydrogenases91486538786630412239911223074162f-1865489866302457FbaBGDhnA-type fructose-191586649686831312228821221065891f-3866535868305800PycACPyruvate carboxylase916868296868430122108212209481705r-286833886841326SUncharacterized ACR91786844487022212209341219156163f-1868483870106640CstATCarbon starvation protein91887026387054712191151218831516f-287037487053330OmpRTKResponse regulators consisting ofa CheY-like receiver domain anda HTH DNA-binding domainCOG0745 OmpR91987053287084012188461218538164f-187058687076929OppFEPABC-typedipeptide/oligopeptide/nickeltransport system92087084287184612185361217532517f-2870851871838451ArsAPArsenite transporting ATPase92187183687212012175421217258892f-387184587207938PaaDRPutative aromatic ringhydroxylating enzyme92287194287277512174361216603165f-187257887275833SmcDChromosome segregationATPases92387283387311712165451216261166f-187286387306150AbrBKRegulators ofstationary/sporulation geneexpression92487352487430612158541215072518f-2873530874292400PldBILysophospholipase92587470787494012146711214438893f-387474987490226DIntracellular septation protein A92687502287584012143561213538894f-387502587527733ValSJValyl-tRNA synthetase927875837876856121354112125222029r-3875837876854603PurAFAdenylosuccinate synthase92887702087723512123581212143895f-387710787719731HmpCFlavodoxin reductases(ferredoxin-NADPH reductases)family 192987727187819712121071211181519f-2877274878180435WcaGMGNucleoside-diphosphate-sugarepimerases COG0451 WcaG930878209878658121116912107201315r-1878317878650145GIM5OPredicted prefoldin93187871887876512106601210613896f-393287888687918212104921210196897f-393387921188050012101671208878167f-187922988045324993488050688138712088721207991898f-3880518881385365EutGCAlcohol dehydrogenase IV93588155088165412078281207724899f-388155088164643EutGCAlcohol dehydrogenase IV936882812882925120656612064532028r-3937885694886539120368412028391314r-1885694886495110938886567887178120281112022001313r-188665788717617493988727588748712021031201891168f-188728488743440SUncharacterized ArCR94088771788792012016611201458520f-288772088791554RPredicted nucleic acid-bindingprotein94188792489070112014541198677521f-28879248906421093LhrRLhr-like helicases94289111489139811982641197980900f-389115989139631NfoLEndonuclease IV94389143489500911979441194369522f-28914438949681392SmcDChromosome segregationATPases94489501389567811943651193700523f-2895022895667248SUncharacterized ACR945895675896097119370311932811312r-189588889605030AcyPCAcylphosphatases94689662689904011927521190338169f-1896632898126684MPH1LERCC4-like helicases947899156900004119022211893742027r-3899165899987342DppAEUncharacterized proteinassociated with dipeptidetransport94890013490038511892441188993524f-290023090031430MglAGABC-type sugar (aldose)transport system949901696902574118768211868041311r-190189190198730TenAKPutative transcription activator950902700903458118667811859201704r-2902703903450387RPredicted phosphate-bindingenzymes951903912904115118546611852631703r-290391290407745SUncharacterized ArCR952904127904555118525111848232026r-3904127904520173SUncharacterized ACR95390461090502611847681184352525f-290487190496728TFA1KTranscription initiation factor IIE95490510590689811842731182480526f-2905105906887998RRNase L inhibitor homolog95590698290797411823961181404170f-1906994907963387HypEOHydrogenase maturation factor956907975908217118140311811611310r-190797590821598SUncharacterized ACR957908370909260118100811801181702r-2908463909246221LPredicted type IV restrictionendonuclease95890930191011611800771179262171f-1909313910093189RPredicted glutamineamidotransferase95991009791051611792811178862527f-2910106910514190RCBS domains96091051391202411788651177354172f-1910531912016744IccRPredicted phosphohydrolases961912021912893117735711764851701r-2912021912879311G2-Phosphoglycerate kinase962912890914188117648811751902025r-391358991381445SUncharacterized ACR96391430591449311750731174885173f-191438991449127HHT1LHistones H3 and H496491471191512111746671174257528f-2914711915119153ArsRKPredicted transcriptionalregulators96591511891642811742601172950174f-191514891540337SUncharacterized ArCR96691658991725711727891172121529f-2916604917246142SUncharacterized BCR96791734891835211720301171026530f-2917357918311400968918655918705117072311706731309r-1969918719919171117065911702072024r-3918779919163149SUncharacterized ACR97091930592326411700731166114901f-392005292049960LMicrococcal nuclease(thermonuclease) homologs971924116924814116526211645642023r-3924128924773140RAD55TRecA-superfamily ATPasesimplicated in signal transduction97292501092724411643681162134531f-29250199267081043MetGJMethionyl-tRNA synthetase973927249927578116212911618001700r-292733992757692SUncharacterizedmembrane-associatedprotein/domain974928257929309116112111600691699r-292835392917845SbcCLATPase involved in DNA repair975929424929705115995411596731698r-292953892969733DcpEZn-dependent oligopeptidases976930480931013115889811583651697r-2930486930996219WecDKRHistone acetyltransferase HPA2and related acetyltransferasesCOG0454 WecD97793110393157611582751157802532f-2931145931556147BcpOPeroxiredoxin97893159493207011577841157308175f-1931651932068190SUncharacterized ACR97993252693308611568521156292902f-3932535933084180SUncharacterized ACR98093312893343011562501155948533f-2933128933428153SUncharacterized ACR98193372893390411556501155474534f-293377993390232SUncharacterized ACR982933919934392115545911549861308r-193392593438775SUncharacterized ACR98393456493537911548141153999176f-1934612935371180MscSMSmall-conductancemechanosensitive channel984935513936664115386511527142022r-3935549936659541RPredicted Fe—S oxidoreductases985936666936944115271211524341696r-293669693694294MoaDHMolybdopterin converting factor986936987938822115239111505561695r-2937005938814977CAldehyde:ferredoxinoxidoreductase98793895494019211504241149186535f-2938969940178572SUncharacterized ACR98894023994046911491391148909903f-398994080394093711485751148441904f-399094093494205511484441147323536f-2940943942050604RUncharacterized proteins of theAP superfamily99194259194291711467871146461905f-394262794289793RPredicted nucleotidyltransferases992942914943306114646411460722021r-394306794328628TrmAJSAM-dependentmethyltransferases related totRNA(uracil-5-)-methyltransferase993943357943545114602111458331307r-194335794352832PyrEFOrotate phosphoribosyltransferase994943533943778114584511456001694r-294354294367746AbrBKRegulators ofstationary/sporulation geneexpression995943889944536114548911448422020r-3943889944534335RpsGJRibosomal protein S7996944542944994114483611443841306r-1944542944992263RpsLJRibosomal protein S12997944996945436114438211439422019r-3944999945434255NusAKTranscription terminator998945433945741114394511436371305r-1945436945727145RPL30JRibosomal protein L30E999945755946939114362311424392018r-3945764946931652RpoCKDNA-directed RNA polymerasebeta′ subunit/160 kD subunit(split gene in archaea and Syn)1000946932948164114244611412141693r-2947001948162674RpoCKDNA-directed RNA polymerasebeta′ subunit/160 kD subunit(split gene in archaea and Syn)1001948079949662114129911397161304r-1948088949645961RpoCKDNA-directed RNA polymerasebeta′ subunit/160 kD subunit(split gene in archaea and Syn)1002949659953030113971911363481692r-29496659530281967RpoBKDNA-directed RNA polymerasebeta subunit/140 kD subunit (splitgene in Mjan1003953048953296113633011360822017r-3953048953294118RPB5KDNA-directed RNA polymerase1004953495954190113588311351882016r-3953510954185408TrxAOCThiol-disulfide isomerase andthioredoxins COG0526 TrxA100595430195502011350771134358177f-1954316955009290KPredicted transcriptionalregulators100695520495639111341741132987178f-1955213956347629FixCCDehydrogenases (flavoproteins)1007956375956533113300311328452015r-395640295649826SUncharacterized BCR100895727095763811321081131740906f-395747795757928RPredicted integral membraneprotein1009957640961329113173811280491303r-1957649958597493TopALTopoisomerase IA101096140796232411279711127054907f-396168996194735FepCPHABC-typecobalamin/Fe3+-siderophorestransport systems101196237296257511270061126803537f-2962372962573108ThiSHSulfur transfer protein involved inthiamine biosynthesis1012962593963804112678511255741302r-1962605963799691AvtAEPLP-dependent aminotransferases101396416896482711252101124551179f-1964495964822139SUncharacterized membraneprotein1014964831965430112454711239481301r-196517696532936SbcCLATPase involved in DNA repair101596560396589611237751123482538f-2965612965894188RPL42AJRibosomal protein L44E101696590196609811234771123280908f-3965901966096128RPS27AJRibosomal protein S27E101796616696700211232121122376180f-1966175966955461SUI2JTranslation initiation factoreIF2alpha101896700296718111223761122197909f-3967002967176120JPredicted Zn-ribbonRNA-binding protein101996718496798711221941121391539f-2967184967985394RUncharacterized proteins of theATP-grasp superfamily102096813496875711212441120621181f-1968143968734142CbiMHCobalamin biosynthesis proteinCbiM102196875496900211206241120376910f-396876096897033CbiMHCobalamin biosynthesis proteinCbiM102296899596966311203831119715182f-196919396964372CbiQPABC-type cobalt transport system102396966097046311197181118915911f-3969660970404233CbiOPABC-type cobalt transport system102497055597189211188231117486183f-197143197152733AprEOSubtilisin-like serine proteases1025971952973340111742611160381691r-2971970973332786CpsGGPhosphomannomutase1026973366974772111601211146061300r-1973375974356455CpsBMMannose-1-phosphateguanylyltransferase1027974823976277111455511131011690r-297548997572032KRNA-binding proteins (RRMdomain)1028976234976803111314411125751299r-1976240976795340GRThermophilicglucose-6-phosphate isomeraseand related metalloenzymesCOG2140-1029976871977053111250711123252014r-397688097704259NusAKTranscription terminator1030977082977765111229611116131689r-2977082977730174SUncharacterized ACR1031977762978706111161611106722013r-3977762978671401ElaCRMetal-dependent hydrolases ofthe beta-lactamase superfamily III103297877697974711106021109631540f-2978791979706234NrfGRTPR-repeat-containing proteins103397982698110011095521108278541f-2979841981095488TrmAJSAM-dependentmethyltransferases related totRNA(uracil-5-)-methyltransferase1034981159981425110821911079531688r-298116898135728DapDETetrahydrodipicolinateN-succinyltransferase1035981762981815110761611075631687r-2103698213698248311072421106895542f-2982136982481168H6-pyruvoyl-tetrahydropterinsynthase1037982480982953110689811064251298r-1982480982822142SUncharacterized ACR103898302598348611063531105892912f-3983058983460115GIM5OPredicted prefoldin103998348398382111058951105557543f-298351698372335GimCOPrefoldin1040983802984371110557611050071686r-2983802984354278PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1041984359985399110501911039792012r-3984554985397537PorBCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1042985204986352110417411030261297r-1985204986338639PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1043986349986912110302911024661685r-2986400986904284PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1044986851987246110252711021321296r-198693598723596SUncharacterized ACR1045987243987566110213511018121684r-298729798737532RPredicted nucleotidyltransferases1046987517988383110186111009951295r-1987517988369501PorBCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1047988383989573110099510998051683r-2988383989571743PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1048989577989894109980110994841682r-2989577989877125CFerredoxin 3104999076299151110986161097867913f-399112599150033CcmAQABC-type multidrug transportsystem105099180399199110975751097387914f-31051992036993010109734210963682011r-3992042993002446RPredicted Fe—S oxidoreductases105299424199502010951371094358544f-2994241994985244SurERSurvival protein105399504799511210943311094266184f-1105499538099584410939981093534185f-199541999577978SPredicted membrane protein1055995878996558109350010928201294r-1995881996550278SpoVKOATPases of the AAA+ class105699703799846410923411090914545f-2997097998456785SerSJSeryl-tRNA synthetase1057998525999265109085310901132010r-3998588999200298NthLPredicted EndoIII-relatedendonuclease1058999750100022910896281089149915f-39998431000212168KPredicted transcriptionalregulators10591000226100121210891521088166546f-210002351001201503CcmAQABC-type multidrug transportsystem10601001217100198710881611087391916f-310012171001982355RABC-type multidrug transportsystem106110020021003240108737610861382009r-310020051003226590PgkG3-phosphoglycerate kinase10621003253100546610861251083912547f-21003355100371540MrrLRestriction endonuclease106310054671006087108391110832912008r-31005581100588438LeuAEIsopropylmalate/homocitrate/citramalatesynthases106410062021007890108317610814882007r-3100620210078881040SbmIMethylmalonyl-CoA mutase106510079791010192108139910791861681r-21008876100939841AlsDHGlutamate-1-semialdehydeaminotransferase106610101891010956107918910784222006r-31010246101059194NosYRABC-type transport systeminvolved in multi-copper enzymematuration106710110111011949107836710774292005r-310110111011938464CcmAQABC-type multidrug transportsystem10681012013101287910773651076499548f-210120131012862332YSH1JPredicted exonuclease of thebeta-lactamase fold involved inRNA processing10691012961101327810764171076100549f-21013114101325529MdlBQABC-type multidrug/protein/lipidtransport system10701013371101388310760071075495186f-110134071013806214IbpAOMolecular chaperone (small heatshock protein)107110139951014411107538310749671293r-11014265101436130FlgDNFlagellar hook capping protein10721014829101722810745491072150187f-1101482910172261310SpoVKOATPases of the AAA+ class10731017331102071110720471068667188f-11018411101864556LType II restriction enzyme107410208211020970106855710684082004r-31020854102096226RPredicted hydrolase of alkalinephosphatase superfamily10751021424102233810679541067040550f-210215351022261177FolPHDihydropteroate synthase107610223191023311106705910660671680r-210223281023294249PerMRPredicted permease107710233011023780106607710655981292r-11023463102363732TldDRPredicted Zn-dependent proteasesand their inactivated homologs107810237811024785106559710645931291r-110237811024759278SppANOPeriplasmic serine proteases(ClpP class) COG0616 SppA10791024877102569210645011063686551f-210248861025681417IolEGSugar phosphateisomerases/epimerases108010256821026086106369610632921679r-21025892102601829LeuBEIsocitrate/isopropylmalatedehydrogenase108110260831026376106329510630022003r-310261221026374146RPB11KDNA-directed RNA polymerase108210263571026986106302110623921678r-210263571026984248SUncharacterized ArCR108310269831027579106239510617992002r-310269861027571280JPredicted RNA-binding protein(consists of S1 domain and aZn-ribbon domain)10841027657102955810617211059820189f-1102767810295561040ThrSJThreonyl-tRNA synthetase108510295171030068105986110593101290r-11029589102994334HsdMLType I restriction-modificationsystem methyltransferase subunit108610302761030950105910210584281289r-11030711103090032UvrCLNuclease subunit of theexcinuclease complex(TBP-interacting protein)108710310131031807105836510575711677r-210310131031805431UppSIUndecaprenyl pyrophosphatesynthase108810318141032344105756410570341676r-210318231032336291PaaYRCarbonicanhydrases/acetyltransferases10891032406103279210569721056586190f-110324121032781137LHolliday junction resolvase -archaeal type10901032841103437310565371055005191f-11032913103358245SPredicted membrane protein10911034458103549810549201053880192f-110344581035493551FrvXGCellulase M and related proteins10921035541103610110538371053277193f-110355471036087185RPredicted Zn-dependent proteases10931036098103664910532801052729917f-310361041036623254CyaBFAdenylate cyclase10941036636103746910527421051909194f-11037026103734148NrfGRTPR-repeat-containing proteins109510373901038229105198810511492001r-310373901038167275CbiOPABC-type cobalt transport system109610382261039704105115210496741288r-110382261039687621TrkGPTrk-type K+ transport systems10971039796104068310495821048695552f-210398081040681417MapJMethionine aminopeptidase10981041012104107110483661048307918f-310991041624104193510477541047443919f-31041705104182231SurAOParvulin-like peptidyl-prolylisomerase11001042133104238410472451046994553f-210421451042382141RPredicted nucleic acid-bindingprotein11011042526104370110468521045677554f-210425261043696659RCBS domains110210436761044812104570210445661675r-21043805104402734NuoLCPNADH:ubiquinoneoxidoreductase subunit 5 (chainL)/Multisubunit Na+/H+antiporter110310448091046068104456910433102000r-310448091046030664GCD1MJNucleoside-diphosphate-sugarpyrophosphorylases involved inlipopolysaccharidebiosynthesis/translation initiationfactor eIF2B subunits COG1208GCD111041047016104809210423621041286195f-110470161048078543RPredicted GTPase110510482091048610104116910407681674r-210482181048596207RPS8AJRibosomal protein S8E110610486841048761104069410406171287r-111071048718104959910406601039779555f-21049000104909330HypFOHydrogenase maturation factor110810495961051275103978210381031286r-110496741051264897PyrGFCTP synthase (UTP-ammonialyase)110910513071051711103807110376671999r-310513161051682168SUncharacterized ArCR111010517081051995103767010373831285r-110517201051993150SUncharacterized ArCR11111052192105270110371861036677556f-21052495105268432PtsAGPhosphoenolpyruvate-proteinkinase (PTS system EI componentin bacteria)11121052753105302210366251036356557f-21052792105300529TarNMethyl-accepting chemotaxisprotein11131053032105379310363461035585558f-210530321053791411NrdGOOrganic radical activatingenzymes11141053859105527410355191034104196f-110539521055269727TIP49LDNA helicase TIP4911151055358105566310340201033715920f-31055370105544528AlsTENa+/alanine symporter11161056285105639510330931032983921f-3111710563921057381103298610319971998r-31056605105674633RpeGPentose-5-phosphate-3-epimerase111810573621057835103201610315431673r-21057494105768031FfhNSignal recognition particleGTPase111910578321058302103154610310761997r-31058003105810228SUncharacterized ACR11201058495105904310308831030335559f-210585431059041260RPhospholipid-binding protein112110590471059307103033110300711996r-31059104105928430RfaGMPredicted glycosyltransferases112210593991059863102997910295151672r-21059465105979540NrfGRTPR-repeat-containing proteins11231059921106051710294571028861922f-310599331060434108GrxCOGlutaredoxin and related proteins11241060582106131010287961028068197f-110605821061296247CcdAOCytochrome c biogenesis protein112510613071061768102807110276101671r-210613221061766237LrpKTranscriptional regulators11261061878106322110275001026157198f-110618781063186614ArgDEPLP-dependent aminotransferases11271063298106459910260801024779560f-210633251064597535UraAFXanthine/uracil permeases112810646561065000102472210243781284r-1112910653701066023102400810233551283r-110653701065943316NuoICFormate hydrogenlyase subunit6/NADH:ubiquinoneoxidoreductase 23 kD subunit(chain I)113010660201067213102335810221651670r-210660531067211652NuoDCNADH:ubiquinoneoxidoreductase 49 kD subunit 7113110672151067811102216310215671282r-110673171067797180NuoCCNADH:ubiquinoneoxidoreductase 27 kD subunit113210677931068392102158510209861669r-210678381068390335NuoBCNADH:ubiquinoneoxidoreductase 20 kD subunit andrelated Fe—S oxidoreductases113310683941069287102098410200911281r-110684061069240367HyfCCFormate hydrogenlyase subunit 4113410692881071138102009010182401280r-110692881071115678HyfBCPFormate hydrogenlyase subunit3/Multisubunit Na+/H+ antiporter11351070858107096510185201018413561f-2113610711351072622101824310167561668r-210711861072614713HyfBCPFormate hydrogenlyase subunit3/Multisubunit Na+/H+ antiporter113710726191072963101675910164151995r-310726191072961194MnhCPMultisubunit Na+/H+ antiporter113810729601073688101641810156901279r-110729631073686333MnhBPMultisubunit Na+/H+ antiporter113910736701073954101570810154241667r-21073745107391968PPredicted subunit of theMultisubunit Na+/H+ antiporter114010739511074343101542710150351994r-310739511074290168MnhGPMultisubunit Na+/H+ antiporter114110743401074594101503810147841278r-110743401074592133MnhFPMultisubunit Na+/H+ antiporter114210745911075124101478710142541666r-210745911075119258MnhEPMultisubunit Na+/H+ antiporter114310753601075860101401810135181277r-110753601075858305EPredicted regulator of amino acidmetabolism (contains the ACTdomain)11441076013107727810133651012100923f-310760191077276687RPredicted Fe—S oxidoreductase11451077432107798610119461011392924f-31077708107793632RibFHFAD synthase114610780711079189101130710101891665r-210780711079187569WecBMUDP-N-acetylglucosamine2-epimerase114710792011080472101017710089061993r-310792191080467577WecCMUDP-N-acetyl-D-mannosaminuronatedehydrogenase11481080723108186210086551007516925f-310807591081797524SUncharacterized ArCR11491082285108463910070931004739562f-210827351084637891ArgSJArginyl-tRNA synthetase115010823631082779100701510065991992r-310824411082765123LplAHLipoate-protein ligase A115110846401085716100473810036621991r-310846401085696377RPredicted ATPase of the AAAsuperfamily11521085820108669810035581002680926f-310858201086684375DapAEMDihydrodipicolinatesynthase/N-acetylneuraminatelyase COG0329 DapA11531086762108698610026161002392927f-31086765108687025PhrBLDeoxyribodipyrimidinephotolyase115410872561088512100212210008661990r-310872651088507746eRF1JPeptide chain release factor eRF1115510885681088813100081010005651664r-211561088815108938410005639999941276r-11089229108935532SUncharacterized ArCR11571089160108921010002181000168199f-11158108948410896399998949997391275r-11089532108963426FbaGFructose/tagatose bisphosphatealdolase (fructose 1,6-bisphophatealdolase)1159108990910906049994699987741663r-21090068109026637BaeSTSensory transduction histidinekinases1160109111810915259982609978531662r-21091292109141533GloBRZn-dependent hydrolases116110916461092197997732997181928f-31091877109213837SUncharacterized ACR1162109220610935229971729958561989r-310922121093496443MPredicted membrane-associatedZn-dependent proteases 11163109355610939579958229954211988r-310935561093952189SUncharacterized ACR1164109396710951279954119942511987r-310939671095125593SUncharacterized ACR116510963751096839993003992539200f-110963841096816242RpsOJRibosomal protein S15P/S13E116610968701098303992508991075201f-110968701098295681RecJLSingle-stranded DNA-specificexonuclease116710982811098538991097990840563f-21098317109845829CPhycocyanin alpha-subunitphycocyanobilin lyase and relatedproteins116810985541099156990824990222564f-210986141099148310RPS1AJRibosomal protein S3AE116910992201099486990158989892565f-21099274109946932HtpGOMolecular chaperone117010994681099908989910989470202f-110994831099906165RPredicted nucleic acid-bindingprotein117110999541100991989424988387203f-110999541100962527SUncharacterized protein sharing aconserved domain with thiaminebiosynthesis protein ThiI1172110107311015109883059878681274r-111010761101448136SPredicted membrane protein1173110186811023269875109870521273r-111018861102324133LrpKTranscriptional regulators1174110278611031819865929861971272r-111027951103179136ArsRKPredicted transcriptionalregulators1175110367311044619857059849171661r-21104120110433031PPutative silver efflux pump117611045851106492984793982886929f-311046511106463742LonBOPredicted ATP-dependentprotease (Lon protease)1177110668611072649826929821141271r-111066861107262272KPredicted transcriptional regulatorwith C-terminal CBS domains1178110752411080159818549813631986r-311075241108007160RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT1179110855911102539808199791251985r-31108979110950738SUncharacterized archaealcoiled-coil domain118011103471111819979031977559566f-211108391111814442RExopolyphosphatase-relatedproteins1181111186211120809775169772981984r-31111871111207597SUncharacterized ACR1182111262411130019767549763771983r-311126271112996204KTranscriptional regulator of ariboflavin/FAD biosyntheticoperon118311134591114217975919975161930f-311134681114212405SmtAQRSAM-dependentmethyltransferases COG0500SmtA118411144071117082974971972296931f-3111441611170711584ValSJValyl-tRNA synthetase118511175771118029971801971349567f-211175771118027289RPS19AJRibosomal protein S19E (S16A)1186111808611197389712929696401270r-11119022111930133RPredicted metal-dependent RNase118711198401120178969538969200932f-311198401120176182RDNA-binding protein118811201721120504969206968874568f-21120172112044230LigLNAD-dependent DNA ligase(contains BRCT domain type II)118911205051121407968873967971569f-211205141121402506SUA7KTranscription initiation factor IIB1190112140811225209679709668581982r-311214981122512451ExoL5′-3′ exonuclease (includingN-terminal domain of PolI)1191112251711237469668619656321269r-111225441123741591MoeAHMolybdopterin biosynthesisenzyme119211238101124472965568964906204f-111238281124440299JPredicted subunit oftRNA(5-methylaminomethyl-2-thiouridylate)methyltransferase1193112456911251149648099642641268r-111246141125112284ThiJRPutative intracellularprotease/amidase1194112517011256379642089637411981r-311251971125635194LrpKTranscriptional regulators119511257271126902963651962476205f-111257361126900666RPredicted GTPase1196112826211284959611169608831267r-111282711128466102UppFUracil phosphoribosyltransferase1197112853511289729608439604061266r-111285441128967233UppFUracil phosphoribosyltransferase1198112903411304769603449589021980r-311290431130459688NorMQNa+-driven multidrug effluxpump1199113053211319449588469574341660r-211305471131936587NorMQNa+-driven multidrug effluxpump1200113200611324229573729569561265r-111320061132420200RPredicted nucleic acid-bindingprotein1201113243211326599569469567191264r-11132438113263069AbrBKRegulators ofstationary/sporulation geneexpression1202113274411351259566349542531263r-1113275311350421319PpsAGPhosphoenolpyruvatesynthase/pyruvate phosphatedikinase120311351541135213954224954165570f-21204113525511377419541239516371262r-11136407113666550RUncharacterized membraneprotein120511386341138867950744950511571f-2120611391591142494950219946884572f-21141529114198235SrmBLKJSuperfamily II DNA and RNAhelicases COG0513 SrmB120711425371142836946841946542573f-211425401142834165SUncharacterized ACR120811428731144054946505945324574f-211428911144034531NMD3JNMD protein affecting ribosomestability and mRNA decay120911440541145121945324944257206f-111440541145044228CPredicted butyrate kinase121011451771146514944201942864575f-211451801146512743HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases121111465531148040942825941338207f-111465921148029539KchPKef-type K+ transport systems121211480861149231941292940147208f-111480951149226549SUncharacterized ACR121311500931151094939285938284209f-11150891115104433AsnBEAsparagine synthase(glutamine-hydrolyzing)1214115109111545349382879348441659r-211527981154532958InfBJTranslation initiation factor 2(GTPase)121511551081155464934270933914933f-31155324115545029NhaBPNa+/H+ antiporter1216115546611559999339129333791261r-111554871155940256NdkFNucleoside diphosphate kinase1217115741811576279319609317511658r-211574241157625136RPL24AJRibosomal protein L24E1218115762411578369317549315421979r-31157630115779277RPS28AJRibosomal protein S28E/S331219115791611582939314629310851657r-211579221158291226RPL8AJRibosomal protein HS6-type(S12/L30/L7a)1220115836111595549310179298241260r-111583731159537321RAD55TRecA-superfamily ATPasesimplicated in signal transduction1221115968611603069296929290721656r-211596951160295277SUncharacterized archaealZn-finger family1222116129911616349280799277441978r-311613141161596128RUncharacterized ATPases of theAAA superfamily1223116169011636069276889257721655r-211623471163139448CysHEH3′-phosphoadenosine5′-phosphosulfate sulfotransferase(PAPS reductase)/FAD synthetaseand related enzymes COG0175CysH122411637031164656925675924722934f-311637751164561466HflCOMembrane protease subunits122511646631165082924715924296935f-311646631165077148NOMembrane protein implicated inregulation of membrane proteaseactivity COG1585 -122611651211165714924257923664576f-211651301165706202TdkFThymidine kinase122711657241165948923654923430577f-21165793116594681RPL39JRibosomal protein L39E122811659591166231923419923147936f-311659591166217136RPL31AJRibosomal protein L31E122911662591166948923119922430937f-311662591166943329TIF6JEukaryotic translation initiationfactor 6 (EIF6)123011670011167234922377922144210f-11167001116723291RPL20AJRibosomal protein L20A (L18A)1231116750311686579218759207211977r-311675031168655468RfaGMPredicted glycosyltransferases1232116867811694729207009199061259r-11168747116929987UbiAH4-hydroxybenzoatepolyprenyltransferase1233116957611710249198029183541976r-311695911170995718GltDERNADPH-dependent glutamatesynthase beta chain and relatedoxidoreductases COG0493 GltD1234117102111719059183579174731258r-111710211171894441UbiBHC2-polyprenylphenol hydroxylaseand related flavodoxinoxidoreductases˜ COG0543 UbiB123511720471172277917331917101211f-11172059117222435PotEEAmino acid transporters1236117226411730259171149163531975r-311722641173023330GCD14JPredicted SAM-dependentmethyltransferase involved intRNA-Met maturation1237117302211736369163569157421257r-11173112117326532NemACNADH:flavin oxidoreductases123811736871174022915691915356938f-311736991173975120SEC65NSignal recognition particle 19 kDaprotein1239117402311742749153559151041654r-21174041117422747LrpKTranscriptional regulators1240117428411743889150949149901653r-2124111744931177870914885911508578f-211744931175486467RPredicted helicases124211782961178862911082910516212f-111783051178854198CoaEHDephospho-CoA kinase124311788401179322910538910056579f-211789061179320232SUncharacterized ArCR1244117933511806069100439087721974r-311793351180583409NatBCABC-type Na+ efflux pump1245118060311813619087759080171256r-111806091181317376CcmAQABC-type multidrug transportsystem1246118171911819169076599074621255r-11181776118191482KPredicted transcriptionalregulators1247118228111826739070979067051973r-31182308118252732CbiMHCobalamin biosynthesis proteinCbiM124811828991183855906479905523580f-21183346118352333SUncharacterized BCR1249118443511847319049439046471972r-31184531118471729InfBJTranslation initiation factor 2(GTPase)1250118483211857529045469036261652r-21185366118551032MurGMUDP-N-acetylglucosamine:LPSN-acetylglucosamine transferase1251118626411865249031149028541254r-11252118737211876539020069017251971r-31253118825011889069011289004721253r-11188304118864936GyrALDNA gyrase (topoisomerase II) Asubunit1254118896211899069004168994721970r-31188983118938535GcvPEGlycine cleavage system proteinP (pyridoxal-binding)1255118994011900628994388993161969r-31190009119005726MalFGABC-type sugar transport systems1256119130911919418980698974371651r-21191474119158529MalKGABC-typesuger/spermidine/putrescine/iron/thiaminetransport systems125711957731195841893605893537939f-31258119642111969398929578924391650r-21196724119687133GmkFGuanylate kinase1259119712111973308922578920481252r-11197211119732230FecBPABC-type Fe3+-siderophorestransport systems1260119732711978278920518915511649r-21197588119780131UvrALExcinuclease ATPase subunit1261119785911981168915198912621251r-11197958119807826TSH3 domain protein1262119812911983958912498909831250r-11198141119830030AlkAL3-Methyladenine DNAglycosylase126311987751198969890603890409581f-21198808119890733AbrBKRegulators ofstationary/sporulation geneexpression1264119921011995368901688898421968r-31199303119952231SmcDChromosome segregationATPases126512004651200542888913888836940f-31266120274112042588866378851201967r-312027501204256910GcvPEGlycine cleavage system proteinP (pyridoxal-binding)1267120426012056248851188837541648r-212042691205598727GcvPEGlycine cleavage system proteinP (pyridoxal-binding)1268120578012070758835988823031966r-31206086120620632FliINFlagellar biosynthesis/type IIIsecretory pathway ATPase126912073621207793882016881585941f-31207452120766232PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases127012077901208482881588880896582f-212077901208444312RPredicted hydrolases of the HADsuperfamily127112094641210141879914879237583f-212095121210130239RPredicted ICC-likephosphoesterases127212101741210893879204878485213f-112101891210885275SUncharacterized membraneprotein127312108901211111878488878267942f-31210890121105833SmcDChromosome segregationATPases127412111281211787878250877591214f-11211251121139233XerCLIntegrase127512118501212755877528876623943f-31211949121203033FecBPABC-type Fe3+-siderophorestransport systems1276121276012131048766188762741249r-11212775121285036NuoGCNADHdehydrogenase/NADH:ubiquinoneoxidoreductase 75 kD subunit(chain G)1277121310112143698762778750091647r-212131371214364572HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases1278121436612152148750128741641965r-312143661215206475RPredicted dehydrogenase1279121525012158618741288735171248r-112152591215793272RPredicted dehydrogenase128012173741217490872004871888215f-11217374121746126RCBS domains128112190741219190870304870188944f-31282121919712206908701818686881646r-212191971220676790GlpKCGlycerol kinase1283122074012215138686388678651247r-112207671221511387UgpQCGlycerophosphoryl diesterphosphodiesterase1284122150312222018678758671771964r-31221509122212492UgpQCGlycerophosphoryl diesterphosphodiesterase128512222821223655867096865723216f-112222971223653582128612237581225113865620864265217f-112238211225096605128712251131225991864265863387945f-312251791225965379RHydrolases of the alpha/betasuperfamily128812261691226861863209862517946f-312262171226835187RPredicted deacetylase1289122707612277028623028616761246r-112270881227691290TmkFThymidylate kinase1290122775612284668616228609121645r-212277561228449365CpsGGPhosphomannomutase129112286221230493860756858885584f-2122863112304821088PckACPhosphoenolpyruvatecarboxykinase (GTP)129212305801233081858798856297218f-1123059212330581177GlgPGGlucan phosphorylase129312332361234546856142854832585f-21233818123434044RNa+-dependent transporters of theSNF family1294123456312362848548158530941644r-212345691236282931GlnSJGlutamyl- and glutaminyl-tRNAsynthetases1295123658412379788527948514001963r-312365841237964630DnaGLDNA primase (bacterial type)1296123797512383768514038510021245r-112379751238371177LSmall primase-like proteins(Toprim domain)1297123843312397078509458496711643r-212384391239702677CFe—S oxidoreductases family 21298123979112399948495878493841962r-31239791123999292HHT1LHistones H3 and H4 (HistonA&B)129912401251240214849253849164947f-31300124080112408968485778484821244r-11301124159212419218477868474571642r-21241601124176998RPP1AJRibosomal proteinL12E/L44/L45/RPP1/RPP21302124198312430148473958463641243r-112419921243009402RplJJRibosomal protein L101303124301112436618463678457171641r-212430111243656327RplAJRibosomal protein L11304124369212437788456868456001640r-21305124377512442728456038451061961r-312437811244264223RplKJRibosomal protein L111306124430712447658450718446131639r-212443161244763257NusGKTranscription antiterminator1307124478812449738445908444051242r-11244788124489349Sss1NProtein translocase subunit Sss11308124500412461258443748432531241r-112450041246123536FtsZDCell division GTPase1309124624112470598431378423191960r-312462411247057446SUncharacterized ArCR1310124736912487098420098406691959r-312475671248584105FucIGL-fucose isomerase and relatedproteins131112486211249226840757840152948f-312486301249179314SUncharacterized ACR1312125049912511888388798381901638r-212504991251186333RpiAGRibose 5-phosphate isomerase1313125119312515618381858378171240r-11251223125137929SUL1PSulfate permease and relatedtransporters (MFS superfamily)1314125163212535788377468358001958r-3125163212535761146RPredicted metal-dependent RNase1315125358812537888357908355901957r-31253588125375074HslVOProteasome protease subunit131612543041255470835074833908219f-11254304125474237AdkAFArchaeal adenylate kinase1317125558212564368337968329421239r-112555941256431481GdhAEGlutamate dehydrogenase/leucinedehydrogenase1318125637912568468329998325321637r-212563791256808256GdhAEGlutamate dehydrogenase/leucinedehydrogenase131912574021258961831976830417949f-312574111258956828RNa+-dependent transporters of theSNF family132012589721259079830406830299220f-11258972125903826SurAOParvulin-like peptidyl-prolylisomerase132112591241259858830254829520950f-31259490125971233SgaTSUncharacterized BCR1322125985512601728295238292061956r-312598551260143100GlnKENitrogen regulatory protein PII1323126022912622568291498271221238r-112602291261816720SUncharacterized ACR132412623881262651826990826727951f-3132512627091264661826669824717952f-312627091264623880KPredicted RNA-binding proteinhomologous to eukaryotic snRNP1326126465812650748247208243041955r-312646581265072231NikRKPredicted transcriptionalregulators containing theCopG/Arc/MetJ DNA-bindingdomain and a metal-bindingdomain132712651451265591824233823787953f-31265307126540929HsdRLRestriction enzymes type Ihelicase subunits and relatedhelicases132812655931266390823785822988221f-11266082126625931UgpBGSugar-binding periplasmicproteins/domains132912667501267955822628821423954f-312667501267941638RPredicted alternative tryptophansynthase beta-subunit (paralog ofTrpB)1330126813012691378212488202411636r-212681301269132523AsdEAspartate-semialdehydedehydrogenase1331126915512700428202238193361954r-312691671270037312ThrBEHomoserine kinase1332127006212711628193168182161635r-212700831271085242LysCEAspartokinases1333127116212721818182168171971953r-312711711272170567MetEEMethionine synthase II(cobalamin-independent)1334127217412731038172048162751634r-212721741273068462MetEEMethionine synthase II(cobalamin-independent)1335127310012741588162788152201952r-312731091274144296MetFE51336127415112752818152278140971633r-212741541275270484MetCECystathioninebeta-lyases/cystathioninegamma-synthases1337127546112761358139178132431951r-312755091276133239JRibonuclease P subunit Rpp301338127612012766898132588126891237r-112762401276684210SUncharacterized ArCR1339127672712783018126518110771950r-312768921278245140MdlBQABC-type multidrug/protein/lipidtransport system1340127863612795358107428098431632r-21279008127914332LivGEABC-type branched-chain aminoacid transport systems1341127995812805878094208087911949r-312799581280585320RPL15AJRibosomal protein L15E134212806611281740808717807638955f-312806701281729544PepPEXaa-Pro aminopeptidase1343128180412823978075748069811631r-212818041282356295SUncharacterized ACR1344128238412830348069948063441236r-112824171283032320SUncharacterized ACR1345128305512842518063238051271630r-212832051284249291PPermease134612846671285869804711803509222f-11285024128570245SUncharacterized archaealcoiled-coil domain134712859751289823803403799555223f-112881441289155166LType II restriction enzyme134812900191292922799359796456224f-1129001912929201723LeuSJLeucyl-tRNA synthetase1349129339612938607959827955181629r-21293606129377430ErfKSUncharacterized BCR135012948921295722794486793656586f-21295033129533644SmtAQRSAM-dependentmethyltransferases COG0500SmtA135112957481297115793630792263956f-312957601297065379RPredicted ATPase of the AAAsuperfamily1352129711612984447922627909341628r-212971611298433640ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases135312986251298846790753790532957f-31298646129880527PanCHPanthothenate synthetase1354129918913002207901897891581627r-212991891300218487IspAHGeranylgeranyl pyrophosphatesynthase1355130029013016247890887877541626r-213002901301619738RPredicted hydrolase of themetallo-beta-lactamasesuperfamily1356130175913029347876197864441948r-313018251302926586LldDCL-lactate dehydrogenase(FMN-dependent) and relatedalpha-hydroxy aciddehydrogenases1357130293113036177864477857611235r-113029401303612268RacXMAspartate racemase1358130369013044547856887849241234r-113036991304449388CinARPredicted nucleotide-utilizingenzyme related tomolybdopterin-biosynthesisenzyme MoeA1359130445113052397849277841391625r-213044511305222243RPredicted archaeal kinases1360130523613062497841427831291947r-313052511306247484ERG12IMevalonate kinase1361130624613067227831327826561233r-113063121306711150SUncharacterized ACR1362130666513070397827137823391624r-213067041307028107RPredicted nucleotidyltransferases1363130707613079637823027814151623r-213070881307961485RPredicted dioxygenase1364130798913090537813897803251232r-113079891309027408ThrCEThreonine synthase136513091061309948780272779430587f-213091331309940284UdpFUridine phosphorylase136613099501311020779428778358958f-31310643131100636SUncharacterized archaealcoiled-coil domain1367131196513133177774137760611946r-313119741313285489HcaDRUncharacterizedNAD(FAD)-dependentdehydrogenases1368131341213142247759667751541622r-213134211314216415PnpFPurine nucleoside phosphorylase1369131566113158797737177734991945r-31315679131576329PrmAJRibosomal protein L11 methylase1370131604113161517733377732271231r-1137113164101317765772968771613225f-113164191317742693FfhNSignal recognition particleGTPase137213177621318001771616771377959f-31317765131799395LrpKTranscriptional regulators137313179981318528771380770850588f-213180041318424189RPredicted Fe—S-clusteroxidoreductase137413185851319298770793770080226f-113185851319296316AptFAdenine/guaninephosphoribosyltransferases andrelated PRPP-binding proteins137513193081319637770070769741227f-11319491131960829SrmBLKJSuperfamily II DNA and RNAhelicases COG0513 SrmB1376131962013200787697587693001230r-113196291320064179LrpKTranscriptional regulators137713213261322096768052767282960f-313213351322010346ApaHTDiadenosine tetraphosphatase andrelated serine/threonine proteinphosphatases1378132210213224017672767669771944r-313221021322399150SUncharacterized ACR1379132284013230047665387663741943r-313228491323002105RPL40AJRibosomal protein L40E1380132318313237887661957655901621r-213231861323783368RpsBJRibosomal protein S21381132380213248277655767645511229r-113238021324822474EnoGEnolase1382132513913253367642397640421620r-213251391325334122RPB10KDNA-directed RNA polymerase1383132536913258007640097635781942r-313253931325798217RpsIJRibosomal protein S91384132578713262157635917631631619r-213257871326213254RplMJRibosomal protein L131385132622213265937631567627851618r-213262311326591187RPL18AJRibosomal protein L18E1386132673813275267626407618521617r-213267471327521411RpoAKDNA-directed RNA polymerasealpha subunit/40 kD subunit1387132754813279707618307614081616r-213275481327944188RpsKJRibosomal protein S111388132796713285097614117608691941r-313279671328507239RpsDJRibosomal protein S4 and relatedproteins1389132852013290777608587603011615r-213286371329075235RpsMJRibosomal protein S131390132908413296717602947597071614r-213290841329669327RsmCJ16S RNA G1207 methylaseRsmC139113300581330213759320759165589f-21392133054013315657588387578131228r-113305491331551632TruBJPseudouridine synthase1393133177713320077576017573711940r-31331810133198740SUncharacterized ACR1394133204313327537573357566251227r-113320941332751201FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG1395133286113331127565177562661613r-213328611333107142RPL14AJRibosomal protein L14E1396133311313336947562657556841612r-213331131333644327CmkFCytidylate kinase 21397133370613339997556727553791939r-313337271333991175RPL34AJRibosomal protein L34E1398133402013345507553587548281226r-113340261334542194SUncharacterized membraneprotein1399133453713351367548417542421938r-313345461335134290AdkAFArchaeal adenylate kinase1400133521013366677541687527111611r-213352191336659665SecYNPreprotein translocase subunitSecY1401133669913371457526797522331225r-113366991337143155RplOJRibosomal protein L151402133715713376247522217517541610r-213371571337622269RpmDJRibosomal protein L30/L7E1403133763613383437517427510351937r-313376481338341426RpsEJRibosomal protein S51404133834013389547510387504241224r-113383401338946302RplRJRibosomal protein L181405133895613394117504227499671936r-313389591339409213RPL19AJRibosomal protein L19E1406133941313397937499657495851609r-213394731339791194RPL32JRibosomal protein L32E1407133981013403737495687490051223r-113398101340371302RplFJRibosomal protein L61408134037513407677490037486111935r-313403751340765243RpsHJRibosomal protein S81409134077913409497485997484291222r-113407791340947122RpsNJRibosomal protein S141410134095113415027484277478761934r-313409601341491307RplEJRibosomal protein L51411134151613422477478627471311608r-213415161342245444RPS4AJRibosomal protein S4E1412134224713426127471317467661933r-313422471342574189RplXJRibosomal protein L241413134262413430497467547463291221r-113426241343047203RplNJRibosomal protein L141414134305313434067463257459721220r-113430621343389195RpsQJRibosomal protein S171415134339413436607459847457181607r-213433941343655127POP4JRNAse P protein subunitP29/POP41416134365713439537457217454251932r-313436571343951170SUI1JTranslation initiation factor(SUI1)1417134396013441607454187452181931r-313439841344158101RpmCJRibosomal protein L291418134414713447857452317445931606r-213441561344729316RpsCJRibosomal protein S31419134478213452527445967441261930r-313447941345250241RplVJRibosomal protein L221420134526313456737441157437051605r-213452811345671203RpsSJRibosomal protein S191421134567013463987437087429801929r-313456791346396438RplBJRibosomal protein L21422134640313466637429757427151604r-213464031346661153RplWJRibosomal protein L231423134667013474377427087419411603r-213466701347435415RplDJRibosomal protein L41424134744813484887419307408901219r-113474481348435509RplCJRibosomal protein L31425134849013493447408887400341928r-313485741349333394SUncharacterized ACR1426134988213512587394967381201927r-313498821351238706RPredicted ATPase of the AAAsuperfamily1427135132213525067380567368721926r-313513581352504501LATP-dependent DNA ligase1428135261313532697367657361091602r-213527211353255301RplPJRibosomal protein L16/L10E142913545741355740734804733638590f-213546011355738619ESerine-pyruvateaminotransferase/archaealaspartate aminotransferase1430135582113564027335577329761218r-113558211356397256VirB11NPredicted ATPases involved inpili and flagella biosynthesis143113566061357514732772731864961f-313566151357512426AsnSJAspartyl/asparaginyl-tRNAsynthetases1432135751713583507318617310281925r-313575201358333394MesJDPredicted ATPase of the PP-loopsuperfamily implicated in cellcycle control1433135844113594337309377299451924r-31358945135911336LacAGBeta-galactosidase143413611811362461728197726917962f-313611811362417612MGlycosyltransferases143513624491362523726929726855591f-21362449136252143MGlycosyltransferases1436136301013639307263687254481923r-313630161363925512MesJDPredicted ATPase of the PP-loopsuperfamily implicated in cellcycle control1437136397213654657254067239131217r-113640291365457858RUncharacterized FAD-dependentdehydrogenases143813655891366155723789723223228f-113656431366150228RCBS domains143913661951367346723183722032229f-113662041367341495KefBPKef-type K+ transport systems144013673571368481722021720897592f-213673571368416353KefBPKef-type K+ transport systems144113685821369193720796720185963f-313686361369188221MarCSIntegral membrane proteins of theMarC family144213692481370567720130718811964f-313692661370559647HisSJHistidyl-tRNA synthetase1443137062713709897187517183891922r-31370681137097251ClsIPhosphatidylserine/phosphatidylglycerophosphate/cardioli pinsynthases and related enzymes144413718471372125717531717253230f-11371853137203034SUncharacterized archaealcoiled-coil domain144513723221373752717056715626593f-21372358137263732PheSJPhenylalanyl-tRNA synthetasealpha subunit144613739021376664715476712714231f-1137391113766591504AlaSJAlanyl-tRNA synthetase144713769211378402712457710976594f-213769361378388653PutPEHRNa+/proline1448137847013795347109087098441601r-213784701379532568EutGCAlcohol dehydrogenase IV144913796491380014709729709364965f-31379802137991328HemBHDelta-aminolevulinic aciddehydratase1450137998113804457093977089331921r-31380098138024833FlgHNFlagellar basal body L-ringprotein1451138053213812847088467080941216r-113805321381279332SUncharacterized ACR1452138128113826877080977066911600r-213812961382565209RPredicted ATPase of the AAAsuperfamily145313827671384572706611704806232f-1138280913845701039ELP3KELP3 component of the RNApolymerase II complex1454138456913853547048097040241599r-21385043138529544SUncharacterized ACR1455138535113859147040277034641920r-313853601385834101HdeDSUncharacterized BCR1456138606113875787033177018001215r-113860791387129150SPredicted membrane protein145713879221388011701456701367595f-21458138800413890507013747003281598r-21388016138882696NosYRABC-type transport systeminvolved in multi-copper enzymematuration145913884851388589700893700789233f-11388485138858426SUncharacterized ArCR1460138904713899827003316993961919r-313890591389962268CcmAQABC-type multidrug transportsystem146113901081390617699270698761234f-113901081390498229RPredicted Fe-S-clusteroxidoreductase146213906561391165698722698213966f-313906681391157246NIP7JProtein involved in ribosomalbiogenesis146313913971391669697981697709967f-31391445139151128GloBRZn-dependent hydrolases146413939801394540695398694838968f-313939801394523160CcmAQABC-type multidrug transportsystem146513961691396951693209692427596f-213962051396946461RAD55TRecA-superfamily ATPasesimplicated in signal transduction146613969651397522692413691856969f-313969771397328206KPredicted transcriptionalregulators1467139752813979686918506914101918r-313975461397951245SpeDES-adenosylmethioninedecarboxylase146813982711399176691107690202235f-113983281399144272SecFNPreprotein translocase subunitSecF146913991731400693690205688685970f-313991881400673452SecDNPreprotein translocase subunitSecD147014006901401382688688687996597f-214006931401374330TrkAPK+ transport systems147114015021401813687876687565236f-11401502140180262NtpFCArchaeal/vacuolar-typeH+-ATPase subunit H147214018151403806687563685572598f-214018151403789681NtpICArchaeal/vacuolar-typeH+-ATPase subunit I147314038241404309685554685069237f-114038241404286171AtpECF0F1-type ATP synthase csubunit/Archaeal/vacuolar-typeH+-ATPase subunit K147414043491404960685029684418238f-114043491404958186NtpECArchaeal/vacuolar-typeH+-ATPase subunit E147514049571406060684421683318971f-314049841406046407NtpCCArchaeal/vacuolar-typeH+-ATPase subunit C147614060571406365683321683013599f-214060571406360146NtpGCArchaeal/vacuolar-typeH+-ATPase subunit F147714063721407382683006681996600f-214063721407344399NtpACArchaeal/vacuolar-typeH+-ATPase subunit A147814074751408257681903681121239f-114074751408255481NtpACArchaeal/vacuolar-typeH+-ATPase subunit A147914082541409654681124679724972f-314082571409646864NtpBCArchaeal/vacuolar-typeH+-ATPase subunit B148014096741410327679704679051240f-114096831410316318NtpDCArchaeal/vacuolar-typeH+-ATPase subunit D148114104131411189678965678189601f-214104221411187442CUncharacterized flavoproteins148214111991411954678179677424602f-214111991411943322TarNMethyl-accepting chemotaxisprotein148314119381413167677440676211973f-314119471413159442RPredicted metal-dependenthydrolases related to alanyl-tRNAsynthetase HxxxH domain148414132351413960676143675418241f-11413274141376334MetCECystathioninebeta-lyases/cystathioninegamma-synthases148514139351414642675443674736603f-21414058141429530AsnBEAsparagine synthase(glutamine-hydrolyzing)148614149431415797674435673581604f-214149521415792507RPredicted metal-dependenthydrolases of the ureasesuperfamily1487141580014186586735786707201214r-114160941417195315GltDERNADPH-dependent glutamatesynthase beta chain and relatedoxidoreductases COG0493 GltD1488141865514204576707236689211597r-214187001420224632NuoFCNADH:ubiquinoneoxidoreductase1489142045014209236689286684551213r-114204891420888150NuoECNADH:ubiquinoneoxidoreductase 24 kD subunit1490142104914220806683296672981596r-214210581422069493RCL1KRNA phosphate cyclase149114222171422759667161666619242f-11422355142244830SbcCLATPase involved in DNA repair1492142274014235946666386657841917r-31423205142334035LysCEAspartokinases1493142361714241296657616652491595r-214236171424127253RPredicted phosphoesterase149414242661424787665112664591243f-11424407142451830RUncharacterized CBSdomain-containing proteins149514247871428260664591661118974f-314247871425792442MCM2LPredicted ATPase involved inreplication control149614283061428734661072660644975f-314283151428732250GCD7JTranslation initiation factor eIF-2149714288421430410660536658968605f-214296131430408486AccAIAcetyl-CoA carboxylase alphasubunit149814304211430807658957658571976f-31430433143079052OadGCNa+-transportingmethylmalonyl-CoA/oxaloacetatedecarboxylase149914308011431283658577658095606f-214308761431281129AccBIBiotin carboxyl carrier protein150014312901432483658088656895607f-214313021432481628OadBCNa+-transportingmethylmalonyl-CoA/oxaloacetatedecarboxylase150114325471433398656831655980608f-214325561433390422RCBS domains150214334321434445655946654933609f-214334471434437291ThrAEHomoserine dehydrogenase150314348741435398654504653980244f-11434985143524633TPeriplasmic ligand-binding sensordomain1504143539514361086539836532701594r-214354341436022315RC4-type Zn finger1505143618014365936531986527851916r-314361801436591124SUncharacterized ACR1506143664514369356527336524431915r-31436774143690031SPredicted membrane protein1507143695814377766524206516021593r-214369581437774418JRNase PH-relatedexoribonuclease1508143776914385276516096508511212r-114377781438525467RphJRNase PH1509143850214392756508766501031914r-314385021439237411RRP4JRNA-binding protein Rrp4 andrelated proteins (contain S1domain and KH domain)1510143927214399826501066493961211r-114392721439980424SUncharacterized ACR1511143999414407766493846486021592r-214399941440774389HslVOProteasome protease subunit151214411151441582648263647796610f-214411151441553219HitFGRDiadenosine tetraphosphate(Ap4A) hydrolase and other HITfamily hydrolases COG0537 Hit1513144155714419766478216474021591r-21441659144196599MazGRPredicted pyrophosphatase1514144188814421846474906471941210r-11441981144211630SerCHEPhosphoserine aminotransferaseCOG1932 SerC151514422681442525647110646853977f-3151614426021444524646776644854245f-114426711443574550PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1517144452114449676448576444111590r-214445211444953102RPredicted nucleic acid-bindingprotein1518144528814460016440906433771913r-31445507144584031CcmAQABC-type multidrug transportsystem1519144642114467446429576426341209r-11446487144661028SUncharacterized ACR152014470181447827642360641551246f-114470571447756221PerMRPredicted permease1521144776314482996416156410791912r-314477631448297325PncAQAmidases related tonicotinamidase1522144835414485276410246408511911r-31448354144852279SUncharacterized ACR152314487331449227640645640151978f-314488051449219164PaaIQUncharacterized protein152414497641450072639614639306611f-214497731450067143KPredicted transcriptionalregulators152514500761451272639302638106612f-214501031451219516152614513621452348638016637030247f-114513621452337398AnsBEJL-asparaginase/archaealGlu-tRNAGln amidotransferasesubunit D COG0252 AnsB1527145234514525666370336368121589r-21528145292114535716364576358071588r-214529301453569229MarCSIntegral membrane proteins of theMarC family152914537391453954635639635424613f-21453805145390428CheANChemotaxis protein histidinekinase and related kinases1530145465814547536347206346251587r-21531145578014574956335986318831586r-21456269145654533LAP4EAspartyl aminopeptidase1532145837314585166310056308621208r-11533146085914613716285196280071585r-21461048146127030GlnQEABC-type polar amino acidtransport system1534146134314617266280356276521207r-11461454146161330UvrALExcinuclease ATPase subunit1535146249414631086268846262701584r-21462509146268028VacBKExoribonucleases1536146310514642836262736250951910r-314631411464236580FtsZDCell division GTPase1537146425514664926251236228861583r-21464516146470235RnhALRibonuclease HI1538146659914676096227796217691206r-114666141467604607CFe—S oxidoreductases153914676551467744621723621634248f-1154014677691467906621609621472249f-11541146789114686766214876207021582r-214680921468650200HemKJPredicted rRNA or tRNAmethylase1542146849814690196208806203591205r-114685011469002255RConserved protein/domaintypically associated withflavoprotein oxygenases154314692651470533620113618845979f-314693311470465343AprEOSubtilisin-like serine proteases1544147060914717906187696175881581r-214706181471788664PncBHNicotinic acidphosphoribosyltransferase1545147181214719376175666174411580r-2154614718701472673617508616705250f-114719121472653149FabGQRDehydrogenases with differentspecificities (related toshort-chain alcoholdehydrogenases) COG1028 FabG1547147473114749286146476144501579r-21474809147489327PcmOProtein-L-isoaspartatecarboxylmethyltransferase1548147507214759836143066133951909r-314750841475972427RPredicted hydrolase of themetallo-beta-lactamasesuperfamily154914771071477574612271611804980f-31477110147739830DeoRKTranscriptional regulator1550147758414790296117946103491578r-214775991479027735GltDERNADPH-dependent glutamatesynthase beta chain and relatedoxidoreductases COG0493 GltD1551147903014798846103486094941577r-214790301479882446UbiBHC2-polyprenylphenol hydroxylaseand related flavodoxinoxidoreductases˜ COG0543 UbiB155214800881480873609290608505614f-214800881480871429SUncharacterized ArCR1553148096014817816084186075971204r-114809601481779378CysKECysteine synthase1554148175314818696076256075091908r-31481759148184031CysKECysteine synthase1555148204914827806073296065981203r-114820491482757382KPredicted transcriptionalregulators155614844221486413604956602965251f-114849501485667224AprEOSubtilisin-like serine proteases155714864481488211602930601167615f-21487183148772933SqhCISqualene cyclase1558148825314893086011256000701202r-114882531489306553RPredicted methyltransferases155914894171490157599961599221252f-114894171490146257RUncharacterized ATPases of thePP-loop superfamily156014902111490753599167598625981f-314902981490748206PaaDRPutative aromatic ringhydroxylating enzyme156114908961491087598482598291253f-11490896149107399FerCFerredoxin 11562149122214913955981565979831576r-214912491491393103RPS31JRibosomal protein S27AE1563149140614917385979725976401201r-114914421491733159RPS24AJRibosomal protein S24E1564149169214922255976865971531907r-314916921492217199SUncharacterized ArCR1565149222214924315971565969471200r-11492237149242699KDNA-directed RNA polymerasesubunit E″1566149242814930005969505963781575r-214924281492941261RPB7KDNA-directed RNA polymerasesubunit E′1567149303714935735963415958051574r-214930371493571312PpaCInorganic pyrophosphatase1568149363114945935957475947851573r-21494243149442033AcnACAconitase A1569149461314955605947655938181199r-11494913149511133TGAF domain-containing proteins1570149555714965645938215928141572r-214955631496529235LepBNSignal peptidase I1571149667714972165927015921621198r-11496755149697732LeuAEIsopropylmalate/homocitrate/citramalatesynthases1572149723114979025921475914761571r-21497582149781633H6-pyruvoyl-tetrahydropterinsynthase1573149801514985065913635908721197r-11498126149840231DcuCCC4-dicarboxylate transporter1574149989315009545894855884241196r-115000041500946498WcaGMGNucleoside-diphosphate-sugarepimerases COG0451 WcaG157515009751501334588403588044982f-315009781501332167RPredicted nucleotidyltransferases157615012341501755588144587623254f-115013121501732222SUncharacterized ACR157715017521502747587626586631983f-315017521502745510GCD1MJNucleoside-diphosphate-sugarpyrophosphorylases involved inlipopolysaccharidebiosynthesis/translation initiationfactor eIF2B subunits COG1208GCD1157815027821504029586596585349255f-115027821503988650RfbXRMembrane protein involved in theexport of O-antigen and teichoicacid1579150370515038815856735854971570r-21503741150386727CysNPGTPases - Sulfate adenylatetransferase subunit 1158015064541507683582924581695256f-115064961507669617TagBMPutativeglycosyl/glycerophosphatetransferases involved in teichoicacid biosynthesisTagF/TagB/EpsJ/RodC158115076801508369581698581009984f-315076801508364371IspDI4-diphosphocytidyl-2-methyl-D-erithritolsynthase158215085131509250580865580128616f-215085131509248404WcaAMGlycosyltransferases involved incell wall biogenesis1583150928415115845800945777941906r-315093111511570800RUncharacterized membraneprotein158415129861513759576392575619617f-215130401513637119WcaAMGlycosyltransferases involved incell wall biogenesis158515137561514835575622574543257f-115137561514773191RfaGMPredicted glycosyltransferases158615158771516842573501572536258f-11516165151679293RfaGMPredicted glycosyltransferases1587151851015185695708685708091569r-2158815198161521600569562567778259f-11520431152062032LolAMOuter membranelipoprotein-sorting protein1589151982415199255695545694531568r-2159015217351522592567643566786985f-31521990152240137HsdRLRestriction enzymes type Ihelicase subunits and relatedhelicases159115232101524667566168564711618f-21523219152362431SUncharacterizedmembrane-associatedprotein/domain159215250751526076564303563302260f-11525372152571435SPredicted archaeal membraneprotein1593152606615264495633125629291905r-31526066152643284RfaGMPredicted glycosyltransferases159415294891530295559889559083619f-215295011530284389NagDGPredicted sugar phosphatases ofthe HAD superfamily159515302961530733559082558645620f-21530557153072233SUncharacterized ACR159615308941536164558484553214986f-315348121536162744NrdAFRibonucleotide reductase alphasubunit159715362981536771553080552607261f-115363071536769230RPredictedphosphoribosyltransferases159815368111537365552567552013262f-115368111537363268LigTJ2′-5′ RNA ligase159915403261541702549052547676987f-315403261541697582CCA1JtRNA nucleotidyltransferase(CCA-adding enzyme)1600154190115436915474775456871567r-21542636154283433GmdMGDP-D-mannose dehydratase160115437541544062545624545316621f-21543862154405428DPH2JDiphthamide synthase subunitDPH2160215440931544920545285544458622f-215440961544915261RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT160315449701545347544408544031988f-31545231154532432SUncharacterized BCR1604154543215459685439465434101566r-215454321545966183SUncharacterized ACR160515461651549362543213540016263f-1154616515493601910IleSJIsoleucyl-tRNA synthetase1606154937015495225400085398561904r-31549385154949027CobQHCobyric acid synthase1607155019515514545391835379241903r-31550882155129032BaeSTSensory transduction histidinekinases160815513841551506537994537872989f-31609155163715520085377415373701195r-115516371552006162VapCRPredicted nucleic acid-bindingprotein1610155197515522175374035371611565r-215519751552212105SUncharacterized ACR161115523301553088537048536290264f-11552351155252533QcrBCCytochrome b subunit of the bccomplex1612155310815554805362705338981902r-3155312615554661072LacAGBeta-galactosidase(exo-beta-D-glucosaminidase)1613155547415562955339045330831194r-115554741556287359AgaSMPredicted phosphosugarisomerases1614155645515574385329235319401193r-115564821557424491OppFEPABC-typedipeptide/oligopeptide/nickeltransport system1615155741615585075319625308711901r-315575391558493497DppDEPABC-typedipeptide/oligopeptide/nickeltransport system1616155839015593345309885300441192r-115584081559320357DppCEPABC-typedipeptide/oligopeptide/nickeltransport systems1617155933715603505300415290281564r-215593641560345529DppBEPABC-typedipeptide/oligopeptide/nickeltransport systems1618156038215610115289965283671191r-115603821560955219OppAEPABC-typedipeptide/oligopeptide/nickeltransport systems1619156139215625975279865267811563r-215613921562439468OppAEPABC-typedipeptide/oligopeptide/nickeltransport systems162015628321564286526546525092990f-315628381564281790BglBGBeta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase162115644891564938524889524440265f-115644891564933158VapCRPredicted nucleic acid-bindingprotein1622156496015657725244185236061190r-115649721565767355Sdeacetylase162315659431569653523435519725991f-315662581567437330ChiAGChitinase1624156969915711445196795182341562r-215700381571139557RUncharacterized ACR related topyruvate formate-lyase activatingenzyme162515708581571220518520518158266f-115708671571218169POP5LRNase P subunit P14 and itsarchaeal orthologs1626157121715725635181615168151561r-215712171572540557GlgAGGlycogen synthase1627157261215736375167665157411560r-215726241573587119KPredicted transcriptionalregulators1628157364115737485157375156301559r-2162915737101575680515668513698992f-315740371575441267AmyAGGlycosidases163015757531577099513625512279993f-315757531577070692MalEGMaltose-binding periplasmicproteins/domains163115771381578040512240511338623f-215771381578032480MalFGABC-type sugar transport systems163215780371579284511341510094267f-115780491579279466MalGGSugar permeases163315792941582596510084506782268f-1157930015823871626GAlpha-amylase/alpha-mannosidase163415827071583825506671505553994f-315827071583823623MalKGABC-typesugar/spermidine/putrescine/iron/thiaminetransport systems163515838581584259505520505119624f-215838701584245146SUncharacterized ArCR163615842891585641505089503737269f-115842921585606321CpsGGPhosphomannomutase1637158564615865755037325028031900r-315857601586573431PhnPRMetal-dependent hydrolases ofthe beta-lactamase superfamily I163815863611588547503017500831995f-315866731588470865SUncharacterized membraneprotein163915885971588962500781500416270f-11588741158891531LysRKTranscriptional regulator164015889191590214500459499164625f-215889521590212639ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases164115902981591578499080497800271f-11590586159088631ThrSJThreonyl-tRNA synthetase1642159190215923724974764970061558r-21592157159233429BglCGEndoglucanase164315927691593515496609495863996f-315927691593501411SmtAQRSAM-dependentmethyltransferases COG0500SmtA1644159368215948844956964944941189r-115936941594882644RPredicted SAM-dependentmethyltransferases164515950171595325494361494053272f-11595017159510430RPredicted phosphate-bindingenzymes1646159646515970584929134923201557r-21596477159671130DltERShort-chain dehydrogenases ofvarious substrate specificities1647159775115985094916274908691899r-315977781598507387RAD55TRecA-superfamily ATPasesimplicated in signal transduction164815986761599902490702489476997f-315987001599873396PRI2LEukaryotic-type DNA primase164915998861600935489492488443273f-115999041600903474PRI1LEukaryotic-type DNA primase165016012201601777488158487601998f-31601223160176067RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT165116037271603786485651485592626f-21652160408816042644852904851141556r-21604088160415426SUncharacterized ArCR165316047081606048484670483330627f-216047681606046714GlnAEGlutamine synthase1654160603916069024833394824761188r-116060451606855363RhaTGEPermeases of the drug/metaboliteRtransporter (DMT) superfamilyCOG0697 RhaT1655160691216076854824664816931187r-116069211607683375NadEHNAD synthase1656160766316079714817154814071898r-31607762160785530FUI1FHCytosine/uracil/thiamine/allantoinpermeases COG1953 FUI11657160821316092204811654801581555r-216082131609215592OppFEPABC-typedipeptide/oligopeptide/nickeltransport system1658160923116101904801474791881186r-116092311610188581DppDEPABC-typedipeptide/oligopeptide/nickeltransport system1659161020216116234791764777551554r-216102021611618657DppCEPABC-typedipeptide/oligopeptide/nickeltransport systems1660161163516126844777434766941897r-316116351612673540DppBEPABC-typedipeptide/oligopeptide/nickeltransport systems1661161286516153124765134740661896r-31613654161498357OppAEPABC-typedipeptide/oligopeptide/nickeltransport systems166216156531616882473725472496999f-316156591616868523PyrCFDihydroorotase166316168601617561472518471817274f-116168601617553338UbiBHC2-polyprenylphenol hydroxylaseand related flavodoxinoxidoreductases˜ COG0543 UbiB1664161755816185174718204708611000f-316176151618512516RPredicted Fe—S oxidoreductases1665161775616178154716224715631553r-21666161857816192764708004701021001f-31618647161913033DppCEPABC-typedipeptide/oligopeptide/nickeltransport systems1667161926316212274701154681511185r-116192661621183975GAlpha-amylase/alpha-mannosidase(4-alpha-glucanotransferase)1668162130516219344680734674441552r-216213051621890216SEC59IDolichol kinase166916227351622920466643466458628f-21622735162290933SUncharacterized archaealmembrane protein1670162292216241124664564652661002f-316229401624086499KefBPKef-type K+ transport systems167116241331625287465245464091629f-216241361625279536GadBEGlutamate decarboxylase andrelated PLP-dependent proteins167216253211625563464057463815630f-21625339162544139KPredicted transcriptionalregulators containing theCopG/Arc/MetJ DNA-bindingdomain1673162562816257174637504636611003f-31625631162570930MazFTGrowth inhibitor167416258161625929463562463449631f-21675162591916268244634594625541551r-216259641626810346MMT1PPredicted Co/Zn/Cd cationtransporters1676162700916276144623694617641184r-11627279162747732RpoBKDNA-directed RNA polymerasebeta subunit/140 kD subunit (splitgene in Mjan167716277931629337461585460041632f-216278171629101316RfbXRMembrane protein involved in theexport of O-antigen and teichoicacid1678162943516305954599434587831004f-316294351630491336MPredicted membrane-associatedZn-dependent proteases 11679163059616317204587824576581005f-316307491631694526MesJDPredicted ATPase of the PP-loopsuperfamily implicated in cellcycle control1680163063716307054587414586731895r-31681163179916330734575794563051006f-316318531633008232RUncharacterized ATPases of theAAA superfamily168216331291633257456249456121275f-11633156163324030IleSJIsoleucyl-tRNA synthetase168316341251634739455253454639276f-11634227163449433AraJGArabinose efflux permease1684163425316343694551254550091550r-21634256163433727SUL1PSulfate permease and relatedtransporters (MFS superfamily)168516347441635046454634454332633f-216347441635005108MarRKTranscriptional regulators1686163504916363654543294530131183r-116351391636348703BglBGBeta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase168716363761637356453002452022634f-216363761637351544GalTCGalactose-1-phosphateuridylyltransferase1688163733616386734520424507051894r-3163734216386536751689163867016397554507084496231182r-116386701639744536SUncharacterized ACR1690163975216408164496264485621549r-216397641640805404GalKGGalactokinase1691164093716415574484414478211548r-21641177164146834SPredicted membrane protein1692164158116435454477974458331893r-316415811643381744SUncharacterized ACR1693164371216440384456664453401007f-31643826164403633ArgSJArginyl-tRNA synthetase1694164403516446644453434447141892r-316440441644641198PcpOPyrrolidone-carboxylate peptidase(N-terminal pyroglutamylpeptidase)1695164471116458324446674435461008f-316447171645830464FixCCDehydrogenases (flavoproteins)1696164584216461954435364431831009f-31645923164616933BisCCAnaerobic dehydrogenases1697164655016477494428284416291010f-31647372164754932UgpBGSugar-binding periplasmicproteins/domains1698165119216526914381864366871181r-116511921652689865EZn-dependent carboxypeptidases169916528421653462436536435916277f-116528481653448222LPredicted site-specificintegrase-resolvase170016534431654624435935434754635f-216535091654499137LPredicted transposases170116546761655512434702433866636f-21654808165542374RbsKGSugar kinases1702165592416569764334544324021891r-316559901656971407SUncharacterized ACR1703165725716582104321214311681547r-216572691658208465RMoxR-like ATPases1704165863316588574307454305211890r-31658633165883197PppANSignal peptidase1705165954016600344298384293441011f-31659564165985832SUncharacterized ArCR1706166013716606164292414287621012f-316601431660560142SlpAOFKBP-type peptidyl-prolylcis-trans isomerases 21707166060516610334287734283451546r-216606051661031155SPredicted membrane protein170816612931661439428085427939278f-11709166151916625834278594267951889r-316615311662581392SPredicted membrane protein1710166258516660194267934233591545r-216639621665537735LInteins1711166618516665054231934228731544r-21666254166641329AcoACThiaminepyrophosphate-dependentdehydrogenases1712166704616685004223324208781543r-216670461668477231SUncharacterized ArCR1713166857316689144208054204641013f-31668708166884930LPredicted transposase171416688711669944420507419434279f-116689521669942506RPredicted GTPases1715166994116718964194374174821542r-21670538167088348RABC-type transport systems1716167185616725454175224168331180r-116718591672504200PhnLRABC-type transport systems1717167264216726864167364166921179r-11718167271316730964166654162821541r-216727131673079144DppCEPABC-typedipeptide/oligopeptide/nickeltransport systems1719167396516749994154134143791178r-116739651674997226DppBEPABC-typedipeptide/oligopeptide/nickeltransport systems172016754481676545413930412833637f-216754481676543556LPredicted N6-adenine-specificDNA methylases172116766301677790412748411588638f-216767801677785572PstSPABC-type phosphate transportsystem172216778121678636411566410742639f-216778121678583259IolEGSugar phosphateisomerases/epimerases172316787051679553410673409825280f-116787051679548414PstCPABC-type phosphate transportsystem172416795401680370409838409008640f-216795551680299326PstAPABC-type phosphate transportsystem172516803671681128409011408250281f-116803731681126395PstBPABC-type phosphate transportsystem1726168138316817304079954076481014f-31681476168168344PhoUPPhosphate uptake regulator1727168174016823334076384070451015f-316817401682328251PhoUPPhosphate uptake regulator172816824281682817406950406561282f-11682536168270433WcaAMGlycosyltransferases involved incell wall biogenesis1729168281816834954065604058831177r-116828211683493387MhpDQ2-keto-4-pentenoatehydratase/2-oxohepta-3-ene-11730168356816845784058104048001176r-11683847168446256PepNEAminopeptidase N173116844391684564404939404814641f-21684475168455926LonOATP-dependent Lon protease1732168553516866894038434026891540r-216855351686684652TrpSJTryptophanyl-tRNA synthetase173316868691687045402509402333642f-21686875168704362SUncharacterized ArCR1734168708916879314022894014471016f-316871521687899185RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT1735168793216892994014464000791539r-216879321689249416RPredicted ATPase of the AAAsuperfamily1736168939916901753999793992031017f-316893991690173345PhnPRMetal-dependent hydrolases ofthe beta-lactamase superfamily I1737169100316924423983753969361888r-316910421692428796CAcyl-CoA synthetase (NDPforming)173816925151693180396863396198643f-216926051693172303ArsRKPredicted transcriptionalregulators173916931841693489396194395889644f-216931841693484186SUncharacterized ArCR174016934991694056395879395322645f-216935081694048163ArsRKPredicted transcriptionalregulators1741169415716956293952213937491018f-316943551695186159AmyAGGlycosidases1742169564216962653937363931131538r-21695957169623333PurCFPhosphoribosylaminoimidazolesuccinocarboxamide(SAICAR)synthase1743169627516977263931033916521537r-216968451697721342GPredicted sugar kinase174416978071698145391571391233646f-21697810169791230MelBGNa+/melibiose symporter andrelated transporters1745169909216991783902863902001019f-31746169962217001733897563892051887r-316996401700171246SUncharacterized ACR related tothe C-terminal domain of histonemacroH2A11747170021017014933891683878851886r-317002101701479464SsnAFRCytosine deaminase and relatedmetal-dependent hydrolasesCOG0402 SsnA174817035311704163385847385215647f-21703534170415592RPredicted transglutaminase-likeproteases1749170422417049703851543844081885r-317043261704965243GckAGPutative glycerate kinase1750170498917051413843893842371884r-31704989170512725SUncharacterized membraneprotein1751170536717063143840113830641883r-317055321706312441PnpFPurine nucleoside phosphorylase1752170613917069843832393823941020f-317062561706982384RArchaeal enzymes of ATP-graspsuperfamily175317069861707378382392382000283f-117069951707373151SUncharacterized ACR1754170737517081333820033812451536r-217073871708125346LPredicted nuclease of the RecBfamily1755170816817107143812103786641175r-117100971710712349RecALRecA/RadA recombinase1756171085517114873785233778911535r-21710987171122454KchPKef-type K+ transport systems1757171277817140403766003753381021f-317128051713984651CDC6LOCdc6-related protein175817140401716247375338373131648f-217146521716230621HYS2LDNA polymerase small subunit175917162481721644373130367734649f-2171627217191281536LNovel archaeal DNA polymerase(contains Zn-fingers)176017216691722406367709366972650f-21721813172202931CpsGGPhosphomannomutase1761172289417234363664843659421022f-31723122172336539LPredicted nuclease of the RecBfamily1762172522217258603641563635181023f-317252221725828250SUncharacterized ACR1763172585717267053635213626731882r-317259561726703376LplAHLipoate-protein ligase A1764172796417290223614143603561024f-317279641728660358WcaAMGlycosyltransferases involved incell wall biogenesis1765172902917297873603493595911025f-317291041729779218RAD55TRecA-superfamily ATPasesimplicated in signal transduction176617297841730227359594359151651f-21729898173022241RacXMAspartate racemase176717302701731955359108357423652f-217302701731941651IapRPredicted aminopeptidases1768173194517322803574333570981534r-21731963173215840KPredicted transcriptionalregulators1769173233217329823570463563961533r-217323771732974216RPredicted ICC-likephosphoesterases1770173299817331203563803562581532r-2177117334731734267355905355111284f-117334731734256398RPredicted amidohydrolase1772173425517350463551233543321531r-217342551735020255SmtAQRSAM-dependentmethyltransferases COG0500SmtA1773173521217357933541663535851026f-31735221173544329PstAPABC-type phosphate transportsystem177417364191736520352959352858285f-1177517364561736896352922352482653f-21736540173671732KPredicted transcriptional regulator1776173689317374233524853519551174r-11737130173732830CcmBOABC-type transport systeminvolved in cytochrome cbiogenesis1777173762017384143517583509641881r-31738181173839733FeoBPFerrous ion uptake system proteinFeoB (predicted GTPase)1778173877717395053506013498731173r-11738843173891233ChrAPChromate transport protein ChrA1779173950217398523498763495261530r-217395081739850169SUncharacterized ACR1780173993517405493494433488291172r-11740337174045132CarAEFCarbamoylphosphate synthasesmall subunit COG0505 CarA1781174079217418263485863475521027f-317408011741818515DPH2JDiphthamide synthase subunitDPH21782174192617437043474523456741028f-31742919174328538FlaDNPutative archaeal flagellar proteinD/E1783174369417439573456843454211171r-11743727174391031RpoEKDNA-directed RNA polymerasespecialized sigma subunits1784174393817442433454403451351880r-31744073174423230SIR2HNAD-dependent proteindeacetylases1785174424517455913451333437871529r-217442631745559346RAD55TRecA-superfamily ATPasesimplicated in signal transduction178617456501746300343728343078286f-117456711746277250JPredicted RNA methylase1787174689417472683424843421101029f-31746915174713431LSuperfamily I DNA and RNAhelicases and helicase subunits1788174730817486603420703407181030f-317473141748610504SunJtRNA and rRNAcytosine-C5-methylases1789174975517499313396233394471879r-31749755174989926TatCNSec-independent protein secretionpathway component TatC1790174990017499923394783393861031f-31791175041617515433389623378351528r-21750896175123832CirAPOuter membrane receptor proteins1792175171717527933376613365851878r-317518521752785449MscSMSmall-conductancemechanosensitive channel1793175279517534933365833358851527r-217528521753491155SUncharacterized ACR1794175346817552913359103340871170r-11755019175521137MfdLKTranscription-repair couplingfactor - superfamily II helicaseCOG1197 Mfd1795175544417561003339343332781526r-217554501756041210SUncharacterized ACR1796175613317569243332453324541877r-317561331756826127SUncharacterized ACR1797175702917574603323493319181169r-117570531757452175RUncharacterized proteins of PilTN-term./Vapc superfamily1798175749417587353318843306431168r-117575031758730716TufBJEGTPases - translation elongationfactors COG0050 TufB1799175887017589983305083303801525r-21800176039417607353289843286431032f-31760619176072127LAdenine-specific DNA methylase1801176216617625583272123268201876r-317621811762556176RPS6AJRibosomal protein S6E (S10)180217626761762846326702326532654f-21762772176284427KupPK+ transporter1803176284317634933265353258851167r-11763275176344633SmcDChromosome segregationATPases180417635901764141325788325237287f-117635931764109251RPredicted GTPases1805176413617646093252423247691166r-117641631764607251LrpKTranscriptional regulators180617647041765804324674323574655f-217647521765748348RPredicted GTPase orGTP-binding protein180717658401766682323538322696288f-117658491766680343UbiAH4-hydroxybenzoatepolyprenyltransferase1808176667917670683226993223101033f-31766814176698829ArpRAnkyrin repeat proteins1809176707917678853222993214931165r-117670791767619281SUncharacterized ArCR1810176791917682693214593211091164r-11768081176818332MukBDUncharacterized protein involvedin chromosome partitioning1811176827117693503211073200281875r-317682801769300431LReplication factor A large subunitand related ssDNA-bindingproteins1812176946917701433199093192351524r-217695591770099308KPredicted transcriptional regulatorcontaining the HTH domain181317708921772169318486317209289f-117709011772104447SfcACMalic enzyme1814177214417727193172343166591874r-317722011772711199FumACTartrate dehydratase betasubunit/Fumarate hydratase class I1815177265317733033167253160751163r-117726801773301226TtdACTartrate dehydratase alphasubunit/Fumarate hydratase class I1816177357117744853158073148931162r-117735711774483523SerAEPhosphoglycerate dehydrogenaseand related dehydrogenases1817177448917751453148893142331161r-117745041775140266PPhosphate transport regulator(distant homolog of PhoU)1818177513917760683142393133101523r-217751391776039357ApbAHKetopantoate reductase1819177607317765403133053128381160r-1182017765861777293312792312085290f-117765891777270186LasTJrRNA methylase1821177728117778113120973115671034f-317772871777806173AdaLMethylated DNA-protein cysteinemethyltransferase (O6methylguanine DNAmethyltransferase)182217777991778830311579310548656f-217777991778813413NrfGRTPR-repeat-containing proteins1823177906917795543103093098241035f-317792191779549131EGD2KTranscription factor homologousto NACalpha-BTF31824177955817799233098203094551522r-21779657177991268SUncharacterized ACR1825177997917816193093993077591159r-11780849178152140NtpCCArchaeal/vacuolar-typeH+-ATPase subunit C182617815971782928307781306450657f-217816001782872573HflXRGTPases1827178286617838283065123055501873r-317829141783826312PitAPPhosphate/sulphate permeases1828178401017845943053683047841036f-317840371784592213PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases182917847741784953304604304425658f-217847741784951125CPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases1830178495517861513044233032271037f-317849641786149643PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases183117861481787092303230302286659f-217861571787090559PorBCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases183217871471787473302231301905660f-217871561787471207CPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases183317874851788669301893300709291f-117874851788664609PorACPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases183417886711789675300707299703661f-217886771789673537PorBCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases183517897141790697299664298681292f-11790005179022733MhpCRPredicted hydrolases oracyltransferases (alpha/betahydrolase superfamily)183617907051791568298673297810662f-21791065179143432HemEHUroporphyrinogen-IIIdecarboxylase1837179162417919592977542974191038f-31791801179194829RfaGMPredicted glycosyltransferases1838179196317927692974152966091039f-31792029179219132MoaDHMolybdopterin converting factor183917927921793328296586296050293f-11792792179300833QorCRNADPH:quinone reductase andrelated Zn-dependentoxidoreductases COG0604 Qor1840179332517945242960532948541521r-217933251794519702CsdBESelenocysteine lyase1841179452117948232948572945551872r-31794566179475835SUncharacterized ACR184217949641796124294414293254294f-1179498817960952851843179612917971542932492922241871r-317961471797152553HypEOHydrogenase maturation factor1844179723517975612921432918171158r-11797256179749370RPredicted nucleotidyltransferases1845179756117976652918172917131520r-21797561179766343VapCRPredicted nucleic acid-bindingprotein1846179787417981162915042912621157r-11847179815818005452912202888331519r-2179822718005401259HypFOHydrogenase maturation factor1848180068618013062886922880721870r-31800704180103133EnoGEnolase184918015921802125287786287253663f-218015951802084187FtnPFerritin-like protein1850180224518033632871332860151156r-118022601803361605HypDOHydrogenase maturation factor1851180336318036022860152857761518r-218033751803597108HypCOHydrogenase maturation factor1852180366618042802857122850981040f-318036751804212246MobAHMolybdopterin-guaninedinucleotide biosynthesis protein A1853180431718045352850612848431517r-21804335180438925CelBGPhosphotransferase systemcellobiose-specific component IIC1854180457118050472848072843311869r-318046071804994135HyaDCNi(Hydrogenase maturationfactor)1855180552118058532838572835251155r-11805653180570728SrmBLKJSuperfamily II DNA and RNAhelicases COG0513 SrmB1856180591118066572834672827211154r-118059201806655359MrpDATPases involved in chromosomepartitioning(Hydrogenasematuration factor)1857180665418070732827242823051516r-218066541807068204HybFRZn finger protein HypA/HybF(possibly regulating hydrogenaseexpression)(Hydrogenasematuration factor)1858180716118080842822172812941041f-318072031808076384CzcDPCo/Zn/Cd efflux systemcomponent185918082491808404281129280974664f-21808249180838780SUncharacterized ACR1860180839418088192809842805591515r-218084031808814190CFerredoxin 31861180898518116182803932777601042f-31810719181118732ArtIEABC-type amino acid transportsystem186218117441812487277634276891665f-218117531812473339RPredicted permeases1863181251818135102768602758681868r-318125181813508476TehAPTellurite resistance protein andrelated permeases1864181335318135502760252758281043f-31813368181353329ZntAPCation transport ATPases1865181363818140542757402753241514r-218136651814004163SUncharacterized ACR1866181414118146442752372747341867r-318142161814633227SUncharacterized ACR1867181455918146482748192747301044f-31868181482918159622745492734161045f-318148291815960486FecBPABC-type Fe3+-siderophorestransport systems186918159591817002273419272376666f-218159741816997415BtuCPHABC-typecobalamin/Fe3+-siderophorestransport systems187018169991817745272379271633295f-118170171817737273FepCPHABC-typecobalamin/Fe3+-siderophorestransport systems187118177561818715271622270663667f-218178281818653497MrpDATPases involved in chromosomepartitioning1872181957018197762698082696021153r-11819570181967530SUncharacterized BCR1873182018718209362691912684421513r-21820226182042435XerCLIntegrase1874182096118216592684172677191512r-218212011821552171TFA1KTranscription initiation factor IIE1875182165918218412677192675371866r-31821659182182732DnaGLDNA primase (bacterial type)187618221051823073267273266305296f-118221051823071471CcmAQABC-type multidrug transportsystem1877182370218237822656762655961865r-3187818238571824675265521264703297f-118238571824673314RABC-type multidrug transportsystem1879182466218256242647162637541864r-318247401825610353RbsKGSugar kinases188018256481826151263730263227298f-118256481826035153EPredicted regulator of amino acidmetabolism (contains the ACTdomain)1881182622618265042631522628741511r-218262291826502167AcyPCAcylphosphatases188218265721826886262806262492299f-118265811826866147CutAPUncharacterized protein involvedin tolerance to divalent cations1883182685918274702625192619081046f-318268831827468267SUncharacterized ACR1884182756318284082618152609701863r-318277821828406229UspATUniversal stress protein UspA andrelated nucleotide-bindingproteins188518284931829698260885259680668f-218284931829693715GcvTEGlycine cleavage system Tprotein (aminomethyltransferase)188618297311830558259647258820300f-118297401830544264RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT1887183062118311152587572582631510r-218306211831113183LepBNSignal peptidase I1888183107618316452583022577331862r-318310851831622216SUncharacterized ArCR188918316991832772257679256606301f-118317021832746182NrfGRTPR-repeat-containing proteins189018327771833709256601255669669f-218327771833704455EZn-dependent dipeptidase1891183370618341582556722552201152r-11833727183413532KPredicted transcriptionalregulators1892183415518348562552232545221509r-218341731834839282RAD55TRecA-superfamily ATPasesimplicated in signal transduction1893183499218356032543862537751047f-31835103183544832EntFQNon-ribosomal peptide synthetasemodules and related proteins189418355811836201253797253177302f-11835662183597131MalGGSugar permeases189518362391837111253139252267670f-218362481837079383RPredicted archaealmethyltransferase1896183710818385082522702508701151r-11838029183830837SUncharacterized ACR1897183851518398462508632495321150r-118385421839790132SPredicted membrane protein1898183984318428212495352465571508r-21840932184132543LMicrococcal nuclease(thermonuclease) homologs1899184299618448642463822445141507r-2184299618448591005DAP2EDipeptidylaminopeptidases/acylaminoacyl-peptidases190018449471845273244431244105303f-11845022184515733RhaTGERPermeases of the drug/metabolitetransporter (DMT) superfamilyCOG0697 RhaT1901184524118459422441372434361149r-118453251845895161TPredicted Ser/Thr protein kinase190218459321846168243446243210671f-218459411846166142LrpKTranscriptional regulators1903184626718471842431112421941148r-118462671847173317CcmAQABC-type multidrug transportsystem1904184719118481112421872412671147r-11847221184770173NosYRABC-type transport systeminvolved in multi-copper enzymematuration1905184811718496642412612397141506r-21849086184926038NosYRABC-type transport systeminvolved in multi-copper enzymematuration1906185343718537422359412356361146r-11853590185370136LonOATP-dependent Lon protease1907185382618538942355522354841048f-31908185393318546072354452347711861r-318539331854602294PPhosphate transport regulator(distant homolog of PhoU)1909185461218558322347662335461505r-218546211855830596PitAPPhosphate/sulphate permeases1910185592818575862334502317921860r-31856972185739547IccRPredicted phosphohydrolases191118576561858012231722231366672f-218576561857998178SUncharacterized ACR1912185801718593002313612300781504r-218580171859286652MiaBJ2-methylthioadenine synthetase1913185938018596072299982297711145r-11859389185959664SUncharacterized ArCR1914185969518601412296832292371144r-118597011860133179HyaDCNi(Hydrogenase maturationfactor)1915186055618607412288222286371143r-11916186081418621002285642272781142r-118608141862098674CCoenzyme F420-reducinghydrogenase(hydrogenasesubunit)1917186209718629002272812264781503r-218621181862898438CCoenzyme F420-reducinghydrogenase(hydrogenasesubunit)1918186290218637862264762255921141r-118629081863784571UbiBHC2-polyprenylphenol hydroxylaseand related flavodoxinoxidoreductases˜ COG0543 UbiB(hydrogenase subunit)1919186378318648952255952244831502r-218637831864887705NapFCFerredoxin 2(hydrogenasesubunit)192018656561866711223722222667304f-118656831866691263GltDERNADPH-dependent glutamatesynthase beta chain and relatedoxidoreductases COG0493 GltD1921186669318672232226852221551049f-318667171867119156HybACFe—S-cluster-containinghydrogenase components 11922186747318686662219052207121050f-318675781868649350BisCCAnaerobic dehydrogenases(formate dehydrogenase)192318686961869637220682219741673f-218686961869554303BisCCAnaerobic dehydrogenases(formate dehydrogenase)192418696431870143219735219235305f-118696521870060172HybACFe—S-cluster-containinghydrogenase components 1(formate dehydrogenase)1925187083318718612185452175171051f-318710431871682145FocAPFormate/nitrite family oftransporters (formatedehydrogenase)1926187201518725572173632168211052f-318720541872555286MnhEPMultisubunit Na+/H+ antiporter192718725331872811216845216567674f-218725631872809128MnhFPMultisubunit Na+/H+ antiporter192818728081873179216570216199306f-118728171873159172MnhGPMultisubunit Na+/H+ antiporter1929187317618734422162022159361053f-31873251187344035PPredicted subunit of theMultisubunit Na+/H+ antiporter193018734391873735215939215643675f-21873439187373366MnhBPMultisubunit Na+/H+ antiporter193118737321874181215646215197307f-118737411874176199MnhBPMultisubunit Na+/H+ antiporter1932187416918745372152092148411054f-318741781874535167MnhCPMultisubunit Na+/H+ antiporter193318745341876078214844213300676f-218745461876073720HyfBCPFormate hydrogenlyase subunit3/Multisubunit Na+/H+ antiporter1934187607118764272133072129511055f-31876080187618830WcaJMSugar transferases involved inlipopolysaccharide synthesis193518764651876995212913212383308f-118764651876993309CNi1936187699218775612123862118171056f-318770431877556248HycECNi193718775581878838211820210540677f-218775671878836699HycECNi1938187884318798352105352095431057f-318788611879833389HyfCCFormate hydrogenlyase subunit 4193918798321880263209546209115678f-218798471880195198NuoICFormate hydrogenlyase subunit6/NADH:ubiquinoneoxidoreductase 23 kD subunit(chain I)1940188026418807972091142085811859r-31880270188072991SUncharacterized ACR1941188078418812782085942081001501r-21880790188124685SUncharacterized ACR1942188127118817592081072076191140r-118812891881745103SUncharacterized ACR1943188179018822722075882071061139r-118817901882261149SUncharacterized protein sharing aconserved domain with thiaminebiosynthesis protein ThiI194418823341883542207044205836679f-218823521883525602HolBLATPase involved in DNAreplication194518835431884076205835205302680f-218835491884074176RPredicted membrane-boundmetal-dependent hydrolases194618841571885149205221204229309f-118841571885144503TrxBOThioredoxin reductase1947188528118866272040972027511058f-318852901886607544ArgDEPLP-dependent aminotransferases194818866711887270202707202108310f-11886914188698030NarKPNitrate/nitrite transporter1949188726718875602021112018181500r-21887291188754933RPredicted RNA-binding proteins1950188754418882182018342011601138r-118875531888216254DeoCFDeoxyribose-phosphate aldolase195118887241890025200654199353681f-218887271890020724EnoGEnolase1952189000618905571993721988211499r-21890105189052258KPredicted transcriptionalregulators195318906341894026198744195352311f-118916211893961221RPredicted drug exporters of theRND superfamily195418943181894365195060195013312f-1195518944421895158194936194220682f-218944421895156386SUncharacterized ACR1956189522218956921941561936861858r-318952521895690245LrpKTranscriptional regulators1957189573018962841936481930941498r-218957301896279270FXanthosine triphosphatepyrophosphatase1958189633018968181930481925601497r-218963301896813298SUncharacterized ACR195918968861897806192492191572313f-118968951897795332LrpKTranscriptional regulators1960189780318987441915751906341496r-218978031898718293RPredicted Fe-S oxidoreductases1961189883018992551905481901231137r-118988331899241162MoaEHMolybdopterin converting factor1962189930919001781900691892001059f-31899738189990033AcsIAcyl-coenzyme Asynthetases/AMP-(fatty) acidligases1963190017119008811892071884971136r-119001831900876335ThiFHDinucleotide-utilizing enzymesinvolved in molybdopterin andthiamine biosynthesis family 21964190120519017201881731876581495r-219012141901718248CdsAICDP-diglyceride synthetase196519017831902706187595186672683f-21901933190241632BaeSTSensory transduction histidinekinases196619027461903273186632186105684f-21902941190316332RPredicted methyltransferase196719032771904434186101184944685f-219032831904432596SunJtRNA and rRNAcytosine-C5-methylases196819044311905462184947183916314f-119044461905403212RPredicted integral membraneprotein1969190550119063371838771830411060f-319055011906332397RPredicted kinase1970190633419070981830441822801857r-31906616190681732AcrRKTranscriptional regulator1971190708919080661822891813121135r-119070891908061538QRI7OMetal-dependent proteases withpossible chaperone activity1972190812719094611812511799171134r-119081451909459683CAcyl-CoA synthetase (NDPforming)197319095171910014179861179364686f-219095261909982250RPredicted nucleotidyltransferase197419100231910727179355178651315f-119100531910725372TpiAGTriosephosphate isomerase197519120101912546177368176832687f-219120191912544278BtuRHATP:corrinoidadenosyltransferase197619126511912902176727176476316f-119126511912900138SUncharacterized ArCR1977191292119135891764571757891133r-119130351913575240AraDGRibulose-5-phosphate4-epimerase and relatedepimerases and aldolases1978191347219140501759061753281494r-21913595191392233RplVJRibosomal protein L221979191438719148121749911745661493r-219143871914810226LrpKTranscriptional regulators1980191488219162041744961731741492r-219149541916193541TrmAJSAM-dependentmethyltransferases related totRNA(uracil-5-)-methyltransferase198119162521916479173126172899688f-21916282191640228MarRKTranscriptional regulators198219165211917351172857172027317f-119165721917262240DATPases involved in chromosomepartitioning1983191731019178791720681714991132r-119173341917847221PyrEFOrotate phosphoribosyltransferase1984191821519187091711631706691061f-31918230191840132RPredicted metal-dependentmembrane protease1985191869319203901706851689881131r-119187111920385880CDC9LATP-dependent DNA ligase1986192042919213311689491680471491r-219204291921329375RPredicted archaeal kinases of thesugar kinase superfamily1987192140719230651679711663131490r-219214071923051700NhaCCNa+/H+ antiporter1988192337719239701660011654081856r-319234251923968301LUracil-DNA glycosylase1989192396719243171654111650611130r-11924060192425531Spo0JKPredicted transcriptionalregulators199019244781926250164900163128689f-2192447819262331040RPredicted Fe—S oxidoreductases1991192625219265661631261628121062f-31926297192644728LysRKTranscriptional regulator199219267071929025162671160353690f-219268721929020723TarNMethyl-accepting chemotaxisprotein1993192903719304911603411588871129r-11930174193043830LysUJLysyl-tRNA synthetase class II199419305731930920158805158458318f-119305821930909125RPutative effector of mureinhydrolase LrgA1995193091719315881584611577901063f-319309171931586258LrgBMPutative effector of mureinhydrolase1996193153519320021578431573761489r-219315411931976224SUncharacterized ArCR199719321931932927157185156451319f-119322921932925325SUncharacterized ACR1998193292819332361564501561421128r-11932997193320732PheSJPhenylalanyl-tRNA synthetasealpha subunit199919333061933578156072155800320f-11933306193356193SUncharacterized ACR2000193367119340511557071553271064f-31933671193403498RPredicted nucleic acid-bindingprotein2001193402919357351553491536431127r-119340291935685764JQueuinetRNA-ribosyltransferases2002193574519366501536331527281126r-119357541936648433SUncharacterized archaealcoiled-coil domain2003193688819378351524901515431125r-119368911937824501ArcCBCarbamate kinase2004193796519393051514131500731124r-11938043193902152HemYHProtoporphyrinogen oxidase2005194137819418631480001475151065f-31941390194184978CcmAQABC-type multidrug transportsystem200619421841942507147194146871691f-21942184194245432CstATCarbon starvation protein2007194261819445761467601448021123r-1194261819445711032CAldehyde:ferredoxinoxidoreductase2008194472919458651446491435131488r-219447291945863697SFructose 1,6-bisphosphatase2009194599319463491433851430291122r-11946074194626331BglXGBeta-glucosidase-relatedglycosidases201019473281948446142050140932321f-11947346194827698ArgEEAcetylornithinedeacetylase/Succinyl-diaminopimelatedesuccinylase and relateddeacylases2011194836819498341410101395441066f-319490611949766320CysHEH3′-phosphoadenosine5′-phosphosulfate sulfotransferase(PAPS reductase)/FAD synthetaseand related enzymes COG0175CysH2012194978819518751395901375031121r-119499381951828691RArchaeal serine proteases201319518251953192137553136186322f-119518311953190555TldDRPredicted Zn-dependent proteasesand their inactivated homologs2014195318919544781361891349001067f-319531891954458345TldDRPredicted Zn-dependent proteasesand their inactivated homologs201519545401955208134838134170323f-11954828195508330PPX1CInorganicpyrophosphatase/exopolyphosphatase2016195525319573941341251319841068f-319553371957014271AmyAGGlycosidases(cyclodextringlucanotransferase)2017195739719582061319811311721855r-31957754195802731AlsDHGlutamate-1-semialdehydeaminotransferase2018195845419589751309241304031487r-21958538195886229ELP3KELP3 component of the RNApolymerase II complex2019195938419599801299941293981486r-21959423195954929GCD1MJNucleoside-diphosphate-sugarpyrophosphorylases involved inlipopolysaccharidebiosynthesis/translation initiationfactor eIF2B subunits COG1208GCD12020195999719602091293811291691120r-11960015196010826SmcDChromosome segregationATPases2021196191119656901274671236881119r-11963837196413136RluAJPseudouridylate synthases202219622261962360127152127018324f-11962229196233428202319645671964629124811124749692f-22024196587319666581235051227201069f-319658791966644381SgcQRPredicted TIM-barrel enzyme2025196689919694031224791199751070f-31968654196898735RecBLATP-dependent exoDNAse(exonuclease V) beta subunit(contains helicase andexonuclease domains)202619693961970652119982118726325f-11969603196990935AprEOSubtilisin-like serine proteases202719708041971262118574118116693f-21970918197115540MazGRPredicted pyrophosphatase202819713281971672118050117706326f-11971481197161337IlvEEHBranched-chain amino acidaminotransferase/4-amino-4-deoxychorismatelyase COG0115 IlvE202919716821972395117696116983327f-11971904197221632MetGJMethionyl-tRNA synthetase203019724931973851116885115527694f-219725021973849709CpsGGPhosphomannomutase2031197429919753571150791140211854r-31975178197534632SUncharacterized BCR2032197569519770171136831123611071f-31975734197608230BacASUncharacterized ACR2033197697119773991124071119791118r-119770551977343312034197739619777041119821116741485r-219774021977678118ArsRKPredicted transcriptionalregulators2035197781919784001115591109781484r-219778191978377218SUncharacterized ACR2036197839719789931109811103851853r-319784061978982263CoaEHDephospho-CoA kinase2037197896619797691104121096091117r-11978966197927576RUncharacterized ATPases of thePP-loop superfamily203819798661980489109512108889328f-119799291980376134SUncharacterized membraneprotein2039198048419809421088941084361116r-119804961980937229PyrIFAspartate carbamoyltransferaseregulatory subunit2040198094619818781084321075001115r-119809461981843487PyrBFAspartate carbamoyltransferase2041198198619828971073921064811072f-319823671982880159SUncharacterized ACR204219828941983307106484106071695f-219828941983305193SUncharacterized ACR2043198357319843251058051050531483r-21983972198428435RPredicted metal-binding domain(associated with helicases inPyrococcus and Mtub)2044198436919857241050091036541114r-119843691985722822SUncharacterized ACR204519859421987522103436101856696f-21986548198668033TehAPTellurite resistance protein andrelated permeases2046198753519888481018431005301852r-319875621988771205RUncharacterized ATPases of theAAA superfamily204719888831989671100495997071482r-21988907198904830CpsGGPhosphomannomutase20481989712199070199666986771113r-11990111199026430RATPase components of variousABC-type transport systems20491991043199202998335973491481r-219910491991937223ThrCEThreonine synthase20501992178199332397200960551112r-11992334199255332FDeoxyguanosine/deoxyadenosinekinase20511993320199392896058954501480r-219933621993914320HslVOProteasome protease subunit20521993956199468495422946941479r-219939741994667297FepCPHABC-typecobalamin/Fe3+-siderophorestransport systems20531994681199569494697936841851r-319946811995686301BtuCPHABC-typecobalamin/Fe3+-siderophorestransport systems20541995731199706293647923161850r-319957611997033280SUncharacterized ArCR20551997062199971392316896651111r-119970621998448284SbcCLATPase involved in DNA repair20561999710200109289668882861478r-219997102000895354SbcDLDNA repair exonuclease20572001233200302088145863581849r-320012722002916595RPredicted ATPase20582003136200371186242856671073f-320032292003700233RimIRAcetyltransferases2059200369620042178568285161697f-220037052004215243SEN2JtRNA splicing endonuclease20602004220200457685158848021110r-12004421200456531AcrRKTranscriptional regulator2061200489020049438448884435698f-220622005188200661584190827631477r-220054192006613699BioFH7-keto-8-aminopelargonatesynthetase and related enzymes2063200653620091368284280242329f-120067222008342773LInteins20642009133201064180245787371074f-320091422010378666HolBLATPase involved in DNAreplication2065201069720120137868177365330f-120107872011984213RUncharacterizedATPases of theAAA superfamily2066201207220123147730677064699f-22012099201224634SUncharacterized ACR20672012311201251477067768641109r-12012377201251266RPredicted ATPase of the AAAsuperfamily20682012712201357276666758061476r-220128142013549133RPredicted ATPase of the AAAsuperfamily20692013609201466175769747171475r-220136482014656423RPredicted methyltransferase20702014525201556874853738101108r-12014672201482230NosYRABC-type transport systeminvolved in multi-copper enzymematuration20712015632201656473746728141107r-120156412016559429MoaAHMolybdenum cofactorbiosynthesis enzyme20722016684201742172694719571075f-32016699201724241SmcDChromosome segregationATPases2073201737820188027200070576331f-120173782018800704CafAJRibonucleases G and E20742019182201940670196699721848r-320191822019401108AbrBKRegulators ofstationary/sporulation geneexpression20752019763202042569615689531106r-120197662020420286RecALRecA/RadA recombinase20762020435202107668943683021105r-120204412021074272RPredicted Zn-dependenthydrolases of the beta-lactamasefold20772021157202152268221678561076f-32021199202133435RPredicted GTPases2078202149520222146788367164700f-22021807202212833LrgBMPutative effector of mureinhydrolase2079202226920231116710966267701f-220222692023103422PrsAFEPhosphoribosylpyrophosphatesynthetase COG0462 PrsA(ribose phosphatepyrophosphokinase)2080202534020254176403863961332f-12081202863120289126074760466333f-12028631202881432BaeSTSensory transduction histidinekinases2082202891420294896046459889702f-220289232029481274SUncharacterized ACR20832029483203009459895592841104r-12029573203003247SEC59IDolichol kinase20842030142203102359236583551474r-22030157203040035FadRKTranscriptional regulators20852031138203272758240566511077f-320311472032725770LysSJLysyl-tRNA synthetase class I20862032734203342056644559581473r-220327342033415334SmtAQRSAM-dependentmethyltransferases COG0500SmtA2087203350120344665587754912703f-220335192034458515RPredicted archaeal sugar kinases20882034330203561055048537681078f-320344592035602596CPredicted butyrate kinase2089203563720362545374153124704f-220356702036246336PorGCPyruvate:ferredoxinoxidoreductase and related2-oxoacid:ferredoxinoxidoreductases20902036331203659453047527841079f-320363312036574124PhoUPPhosphate uptake regulator2091203660920372445276952134705f-220366092037239296PhoUPPhosphate uptake regulator2092203729020382195208851159706f-220372992038217544EAsparaginase2093203821920393945115949984334f-1203823120393684422094203942920400404994949338707f-220394292040026255RBiotin synthase-related enzyme20952039994204032649384490521080f-320400092040312111RBiotin synthase-related enzyme20962040316204081649062485621103r-12040316204073945NrfGRTPR-repeat-containing proteins20972040797204173248581476461847r-320407972041718498TPredicted serine/threonine proteinkinases20982043010204420346368451751102r-120430102044201669RPT1OATP-dependent 26S proteasomeregulatory subunit2099204434020451704503844208708f-220444212045141252SUncharacterized ACR21002045127204603244251433461472r-220451542045985298RfeMUDP-N-acetylmuramylpentapeptidephosphotransferase/UDP-N-acetylglucosamine-1-phosphatetransferase2101204607720473994330141979709f-220466772047397303WcaAMGlycosyltransferases involved incell wall biogenesis2102204740620477804197241598710f-22047478204775175SUncharacterized ACR21032047777204831341601410651101r-120477832048305325ComEBFDeoxycytidylate deaminase21042048320204909941058402791100r-120484822049088175HtpXOZn-dependent protease withchaperone function21052049106204947140272399071099r-120491062049469184KPredicted transcriptional regulator2106205069720516143868137764711f-220507212051612493PyrDFDihydroorotate dehydrogenase21072051664205190037714374781081f-32051664205183885AbrBKRegulators ofstationary/sporulation geneexpression2108205188820522983749037080712f-22051894205225732RUncharacterized proteins of PilTN-term./Vapc superfamily2109205229520530143708336364335f-120522952053012391RPredicted ATPase (PP-loopsuperfamily)21102053125205319036253361881082f-321112055992205714633386322321846r-32055992205714155421122057204205746732174319111845r-32057216205744153SPredicted membrane protein21132057477205865531901307231844r-320574862058653561AvtAEPLP-dependent aminotransferases21142058742205914930636302291098r-12058769205913289SUncharacterized ACR2115205931020595013006829877713f-22059310205942759KPredicted transcriptionalregulators containing theCopG/Arc/MetJ DNA-bindingdomain21162059560206080129818285771083f-320595602060775454FtsZDCell division GTPase2117206081920615982855927780714f-220608282061596420SojDATPases involved in chromosomepartitioning21182061501206191127877274671084f-32061690206186132RWD40 repeat protein21192061997206244627381269321097r-120620122062444222TagDMICytidylyltransferase COG0615TagD21202062448206296626930264121843r-320624482062964292JPUA domain (predictedRNA-binding domain)21212062966206360726412257711096r-120629812063593312PyrFFOrotidine-5′-phosphatedecarboxylase21222063612206421425766251641842r-32063678206385835DeoRKTranscriptional regulator21232064280206542825098239501095r-120642802065423586INO1IMyo-inositol-1-phosphatesynthase21242065471206677823907226001094r-120654922066215311MPredicted sugarnucleotidyltransferases2125206686320675582251521820336f-120668782067541320RPredicted ATPases of PP-loopsuperfamily2126206762320683842175520994715f-220676232068379355CcmAQABC-type multidrug transportsystem2127206838420698382099419540337f-120683872069740140RPredicted permease21282069828207018419550191941841r-320698282070182176SUncharacterized ACR21292070189207072819189186501471r-220702162070720238FADP-ribose pyrophosphatase21302070778207159918600177791093r-120707782071522124RbsKGSugar kinases21312071722207206917656173091085f-320717222071995130GAR1JRNA-binding protein involved inrRNA processing2132207206620729861731216392716f-220720752072978343SUA7KTranscription initiation factor IIB2133207300220734901637615888717f-220730022073488145RPredicted phosphoesterase21342073534207373715844156411470r-220735342073735114HHT1LHistones H3 and H4 (HistonA&B)2135207401220754241536613954338f-120741112075422649RbcLGRibulose 12136207555720761621382113216339f-120755692076085224KPredicted transcriptionalregulators21372076199207641113179129671092r-120762082076409113RPS17AJRibosomal protein S17E21382076528207695912850124191086f-320765282076909182SUncharacterized ArCR2139207698620776631239211715718f-220769952077661351SUncharacterized ACR2140207770320781521167511226719f-22077772207793131CcmCOABC-type transport systeminvolved in cytochrome cbiogenesis21412078164207896411214104141091r-120781672078932275SplBLDNA repair photolyase2142207900120800261037793521090r-120790192080021335NrfGRTPR-repeat-containing proteins21432080319208216990597209720f-2208031920821641008NrdDFOxygen-sensitiveribonucleoside-triphosphatereductase21442082376208289770026481340f-120823762082874194PflAOPyruvate-formate lyase-activatingenzyme214520829192083284645960941089r-120829252083282171SUncharacterized ACR214620832882084007609053711088r-120832882083987359CofRPredicted hydrolases of the HADsuperfamily214720840572085316532140621840r-320840902085308503SUncharacterized ACR21482085470208711039082268721f-220854702087042899GroLOChaperonin GroEL (HSP60family)(Chaperonin B)21492087216208856821628101839r-320872162088566753SunJtRNA and rRNAcytosine-C5-methylases215020886702088921708457341f-12088691208882330FliAKDNA-directed RNA polymerasespecialized sigma subunit2151208890520893784730722f-22088911208936473RPredicted nucleic acid-bindingprotein


In Table 2, f-1 through f-3, as described as reading frames, refers to open reading frames in the sense strand, and r-1 through r-3 refers to open reading frames in the antisense strand. In the classification, J refers to polypeptides relating to translation, ribosome structure or biological development; K refers to polypeptides relating to transcription; L refers to polypeptides relating to DNA replication, recombination or repair; D refers to polypeptides relating to chromosomal fractionation; O refers to polypeptides relating to post-translational events, protein metabolism turnover or chaperone proteins; M refers to polypeptides relating to cellular envelope biological development or outer membranes; N refers to polypeptides relating to cellular movement or secretion; P refers to polypeptides relating to inorganic ion transportation or metabolism; T refers to polypeptides relating to signaling mechanisms; C refers to polypeptides relating to energy production and conversion; G refers to polypeptides relating to carbohydrate transportation and metabolism; E refers to polypeptides relating to amino acid transportation and metabolism; F refers to polypeptides relating to nucleotide transportation and metabolism; H refers to polypeptides relating to coenzyme metabolism; I refers to polypeptides relating to lipid metabolism; Q refers to polypeptides relating to secondary metabolites biosynthesis, transportation or catabolism; R refers to polypeptides predicted to have general function; and S refers to polypeptides with an unknown function. Classification is interim, and two or more classifications may be appropriate, and in such cases, both letters are described therein.


(Biomolecule Chip)


In another aspect, the present invention provides a biomolecule chip. The present biomolecule chip comprises a substrate and at least one nucleic acid molecule having at least eight contiguous or non-contiguous nucleotide sequences of the sequence set forth in SEQ ID NOs: 1, or 1087, or a variant thereof located therein.


Accordingly, in one embodiment, the present invention provides a nucleic acid molecule comprising a) a sequence set forth in SEQ ID NO: 1 or 1087, or a complementary sequence or fragment thereof; (b) a polynucleotide encoding a polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; (c) a polynucleotide encoding a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a variant thereof having at least one mutation selected from the group consisting of one or more amino acid substitutions, additions, and deletions, wherein the variant polypeptide has biological activity; (d) a polynucleotide capable of hybridizing to a polynucleotide of any of (a)-(c), and encoding a polypeptide having an amino acid sequence having at least 70% identity to any one of the polypeptides of (a) to (c), wherein the polypeptide has biological activity.


In one preferred embodiment, the number of substitutions, additions and deletions described in (c) above may be limited to, for example, preferably 50 or less, 40 or less, 30 or less, 20 or less, 15 or less, 10 or less, 9 or less, 8 or less, 7 or less, 6 or less, 5 or less, 4 or less, 3 or less, or 2 or less. The number of substitutions, additions and deletions is preferably small, but may be large as long as the biological activity is maintained (preferably, the activity is similar to or substantially the same as that as set forth in Table 2, or an abnormal activity thereof (for example, inhibition of normal biological activity).


In other preferable embodiments, the biological activities possessed by the polypeptides of the present invention include, but are not limited to, for example, interactions with specific antibodies against at least one polypeptide selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157; a biological activity listed in Table 2, and the like. These may be measured by, for example, immunological assays, labeling assays and the like.


In other preferable embodiments, allelic gene variants as described in (d) above, advantageously have at least 99% homology to the nucleic acid sequences set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).


If a gene sequence database for the subject species is available, the above-mentioned species homologs may be identified by searching against the database using a gene sequence of the present invention as a query sequence. Alternatively, a nucleic acid sequence of the present invention, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)) may be used as a probe or primer to screen a genetic library of the subject species for identification thereof. Such identification methods are well known in the art, and are also described in references cited herein. Species homologs have preferably at least 30% homology to a nucleic acid sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)). Preferably, the species homologs of the present invention may have at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology with the above-mentioned standard sequence.


In preferable embodiments, identity against at least one polynucleotide of the above (a)-(e) or the complementary sequence thereto, maybe at least about 80%, more preferably at least 90%, still more preferably at least about 98%,most preferably at least about 99%. Most preferably, the gene sequence of the present invention, has a sequence 100% identical to a nucleic acid sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).


In a preferred embodiment, the nucleic acid molecule of the present invention encoding the gene of the present invention may have a length of at least 8 contiguous nucleotides. The appropriate nucleotide length of the nucleic acid molecule of the present invention may vary depending on the purpose of use of the present invention. More preferably, the nucleic acid molecule of the present invention may have a length of at least 10 contiguous nucleotides, even more preferably at least 15 contiguous nucleotides, still even more preferably at least 20 contiguous nucleotides, and yet still even more preferably at least 30 contiguous or non-contiguous nucleotides. These lower limits of the nucleotide length may be present between the above-specified numbers (e.g., 9, 11, 12, 13, 14, 16, and the like) or above the above-specified numbers (e.g., 21, 22, . . . 30, and the like). The upper limit of the length of the polypeptide of the present invention may be greater than or equal to the full length of the sequence as set forth in SEQ ID NO. 1, as long as the polynucleotide can be used for the intended purpose (e.g. antisense, RNAi, marker, primer, probe, capable of interacting with a given agent). Alternatively, when the nucleic acid molecule of the present invention is used as a primer, the nucleic acid molecule typically may have a nucleotide length of at least about 8, preferably a nucleotide length of about 10. When used as a probe, the nucleic acid molecule typically may have a nucleotide length of at least about 15, and preferably a nucleotide length about 17.


In one embodiment, the nucleic acid molecule encoding the gene of the present invention comprises the entire range of the open reading frame of SEQ ID NO: 1. More preferably, the nucleic acid molecule of the present invention consists of at least one sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).


Accordingly, the biomolecule chip of the present invention preferably uses nucleic acid molecules or variants thereof which encompass the sequence set forth in SEQ ID NO: 1 or 1087. By using nucleic acid molecules of such an encompassing nature, it is possible to analyze functions of the genome in an exhaustive manner. This was first made possible by reading the entire sequence of the genome, and thus has not been attained by prior art technologies, and thus should present significant effects.


In other embodiments, the nucleic acid molecules, or variants thereof of the present invention, to be used in the biomolecule chip, comprise any open reading frame, as set forth in SEQ ID NO: 1 or 1087. As such, the effect by which any open reading frame can be selected on the genome, should be recognized as significant as this has not been possible using prior art technology. In particular, it should be noted that analysis of the entire genome of an organism living in high temperature environments, such as at 90° C., is possible.


In another embodiment, the nucleic acid molecule or variants thereof, to be used in the biomolecule chip of the present invention, preferably comprise substantially all the open reading frames set forth in SEQ ID NO: 1 or 1087. As used herein the term “substantially all” refers to a number sufficient for global genomic needs. Accordingly, the term “substantially all” is not necessarily all, and depending on the purpose of interest, those skilled in the art may select an appropriate number therefor. Exemplary “substantially all” includes, but is not limited to, for example, at least about 30%, preferably at least about 40%, more preferably at least about 50%, still preferably at least about 80%, still more preferably at least about 90%, yet more preferably at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and the like, of the total number of entire open reading frames. In other typical examples of the present invention, substantially all may be about 900 genes whose function has already been identified in the present application. The effect by which analysis of substantially all the open reading frame is allowed, is not attainable using prior art technologies.


Accordingly, in another preferable embodiment, the nucleic acid molecule or variants thereof, to be used in the biomolecule chip of the present invention, comprises a sequence encoding at least one sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.


In other preferable embodiments, the nucleic acid molecules or variants thereof comprise substantially all sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.


In more preferable embodiments, the nucleic acid molecule or the variant thereof, to be used as the biomolecule of the present invention, comprises at least an eight contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. As used herein the selection of the sequence may be determined in consideration of a variety of factors as described above. A nulciec acid molecule at least eight contiguous nucleotides in length may comprise a sequence unique to the hyperthermophillic archeabacteria, and thus is advantageous for such analyses.


In another preferable embodiment, the nucleic acid molecule or the variant thereof to be used as the biomolecule of the present invention, comprises at least a fifteen contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. A nucleic acid molecule at least fifteen nucleotides in length allows substantially specific identification of sequences unique to the hyperthermophillic archeabacteria, and thus is advantageous for such analyses.


In another more preferable embodiment, the nucleic acid molecule or the variant thereof, to be used in the biomolecule chip of the present invention, comprises at least a thirty contiguous or non-contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. A nucleic acid molecule at least thirty contiguous or non-contiguous nucleotides in length allows substanitally specific identification of sequences unique to the hyperthermophillic archeabacteria, even when used as a probe, and thus is advantageous for such analyses.


In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Such sequences allow exhaustive analyses of nucleic acid molecules encoding polypeptides included or suspected to be included in an archeabacteria, and thus are advantageous for such analyses.


In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises at least an eight contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Chips containing such sequences may be used for analysis of the behavior of all genes.


In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises a molecule where the reading frame of Table 2 is f-1, f-2 or f-3, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. Such sequences contain open reading frames actually possessed by hyperthermophillic archeabacteria and thus provide an accurate assay at the genomic level. Thus, the present embodiment may be used for global analysis at such a genomic level.


In another embodiment, the substrate comprising the biomolecule of the present inventin is addressable. Giving addresses facilitates the analyses of all of the nucleic acid molecules. Methods for addressing are well known in the art.


In another aspect, the present invention provides a biomolecule chip with a polypeptide or a variant thereof, having at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.


Accordingly, in one embodiment, the present invention provides a polypeptide of (a) a polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; (b) a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a variant thereof having at least one mutation selected from the group consisting of one or more amino acid substitutions, additions, and deletions, wherein the variant polypeptide has a biological activity; (c) a polypeptide encoded by a sequence or splicing variants or allelic variants thereof, wherein the nucleic acid molecule or the variant thereof, when the reading frame of Table 2 is f-1, f-2 or f-3, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop); (d) a polypeptide of at least one species homolog of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157; or (e) a polypeptide having an amino acid sequence having at least 70% identity to any one of the polypeptides of (a) to (c), wherein the polypeptide has biological activity.


In one preferred embodiment, the number of substitutions, additions and deletions described in (b) above may be limited to, for example, preferably 50 or less, 40 or less, 30 or less, 20 or less, 15 or less, 10 or less, 9 or less, 8 or less, 7 or less, 6 or less, 5 or less, 4 or less, 3 or less, or 2 or less. The number of substitutions, additions and deletions is preferably small, but may be large as long as biological activity, is maintained (preferably, the activity is similar to or substantially the same as that of the biological activity of a normal genetic type of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or an abnormal activity of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157).


In another preferred embodiment, the above-described splicing or allelic variants of the polypeptides described in (c) above preferably have at least about 99% homology to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.


In another preferable embodiment, the above-mentioned species homologs preferably have at least about 30% homology to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. Preferably, the species homologs have homology to the above standard sequence with at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology.


When a genetic sequence database of the species exists, the above species homologs may be identified by performing a search against the database using a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, as a query sequence. Alternatively, the entire amino acid sequence of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a portion thereof, may be used as a probe or primer for screening a genetic library of the species. Such methods for identification are well known in the art, and are described in the references cited herein. Species homologs have preferably at least about 30% homology when the reading frame of Table 2 is f-1, f-2 or f-3, a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop); or an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. Preferably, the species homologs may have homology to the above standard sequence with at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology.


In another preferable embodiment, the biological activity possessed by the variant polypeptide in (e) above, includes, but is not limited to, for example, interaction with an antibody specific to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; an enzymatic function as described in Table 2; and the like. Such functions may be measured by enzymatic assays, immunological assays, fluorescence assays and the like.


In preferable embodiments, the above-described homology to any one of the polypeptides described in (a) to (d) above may be at least about 80%, more preferably at least about 90%, even more preferably at least about 98%, and most preferably at least about 99%. Most preferably, the genetic product of the present invention is a sequence consisting of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.


The polypeptide of the present invention typically has a sequence of at least 3 contiguous amino acids. The amino acid length of the polypeptide of the present invention may be short as long as the peptide is suitable for an intended application, but preferably a longer sequence may be used. Therefore, the amino acid length may be preferably at least 4, more preferably at least 5, at least 6, at least 7, at least 8, at least 9 and at least 10, even more preferably at least 15, and still even more preferably at least 20. These lower limits of the amino acid length may be present between the above-specified numbers (e.g., 11, 12, 13, 14, 16, and the like) or above the above-specified numbers (e.g., 21, 22, . . . , 30, and the like). The upper limit of the length of the polypeptide of the present invention may be greater than or equal to the full length of the sequence as set forth in amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157 as long as the peptide is capable of interacting with a given agent. As used herein, more preferable forms and constitutions with respect to the sequence to be included, may take any embodiment described herein above for preferable forms and constitutions.


The genetic product of the polypeptide form of the present invention is preferably labeled or may be capable of being labeled. Such a genetic product which is labeled or may be capable of being labeled, may be used to measure the antibody levels against the genetic product, thereby allowing indirect measurement of the level of expression of the genetic product.


In another preferable embodiment, the polypeptide or the variant thereof to be located on to a support of the biomolecule chip of the present invention has a length of at least three contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. By having a sequence of at least three contiguous three amino acids, it is possible to constitute a specific epitope. As used herein, preferable forms of the sequence to be used, takes any form described herein above.


In preferable embodiments, the polypeptide or the variant thereof to be located on a support of the biomolecule chip of the present invention, has a length of at least eight contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. By having a sequence of at least eight contiguous amino acids, it is possible to constitute specific epitopes in a more efficient manner. As used herein, preferable forms and constitutions of the sequence to be used, takes any form described herein above.


In preferable embodiments, the polypeptide or the variant thereof to be located on a support of the biomolecule chip of the present invention, has a length of at least three contiguous or non-contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, and having a biological function. As used herein, the biological activities preferably include a function described in Table 2. In another embodiment, the biological activity includes epitope activity. As used herein, preferable forms and constitutions relating to preferable sequences may have the advantage of any of the forms and constitutions described herein above.


In another aspect, the present invention provides a storage medium having stored therein, information about a nucleic acid sequence of a nucleic acid molecule having a sequence of at least eight contiguous or non-contiguous nucleotides of the sequence set forth in SEQ ID NOs: 1 or 1087, or a variant thereof. As used herein, the information about the nucleic acid sequence includes, in addition to information about the nucleic acid sequence per se, information relating to that set forth in a conventional sequence listing. Such additional information includes, but is not limited to, for example, coding region, intron region, specific expression, promoter sequence and activity, biological function, similar sequences, homologs, reference information, and the like.


In a preferable embodiments, the nucleic acid molecule or the variant thereof to be stored in the storage medium of the present invention, comprises a sequence of at least eight contiguous nucleotides of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Such information could not be provided by prior art technologies, and thus should be recognized to be an effect attained for the first time by the present invention.


In other embodiments, the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule or the variant thereof to be recorded in the storage medium of the present invention, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. Such storage medium with information recorded thereon has never been conventionally provided, and thus the storage medium of the present invention has an advantageous effect in allowing analysis of the entire genome. Preferably, the storage medium of the present invention includes information about substantially all the open reading frame sequences. As used herein, preferable forms and constitutions relating such preferable sequences may take advantages of any forms and constitutions described herein above.


In another aspect, the present invention provides a storage medium, comprising information about a polypeptide or a variant thereof having at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein. As used herein, preferable forms and constitutions relating such preferable sequences may take advantage of any forms and constitutions described herein above.


In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present, invention with respect to information thereabout, has a sequence of at least three contiguous amino acids of at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, the referable forms and constitutions of such preferable sequences may take advantage of any of the forms and constitutions described herein above.


In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present invention with respect to information thereabout, has a sequence of at least eight contiguous amino acids of at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, the preferable forms and constitutions of such preferable sequences may take advantages of any of the forms and constitutions described herein above.


In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present invention with respect to information thereabout, has a sequence of at least three contiguous or non-contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, having biological function. As used herein, preferable forms and constitutions of such preferable sequences may take advantages of any of the forms and constitutions described herein above.


In another embodiment, the biological activity to be included in the storage medium of the present invention with respect to information thereof, comprises a function set forth in Table 2. As used herein, preferable forms and constitutions of such preferable activities may take advantage of any forms and constitutions described herein above.


In another aspect, the present invention provides a biomolecule chip having at least one antibody against a polypeptide or a variant thereof, located on a substrate, the polypeptide or the variant thereof comprises at least one amino acid sequence of sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, preferable forms and constitutions of preferable sequences may take advantage of any forms and constitutions described herein above.


In another aspect, the present invention provides an RNAi molecule having a sequence homologous to a reading frame sequence wherein, when the reading frame of Table 2 is f-1, f-2 or f-3, the reading frame sequence has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the reading frame sequence has a a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. As used herein, such an RNAi molecule may take any form described herein above in detail, and those skilled in the art may make and use any appropriate RNAi molecule once the sequence information of the present invention is given.


In preferable embodiments, the RNAi molecule of the present invention is an RNA or a variant thereof comprising double-stranded portion of at least 10 nucleotide length.


In a more preferable embodiment, the RNAi molecule comprises a 3′ overhand.


In another preferable embodiment, the above-3′ overhang terminus has a DNA molecule of two or more nucleotides in length.


In other preferable embodiments, the 3′ overhang has a DNA molecule of 2-4 nucleotides.


Such RNAi molecules may be used for suppressing particular functions of hyperthermophillic archeabacteria. Any RNAi molecules may be used which were not attainable by the prior art, and thus the present invention attains significant effects in this regard.


All patents, patent applications, journal articles and other references mentioned herein are incorporated by reference in their entirety.


The present invention is heretofore described with reference to preferred embodiments to facilitate understanding of the present invention. Hereinafter, the present invention will be described by way of examples. Examples described below are provided for illustrative purposes only. Accordingly, the scope of the present invention is limited only by the appended claims.


EXAMPLES

Hereinafter, the present invention will be described in more detail by way of examples. Thus it should be understood that the present invention is not limited to the examples below.


EXAMPLE 1
Genomic Sequencing

(Preparation of Chromosomal DNA the KOD-1 Strain)


The KOD-1 strain was inoculated into 1000 ml of 0.5×2216 Marine Broth medium as described in Appl. Environ. Microbiol. 60 (12), 4559-4566 (1994) (2216 Marine Broth: 18.7 g/L, PIPES 3.48 g/L, CaCl2.H2O 0.725 g/L, 0.4 mL 0.2% resazurin, 475 mL artificial sea water (NaCl 28.16 g/L, KCl 0.7 g/L, MgCl2.6H2O 5.5 g/L, MgSO4.7H2O 6.9 g/L), distilled water 500 mL, pH 7.0) and cultured using 2 liter fermenter. During culture, nitrogen gas was introduced into the fermenter, and was maintained at an internal pressure of 0.1 kg/cm2. Culture was maintained at the temperature of 85±1° C. for fourteen hours. Further, the culture was carried out by static culture, and no aeration and agitation was performed with the nitrogen gas in the culture. After culture, the bacteria (about 1,000 ml) were recovered by centrifugation at 10,000 rpm for 10 minutes.


One g of the resulting bacterial pellet was suspended in 10 ml of Solution A (50 mM Tris-HCl, 50 mM EDTA, pH 8.0), and centrifuged (8,000 rpm, 5 minutes, 4° C.) to pellet the bacteria and suspended in 3 ml of Solution A containing 15% sucrose, maintained the temperature at 37° C. for 30 minutes, and added 3 ml of Solution A containing 1% N-lauryl sarcosine thereto. 5.4 g of cesium chloride and 300 μl of10 mg/ml of ethidium bromide were added to the solution, and ultracentrifuged at 55,000 rpm, 16 hours, at 18° C. and chromosomal DNA was fractionated. The resultant chromosomal DNA fractions were subjected to n-butanol extraction to remove ethidium bromide, and dialyzed against TE solution (10 mM Tris-HCl (pH 8.0), 0.1 mM EDTA) to yield chromosomal DNA.


(Screening/Sequencing Analysis of the Chromosomal Library)


Determination of the genomic sequence was peformed according to the bottom-down approach, as generally performed in the art. In brief, the outline is as follows: first, isolated DNA was fragmented to clone into a cloning vector such as pUC. Next, cloned fragments were sequenced by shot-gun sequencing. These sequencing reactions were performed at about 15,000 per 1 Mbp. The sequences determined for each reaction, were assembled for clarification in a group of sequences called “contig”. Thereafter, gaps between the contigs (physical and sequence gaps) were cloned, and the gaps were sequenced to fill the gaps. Thereafter, the analysis of base sequence data was performed to identify open reading frame for performing annotation. The details are as follows:


First, genomic libraries were constructed. As used herein, in order to prevent bias derived from genetic sequences, physical digestions rather than partial digestion using restriction enzymes were performed. In this case, libraries of a plurality of lengths were constructed. Plasmid libaries containing 2-3 kbp fragments, and lambda phage libraries containing about 20 kbp were constructed.


Second, shot gun sequencing of plasmid libraries was performed. A sequencer commercially available from Applied Biosystems was used for sequencing. As used herein, such sequencing was performed so that 400-500 bp base sequences may be obtained for about 150,000/1 Mbp. Similarly, terminal shot gun sequencing of the lambda phage library was performed. As such, theoretically, it was calculated the entire full-length genome was sequenced six times or more.


Third, base sequence data (about 40,000 pieces of data for about 2 Mbp genome) was assembled to fill in the gaps. In this instance, terminal sequence data from the lambda phage library consisting of long fragments was determined for relative positions and the direction of each region. What is obtained by this proceedure is usually called a “contig”. In the present Example, a number of contigs were obtained. Sequence undetermined regions (gaps) therebetween were filled. When fragments were identified to fill the gap between contigs, such gaps are called sequence gaps, and gaps in which such fragments were not cloned, are called physical gaps. Filling such physical gaps was performed by engineering techniques, such as amplification of LA-PCR and the like, and base sequence determination and the like. As such, substantially all the sequencing data fell within one contig, and the sequencing was thus completed.


Fourth, the sequence data was analyzed. Open reading frames (ORF) were identified and the annotation thereof was performed. In this task, programs such as Hidden Markov model (HMM) and Interpolated Markov model (GLIMMER) and the like were used for identification of ORFs. Thereafter, the search functions of BLAST, BLASTX and FASTA and the like were used to identify the function of each ORF. Thereafter, genetic and biochemical analyses were performed (see, for example, Fraser C. M., Res Microbiol., 151, 79-84 (2000); Fraser C. M. et al., Nature, 406, 799-803 (2000); Nelson et al., Nat Biotechnol., 18, 1049-1054 (2000); Kawarabayasi Y. et al., DNA Res., 6, 83-101, 145-222 (1999) and the like).


The nucleic acid sequences determined as above are sequences set forth in SEQ ID NO: 1 (SEQ ID NOs: 1, 342, and 723 are plus (sense) strand, and SEQ ID NOs: 1087, 1469 and 1838 are minus (antisense) strand).


(Functional Analysis of Each Gene)


Next, the amino acid sequence of each gene was compared to those known in the art, as registered in databases such as EMBL, PDB and the like, by using software such as DNASIS, BLAST, and CLUSTAL W. As a result, a variety of polypeptides having high homology with said amino acid sequences were identified, and the function of each gene inferred therefrom (see Table 2).


Example 2
Targeting

(Double Cross-Over Disruption)


(Bacterial Strains and Growth Conditions)



T. kodakaraensis KOD1 and derivatives thereof were cultured under stringent anaerobic conditions at 85° C. in rich growth medium (ASW-YT) and amino acid-containing synthetic medium (ASW-AA). ASW-YT medium contains 5.0 g/L yeast extract, 5.0 g/L trypton and 0.2 g/L sulfur (pH 6.6) in a diluted artificial sea water to 1.25 fold (ASW×0.8). The composition of ASW is as follows: NaCl 20 g; MgCl2.6H2O 3 g; MgSO4.7H2O 6 g; (NH4)2SO4 1 g; NaHCO3 0.2 g; CaCl2.2H2O 0.3 g; KCl 0.5 g; NaBr 0.05 g; SrCl2.6H2O 0.02 g; and Fe(NH4) citrate 0.01 g. ASW-AA medium is 0.8×ASW supplemented with 5.0 ml/L modified Wolfe minor mineral (containing in 1 L, 0.5 g MnSO4. 2H2O; 0.1 g CoCl2; 0.1 g ZnSO4; 0.01 g CuSO4.5H2O; 0.01 g AlK(SO4)2; 0.01 g H3BO3; and 0.01 g NaMoO4.2H2O), 5.0 ml/L vitamin mixture (see the following literature), twenty amino acids (containing 250 mg cystein.HCl; 75 mg alanine; 125 mg arginine.HCl; 100 mg asparagine.H2O; 50 mg aspartic acid; 50 mg glutamine, 200 mg glutamic acid; 200 mg glycine; 100 mg histidine.HCl.H2O; 100 mg isoleucine; 100 mg leucine; 100 mg lysine.HCl; 75 mg methionine; 75 mg phenylalanine; 125 mg proline; 75 mg serine; 100 mg threonine; 75 mg tryptophane; 100 mg tyrosine; and 50 mg valine in 1 L) and 0.2 g/L sulfur element (pH is adjusted to 6.9 with NaOH) (Robb, F. T., and A. R. Place. 1995. Media for Thermophiles, p. 167-168. In F. T. Robb and A. R. Place (ed.) Archea: a laboratory manual-Thermophiles.Cold Spring Harbor Press, Cold Spring Harbor, N.Y.). Optionally, 5-FOA (Wako Pure Chemical, Osaka, Japan) and uracil (Kojin, Tokyo, Japan) were added to ASW-AA medium at the concentrations described in Robb. In order to examine tryptophan nutrient requirement, tryptophan-free ASW-AA, ASW-AAW were used. In order to reduce dissolved oxygen in the medium, 5.0% Na2S.9H2O was added until the color of sodium resazurin salt (1.0 mg/L) disappeared. In the case of plate culture, 1.0% (w/v) Gelrite (Wako Pure Chemical) was added, and in lieu of the sulphur element 5.0% Na2S.9H2O solution, 2.0 ml/L polysulfide solution (10 g Na2S.9H2O and 3.0 g sulphur element/15 ml) weas used for solidification. The cells were incubated in anaerobic chamber (Tabai Espec, Osaka, Japan), at 85° C.


DH5-alpha, an E. coli used for general DNA engineering, was routinely cultured on LB medium (Sambrook, J., and D. Russel. 2001. Molecular cloning: a laboratory manual, 3rd edn. Cold Spring Harbor Press, Cold Spring Harbor, N.Y.) which was supplemented with 50 μg/ml ????? as necessary.


(Mutation by UV Radiation and Isolation of 5-FOA Resistant Variants)



T. kodakaraensis KOD1 was cultured in 2.0 L of ASW-AA liquid medium for 39 hours. Cells within the stationary phase were recovered by centrifugation (6,000×g, 30 minutes). The following procedures were performed anaerobically in an anaerobic chamber as follows: cells were resuspended in 60 mL of ASW, and a portion of the suspension (10 mL) was placed into a petri dish. The suspension was UV radiated for an appropriate time (0, 30, 60, 90 and 120 seconds) at a distance of 20 cm from 15 W sterilization lamp, with agaitation. Aliquots (200 μl) were plated on ASW-AA plate medium containing 0.75% 5-FOA, and uracil nutrition requirement (Pyr) variants were dominantly screened. In order to support growth of the resultant variants, 10 μg/ml uracil was included in the growth media. The cells were incubated at 85° C. for five days. The number of viable cells was deterimined by inoculation onto a ASW-AA plate medium free of 5-FOA at an appropriate dilution ratio, and counting the number of colonies formed.


5-FOA colonies were separated, and cultured in ASW-YT liquid medium. The cells were incubated in ASW-AA liquid medium for two days in order to avoid carry over of uracil, and passaged into ASW-AA liquid medium with or without 5 μg/ml uracil to study the nutritional requirement of the isolates for uracil of isolates.


(Enzymatic Assay)


Cell-free extracts of T. kodakaraensis KOD1 and variants thereof were prepared as follows: cells were cultured in ASW-Y liquid medium for twenty hours, and collected by centrifugaion (6,000×g, 30 minutes), and the cells were resuspended in 50 mM Tris-HCl (pH 7.5) containing 0.1% v/v Triton X-100. The samples were vortexed for ten minutes, centrifuged at 3,000×g for twenty minutes, and the resultant supernatant retained as cell-free extract. Protein concentration was determined using the Bio-Rad Protein Assay System (Bio-Rad, Hercules, Calif., USA) using bovine serum albumin as a standard.


Orotidine-5′-monophosphate decarboxylase (OMPdecase, PyrF) activity was determined by monitoring the reduction in optical density at 285 nm (ODλ285nm), derived from the conversion of orotidine-5′-monophosphate (OMP) into uridine-5′-monophosphate (UMP) (Beckwith, J. R., A. B. Pardee, R. Austrian, and F. Jacob. 1962. Coordination of the synthesis of the enzymes in the pyrimidine pathway of E. coli. J. Mol. Biol. 5: 618-634.). The assay mixture consists of 100 mM Tris-HCl (pH 8.6), 1.5 mM MgCl2, 0.125 mM OMP and enzyme solution in 1 ml in total. This mixture was preincubated at 85° C. for 5 minutes in a capped cuvette, and the reaction was initiated by adding an enzyme solution and monitored for 10 minutes at the same temperature.


Orotinate phoshoribosyltransfrase (OPRTase, PyrE) activity was assayed by spectrometrically measuring orotinic acid at 295 nm. When measuring enzyme sample from pyrE+ strain, continuous decarboxylation by intrinsic OMP decase of the reactant product OMP should be taken into account. As OMP decase activity is higher than OPRTase in T. kodakadaensis, OPRTase activity may be determined at □□295 of 3,670 M−1cm−1. This does not correspond to the conversion from orotinic acid to UMP via OMP. In the case of the pyrF strain, we monitored the conversion of the vstarting substrate to OMP by means of □□295 of 2,520 M−1cm−1. This reaction was performed in 1 ml mixture comprising Tris-HCl (pH 8.6), 1.5 mM MgCl2, 0.125 mM orotinic acid, cell-free extract, and 1.6 mM 5-phosphoribosylpyrophosphate (PRPP). The same assay mixture free of PRPP was placed in a capped cuvette, and preincubated at 85° C. for 10 minutes, and the reaction was initiated by the addition of PRPP. The decrease in A295 was measured at the same temperature for three minutes.


(DNA Engineering and Sequencing)


General DNA engineering was performed as described in Sambrook and Russel (Sambrook, J., and D. Russel. 2001. Molecular cloning: a laboratory manual, 3rd edn. Cold Spring Harbor Press, Cold Spring Harbor, N.Y.). The genomic DNA of T. kodakaraensis was isolated as described above. PCR was performed using KOD-Plus-(TOYOBO, OSAKA, JAPAN) as the DNA polymerase. The sequence of the primers used for PCR are shown below. Optionally, DNA fragments amplified by PCR were phosphorylated by T4 kinase (TOYOBO). Restriction enzymes and modification enzymes were purchased from TaKaRa (Kyoto, Japan) or Toyobo. DNA fragments were collected after agarose gel electrophoresis, and GFX PCR DNA and a Gel Band Purification Kit (Amersham Pharmacia Biotech, Uppsala, Sweden) were used for purification thereof. Plasmid DNA was isolated using Qiagen Plasmid Kits (Qiagen, Hilden, Germany). DNA sequencing was performed using ABI PRISM kit and a Model 3100 capillary sequencer (Applied Biosystems, Foster City, Calif., USA).


(Construction of pUDT and pUDT2)


Two disruption vectors pUDT1 (SEQ ID NO: 2158) and pUDT2 (SEQ ID NO 2159) were constructed for respective homologous recombination of single and double cross-over events in T. kodakaraensis. They were constructed as follows: a DNA fragment (676 bp) containing Tk-pyrF was amplified from T. kodakaraensis KOD1 genomic DNA using the following primers

  • TK1-DUR/TK1-DUF:
  • TK1-DUR/TK1-DUF: 5′-GGGCATATGGAGGAGAGCAGGCTCATTCTGGCG-3′ (SEQ ID NO; 2160)/5′-CTGAGGGGGTGTTTGACTTTCAA-3′ (SEQ ID NO: 2161), wherein underlined sequences indicate NdeI sites.


Deduced promoter region (130 bp) was amplified from primers TK2-DPR/TK2-DPF:

  • TK2-DPR/TK2-DPF: 5′-GGGCTGCAGCCGCAACGCGCATTTTGCTCACCCGAA AA-3′ (SEQ ID NO: 2162)/5′-GGGCATATGCATCACCTTTTTAACGGCCCTCTCCAAGAG-3′ (SEQ ID NO: 2163), wherein underlined sequences indicates PstI and NdeI sites, respectively.


Both fragments were subcloned into pUC118 in an appropriate promoter pyrF direction. The resultant plasmid was designated as pUD (3,944). A short fragment (788 bp) of Tk-trpE was amplified using the following primers TK3-DTR/TK3-DTF:

  • TK3-DTR/TK3-DTF: 5′-GGGGCATGCGGTGGCTT CGTTGGCTACGTCTCCTACG-3′ (SEQ ID NO: 2164)/5′-GGGCTGCAGTTCGGGGCTCCGGTTAGTGTTCCCGCCG-3′ (SEQ ID NO: 2165), wherein underlined sequences indicate SphI and PstI sites. Next, this was ligated with pUD at SphtI and PstI sites to yield pUDT1 (4732 bp).


In order to construct pUDT2, fragments containing Tk-trpE and flanking regions (2223 bp) were amplified using the following primers TK4-DT2R/TK4-DT2F:

  • TK4-DT2R/TK4-DT2F: 5′-GGGGTCGACCGGG TCTGGCGAGGGCAATGAGGGAC-3′ (SEQ ID NO: 2166)/5′-GGGGAATTCGGTTATAGTGTTCGGAACGACCTTCACTC-3′ (SEQ ID NO: 21267), wherein underlined sequences indicate SalI and EcoRI sites, respectively)


This was subcloned into SalI and EcoRI sites of pUC119. The resultant plasmid was designated pUT4 (5,340 bp). pUD was digested with PvuII, and the fragment containing pyrF and the deduced promoter region (1104 bp) was isolated. pUDT2 (6,012 bp) was obtained by inserting the isolated fragment in pUT4, into the blunt ended SacI sites of Tk-trpE.


Linear DNA fragments for homologous recombination in T. kodakaraensis were prepared by PCR using pUDT2 as a template, and purified after agarose gel electrophoresis.


(Transformation of T. kodakaraensis)


The calcium chloride method for Methanococcus voltae PS (Bertani, G., and L. Baresi. 1987. Genetic transformation in the methanogen Methanococcus voltae PS. J. Bacteriol. 169: 2730-2738.) was modified for transformation of T. kodakaraensis. T.kodakaraensis KU25 was cultured for twelve hours in ASW-YT liquid medium, and cells were collected from 3 ml broth during later log phase (17,000×g, 5 minutes), and resuspended in 200 μl transformatinon buffer (in order to avoid precipitation phenomensa between calcium cations and phosphate groups, in 80 mM CaCl2 in 0.8 modified ASW free of KH2PO4) ( 1/15 vol.). This was maintained on ice for 30 minutes. Next, 3 μg DNA was dissolved in TE buffer, and added to the suspension. Further, the cells were incubated on ice for one hour, followed by heat shock at 85° C. for 45 seconds, and further incubated on ice for 10 minutes. As control experiments, an equal volume of TE buffer was added to the cell in lieu of DNA. Processed cells were screened for Pyr+ transformant by passaging two generations in the absence of uracil in 20 ml of ASW-AA liquid medium. Next, the cells were diffused on an ASW-AA plate, free of uracil, and incubated for 5-8 days at 85° C. Resultant Pyr+ strain was analyzed by Southern hybridization using colony PCR and DIG-DNA labeling and detection kit (Boehringer Mannheim, Mannheim, Germany).


(Experimental Procedures)


Double targeting disruption was performed using circular DNA molecules for double cross-over gene disruption. The exemplary scheme is shown in FIG. 1.


(Preparation of a Disruption Vector)


(Preparation of KOD-1)


The KOD-1 strain was prepared as described above.


(Transformation and Homologous Recombination)


As described above, transformed KOD-1 strain was maintained in ASW-AA. In this instance, KOD-1 strain growth is sustained by carried-over uracil.


Next, the KOD-1 strain was inoculated into fresh amino acid liquid medium. PyrF+ is the only strain in which homologous recombination occurred, and therefore grows in fresh amino acid liquid medium, this allowed screening and isolation of strians in which homologous recombination had occurred.


Next, isolated strains were inoculated into ASW-AA. Colonies grown on solid medium were confirmed with colony PCR and Southern blotting analysis. The procedure therefor is described as follows:


Reaction mixture: 2.5 unit KOD polymerase (TOYOBO) 0.5 μl; 10× KOD polymerase buffer (TOYOBO) 5.0 μl; 25 mM MgCl2 4.0 μl; dNTP mixture 4.0 μl; 20 pmol/μl primer 1 0.5 μl; 20 pmol/μl primer 2 0.5 μl; sterilized water 37.0 μl; cell suspension 0.5 μl.


This reaction mixture was incubated under the following reaction conditions: 96° C., 2 minutes, 96° C., 30 seconds, 55° C., 3 seconds, 72° C., 30 seconds, 30 cycles; 72° C. 3 minutes.


Colony PCR and Southern blotting analyses were performed to yield the following results:

TABLE 3Double cross-over gene targeted disruptionControlTransformant1Transformant2CaCl2+++DNATE bufferpUDT2pUDT2Growth inNo growthGrowthGrowthamino acidliquid mediumin thepresence ofcarried-overuracilT/Cnot12/125/12availableTotal T/Cnot17/24available
T/C refers to the number of clones which were screened by transformant/colony PCR of interest (i.e., PyrF+ strain).


As shown in the above results, it was demonstrated that targeted double cross-over disruption of genes using circular molecules proceeds at a very high ratio.


Example 3
Examples of Double Cross-Over Disruption; Cases Where Linear DNA was Used

Next, examples of double cross-over using linear DNA molecules were shown.


(Production of the Disruption Vector)


Linear DNA was prepared as shown in FIG. 2 as a linear disruption vector. Linear DNA was obtained by amplification using pUDT2 prepared in Example 2 as a template using appropriate primers.


(Preparation of KOD1)


The KOD-1 strain was prepared as described in Example 2.


(Transformation and Homologous Recombination)


Prepared KOD-1 strain was transformed using the calcium chloride method. The transformed KOD-1 strain was maintained in ASW-AA. In this instance, KOD-1 strain growth is sustained by carried-over uracil.


Next, the KOD-1 strain was inoculated into fresh amino acid liquid medium. PyrF+ strain is the only strain in which homologous recombination occurrs, and therefore grows in fresh amino acid liquid medium, allowing screening and isolation of strains in which homologous recombination has occurred.


Next, isolated strains were inoculated into ASW-AA. Then colonies grown on the solid medium were confirmed by colony PCR and Southern blotting analysis. The procedure therefor is described as follows:


Colony PCR and Southern blotting were performed as described above.


As analyzed above, the following results were obtained.

TABLE 4Gene targeted disruption by double cross-overControlTransformant3Transformant4CaCl2+++DNATE bufferLinear DNALinear DNAGrowth inNo growthGrowthGrowthamino acidliquid mediumin thepresence ofcarried-overuracilT/Cnot7/120/12availableTotal T/Cnot7/24available
T/C refers to the number of clones which were screened by transformant/colony PCR of interest (i.e., PyrF+ strain).


As shown in the above results, it was demonstrated that targeted double cross-over disruption of genes using linear molecules proceeds at a sufficiently high ratio, although lower than those using circular molecules. It is thought that the reason for lower ratios than that observed using circular molecules include digestion of linear molecules by host nucleases.


Further, in light of the above-mentioned results, when determining a preferable length for linear DNA, if there are at least 500 bases at both termini, targeted disruption progresses at about 5% or more, and if there are at least respective 1000 bases at both termini, targeted disruption progresses at about 20% or more. Accordingly, it is understood that targeted disruption using a linear molecule requires at least 500 bases, and preferably at least 1,000 bases of nucleic acid sequences at both termini.


Example 4
Examples of Double Cross-Over Disruption Other Genes

A gene other than the above-mentioned genes (for example, a sequence encoding SEQ ID NO: 395 (Tryptophane synthase)) is selected to perform similar experiments based on tryptophane nutritional requirement, and similar targeted disruption was performed.


Example 5
Single Cross-Over Disruption

Gene targeted disruption was performed using a circular molecule using a single cross-over dirsuption system. Schematic drawing is shown in FIG. 3. pUDT (SEQ ID NO: 2158) was prepared as described above.


(Preparation of KOD1)


The KOD-1 strain was prepared as described in Example 2.


(Transformation and Homologous Recombination)


Prepared KOD-1 strain was transformed with the calcium chloride method. The Transformed KOD-1 strain was maintained in ASW-AA. In this instance, the KOD-1 strain grows with carried-over uracil.


Next, the KOD-1 strain was inoculated to a fresh amino acid liquid medium. As PyrF+ strain, in which homologous recombination occurred, only grows in fresh amino acid liquid medium, this allows screening and concentration for those in which homologous recombination has occurred.


Next, grown strains were inoculated into ASW-AA. Then colonies grown in the solid medium were confirmed with colony PCR and Southern blotting analysis. The procedure therefor is described as follows:


Colony PCR and Southern blotting were performed as described above.


As analyzed above, the following results were obtained.

TABLE 5Gene targeted disruption by single cross-overControlTransformant5Transformant6CaCl2+++DNATE bufferpUDT1pUDT1Growth inNo growthGrowthGrowthamino acidliquid mediumin thepresence ofcarried-overuracilT/Cnot1/962/96availabletotal T/Cnot3/192available
T/C refers to the number of clones which were reviewed by transformant/colony PCR of interest (i.e., PyrF+ strain).


As described above, it is understood that gene targeted disruption by single cross-over using a circular molecule progresses at a much lower rate than the gene targeted disruption by double cross-over. A reason why efficiency by single cross-over is lower than that by double cross-over is believed to be the digestion of pUDT1 by restriction enzymes from the host.


As such, the present invention is demonstrated to work in a system using single disruption. Further, when using a linear molecule, the system using single disruption works, although at much lower rate.


Example 6
Examples of Single Cross-Over Disruption; Other Genes

Genes were disrupted by single cross-over as in Example 4, and it was demonstrated that disruption was permissible, although efficiency thereof was not as good as in Example 5.


Example 7
Expression of DNA Ligase Gene

In order to express an ATP dependent DNA ligase in Escherichia coli, the following protocols were used. Fragments of the phage clone comprising the sequence of DNA ligase identified in the present invention (for example, SEQ ID NO: 1131) was used as a template to yield fragments of two types of DNA ligase coding regions, which were inserted into pUC18. The sequences of the inserted fragments were confirmed and the fragments comprising the DNA ligase from the plasmid was inserted into the plasmid pET21a (Novagen) to construct the plasmids. The expression and the activity were confirmed as follows:



Escherichia coli BL21 (DE3) was transformed with the plasmid. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto to continue the culture at 37° C. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract, which was disrupted by sonication, and this was again centrifuged to recover soluble fractions. The resultant fraction was processed at 70° C. for ten minutes and the thermostable soluble fraction was centrifuged again to yield a sample. This sample may be further purified using a variety of well known purification methods and a combination thereof.


Enzymatic activities are measured by a method for observing a change of mobility of DNA fragments after the obtained samples were digested with lambda phage DNA Hind III, and the resultant was agarose gel electrophoresed; or a method for reacting the obtained sample to an oligo dT labeled with 32P and removing unreacted 32P with alkaline phosphatase, and then measuring radioactivity thereof (see Rossi, R et al, (1997) Nucleic Acids Research, 25(11):2106-2113; Odell, M. et al., (1996) Virology 221:120-129; Sriskanda, V. et al, (1998) Nucleic Acids Research, 26(20):4618-4625; Takahashi, M. et al., (1984) The Journal of Biological Chemistry, 259(16):10041-10047)).


Examples 8
Expression and Confirmation of Formic Acid Dehydrogenase

Formic acid dehydrogenase is an enzyme catalyzing a reaction oxydizing formic ion into CO2. The reaction thereof is represented by the formula: HCOO∓NAD+⇄CO2+NADH. As used herein, NAD (nicotine amide adenine dinucleotide; reductive type is NADH) is one of the coenzymes relating to the redox reaction.


Formic acid dehydrogenase activity is measured using, for example, NADP+ (340 nm, ε=6.22×103), methyl viologen (600 nm, ε=1.13×104), or benzyl viologen (605 nm, ε=1.47×104) (Andreesen, J. R. et al., (1974) J. Bacteriol., 120:6-14).


Known formic dehydrogenases include a homodimer consisting only of alpha subunits, a heterodimer and heterotetramer consisting of alpha and beta subunits, and a dodecamer consisting of alpha, beta and gamma subunits.


Formic acid dehydrogenases of the present invention may consist of single or plural subunits. Preferably, the formic acid dehydrogenases consist of two or more subunits.


(Expression of Thermostable Formic Acid Dehydrogenase)


In order to express the formic acid dehydrogenases (SEQ ID NO: 305, 673, 1050 and 1051) encoded by an open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted in plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5%. NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solutions.


The crude enzyme solution was measured for its formic acid dehydrogenase enzymatic activity according to routine method (Andreesen, J. R. et al., (1974) J. Bacteriol., 120: 6-14). Further, the enzyme has an optimum temperature at 90° C.


Example 9
Hyperthermostable Beta-Glycosidase

Beta-glycosidases collectively refer to a group of enzymes hydrolyzing a beta-glycoside bond. Beta glycosidases include, for example, beta-glucosidase, beta-galactosidase, beta-mannosidase, beta-fructosidase and the like.


Beta-galactosidase, a type of beta-glycosidase, is an enzyme hydrolyzing beta-D-galactoside to yield D-galactose. Degrading lactose (glucose-beta-D-galactoside) into glucose and galactose using a galactosidase is a method for producing low-lactose milk by processing the lactose in cow milk. For these purposes, in addition to adding the enzyme into milk, the use of a fixed enzyme is also considered. Generally, enzymes used as a fixation enzyme present preferably high activity at the reaction condition used (pH, temperature and the like), and is structurally stable.


As used herein, beta-galactosidase is an enzyme hydrolyzing beta-D-galactoside to produce D-galactose, and is systematically called beta-D-galactoside galactohydrolase. Beta-glycosidase of the present invention may have beta-glucosidase, beta-mannosidase and/or beta-xylosidase activities in addition to beta-galactosidase activity. Beta-glycosidase of the present invention may have transferring activity in addition to hydrolyzing activity of oligosaccharides.


(Expression of Beta-Glycosidase)


Beta-glycosidase (SEQ ID NO: 1122) was expressed using the same method as described above in the Examples. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)) containing amplicillin (50 μg/ml), cultured at 37° C. until the OD660 reached 0.5. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication in 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2, and centrifuged again to yield a soluble fraction, which was then heated at 85° C. for thirty minutes. Heat-stable soluble fractions were centrifuged and concentrated, and then were subjected to sodium dodecyl sulfate polyacrylamide electrophoresis (SDS-PAGE) to detect a expected band of molecular weight, and the band was seen to increase over time after the induction by IPTG.


The sample was heat treated as above and used for determining the enzymatic chemical properties of beta-glycosidase of the present invention. As for methods of measuring enzymatic activities, see Pisani, F. M. et al., Eur. J. Biochem., 187, 321-328 (1990). Enzymatic acitivity of liberalizing 1 μmol p-nitrophenol per minute was considered 1 U.


The optimum pH of beta-glycosidase of the present invention was examined. The reaction was performed in a variety of buffers, including 1.5 μg/ml of the enzyme with 2.8 mM pNp beta-glucopyranoside as the substrate at 75° C. The buffers used were sodium phosphate buffer (pH 6-8), citrate buffer (pH 4-6), borate buffer (pH 8-9), glycine buffer (pH 8.5-10) (data not shown). These results show that the beta-glycosidase has its optimum pH at around pH 6.5.


Optimum temperature for beta-glycosidase of the present invention was also examined. Reactions were performed in sodium phosphate buffer (pH 6.5) including 1.5 μg/ml of the enzyme with 2.8 mM pNp beta-glucopyranoside as the substrate at a variety of temperatures (data not shown). As a result, the beta-glycoidase of the present invention has its optimum temperature at around 100° C. Further, Arrhenius plotting was performed using this result, and it was demonstrated that the gradient of the line is changed around 75° C. (1/T*10−3=2.87). The results were applied to the formula k=Ae−E/RT (wherein k is reaction rate constant, E is activation energy, R is gas constant, T is absolute temprature, A is frequency factor), it was calculated that E=53.4 kJ/mol in the range of 25-75° C., and E=17.7 kJ/mol in the range of 75-100° C.


Thermostability of beta-glycosidase of the present invention was examined. After the above samples were incubated for a variety of times at 90 or 100° C., enzymatic activity was measured at 80° C. in 50 mM sodium phosphate buffer (pH 6.5), including 1.5 μg/ml of the enzyme and using 2.8 mM pNp-beta-glucopyranoside as a substrate (data not shown). This result indicates that the beta-glycosidase has about 18 hours and 1 hour of thermostability at 90° C. and 100° C., respectively. Similar experiments were performed at 110° C., the enzyme was inactivated after about 15 minutes.


Substrate specificity of beta-glycosidase of the present invention was examined. Activities against a variety of substrates at 2.8 mM were measured at 80° C. in 50 mM sodium phosphate buffer (pH 6.5) containing 1.5 μg/ml of enzyme, and it was demonstrated that the beta-glycosidase of the present invention has high beta-glycosidase activity, and further, has beta-mannosidase, beta-glycosidase and beta-xylosidase activities.


Reaction rate constants for these four enzymes were determined by measuring the activity against substrates by incubating each 2 mM of oligosaccharide (beta-lactose, cellobiose, cellotriose, cellotetraose and cellopentaose) with 3.0 μg/ml enzyme at the concentration of 0.28 mM to 5.6 mM, in 50 mM sodium phosphate buffer (pH 6.5) containing 1.5 μg/ml at 80° C. for seven hours. Next, the reactant solution was subjected to thin layer chromatography (TLC) (data not shown). Spots of glucoses were observed in lanes other than the beta-lactose lane. Cellotetraose, a tetrasaccharide, was divided into trisaccharide and monosaccharide, and cellopentaose, a pentasaccharide, was divided into tetrasaccharide and monosaccharide, respectively. These results show that the beta-glycosidase of the present invention has an exo-type of hydrolyzing activity.


5 mM solutions of cellobiose, cellotriose, cellotetraose and cellopentaose in 50 mM sodium phosphate buffer (pH 6.5) containing 3 μg/ml of enzyme were incubated at 80° C. for four hours. Cellotetraose was also incubated for 0, 1, 2, 4 and 7 hours in a similar reaction system. Next, the reaction solution was subjected to thin layer chromatography (TLC). Cellobiose, cellotriose, cellotetraose and cellopentaose are disaccharides, trisaccharides, tetrasaccharides and pentasaccharides, respectively, and larger spots than these saccharides were observed after reaction. This result demonstrates that the beta-glycosidase of the present invention has sugar-transferase activity in addition to an exo-type sugar-degrading activity In this reaction condition, glucose and cellobiose were increased over time, and this means that hydrolyzing activity, rather than transferring activity, is increased over time. That is, beta-glycosidase of the present invention can be applied to the synthesis of oligosaccharides having any combination of beta linkage such as oligosaccharide in which cellobiose is linked to mannose, and the like.


Example 10
Hyperthermophillic Chitinase

Chitin is a type of mucopolysaccharides, and has a structure of beta-poly-N-acetylglucosamine. Chitinase is an enzyme present as a cell-wall substance of arthropods, molluscs, crustaceans, insects, fungi, bacteria and the like, in an abundant amount, which hydrolyzes a chitin, and is found in the gastric juice of snails, exuvial fluid of an insect, fruit skin, microorganisms and the like. This enzyme produces N-acetylglucosamine by hydrolysis of beta-1,4 linkage of a chitin, and has a systematic name of poly(1,4-beta-(2-acetamide-2-deoxy-D-glucoside)) glucanohydrolase.


Chitinase may be industrially useful for the purpose of decomposing chitin, which is present in an abundant amount in nature, into forms more available to microorganisms and the like. Further, chitinase is also believed to play an important role as a protection mechanism against pathogens in plants, and thus attempts have been made to develop a disease-desistant plant by introducing a gene encoding the subject enzyme.


(Expression of Hyperthermophillic Chitinase)


As described in the above-mentioned Examples, hyperthermophillic chitinase (SEQ ID NO: 991) was expressed. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.3. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 70° C. for ten minutes, and then the obtained thermophillic fraction was centrifuged to yield the supernatant thereof as a sample, which was subjected to sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), and the expected band was detected at about 130 kDa.


The sample was heat-processsed as above and purified using ammonium sulfate precipitation (40% saturation), anionic exchange column (HiTrapQ), gel filtration column, and anionic exchange column (MonoQ) so that only single band is observed on an SDS-PAGE.


The enzymatic activities were measured in accordance with a method “Chitin, Chitosan Experimental Manual” (Chitin Chitosan Research Ed., Gihodo Publishing) using colloidal chitin. The amount of enzyme required to produce a reduced saccharide corresponding to 1 μmol N-acetylglucosamine per minute was defined as 1 U.


Colloidal chitin as a substrate was prepared as follows: 10 g Chitin (Wako Pure Chemical) was solubilized in 500 ml of 85% phosphoric acid and agitated for 24 hours at −4° C. The viscous liquid was added to a ten-fold volume of deionized water while agitating. The precipitate was obtained by centrifugation, and the resultant was repeatedly washed by deionized water until the pH thereof was 5.0 or higher. NaOH was adjusted to pH 7.0, and then washed with deionized water for one more time. This was solubilized in a small volume of water and autoclaved.


The optimum temperature of hyperthermostable chitinase of the present invention was determined by measuring the activities of the above-mentioned purified enzymes in 50 mM sodium phosphate (pH 7.0) for sixty minutes at a variety of temperatures. The reaction was terminated by cooling on ice (data not shown). The hyperthermostable chitinase of the present invention was shown to have an optimum temperature at about 80° C.


Optimum pH of the hyperthermostable chitinase of the present invention was determined by measuring the activities of the above-mentioned purified enzymes for sixty minutes at a variety of pH levels using the following buffers: 50 mM disodium hydrogen citrate-HCl (pH 2.5˜4.0); 50 mM sodium acetate (pH 4.0˜5.5); 50 mM MES-NaOH (pH 5.5˜7.0); 50 mM Tris-HCl (pH 7.0˜9.0); 50 mM glycine-NaOH (pH 9.0˜10.0). The reaction was terminated by cooling on ice. The result is shown in FIG. 5. The hyperthermostable chitinase of the present invention was demonstrated to have an optimum pH at about 4.0. Further, peaks were observed at about pH 8.0.


The effects of salt on the activity of hyperthermostable chitinase of the present invention was studied by measuring the activities of the above-mentioned purified enzymes in 50 mM sodium phosphate (pH 7.0) with a variety of concentrations of salt (NaCl or KCl) added thereto for 120 minutes at 80° C. The reaction was terminated by cooling on ice (data not shown). The activity of the hyperthermostable chitinase of the present invention was increased by the addition of the salt, and in particular, the addition of KCl increased the activity by about two fold.


The hyperthermostable chitinase of the present invention was studied for the effects thereof on oligosaccharide and colloidal chitin. Oligosaccharides used were N-acetyl-D-glucosamine (G1), di-N-acetyl-chitobiose (G2), tri-N-acetyl-chitotriose (G3), tetra-N-acetyl-chitotetraose (G4), penta-N-acetyl-chitopentaose (G5) and hexa-N-acetyl-chitohezaose (G6). Fifty μl of reaction mixture containing 0.7 mg of each oligosaccharide, 70 mM sodium acetate buffer (pH 6.0), 200 mM KCl, and purified enzyme (for G1-G3, 0.9 μg, and for G4-G6, 1.8 μg) was incubated at 80° C. and sampled at 0, 5, 15, 30, 60 or 120 minutes thereafter. As for colloidal chitin, 1 ml total reaction mixture containing 0.16 mg colloidal chitin, 50 mM sodium acetate buffer (pH 5.0), and 0.6 μg of purified enzyme was incubated at 80° C., and sampled at 1.5, 3.0 and 4.5 hours thereafter, and centrifuged to concentrate 20 fold. Next, the samples were subjected to TLC as follows: sampled solution was spotted on Kieselgel 60 silica gel plate (Merck), and development solution (n-butanol:methanol:25% ammonia solution:water=5:4:2:1) was used for the development thereof. After development, the plates were dried, and developing reagents (anillin 4 ml, diphenylamine 4 g, acetone 200 mL, 85% phosphoric acid 30 mL were mixed for preparation) was atomized and this was heated at 180° C. for about five minutes for coloring (data not shown).


From this result, it was demonstrated that the hyperthermophillic chitinase of the present invention has no degrading action against disaccharides or lower, and when chitin was used as a substrate, the enzyme mainly produced chitobiose, a disaccharide, as a main product.


The hyperthermostable chitinase of the present invention was also studied for effects on 4-methyl umbellipherone (4-MU). GlcNAc-4-MU, GlcNAc2-4-MU or GlcNAc3-4-MU (0.01 mM) 10 μl, 100 mM acetate buffer (pH 5.0) 990 μl, and the purified enzyme 20 μl (18 ng) were incubated at 80° C. At 0, 5, 15, 30, 45, 60, or 180 minutes, 100 μl of the reaction solution was sampled, and added to 900 μl of ice-cold 100 mM glycine-NaOH (pH 11) to terminate the reaction. The samples were measured for their excitation at 350 nm and fluorscence at 440 nm by spectrofluorometer (data not shown). As a result, reation rates against each substrate were determined.


It was reported that reaction rates against disaccharide derivatives and against trisaccharide derivatives were compared and thus the digestion type of the enzymes was either endo-type or exo-type (Robbins, P. W., J. Biol. Chem., 263 (1), 443-447 (1988)). In this case, when the reaction rate against disaccharide derivative is greater than that of the other, the enzyme is expected to be exo-type, whereas when the reaction rate against trisaccharide is greater than that of the other, the enzyme is expected to be endo-type. Based on this description, the hyperthermostable chitinase of the present invention is determined to be endo-type.


Functions possessed by each domain of the hyperthermostable chitinase of the present invention were studied by creating a variety of deletion mutants. Deletion mutants Pk-ChiAΔ1 (containing the first Bacillus circulans chitinase homologous region and two cellulose binding domains), Pk-ChiAΔ2 (containing the fourth Streptomyces erythraeus chitinase homologous region and two cellulose binding domains), Pk-ChiAΔ3 (containing the first Bacillus circulans chitinase homologous region), and Pk-ChiAΔ4 (containing the fourth Streptomyces erythraeus chitinase homologous region), were produced based on the previous reference (Japanese Laid-Open Publication 11-313688).


From the culture of E. coli transformant strains possessing each plasmid, crude enzyme solution was obtained by heat treating at 70° C. for 10 minutes. This crude enzyme solution was spotted on a colloidal chitin plate (0.5% colloidal chitin, 1.5% agar) and was incubated to study the activities thereof (data not shown). Deletion mutants having only the first chitinase homologous region showed some activity, and the deletion mutants having the fourth chitinase homologous region only showed little activity. All of the deletion mutants having any chitinase homologous regions and the two cellulose binding domains showed high activities.


Thirty a μl of the crude enzyme solution of deletion mutants Pk-ChiAΔ2 and Pk-ChiAΔ4 was mixed with 30 μl of 1% collidal chitin, and incubated at 70° C. for one hour. Next, the reaction solution was centrifuged and the supernatant and a precipitate containing the colloidal chitin was obtained. The precipitate was washed twice with 50 mM sodium phosphate (pH 7.0), and was subjected to SDS-PAGE (data not shown). This result shows that the two cellulose binding domains are necessary for binding to a chitin and for chitinase activity.


Example 11
Hyperthermostable Ribulose Bisphosphate Carboxylate

Ribulose bisphosphate carboxylase is an enzyme catalyzing photosynthetic reactions and is present in plant chloroplasts and microorganisms having photosynthetic ability. Ribulose bisphosphate carboxylase of higher plants is a macromolecule consisting of eight large subunits and eight small subunits (Type I), and is a major soluble protein in leaves of plants. On the other hand, ribulose bisphosphate carboxylase of microorganisms such as bacteria consists of only small subunits (Type II).


Ribulose bisphosphate carboxylase is used as a marker for plant classification, and for example, as a cell marker for cell fusion. Further, in view of the possible improvement of the global environment, it has been attempted to modify ribulose bisphosphate carboxylase gene to produce a plant with increased fixation ability of CO2 in the air. Breeding of photosynthetic bacteria and device having photosynthetic ability may be intended for development. For such purposes, it is useful to have a gene encoding ribulose bisphosphate carboxylase having increased enzymatic activity and structural stablility.


As used herein, the term “ribulose bisphosphate carboxylase refers to an enzyme adding CO2 to ribulose phosphate to produce two molecules of 3-phosphoglycerinic acid. Further, ribulose bisphosphate carboxylase has an activity of adding O2 to ribulose phosphate to produce 2-phosphoglycolic acid and 3-phosphoglycerinic acid (oxygenase activity).


(Expression of Hyperthermostable Ribulose Bisphosphate Carboxylase)


According to the method as described in the Examples above, hyperthermostable ribulose bisphosphate carboxylase (SEQ ID NO: 338) was expressed using PCR method. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)) containing amplicillin (50 μg/ml), cultured at 37° C. until the OD660 reached 0.5. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication in 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2, and centrifuged again to yield a soluble fraction, which was then heated at 85° C. for thirty minutes. Heat-stable soluble fractions were centrifuged and concetrated, and then were subjected to sodium dodecyl sulfate polyacrylamide electrophoresis (SDS-PAGE) to detect an expected band of a particular molecular weight, and the band was increased over time after the induction of IPTG (data not shown).


The samples obtained by centrifugation of the above-mentioned heat-stable soluble fractions were further purified using anion exchange column Resource Q (Amersham Pharmacia Biotech, Uppsala, Sweden), and gel filtration column Superdex 200 HR 10/30 (Amersham Pharmacia Biotech, Uppsala, Sweden), and confirmed that the band was single by SDS-PAGE (data not shown).


Purification was performed using AKTA explorer 10S (Amersham Pharmacia Biotech, Uppsala, Sweden). As for anionic exchange column, separation was performed by using gradient of 0-1.0 M NaCl, against buffer of 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2. As for gel filtration, 50 mM sodium phosphate/0.15 M NaCl buffer was used.


Analysis using gel filtration suggests that the expressed enzyme forms an octamer consisting of only large subunits.


The carboxylase activity of samples as purified above were measured by using D-ribulose 1,5-bisphosphate (RuBP) (Sigma) as substrate, in accordance with a method described in Uemura, K. et al., Plant Cell Physiol., 37(3),325-331 (1996).


First, optimal pH of the hyperthermostable ribulose bisphosphate carboxylase of the present invention was studied. Reactions were performed using a buffer containing citrate buffer (pH 5.6), sodium phosphate buffer (pH 6.3), vicine buffer (pH 7.3, 7.8, 8.0 or 8.3), or glycine buffer (pH 9.1 or 10.1), 10 mM MgCl2, and 30 mM RuBP as substrate at a variety of temperatures. One unit of activity was characterized as fixing 1 μmol CO2 per mg per minute. The results were expressed as a ratio against activity at pH 8.3. These results demonstrate that the hyperthermostable ribulose bisphosphate carboxylase has an optimum pH at about 8.3.


The hyperthermostable ribulose bisphophate carboxylase of the present invention was investigated for its optimum temperature. Reactions were performed in buffer containing 100 mM vicine-KOH (pH 8.3) and 10 mM MgCl2, using 30 mM RuBP as substrate at a variety of temperatures (data not shown). It was demonstrated that the hyperthemostable ribulose bisphosphate carboxylase of the present invention has an optimum temperature of about 90° C.


The thermostablity of hyperthermostable ribulose bisphosphate carboxylase of the present invention was studied. The purified enzyme was measured for its remnant activities after incubation for a variety of time periods at 80° C. and 100° C. (data not shown). It was demonstrated that the thermostable ribulose bisphosphate carboxylase of the present invention has a half life of about 15 hours at 80° C.


The carboxylase activity and oxygensase activity of the hyperthermostable ribulose phosphate carboxylase of the present invention was measured at 50-90° C. Further, τ value, which is carboxy activity/oxigenase activity, was calculated (see Ezaki et al., J. Biol. Chem. (J Biol Chem. 1999 February 19;274(8):5078-82)).


From the increase in carbon dioxide in the air, environmental problems such as green house effects have occurred. As a solution thereto, ribulose phosphate carboxylase catalyzing carbon dioxide fixation is noted. The ratio of oxygen versus carbon dioxide in the air is about 20:0.03, and oxygen is much more abundant than carbon dioxide. Accordingly, for the purpose of the above, a high specificity against carboxylase reaction, that is greater τ value, is required. The enzymes from KOD-1 strain have higher τ values than those of conventional type II enzymes (about 30-200×) or those of type I enzymes (about 10×), and thus are expected to be useful for the application of more efficient carbon dioxide fixation.


Example 12
fructose 1,6-bisphophate aldolase

In order to express the fructose 1,6-bisphophate aldolase (SEQ ID NO: 1275) encoded by an open reading frame obtained by the present invention in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted to plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


The crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the fructose 1,6-bisphophate aldolase activity of interest. Further, the enzyme has an optimum temperature of 90° C.


Example 13
Glycerol Kinase

In order to express the glycerol kinase (SEQ ID NO: 1646) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as crude enzyme solutions.


The crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, the enzyme has an optimum temperature at 90° C.


Example 14
Glutamate Dehydrogenase

In order to express the glutamate dehydeogenases (SEQ ID NO: 1239 and 1637) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted to plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which were used as crude enzyme solutions.


These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.


Example 15
Pyruvate Kinase

In order to express the pyruvate kinase (SEQ ID NO: 1776) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 16
Enolase

In order to express the enolase (SEQ ID NO:681) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 17
fructose 1,6-bisphophatase

In order to express the fructose 1,6-bisphophatase (SEQ ID NO:1488) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 18
Hydrogenase

In order to express the hydrogenase (each subunits correspond to SEQ ID NO:1141, 1142, 1502, and 1503) encoded by an open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 19
β-glycosidase

In order to express the β-glycosidase (SEQ ID NO:990) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 20
α-amylase

In order to express the α-amylase (SEQ ID NO:268) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 21
Deacetylase

In order to express the deacetylase (SEQ ID NO:1190) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a (+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 22
Cyclodextrin Glucanotransfrase

In order to express the cyclodextrin glucanotransfrase (SEQ ID NO:1068) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 23
4-α-D-glucanotransferase

In order to express the 4-α-D-glucanotransferase (SEQ ID NO:1185) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0 .4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 24
DNA Polymerases

In order to express the DNA polymerases (SEQ ID NO:2, 93, 379, 648, 649, 743, 1386, 1740 and 1830) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest for the respective sequences. Further, this enzyme has an optimum temperature at 90° C. for the respective sequences.


Example 25
Homing Endonuclease

In order to express the homing endonuclease (SEQ ID NO:2) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured by a modified method of endonuclease assay according KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 26
Histones

In order to express the histones (SEQ ID NO:173, 1470 and 1963 and the like) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured by a method using histone kinase as described in KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions hasve an activity as a substrate for the activity of interest. Further, this protein was stable at 90° C.


Example 27
Histones A&B

In order to express the histones A and B (SEQ ID NO: 1470 and 1962) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.


These crude protein solutions were measured by a method using histone kinase as described in KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions have an activity as a substrate for the activity of interest. Further, these proteins were stable at 90° C.


Example 28
Rec Protein

In order to express the Rec protein (SEQ ID NO:1106) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude protein solution has an activity of the Rec protein. Further, this protein was stable at 90° C.


Example 29
O6-methylguanine DNA methyl transferase

In order to express the O6-methylguanine DNA methyl transferase (SEQ ID NO:1034) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 30
PCNA

In order to express the PCNA (Proliferating Cell Nuclear Antigen) (SEQ ID NO:93) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude protein solution has the activity of the PCNA protein. Further, this protein was stable at 90° C.


Example 31
Indole Pyruvate Ferredoxin Oxydoreductases

In order to express the indole pyruvate ferredoxin oxydoreductases (SEQ ID NOs:) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform Escherichia coli BL1 (DE3) strains.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook), edited by Bunji MARUO and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest for the respective sequences. Further, these enzymes have an optimum temperature at 90° C. for the respective sequences.


Example 32
Glutamine Synthase

In order to express the glutamine synthase (SEQ ID NO:627) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 33
Anthranilate Phosphoribosyl Transferases

In order to express the anthranilate phosphoribosyl transferases (SEQ ID NO:.394 and 1767) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 34
Cobyric Acid Synthase

In order to express the cobyric acid synthases (SEQ ID NO:137 and 1904) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to Methods in Enzymology, Acadmic Press, to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.


Example 35
Phosphoribosyl Anthranilate Isomerase

In order to express the phosphoribosyl anthranilate isomerase (SEQ ID NO:44) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.


Example 36
Cobalamin Synthase

In order to express the cobalamin synthase (SEQ ID NO:181, 910, 1720 and 1973) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to Methods in Enzymology, Acadmic Press, to confirm that the crude enzymes solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature of 90° C.


Example 37
indole-3-glycerole-phophate synthase

In order to express the indole-3-glycerole-phophate synthase (SEQ ID NO: 772) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.


Example 38
Tryptophane Synthase

In order to express the tryptophane synthase (SEQ ID NO:395, 774, 954 and 2032) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.


Example 39
Ribose Phosphate Pyrophosphokinase

In order to express the ribose phosphate pyrophosphokinase (SEQ ID NO: 701) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) wasthen added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 40
Glutamate Synthase

In order to express the glutamate synthase (SEQ ID NO: 1578) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 41
orotidine-5′-phosphate decarboxylase

In order to express the orotidine-5′-phosphate decarboxylase (SEQ ID NO: 1096) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 42
Anthranilate Synthase

In order to express the anthranilate synthase (SEQ ID NO:43 and 773) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.


Example 43
Aspartyl-tRNA Synthase

In order to express the aspartyl-tRNA synthase (SEQ ID NO: 808) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 44
Phenylalanyl-tRNA-Synthase

In order to express the phenylalanyl-tRNA-synthase (SEQ ID NO:506 and 878) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzyme has an optimum temperature at 90° C.


Example 45
Chaperonins

In order to express the chaperonin A (SEQ ID NO: 1368) and the chaperonin B (SEQ ID NO: 721) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.


These crude protein solutions were measured by a method described in Frydman, J. et al. (1994) Nature 370, 111., to confirm that the crude protein solutions have activity as a substrate for the enzyme of interest. Further, these proteins were stable at 90° C.


Example 46
TATA Binding Protein

In order to express the TATA binding protein (SEQ ID NO: 31) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) wasthen added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured according to Methods in Enzymology, Academic Press, to confirm that the crude protein solution has the activity of the protein. Further, this protein was stable at 90° C.


Example 47
TBP-Interacting Protein

In order to express the TBP-interacting protein (SEQ ID NO: 1289) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured according to Methods in Enzymology, Academic Press, to confirm that the crude protein solution has the activity of the protein. Further, this protein was stable at 90° C.


Example 48
RNase HII

In order to express the RNase HII (SEQ ID NO:856) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 49
Hydrogenase Maturation Factor

In order to express the hydrogenase maturation factors (SEQ ID NO: 1144, 1154, 1156, 1516, 1518, 1519, 1869 and 1871) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a (+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.


This crude protein solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions have activity as substrates for the enzyme of interest. Further, these proteins were stable at 90° C.


Example 50
Lon Protease

In order to express the Lon protease (SEQ ID NO: 929) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 51
Thiol Protease

In order to express the thiol protease encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 52
Fragellins

In order to express the fragellins (SEQ ID NO: 11, 350, 351, 727, and 728) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.


This crude protein solutions were measured according to Aldridge P, Hughes K T., Curr Opin Microbiol. 2002 April; 5(2):160-5 and the references cited therein, to confirm that the crude protein solutions have activity as a substrate for the protein of interest. Further, these proteins were stable at 90° C.


Example 53
Subtilin-Like Protease

In order to express the subtilin-like protease (SEQ ID NO: 979) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 54
Cell Division Control Protein A

In order to express the cell division control protein A (SEQ ID NO: 1369) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured for cell division controlling activity, to confirm that the crude protein solution has the activity of the protein of interest. Further, this protein was stable at 90° C.


Example 55
Endonucleases

In order to express the endonucleases (SEQ ID NOs: 547, 697, 900, 1450, 1702, 1716, 1731, and 2010) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.


The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.


These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest for the respective sequences. Further, these enzymes have an optimum temperature at 90° C. for the respective sequences.


Example 56
Ferredoxin

In order to express the ferredoxin (SEQ ID NO:253) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.


This crude protein solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solution has the activity of the protein of interest. Further, this protein was stable at 90° C.


Example 57
exo-β-D-glucosaminidase

In order to express the exo-β-D-glucosaminidase (SEQ ID NO:1902) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.


The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.


This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.


Example 58
Confirmation of Other Deduced Functions

In order to express the gene products encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations are performed: fragments containing the open reading frames are amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids are used to transform the Escherichia coli BL1 (DE3) strains.


The resultant ampicillin resistant transformants are inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) is then added thereto and the culture is continued at 37° C. for four hours. After culture, cells are collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts are heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which are used as crude enzyme solutions.


These crude enzyme solutions are measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the activity of interest for the respective sequences. Further, this enzyme has an optimum temperature or is stable at 90° C. for the respective sequences.


EXAMPLE 59
Biomolecule Chip—DNA Chip

Next, an exemplary prepration of a biomolecule chip is demonstrated. In this Example, methods for DNAs having different sequences being aligned and immobilized thereon are described.


Aggregates of DNA fragments having specific sequences of the present invention are immobilized in a DNA spot form on a substrate. As a substrate, glass is usually used but plastic may also be used. Formats for DNA chips may be rectangular or circular. Each DNA dot comprises a DNA encoding a different gene of the present invention, and is immobilized onto the substrate. The size of the DNA dot is 100-200 μm in diameter in case of microarrays, and in the case of a DNA chip, about 10-30 μm.


Next, methods for forming each DNA spot are described. For example, a DNA solution of interest is located onto a DNA substrate using pin methods, inkjet format and the like.


Exemplary preparation of such DNA chips prepared thereby is shown in FIG. 7.


Example 60
Biomolecule Chip—Protein Chip

Next, an exemplary preparation of biomolecule chips is demonstrated. In this Example, methods for aligning proteins having different sequences on a substrate and immobilized thereto, are described.


Aggregates of the protein fragments of specific sequences of the present invention are immobilized on a substrate in a form of a dot. Glass is usually used as a substrate, but plastic may also be used. Formats may be rectangular, as with a DNA chip, or circular. Each protein dot comprises a protein from a different gene of the present invention and is immobilized onto the substrate. The size of the protein dot is 100-200 μm in diameter in case of microarrays, and in the case of DNA chip, about 10-30 μm.


Next, methods for forming each protein spot are described. For example, the protein solution of interest is located onto a protein substrate using pin methods, inkjet format and the like.


Exemplary preparation of such protein chips prepared thereby is shown in FIG. 7. Outlooks thereof are similar to that of DNA chip.


Although certain preferred embodiments have been described herein, it is not intended that such embodiments be construed as limitations on the scope of the invention except as set forth in the appended claims. Various other modifications and equivalents will be apparent to and can be readily made by those skilled in the art, after reading the description herein, without departing from the scope and spirit of this invention. All patents, published patent applications and publications cited herein are incorporated by reference as if set forth fully herein.


EFFECTS OF THE INVENTION

The present invention provides a method and kit for gene targeting in an efficient and accurate manner at any position in the genome of an organism. Further, information of the entire genomic sequence of Thermococcus kodakaraensis KOD1, and the gene information contained therein are also provided.


INDUSTRIAL APPLICABILITY

The present invention provides a variety of hyperthermostable gene products, and thus is useful in providing a method and kit for gene targeting in an efficient and accurate manner at any position in the genome of an organism. Such a variety of hyperthermostable gene products are applicable to global analysis of a hyperthermostable organism in genomic analysis and the like.

Claims
  • 1. A method for targeted-disruption of an arbitrary gene in the genome of a living organism, comprising the steps of: A) providing information of the entire sequence of the genome of the living organism; B) selecting at least one arbitrary region of the sequence; C) providing a vector comprising a sequence complementary to the selected region and a marker gene; D) transforming the living organism with the vector; and E) placing the living organism in a condition allowing homologous recombination.
  • 2. The method according to claim 1 wherein in step B, the region comprises at least two regions.
  • 3. The method according to claim 1, wherein the vector further comprises a promoter.
  • 4. The method according to claim 1 further comprising the step of detecting an expression product of the marker gene.
  • 5. The method according to claim 5 wherein the marker gene is located in the selected region.
  • 6. The method according to claim 1, wherein the maker is located outside of the selected region.
  • 7. The method according to claim 1, wherein the genome is the genome of Thermococcus kodakaraensis KOD1.
  • 8.-51. (canceled)
  • 52. An RNAi molecule having a sequence homologous to a reading frame sequence wherein, when the reading frame of Table 2 is f-1, f-2 or f-3, the reading frame sequence has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto.
  • 53. The RNAi molecule according to claim 52, which is an RNA or a variant thereof comprising a double-stranded portion of at least 10 nucleotide length.
  • 54. The RNAi molecule according to claim 52, comprising a 3′ overhang terminus.
  • 55. The RNAi molecule according to claim 54, wherein the 3′ overhang terminus is a DNA having at least 2 nucleotides in length.
  • 56. The RNAi molecule according to claim 54, wherein the 3′ overhang terminus is a DNA of two to four nucleotides in length.
Priority Claims (1)
Number Date Country Kind
2002-319011 Aug 2002 JP national
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/IB03/03597 8/29/2003 WO 4/19/2006