DESCRIPTION (provided by applicant): Innovative computational approaches are critical to extract relevant biological information from the massive amount of raw material produced by international genomics efforts. The working draft of the human genome, single nucleotide polymorphism (SNP) databases and gene expression profile libraries are redefining the goals and opportunities of biomedical research. We propose to derive from these databases, sequence and structure-based annotation of whole protein families. We will search for all human nuclear hormone receptors (hNRs), cluster them into subfamilies of close homologues, build a corresponding database of tissue expression profile, retrieve all known related genetic disorders, and map all associated SNPs on atomic models of their 3D structures. This annotation scheme should reveal novel therapeutic targets and pharmacogenomic approaches for the development of improved and customized therapeutics. In phase II, we will conduct a similar approach with other protein families, including kinases, phosphatases, proteases and MHC molecules. The intellectual support of international experts in genomics and computational biology, and Molsoft's state-of-the-art technology in computational biology, genomic data management and online publishing will represent important assets for our research program. PROPOSED COMMERCIAL APPLICATION: The systematic search, classification, and tissue expression profile analysis of therapeutically relevant protein families will produce a valuable database for the identification of novel therapeutical targets. The construction of a structure database and mapping of all SNPs on our structures will enable the rational design of improved and customized therapeutics. The product (sequence-and structure-based annotated protein families), will be maintained and available online through annual subscription.