The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase I project is to enhance the impact of precision medicine by simultaneously addressing large-scale medical data aggregation and optimized computation that is cost-effective and to extend the utility of medical informatics well beyond current practice. Patient medical information comes in many diverse forms: genomic sequences, medical images, and clinical observations. The integration of these various data sources across patient populations have shown to reveal patterns and similarities among patients, which inform treatment options. With advances in imaging and genomic sequencing technologies, the sheer volume of available information is growing exponentially, straining current computational approaches, and creating an imminent need for scalable data integration. The ability to overcome this data mountain opens the door to support advanced analytics to support precision medicine and provide enhanced services to medical institutions. With these innovations, patients receive faster and more accurate diagnoses and treatments, clinicians deliver verified treatment decisions through patient cohort comparison, hospitals have better standard of care, and society is overall empowered by supporting global treatment options and well informed pharmaceutical development.<br/><br/>The proposed project will develop a scalable aggregation and analysis framework to integrate various patient data modalities to inform personalized diagnosis and therapy in precision medicine. Currently, information from different modalities exists in silos, hindering joint analysis and insight. While there has been research trying to leverage machine learning techniques in medical imaging, these efforts have generally focused on a single domain and not been able to integrate facts from other domains. This project will aggregate features from genomics, imaging and clinical characterization of patients into scalable databases and then use a distributed, parallel framework to enable efficient analytics on the resultant joint representation. The resulting platform will enable identification of cohorts based on both genotypes and phenotypes and empower powerful machine learning analyses to inform clinical decision systems or identification of new personalized therapies.