This Small Business Innovation Research (SBIR) Phase I reserach will establish a reapid method for identification of a suitable production system for a heterologous protein. This is often a time-consuming trial-and-error process and can significantly hinder commercialization of a protein. Commercial production of a protein generally requires the use of an organism other than the natural source, a so-called heterologous host. When a production system is found, it is often far from optimized due to prohibitive time and cost and our limited current understanding of the critical parameters. We propose to use a machine learning approach to identify the critical variables and their correlations for optimal protein expression. Test gene sets systematically varied in a number of relevant parameters will be synthesized and used for protein production in a bacterial system. Protein production will then be surveyed to generate a multidimensional sequence-expression landscape, which will be modeled. The resulting model will then be incorporated into synthetic gene design software.<br/> The broader impact of this research is to enable simple means for cost-effective production of valuable therapeutic proteins in easily cultured and manipulated organisms. Recent technologies, including gene synthesis, have greatly improved our ability to recognize, isolate, study, and engineer proteins of value, expanding the field of candidate proteins for commercialization. To capture the potential of this immense and expanding market, we must have reliable means to produce proteins in heterologous systems. Gene synthesis allows us to systematically approach this problem. We expect the tools and correlations we gain from this project to drastically improve the speed, reduce the cost and remove the uncertainties of modern protein manufacturing.