Project Summary The proposed work will address a critical gap in our understanding of neuronal phenotypes and cell types by developing machine learning algorithms and cloud-based software for the integration of multiple modality characterizations large and growing datasets of cortical neurons in mouse and human. Through optimal and innovative use of potentially incomplete data and emphasis on automated morphological characterization, the proposed tools will enable richer and consistent characterizations of neurons from transcriptomic, anatomical, or electrophysiological profiling. While large-scale, BICCN-funded cell type research programs rely on the notion of unique neuronal identity which determines the cell?s phenotype across different observation modalities, overarching agreements across physiological, anatomical and molecular characterizations remain elusive. Although these large-scale programs succeeded in generating extensive multiple modality datasets, the lack of principled, accurate and widely available computational alignment and inference tools presents a roadblock to the success of the overall program. A second issue is that anatomical characterization, despite being the classical approach to understanding cell types, lags significantly behind molecular and physiological methods in terms of throughput. The research proposed here aims to address the alignment problem by building on the coupled autoencoder approach, which presents an efficient optimization framework centered on the ubiquity of neuronal identity. Importantly, the proposed software can utilize incompletely characterized data points, which is common in practice, to produce unified visualization and analysis of abstract neuronal identity. This tool will be both flexible (e.g., the feature set can be changed) and extensible (e.g., more observation modalities can be added for joint alignment). The aligned representations enable consistent clustering of the neuronal population across the different observation modalities, which is a pressing problem in modern neuroscience. We propose to address the anatomical throughput issue with an end-to-end computational pipeline, from the raw image of local neuronal arbors to the anatomical descriptor that can be readily aligned and interpreted by the coupled autoencoder software. By utilizing our extensive gold-standard manual reconstructions, we will train supervised deep artificial neural networks to segment neuronal arbors in sparse labeling scenarios. The rich set of training examples, together with algorithmic innovations, will endow superior generalizability of this automated segmentation tool, accelerating science for light microscopy-based studies.