This is a comprehensive research proposal on the statistical modeling and analysis for educational assessment. This research addresses issues concerning fundamental statistical problems that arise in the analysis of Big Data in education. The research focus is on modeling and inference for large-scale data with complex dependence and structures (such as high-dimensional response and process data). These data arise from the introduction of new methods of testing student knowledge that rely on scenarios presented to the students and on simulation-based environments where student responses to a simulated environment are tested. This research is collaborative between Columbia University and the Educational Testing Service.<br/><br/>The topics studied include latent graphical modeling for high-dimensional item response data, modeling and segmentation of process data via dictionary models, estimation of item-attribute relationship, dimension reduction, theoretical analysis and computational methods for the proposed models. The analysis combines techniques and concepts from mathematics and probability and applies them to nonlinear statistical models and data analysis. The proposed model combines latent variable and graphical approaches for high-dimensional data; for modeling process data, recent advances in modeling and segmenting techniques for natural language processing will be investigated. In the theoretical development, several algebraic concepts to formulate model identifiability and perform combinatorial analysis on high-dimensional discrete spaces will be studied. In addition, optimization algorithms will be developed using recent advances in numerical methods.