No evidence currently exists to support the contention that the psychometric differences between performance (portfolio) assessment and multiple-choice assessment can be reconciled and that different performance assessments administered to different groups of students can ever be combined and summarized in educationally meaningful ways. The proposed study represents a multi-year investigation of the ability of elementary and middle school science and mathematics teachers, students, parents, and evaluators to develop and reach consensus on portfolio assessments to produce aggregate data within individual schools and across which exhibit diversity along key dimensions (size, location, racial/ethnic makeup of student population, etc). Formative and summative measures (ranging from low-inference to high-inference) will be used to evaluate the proposed activities in terms of four targets: utility of the model, feasibility of the consensus-building process, meaningfulness - for multiple users - of the data, and extent to which the project was implemented as proposed.