Conversational models are permeating all aspects of society, including systems for clinical support, education, gaming, etc. Such systems have remarkable capacity to personalize their outputs. For example, a user wishing for movie recommendations might describe their current circumstances and the types of movies they normally enjoy. In response, a language model might suggest a list of movies that use similar descriptions related to the request. This project will explore limitations of conversational models for problems related to personalized recommendation. Specific areas of focus include collecting new datasets for model training and evaluation; ensuring that language models recommend items fairly; and exploring evaluation protocols to make sure recommendations are aligned with human preferences. This project will have benefits in settings where language models are used to make personalized recommendations, spanning applications from movie recommendation to personalized clinical support.<br/><br/>General-purpose language models act as powerful conversational recommenders despite having never been trained for the specific purpose. This project will explore new approaches to conversational recommendation, revisiting issues of data, methodology, and evaluation. First, the project will collect new datasets that are large, naturalistic, highly contextual, and carefully annotated. Second, the project will explore knowledge grounding and controllability in conversational recommenders, especially with a goal of performing fairness and bias interventions. Finally, the project will explore model evaluation, especially by developing simulation approaches that allow conversational models to be evaluated offline by mimicking the behavior of real users. This project extends research in each of the areas in which it builds: recommender systems, NLP, controllability, and fairness. Closing the research gap between these areas enables a host of exciting new applications dealing with mixed datasets that combine language with user interaction data, ranging from standard recommendation problems, to personalized language understanding tasks in fashion, health, and therapy, among others. The project aims at fostering the retention and involvement of groups including underrepresented minorities, high-schoolers, and engineering professionals.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.