EAGER: Integrating Dense Paraphrased-Enriched Representations with Large Language Models

Information

  • NSF Award
  • 2326985
Owner
  • Award Id
    2326985
  • Award Effective Date
    6/1/2023 - 12 months ago
  • Award Expiration Date
    5/31/2024 - 2 days from now
  • Award Amount
    $ 149,997.00
  • Award Instrument
    Standard Grant

EAGER: Integrating Dense Paraphrased-Enriched Representations with Large Language Models

Current large language models like Chat-GPT have captured attention by their impressive ability to answer a broad range of user questions. But these models can still get confused by content not specifically mentioned in written text. For example, while a recipe may say “cut the onions”, it does not say that one ends up with onion pieces or that a knife was used. It is the natural economy of language that often makes life hard for current language processing tools. This project develops an approach called Dense Paraphrasing that enriches the surface language with paraphrases that reveal the deeper relations between words in a body of text, providing information needed by current models to make better inferences and judgments from the text.<br/><br/>In this project, we develop a model, Dense Paraphrasing, for text enrichment based on how a meaning can be associated with many different syntactic patterns in language. This project develops the process of integrating Dense Paraphrasing with Large Language Models, followed by an evaluation of the resulting system. This involves four subtasks, outlined here: (1) creation of a dataset that contains a rich annotation that makes explicit the hidden semantic relations in the text; (2) the development of baseline models for Dense Paraphrasing text generation systems that translate the source text into the “dense” text form, by fine-tuning the Large Language Models; (3) applications of Dense Paraphrasing on downstream NLP tasks are explored, applied to the task of Question Answering; and (4) applied to a coreference resolution problem involving objects as they move or change through a sequence of events. The ultimate goal of the project is to build a novel framework incorporating advanced Large Language Models with deeper semantics and integrate it in downstream applications to advance the state of the art in NLP, making the systems more generally useful for real-world problems.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Eleni Miltsakakiemiltsak@nsf.gov7032922972
  • Min Amd Letter Date
    5/18/2023 - a year ago
  • Max Amd Letter Date
    5/18/2023 - a year ago
  • ARRA Amount

Institutions

  • Name
    Brandeis University
  • City
    WALTHAM
  • State
    MA
  • Country
    United States
  • Address
    415 SOUTH ST
  • Postal Code
    024532728
  • Phone Number
    7817362121

Investigators

  • First Name
    James
  • Last Name
    Pustejovsky
  • Email Address
    pustejovsky@gmail.com
  • Start Date
    5/18/2023 12:00:00 AM

Program Element

  • Text
    Robust Intelligence
  • Code
    7495

Program Reference

  • Text
    ROBUST INTELLIGENCE
  • Code
    7495
  • Text
    EAGER
  • Code
    7916