ReDDDoT Phase 1: Planning Grant: Piloting a framework to measure the impacts of artificial intelligence tools for government agencies

Information

NSF Award
2427748

Owner

GEORGETOWN UNIVERSITY

Award Id
2427748
Award Effective Date
10/1/2024 - 10 months ago
Award Expiration Date
3/31/2026 - 7 months from now
Award Amount
$ 300,000.00
Award Instrument
Standard Grant

Information

ReDDDoT Phase 1: Planning Grant: Piloting a framework to measure the impacts of artificial intelligence tools for government agencies

This project develops an evaluation methodology around the implementation of Large Language Model (LLM)-based tools to support human experts working in federal/state/local government programs. The complexity of rules in these programs means that mistakes are quite common, often creating more work for staff and participants. LLM-based tools have the potential to reduce that complexity by reflecting relevant rules and information from internal knowledge bases back to staff, along with citations. They can also be used to automate and simplify steps that require significant time and introduce potential for human error. For instance, rather than having staff manually key in data from scanned documents, these tools can assist by categorizing and populating data, which are then reviewed by staff. This project will conduct a study that reflects the breadth of households across the country and evaluate the trade-offs of implementing these systems: from the cost of development, to quantifying errors in LLM responses, to evaluating the potential burdens on human experts in correcting errors. This experiment compares performance (measured via metrics such as answer accuracy and time-to-answer) for responses in three conditions: generated by 1) LLMs only, 2) humans only, and 3) humans working together with LLMs. <br/><br/>The project goal is to evaluate, with a replicable approach, whether LLMs ought to be applied to specific use cases within public services, and provide structured summaries of findings for decision-makers when weighing LLM adoption. Natural language processing methods are used to generate nationally representative synthetic prompts based on program rules and demographics across states (e.g., eligibility for different programs given different situations). The project will collect responses in the form of a next step or decision from three experimental conditions: a hypothetical LLM-only condition, a human-only condition, and a human supported by LLM condition. The LLM tool itself is fine-tuned on similar question-and-answer data regarding government services, and was created with partner organizations using retrieval augmented generation methods. The project relies upon gold-standard responses from quality control auditors to evaluate correctness. The answers from each condition will be analyzed for subgroup disparities based on prompt characteristics (e.g., types of employment, age group, housing circumstances), allowing for granular reporting of potential tradeoffs of LLM adoption.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Program Officer
Danielle F. Sumydsumy@nsf.gov7032924217
Min Amd Letter Date
8/20/2024 - 11 months ago
Max Amd Letter Date
8/20/2024 - 11 months ago
ARRA Amount

Institutions

Name
Georgetown University
City
WASHINGTON
State
DC
Country
United States
Address
MAIN CAMPUS
Postal Code
200570001
Phone Number
2026250100

Investigators

First Name
Allison
Last Name
Koenecke
Email Address
koenecke@cornell.edu
Start Date
8/20/2024 12:00:00 AM

First Name
Eric
Last Name
Giannella
Email Address
eg1009@georgetown.edu
Start Date
8/20/2024 12:00:00 AM

Program Element

Text
NSF-Ford Foundation Partnrshp

ReDDDoT Phase 1: Planning Grant: Piloting a framework to measure the impacts of artificial intelligence tools for government agencies

Information

Owner

Award Id

Award Effective Date

Award Expiration Date

Award Amount

Award Instrument

ReDDDoT Phase 1: Planning Grant: Piloting a framework to measure the impacts of artificial intelligence tools for government agencies

Program Officer

Min Amd Letter Date

Max Amd Letter Date

ARRA Amount

Institutions

Name

City

State

Country

Address

Postal Code

Phone Number

Investigators

First Name

Last Name

Email Address

Start Date

First Name

Last Name

Email Address

Start Date

Program Element

Text