Collaborative Research: CSR: Medium: DISCO: Disciplined Data Science Framework for Storage I/O Management

Information

  • NSF Award
  • 2402328
Owner
  • Award Id
    2402328
  • Award Effective Date
    10/1/2024 - a month ago
  • Award Expiration Date
    9/30/2028 - 3 years from now
  • Award Amount
    $ 312,066.00
  • Award Instrument
    Continuing Grant

Collaborative Research: CSR: Medium: DISCO: Disciplined Data Science Framework for Storage I/O Management

In the multi-billion-dollar storage industry, efficient operation of systems is essential for achieving application accuracy, reliability, and performance. Traditionally, this efficiency has relied on heuristics with adjustable parameters. However, as workloads and devices become increasingly complex, manual tuning becomes impractical. The DISCO project (which stands for “disciplined data science framework for storage I/O management”) will address how to systematically leverage data science (DS) to revolutionize the many facets of storage I/O decision making. More specifically, DISCO’s research objectives are to (a) pioneer a comprehensive data science pipeline tailored to enhance the storage I/O decision-making process by in-depth exploration of intricate concepts such as data augmentation, precise labeling, noise filtration, meticulous model engineering, drift detection, and many others; (b) target both classical I/O policies (e.g., I/O admission, prefetching) and open problems in the context of modern device features (multi-stream and KV-SSDs) as well as venture to “uncharted territories" such as investigating what data science can reveal from billions of performance data points; and (c) comprehensively encompass high-, medium-, and low-frequency decision making and address each of their own unique challenges, but at the same time address cross-cutting concerns such as all-in-one integration. <br/><br/>The DISCO project will bring significant broader impacts, especially in training future storage data scientists. The Data Storage Research Vision 2025 (DSRV) paper from an NSF workshop emphasized "the deficit of the professionals who are knowledgeable in both storage and AI areas" where "the number of fresh graduate students with this combination of skills is small, and training existing staff takes time and effort" and "storage companies are also experiencing significant competition from other industries that require AI/ML knowledge." In this context, the DISCO project will train graduate and undergraduate students to be part of the next-generation storage data scientists. The project will also release open ML-for-storage testbeds along with a public storage data science curriculum. In terms of technology transfer, the DSRV workshop paper also states that “storage companies are excited by the opportunities of using ML to improve performance and reliability, and develop quality products.” The DISCO project will produce sophisticated ML-for-storage solutions for solid-state drive (SSD) systems, potentially making a positive impact to the SSD market that is forecasted to reach over $50 billion by 2025.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Erik Brunvandebrunvan@nsf.gov7032928950
  • Min Amd Letter Date
    7/11/2024 - 4 months ago
  • Max Amd Letter Date
    7/11/2024 - 4 months ago
  • ARRA Amount

Institutions

  • Name
    Florida International University
  • City
    MIAMI
  • State
    FL
  • Country
    United States
  • Address
    11200 SW 8TH ST
  • Postal Code
    331992516
  • Phone Number
    3053482494

Investigators

  • First Name
    Janki
  • Last Name
    Bhimani
  • Email Address
    janki.bhimani@fiu.edu
  • Start Date
    7/11/2024 12:00:00 AM

Program Element

  • Text
    CSR-Computer Systems Research
  • Code
    735400

Program Reference

  • Text
    MEDIUM PROJECT
  • Code
    7924