Data Science Core

Information

  • Research Project
  • 10047728
  • ApplicationId
    10047728
  • Core Project Number
    U19NS118284
  • Full Project Number
    1U19NS118284-01
  • Serial Number
    118284
  • FOA Number
    RFA-NS-19-003
  • Sub Project Id
    8070
  • Project Start Date
    9/17/2021 - 4 years ago
  • Project End Date
    8/31/2026 - 5 months from now
  • Program Officer Name
  • Budget Start Date
    6/1/2020 - 5 years ago
  • Budget End Date
    5/31/2021 - 4 years ago
  • Fiscal Year
    2021
  • Support Year
    01
  • Suffix
  • Award Notice Date
    9/17/2021 - 4 years ago
Organizations

Data Science Core

Data Science Core (DSC) Leads: Krishna Shenoy PhD and Chris Roat PhD (with Surya Ganguli PhD) Project Summary Given the large volumes of optical, electrical, genetic and behavioral data that will be generated, stored and computationally analyzed, it is essential to establish a comprehensive and yet streamlined DSC. There are four major data challenges that the DSC will address. (1) Data size. Each experimental lab will generate very large, and rapidly increasing, datasets. We must contend with storing, pre-processing (e.g., spike sorting) and processing (e.g., single-trial analyses) these large and growing datasets. (2) Metadata. Collaborations between groups are often hampered by not fully capturing ? in a searchable database and linked to the bulk data ? all animal and experiment conditions, or so-called metadata. We will build in capabilities and requirements to electronically capture full metadata. (3) Data format. Collaborations are also often hampered by the effort required to understand each lab?s dataset format. Data format often depends on whether a given measurement system was custom built or relies on a commercial system. We will capture this information as part of the metadata for historical data relevant to this U19, and moving forward we will adopt the increasingly-popular NeuroData Without Borders (NWB) data format. Finally, (4) Across animals and labs. Performing large- scale analyses across many animals and labs is often truly onerous. This is because all three of the challenges listed above combine, causing one to shy away from anything other than essential analyses (e.g., pooling results across just a few mice in one specific condition). We will both build our own data pipelines to automatically query our metadata database and, subsequently, retrieve the indicated experimental data as well as adopt the increasingly-popular DataJoint pipeline. Our DSC will be led by Prof. Shenoy, Dr. Roat (with considerable industrial-scale data handling experience, and now at Stanford) and Prof. Ganguli (RP3 lead). Two full-time software engineers (TBD) will implement the DSC architecture, including bulk data server, relational meta-database, data standards and data pipeline. The software engineers will work closely with the rest of the team to help assure good communication, and to help migrate analysis code and documentation into professional software standards for dissemination. This will enable storage, retrieval and analysis of data in an efficient and modular way, which enables rapid replacement of any piece of the data analysis pipeline as is essential for a creative environment that also promotes rapid feedback of emerging ideas to subsequent experiments. We believe in Open Science, including open source code (e.g., github) and data formats. We will share data with the broader community, including with other U19 consortia. Thus our DSC is critical to the success of our proposed research, and serves as the central hub of our U19 research.

IC Name
NATIONAL INSTITUTE OF NEUROLOGICAL DISORDERS AND STROKE
  • Activity
    U19
  • Administering IC
    NS
  • Application Type
    1
  • Direct Cost Amount
    275233
  • Indirect Cost Amount
    157984
  • Total Cost
  • Sub Project Total Cost
    433217
  • ARRA Funded
    False
  • CFDA Code
  • Ed Inst. Type
  • Funding ICs
    NINDS:433217\
  • Funding Mechanism
    Non-SBIR/STTR RPGs
  • Study Section
    ZNS1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    STANFORD UNIVERSITY
  • Organization Department
  • Organization DUNS
    009214214
  • Organization City
    STANFORD
  • Organization State
    CA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    943052004
  • Organization District
    UNITED STATES