Least Angle Regression

Information

  • Research Project
  • 7219604
  • ApplicationId
    7219604
  • Core Project Number
    R44GM074313
  • Full Project Number
    2R44GM074313-02
  • Serial Number
    74313
  • FOA Number
    PA-06-20
  • Sub Project Id
  • Project Start Date
    5/15/2005 - 19 years ago
  • Project End Date
    9/28/2008 - 16 years ago
  • Program Officer Name
    COUCH, JENNIFER A
  • Budget Start Date
    9/29/2006 - 18 years ago
  • Budget End Date
    9/28/2007 - 17 years ago
  • Fiscal Year
    2006
  • Support Year
    2
  • Suffix
  • Award Notice Date
    9/29/2006 - 18 years ago
Organizations

Least Angle Regression

[unreadable] DESCRIPTION (provided by applicant): This SBIR project aims to produce superior methods and software for classification and regression when there are many potential predictor variables to choose from. The methods should (1) produce stable results, where small changes in the data do not produce major changes in the variables selected or in model predictions; (2) produce accurate predictions; (3) facilitate scientific interpretation, by selecting a smaller subset of predictors which provide the best predictions; (4) allow continuous and categorical variables; and (5) support linear regression, logistic regression (predicting a binary outcome), survival analysis, and other types of regression. This project is based on least angle regression, which unifies and provides a fast implementation for a number of modern regression techniques. Least angle regression has great potential, but currently available software is limited in scope and robustness. The outcome of this project should be software which is more robust and widely applicable. This software would apply broadly, including to medical diagnosis, detecting cancer, feature selection in microarrays, and modeling patient characteristics like blood pressure. Phase I work demonstrates feasibility by extending least angle work in three key directions-categorical predictors, logistic regression, and a numerically-accurate implementation. Phase II goals include extensions to other types of explanatory variables (e.g. polynomial or spline functions, and interactions between variables), to survival and other additional regression models, and to handle missing data and massive data sets. This proposed software will enable medical researchers to obtain high prediction accuracy, and obtain stable and interpretable results, in high-dimensional situations. Predicting outcomes based on covariates, determining which covariates most affect outcomes, and adjusting treatment effects estimates for covariates, are among the most important problems in biostatistics. Prediction and feature selection are particularly difficult when there are more possible features than samples; gene microarrays and protein mass spectrometry are extreme examples of this, producing thousands to millions of measurements per sample. LARS excels at feature selection; the proposed software should enable medical researchers to obtain stable and interpretable models with better prediction accuracy in high-dimensional situations. [unreadable] [unreadable] [unreadable] [unreadable]

IC Name
NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
  • Activity
    R44
  • Administering IC
    GM
  • Application Type
    2
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    374846
  • Sub Project Total Cost
  • ARRA Funded
  • CFDA Code
    859
  • Ed Inst. Type
  • Funding ICs
    NIGMS:374846\
  • Funding Mechanism
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    INSIGHTFUL CORPORATION
  • Organization Department
  • Organization DUNS
    150683779
  • Organization City
    SEATTLE
  • Organization State
    WA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    98109
  • Organization District
    UNITED STATES