Bayesian Textual and Multimedia Information Retrieval

Information

  • Research Project
  • 6528391
  • ApplicationId
    6528391
  • Core Project Number
    R44LM006520
  • Full Project Number
    5R44LM006520-03
  • Serial Number
    6520
  • FOA Number
  • Sub Project Id
  • Project Start Date
    9/15/1997 - 27 years ago
  • Project End Date
    9/29/2004 - 20 years ago
  • Program Officer Name
    YE, JANE
  • Budget Start Date
    9/30/2002 - 22 years ago
  • Budget End Date
    9/29/2004 - 20 years ago
  • Fiscal Year
    2002
  • Support Year
    3
  • Suffix
  • Award Notice Date
    9/13/2002 - 22 years ago
Organizations

Bayesian Textual and Multimedia Information Retrieval

DESCRIPTION (provided by applicant): Industry transformation and rapid advancements are creating tremendous amounts of electronic multimedia information. We are developing a multimedia search agent for health data networks and data banks. The search agent employs a generalized probabilistic model that bridges the gap between automatic feature extraction and semantic understanding. What distinguishes our approach is the integration of Bayesian methodology with a fast and scalable semantic interpreter. The semantic interpreter can generate a bootstrap database of prior probabilities, overcoming a major weakness of the traditional probabilistic model. We also provide a principled approach to user feedback, contrasting existing probabilistic and nonprobabilistic ad-hoc methods. Relevance feedback on an initial database of prior probabilities can incrementally improve retrieval results to unprecedented levels of precision/recall. The model supports interactive definition and training of new semantic labels in a collaborative environment. Semantic labels organize index term dependencies in a tree-like structure with probabilities at each node, and allow the user to define concepts that match specific information needs more closely than the raw feature information found in an indexed database. Finally, we propose a comprehensive approach to the difficult problem of combining probability distributions, or relevance judgements, from different search engines. PROPOSED COMMERCIAL APPLICATIONS: The proposed methodology adds a customizable interpretive layer to electronic collections and archival multimedia databases. The Phase I prototype has opened an immediate partnership opportunity in this area. The Bayesian search and retrieval functions will become an add-on module to many database management systems. The ability to train the system to recognize visual concepts gives us a definite advantage In image mining. The decision-maker software can merge the results of different experts or search engines, a problem for which presently there are only ad hoc unsatisfactory solutions, and great opportunities in the Web and e-commerce.

IC Name
NATIONAL LIBRARY OF MEDICINE
  • Activity
    R44
  • Administering IC
    LM
  • Application Type
    5
  • Direct Cost Amount
  • Indirect Cost Amount
  • Total Cost
    371863
  • Sub Project Total Cost
  • ARRA Funded
  • CFDA Code
    879
  • Ed Inst. Type
  • Funding ICs
    NCMHD:330645\NLM:41218\
  • Funding Mechanism
  • Study Section
    ZRG1
  • Study Section Name
    Special Emphasis Panel
  • Organization Name
    INSIGHTFUL CORPORATION
  • Organization Department
  • Organization DUNS
    150683779
  • Organization City
    SEATTLE
  • Organization State
    WA
  • Organization Country
    UNITED STATES
  • Organization Zip Code
    98109
  • Organization District
    UNITED STATES