SBIR Phase IB: Units-based numeric data extraction with knowledge of scientific context

Information

  • NSF Award
  • 1003361
Owner
  • Award Id
    1003361
  • Award Effective Date
    1/1/2010 - 14 years ago
  • Award Expiration Date
    6/30/2010 - 14 years ago
  • Award Amount
    $ 50,000.00
  • Award Instrument
    Standard Grant

SBIR Phase IB: Units-based numeric data extraction with knowledge of scientific context

This Small Business Innovation Research Phase I project focuses on a novel approach to develop units-based numeric indexing and search tools. The goal is to extract numeric quantities from technical literature, identify each with a corresponding physical unit, and further relate these to other identified semantic entities such as device properties. Unlike generic semantic information extraction strategies, which attempt to identify ambiguous structure with AI learning algorithms, Entanglement Technologies? solution capitalizes on the standardization and universality of units. Physical quantities are identified accurately with scientific heuristics in knowledge-rich contexts, and thus numeric search can be more efficient than keyword searches. The research objectives are to design and optimize these scientific heuristics across a wide array of physical units to intelligently extract numeric data. Concurrently, Entanglement will develop detailed scientific ontologies for identifying the context of an indexed number-unit pair. <br/><br/>Successful demonstration of this project offers the potential for rapid and accurate data mining for technical and scientific specifications. This will have broad applications in industrial and scientific research. Entanglement Technologies anticipates generating licensing revenue from access to this search technology, targeting financial institutions involved in high-tech investments and academic libraries providing scientific search capabilities. The ability to define comprehensive, yet objective, heuristics for contextualizing a number removes much of the user's responsibility for identifying the correctness of the search engine?s retrieval. This inherent feature provides a non-expert with the capability for searching, aggregating and analyzing technical data currently only processed by experts. Furthermore, units-based numeric search offers the potential for automated number extraction and aggregation. Entanglement will utilize this capability by integrating its search functionality into a front-end user-friendly package, allowing a customer to benefit not only from the search but also from streamlined graph and report generation. If successful, this potential for automation will reduce the cost of such services currently provided by technical consulting firms.

  • Program Officer
    Errol Arkilic
  • Min Amd Letter Date
    12/18/2009 - 14 years ago
  • Max Amd Letter Date
    12/18/2009 - 14 years ago
  • ARRA Amount

Institutions

  • Name
    Quantifind Inc.
  • City
    Palo Alto
  • State
    CA
  • Country
    United States
  • Address
    2470 el camino real #214
  • Postal Code
    943061716
  • Phone Number
    6508044179

Investigators

  • First Name
    Ari
  • Last Name
    Tuchman
  • Email Address
    ari.tuchman@gmail.com
  • Start Date
    12/18/2009 12:00:00 AM