SBIR Phase II: A Multithreaded Storage Engine Using Highly-Concurrent Fractal Trees

Information

  • NSF Award
  • 1058565
Owner
  • Award Id
    1058565
  • Award Effective Date
    2/1/2011 - 15 years ago
  • Award Expiration Date
    7/31/2013 - 12 years ago
  • Award Amount
    $ 425,000.00
  • Award Instrument
    Standard Grant

SBIR Phase II: A Multithreaded Storage Engine Using Highly-Concurrent Fractal Trees

This Small Business Innovation Research (SBIR) Phase II project will apply multithreading techniques to provide multi-terabyte (and larger) high-performance databases in MySQL. The company has developed a highperformance storage engine for MySQL, which maintains indexes on live data 100 times faster than current commonly-used structures. The technology solves the problem of maintaining indexes on large databases in the face of high trickle-load indexing rates. In Phase I, the company developed a multithreaded bulk loader to solve the problem of how to load data quickly. The next significant research problems for large MySQL databases are to allow online, or "hot", schema changes in which, for example, an index can be added without taking the database down, and to use multithreading to speed up joins and reductions so that the large data sets can be queried quickly. In this project, the researchers will investigate the use of multithreading to support hot indexing and parallel joins reductions.<br/><br/>If successful, multi-terabyte and larger databases will be manageable and fast on modest hardware, and the hardware will be scalable both with CPU cores and disks. The broader impact of this work is driven by faster, cheaper, lower-power on-disk storage. Organizations that have very large databases will be able to use much less hardware, both saving money and reducing power consumption significantly. Currently many application areas do not employ databases because their performance is too slow. Speeding up databases by two orders-of-magnitude can help grow the market. Currently, many organizations fail to make good use of the data they have collected because they cannot manage it, index it, or query it fast enough to be useful. Applications in finance, retail, homeland security, telecommunications, and scientific computing will benefit from improved manageability and performance. As users' appetite for data continues to outstrip the availability of fast memory, organizing multithreaded queries on disk-based data for performance will continue to grow in importance.

  • Program Officer
    Glenn H. Larsen
  • Min Amd Letter Date
    1/31/2011 - 15 years ago
  • Max Amd Letter Date
    8/8/2014 - 11 years ago
  • ARRA Amount

Institutions

  • Name
    Tokutek, Inc.
  • City
    Lexington
  • State
    MA
  • Country
    United States
  • Address
    1 Militia Drive, Suite 11
  • Postal Code
    024214703
  • Phone Number
    3392230680

Investigators

  • First Name
    Bradley
  • Last Name
    Kuszmaul
  • Email Address
    bradley@mit.edu
  • Start Date
    1/31/2011 12:00:00 AM

Program Element

  • Text
    SMALL BUSINESS PHASE II
  • Code
    5373

Program Reference

  • Text
    SMALL BUSINESS PHASE II
  • Code
    5373
  • Text
    DIGITAL SOCIETY&TECHNOLOGIES
  • Code
    6850
  • Text
    INFORMATION INFRASTRUCTURE & TECH APPL
  • Code
    9139
  • Text
    HIGH PERFORMANCE COMPUTING & COMM