III: Small: Collaborative Research: Cost-Efficient Sampling and Estimation from Large-Scale Networks

Information

  • NSF Award
  • 1908375
Owner
  • Award Id
    1908375
  • Award Effective Date
    10/1/2019 - 4 years ago
  • Award Expiration Date
    9/30/2022 - a year ago
  • Award Amount
    $ 249,999.00
  • Award Instrument
    Standard Grant

III: Small: Collaborative Research: Cost-Efficient Sampling and Estimation from Large-Scale Networks

Sampling and estimating structural information from <br/>large-scale networks or graphs has been central to our <br/>understanding of the network dynamics and its rich set <br/>of applications. Markov Chain Monte Carlo (MCMC) has <br/>been the key enabler for a broader context of graph <br/>sampling, including estimating the properties of large <br/>graphs, sampling the corpus of documents indexed by <br/>search engines, sampling records from hidden databases <br/>behind Web forms, identifying subgraphs of certain <br/>characteristics and frequent graph pattern matching. <br/>Despite versatile applications of the MCMC methods and <br/>their customized algorithms for analyzing <br/>graph-structured data in various forms, there still <br/>exist critical challenges and limitations in the <br/>literature centered around the MCMC methods. One is the <br/>'cost' consumption/constraints associated with the <br/>sampling operation, which limits the size of total <br/>samples obtained and negatively affects the accuracy of <br/>any estimator based on the obtained samples. Another <br/>limitation is that the recent advances in MCMC, <br/>especially built up on favorable non-reversible Markov <br/>chains, cannot be leveraged to the various large-graph <br/>sampling tasks, due to their required global knowledge <br/>of the underlying state space, lack of distribution <br/>implementation, unconstrained state space, as well as <br/>the simplified cost assumption. The goal of this research is to fully exploit the <br/>potentials of a set of crawling samplers by making the samplers adaptive and possibly <br/>interactive on a properly constructed graph domain, to <br/>transcend the current status-quo in the wide range of <br/>graph sampling tasks. <br/><br/>Specifically, the project aims to: (i) build a theoretical framework to <br/>construct a suite of cost-efficient sampling policies <br/>by optimally balancing the tradeoff between the sample <br/>quality and quantity under challenged access <br/>environments with a given cost budget, (ii) design a <br/>class of adaptive random walks by fully exploiting the <br/>past information to achieve minimal temporal <br/>correlations over the obtained samples and by <br/>controlling the random walks collectively to enable <br/>maximal space exploration, and (iii) extend the <br/>standard MCMC toolkits toward faster and more <br/>cost-efficient exploration of feasible <br/>subgraphs/configurations and computing/optimization on <br/>a graph, along with extensive validations to create <br/>practical and usable solutions in reality. This <br/>research has a high potential impact on a vast range of <br/>multi-disciplinary applications, including sampling <br/>large-scale graphs for statistical inference and <br/>efficient estimation and randomized algorithms for <br/>combinatorial optimizations in various disciplines, <br/>where the standard MCMC methods have been dominant but <br/>also constrained our understanding.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

  • Program Officer
    Sylvia Spengler
  • Min Amd Letter Date
    9/7/2019 - 4 years ago
  • Max Amd Letter Date
    9/7/2019 - 4 years ago
  • ARRA Amount

Institutions

  • Name
    Florida Institute of Technology
  • City
    MELBOURNE
  • State
    FL
  • Country
    United States
  • Address
    150 W UNIVERSITY BLVD
  • Postal Code
    329016975
  • Phone Number
    3216748000

Investigators

  • First Name
    Chul-Ho
  • Last Name
    Lee
  • Email Address
    clee@fit.edu
  • Start Date
    9/7/2019 12:00:00 AM

Program Element

  • Text
    Info Integration & Informatics
  • Code
    7364

Program Reference

  • Text
    INFO INTEGRATION & INFORMATICS
  • Code
    7364
  • Text
    SMALL PROJECT
  • Code
    7923