This Small Business Innovation Research (SBIR) Phase I project addresses the problem of integrating information about named entities, such as people, companies, and products, from numerous data sources. Integrating information about entities from multiple sources can be difficult because sources may use different formats and terminology to describe the same entity, a problem referred to as "entity resolution". Most existing commercial enterprise systems rely on rule-based matching techniques for entity resolution. This project investigates statistical learning techniques that allow a system to estimate the probability of a match, rather than computing a score based on ad-hoc rules or weights. Because the approach is based on sound statistical principles and uses evidence compiled from large datasets, it can produce more accurate results than existing methods. Moreover, these advantages are amplified when handling data that that has highly variable, missing or noisy attributes, such as data extracted from Web sites. <br/><br/>The broader impact/commercial potential of this project lies in enabling enterprises to perform more accurate and reliable data integration. The are many potential target markets that need better technology for integrating information about businesses, products, people, locations, and other entities. This capability is critical for some of the nation's largest companies and institutions, from search engines, to the U.S. Intelligence and law enforcement community, to financial institutions. In particular, large enterprises often have difficulty utilizing data extracted from news, foreign language data sources, and social media, because the extracted data is noisy and not-well structured. The technology developed in this project will help enterprises make use of the growing amount of information on the Web, so that they can take advantage of the network of relationships that link people, companies, and other entities to serve their customers better.