SDCI Data: Improvement: Java Graphical Authorship Attribution Program (JGAAP)


  • NSF Award
  • 1032683
  • Award Id
  • Award Effective Date
    9/1/2010 - 13 years ago
  • Award Expiration Date
    8/31/2014 - 9 years ago
  • Award Amount
    $ 1,622,036.00
  • Award Instrument
    Standard Grant

SDCI Data: Improvement: Java Graphical Authorship Attribution Program (JGAAP)

Recent developments in machine learning and corpus linguistics have shown it to be possible to make automatic determinations about authorship using statistics; the NSF- funded JGAAP (Java Graphical Authorship Attribution Program) system has been part of these developments. JGAAP has helped support the emerging authorship attribution community and create a useful tool for a wide variety of scholastic specialties. <br/><br/>Although JGAAP incorporates thousands of possible methods, there are many more in the literature that have been proposed but not rigorously tested. Comparative testing on a large scale will require the development of new methods and test corpora. In addition, there are many key problems to address to meet the needs of the community, such as the open class problem, the adversarial problem, and the coauthorship problem. Finally, we will examine applications of JGAAP and similar systems to key areas in linguistic profiling, such as determining gender, education, native language, psychological profile, medical condition, age (of document or writer), or even attempted deceptiveness. Again, by applying a rigorous testing method to these new problems and corpora, the project can establish accuracy benchmarks for various techniques (under the various testing conditions), find new combinations resulting in improved techniques, and establish a recommendation for 'best practices.' <br/><br/>Improved authorship attribution will be immediately useful both to scholars and in broader social contexts, such as law enforcement and forensics where there are direct demands for this kind of security technology. The historical/social analysis will also provide better access between the related disciplines of digital humanities, sociology, history, and computer science, providing the basis for a better understanding of traditional humanities issues. Profiling work can help medical and psychological practitioners by providing a non-invasive method to detect certain aspects of a person's mind. The software developed (and the planned development/distribution process) will help improve the effectiveness of both digital humanities scholarship and computer science, especially through the establishment of software review standards and processes. In particular, by providing direct evidence of the conditions and expected error rates involved in various techniques, the information gained will help authorship attribution meet the Daubert criteria for expert evidence, allowing authorship attribution to be used in a formal legal setting. Finally, the funding of this research will help support the unique interdisciplinary Duquesne University Computational Mathematics program, providing a broader access to an unusual and atypical audience for technological education.

  • Program Officer
    Amy Walton
  • Min Amd Letter Date
    8/30/2010 - 13 years ago
  • Max Amd Letter Date
    8/30/2010 - 13 years ago
  • ARRA Amount


  • Name
    Duquesne University
  • City
  • State
  • Country
    United States
  • Address
    Room 310 Administration Building
  • Postal Code
  • Phone Number


  • First Name
  • Last Name
  • Email Address
  • Start Date
    8/30/2010 12:00:00 AM

Program Element

  • Text
  • Code

Program Reference

  • Text
  • Code