Digging into Human Rights Violations

EXECUTIVE SUMMARY

Digging into Human Rights Violations was an international effort linking scholarly and industry investigations into the application of Natural Language Processing (NLP) techniques and tools to advance current human rights violation research. Focusing broadly on global crises, the project began in 2012 to expand models for information extraction from witness statements and government reports, for visualizing that data to facilitate better event understanding, for understanding how researchers and investigators used computational methods in their human rights work, and for developing methods to model speakers’ expressed certainty of his or her statements. The results of this project appeared in publications and presentations ranging from IEEE Big Data to Digital Humanities to the Association for Computational Linguistics to the American Comparative Literature Association. By focusing on the subdomain of anaphora resolution know as cross-document co-reference, this project led to better tools for navigating large corpora by finding similar story elements throughout a corpus. This correlation of elements across a corpus helped piece together multiple fragmented narratives that enable the identification of emergent witnesses through the creation of transversal narratives.

To assist in the correlation of events through cross-document co-reference, or the recognition that a person mentioned in one report is the same as a person from another report, the project created and implemented a novel visualization technique entitled Storygraph. This technique helps investigators trace individual actors’ movements over time throughout a corpus. This visualization technique developed in concert with Storygram, a tri-gram visualization that facilitated similarity comparisons of narrative elements of time, location, and person. The visualization research led to methods for comparisons of narrative fragments using graph similarity techniques, the results of which was shared via venues like the Computational Models of Narrative Workshop, the Association for Computational Linguistics, Digital Humanities.

Further investigations into the expression of individual narratives within violation reports also required research into an aspect of attribution known as speaker veridicality. Every measurement tool includes some margin of error: speech, more so than most. As witness statements move into the more quantitative realm of data, having metrics to indicate the “error,” or certainty of a statement would allow investigators to better appreciate what they see. This area of further study led to the development and testing of a model for assessing the veridicality of witness statements. 

Working across international borders, the project personnel met with human rights organizations and researchers in Switzerland, the Netherlands, Canada, Central America, South America, and the United States. The greatest impact of this project lay in its potential to introduce computational research methods to human rights investigations. Specifically, the project personnel were able to offer guidance on methodologies for navigating large corpora of witness testimony, mining those texts for relevant information using existing NLP techniques, and visualizing that information for further analysis.

Over the three years of this project, many undergraduate and graduate student researchers contributed to the project’s outcomes. These students, many of whom have now gone on to more advanced degrees and positions, contributed skills from a variety of disciplines: computer science, applied linguistics, computational linguistics, interactive computing, English, Spanish, communication studies, political science. 

 

PROJECT TEAM, US AND CANADA
 

Faculty 

Ben Miller (Overall Project Lead), Assistant Professor, Georgia State University
Lu Xiao (Canadian Team Lead), Assistant Professor, Western University
Karthikeyan Umapathy, Associate Professor, University of North Florida
Fuxin Li, Research Scientist, Georgia Institute of Technology
George Pullman, Associate Professor, Georgia State University
Mary Beth Rosson, Professor, Pennsylvania State University
Ágnes Sándor, Project Leader, Xerox Research Centre Europe (XRCE)

 

Advisors

Dan Jurafsky, Professor, Stanford University
Ben Kiernan, A. Whitney Griswold Professor of History, Yale University
Nancy Marrelli, Archivist Emerita, Concordia University
 

Graduate and Postdoctoral Research Assistants
Lindsay Baker, MIS candidate, Western University
Patrick Crenshaw, MA student, Georgia State University
Jill R. Kavanaugh, MLIS candidate, Western University
Kristopher Kyle, PhD candidate, Georgia State University
Shakthidhar Gopavaram, PhD Student, University of Indiana - Bloomington
Jennifer Olive, PhD student, Georgia State University
Ayush Shrestha, PhD, Georgia State University
Nicholas Subtirelu, PhD student, Georgia State University
Tatiana Vashchilko, Research Associate, Western University
Vicky Yan, MLIS candidate, Western University
Jin Zhao, PhD, Georgia State University
Yanjun Zhao, PhD, Georgia State University
 

Industry Partners
Benetech Human Rights Project (Palo Alto, CA)
Human Rights Data Analysis Group (San Francisco, CA)
Invisible Children (Los Angeles, CA)
The Resolve (Washington, D.C.) 

 

For more information, please see the project white paper.