CAREER: Enabling Reproducibility of Interactive Visual Data Analysis

Abstract

Reproducibility and justifiability are widely recognized as critical aspects of data-driven decision making in fields as varied as scientific research, business, healthcare, or intelligence analysis. This project is concerned with enabling reproducibility and justifiability of decisions in the data analysis process, specifically as it relates to visual data analysis. Visualization is an important tool for discovery, yet decisions made by humans based on visualizations of data are difficult to capture and to justify. This project will develop methods to justify, communicate, and audit decisions made based on visual analysis. This, in turn will lead to better outcomes, achieved with less effort and cost. The increasing use of visual analysis tools for decision making will make data analysis accessible to a broad variety of people, as visual analysis tools are generally easier to use than scripting languages and do not require extensive computational and statistical training. This research and its related activities increase accessibility and enhance the data analysis infrastructure for research and education.

To achieve these goals, this research will develop a framework for making visual analysis sessions not only reproducible but also reusable. The approach is based on tracking semantically meaningful provenance data during an interactive visual analysis session. Once a discovery is made, analysts can use this history to curate a succinct analysis story, adding justifications and explanations to make their analysis reproducible by others. Using a semi-automatic process, analysts will be able to make their actions data-aware, so that their analysis processes become robust to changes, such as updates in the data. A second contribution of the proposed work is the integration of visual analysis into computational analysis processes. While visualization is commonly used to present computational analysis results, the results of a visual analysis session are rarely used to feed into further computational processes. The techniques developed in this project will allow analysts to feed analysis results (selections, aggregations, filters, etc.) back into a computational environment. This will make it possible to use interactive visualization at any point in the data analysis process while maintaining reproducibility and enabling reuse. The expected results include new methods to capture user intent, create data stories from analysis processes, and to integrate computational and visual data analysis, leveraging the strength of both, human abilities and computational power. The results will be disseminated in publications and in the form of open source software, and accessible via this website.

Reproducibility Framework Concept

Publications

Composer screenshot

Jen Rogers, Nicholas Spina, Ashley Neese, Rachel Hess, Darrel Brodke, Alexander Lex
Composer: Visual Cohort Analysis of Patient Outcomes
Workshop on Visual Analytics in Healthcare at AMIA (VAHC 2018), to appear, 2018

Juniper screenshot

Carolina Nobre, Marc Streit, Alexander Lex
Juniper: A Tree+Table Approach to Multivariate Graph Visualization
IEEE Transactions on Visualization and Computer Graphics (InfoVis ’18), to appear, 2018

VDL Project Staff

 Alexander Lex  Jen Rogers

Funded by

The National Science Foundation

Number

 NSF IIS 1751238

Program

IIS CAREER

Period

04/01/2018-03/31/2023

Principal Investigator

Awarded Amount:

$ 512,245

Last Updated

2018-08-02