CAREER: Enabling Reproducibility of Interactive Visual Data Analysis

Abstract

Reproducibility and justifiability are widely recognized as critical aspects of data-driven decision making in fields as varied as scientific research, business, healthcare, or intelligence analysis. This project is concerned with enabling reproducibility and justifiability of decisions in the data analysis process, specifically as it relates to visual data analysis. Visualization is an important tool for discovery, yet decisions made by humans based on visualizations of data are difficult to capture and to justify. This project will develop methods to justify, communicate, and audit decisions made based on visual analysis. This, in turn will lead to better outcomes, achieved with less effort and cost. The increasing use of visual analysis tools for decision making will make data analysis accessible to a broad variety of people, as visual analysis tools are generally easier to use than scripting languages and do not require extensive computational and statistical training. This research and its related activities increase accessibility and enhance the data analysis infrastructure for research and education.

To achieve these goals, this research will develop a framework for making visual analysis sessions not only reproducible but also reusable. The approach is based on tracking semantically meaningful provenance data during an interactive visual analysis session. Once a discovery is made, analysts can use this history to curate a succinct analysis story, adding justifications and explanations to make their analysis reproducible by others. Using a semi-automatic process, analysts will be able to make their actions data-aware, so that their analysis processes become robust to changes, such as updates in the data. A second contribution of the proposed work is the integration of visual analysis into computational analysis processes. While visualization is commonly used to present computational analysis results, the results of a visual analysis session are rarely used to feed into further computational processes. The techniques developed in this project will allow analysts to feed analysis results (selections, aggregations, filters, etc.) back into a computational environment. This will make it possible to use interactive visualization at any point in the data analysis process while maintaining reproducibility and enabling reuse. The expected results include new methods to capture user intent, create data stories from analysis processes, and to integrate computational and visual data analysis, leveraging the strength of both, human abilities and computational power. The results will be disseminated in publications and in the form of open source software, and accessible via this website.

Reproducibility Framework Concept

Software

We are developing a provenance tracking library for integration with web applications. The source code is available here, and a blog post is also available.

We are also working on a visualization tool to capture analysis intent using the provenance library discussed above. Find the code here, and a live-demo of the system at this page.

The following image illustrates the interface:

The predicting intent visualization user interface

Check out the two core paper for this project, on predicting intent and reusing workflows.

Publications

revisit screenshot

Carolina Nobre, Dylan Wootton, Zach Cutler, Lane Harrison, Hanspeter Pfister, Alexander Lex
reVISit: Looking Under the Hood of Interactive Visualization Studies
SIGCHI Conference on Human Factors in Computing Systems (CHI), 2021

Taggle screenshot

Katarina Furmanova, Samuel Gratzl, Holger Stitz, Thomas Zichner, Miroslava Jaresova, Alexander Lex, Marc Streit
Taggle: Scalable Visualization of Tabular Data through Aggregation
Information Visualization, 2019

Origraph screenshot

Alex Bigelow, Carolina Nobre, Miriah Meyer, Alexander Lex
Origraph: Interactive Network Wrangling
IEEE Conference on Visual Analytics Science and Technology (VAST), 2019

Juniper screenshot

Carolina Nobre, Marc Streit, Alexander Lex
Juniper: A Tree+Table Approach to Multivariate Graph Visualization
IEEE Transactions on Visualization and Computer Graphics (InfoVis), 2019

Composer screenshot

Jen Rogers, Nicholas Spina, Ashley Neese, Rachel Hess, Darrel Brodke, Alexander Lex
Composer: Visual Cohort Analysis of Patient Outcomes
Applied Clinical Informatics, 2019

Composer screenshot

Jen Rogers, Nicholas Spina, Ashley Neese, Rachel Hess, Darrel Brodke, Alexander Lex
Composer: Visual Cohort Analysis of Patient Outcomes
Workshop on Visual Analytics in Healthcare at AMIA (VAHC 2018), 2018

VDL Project Staff

 Alexander Lex  Kiran Gadhave  Devin Lange

VDL Project Alumni

  • Jen Rogers
  • Zach Cutler
  • Hannah Bruns
  • Jochen Görtler
  • Pranav Rajan

Funded by

The National Science Foundation

Number

 NSF IIS 1751238

Program

IIS CAREER

Period

04/01/2018-03/31/2024

Principal Investigator

Awarded Amount:

$ 512,245

Last Updated

2022-05-09