Abstract
With data growing in scale and complexity, interactive visualizations are increasingly important in data analysis. However, interactive visual analysis lacks the reproducibility and reusability of computational analysis using code or scripts. This dissertation aims to improve the reproducibility and reusability of visual analysis, improving trust in the visual analysis process and allowing the use of the analysis in a different context.
While computational analysis using languages like R or Python is inherently reproducible and reusable, interactive visual analysis often remains ad hoc and difficult to capture, reproduce, and reuse. Hybrid approaches, such as using multiple tools, have compatibility issues and lack reproducibility within the interactive components. In computational notebooks, code and interactive visualizations have two major gaps: results of interactions cannot be used in code, and interactions with visualizations are lost upon cell re-execution or notebook restarts, hindering the reproducibility and reusability of visual analysis in notebooks.
The dissertation makes four contributions toward addressing the above issues: 1) a software library to capture and replay the interaction provenance; 2) techniques to capture the analyst's pattern-based intent and annotations to make the interaction provenance semantically meaningful; 3) techniques to reuse the semantically meaningful interaction provenance on updated datasets and curate reusable workflows that can be reused on updated datasets as well as different analysis environments; and 4) techniques to leverage the interaction provenance to bridge the gaps between code and interaction in computational notebooks. These techniques improve the reproducibility and reusability of visual analysis, therefore improving trust, reducing the gaps between computational and interactive analysis, and moving us closer to achieving a literate visual analysis framework.
Citation
Kiran Gadhave
Toward Reproducible and Reusable Visual Analysis
Advisors: Alexander Lex, Miriah Meyer, Jeff Phillips, Marc Streit, Vivek Srikumar
University of Utah, PhD Thesis, March 2024.
BibTeX
@phdthesis{2024-thesis-gadhave, title = {Toward Reproducible and Reusable Visual Analysis}, author = {Kiran Gadhave}, school = {University of Utah}, month = {December}, year = {2024} }