Introducing Trrack – A Library for Tracking Provenance on the Web

Tracking a history of actions of an interactive visual analysis session (i.e., its provenance) has benefits ranging from simple undo/redo, to enabling reproducibility, to making post-hoc analysis of user sessions possible. However, there are no established provenance tracking libraries that visualization developers can use with their web-based tools. This blog post introduces Trrack – a web-based library designed for easy integration in existing and newly developed visualization systems.

Various visualization application that implement trrack, and a trrack provenance graph

How did we get to that particular analysis result? That’s a question that’s easy to answer when we do data analysis with scripting languages, such as R and Python. In fact, computational notebooks such as Jupyter notebooks, Observable, or R Markdown have come remarkably far in fulfilling Knuth’s vision of literate programming – programming that emphasizes readability and understandability.

In contrast, if we use a visual analysis system – with its many advantages – to investigate a research question, we might arrive at a conclusion through a sequence of actions (view specifications, filters, brushes) but these sequences are typically lost after the analysis and can’t be easily reproduced and scrutinized. And while computational notebooks make it easy to add comments to justify analysis decisions, most visual analysis systems can’t capture an analyst’s thought process.

So can we achieve reproducibility in interactive web-based visualizations? How? These are questions we are looking at in a larger research project, and which were also the subject of a paper we wrote a while ago.

We believe that the answer to these challenges is tracking analysis provenance. Provenance describes the history of all actions that lead to a specific state. Provenance tracking is a critical part of enabling reproducible analysis processes, but provenance tracking also enables simple undo/redo, collaboration, and logging for post-hoc analysis of users sessions.

The Trrack Provenance Tracking Library

There are various visualization systems that track provenance, but most use ad-hoc solutions created for a specific visualization. Developing these solutions can be time consuming, and custom one-off approaches likely lack some of the key benefits that provenance can provide. To this overhead, we developed Trrack – a library for tracking provenance in web-based systems to enable action recovery, reproducibility, and logging.

We will present a paper about Trrack this week at IEEE VIS, you can check out the video here:

Track is designed to be lightweight and modular, to be easy to integrate in existing or new tools, to support sharing of states out of the box (copy URLs), and to provide server integration (either through Firebase or with custom servers), if desired.

A sister library, Trrack-Vis, provides customizable visualization of the provenance data, and also enables analyst to bookmark and annotate states. Trrack-Vis also has some features, such as aggregation of provenance nodes, to ensure scalability, though we’re working on dealing with even larger graphs!

Storage Model

Trrack is using a unique storage mode. Traditionally, there are three different ways to store provenance data. The first approach is to store the complete state of an application every time something changes. This approach is fast and straightforward, but it consumes a lot of memory.

The second approach is to store a sequence of actions, i.e., one keeps track of the actions that change the state of an application. This approach results in light-weight provenance data, but the downsides are that one has to execute long sequences of actions to load a particular state, which can be slow. Furthermore, if we were to export data captured like this for post-hoc analysis, we would find that the data is very specific to the implementation of the application, which can be challenging.

Finally, hybrids between these approaches that store entire states at ‘checkpoints’, but also stores actions in between are also common. The hybrid approach provides a good compromise between speed and memory usage but still has the problem of being application dependent. It also adds a layer of complexity to the state management.

Differential States Model

Trrack introduces a new model we call the differential state model. Trrack occasionally stores complete state in ‘state nodes’, but mostly stores ‘differential nodes’ in between. The differential nodes only contain the changes to the state relative to the previous node. The differential approach reduces the storage required, but the provenance data is still application-independent. The decision for when to store the entire state vs. differential state is abstracted away from the developer.

Technical Background

Trrack and Trrack-Vis are both written in TypeScript and provide types with generics for complete customization. Both libraries feature a clean API, rich documentation, and usage examples for multiple scenarios. Trrack is framework agnostic and can work in conjunction with any UI framework like React or Vue and state management solutions like Mobx or Redux. In the documentation, we also give examples with React and Mobx. Trrack-Vis is a React component. It is easy to use Trrack-Vis in vanilla JavaScript, but it needs React as a dependency. Both the libraries are published via npm.

To install Trrack run:

yarn add @visdesignlab/trrack

To install TrrackVis run:

yarn add @visdesignlab/trrack-vis

Or, you can load Trrack and its dependencies directly:

<script src="//cdn.jsdelivr.net/combine/npm/firebase@7/firebase-app.min.js,
npm/firebase@7/firebase-database.min.js,
npm/mobx@6/dist/mobx.umd.production.min.js,
npm/lz-string@1/libs/lz-string.min.js,
npm/deep-diff@1/dist/deep-diff.min.js,
npm/@visdesignlab/trrack/dist/trrack.umd.production.min.js"></script>

How to Use Trrack

To use trrack we show a minimal example of a todo list in vanilla JavasScript. Here we track actions that (1) add items to the list, and (2) select existing items; and provide undo for these actions. Click on Open Sandbox to open an editable version to see the code and experiment.

To use Trrack, developers have to create and maintain a state for their application. This state should define variables for anything that designers would like to be tracked. Defining state explicitly is a critical part of implementing Trrack, but we often found that it also encourages good software engineering.

State is usually a javascript object. If using typescript we can use interface or type to create a shape for our state . Here’s an example tracking todos, and which item is selected:

// Only if using typescript
interface State {
  todos: string[];
  selected: number;
};

const initialState: State = {
  todos: [],
  selected: -1
}

Now we are ready to setup Trrack itself!

let prov = initProvenance(initialState, { loadFromUrl: false });

For the second parameter we pass in a config option loadFromUrl. We are setting this to false, it is true by default. We will talk about URL sharing later.

Trrack takes advantage of an observer software design pattern. To identify changes that are made to the state, an observer must be assigned to that part of the state. For example, here is an empty observer which would be called when changes are made to the “todos” part of the previously defined state.

prov.addObserver((state) => state.todos, () => {
  // Update the todo list here
});

The observers will be called when your state changes – either because you’re changing it because of a user action, or because undo/redo is called. Because of this we recommend only updating your visualization front-end within these observers, to avoid code-duplication and inconsistent changes. So, instead of updating the visualization when a user makes changes, developers should add and apply a provenance action. The provenance library then calls the observer that changes the visualization.

When creating an action you can also add a label, specify an event type, or pass in metadata. Here’s the example adding a new todo.

First, we create a action.

  const addTodoAction = createAction((state, todo) => {
        state.todos.push(todo);
      })
      .setLabel("Add Todo");

Second, we pass in the arguments and apply it when we want to execute it.

prov.apply(addTodoAction("New Todo"))

When an action is applied, a new provenance node is created, and any observers whose state were changed are triggered. Thus, applying an action is indirectly updating the visualization.

And that’s it! For a simple implementation that allows for undo/redo, that is everything required. However, there are still many helpful features that can be added to the basic implementation.

Ephemeral Nodes

Sometimes we want to track things that we don’t want on an undo/redo stack, for example for detailed logging. To do that, we can label actions as ephemeral. Ephemeral nodes are automatically grouped in Trrack-Vis so that they don’t consume too much space.

  const selectTodoAction = createAction((state, todoId) => {
        state.selected = todoId;
      })
      .setLabel("Select Todo")
      .setActionType("Ephemeral");

Adding Trrack-Vis

When you want to include provenance visualization, you can add Trrack-Vis is an easy one liner. The Library exports a ProvVisCreator function, which takes in the div that you want Trrack-Vis added to, your provenance graph from Trrack, and a callback function. Once Trrack-Vis is set up, it will automatically update when the Trrack provenance graph changes, and will change the current node in Trrack whenever a node in the graph is clicked.

    const prov = initProvenance(state, config)

    ProvVisCreator(document.getElementById('provDiv')!, prov, visCallback);

    // Function called when a node is clicked.
    const visCallback = function (clickedNode: NodeID) {
      prov.goToNode(clickedNode);
    };

This simple setup enables most features in Trrack-Vis. You can navigate the graph, as well as bookmark and annotate any node.

You can customize Trrack-Vis with icons and logic to group nodes. Below is an example with custom icons and grouping, based on our predicting intent project. For more information, see the documentation.

Intent Trrack Vis

Enabling URL Sharing

A simple way of sharing something you did in an interactive web-based system is just copying a URL that contains the state and sending it to someone. Trrack makes this easy to achieve: just change the second parameter of initProvenance as shown earlier to true. Or, simply remove the parameter, and by default URL sharing will be enabled. The only further code required from the developer is a call to done() after the observers are setup, Telling trrack that it is okay to load the state stored in the URL.

Basic example with Typescript and D3

Here’s an example using TypeScript and D3, where you can sort and select bar charts.

Resources

We have compiled more examples here, live versions of those are hosted here

If you’re interested in using Trrack, also check out the documentation and code of the Trrack and Trrack-Vis Library.

We hope Trrack is useful to you! Send us feedback or leave an issue on GitHub if you have any problems!

Update — Trrack v2

Trrack supported state-based provenance tracking very well. We are excited to announce Trrack v2 which introduces a hybrid approach to provenance tracking, where you can mix between state-based tracking and action-based tracking. Be on the lookout for a new blog which discusses the changes and design decisions in Trrack v2. Meanwhile click here to read more.

The Paper

This blog post is based on the following paper: