Visual Analytics and Imaging Laboratory (VAI Lab)
Computer Science Department, Stony Brook University, NY
Jun Wang, Klaus Mueller
Abstract: Uncovering the causal relations that exist among variables in multivariate datasets is one of the ultimate goals in data analytics. Causation is related to correlation but correlation does not imply causation. While a number of casual discovery algorithms have been devised that eliminate spurious correlations from a network, there are no guarantees that all of the inferred causations are indeed true. Hence, bringing a domain expert into the casual reasoning loop can be of great benefit in identifying erroneous casual relationships suggested by the discovery algorithm. To address this need we present the Visual Causal Analyst – a novel visual causal reasoning framework that allows users to apply their expertise, verify and edit causal links, and collaborate with the causal discovery algorithm to identify a valid causal network. Its interface consists of both an interactive 2D graph view and a numerical presentation of salient statistical parameters, such as regression coefficients, p-values, and others. Both help users in gaining a good understanding of the landscape of causal structures particularly when the number of variables is large. Our framework is also novel in that it can handle both numerical and categorical variables within one unified model and return plausible results. We demonstrate its use via a set of case studies using multiple practical datasets.
Teaser: Casual analysis of a car dataset:
Causal reasoning with the Visual Causality Analyst using a car dataset. (left) The unfiltered causal graph. (center) The causal graph with regression coefficient threshold set to 0.4. Weak causal relations are filtered away. (right) The graph relevant to mpg (miles per gallon of fuel), which is a chain of causal relationships from number of cylinders to engine displacement. The correct technical causal was inferred. Having more cylinders increases displacement (and not the other way around). Then, more displacement leads to a higher car weight, which is the next link in the chain. Finally, a higher weight of the car leads to lower fuel efficiency (lower mpg). The red arrow on the link indicates a negative (reverse) causal to mpg.
Video: Watch it to get a quick overview:
Paper: J. Wang, K. Mueller, "The Visual Causality Analyst: An Interactive Interface for Causal Reasoning," IEEE Trans. on Visualization and Computer Graphics, 22(1): 230-239, 2016. ppt pdf