Visual Analytics and Imaging Laboratory (VAI Lab)
Computer Science Department, Stony Brook University, NY
Abstract: There are many applications where users seek to explore the impact of the settings of several categorical variables with respect to one dependent numerical variable. For example, a computer systems analyst might want to study how the type of file system or storage device affects system performance. A usual choice is the method of Parallel Sets designed to visualize multivariate categorical variables. However, we found that the magnitude of the parameter impacts on the numerical variable cannot be easily observed here. We also attempted a dimension reduction approach based on Multiple Correspondence Analysis but found that the SVD-generated 2D layout resulted in a loss of information. We hence propose a novel approach, the Interactive Configuration Explorer (ICE), which directly addresses the need of analysts to learn how the dependent numerical variable is affected by the parameter settings given multiple optimization objectives. No information is lost as ICE shows the complete distribution and statistics of the dependent variable in context with each categorical variable. Analysts can interactively filter the variables to optimize for certain goals such as achieving a system with maximum performance, low variance, etc. Our system was developed in tight collaboration with a group of systems performance researchers and its final effectiveness was evaluated with expert interviews, a comparative user study, and two case studies.
Teaser: This image shows the use of the ICE tool in a computer systems performance optimization scenario:
A is the Parameter Explorer. It shows the distribution and statistics of the numerical target variable in the context of the various categorical variables (or parameters), labeled by the green buttons at the bottom of the interface (e.g., Workload, File System). Each parameter has levels e.g., Workload has 4 levels (dbsrvr, filesrvr, mailsrvr, and websrvr), and each level has an associated bar displaying the statistical information about the numerical target variable (here, system throughput) for this level. Analysts can interactively deselect (and select) parameter levels to filter out the associated parameter configurations throughout. B is the Aggregate View, which visualizes the joint distributions of all currently selected parameter levels. C is the Provenance Terminal, to keep track of the changes in the target variable over the course of the user interactions. D shows the information contained in each bar inside the Parameter Explorer and Aggregate View.
Video: Watch it to get a quick overview:
Paper: A Tyagi, Z. Cao, T. Estro, E. Zadok, K. Mueller, "ICE: An Interactive Configuration Explorer for High Dimensional Parameter Spaces," IEEE Visual Analytcs Science and Technology Conference (VAST), Vancouver, Canada, October 2019," pdf ppt talk-video github