Visual Analytics and Imaging Laboratory (VAI Lab)
Computer Science Department, Stony Brook University, NY

Into the Void: Mapping the Unseen Gaps in High Dimensional Data

Abstract: We present a comprehensive pipeline, integrated with a visual analytics system called GapMiner, capable of exploring and exploiting untapped opportunities within the empty regions of high-dimensional datasets. Our approach utilizes a novel Empty-Space Search Algorithm (ESA) to identify the center points of these uncharted voids, which represent reservoirs for potentially valuable new configurations. Initially, this process is guided by user interactions through GapMiner, which visualizes Empty-Space Configurations (ESCs) within the context of the dataset and allows domain experts to explore and refine ESCs for subsequent validation in domain experiments or simulations. These activities iteratively enhance the dataset and contribute to training a connected deep neural network (DNN). As training progresses, the DNN gradually assumes the role of identifying and validating high-potential ESCs, reducing the need for direct user involvement. Once the DNN achieves sufficient accuracy, it autonomously guides the exploration of optimal configurations by predicting performance and refining configurations through a combination of gradient ascent and improved empty-space searches. Domain experts were actively involved throughout the system’s development. Our findings demonstrate that this methodology consistently generates superior novel configurations compared to conventional randomization-based approaches. We illustrate its effectiveness in multiple case studies with diverse objectives.

Teaser: Shown here is the GapMiner visual interface:

Teaser image

Shown here is the GapMiner visual interface where a selected ESC (Empty Space Configuration) is reflected in all displays. (A) Control Panel. From top to bottom: (a) File Selector to load a dataset of initial verified configurations with values for all parameter variables. (b) Target Variable Configurator with an interface for breaking its value range into discrete intervals. (c) Empty-space Search Algorithm (ESA) Configurator to select the ESA and a slider to set the ESC batch size. (d) Empty-Space Configuration (ESC) Range Selector to control which target variable intervals are used for display and ESC proposals. (e) Overview Quality Monitor screeplot that shows the amount of data variance captured by the Overview (PCA) Display. (B) Overview (PCA) Display with data distribution contours, raw or modified ESCs rendered as points, and color legend. (C) Empty-space Configuration (ESC) Editor. From left to right: (a) Parallel Coordinate Plot Display where users can configure ESCs starting from a raw ESC or an existing configuration. (b) Neighbor Display of the selected ESC providing a local view of the distribution of its nearest existing configurations. (D) Progress Tracker. From top to bottom: (a) Budget/Reward Display that captures the aggregated evaluation cost and merit of the ESC exploration so far. (b) Training Status Display of the assistive DNN. (c) Pareto Frontier plot that shows the Pareto frontiers of existing configurations (red) and ESCs (gray) with respect to two user-chosen merit (target) variables.

Video: Watch it to get a quick overview how a user would find promising new configurations with the GapMiner interface:

Paper: X. Zhang, T. Estro, G. Kuenning, E. Zadok, K. Mueller, “Into the Void: Mapping the Unseen Gaps in High Dimensional Data,” IEEE Transactions on Visualization and Computer Graphics, 31(10): 8578-8591, 2025. PDF

Funding: NSF grant CNS 1900706