Interactive Subspace Cluster Analysis Guided by Semantic Attribute Associations

Abstract: Multivariate datasets with many variables are increasingly common in many application areas. Most methods approach multivariate data from a singular perspective. Subspace analysis techniques, on the other hand. provide the user a set of subspaces which can be used to view the data from multiple perspectives. However, many subspace analysis methods produce a huge amount of subspaces, a number of which are usually redundant. The enormity of the number of subspaces can be overwhelming to analysts, making it difficult for them to find informative patterns in the data. In this paper, we propose a new paradigm that constructs semantically consistent subspaces. These subspaces can then be expanded into more general subspaces by ways of conventional techniques. Our framework uses the labels/meta-data of a dataset to learn the semantic meanings and associations of the attributes. We employ a neural network to learn a semantic word embedding of the attributes and then divide this attribute space into semantically consistent subspaces. The user is provided with a visual analytics interface that guides the analysis process. We show via various examples that these semantic subspaces can help organize the data and guide the user in finding interesting patterns in the dataset.

Teaser: The below shows an overview of the system visualizing the Filipino Family Income and Expenditure dataset:

The user interface consists of five coordinated views. (a) Control Panel, used to change the various settings of the visual analytics tool. (b) Dimensionality View, provides diagnostics about the fidelity of the subspace visualizations. (c) Semantic Space View, visualizes the semantic space of the data using a scatter plot of attribute labels. (d) Subspace View, shows a user-selected subspace in more detail. (e) Subspace Organizer, shows an overview of all of the subspaces generated by the algorithm.

Video: Watch it to get a quick overview:

Paper: S. Mahmood, K. Mueller, "Interactive Subspace Cluster Analysis Guided by Semantic Attribute Associations," IEEE Trans. on Visualization and Computer Graphics, (to appear) 2023 PDF PPT

Funding: NSF grants IIS 1941613 and IIS 1527200