Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type
Feature learning in the presence of a mixed type of variables, numerical and categorical types, is an important issue for related modeling problems. For simple neighborhood queries under mixed data space, standard practice is to consider numerical and categorical variables separately and combining them based on some suitable distance functions. Alternatives, such as Kernel learning or Principal Component do not explicitly consider the inter-dependence structure among the mixed type of variables. In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph. Spectral decomposition of the graph Laplacian provides the desired feature transformation. The Eigen spectrum of the transformed feature space shows increased separability and more prominent clusterability among the observations. The main novelty of our paper lies in capturing interactions of the mixed feature type in an unsupervised framework using a graphical model. We numerically validate the implications of the feature learning strategy
READ FULL TEXT