On spectral clustering, informativeness and seriation

Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
This thesis studies spectral clustering and seriation, which have very similar relaxed objective functions. We analyzed these problems from the standpoint of informativeness, an unsupervised measure of the similarity within a data set. We are motivated by the fact that spectral clustering depends on the choice of similarity and it does not perform well for all values of Gaussian kernel bandwidth parameter. We proposed three alternative fixes to spectral clustering algorithm, such that it could be used for all kind of similarities. We also implemented state-of-the-art seriation algorithms, GNCR and FAQ. For both clustering and seriation, we analyzed the relationship between the Gaussian kernel bandwidth parameter with informativeness and observed that, in order to find the optimal Gaussian kernel bandwidth parameter, one can use informativeness as a guiding principle to get a candidate range of values. We also observed that GNCR is more robust to variations in the kernel bandwidth parameter as compared to FAQ.
Description
Keywords
Seriation algorithms, Spectral clustering, Informativeness
Citation