Feature Relevance in Ward’s Hierarchical Clustering Using the Lp Norm
Cordeiro De Amorim, Renato
In this paper we introduce a new hierarchical clustering algorithm called Ward p . Unlike the original Ward, Ward p generates feature weights, which can be seen as feature rescaling factors thanks to the use of the L p norm. The feature weights are cluster dependent, allowing a feature to have different degrees of relevance at different clusters. We validate our method by performing experiments on a total of 75 real-world and synthetic datasets, with and without added features made of uniformly random noise. Our experiments show that: (i) the use of our feature weighting method produces results that are superior to those produced by the original Ward method on datasets containing noise features; (ii) it is indeed possible to estimate a good exponent p under a totally unsupervised framework. The clusterings produced by Ward p are dependent on p. This makes the estimation of a good value for this exponent a requirement for this algorithm, and indeed for any other also based on the Lp norm.