Show simple item record

dc.contributor.authorCordeiro De Amorim, Renato
dc.contributor.authorHennig, Christian
dc.date.accessioned2016-03-30T14:12:26Z
dc.date.available2016-03-30T14:12:26Z
dc.date.issued2015-12-10
dc.identifier.citationCordeiro De Amorim , R & Hennig , C 2015 , ' Recovering the number of clusters in data sets with noise features using feature rescaling factors ' , Information Sciences , vol. 324 , pp. 126-145 . https://doi.org/10.1016/j.ins.2015.06.039
dc.identifier.issn0020-0255
dc.identifier.urihttp://hdl.handle.net/2299/16864
dc.description.abstractIn this paper we introduce three methods for re-scaling data sets aiming at improving the likelihood of clustering validity indexes to return the true number of spherical Gaussian clusters with additional noise features. Our method obtains feature re-scaling factors taking into account the structure of a given data set and the intuitive idea that different features may have different degrees of relevance at different clusters. We experiment with the Silhouette (using squared Euclidean, Manhattan, and the pth power of the Minkowski distance), Dunn’s, Calinski–Harabasz and Hartigan indexes on data sets with spherical Gaussian clusters with and without noise features. We conclude that our methods indeed increase the chances of estimating the true number of clusters in a data set.en
dc.format.extent396865
dc.language.isoeng
dc.relation.ispartofInformation Sciences
dc.titleRecovering the number of clusters in data sets with noise features using feature rescaling factorsen
dc.contributor.institutionSchool of Computer Science
dc.description.statusPeer reviewed
rioxxterms.versionofrecord10.1016/j.ins.2015.06.039
rioxxterms.typeJournal Article/Review
herts.preservation.rarelyaccessedtrue


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record