Robust data clustering by learning multi-metric Lq-norm distances (Articolo in rivista)

Type
Label
  • Robust data clustering by learning multi-metric Lq-norm distances (Articolo in rivista) (literal)
Anno
  • 2012-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1016/j.eswa.2011.07.023 (literal)
Alternative label
  • Zhang J., Peng L., Zhao X., Kuruoglu E. E. (2012)
    Robust data clustering by learning multi-metric Lq-norm distances
    in Expert systems with applications
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Zhang J., Peng L., Zhao X., Kuruoglu E. E. (literal)
Pagina inizio
  • 335 (literal)
Pagina fine
  • 349 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#altreInformazioni
  • Progetto: MAE - Chinese Academy of Sciences Bilateral Project Tipo Progetto: NC (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#url
  • http://www.sciencedirect.com/science/article/pii/S0957417411009857 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 39 (literal)
Rivista
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroFascicolo
  • 1 (literal)
Note
  • ISI Web of Science (WOS) (literal)
  • PuMa (literal)
  • Scopu (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • School of Computer Science and Technology, Xidian University, Xi'an, China Department of Physics, Fudan University, Shanghai, China CNR-ISTI, Pisa, Italy; (literal)
Titolo
  • Robust data clustering by learning multi-metric Lq-norm distances (literal)
Abstract
  • Unsupervised clustering for datasets with severe outliers inside is a difficult task. In this approach, we propose a cluster-dependent multi-metric clustering approach which is robust to severe outliers. A dataset is modeled as clusters each contaminated by noises of cluster-dependent unknown noise level in formulating outliers of the cluster. With such a model, a multi-metric Lp-norm transformation is proposed and learnt which maps each cluster to the most Gaussian distribution by minimizing some non-Gaussianity measure. The approach is composed of two consecutive phases: multi-metric location estimation (MMLE) and multi-metric iterative chi-square cutoff (ICSC). Algorithms for MMLE and ICSC are proposed. It is proved that the MMLE algorithm searches for the solution of a multi-objective optimization problem and in fact learns a cluster-dependent multi-metric Lq-norm distance and/or a cluster-dependent multikernel defined in data space for each cluster. Experiments on heavy-tailed alpha-stable mixture datasets, Gaussian mixture datasets with radial and diffuse outliers added respectively, and the real Wisconsin breast cancer dataset and lung cancer dataset show that the proposed method is superior to many existent robust clustering and outlier detection methods in both clustering and outlier detection performances. (literal)
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
Insieme di parole chiave di
data.CNR.it