http://www.cnr.it/ontology/cnr/individuo/prodotto/ID92091
Mining top-K patterns from binary datasets in presence of noise (Contributo in atti di convegno)
- Type
- Label
- Mining top-K patterns from binary datasets in presence of noise (Contributo in atti di convegno) (literal)
- Anno
- 2010-01-01T00:00:00+01:00 (literal)
- Alternative label
Lucchese C.; Orlando S.; Perego R. (2010)
Mining top-K patterns from binary datasets in presence of noise
in Tenth SIAM International Conference on Data Mining, Columbus, Ohio, US
(literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
- Lucchese C.; Orlando S.; Perego R. (literal)
- Pagina inizio
- Pagina fine
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#url
- http://www.siam.org/proceedings/datamining/2010/dm10_015_lucchesec.pdf (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#note
- In: SDM10 - Tenth SIAM International Conference on Data Mining (Columbus, Ohio, US, April 29 - May 1 2010). Proceedings, pp. 165 - 176. SIAM, 2010. (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#descrizioneSinteticaDelProdotto
- ABSTRACT: The discovery of patterns in binary dataset has many ap- plications, e.g. in electronic commerce, TCP/IP networking, Web usage logging, etc. Still, this is a very challenging task in many respects: overlapping vs. non overlapping patterns, presence of noise, extraction of the most important patterns only. In this paper we formalize the problem of discovering the Top-K patterns from binary datasets in presence of noise, as the minimization of a novel cost function. According to the Minimum Description Length principle, the proposed cost function favors succinct pattern sets that may approximately describe the input data. We propose a greedy algorithm for the discovery of Patterns in Noisy Datasets, named PaNDa, and show that it outperforms related techniques on both synthetic and real- world data. (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
- CNR-ISTI, Pisa, Dipartimento di Informatica, Università Ca' Foscari di Venezia, CNR-ISTI, Pisa (literal)
- Titolo
- Mining top-K patterns from binary datasets in presence of noise (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#isbn
- 978-0-898717-03-7 (literal)
- Abstract
- The discovery of patterns in binary dataset has many applications, e.g. in electronic commerce, TCP/IP networking, Web usage logging, etc. Still, this is a very challenging task in many respects: overlapping vs. non overlapping patterns,
presence of noise, extraction of the most important patterns only. In this paper we formalize the problem of discovering the Top-K patterns from binary datasets in presence of noise, as the minimization of a novel cost function. According to the
Minimum Description Length principle, the proposed cost function favors succinct pattern sets that may approximately describe the input data. We propose a greedy algorithm for the discovery of Patterns in Noisy Datasets, named PaNDa, and show that it outperforms related techniques on both synthetic and realworld data. (literal)
- Prodotto di
- Autore CNR
- Insieme di parole chiave
Incoming links:
- Autore CNR di
- Prodotto
- Insieme di parole chiave di