Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution (Articolo in rivista)

Type
Label
  • Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution (Articolo in rivista) (literal)
Anno
  • 2006-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1007/11880561_3 (literal)
Alternative label
  • [1] Geraci F., [1] Pellegrini M., [2] Sebastiani F., [3] Maggini M. (2006)
    Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution
    in Lecture notes in computer science; Springer, Berlin (Germania)
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • [1] Geraci F., [1] Pellegrini M., [2] Sebastiani F., [3] Maggini M. (literal)
Pagina inizio
  • 25 (literal)
Pagina fine
  • 36 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 4209 (literal)
Rivista
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#note
  • In Proceedings of the 13th Symposium on String Processing and Information Retrieval (SPIRE 2006) Glasgow, UK (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • [1] CNR-IIT, Pisa, Italy; [2] CNR-ISTI, Pisa, Italy; [3] Dipartimento di Ingegneria dell’Informazione, Universita’ di Siena, Italy (literal)
Titolo
  • Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution (literal)
Abstract
  • This paper describes Armil, a meta-search engine that groups into disjoint labelled clusters the Web snippets returned by auxiliary search engines. The cluster labels generated by Armil provide the userwith a compact guide to assessing the relevance of each cluster to her information need. Striking the right balance between running time and cluster well-formedness was a key point in the design of our system. Both the clustering and the labelling tasks are performed on the fly by processing only the snippets provided by the auxiliary search engines, and use no external sources of knowledge. Clustering is performed by means of a fast version of the furthest-point-first algorithm for metric k-center clustering. Cluster labelling is achieved by combining intra-cluster and inter-cluster term extraction based on a variant of the information gain measure.We have tested the clustering effectiveness of Armil against Vivisimo, the de facto industrial standard in Web snippet clustering, using as benchmark a comprehensive set of snippets obtained from theOpen Directory Project hierarchy. According to two widely accepted external metrics of clustering quality, Armil achieves better performance levels by 10%. We also report the results of a thorough user evaluation of both the clustering and the cluster labelling algorithms. (literal)
Editore
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Editore di
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
Insieme di parole chiave di
data.CNR.it