Sorting out the document identifier assignment problem (Articolo in rivista)

Type
Label
  • Sorting out the document identifier assignment problem (Articolo in rivista) (literal)
Anno
  • 2007-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1007/978-3-540-71496-5_12 (literal)
Alternative label
  • Silvestri F. (2007)
    Sorting out the document identifier assignment problem
    in Lecture notes in computer science
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Silvestri F. (literal)
Pagina inizio
  • 101 (literal)
Pagina fine
  • 112 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#altreInformazioni
  • Best Paper Award (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#url
  • http://www.springerlink.com/content/y0755644n8n48627/ (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 4425 (literal)
Rivista
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#note
  • In: Advances in Information Retrieval . 29th European Conference on IR Research, ECIR 2007 (Rome, 2-5 April 2007). Proceedings, pp. 101 - 112. Giambattista Amati, Claudio Carpineto and Giovanni Romano (eds.). (Lecture Notes in Computer Science, vol. 4425). Springer Verlag, 2007. (literal)
Note
  • Scopu (literal)
  • ISI Web of Science (WOS) (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • ISTI - CNR, Pisa (literal)
Titolo
  • Sorting out the document identifier assignment problem (literal)
Abstract
  • The compression of Inverted File indexes in Web Search Engines has received a lot of attention in these last years. Compressing the index not only reduces space occupancy but also improves the overall retrieval performance since it allows a better exploitation of the memory hierarchy. In this paper we are going to empirically show that in the case of collections of Web Documents we can enhance the performance of compression algorithms by simply assigning identifiers to documents according to the lexicographical ordering of the URLs. We will validate this assumption by comparing several assignment techniques and several compression algorithms on a quite large document collection composed by about six million documents. The results are very encouraging since we can improve the compression ratio up to 40% using an algorithm that takes about ninety seconds to finish using only 100 MB of main memory. (literal)
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
Insieme di parole chiave di
data.CNR.it