Fast Detection of XML Structural Similarity (Articolo in rivista)

Type
Label
  • Fast Detection of XML Structural Similarity (Articolo in rivista) (literal)
Anno
  • 2005-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1109/TKDE.2005.27 (literal)
Alternative label
  • Sergio Flesca; Giuseppe Manco; Elio Masciari; Luigi Pontieri; Andrea Pugliese (2005)
    Fast Detection of XML Structural Similarity
    in IEEE transactions on knowledge and data engineering (Print)
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Sergio Flesca; Giuseppe Manco; Elio Masciari; Luigi Pontieri; Andrea Pugliese (literal)
Pagina inizio
  • 160 (literal)
Pagina fine
  • 175 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 17 (literal)
Rivista
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#pagineTotali
  • 15 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroFascicolo
  • 2 (literal)
Note
  • Scopu (literal)
  • DBLP (literal)
  • Google Scholar (literal)
  • IEEE Xplore digital library (literal)
  • ISI Web of Science (WOS) (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • DEIS, University of Calabria; ICAR-CNR;ICAR-CNR;ICAR-CNR;DEIS, University of Calabria; (literal)
Titolo
  • Fast Detection of XML Structural Similarity (literal)
Abstract
  • Because of the widespread diffusion of semistructured data in XML format, much research effort is currently devoted to support the storage and retrieval of large collections of such documents. XML documents can be compared as to their structural similarity, in order to group them into clusters so that different storage, retrieval, and processing techniques can be effectively exploited. In this scenario, an efficient and effective similarity function is the key of a successful data management process. We present an approach for detecting structural similarity between XML documents which significantly differs from standard methods based on graph-matching algorithms, and allows a significant reduction of the required computation costs. Our proposal roughly consists of linearizing the structure of each XML document, by representing it as a numerical sequence and, then, comparing such sequences through the analysis of their frequencies. First, some basic strategies for encoding a document are proposed, which can focus on diverse structural facets. Moreover, the theory of Discrete Fourier Transform is exploited to effectively and efficiently compare the encoded documents (i.e., signals) in the domain of frequencies. Experimental results reveal the effectiveness of the approach, also in comparison with standard methods. (literal)
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
Insieme di parole chiave di
data.CNR.it