XML Document Co-clustering via Non-negative Matrix Tri-factorization (Contributo in atti di convegno)

Type
Label
  • XML Document Co-clustering via Non-negative Matrix Tri-factorization (Contributo in atti di convegno) (literal)
Anno
  • 2014-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1109/ICTAI.2014.96 (literal)
Alternative label
  • Gianni Costa and Riccardo Ortale (2014)
    XML Document Co-clustering via Non-negative Matrix Tri-factorization
    in International Conference on Tools with Artificial Intelligence (ICTAI)
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Gianni Costa and Riccardo Ortale (literal)
Pagina inizio
  • 607 (literal)
Pagina fine
  • 614 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#url
  • http://dx.doi.org/10.1109/ICTAI.2014.96 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#titoloVolume
  • 26th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2014, Limassol, Cyprus, November 10-12, 2014 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • ICAR-CNR (literal)
Titolo
  • XML Document Co-clustering via Non-negative Matrix Tri-factorization (literal)
Abstract
  • XML co-clustering is a promising method to overcome the effectiveness of traditional XML clustering approaches, due to the exploitation of the mutual relationships between XML documents and their respective XML features while clustering both simultaneously. To shed light on this so far unexplored research direction, we conduct a systematic study of the effectiveness of XML co-clustering, by viewing the task as parametric with respect to the XML features. Thus, the definition and exploitation of three distinct types of XML features, which are respectively informative of the content, structure and both aspects of the XML documents, allows an in-depth investigation of all three different instances of the XML co-clustering task, i.e., XML co-clustering by content alone, structure alone as well as both structure and content. XML co-clustering relies on a non-negative matrix trifactorization technique, that efficiently processes large-scale input data, which is especially useful with large corpora of text-centric XML documents. The relevance of the structural and content features of the XML documents is assessed through a new weighting scheme. An intensive experimental evaluation on real-world benchmark XML corpora reveals a higher effectiveness of XML co-clustering in comparison with state-of-the-art approaches to XML clustering. Insights are also provided on the effectiveness of XML feature clustering. (literal)
Prodotto di
Autore CNR

Incoming links:


Prodotto
Autore CNR di
data.CNR.it