http://www.cnr.it/ontology/cnr/individuo/prodotto/ID287415
Supervised DNA Barcodes species classification: Analysis, comparisons and results (Articolo in rivista)
- Type
- Label
- Supervised DNA Barcodes species classification: Analysis, comparisons and results (Articolo in rivista) (literal)
- Anno
- 2014-01-01T00:00:00+01:00 (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
- 10.1186/1756-0381-7-4 (literal)
- Alternative label
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
- Weitschek E.; Fiscon G.; Felici G. (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#url
- http://www.scopus.com/inward/record.url?eid=2-s2.0-84899472938&partnerID=q2rCbXpz (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
- Rivista
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroFascicolo
- Note
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
- Department of Engineering, Roma Tre University, Via della Vasca Navale, 79, 00146 Rome, Italy; Institute of Systems Analysis and Computer Science Antonio Ruberti, National Research Council, Viale Manzoni, 30, 00185 Rome, Italy; Department of Computer, Control, and Management Engineering, Sapienza University, Via Ariosto, 25, 00185 Rome, Italy (literal)
- Titolo
- Supervised DNA Barcodes species classification: Analysis, comparisons and results (literal)
- Abstract
- Background: Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods. In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results: A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions: The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. © 2014 Weitschek et al.; licensee BioMed Central Ltd. (literal)
- Prodotto di
- Autore CNR
- Insieme di parole chiave
Incoming links:
- Autore CNR di
- Prodotto
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
- Insieme di parole chiave di