http://www.cnr.it/ontology/cnr/individuo/prodotto/ID128275
Investigation of named entity recognition in molecular biology by data fusion (Comunicazione a convegno)
- Type
- Label
- Investigation of named entity recognition in molecular biology by data fusion (Comunicazione a convegno) (literal)
- Anno
- 2006-01-01T00:00:00+01:00 (literal)
- Alternative label
P. Arrigo, P. P. Cardo (2006)
Investigation of named entity recognition in molecular biology by data fusion
in Fifth Conference on Bioinformatics of genome regulation and structure, Novosibirsk, july 16-22
(literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
- P. Arrigo, P. P. Cardo (literal)
- Pagina inizio
- Pagina fine
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#pagineTotali
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
- Istituto di Dermatologia Universita'di Genova
CNR ISMAC (literal)
- Titolo
- Investigation of named entity recognition in molecular biology by data fusion (literal)
- Abstract
- Motivation: The amount of published scientific literature is fast expanding its
management and processing is become a burden task. Text Mining (TM) is acquiring a
key role for bioinformatics; it seems one of more suitable approaches for heterogeneous
data sources integration. Textual data has been recently used to support scientific
hypotheses generation (Literature Based Discovery). In this work we have considered the
screening of previous unknown molecular in a literature based discovery perspective. The
identification of molecular species, that could interact among them, is the first step for in
silico design of molecular interaction networks. In order to achieve this goal, we have
need to extract the more standardized set of potentially interacting molecules. The
published articles reflects the fragmentation biomedical researches, this situation could
affect the reliability of the set.
During the temporal evolution, new published papers can modify the knowledge about
molecular interactions. The evaluation these changes on knowledge is important in a model
development perspective. The screening of potentially interacting molecules could be
considered equivalent to linguistic named entity recognition process. In this paper we have
applied an ensemble of unsupervised learning machines to selection and extraction of
named entities associated to potentially interacting molecules; the analysis has been focused
on the changes emerged in PubMed repository during the period of time 1985-2000.
Results: A set of PubMed queries has been analyzed; everyone of which was a
molecular entity. Each corresponding set of PubMed abstracts has been separately
retrieved and processed; the retrieval phase has limited to the period 1985-2000. Each set
has been split into three chunks; each chunk represent a five year sub interval. This
procedure allowed us to screen named entities, specific for each time interval, associated
with potentially interacting molecules; The recognition of time invariant named entities is
essential for subsequent molecular interaction screening. A data-fusion system, based on
self-organization paradigm, seems to be able to evaluate the temporal modification in
textual information. Our system has detected, in this preliminary analysis, several named
entities that can be functionally related with the original query. Motivation: The amount of published scientific literature is fast expanding its
management and processing is become a burden task. Text Mining (TM) is acquiring a
key role for bioinformatics; it seems one of more suitable approaches for heterogeneous
data sources integration. Textual data has been recently used to support scientific
hypotheses generation (Literature Based Discovery). In this work we have considered the
screening of previous unknown molecular in a literature based discovery perspective. The
identification of molecular species, that could interact among them, is the first step for in
silico design of molecular interaction networks. In order to achieve this goal, we have
need to extract the more standardized set of potentially interacting molecules. The
published articles reflects the fragmentation biomedical researches, this situation could
affect the reliability of the set.
During the temporal evolution, new published papers can modify the knowledge about
molecular interactions. The evaluation these changes on knowledge is important in a model
development perspective. The screening of potentially interacting molecules could be
considered equivalent to linguistic named entity recognition process. In this paper we have
applied an ensemble of unsupervised learning machines to selection and extraction of
named entities associated to potentially interacting molecules; the analysis has been focused
on the changes emerged in PubMed repository during the period of time 1985-2000.
Results: A set of PubMed queries has been analyzed; everyone of which was a
molecular entity. Each corresponding set of PubMed abstracts has been separately
retrieved and processed; the retrieval phase has limited to the period 1985-2000. Each set
has been split into three chunks; each chunk represent a five year sub interval. This
procedure allowed us to screen named entities, specific for each time interval, associated
with potentially interacting molecules; The recognition of time invariant named entities is
essential for subsequent molecular interaction screening. A data-fusion system, based on
self-organization paradigm, seems to be able to evaluate the temporal modification in
textual information. Our system has detected, in this preliminary analysis, several named
entities that can be functionally related with the original query. Availability: http://biocomp.ge.ismac.cnr.it/ (literal)
- Prodotto di
- Autore CNR
- Insieme di parole chiave
Incoming links:
- Prodotto
- Autore CNR di
- Insieme di parole chiave di