BioCreAtIvE II (2006)
The Second BioCreAtIvE - Critical Assessment for Information Extraction in
Biology challenge (2006-2007 ) is a community-wide effort for
evaluating text mining and information extraction systems applied to the biological domain.
The second BioCreAtIvE challenge will focus on:
(1) Gene mention tagging [GM]
(2) Gene normalization [GN]
(3) Extraction of protein-protein interactions from text [PPI]
Background
BioCreAtIvE is a community-wide effort for evaluating text
mining and information extraction systems applied to the
biological domain. BioCreAtIvE arose out the needs of working
biologists, biological curators and bioinformaticians to access the
wealth of information in the literature, and to link this information to
biological databases and ontologies. BioCreAtIvE focuses on the
comparison of methods and community assessment of scientific
progress, rather than on the purely competitive aspects. The first
BioCreAtIvE challenge evaluation in 2003-2004 [1] attracted broad
interest within the bioinformatics and biomedical text mining community,
with participation from 27 groups from 10 countries. BioCreAtIvE is
organized through collaborations between text mining groups,
biological database curators and bioinformatics researchers.
BioCreAtIvE II Information
BioCreAtIve II will be held during October of 2006, with the workshop
to be held in Spring 2007. It will consist of three tracks. The first will
focus on finding the mentions of genes and proteins in sentences
drawn from MEDLINE abstracts and is the same as Task 1A (Tanabe,
Xie et al. 2005) from BioCreAtIvE I [2]. The second track will involve
producing a list of the EntrezGene identifiers for all the human
genes/proteins mentioned in a collection of MEDLINE abstracts and is
similar to BioCreAtIvE I Task 1B (Hirschman, Colosimo et al. 2005). The
third track of BioCreAtIvE II is new and will involve identifying protein-
protein interactions from full text papers, including extraction of excerpts
from those papers that describe experimentally derived interactions, for
curation into one of two interaction databases:
IntAct (Hermjakob,
Montecchi-Palazzi et al. 2004) and MINT (Zanzoni, Montecchi-Palazzi et
al. 2002). [3]
References
[1] Hirschman, L., M. Colosimo, et al. (2005). "Overview of BioCreAtIvE
task 1B: normalized gene lists." BMC Bioinformatics 6 Suppl 1: S11.
[2] Tanabe, L., N. Xie, et al. (2005). "GENETAG: a tagged corpus for
gene/protein named entity recognition." BMC Bioinformatics 6
Suppl 1: S3.
[3] Hermjakob, H., L. Montecchi-Palazzi, et al. (2004). "IntAct: an open
source molecular interaction database." Nucleic Acids Res
32(Database issue): D452-5.
[up][home]
|