BioCreAtIvE challenge evaluation
The BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology)
challenge evaluation consists of a community-wide effort for evaluating text mining
and information extraction systems applied to the biological domain.
The organization of BioCreAtIvE was motivated by the increasing number of groups
working in the area of text mining. However, despite increased activity in this area,
there were no common standards or shared evaluation criteria to enable comparison
among the different approaches. The various groups were addressing different
problems, often using private data sets, and as a result, it was impossible to
determine how good the existing systems were, whether they would scale to real
applications, and what performance could be expected.
The main emphasis of BioCreAtIvE is on the comparison of methods and the
community assessment of scientific progress, rather than on the purely competitive
There is a considerable difficulty in constructing suitable "gold standard" data for
training and testing new information extraction systems which handle life science
literature. Thus the data sets derived from the BioCreAtIvE challenge - because they
have been examined by biological database curators and domain experts - serve as
useful resources for the development of new applications as well as helping to
improve existing ones.
Two main issues are addressed at BioCreAtIvE, both concerned with the extraction of
biologically relevant and useful information from the literature. The first one is
concerned with the detection of biologically significant entities (names) such as gene
and protein names and their association to existing database entries. The second one
is concerned with the detection of entity-fact associations (e.g. protein - functional
term associations ).
The first BioCreAtIvE challenge evaluation in 2003-2004 attracted considerable
attention within the bioinformatics and biomedical text mining community. Overall,
27 groups from some 10 countries participated in the evaluation. The first
BioCreAtIvE was organized through collaborations between text mining and NLP
groups, biological database curators and bioinformatics researchers and has served
as the promoting force for the organization of the second BioCreAtIvE challenge.