HVP project corpus (15-01-2013)

Reference:

Verspoor, K., Jimeno Yepes, A., Cavedon, L., McIntosh, T., Herten-Crabb, A., Thomas, Z., and Plazzer, J. (2013).
Annotating the biomedical literature for the human variome. Database: The journal of biological databases and curation

Files included in the release
=============================

BRAT configuration files
------------------------

brat_files\entity_types.conf -- entities allowed in the annotation
brat_files\relations_types.conf -- relations allowed in the annotation

Annotated documents
-------------------

The name of the documents files are composed of the following fields:

PMCID - Pubmed Central ID, a list of ids included in the release are available below
serial - serial number, it allows ordering the sections as they appear in the paper
section - section name (e.g. Introduction, Background, ...)
paragraph - number of paragraph in the section (e.g. p01, p02, ...)

data\PMCID-serial-section-paragraph.txt - files with extension txt contain the text from a paragraph of the paper
data\PMCID-serial-section-paragraph.ann - files with extension ann contain the standoff annotation of the paragraph


Articles annotated
==================

PubMed ID PubMed Central ID
--------- -----------------
16202134  1266026
16356174  1334229
16403224  1360090
16426447  1373649
16879751  1557864
16982006  1601966
16879389  1619718
18257912  2275286
18433509  2386495
21247423  3034663

