| Literature DB >> 27270715 |
Sushil Tripathi1, Steven Vercruysse2, Konika Chawla2, Karen R Christie3, Judith A Blake3, Rachael P Huntley4, Sandra Orchard5, Henning Hermjakob5, Liv Thommesen6, Astrid Lægreid7, Martin Kuiper8.
Abstract
A large gap remains between the amount of knowledge in scientific literature and the fraction that gets curated into standardized databases, despite many curation initiatives. Yet the availability of comprehensive knowledge in databases is crucial for exploiting existing background knowledge, both for designing follow-up experiments and for interpreting new experimental data. Structured resources also underpin the computational integration and modeling of regulatory pathways, which further aids our understanding of regulatory dynamics. We argue how cooperation between the scientific community and professional curators can increase the capacity of capturing precise knowledge from literature. We demonstrate this with a project in which we mobilize biological domain experts who curate large amounts of DNA binding transcription factors, and show that they, although new to the field of curation, can make valuable contributions by harvesting reported knowledge from scientific papers. Such community curation can enhance the scientific epistemic process.Database URL: http://www.tfcheckpoint.org.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27270715 PMCID: PMC4911790 DOI: 10.1093/database/baw088
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Overview of resources for mammalian transcription factors
| Resources | Description | Entries | URL/PMID |
|---|---|---|---|
| AnimalTFDB | Animal transcription factor database | 1682 | |
| CIS-BP | Determination and inference of eukaryotic transcription factor sequence specificity | 1017 | |
| DBD | Database of predicted transcription factors in completely sequenced genomes | 1395 | |
| footprintDB | Database of transcription factors with annotated cis elements and binding interfaces | 2422 | |
| GO database | Community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies | 1121 | |
| HOCOMOCO | Comprehensive collection of human transcription factor binding sites models | 601 | |
| HTRIdb | Repository of experimentally verified interactions among human TFs and their respective target genes | 284 | |
| IntAct | Molecular interaction database populated by data either curated from the literature or from direct data depositions | 607 | |
| JASPAR | Matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species | 202 | |
| PAZAR | Transcription factor and regulatory sequence annotation. Unites independently created and maintained data collections | 708 | |
| TcoF-DB | Human transcription co-factors and transcription factor interacting proteins. | 1864 | |
| TFCat | Mouse and human TFs based on a reliable core collection of annotations obtained by expert review of the scientific literature database | 1052 | |
| TFcheckpoint | Curated compendium of specific DNA-binding RNA polymerase II transcription factors | 3480 | |
| TFClass | Classification of human transcription factors and their rodent orthologs | 1558 | |
| TFe | Compendium of mini review articles on transcription factors (TFs) that is founded on the principles of open access and collaboration | 803 | |
| TRANSFAC | Transcription factors, their binding sites, nucleotide distribution matrices and regulated genes | 1040 | |
| TRED | Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies | 36 | |
| Jolma et al. | DNA-binding specificities of human transcription factors | 411 | PMID: 23332764 |
| Messina et al. | ORFeome-based analysis of human transcription factor genes | 1770 | PMID: 15489324 |
| Ravasi et al. | Atlas of combinatorial transcriptional regulation in mouse and man; physical interactions among the majority of human and mouse DNA-binding transcription factors | 1967 | PMID: 20211142 |
| TFCONES | Vertebrate transcription factor-encoding genes and their associated conserved non-coding elements. Content integrated with AnimalTFDB. | 1962 | PMID: 18045502 |
| Vaquerizas et al. | Census of human transcription factors: function, expression and evolution; analysis of 1391 manually curated sequence-specific DNA-binding transcription factors, their functions, genomic organization and evolutionary conservation | 1909 | PMID: 19274049 |
Contents of the individual resources are summarized with a brief description and number of entries. Link to each of the resources are also provided as URL or PMID.
Numbers obtained from these resources on 3 March 2016.
Entries annotated with GO term GO:0003700 or more specific.
Interactions where A is a protein annotated to GO0000981 (or child thereof) and B is a gene
Figure 1.Contents of TF resources. For each TF database resource two bars are shown: the total number of unique entries is indicated by blue bars, the dark blue part of which indicates specific DNA binding transcription factors (DbTFs) for which we have found literature evidence (3). The green bars below each blue bar represent the numbers of DbTFs present within that resource that are corroborated in the GO database by annotation with experimental evidence to the GO term GO:0000981, or child terms thereof. Dark green: DbTFs documented in the GO database at the start of our project March 2013 (205); Light green: new entries after March 2013 (328). Numbers in parentheses give the cumulative total in TFcheckpoint and refer to human, mouse or rat DbTFs, with orthologues counted only once. Of the 328 new experimentally documented DbTF annotations (light green), 301 were uniquely provided by our current project. The GO database version referenced here, which includes our new annotations, is dated 06 December 2014. Data versions for the other sources are given at www.tfcheckpoint.org.
Figure 2.Overview of the curation status of DbTFs. In the pie chart blue represents the total number of candidate TFs, and the dark blue part indicates DbTFs with literature reference (3). Note that only 1700–1800 of the candidate TFs (blue) are considered DbTFs (1, 2). In the bar to the right of the pie part green represents the number of curated DbTFs in the GO database (dark green: before March 2013, light green: after March 2013 when we started our community curation efforts. Orange indicates the number of DbTFs with literature reference (3) that still need to be curated.