Literature DB >> 28053160

The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes.

Michael Y Galperin¹, Xosé M Fernández-Suárez², Daniel J Rigden³.

Abstract

This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein-protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as 'breakthrough' contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the 'golden set' of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/. Published by Oxford University Press on behalf of Nucleic Acids Research 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2017 PMID： 28053160 PMCID： PMC5210597 DOI： 10.1093/nar/gkw1188

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

NEW AND UPDATED DATABASES

The current 2017 Nucleic Acids Research Database Issue is the 24th annual collection of bioinformatic databases on various areas of molecular biology. It includes 152 papers, of which 54 describe newly created databases (Table 1), 82 papers provide updates on the databases that have been previously described in NAR and 16 contain updates on the databases whose descriptions have previously been published elsewhere (Table 2).

Table 1.

Descriptions of new online databases in the 2017 NAR Database issue

Database name	URL	Brief description^a
3DSNP	http://biotech.bmi.ac.cn/3dsnp/	Human noncoding SNPs: interactions with genes and other SNPs
AAgAtlas	http://aagatlas.ncpsb.org	Human AutoAntigen database
ADPriboDB	http://adpribodb.leunglab.org/	ADP-ribosylated proteins and sites
antiSMASH	http://antismash-db.secondarymetabolites.org	antibiotics and Secondary Metabolite Analysis SHell
AraPheno	https://arapheno.1001genomes.org	Phenotypic data for Arabidopsis thaliana
ccNET	http://structuralbiology.cau.edu.cn/gossypium/	Co-expression networks for diploid and polyploid Gossypium
CeNDR	http://www.elegansvariation.org	C. elegans Natural Diversity Resource
CGDB	http://cgdb.biocuckoo.org/	Circadian Gene database
CistromeDB	http://cistrome.org/db	ChIP-Seq and DNase-Seq data in human and mouse
Coexpedia	http://www.coexpedia.org	Gene co-expression data mapped to medical subject headings (MeSH).
dbSAP	http://www.megabionet.org/dbSAP	Single Amino acid Polymorphisms: SNP-derived variation in human proteins
denovo-db	http://denovo-db.gs.washington.edu	Human de novo gene variants detected by parent-child sequencing
DrugCentral	http://drugcentral.org	Active ingredients of approved pharmaceutical products, indications and mode of action
EURISCO	http://eurisco.ecpgr.org/	European catalogue for plant genetic resources
ExAC browser	http://exac.broadinstitute.org	Exome Aggregation Consortium sequence data
Exposome-Explorer	http://exposome-explorer.iarc.fr	Biomarkers of exposure to disease risk factors
FAIRDOMHub	https://fairdomhub.org/	Findable, Accessible, Interoperable and Reusable Data, Operating procedures and Models
FuzDB	http://protdyn-database.org	Database of fuzzy protein complexes
GenomeCRISPR	http://genomecrispr.org	High-throughput screening using the CRISPR/Cas-9 system
GTRD	http://gtrd.biouml.org	Gene Transcription Regulation Database
HieranoiDB	http://hieranoidb.sbc.su.se/	Ortholog groups and trees inferred by Hieranoid2 software
IGSR	http://www.1000genomes.org/data-portal	International Genome Sample Resource
IMG/VR	https://img.jgi.doe.gov/vr/	DOE Joint Genome Institute Viral Resource
JET2 Viewer	http://www.lcqb.upmc.fr/jet2_viewer/	Joint Evolutionary Trees: protein-protein interaction patches in known structures
jPOSTrepo	https://repository.jpostdb.org/	Japanese ProteOme STandard repository
KERIS	http://igenomed.org/KERIS	Kaleidoscope of gEne Responses to Inflammation among Species
LinkProt	http://linkprot.cent.uw.edu.pl/	Topologically complex protein structures
LNCediting	http://bioinfo.life.hust.edu.cn/LNCediting/	RNA editing sites in lncRNAs from human, monkey, mouse and fly
MEGaRes	https://meg.colostate.edu/MEGaRes/	Mechanisms of antimicrobial resistance
Membranome	http://membranome.org/	A database of single-pass membrane proteins
MethSMRT	http://sysbio.sysu.edu.cn/methsmrt	DNA methylation data from Single Molecule, Real-Time sequencing
mirDNMR	https://www.wzgenomics.cn/mirdnmr/	Background de novo mutation rates in human genes
Monarch Initiative	http://monarchinitiative.org	Human disease-related genotypes and phenotypes in model organisms
MRPrimerV	http://infolab.dgist.ac.kr/MRPrimerV	PCR primer pairs for detecting RNA virus-mediated infectious diseases
mutLBSgeneDB	http://www.zhaobioinfo.org/mutLBSgeneDB/	Mutations in Ligand Binding Sites gene DataBase
NSDNA	http://www.bio-bigdata.net/nsdna/	Nervous System Disease NcRNA Atlas
Ontobee	http://www.ontobee.org/	Ontology database server of OBO Foundry
Open Targets	https://targetvalidation.org	Target validation platform: links between potential drug targets and diseases
pathDIP	http://ophid.utoronto.ca/pathDIP	Pathway data integration and analysis portal
PathoYeastract	http://pathoyeastract.org/index.php	Transcription regulation in pathogenic yeasts
PceRBase	http://bis.zju.edu.cn/pcernadb/index.jsp	Plant competing endogenous RNAs
Pharos	https://pharos.nih.gov/idg/index	Data on unstudied and understudied drug targets
PLaMoM	http://www.byanbioinfo.org/plamom/	Plant Mobile Macromolecules: Extracellular siRNAs, microRNAs, mRNAs and proteins in plants
Plant Reactome	http://plantreactome.gramene.org/	Plant metabolic, regulatory and signaling pathways
PMDBase	http://www.sesame-bioinfo.org/PMDBase	Plant microsatellites and marker development
POSTAR	http://POSTAR.ncrnalab.org	Post-transcriptional regulation by RNA-binding proteins
proGenomes	http://van.embl.de/progene/	Consistently annotated bacterial and archaeal genomes
Proteome-pI	http://isoelectricpointdb.org/	Pre-computed isoelectric points for >5000 proteomes
REDIportal	http://srv00.recas.ba.infn.it/atlas/	A-to-I RNA editing events in human
RNALocate	http://www.rna-society.org/rnalocate/	RNA localization in the cell
SNP2TFBS	http://ccg.vital-it.ch/snp2tfbs/	Regulatory SNPs affecting predicted transcription factor binding sites
SoyNet	http://www.inetbio.org/soynet/	Co-functional networks for soy bean Glycine max
TFBSbank	http://tfbsbank.co.uk/	Transcription Factor Binding Site profiles deduced from ChIP-seq or ChIP-chip data
TSTMP	http://tstmp.enzim.ttk.mta.hu	Target Selection database for human TransMembrane Proteins
Uniclust	http://uniclust.mmseqs.com/	Clustered protein sequences and multiple sequence alignments
WERAM	http://weram.biocuckoo.org/	Writers, Erasers and Readers of histone Acetylation and Methylation

aAt the time of this writing, references to the databases featured in this issue have not yet been finalized; please see the Database Issue Table of Contents.

Table 2.

Updated descriptions of databases most recently published elsewhere

Database	URL	Brief description^a
CARD	http://arpcard.mcmaster.ca	Comprehensive Antibiotic Research Database
dbDEMC	http://www.picb.ac.cn/dbDEMC	Differentially expressed miRNAs in human cancers
DisGeNET	http://www.disgenet.org/	Genetic determinants of human diseases
ECOD	http://prodata.swmed.edu/ecod/	Evolutionary Classification Of protein Domains
GETPrime	http://bbcftools.epfl.ch/getprime	Gene- or transcript-specific primers for qPCR
HIPPIE	http://cbdm.uni-mainz.de/hippie/	Human Integrated Protein–Protein Interaction rEference
HipSci	http://www.hipsci.org/	Human induced pluripotent Stem cells initiative
IMG-ABC	https://img.jgi.doe.gov/abc-public/	Integrated Microbial Genomes—Atlas of Biosynthetic gene Clusters
Influenza Research Database	http://www.fludb.org	All data on influenza: sequences, strains, alignments, trees, variation, epitopes, classification and surveillance
LincSNP	http://bioinfo.hrbmu.edu.cn/LincSNP	Association of human lncRNAs with disease-related SNPs
MalaCards	http://www.malacards.org/	Human maladies and their annotations, organized into ‘disease cards’
pVOGs	http://research.engineering.uiowa.edu/kristensenlab/VOG	Prokaryotic Virus Orthologous Groups of proteins
Proteome Xchange	http://www.proteomexchange.org/	Proteomics resources portal
RAID	http://www.rna-society.org/raid	Human RNA–RNA and RNA–protein interactions
SZGR	https://bioinfo.uth.edu/SZGR/	SchiZophrenia Gene Resource
WDCM	http://www.wdcm.org	World Data Center of Microorganisms collections
XTalkDB	http://www.xtalkdb.org	Crosstalk among signaling pathways

aFor full references to the databases featured in this issue, please see the Table of Contents.

aAt the time of this writing, references to the databases featured in this issue have not yet been finalized; please see the Database Issue Table of Contents. aFor full references to the databases featured in this issue, please see the Table of Contents. As previously, the issue is organized according to subject categories covering (i) nucleic acid sequence and structure, transcriptional regulation; (ii) protein sequence and structure; (iii) metabolic and signaling pathways, protein–protein interactions; (iv) genomics of viruses, bacteria, protozoa and fungi; (v) genomics of human and model organisms; (vi) human diseases and drugs; (vii) plants and (viii) other topics, such as proteomics databases. Unsurprisingly, many resources straddle multiple categories and defy easy classification so we encourage readers to browse the whole issue, not limiting themselves to a single section. The databases listed in the Nucleic Acids Research online Molecular Biology Database Collection, which is available at http://www.oxfordjournals.org/nar/database/a/, are split into the same 15 categories and 41 subcategories as before. In this year's issue, the usual annual survey of the progress in databases held by the U.S. National Center for Biotechnology Information (NCBI), is supplemented by a report from the Beijing Institute of Genomics, Chinese Academy of Sciences, on their BIG Data Center which hosts a variety of genomic databases. [Because of the high number of references to the databases in the NAR ‘golden set’ (Table 3), we could not properly cite most of the papers included in the current Database Issue. Please refer to this issue's Table of Contents.].

Table 3.

The ‘golden set’ of the most popular databases featured in multiple NAR issuesa

No.^b	Database name	Current URL	Brief description	NAR publications, reference^c
Annual updates
1	DDBJ	http://www.ddbj.nig.ac.jp	All known nucleotide and protein sequences	2000, 2002–2017 (10)
2	ENA	http://www.ebi.ac.uk/ena	All known nucleotide and protein sequences	1986, 1990, 1997–2017 (11)
3	GenBank	https://www.ncbi.nlm.nih.gov/genbank/	All known nucleotide and protein sequences	1986, 1988, 1990–1994, 1996–2000, 2002–2017 (12)
27	Ensembl	http://www.ensembl.org/	Annotated information on eukaryotic genomes	2002-2017 (13)
87	Mouse Genome Database	http://www.informatics.jax.org	Mouse genome database	1997-2017 (14):
316	UCSC Genome Browser	http://genome.ucsc.edu/	A universal genome viewing and analysis platform	2006-2017 (15)
318	UniProt	http://www.uniprot.org	A universal database of protein sequences (includes Swiss-Prot and TrEMBL)	1991-1994, 1996–2000, 2003, 2004–2010, 2012–2015, 2017 (16)
Regular updates
338	ArrayExpress	http://www.ebi.ac.uk/arrayexpress	Array-based gene expression data	2003, 2005, 2007, 2009, 2011, 2013, 2015 (17)
420	BioCyc^d	http://biocyc.org/	Pathway information for sequenced genomes	2005, 2008, 2010, 2012, 2014, 2016 (18)
800	BioGRID	http://www.thebiogrid.org	Genetic and physical interactions in yeast, worm and fly	2006, 2008, 2011, 2013, 2015, 2017 (19)
421	BRENDA	http://www.brenda-enzymes.info	Enzyme names and biochemical properties	2002, 2004, 2007, 2009, 2011, 2013, 2015, 2017 (20)
645	CGD	http://www.candidagenome.org/	Candida Genome Database	2005, 2007, 2010, 2012, 2014, 2016 (21)
1531	CanSAR	http://cansar.icr.ac.uk	Cancer research and drug discovery resource	2012, 2014, 2016 (22)
258	CATH	http://www.cathdb.info	Protein domain structure database	1999, 2000, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017 (23)
1211	CAZy	http://www.cazy.org	Carbohydrate-Active enZymes database	2009, 2014 (24)
204	CDD	http://www.ncbi.nlm.nih.gov/cdd	Conserved Domain Database	2002, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017 (25)
646	ChEBI	http://www.ebi.ac.uk/chebi	Chemical Entities of Biological Interest	2008, 2013, 2016 (26)
1548	ChEMBL	https://www.ebi.ac.uk/chembldb	Interaction of drugs and compounds with their targets	2012, 2014, 2017 (27)
803	ChimerDB	http://ercsb.ewha.ac.kr/fusiongene	Chromosome translocations and gene fusions	2006, 2010, 2017 (28)
7	COG	http://www.ncbi.nlm.nih.gov/COG	Clusters of Orthologous Groups of proteins	2000, 2001, 2015 (29)
1188	Comparative Toxicogenomics Database	http://ctdbase.org	A knowledgebase for curated chemical-gene-disease networks	2009, 2011, 2013, 2017 (30)
651	COSMIC	http://cancer.sanger.ac.uk	Catalogue of Somatic Mutations in Cancer	2010, 2011, 2015, 2017 (31)
68	CyanoBase	http://genome.microbedb.jp/cyanobase	Cyanobacterial genomes	1998, 1999, 2000, 2010, 2014, 2017 (32)
885	dbPTM	http://dbPTM.mbc.nctu.edu.tw/	Post-translational modification of proteins	2006, 2013, 2014, 2016 (33)
591	DBTSS	http://dbtss.hgc.jp/	Database of transcriptional start sites	2002, 2004, 2006, 2008, 2010, 2012, 2015 (34)
445	DEG	http://www.essentialgene.org	Database of essential genes	2004, 2009, 2014 (35)
446	DictyBase	http://dictybase.org	Model organism database for Dictyostelium discoideum	2004, 2006, 2009, 2011, 2013 (36)
811	DrugBank	http://www.drugbank.ca/	Drug and drug target database	2006, 2008, 2011, 2014 (37)
108	EcoCyc	http://ecocyc.org/	E. coli K12 genes, metabolic pathways, transporters, and gene regulation	1996, 1997, 1998, 2000, 2002, 2005, 2009, 2011, 2013 (38)
1068	eggNOG	http://eggnog.embl.de/	Evolutionary genealogy of genes: Non-supervised Orthologous Groups	2008, 2010, 2012, 2014, 2016 (39)
1347	ELM	http://elm.eu.org/	Eukaryotic Linear Motif: functional sites in eukaryotic proteins	2003, 2008, 2010, 2011, 2012, 2014, 2016 (40)
812	EMAGE	http://www.emouseatlas.org/emage/	e-Mouse Atlas of Gene Expression	2006, 2008, 2010, 2014 (41)
985	ENCODE project at UCSC	http://genome.ucsc.edu/ENCODE	Encyclopedia of DNA Elements, functional elements in human genome	2007, 2010–2013 (42)
33	EPD	http://epd.vital-it.ch	Eukaryotic Promoter Database	1998, 1999, 2000, 2002, 2004, 2006, 2013, 2015, 2017 (43)
91, 969, 1219	EuPathD	http://eupathdb.org/	Unified genome databases on eukaryotic pathogens (includes PlasmoDB, ToxoDB, ApiDB, TrichDB, TriTrypDB, GiardiaDB, etc.)	2002, 2003, 2007–2013, 2017 (44)
1294	Expression Atlas	http://www.ebi.ac.uk/gxa/	Dene expression patterns deduced from microarray and RNA-seq data	2010, 2012, 2014, 2016 (45)
465	FANTOM	http://fantom.gsc.riken.jp/	Functional annotation of mouse full-length cDNA clones	2002, 2011, 2016, 2017 (46)
1020	FINDBase	http://www.findbase.org	Frequencies of INherited Disorders	2007, 2011, 2014, 2017 (47)
71	FlyBase	http://flybase.org/	Drosophila sequences and genomic information	1994, 1996–1999, 2002, 2003, 2005–2009, 2012–2017 (48)
817	FlyRNAi	http://flyrnai.org/	Genome-wide RNAi analysis in Drosophila	2006, 2012, 2017 (49)
472	Gene3D	http://gene3d.biochem.ucl.ac.uk	Structural domain assignments for protein sequences	2003, 2005, 2006, 2008, 2010, 2012, 2014, 2016 (50)
73	Genenames	http://www.genenames.org/	The HGNC human gene nomenclature database	2008, 2011, 2013, 2015, 2017 (51)
989	GenomeRNAi	http://www.genomernai.org	RNA interference data for human and Drosophila	2007, 2010, 2013, 2017 (52)
603	GEO	http://www.ncbi.nlm.nih.gov/geo/	NCBI's Gene Expression Omnibus	2005, 2007, 2009, 2011, 2013 (53)
487	GO	http://www.geneontology.org	Gene Ontology Database	2004, 2006, 2008, 2010, 2012, 2013, 2015, 2017 (54)
389	GOA	http://www.ebi.ac.uk/GOA	Gene Ontology annotations for proteins in UniProt	2004, 2009, 2015 (55)
75	GOLD	https://gold.jgi.doe.gov/	Genomes online database: completed and ongoing genome projects	2001, 2006, 2008, 2010, 2012, 2015, 2017 (56)
166	GPCRdb	http://gpcrdb.org/	Data and tools for studying G protein-coupled receptors	1998, 2001, 2003, 2011, 2014, 2016 (57)
607	Gramene	http://www.gramene.org	Comparative genomics of crops and model plant species	2002, 2006, 2008, 2010, 2013, 2016 (58)
15	GXD	http://www.informatics.jax.org/expression.shtml	Mouse Gene Expression Database	1999-2001, 2004, 2007, 2011, 2014, 2017 (59)
1210	HAMAP	http://hamap.expasy.org/	High-quality Automated and Manual Annotation of Proteins	2009, 2013, 2015 (60)
991	HMDB	http://www.hmdb.ca	Human Metabolome Database	2007, 2009, 2013 (61)
779	IEDB	http://www.iedb.org/	Immune Epitope Database	2008, 2012, 2015 (62)
1089	IMG/M	http://img.jgi.doe.gov/m	JGI's Integrated Microbial Genomics and Metagenomics	2006, 2008, 2012, 2014, 2017 (63)
172	IMGT	http://www.imgt.org	International ImMunoGeneTics database.	1997-2001, 2003–2006, 2008–2010, 2015 (64)
690	InParanoid	http://InParanoid.sbc.su.se	Orthologous relationships between eukaryotic proteomes	2005, 2008, 2010, 2015 (65)
507	IntAct	http://www.ebi.ac.uk/intact/	Protein–Protein INTerACTion data	2004, 2007, 2010, 2012, 2014 (66)
207	InterPro	http://www.ebi.ac.uk/interpro	Integrated resource of protein families, domains and functional sites	2001, 2003, 2005, 2007, 2009, 2012, 2015, 2017 (67)
367	IPD	http://www.ebi.ac.uk/ipd	Immuno Polymorphism database (includes IMGT/HLA)	2001, 2003, 2005, 2009, 2010, 2011, 2013, 2015 (68)
516	JASPAR	http://jaspar.genereg.net/	PSSMs for transcription factor DNA-binding sites	2004, 2006, 2008, 2010, 2014, 2016 (69)
112	KEGG	http://www.genome.ad.jp/kegg	Kyoto Encyclopedia of Genes and Genomes: genes, proteins, pathways	1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012–2014, 2016, 2017 (70)
177	MEROPS	http://merops.sanger.ac.uk/	Database of proteases (peptidases)	1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016 (71)
114	MetaCyc	http://metacyc.org/	Metabolic pathways and enzymes in various organisms	2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016 (18)
529	miRBase	http://www.mirbase.org/	MicroRNA sequences, names and predicted targets in animals	2006, 2008, 2011, 2014 (72)
1098	miRGator	http://mirgator.kobic.re.kr	MicroRNA expression profiles and mRNA targets	2008, 2011, 2013 (73)
994	miRGen	http://www.microrna.gr/mirgen	MicroRNA promoters and transcription start sites	2007, 2010, 2016 (74)
1423	miRTarBase	http://miRTarBase.mbc.nctu.edu.tw/	Experimentally validated microRNA–target interactions	2011, 2013, 2016 (75)
270	MMDB	http://www.ncbi.nlm.nih.gov/Structure	Molecular Modeling Database of protein structures	1999, 2000, 2002, 2003, 2007, 2012, 2014 (76)
840	MODOMICS	http://genesilico.pl/modomics/	RNA modification pathways	2006, 2009, 2013 (77)
152	Mouse Tumor Biology Database	http://tumor.informatics.jax.org/mtbwi/	Mouse as a model system of human cancers	1999, 2000, 2007, 2015 (78)
1453	neXtProt	https://www.nextprot.org/	A database of human proteins	2012, 2015, 2017 (79)
705	NONCODE	http://noncode.org/	A database of noncoding RNAs	2005, 2008, 2012, 2014, 2016 (80)
143	OMIM	http://www.omim.org	Online Mendelian inheritance in man: A catalog of human genetic and genomic disorders	1994, 2002, 2005, 2009, 2015 (81)
1108	OrthoDB	http://www.orthodb.org	An hierarchical catalog of orthologous proteins	2008, 2011, 2013, 2015, 2017 (82)
552	PANTHER	http://www.pantherdb.org	Protein sequence evolution mapped to functions and pathways	2003, 2005, 2007, 2010, 2013, 2016, 2017 (83)
1000	PATRIC	http://www.patricbrc.org	PathoSystems Resource Integration Center	2007, 2014, 2017 (84)
276	PDB	http://rcsb.org/pdb	Protein DataBank: All biological macromolecular structures	2000-2002, 2004–2006, 2011, 2013, 2015, 2017 (85)
456	PDBe	http://www.ebi.ac.uk/pdbe/	Protein Databank in Europe	2010-2012, 2014, 2016 (86)
278	PDBsum	http://www.ebi.ac.uk/pdbsum	Summaries and analyses of PDB structures	2001, 2005, 2009, 2014 (87)
210	Pfam	http://pfam.xfam.org	Protein families: Multiple sequence alignments and profile hidden Markov models of protein domains	1998-2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016 (8)
852	PHI-base	http://www4.rothamsted.bbsrc.ac.uk/phibase/	Genes affecting fungal pathogen–host interactions	2006, 2008, 2015, 2017 (88)
194	PIR	http://pir.georgetown.edu/	Protein Information Resource, part of UniProt	1986, 1988, 1991–1994, 1996–2004 (89)
857	PRIDE	http://www.ebi.ac.uk/pride/	Proteomics peptide identification database	2006, 2008, 2013, 2016 (90)
212	PRINTS	http://www.bioinf.man.ac.uk/dbbrowser/PRINTS	Protein fingerprints, conserved motifs used to characterise a protein family	1994, 1996–2000, 2002, 2003 (91)
215	Prosite	http://www.expasy.org/prosite	Biologically-significant protein patterns and profiles	1991-1994, 1996, 1997, 1999, 2002, 2004, 2006, 2008, 2010, 2013 (92)
735	PubChem	http://pubchem.ncbi.nlm.nih.gov/	Structures and biological activities of small organic molecules	2009, 2010, 2014, 2016, 2017 (93)
93	RGD	http://rgd.mcw.edu/	Rat Genome Database	2002, 2005, 2007, 2009, 2015 (94)
243	RDP	http://rdp.cme.msu.edu	Ribosomal Database Project: Bacterial and archaeal 16S rRNA and fungal 28S rRNA sequences	19991-1994, 1996, 1997, 1999–2001, 2003, 2005, 2007, 2009, 2014 (95)
612	Reactome	http://www.reactome.org	A database of metabolic and signaling pathways	2005, 2009, 2011, 2014, 2016 (96)
224	REBASE	http://rebase.neb.com/rebase/	Restriction enzyme database	1993, 1994, 1996–2001, 2003, 2005, 2007, 2010, 2015 (97)
391	RefSeq	https://www.ncbi.nlm.nih.gov/refseq/	NCBI Reference Sequence Database	2000, 2001, 2005, 2007, 2009, 2012, 2014–2016 (98)
382	Rfam	http://rfam.xfam.org	RNA families with multiple sequence alignments	2003, 2005, 2009, 2011, 2013, 2015 (99)
282	SCOP	http://scop.mrc-lmb.cam.ac.uk/	Structural Classification Of Proteins	1997, 1999, 2000, 2002, 2004, 2008, 2014 (100)
352	SGD	http://www.yeastgenome.org	Saccharomyces Genome Database	1998, 1999, 2002–2008, 2010, 2012, 2014, 2016 (101)
1183	SILVA	http://www.arb-silva.de/	Aligned small- and large subunit rRNA sequences	2007, 2013, 2014 (102)
867	SIMAP	http://mips.gsf.de/simap/	Similarity Matrix of Proteins	2006, 2008, 2010, 2014 (103)
218	SMART	http://smart.embl-heidelberg.de	Simple Modular Architecture Research Tool: signalling, extracellular and chromatin-associated protein domains	1999, 2000, 2002, 2004, 2006, 2009, 2012, 2015 (104)
1134	STITCH	http://stitch-db.org/	Search Tool for Interactions of Chemicals	2008, 2010, 2012, 2014, 2016 (105)
582	STRING	http://string.embl.de/	Predicted functional associations between proteins	2000, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017 (106)
285	SUPERFAMILY	http://supfam.org	Genome-wide identification of protein domains of known structure	2002, 2004, 2007, 2009, 2011, 2015 (107)
585	SWISS-MODEL	http://swissmodel.expasy.org/	3D models for proteins of unknown structure	2003, 2004, 2006, 2009, 2014, 2017 (108)
97	TAIR	http://www.arabidopsis.org/	The Arabidopsis information resource	2001, 2003, 2008, 2012 (109)
1264	TarBase	http://microrna.gr/tarbase	Database of experimentally supported microRNA targets	2006, 2009, 2012, 2015 (110)
790	TCDB	http://www.tcdb.org/	Transporter protein classification database	2006, 2009, 2014, 2016 (111)
1452	UCSC Cancer Genomics Browser	https://genome-cancer.ucsc.edu/	Visualization of cancer genomic datasets	2011, 2013, 2015 (15)
1031	VectorBase	https://www.vectorbase.org/	Invertebrate vectors of human pathogens	2007, 2009, 2012, 2015 (112)
51	WormBase	http://www.wormbase.org	Community portal on all aspects of C. elegans biology	2001, 2003–2008, 2010, 2012, 2014, 2016 (113)
1151	XenBase	http://www.xenbase.org	Xenopus frog database	2008, 2010, 2013, 2015 (114)
792	YEASTRACT	http://www.yeastract.com	Transcriptional regulation in Saccharomyces cerevisiae	2006, 2008, 2011, 2014 (115)
101	ZFIN	http://zfin.org/	Zebrafish information network	2001, 2003, 2011, 2013 (116)

bThe database entry in the NAR online Database Collection. For example, the summary for ArrayExpress (no. 338) is available at http://www.oxfordjournals.org/nar/database/summary/338.

cThe reference to the most recent database description that is available in PubMed (excludes the current issue).

dThis database has switched to subscription-based service and is no longer available without registration.

aThis list includes databases that have been featured in the NAR Database Issue multiple times as separate papers. This listing omits many NCBI databases whose updated descriptions are published in annual NCBI overview papers. bThe database entry in the NAR online Database Collection. For example, the summary for ArrayExpress (no. 338) is available at http://www.oxfordjournals.org/nar/database/summary/338. cThe reference to the most recent database description that is available in PubMed (excludes the current issue). dThis database has switched to subscription-based service and is no longer available without registration. In the ‘Nucleic acid databases’ section, several resources emphasize the complexity of regulatory processes. Examples include SNP2TFBS, a database of SNPs in predicted transcription factor binding sites (TFBSs); LincSNP, a database that links SNPs to long noncoding RNAs and their TFBSs; LNCediting, a database of RNA editing in lncRNAs, and POSTAR, a resource on post-transcriptional regulation by RNA-binding proteins. Major protein sequence databases include updates from UniProt and InterPro, the latter encompassing ever more component databases, most recently the NCBI's Conserved Domain Database (CDD), which is also described in a separate paper in this issue, and the Structure–Function Linkage Database (SFLD), which has been featured in NAR previously (1). Accordingly, as described in the UniProt paper, InterPro now serves as a major source of protein functional annotation for the UniProt entries. Updates on primary protein structure databases include papers on the RCSB Protein Data Bank (PDB) and PDBj. The latter reports on the integration of previously separate visualizations, allowing a single tool to display macromolecular structures not just from the PDB, but also from EMDB and SASBDB, containing structural information obtained, respectively, from cryo-EM and small angle solution scattering experiments. PDBj also now allows a search across the same three databases on shape similarity. In the area of modelled structures, the hugely popular Swiss-Model repository reports new features and policies, including a weekly update of modelled proteomes of 12 ‘core species’ to recognize the possible emergence of better templates in weekly PDB releases. Reacting quickly to the ever-expanding PDB is also a preoccupation of two protein structural domain databases, CATH and ECOD, reporting in update papers here. The CATH paper reports a new, daily-generated automatic supplement CATH-B, as well as developments of its functional families, or FunFams, whose value in sequence annotation has become clear in competitive blind CAFA tests. The ECOD describes a weekly release cycle as well as new ways to search the database and convenient means to superimpose and visualize the search results. Class-specific protein databases include updates from RepeatsDB and DisProt and an interesting new arrival FuzDB, cataloguing protein complexes whose components remain ‘fuzzy’, or locally disordered, even when interacting with other proteins. Another new database, LinkProt, features protein structures with topologically complex shapes. Metabolic and signaling pathway databases include updates on major resources KEGG and BioGRID. The update from BRENDA database of enzymes, one of the most venerable in the collection, dating as it does from 1987, describes new means of visualization - pathway maps and metabolic overviews. An interesting new arrival, XTalkDB, focuses specifically on cross-talk between signaling pathways. Microbe-related databases in the following section include heavily-used resources for influenza (Influenza Research Database) and Escherichia coli (EcoCyc). Other databases focus significantly on pathogens, or on antimicrobial resistance. Eukaryote pathogens are strongly represented by EuPathDB, PHI-base and a newcomer PathoYeastract, focusing on transcription regulation in pathogenic yeasts. In the section covering genomics and comparative genomics, important updates from Ensembl, FlyBase, STRING and the UCSC Genome Browser are included. Easy access to orthologous genes across species is provided by the well-established OrthoDB and the new arrival HieranoiDB which offers beautifully presented trees of orthologues. Another important cross-species analysis is represented by the Monarch Initiative, highlighted by NAR reviewers and editors as a ‘Breakthrough’ article (2). Working with the Human Phenotype Ontology, also reporting an update in this issue, Monarch Initiative aims to link mutations in orthologous genes to the similar phenotypes often observed in different species. This ambitious objective requires the careful use and integration of ontologies to precisely describe anatomy, diseases and phenotypes, but the pay-off is an ability to link from human diseases to disease models in various model organisms, maximizing the value of data obtained for any given organism (2). As ever, this issue covers important databases supporting research in the molecular basis of disease and treatment. Cancer is covered not only by the major resource COSMIC, reporting interesting new coverage of the genetics of drug resistance, but also by updates to ChimerDB, recording chimeric transcripts, dbDEMC, containing information on miRNA expression levels in cancer, and YM500, focusing on small RNA sequences relevant to cancer. Furthermore, the update paper from OGEE, the gene essentiality database, includes an interesting focus on genes that are differentially essential in different cancers. More generally, DisGeNET and Open Targets [another new database designated as a ‘Breakthrough’ paper, (3)], both offer comprehensive resources linking pathogenic gene variants to a variety of other data. The third database in this issue recognized by the NAR reviewers and editors with the ‘Breakthrough’ designation is denovo-db, a database of mutations that have been found in human subjects but which were missing in both of their parents (4). The database lists ∼32 000 sites in the genome with data obtained from >16 000 patients carrying some kind of a disease and >17,000 control individuals. The majority of disease variants were from individuals with autism and congenital heart disease with smaller samples coming from schizophrenia, epilepsy, and other neurodevelopmental disorders (4). There is no doubt that this collection will find a variety of uses, from analyzing de novo mutations linked to a particular disease to studying the frequencies of mutations in certain parts of the genome. The mirDNMR database is also a collection of de novo mutations with specific focus on the background mutation rates calculated by several statistical approaches (5). Finally, in the genomic variation section, a major new arrival is the ExAC browser providing access to exome sequences from over 60 000 human genomes. This unprecedented depth of sampling of human genome data has important implications for attempts to predict observed SNP sequence variants as benign or damaging (6). Plant databases represented here include an update to the popular PlantTFDB, collecting information on plant transcription factors and SUBA, recording plant subcellular localization data in Arabidopsis. Important new databases here include AraPheno, dealing with phenotypic data for the same model plant, and the intriguing PLaMoM which covers macromolecules, nucleic acids and plants, that are mobile over long distances in plants. Finally, this issue includes descriptions of two important proteomics databases, an update on the widely used ProteomeXchange, dealing with standards and dissemination of proteomics data, and a first paper from one of its members describing the Japanese Proteomics Standards (jPOSTrepo) repository.

UPDATED NAR ONLINE MOLECULAR BIOLOGY DATABASE COLLECTION

This year's update of the NAR online Molecular Biology Database Collection (which is freely available at http://www.oxfordjournals.org/nar/database/a/), involved inclusion of 55 new databases (Table 1) and 15 databases that have been previously described elsewhere and were not part of this Collection (Table 2). In the current update, 18 duplicate entries and 30 obsolete databases have been removed from the Collection, and five new databases have been added to the list. Suggestions for inclusion of additional databases in the NAR Collection should be addressed to Xosé M. Fernández-Suárez at xose.m.fernandez@gmail.com and should include database summaries in plain text, organized in accordance with the http://www.oxfordjournals.org/nar/database/summary/1 template.

LOOKING BACK: WHAT HAS CHANGED, WHAT HAS NOT

The 2006 editorial by MYG (7) included the following paragraph: ‘After 12 years of database issues and 8 years of the accompanying web supplement, it was interesting to check if they are really having an impact. In other words, how many people really care about them and use them? To evaluate the impact of the NAR database issues, I have used a tool that, despite all complaints and caveats, is commonly utilized for evaluating research productivity, namely the Science Citation Index® produced by the Institute for Scientific Information (ISI). If databases are put on the web for the benefit of the research community, the frequency with which people use (and cite) a given database could serve as an indication of whether this database serves a useful purpose. An inspection of the citation figures for the 141 papers published 2 years ago in the 2004 NAR Database Issue (all citation data are as of 15 October 2005) revealed a very encouraging trend. Most of the papers were well—or very well—cited. Only five papers have not been cited at all and the same number of database descriptions — five — have been cited >100 times, becoming, in ISI parlance, instant ‘citation classics’. Whatever the caveats, the fact that the paper describing the Pfam domain database [http://www.sanger.ac.uk/Software/Pfam/, NAR Collection entry no. 210, (6)] has been cited 375 times in <2 years definitely indicates that this database is widely used by the research community. Indeed, comparing a protein sequence against Pfam has become standard practice in sequence analysis, particularly in genome annotation. It is probably no coincidence that the first author of the Pfam paper also serves as the Editor of the NAR database issues. In the interest of full disclosure, I have cited this Pfam paper myself eight times since its publication in 2004.’ We hope the readers will excuse this small piece of self-plagiarism, which shows how little has changed in more than a decade. Pfam still remains our citation leader, even though it has been moved from the Wellcome Trust Sanger Institute to the EMBL-European Bioinformatics Institute and its URL has been changed to http://pfam.xfam.org (8). The NAR Database Issue as a whole is still very well cited and serves as a publication venue for a wide variety of hugely popular databases. In response to repeated requests of various researchers, Table 3 presents a list of such perennial favorites published in NAR three or more times.

SOME LESSONS LEARNT

Ten years ago Alex Bateman published an editorial that included a section on ‘What makes a good database?’ (9). This paper remains a must-read for anyone planning to submit a paper to the NAR Database Issue. It is available online at http://nar.oxfordjournals.org/content/35/suppl_1/D1.full and is linked from the NAR Instructions to authors page https://academic.oup.com/nar/pages/ms_prep_database. Here are some additional recommendations that might be useful for future database authors. The database is expected to be maintained for many years, so it is worth spending some effort on finding it a proper name. Names that use the words from standard vocabulary or well-known commercial terms make it difficult for the potential users to find the database URL on the web. Among the databases published this year, the name of the Japanese Proteomics Standard repository has been changed from jPOST (a name shared with the web site of the Jerusalem Post newspaper) to JPOSTrepo, which does not have such a connotation. However, the previously published HIPPIE (Human Integrated Protein–Protein Interaction rEference) and RAID (RNA-associated interactions database) retained their original names. While it is quite unlikely that anyone would confuse the name of the former with the hippie lifestyle or the latter with the name of the popular pest killer, it is equally unlikely that either database will show up near the top in any web search. Therefore, it is always a good idea to do a Google search on the proposed database name: who knows what it might mean in other languages or even in urban slang. That said, creative mnemonics could be helpfully memorable, with notable examples including HIC-Up (Hetero-compound Information Centre-Uppsala), InParanoid (In-paralogs and orthologs in mammalian genomes), COSMIC (Catalog Of Somatic Mutations In Cancer), DARNED (DAtabase of RNA EDiting), FINDbase (Frequencies of INherited Disorders database), and YEASTRACT (YEASt Transcriptional Regulators And Consensus Tracking). To warrant publication, a new database must offer carefully curated data and provide substantial improvement in coverage and convenience over all previously created databases (including previous versions of that same database). Many prospective authors have been utterly surprised by this requirement and disappointed to see that there already existed similar (or better) databases created elsewhere. It was always hard for us to understand why would someone commit time and effort to constructing a database without even checking what is already available on the web. The curation requirement means that mere integration of previously created databases is not going to be welcomed, no matter how complicated and successful that integration might have been. The only exceptions we have considered were consortium projects (such as RNAcentral and ProteomeXchange in this issue) where authors of diverse databases committed to jointly maintaining their resources and exchanging the data for the benefit of the community. Scientists, like everybody else, are not immune to fashion. In the past years, we have seen rapidly rising—and then quickly falling—numbers of databases dedicated to protein–protein interactions, noncoding RNA, microRNA, their targets, long noncoding RNA, disease-related genes, drugs and drug targets, and so on. In cases like that, we used to refrain from choosing the best database among several ones created at the same time. Instead, we accepted two or three similar papers and allowed the respective databases to prove themselves. In the emerging areas of research, we see nothing wrong with a bit of competition, as long as these databases remain functional and regularly updated, and continue offering a useful service to the community. While it is hard for any single group to compete with such database juggernauts as NCBI, EMBL-EBI, BGI or Swiss Institute for Bioinformatics, some of the most successful databases, such as CAZy and GPCRdb (Table 3), have been created and are being maintained by relatively small groups.

CHANGING OF THE GUARD

This issue has been jointly edited by Drs. Michael Y. Galperin (Bethesda, MD, USA) and Daniel J. Rigden (Liverpool, UK). At the end of 2016, the former has retired from editing NAR and the latter assumed full responsibility for the NAR Database Issue. We are going to continue using the same E-mail address nardatabase@gmail.com and will adhere to the same database selection principles that have been introduced by the founding editor Sir Richard J. Roberts and successfully continued by Drs. Andreas D. Baxevanis, Alex Bateman, and, most recently, Michael Y. Galperin.

115 in total

1. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification.

Authors: T B K Reddy; Alex D Thomas; Dimitri Stamatis; Jon Bertsch; Michelle Isbandi; Jakob Jansson; Jyothi Mallajosyula; Ioanna Pagani; Elizabeth A Lobos; Nikos C Kyrpides
Journal: Nucleic Acids Res Date: 2014-10-27 Impact factor: 16.971

2. New and continuing developments at PROSITE.

Authors: Christian J A Sigrist; Edouard de Castro; Lorenzo Cerutti; Béatrice A Cuche; Nicolas Hulo; Alan Bridge; Lydie Bougueleret; Ioannis Xenarios
Journal: Nucleic Acids Res Date: 2012-11-17 Impact factor: 16.971

3. Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation.

Authors: Hideya Kawaji; Jessica Severin; Marina Lizio; Alistair R R Forrest; Erik van Nimwegen; Michael Rehli; Kate Schroder; Katharine Irvine; Harukazu Suzuki; Piero Carninci; Yoshihide Hayashizaki; Carsten O Daub
Journal: Nucleic Acids Res Date: 2010-11-12 Impact factor: 16.971

4. OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software.

Authors: Evgenia V Kriventseva; Fredrik Tegenfeldt; Tom J Petty; Robert M Waterhouse; Felipe A Simão; Igor A Pozdnyakov; Panagiotis Ioannidis; Evgeny M Zdobnov
Journal: Nucleic Acids Res Date: 2014-11-26 Impact factor: 16.971

5. The neXtProt knowledgebase on human proteins: current status.

Authors: Pascale Gaudet; Pierre-André Michel; Monique Zahn-Zabal; Isabelle Cusin; Paula D Duek; Olivier Evalet; Alain Gateau; Anne Gleizes; Mario Pereira; Daniel Teixeira; Ying Zhang; Lydie Lane; Amos Bairoch
Journal: Nucleic Acids Res Date: 2015-01 Impact factor: 16.971

6. COSMIC: exploring the world's knowledge of somatic mutations in human cancer.

Authors: Simon A Forbes; David Beare; Prasad Gunasekaran; Kenric Leung; Nidhi Bindal; Harry Boutselakis; Minjie Ding; Sally Bamford; Charlotte Cole; Sari Ward; Chai Yin Kok; Mingming Jia; Tisham De; Jon W Teague; Michael R Stratton; Ultan McDermott; Peter J Campbell
Journal: Nucleic Acids Res Date: 2014-10-29 Impact factor: 16.971

7. The Rat Genome Database 2015: genomic, phenotypic and environmental variations and disease.

Authors: Mary Shimoyama; Jeff De Pons; G Thomas Hayman; Stanley J F Laulederkind; Weisong Liu; Rajni Nigam; Victoria Petri; Jennifer R Smith; Marek Tutaj; Shur-Jen Wang; Elizabeth Worthey; Melinda Dwinell; Howard Jacob
Journal: Nucleic Acids Res Date: 2014-10-29 Impact factor: 19.160

8. ChimerDB 2.0--a knowledgebase for fusion genes updated.

Authors: Pora Kim; Suhyeon Yoon; Namshin Kim; Sanghyun Lee; Minjeong Ko; Haeseung Lee; Hyunjung Kang; Jaesang Kim; Sanghyuk Lee
Journal: Nucleic Acids Res Date: 2009-11-11 Impact factor: 16.971

9. WormBase 2016: expanding to enable helminth genomic research.

Authors: Kevin L Howe; Bruce J Bolt; Scott Cain; Juancarlos Chan; Wen J Chen; Paul Davis; James Done; Thomas Down; Sibyl Gao; Christian Grove; Todd W Harris; Ranjana Kishore; Raymond Lee; Jane Lomax; Yuling Li; Hans-Michael Muller; Cecilia Nakamura; Paulo Nuin; Michael Paulini; Daniela Raciti; Gary Schindelman; Eleanor Stanley; Mary Ann Tuli; Kimberly Van Auken; Daniel Wang; Xiaodong Wang; Gary Williams; Adam Wright; Karen Yook; Matthew Berriman; Paul Kersey; Tim Schedl; Lincoln Stein; Paul W Sternberg
Journal: Nucleic Acids Res Date: 2015-11-17 Impact factor: 16.971

10. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

Authors: Anthony Mathelier; Oriol Fornes; David J Arenillas; Chih-Yu Chen; Grégoire Denay; Jessica Lee; Wenqiang Shi; Casper Shyr; Ge Tan; Rebecca Worsley-Hunt; Allen W Zhang; François Parcy; Boris Lenhard; Albin Sandelin; Wyeth W Wasserman
Journal: Nucleic Acids Res Date: 2015-11-03 Impact factor: 16.971

16 in total

1. Association of Cardiovascular Health Through Young Adulthood With Genome-Wide DNA Methylation Patterns in Midlife: The CARDIA Study.

Authors: Yinan Zheng; Brian T Joyce; Shih-Jen Hwang; Jiantao Ma; Lei Liu; Norrina B Allen; Amy E Krefman; Jun Wang; Tao Gao; Drew R Nannini; Haixiang Zhang; David R Jacobs; Myron D Gross; Myriam Fornage; Cora E Lewis; Pamela J Schreiner; Stephen Sidney; Dongquan Chen; Philip Greenland; Daniel Levy; Lifang Hou; Donald M Lloyd-Jones
Journal: Circulation Date: 2022-06-02 Impact factor: 39.918

2. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank.

Authors: Zukang Feng; John D Westbrook; Raul Sala; Oliver S Smart; Gérard Bricogne; Masaaki Matsubara; Issaku Yamada; Shinichiro Tsuchiya; Kiyoko F Aoki-Kinoshita; Jeffrey C Hoch; Genji Kurisu; Sameer Velankar; Stephen K Burley; Jasmine Y Young
Journal: Structure Date: 2021-03-02 Impact factor: 5.006

3. Epigenome-wide association meta-analysis of DNA methylation with coffee and tea consumption.

Authors: Irma Karabegović; Eliana Portilla-Fernandez; Yang Li; Jiantao Ma; Silvana C E Maas; Daokun Sun; Emily A Hu; Brigitte Kühnel; Yan Zhang; Srikant Ambatipudi; Giovanni Fiorito; Jian Huang; Juan E Castillo-Fernandez; Kerri L Wiggins; Niek de Klein; Sara Grioni; Brenton R Swenson; Silvia Polidoro; Jorien L Treur; Cyrille Cuenin; Pei-Chien Tsai; Ricardo Costeira; Veronique Chajes; Kim Braun; Niek Verweij; Anja Kretschmer; Lude Franke; Joyce B J van Meurs; André G Uitterlinden; Robert J de Knegt; M Arfan Ikram; Abbas Dehghan; Annette Peters; Ben Schöttker; Sina A Gharib; Nona Sotoodehnia; Jordana T Bell; Paul Elliott; Paolo Vineis; Caroline Relton; Zdenko Herceg; Hermann Brenner; Melanie Waldenberger; Casey M Rebholz; Trudy Voortman; Qiuwei Pan; Myriam Fornage; Daniel Levy; Manfred Kayser; Mohsen Ghanbari
Journal: Nat Commun Date: 2021-05-14 Impact factor: 14.919

4. A Peripheral Blood DNA Methylation Signature of Hepatic Fat Reveals a Potential Causal Pathway for Nonalcoholic Fatty Liver Disease.

Authors: Jiantao Ma; Jana Nano; Jingzhong Ding; Yinan Zheng; Rachel Hennein; Chunyu Liu; Elizabeth K Speliotes; Tianxiao Huan; Ci Song; Michael M Mendelson; Roby Joehanes; Michelle T Long; Liming Liang; Jennifer A Smith; Lindsay M Reynolds; Mohsen Ghanbari; Taulant Muka; Joyce B J van Meurs; Louise J M Alferink; Oscar H Franco; Abbas Dehghan; Scott Ratliff; Wei Zhao; Lawrence Bielak; Sharon L R Kardia; Patricia A Peyser; Hongyan Ning; Lisa B VanWagner; Donald M Lloyd-Jones; John Jeffrey Carr; Philip Greenland; Alice H Lichtenstein; Frank B Hu; Yongmei Liu; Lifang Hou; Sarwa Darwish Murad; Daniel Levy
Journal: Diabetes Date: 2019-04-01 Impact factor: 9.337

5. The CompTox Chemistry Dashboard: a community data resource for environmental chemistry.

Authors: Antony J Williams; Christopher M Grulke; Jeff Edwards; Andrew D McEachran; Kamel Mansouri; Nancy C Baker; Grace Patlewicz; Imran Shah; John F Wambaugh; Richard S Judson; Ann M Richard
Journal: J Cheminform Date: 2017-11-28 Impact factor: 5.514

6. YummyData: providing high-quality open life science data.

Authors: Yasunori Yamamoto; Atsuko Yamaguchi; Andrea Splendiani
Journal: Database (Oxford) Date: 2018-01-01 Impact factor: 3.451

7. LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions.

Authors: Jinbo Chen; Uwe Scholz; Ruonan Zhou; Matthias Lange
Journal: PLoS Comput Biol Date: 2018-03-12 Impact factor: 4.475

8. Analysis of impact metrics for the Protein Data Bank.

Authors: Christopher Markosian; Luigi Di Costanzo; Monica Sekharan; Chenghua Shao; Stephen K Burley; Christine Zardecki
Journal: Sci Data Date: 2018-10-16 Impact factor: 6.444

9. Whole Blood DNA Methylation Signatures of Diet Are Associated With Cardiovascular Disease Risk Factors and All-Cause Mortality.

Authors: Jiantao Ma; Casey M Rebholz; Kim V E Braun; Lindsay M Reynolds; Stella Aslibekyan; Rui Xia; Niranjan G Biligowda; Tianxiao Huan; Chunyu Liu; Michael M Mendelson; Roby Joehanes; Emily A Hu; Mara Z Vitolins; Alexis C Wood; Kurt Lohman; Carolina Ochoa-Rosales; Joyce van Meurs; Andre Uitterlinden; Yongmei Liu; Mohamed A Elhadad; Margit Heier; Melanie Waldenberger; Annette Peters; Elena Colicino; Eric A Whitsel; Antoine Baldassari; Sina A Gharib; Nona Sotoodehnia; Jennifer A Brody; Colleen M Sitlani; Toshiko Tanaka; W David Hill; Janie Corley; Ian J Deary; Yan Zhang; Ben Schöttker; Hermann Brenner; Maura E Walker; Shumao Ye; Steve Nguyen; Jim Pankow; Ellen W Demerath; Yinan Zheng; Lifang Hou; Liming Liang; Alice H Lichtenstein; Frank B Hu; Myriam Fornage; Trudy Voortman; Daniel Levy
Journal: Circ Genom Precis Med Date: 2020-06-11

10. A system-wide network reconstruction of gene regulation and metabolism in Escherichia coli.

Authors: Anne Grimbs; David F Klosik; Stefan Bornholdt; Marc-Thorsten Hütt
Journal: PLoS Comput Biol Date: 2019-05-03 Impact factor: 4.475