Literature DB >> 29036693

ICG: a wiki-driven knowledgebase of internal control genes for RT-qPCR normalization.

Jian Sang1,2,3, Zhennan Wang3,4, Man Li1,2,3, Jiabao Cao1,2,3, Guangyi Niu1,2,3, Lin Xia1,2,3, Dong Zou1,2, Fan Wang1,2, Xingjian Xu1,2, Xiaojiao Han5, Jinqi Fan6, Ye Yang7, Wanzhu Zuo7, Yang Zhang1,2,3, Wenming Zhao1,3, Yiming Bao1,2,3, Jingfa Xiao1,2,3,8, Songnian Hu2,3,8, Lili Hao1,2, Zhang Zhang1,2,3,8.   

Abstract

Real-time quantitative PCR (RT-qPCR) has become a widely used method for accurate expression profiling of targeted mRNA and ncRNA. Selection of appropriate internal control genes for RT-qPCR normalization is an elementary prerequisite for reliable expression measurement. Here, we present ICG (http://icg.big.ac.cn), a wiki-driven knowledgebase for community curation of experimentally validated internal control genes as well as their associated experimental conditions. Unlike extant related databases that focus on qPCR primers in model organisms (mainly human and mouse), ICG features harnessing collective intelligence in community integration of internal control genes for a variety of species. Specifically, it integrates a comprehensive collection of more than 750 internal control genes for 73 animals, 115 plants, 12 fungi and 9 bacteria, and incorporates detailed information on recommended application scenarios corresponding to specific experimental conditions, which, collectively, are of great help for researchers to adopt appropriate internal control genes for their own experiments. Taken together, ICG serves as a publicly editable and open-content encyclopaedia of internal control genes and accordingly bears broad utility for reliable RT-qPCR normalization and gene expression characterization in both model and non-model organisms.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29036693      PMCID: PMC5753184          DOI: 10.1093/nar/gkx875

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Real-time quantitative PCR (RT-qPCR) is one of the most powerful molecular techniques for accurate expression profiling of targeted nucleic acid in a wide range of biological research (1,2). To reduce experimental bias and produce accurate expression levels, several variables (like operator variability, amount of RNA extraction yield and variation) need to be taken into account for normalization (3–5). Currently, the most frequently used approach for RT-qPCR normalization is the use of internal control genes (or reference genes) (6) that ideally should have relatively stable expression levels across all samples from different tissues, during all developmental stages and in response to distinct experimental treatments (7,8). Thus, housekeeping genes, such as ACT, 18S rRNA and GAPDH, were frequently used for RT-qPCR normalization (9). However, evidence has accumulated that some traditional housekeeping genes used to control for experimental bias are expressed at relatively constant levels only for certain conditions (10,11). It is clearly that internal control genes are condition-specific and accordingly there is no universal gene that can be used for internal control for all application scenarios (12), strongly indicating the necessity of proper selection of internal control gene(s) before performing any RT-qPCR experiment. Over the past decade, with the ever-increasing RT-qPCR expression analyses carried out in both model and non-model organisms, advancements have been made in identification and validation of appropriate internal control genes under specific tissues, developmental stages and experimental treatments (Figure 1). However, characterizing internal control genes is an onerous task requiring well-designed molecular experiments followed with a series of elaborate computational analyses (3,13,14). Therefore, it is extremely necessary to comprehensively integrate experimentally validated internal control genes from published literature and make these genes and their associated experimental conditions well-organized and public accessible to the whole scientific community. Although valuable efforts have been made in building related databases including RTPrimerDB (15), PrimerBank (16), qPrimerDepot (17) and GETPrime (18), they merely focus on RT-qPCR primers in model organisms (mainly human and mouse), ignoring collection of internal control genes as well as their associated application scenarios. To date, there still lacks a unified knowledgebase that integrates internal control genes adjusting for various tissues, developmental stages and experimental treatments across a wide variety of species.
Figure 1.

Cumulative numbers of relevant publications on internal control genes from 2004∼2016. All statistics were extracted from NCBI PubMed by using the search terms: (‘internal control genes’ OR ‘reference genes’) AND (‘qPCR’ OR ‘qRT-PCR’) occurring in Title/Abstract.

Cumulative numbers of relevant publications on internal control genes from 2004∼2016. All statistics were extracted from NCBI PubMed by using the search terms: (‘internal control genes’ OR ‘reference genes’) AND (‘qPCR’ OR ‘qRT-PCR’) occurring in Title/Abstract. In order to fill this gap and provide molecular biologists with informative guidance on selecting internal control genes to customize their RT-qPCR experiments, here we present ICG (http://icg.big.ac.cn), a wiki-based, publicly editable and open-content resource for community curation of internal control genes across a diversity of species. Unlike extant relevant databases, ICG features harnessing collective intelligence in collaborative integration of experimentally validated internal control genes as well as their associated application scenarios in both model and non-model organisms, accordingly bearing great utility for proper selection of internal control genes and reliable gene expression normalization and characterization.

IMPLEMENTATION

ICG is built based on MediaWiki (http://www.mediawiki.org; version 1.28.2), which is one of popular open-source wiki engines, originally providing a collaborative framework for use on Wikipedia. The majority of contents in ICG is stored as wiki-markup text, which is organized by MediaWiki concepts such as ‘template scheme’ and ‘content page’. Additionally, Category, as a software feature of MediaWiki, is extensively used for automatic indexes and classifications of content pages in ICG. To increase the usability and searchability, a series of extensible plugins are installed in aid of content presentation and customized functionalities (http://icg.big.ac.cn/index.php/Special:Version). ICG is implemented with Apache (https://httpd.apache.org; an open-source HTTP server; version 2.2.15), PHP (http://www.php.net; a widely-used general-purpose scripting language; version 7.0.19) and MySQL (http://www.mysql.org; a free and popular relational database management system; Version 5.7.13) on a CentOS release 6.5 Linux Server. Powered by MediaWiki, therefore, ICG allows any registered user to edit any content simply via a web browser and enables internal control genes to be edited and updated by multiple users. For each page, ICG records all revisions and their associated users who are responsible for each revision, and most importantly, each history revision can be easily recovered, with the purpose to minimise invalid/incorrect edits.

DATABASE CONTENT AND USAGE

To facilitate appropriate selection of internal control genes for accurate RT-qPCR normalization, ICG integrates >750 experimentally validated internal control genes manually curated from 283 publications, corresponding to a wide range of specific tissues, development stages and experiment treatments and covering a wide variety of species including 73 animals, 115 plants, 12 fungi and 9 bacteria. Consequently, ICG provides two major categories, namely, Species and Genes, to allow users to access internal control genes and their associated specific experimental conditions. ICG organizes experimentally validated internal control genes as well as their associated experimental conditions in terms of ‘Species’ (http://icg.big.ac.cn/index.php/Species), where each species corresponds to a wiki page (Figure 2). Specially, the content of a species page is structured into multiple sections, namely, basic description, experimental condition(s), reference(s) and category. For each experimental condition, ICG incorporates an abundance of information, involving internal control genes (e.g. gene symbol, full name, accession number), primers (e.g. validated primer sequence, amplicon size) and RT-qPCR conditions (e.g. recommended application scopes, detection chemistry, annealing temperature), which, collectively, are helpful for researchers to select appropriate internal control genes for their own experiments. Additionally, ICG specifies evaluation methods that are used for identification of internal control genes and provides relevant publications, citations and contact information of their corresponding authors.
Figure 2.

Screenshots of a species page for Glycine max (http://icg.big.ac.cn/index.php/Glycine_max). (A) Table of contents; (B) Description of the species as well as its common name and a hyperlink to NCBI Taxonomy; (C) Detailed information on internal control genes for a specific experimental condition; (D) References associated with this species; (E) Categories associated with this species.

Screenshots of a species page for Glycine max (http://icg.big.ac.cn/index.php/Glycine_max). (A) Table of contents; (B) Description of the species as well as its common name and a hyperlink to NCBI Taxonomy; (C) Detailed information on internal control genes for a specific experimental condition; (D) References associated with this species; (E) Categories associated with this species. Meanwhile, considering that one gene is most likely used for internal control in multiple species, ICG sets up a specific page for each collected gene (http://icg.big.ac.cn/index.php/ICG:Genes). For any given gene, ICG integrates a wide range of related information, including its synonyms, applicable species, recommended application scenarios, sequence from representative species, conserved domains and external hyperlinks (Figure 3), which on the whole provides a whole picture for the utilization of this gene across different species and thus greatly facilitates users to perform systematic investigations on this gene. For instance, according to the statistics as of 10 August 2017 (http://icg.big.ac.cn/index.php/ICG:Statistics), the most popular internal control gene collected in ICG is EF1α (Elongation factor 1-alpha), which has been widely adopted for controlling experimental bias in 79 species (http://icg.big.ac.cn/index.php/Gene:EF1A). Additionally, ICG collects internal control genes for non-coding RNAs normalization, which can be accessed through a specific category of non-coding RNA (http://icg.big.ac.cn/index.php/Category:Non-coding_RNA).
Figure 3.

Screenshots of a gene page for Actin (http://icg.big.ac.cn/index.php/Gene:ACT). (A) Synonymous names; (B) A tabulated form summarizing its utilization as internal control in all relevant species as well as recommended application scenarios and related references; (C) Sequence for Actin in a representative species; (D) Gene structure; (E) Links to external resources.

Screenshots of a gene page for Actin (http://icg.big.ac.cn/index.php/Gene:ACT). (A) Synonymous names; (B) A tabulated form summarizing its utilization as internal control in all relevant species as well as recommended application scenarios and related references; (C) Sequence for Actin in a representative species; (D) Gene structure; (E) Links to external resources. In the era of big data, community curation bears the potential in dealing with the flood of data (19). Based on MediaWiki, ICG enables users to be easily involved in an ongoing process of collaboration that adds newly identified internal control genes and frequently updates the contents for all collected genes. Thus, ICG can significantly ease the process of data collection, curation and sharing, befitting the exploding volume of biological knowledge. Moreover, ICG features user-friendly web interfaces for data search and retrieval just by specifying a gene name or a species name. To get an overview of all collected data contents, ICG also provides statistics for species, internal control genes, and experiment conditions and generates a word cloud for visualizing the most prominent terms (http://icg.big.ac.cn/index.php/ICG:Statistics). In addition, molecular sequences of validated internal control genes are collected and publicly available at http://icg.big.ac.cn/index.php/Downloads.

DISCUSSION AND FUTURE DEVELOPMENTS

ICG, to our knowledge, is the first knowledgebase integrating a comprehensive collection of experimentally validated internal control genes as well as their associated application scenarios across a wide range of species. Currently, it has integrated >750 experimentally validated internal control genes covering 209 species, accordingly providing valuable guidance for researchers to choose proper genes for their own RT-qPCR experiments. Considering the continuous accumulation of newly characterized internal control genes from subsequently published literature, ICG will continue to regularly update the experimentally verified genes for newly studied species and/or conditions, not only for linear RNAs but also circular RNAs (20). As a core resource of BIG Data Center (http://bigd.big.ac.cn) (21), ICG serves as a publicly editable and open-content encyclopedia of internal control genes and thus bears broad utility for reliable RT-qPCR normalization and gene expression characterization in both model and non-model organisms. Future directions of ICG include integration of more internal control genes through literature curation and development of new functionalities for inviting authors of recent relevant publications to get involved in community curation. We will also develop tools in aid of literature mining and community curation in order to facilitate automatic information retrieval and improve the reliability of community-provided contents.
  21 in total

1.  Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper--Excel-based tool using pair-wise correlations.

Authors:  Michael W Pfaffl; Ales Tichopad; Christian Prgomet; Tanja P Neuvians
Journal:  Biotechnol Lett       Date:  2004-03       Impact factor: 2.461

2.  Analyzing real-time PCR data by the comparative C(T) method.

Authors:  Thomas D Schmittgen; Kenneth J Livak
Journal:  Nat Protoc       Date:  2008       Impact factor: 13.491

3.  Big data: The future of biocuration.

Authors:  Doug Howe; Maria Costanzo; Petra Fey; Takashi Gojobori; Linda Hannick; Winston Hide; David P Hill; Renate Kania; Mary Schaeffer; Susan St Pierre; Simon Twigger; Owen White; Seung Yon Rhee
Journal:  Nature       Date:  2008-09-04       Impact factor: 49.962

4.  GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR.

Authors:  Carine Gubelmann; Alexandre Gattiker; Andreas Massouras; Korneel Hens; Fabrice David; Frederik Decouttere; Jacques Rougemont; Bart Deplancke
Journal:  Database (Oxford)       Date:  2011-09-14       Impact factor: 3.451

5.  RTPrimerDB: the real-time PCR primer and probe database, major update 2006.

Authors:  Filip Pattyn; Piet Robbrecht; Anne De Paepe; Frank Speleman; Jo Vandesompele
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

6.  qPrimerDepot: a primer database for quantitative real time PCR.

Authors:  Wenwu Cui; Dennis D Taub; Kevin Gardner
Journal:  Nucleic Acids Res       Date:  2006-10-26       Impact factor: 16.971

7.  CIRI: an efficient and unbiased algorithm for de novo circular RNA identification.

Authors:  Yuan Gao; Jinfeng Wang; Fangqing Zhao
Journal:  Genome Biol       Date:  2015-01-13       Impact factor: 13.583

8.  The BIG Data Center: from deposition to integration to translation.

Authors: 
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

9.  Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes.

Authors:  Jo Vandesompele; Katleen De Preter; Filip Pattyn; Bruce Poppe; Nadine Van Roy; Anne De Paepe; Frank Speleman
Journal:  Genome Biol       Date:  2002-06-18       Impact factor: 13.583

10.  Selection and validation of reference genes for real-time quantitative PCR in hyperaccumulating ecotype of Sedum alfredii under different heavy metals stresses.

Authors:  Jian Sang; Xiaojiao Han; Mingying Liu; Guirong Qiao; Jing Jiang; Renying Zhuo
Journal:  PLoS One       Date:  2013-12-10       Impact factor: 3.240

View more
  20 in total

1.  MicroRNA414c affects salt tolerance of cotton by regulating reactive oxygen species metabolism under salinity stress.

Authors:  Wei Wang; Dan Liu; Dongdong Chen; Yingying Cheng; Xiaopei Zhang; Lirong Song; Mengjiao Hu; Jie Dong; Fafu Shen
Journal:  RNA Biol       Date:  2019-01-29       Impact factor: 4.652

2.  Mitigation of salt stress response in upland cotton (Gossypium hirsutum) by exogenous melatonin.

Authors:  Jian Shen; Dongdong Chen; Xiaopei Zhang; Lirong Song; Jie Dong; Qingjiang Xu; Mengjiao Hu; Yingying Cheng; Fafu Shen; Wei Wang
Journal:  J Plant Res       Date:  2021-03-24       Impact factor: 2.629

3.  Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022.

Authors: 
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

4.  Screening and verification of reference genes for analysis of gene expression in winter rapeseed (Brassica rapa L.) under abiotic stress.

Authors:  Li Ma; Junyan Wu; Weiliang Qi; Jeffrey A Coulter; Yan Fang; Xuecai Li; Lijun Liu; Jiaojiao Jin; Zaoxia Niu; Jinli Yue; Wancang Sun
Journal:  PLoS One       Date:  2020-09-17       Impact factor: 3.240

5.  Transcriptome-based selection and validation of optimal house-keeping genes for skin research in goats (Capra hircus).

Authors:  Jipan Zhang; Chengchen Deng; Jialu Li; Yongju Zhao
Journal:  BMC Genomics       Date:  2020-07-18       Impact factor: 3.969

6.  Selection of reference genes for the quantitative real-time PCR normalization of gene expression in Isatis indigotica fortune.

Authors:  Renjun Qu; Yujing Miao; Yingjing Cui; Yiwen Cao; Ying Zhou; Xiaoqing Tang; Jie Yang; Fangquan Wang
Journal:  BMC Mol Biol       Date:  2019-03-25       Impact factor: 2.946

7.  Selection and validation of reference genes for quantitative expression analysis of miRNAs and mRNAs in Poplar.

Authors:  Fang Tang; Liwei Chu; Wenbo Shu; Xuejiao He; Lijuan Wang; Mengzhu Lu
Journal:  Plant Methods       Date:  2019-04-06       Impact factor: 4.993

8.  Selection of Reference Genes for the Normalization of RT-qPCR Data in Gene Expression Studies in Insects: A Systematic Review.

Authors:  Jing Lü; Chunxiao Yang; Youjun Zhang; Huipeng Pan
Journal:  Front Physiol       Date:  2018-11-06       Impact factor: 4.566

9.  The Catalase Gene Family in Cotton: Genome-Wide Characterization and Bioinformatics Analysis.

Authors:  Wei Wang; Yingying Cheng; Dongdong Chen; Dan Liu; Mengjiao Hu; Jie Dong; Xiaopei Zhang; Lirong Song; Fafu Shen
Journal:  Cells       Date:  2019-01-24       Impact factor: 6.600

10.  Database Resources of the BIG Data Center in 2019.

Authors: 
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.