Literature DB >> 20979611

Literature-based discovery of diabetes- and ROS-related targets.

Junguk Hur1, Kelli A Sullivan, Adam D Schuyler, Yu Hong, Manjusha Pande, David J States, H V Jagadish, Eva L Feldman.   

Abstract

BACKGROUND: Reactive oxygen species (ROS) are known mediators of cellular damage in multiple diseases including diabetic complications. Despite its importance, no comprehensive database is currently available for the genes associated with ROS.
METHODS: We present ROS- and diabetes-related targets (genes/proteins) collected from the biomedical literature through a text mining technology. A web-based literature mining tool, SciMiner, was applied to 1,154 biomedical papers indexed with diabetes and ROS by PubMed to identify relevant targets. Over-represented targets in the ROS-diabetes literature were obtained through comparisons against randomly selected literature. The expression levels of nine genes, selected from the top ranked ROS-diabetes set, were measured in the dorsal root ganglia (DRG) of diabetic and non-diabetic DBA/2J mice in order to evaluate the biological relevance of literature-derived targets in the pathogenesis of diabetic neuropathy.
RESULTS: SciMiner identified 1,026 ROS- and diabetes-related targets from the 1,154 biomedical papers (http://jdrf.neurology.med.umich.edu/ROSDiabetes/). Fifty-three targets were significantly over-represented in the ROS-diabetes literature compared to randomly selected literature. These over-represented targets included well-known members of the oxidative stress response including catalase, the NADPH oxidase family, and the superoxide dismutase family of proteins. Eight of the nine selected genes exhibited significant differential expression between diabetic and non-diabetic mice. For six genes, the direction of expression change in diabetes paralleled enhanced oxidative stress in the DRG.
CONCLUSIONS: Literature mining compiled ROS-diabetes related targets from the biomedical literature and led us to evaluate the biological relevance of selected targets in the pathogenesis of diabetic neuropathy.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20979611      PMCID: PMC2988702          DOI: 10.1186/1755-8794-3-49

Source DB:  PubMed          Journal:  BMC Med Genomics        ISSN: 1755-8794            Impact factor:   3.063


Background

Diabetes is a metabolic disease in which the body does not produce or properly respond to insulin, a hormone required to convert carbohydrates into energy for daily life. According to the American Diabetes Association, 23.6 million children and adults, approximately 7.8% of the population in the United States, have diabetes [1]. The cost of diabetes in 2007 was estimated to be $174 billion [1]. The micro- and macro-vascular complications of diabetes are the most common causes of renal failure, blindness and amputations leading to significant mortality, morbidity and poor quality of life; however, incomplete understanding of the causes of diabetic complications hinders the development of mechanism-based therapies. In vivo and in vitro experiments implicate a number of enzymatic and non-enzymatic metabolic pathways in the initiation and progression of diabetic complications [2] including: (1) increased polyol pathway activity leading to sorbitol and fructose accumulation, NAD(P)-redox imbalances and changes in signal transduction; (2) non-enzymatic glycation of proteins yielding "advanced glycation end-products" (AGEs); (3) activation of protein kinase C (PKC), initiating a cascade of intracellular stress responses; and (4) increased hexosamine pathway flux [2,3]. Only recently has a link among these pathways been established that provides a unified mechanism of tissue damage. Each of these pathways directly and indirectly leads to overproduction of reactive oxygen species (ROS) [2,3]. ROS are highly reactive ions or small molecules including oxygen ions, free radicals and peroxides, formed as natural byproducts of cellular energy metabolism. ROS are implicated in multiple cellular pathways such as mitogen-activated protein kinase (MAPK) signaling, c-Jun amino-terminal kinase (JNK), cell proliferation and apoptosis [4-6]. Due to the highly reactive properties of ROS, excessive ROS may cause significant damage to proteins, DNA, RNA and lipids. All cells express enzymes capable of neutralizing ROS. In addition to the maintenance of antioxidant systems such as glutathione and thioredoxins, primary sensory neurons express two main detoxifying enzymes: superoxide dismutase (SOD) [7] and catalase [8]. SOD converts superoxide (O2-) to H2O2, which is reduced to H2O by glutathione and catalase [8]. SOD1 is the main form of SOD in the cytoplasm; SOD2 is located within the mitochondria. In neurons, SOD1 activity represents approximately 90% of total SOD activity and SOD2 approximately 10% [9]. Under diabetic conditions, this protective mechanism is overwhelmed due to the substantial increase in ROS, leading to cellular damage and dysfunction [10]. The idea that increased ROS and oxidative stress contribute to the pathogenesis of diabetic complications has led scientists to investigate different oxidative stress pathways [7,11]. Inhibition of ROS or maintenance of euglycemia restores metabolic and vascular imbalances and blocks both the initiation and progression of complications [12,13]. Despite the significant implications and extensive research into the role of ROS in diabetes, no comprehensive database regarding ROS-related genes or proteins is currently available. In the present study, a comprehensive list of ROS- and diabetes-related targets (genes/proteins) was compiled from the biomedical literature through text mining technology. SciMiner, a web-based literature mining tool [14], was used to retrieve and process documents and identify targets from the text. SciMiner provides a convenient web-based platform for target-identification within the biomedical literature, similar to other tools including EBIMed [15], ALI BABA [16], and PolySearch [17]; however, SciMiner is unique in that it searches full text documents, supports free-text PubMed query style, and allows the comparison of target lists from multiple queries. The ROS-diabetes targets collected by SciMiner were further tested against randomly selected non-ROS-diabetes literature to identify targets that are significantly over-represented in the ROS-diabetes literature. Functional enrichment analyses were performed on these targets to identify significantly over-represented biological functions in terms of Gene Ontology (GO) terms and pathways. In order to confirm the biological relevance of the over-represented ROS-diabetes targets, the gene expression levels of nine selected targets were measured in dorsal root ganglia (DRG) from mice with and without diabetes. DRG contain primary sensory neurons that relay information from the periphery to the central nervous system (CNS) [7,10,18]. Unlike the CNS, DRG are not protected by a blood-nerve barrier, and are consequently vulnerable to metabolic and toxic injury [19]. We hypothesize that differential expression of identified targets in DRG would confirm their involvement in the pathogenesis of diabetic neuropathy.

Methods

Defining ROS-diabetes literature

To retrieve the list of biomedical literature associated with ROS and diabetes, PubMed was queried using ("Reactive Oxygen Species"[MeSH] AND "Diabetes Mellitus"[MeSH]). This query yielded 1,154 articles as of April 27, 2009. SciMiner, a web-based literature mining tool [14], was used to retrieve and process the abstracts and available full text documents to identify targets (full text documents were available for approximately 40% of the 1,154 articles). SciMiner-identified targets, reported in the form of HGNC [HUGO (Human Genome Organization) Gene Nomenclature Committee] genes, were confirmed by manual review of the text.

Comparison with human curated data (NCBI Gene2PubMed)

The NCBI Gene database provides links between Gene and PubMed. The links are the result of (1) manual curation within the NCBI via literature analysis as part of generating a Gene record, (2) integration of information from other public databases, and (3) GeneRIF (Gene Reference Into Function) in which human experts provide a brief summary of gene functions and make the connections between citation (PubMed) and Gene databases. For the 1,154 ROS-diabetes articles, gene-paper associations were retrieved from the NCBI Gene database. Non-human genes were mapped to homologous human genes through the NCBI HomoloGene database. The retrieved genes were compared against the SciMiner derived targets. Any genes missed by SciMiner were added to the ROS-diabetes target set.

Protein-protein interactions among ROS-diabetes targets

To indirectly examine the association of literature derived targets (by SciMiner and NCBI Gene2PubMed) with ROS and diabetes, protein-protein interactions (PPIs) among the targets were surveyed. This was based on an assumption that targets are more likely to have PPIs with each other if they are truly associated within the same biological functions/pathways. A PPI network of the ROS-diabetes targets was generated using the Michigan Molecular Interactions (MiMI, http://mimi.ncibi.org/) database [20] and compared against 100 PPI networks of randomly drawn sets (the same number of the ROS-diabetes target set) from HUGO. A standard Z-test and one sample T-test were used to calculate the statistical significance of the ROS-diabetes PPI network with respect to the random PPI networks.

Functional enrichment analysis

Literature derived ROS-diabetes targets (by SciMiner and NCBI Gene2PubMed) were subject to functional enrichment analyses to identify significantly over-represented biological functions in terms of Gene Ontology [21], pathways (Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/) [22] and Reactome http://www.reactome.org/[23]). Fisher's exact test [24] was used to calculate the statistical significance of these biological functions with Benjamini-Hochberg (BH) adjusted p-value < 0.05 [25] as the cut-off.

Over-represented ROS-diabetes targets

Defining background corpora

To identify a subset of targets that are highly over-represented within the ROS-diabetes targets, the frequency of each target (defined as the number of documents in which the target was identified divided by the number of total documents in the query) was compared against the frequencies in randomly selected background corpora. Depending on how the background set is defined, over-represented targets may vary widely; therefore, to maintain the background corpora close to the ROS and diabetes context, documents were selected from the same journal, volume, and issue of the 1,154 ROS-diabetes documents, but were NOT indexed with "Reactive Oxygen Species"[MeSH] nor "Diabetes Mellitus"[MeSH]. For example, one of the ROS-diabetes articles (PMID: 18227068), was published in the Journal of Biological Chemistry, Volume 283, Issue 16. This issue contained 85 papers, 78 of which were not indexed with either "Reactive Oxygen Species"[MeSH] or "Diabetes Mellitus"[MeSH] indexed. One of these 78 papers was randomly selected as a background document. Three sets of 1,154 documents were selected using this approach and processed using SciMiner. Identified targets were confirmed by manual review for accuracy.

Identifying significantly over-represented targets

ROS-diabetes targets were tested for over-representation against targets identified from the three background sets. Fisher's exact test was used to determine if the frequency of each target in the ROS-diabetes target set was significantly different from that of the background sets. Any targets with a BH adjusted p-value < 0.05 in at least two of the three comparisons were deemed to be an over-represented ROS-diabetes target. Functional enrichment analyses were performed on these over-represented ROS-diabetes targets as described above.

Selecting targets for real-time RT-PCR

A subset of targets were selected for RT-PCR from the top 10 over-represented ROS-diabetes targets excluding insulin and NADPH oxidase 5 (NOX5), which does not have a mouse ortholog. Nitric oxide synthase 1 (NOS1), the main generator of nitric oxide, ranked at the 15th position and was additionally selected for inclusion in the test set.

Differential gene expression by real-time RT-PCR

Mice

DBA/2J mice were purchased from the Jackson Laboratory (Bar Harbor, ME). Mice were housed in a pathogen-free environment and cared for following the University of Michigan Committee on the Care and Use of Animals guidelines. Mice were fed AIN76A chow (Research Diets, New Brunswick, NJ). Male mice were used for this study.

Induction of diabetes

Two treatment groups were defined: control (n = 4) and diabetic (n = 4). Diabetes was induced at 13 weeks of age by low-dose streptozotocin (STZ) injections, 50 mg/kg/day for five consecutive days. All diabetic mice received LinBit sustained release insulin implants (LinShin, Toronto, Canada) at 8 weeks post-STZ treatment. Insulin implants were replaced every 4 weeks, at 12 and 16 weeks post-STZ treatment. At 20 weeks post-STZ treatment, mice were euthanized by sodium pentobarbital overdose and DRG were harvested as previously described [26].

Real-time RT-PCR

The gene expression of the selected nine literature-derived ROS-diabetes targets in DRG was measured using real-time RT-PCR in duplicate. The amount of mRNA isolated from each DRG was normalized to an endogenous reference [Tbp: TATA box binding protein; Δ cycle threshold (CT)].

Results

Identification of ROS-diabetes targets

A total of 1,021 unique targets were identified by SciMiner from the 1,154 ROS-diabetes papers defined by the query of ("Reactive Oxygen Species"[MeSH] AND "Diabetes Mellitus"[MeSH]) and confirmed by manual review. Table 1 contains the top 10 most frequently mentioned targets in the ROS-diabetes papers. Insulin was the most frequently mentioned target, followed by superoxide dismutase 1 and catalase.
Table 1

Top 10 most frequent ROS-diabetes targets

SymbolName#PaperMatch Strings
INSinsulin503INS | insulin | proinsulin |
SOD1superoxide dismutase 1368Sod1 | SOD1 | SOD1 | *
CATcatalase241CAT | catalase | *
PRKCAprotein kinase C, alpha194PKCA | PKC-alpha | *
ALBalbumin179albumin | serum albumin |
NOX5NADPH oxidase 5177NOX5 | nadph oxidase |
NOS2Anitric oxide synthase 2A144NOS | iNOS | Nos2 |*
XDHxanthine dehydrogenase133XOR |xanthine dehydrogenase| *
AGTangiotensinogen131Ang-II | ANG | AGT | AngI | *
TNFtumor necrosis factor120TNFA | TNF | TNF-alpha | *

* Matching strings are truncated to fit in the table. The full contents are available in the Additional File 2. '#Paper' refers to the number of documents in which each target was mentioned at least once.

Top 10 most frequent ROS-diabetes targets * Matching strings are truncated to fit in the table. The full contents are available in the Additional File 2. '#Paper' refers to the number of documents in which each target was mentioned at least once. The NCBI Gene2PubMed database, containing expert-curated associations between the NCBI Gene and PubMed databases, revealed 90 unique genes associated with the 1,154 ROS-diabetes papers (Additional File 1). SciMiner identified 85 out of these 90 targets, indicating a 94% recall rate. Five targets missed by SciMiner were added to the initial ROS-diabetes target set to result in 1,026 unique targets (Additional File 2).

PPI network of the ROS-diabetes targets

The PPI network among the ROS-diabetes targets was evaluated using MiMI interaction data. This was based on the assumption that targets commonly related to a certain topic are more likely to have frequent interactions with each other. One hundred PPI networks were generated for comparison using the same number of genes (1,026) randomly selected from the complete HUGO gene set (25,254). The PPI network of the ROS-diabetes targets was significantly different from the randomly generated networks indicating their strong association with the topic "ROS and Diabetes". Table 2 demonstrates that the mean number of targets with any PPI interaction in the randomly generated target sets was 528.9 (approximately 52% of 1,026 targets), while the number of targets with any PPI interaction in the ROS-diabetes target was 983 (96%). The number of targets interacting with each other was also significantly different between the random networks (mean = 155.4) and the ROS-diabetes network (mean = 879). Figure 1 illustrates the distributions of these measurements from the 100 random networks with the ROS-diabetes set depicted as a red vertical line. It is obvious that the PPI network of the ROS-diabetes targets is significantly different from the random networks.
Table 2

Summary of 100 randomly generated PPI networks

# of targetswith anyinteraction# of targetsinteracting witheach other# of directinteractionsamong targetsMax degree*
ROS-diabetes Targets9838795002173

Mean (100 networks)528.9155.4165.425
STDEV (100 networks)1636.254.239.7
Z-Score28.52089.23.7
P-value(Z)0009.60E-05
T-Statistics-284.8-200-891.9-37.3
P-value(T)4.60E-1466.70E-1314.00E-1954.20E-60

* Max degree refers to the number of interactions of the most highly interacting target.

Figure 1

Histograms of randomly generated PPI networks. The histograms illustrate the distributions of 100 randomly generated networks, while the red line indicates the ROS-diabetes targets. The network of the ROS-diabetes targets is significantly different from the 100 randomly generated networks, indicating the overlap of ROS-diabetes targets with respect to the topic "Reactive Oxygen Species and Diabetes".

Summary of 100 randomly generated PPI networks * Max degree refers to the number of interactions of the most highly interacting target. Histograms of randomly generated PPI networks. The histograms illustrate the distributions of 100 randomly generated networks, while the red line indicates the ROS-diabetes targets. The network of the ROS-diabetes targets is significantly different from the 100 randomly generated networks, indicating the overlap of ROS-diabetes targets with respect to the topic "Reactive Oxygen Species and Diabetes".

Functional enrichment analyses of the ROS-diabetes targets

Functional enrichment analyses of the 1,026 ROS-diabetes targets were performed to identify over-represented biological functions of the ROS-diabetes targets. After Benjamini-Hochberg correction, a total of 189 molecular functions, 450 biological processes, 73 cellular components and 341 pathways were significantly enriched in the ROS-diabetes targets when compared against all the HUGO genes (see Additional Files 3, 4, 5 and 6 for the full lists). Table 3 lists the top 3 most over-represented GO terms and pathways ranked by p-values of Fisher's exact test: e.g., apoptosis, oxidoreductase activity and insulin signaling pathway.
Table 3

Enriched functions of 1,026 ROS-diabetes targets

CategoryTerm#targetp-valueFold
Biological Processes GOmetabolic process1133.40E-263.3
protein amino acid phosphorylation982.90E-243.5
response to hypoxia368.80E-2412

Molecular Functions GOprotein binding5142.80E-712.1
oxidoreductase activity1031.50E-314.2
transferase activity1481.70E-262.7

Cellular Components GOcytoplasm3811.50E-572.3
extracellular region2209.10E-442.9
mitochondrion1546.30E-433.9

PathwayFocal adhesion752.40E-429.4
Apoptosis496.70E-3514.5
MAPK signaling pathway734.30E-346.9

'#target' refers to the number of ROS-diabetes targets with each biological function with Benjamini-Hochberg adjusted p-values. Fold is the ratio of targets from the ROS-diabetes set to the complete HUGO gene set.

Enriched functions of 1,026 ROS-diabetes targets '#target' refers to the number of ROS-diabetes targets with each biological function with Benjamini-Hochberg adjusted p-values. Fold is the ratio of targets from the ROS-diabetes set to the complete HUGO gene set.

Identification of over-represented ROS-diabetes targets

To identify the ROS-diabetes targets highly over-represented in ROS-diabetes literature, three sets of background corpora of the same size (n = 1,154 documents) were generated using the same journal, volume and issue approach. The overlap among the three background sets in terms of documents and identified targets are illustrated in Figure 2. Approximately 90% of the selected background documents were unique to the individual set, while 50% of the identified targets were identified in at least one of the three background document sets. The frequencies of the identified targets were compared among the background sets for significant differences. None of the targets had a BH adjusted p-value < 0.05, indicating no significant difference among the targets from the three different background sets (See Additional File 7).
Figure 2

Venn diagrams of document compositions and identified targets of the randomly generated background sets. Approximately 90% of the selected background documents were unique to individual set (A), while 50% of the identified targets were identified in at least one of the three background document sets (B).

Venn diagrams of document compositions and identified targets of the randomly generated background sets. Approximately 90% of the selected background documents were unique to individual set (A), while 50% of the identified targets were identified in at least one of the three background document sets (B). Comparisons of the ROS-diabetes targets against these background sets revealed 53 highly over-represented ROS-diabetes targets as listed in Table 4. These 53 targets were significant (p-value < 0.05) against all three background sets and significant following Benjamini-Hochberg multiple testing correction (BH adjusted p-value < 0.05) against at least two of the three background sets. SOD1 was the most over-represented in the ROS-diabetes targets.
Table 4

53 targets over-represented in ROS-diabetes literature

RankSymbolHUGO_IDName#PaperBG #1BG #2BG #3
1SOD111179superoxide dismutase 1, soluble (amyotrophic lateral sclerosis 1 (adult))3683.1E-842.0E-782.0E-78
2CAT1516catalase2412.1E-503.9E-443.9E-44
3NOX514874NADPH oxidase, EF-hand calcium binding domain 51773.1E-423.6E-392.1E-37
4INS6081insulin5035.9E-412.0E-432.3E-39
5XDH12805xanthine dehydrogenase1331.5E-301.2E-288.8E-28
6PRKCA9393protein kinase C, alpha1947.1E-236.4E-268.9E-24
7NCF17660neutrophil cytosolic factor 1, (chronic granulomatous disease, autosomal 1)727.6E-197.7E-168.7E-16
8NOS37876nitric oxide synthase 3 (endothelial cell)1151.6E-183.9E-167.6E-18
9SOD211180superoxide dismutase 2, mitochondrial852.1E-187.7E-163.8E-15
10CYBA2577cytochrome b-245, alpha polypeptide694.2E-175.0E-136.9E-14
11NOS2A7873nitric oxide synthase 2A (inducible, hepatocytes)1443.9E-165.2E-124.5E-14
12AGT333angiotensinogen (serpin peptidase inhibitor, clade A, member 8)1311.8E-141.4E-093.5E-08
13AKR1B1381aldo-keto reductase family 1, member B1 (aldose reductase)618.0E-139.5E-133.6E-11
14CYBB2578cytochrome b-245, beta polypeptide (chronic granulomatous disease)494.0E-122.6E-095.8E-11
15NOS17872nitric oxide synthase 1 (neuronal)824.9E-123.7E-104.7E-09
16NCF27661neutrophil cytosolic factor 2 (65 kDa, chronic granulomatous disease, autosomal 2)502.4E-111.5E-093.8E-08
17CYCS19986cytochrome c, somatic818.7E-102.2E-102.1E-10
18HBB4827hemoglobin, beta1011.4E-085.9E-102.2E-08
19GSR4623glutathione reductase611.4E-084.8E-084.8E-08
20UCP112517uncoupling protein 1 (mitochondrial, proton carrier)384.1E-072.1E-069.7E-06
21NOX47891NADPH oxidase 4316.2E-072.3E-042.7E-05
22PARP1270poly (ADP-ribose) polymerase family, member 1377.1E-071.1E-075.3E-05
23UCP212518uncoupling protein 2 (mitochondrial, proton carrier)347.0E-074.5E-062.1E-05
24HBA14823hemoglobin, alpha 1301.1E-061.2E-069.3E-06
25ALB399albumin1797.0E-064.9E-061.7E-06
26NOX17889NADPH oxidase 1308.2E-068.6E-069.7E-06
27NFKB17794nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (p105)909.4E-061.2E-044.5E-04
28VEGFA12680vascular endothelial growth factor A572.6E-041.9E-044.1E-03
29SOD311181superoxide dismutase 3, extracellular182.5E-048.1E-023.4E-02
30REN9958renin513.6E-042.2E-027.2E-02
31MPO7218myeloperoxidase285.7E-042.4E-015.1E-02
32SORD11184sorbitol dehydrogenase151.8E-031.9E-031.8E-03
33COL4A12202collagen, type IV, alpha 1151.8E-031.3E-021.8E-03
34TGFA11765transforming growth factor, alpha462.1E-033.5E-023.5E-04
35ACE2707angiotensin I converting enzyme (peptidyl-dipeptidase A) 1693.8E-031.1E-021.1E-02
36AGTR1336angiotensin II receptor, type 1363.7E-034.9E-021.8E-03
37G6PD4057glucose-6-phosphate dehydrogenase195.6E-033.7E-012.1E-01
38CP2295ceruloplasmin (ferroxidase)136.2E-033.1E-012.9E-01
39NCF47662neutrophil cytosolic factor 4, 40kDa166.7E-039.9E-049.9E-04
40MT-CYB7427mitochondrially encoded cytochrome b151.3E-021.3E-021.3E-01
41DUOX13062dual oxidase 1112.2E-022.9E-011.1E-01
42SERPINE18583serpin peptidase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1372.4E-022.5E-021.1E-03
43GSTCD25806glutathione S-transferase, C-terminal domain containing372.4E-023.8E-019.1E-02
44COQ72244coenzyme Q7 homolog, ubiquinone (yeast)162.8E-021.9E-013.1E-02
45RAC19801ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1)183.0E-024.3E-017.8E-02
46MAOB6834monoamine oxidase B103.9E-024.1E-014.4E-01
47UCP312519uncoupling protein 3 (mitochondrial, proton carrier)174.7E-021.7E-021.8E-02
48VCAM112663vascular cell adhesion molecule 1295.4E-026.3E-023.5E-02
49AKT1391v-akt murine thymoma viral oncogene homolog 1755.5E-024.9E-026.4E-02
50LEPR6554leptin receptor218.7E-023.1E-011.4E-02
51EDN13176endothelin 1388.8E-023.8E-012.6E-02
52COL1A12197collagen, type I, alpha 1848.7E-022.6E-021.7E-01
53CCL210618chemokine (C-C motif) ligand 2382.0E-014.9E-021.0E-02

'#Paper' is the number of papers in ROS-diabetes corpus; BG#1, BG#2 and BG#3 are Benjamini-Hochberg adjusted p-values between ROS-diabetes targets and background sets.

53 targets over-represented in ROS-diabetes literature '#Paper' is the number of papers in ROS-diabetes corpus; BG#1, BG#2 and BG#3 are Benjamini-Hochberg adjusted p-values between ROS-diabetes targets and background sets.

Functional enrichment analyses of the over-represented ROS-diabetes targets

Functional enrichment analyses of the 53 ROS-diabetes targets were performed to identify over-represented biological functions. Following Benjamini-Hochberg correction, a total of 65 molecular functions, 209 biological processes, 26 cellular components and 108 pathways were significantly over-represented when compared against all the HUGO genes (see Additional Files 8, 9, 10 and 11 for the full lists). Table 5 shows the top 3 most significantly over-represented GO terms and pathways ranked by p-values of Fisher's exact test. GO terms related to oxidative stress such as "superoxide metabolic process", "superoxide release", "electron carrier activity" and "mitochondrion" were highly over-represented in the 53 ROS-diabetes targets.
Table 5

Enriched functions of the 53 over-represented targets in diabetes

CategoryTerm# targetp-valueFold
Biological Processes GOsuperoxide metabolic process73.70E-15303
electron transport131.50E-1216
superoxide release54.20E-11298

Molecular Functions GOelectron carrier activity151.80E-1727
oxidoreductase activity182.20E-1614
iron ion binding154.20E-1621

Cellular Components GOmitochondrion139.90E-086
extracellular space106.60E-078
soluble fraction73.20E-0611

PathwayLeukocyte transendothelial migration96.40E-1236
Small cell lung cancer71.00E-0938
Formation of Platelet plug61.10E-0841

'#target' refers to the number of ROS-diabetes targets with each biological function with Benjamini-Hochberg adjusted p-values. Fold is the ratio of targets from the ROS-diabetes set to the complete HUGO gene set.

Enriched functions of the 53 over-represented targets in diabetes '#target' refers to the number of ROS-diabetes targets with each biological function with Benjamini-Hochberg adjusted p-values. Fold is the ratio of targets from the ROS-diabetes set to the complete HUGO gene set.

Gene expression change in diabetes

Two groups of DBA/2J mice exhibited significantly different levels of glycosylated hemoglobin (%GHb). The mean ± SEM were 6.2 ± 0.3 for the non-diabetic control group and for 14.0 ± 0.8 for the diabetic group (p-value < 0.001), indicative of prolonged hyperglycemia in the diabetic group [26]. DRG were harvested from these animals for gene expression assays. Nine genes were selected from the top ranked ROS-diabetes targets: superoxide dismutase 1 (Sod1), catalase (Cat), xanthine dehydrogenase (Xdh), protein kinase C alpha (Prkca), neutrophil cytosolic factor 1 (Ncf1), nitric oxide synthase 3 (Nos3), superoxide dismutase 2 (Sod2), cytochrome b-245 alpha (Cyba), and nitric oxide synthase 1 (Nos1). Eight genes exhibited differential expression between diabetic and non-diabetic mice (p-value < 0.05) as shown in Figure 3. Cat, Sod1, Sod2, Prkca, and Nos1 expression levels were decreased, while Ncf1, Xdh, and Cyba expression levels were increased in diabetes.
Figure 3

Gene expression levels of selected ROS-diabetes targets in DRG examined by real-time RT-PCR. Expression levels are relative to Tbp, an internal control (error bar = SEM) (*, p < 0.05; **, p < 0.01; ***, p < 0.001). Eight (Cat, Sod1, Ncf1, Xdh, Sod2, Cyba, Prkca, and Nos1) out of the nine selected ROS-diabetes genes were significantly regulated by diabetes.

Gene expression levels of selected ROS-diabetes targets in DRG examined by real-time RT-PCR. Expression levels are relative to Tbp, an internal control (error bar = SEM) (*, p < 0.05; **, p < 0.01; ***, p < 0.001). Eight (Cat, Sod1, Ncf1, Xdh, Sod2, Cyba, Prkca, and Nos1) out of the nine selected ROS-diabetes genes were significantly regulated by diabetes.

Discussion

Reactive oxygen species (ROS) are products of normal energy metabolism and play important roles in many other biological processes such as the immune response and signaling cascades [4-6]. As mediators of cellular damage, ROS are implicated in pathogenesis of multiple diseases including diabetic complications [27-30]. With the aid of literature mining technology, we collected 1,026 possible ROS-related targets from a set of biomedical literature indexed with both ROS and diabetes. Fifty-three targets were significantly over-represented in the ROS-diabetes papers when compared against three background sets. Depending on how the background set is defined, the over-represented targets may vary widely. An ideal background set would be the entire PubMed set; however, this is not possible due to limited access to full texts and intense data processing. An alternative method would be to use only abstracts in PubMed, but this may not fully represent the literature. Using only the abstracts, our target identification method resulted in 21 (39%) of the 53 key ROS-diabetes targets (Additional File 12), suggesting the benefit of rich information in full text documents. In the present study, background documents were randomly selected from the same journal, volume, and issue of the 1,154 ROS-diabetes documents, which were not indexed with "Reactive Oxygen Species"[MeSH] nor "Diabetes Mellitus"[MeSH]. This approach maintained the background corpora not far from the ROS and diabetes context. The gene expression levels of nine targets selected from the 53 over-represented ROS-diabetes targets were measured in diabetic and non-diabetic DRG. Our laboratory is particularly interested in deciphering the underlying mechanisms of diabetic neuropathy, a major complication of diabetes. Data published by our laboratory both in vitro and in vivo confirm the negative impact of oxidative stress in complication-prone neuron tissues like DRG [7,10,18,31]. In an effort to obtain diabetic neuropathy specific targets, SciMiner was employed to further analyze a subset of the ROS-diabetes papers (data not shown). Nerve growth factor (NGF) was identified as the most over-represented target in this subset when compared to the full ROS-diabetes set; however, NGF did not have statistical significance (BH adjusted p-value = 0.06). The relatively small numbers of papers and associated targets may have contributed to this non-significance. Therefore, the candidate targets for gene expression validation were selected from among the 53 over-represented ROS-diabetes targets derived from the full ROS-diabetes corpus. Among the tested genes, the expression levels of Cat, Sod1, Sod2, Prkca, and Nos1 were decreased, while the expression levels of Ncf1, Xdh, and Cyba were increased under diabetic conditions. Cat, Sod1, and Sod2 are responsible for protecting cells from oxidative stress by destroying superoxides and hydrogen peroxides [8-11]. Decreased expression of these genes may result in oxidative stress [32]. Increased expression of Cyba and Ncf1, subunits of superoxide-generating nicotinamide adenine dinucleotide phosphate (NADPH) oxidase complex [30], also supports enhanced oxidative stress. Xdh and its inter-convertible form, Xanthine oxidase (Xod), showed increased activity in various rat tissues under oxidative stress conditions with diabetes [33], and also showed increased expression in diabetic DRG in the current study. Unlike the above concordant genes, protein kinase C and nitric oxide synthases did not exhibit predicted expression changes in diabetes. Protein kinase C activates NADPH oxidase, further promoting oxidative stress in the cell [34,35]. Decreased expression of Prkca in our diabetic DRG is not parallel with expression levels of other enzymes expected to increase oxidative stress. Between the two nitric oxide synthases tested in the present study, Nos1 (neuronal) expression was significantly decreased (p-value < 0.001) in diabetes, while Nos3 (endothelial) expression was not significant (p-value = 0.06). The neuronal Nos1 is expected to play a major role in producing nitric oxide, another type of highly reactive free radical. Thus, with some exceptions, the majority of the differentially expressed genes in DRG show parallel results to the known activities of these targets in diabetes, suggesting enhanced oxidative stress in the diabetic DRG. Assessment of antioxidant enzyme expression in diabetes has yielded a variety of results [36-40] depending upon the duration of diabetes, the tissue studied and other factors. In diabetic mice and rats, it is commonly reported that superoxide dismutases are down-regulated [37-40], where data regarding catalase are variable [36,40]. PKC is activated in diabetes, but most papers that examined mRNA demonstrated that its expression is largely unchanged [41]. Among the 53 over-represented ROS-diabetes targets, SOD1 was the most over-represented and was differentially expressed under diabetic and non-diabetic conditions. To the best of our knowledge, no published study has investigated the role of SOD1 in the onset and/or progression of diabetic neuropathy. Mutations of SOD1 have long been associated with the inherited form of amyotrophic lateral sclerosis (ALS) [42] and the theory of oxidative stress-based aging [43]. Early reports indicate that knockout of the SOD1 gene does not affect nervous system development [44], although recovery following injury is slow and incomplete [45,46]. With respect to diabetes, SOD1 KO accelerates the development of diabetic nephropathy [47] and cataract formation [48]. Thus, examining the SOD1 KO mouse as a model of diabetic neuropathy would be a reasonable follow-up study. One limitation of the current approach using literature mining technology is incorrect or missed identification of the mentioned targets within the literature. Based on a performance evaluation using a standard text set BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology) version 2 [49], SciMiner achieved 87.1% recall (percentage identification of targets in the given text), 71.3% precision (percentage accuracy of identified target) and 75.8% F-measure (harmonious average of recall and precision = (2 × recall × precision)/(recall + precision)) before manual revision [14]. In order to improve the accuracy of SciMiner's results, each target was manually reviewed and corrected by checking the sentences in which each target was identified. Approximately, 120 targets (~10% of the initially identified targets from the ROS-diabetes papers) were removed during the manual review process. The overall accuracy is expected to improve through the review process; however, the review process did not address targets missed by SciMiner, since we did not thoroughly review individual papers. Instead, 5 missed targets, whose associations with ROS-diabetes literature were available in the NCBI Gene2PubMed database, were added to the final ROS-diabetes target list (Additional File 2).

Conclusions

The present approach enabled us to collect a comprehensive list of ROS and diabetes related targets and led us to confirm the biological relevance to diabetic neuropathy of the selected ROS-diabetes targets. Using SciMiner to identify significantly enriched targets is applicable to other disease topics of interest by providing a more focused subset of literature for review and by highlighting targets common to multiple manuscripts.

List of abbreviations

NAD(P): nicotinamide adenine dinucleotide (phosphate); AGE: advanced glycoation end-products; PKC: protein kinase C; ROS: reactive oxygen species; MAPK: mitogen-activated protein kinase; JNK: c-Jun amino-terminal kinase; SOD1: superoxide dismutase 1; SOD2: superoxide dismutase 2; CAT: catalase; XDH: xanthine dehydrogenase; NCF1: neutrophil cytosolic factor 1; NOS3: nitric oxide synthase 3; CYBA: cytochrome b-245 alpha; NOS1: nitric oxide synthase 1; ALS: amyotrophic lateral sclerosis; BioCreAtIvE: Critical Assessment of Information Extraction systems in Biology; MiMI: Michigan Molecular Interactions; KEGG: Kyoto Encyclopedia of Genes and Genomes; STZ: streptozotocin; GO: gene ontology; DRG: dorsal root ganglia; CNS: central nervous system; HGNC: HUGO (human genome organization) nomenclature committee; PPI: protein-protein interaction; SEM: standard error mean.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

JH participated in the study design, performed the literature mining and functional enrichment analyses, and drafted the manuscript. KAS participated in the study design and drafted the manuscript. ADS participated in the statistical analysis. YH carried out the quantitative RT-PCR assay. MP participated in the manuscript revision. DJS participated in the study design and manuscript revision. HVJ participated in the study design and manuscript revision. ELF participated in the study design and manuscript revision. All authors read and approved the final manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1755-8794/3/49/prepub

Additional file 1

The list of 90 genes from the NCBI Gene2PubMed database for the ROS-Diabetes literature (1,154 papers). Click here for file

Additional file 2

The list of 1,026 ROS-Diabetes targets. Click here for file

Additional file 3

The enriched Molecular Functions Gene Ontology Terms in the 1,026 ROS-Diabetes targets. Click here for file

Additional file 4

The enriched Biological Processes Gene Ontology Terms in the 1,026 ROS-Diabetes targets. Click here for file

Additional file 5

The enriched Cellular Components Gene Ontology Terms in the 1,026 ROS-Diabetes targets. Click here for file

Additional file 6

The enriched pathways in the 1,026 ROS-Diabetes targets. Click here for file

Additional file 7

Comparisons of target frequencies among three background sets. Click here for file

Additional file 8

The enriched Molecular Functions Gene Ontology Terms in the Over-represented 53 ROS-Diabetes targets. Click here for file

Additional file 9

The enriched Biological Processes Gene Ontology Terms in the Over-represented 53 ROS-Diabetes targets. Click here for file

Additional file 10

The enriched Cellular Components Gene Ontology Terms in the Over-represented 53 ROS-Diabetes targets. Click here for file

Additional file 11

The enriched pathways in the Over-represented 53 ROS-Diabetes targets. Click here for file

Additional file 12

The Key 53 ROS-Diabetes Targets Identifiable Using Only the Abstracts. Click here for file
  47 in total

1.  Sensory neurons and schwann cells respond to oxidative stress by increasing antioxidant defense mechanisms.

Authors:  Andrea M Vincent; Koichi Kato; Lisa L McLean; Mary E Soules; Eva L Feldman
Journal:  Antioxid Redox Signal       Date:  2009-03       Impact factor: 8.401

2.  Dyslipidemia-induced neuropathy in mice: the role of oxLDL/LOX-1.

Authors:  Andrea M Vincent; John M Hayes; Lisa L McLean; Anuradha Vivekanandan-Giri; Subramaniam Pennathur; Eva L Feldman
Journal:  Diabetes       Date:  2009-07-10       Impact factor: 9.461

3.  Enhanced diabetes-induced cataract in copper-zinc superoxide dismutase-null mice.

Authors:  Eva M Olofsson; Stefan L Marklund; Anders Behndig
Journal:  Invest Ophthalmol Vis Sci       Date:  2009-03-25       Impact factor: 4.799

Review 4.  Redox control of the cell cycle in health and disease.

Authors:  Ehab H Sarsour; Maneesh G Kumar; Leena Chaudhuri; Amanda L Kalen; Prabhat C Goswami
Journal:  Antioxid Redox Signal       Date:  2009-12       Impact factor: 8.401

5.  The effect of 1alpha,25(OH)2D3 vitamin over oxidative stress and biochemical parameters in rats where Type 1 diabetes is formed by streptozotocin.

Authors:  Sevki Cetinkalp; Yasemin Delen; Muammer Karadeniz; Gül Yüce; Candeğer Yilmaz
Journal:  J Diabetes Complications       Date:  2008-10-31       Impact factor: 2.852

Review 6.  Reactive oxygen species: a double-edged sword in oncogenesis.

Authors:  Jin-Shui Pan; Mei-Zhu Hong; Jian-Lin Ren
Journal:  World J Gastroenterol       Date:  2009-04-14       Impact factor: 5.742

Review 7.  Is the oxidative stress theory of aging dead?

Authors:  Viviana I Pérez; Alex Bokov; Holly Van Remmen; James Mele; Qitao Ran; Yuji Ikeno; Arlan Richardson
Journal:  Biochim Biophys Acta       Date:  2009-06-11

8.  SciMiner: web-based literature mining tool for target identification and functional enrichment analysis.

Authors:  Junguk Hur; Adam D Schuyler; David J States; Eva L Feldman
Journal:  Bioinformatics       Date:  2009-02-02       Impact factor: 6.937

Review 9.  Non-cell autonomous toxicity in neurodegenerative disorders: ALS and beyond.

Authors:  Hristelina Ilieva; Magdalini Polymenidou; Don W Cleveland
Journal:  J Cell Biol       Date:  2009-12-14       Impact factor: 10.539

10.  Michigan molecular interactions r2: from interacting proteins to pathways.

Authors:  V Glenn Tarcea; Terry Weymouth; Alex Ade; Aaron Bookvich; Jing Gao; Vasudeva Mahavisno; Zach Wright; Adriane Chapman; Magesh Jayapandian; Arzucan Ozgür; Yuanyuan Tian; Jim Cavalcoli; Barbara Mirel; Jignesh Patel; Dragomir Radev; Brian Athey; David States; H V Jagadish
Journal:  Nucleic Acids Res       Date:  2008-10-31       Impact factor: 16.971

View more
  15 in total

Review 1.  New Horizons in Diabetic Neuropathy: Mechanisms, Bioenergetics, and Pain.

Authors:  Eva L Feldman; Klaus-Armin Nave; Troels S Jensen; David L H Bennett
Journal:  Neuron       Date:  2017-03-22       Impact factor: 17.173

2.  Eryptosis and oxidative damage in type 2 diabetic mellitus patients with chronic kidney disease.

Authors:  J V Calderón-Salinas; E G Muñoz-Reyes; J F Guerrero-Romero; M Rodríguez-Morán; R L Bracho-Riquelme; M A Carrera-Gracia; M A Quintanar-Escorza
Journal:  Mol Cell Biochem       Date:  2011-05-28       Impact factor: 3.396

3.  Effects of triple antioxidant therapy on measures of cardiovascular autonomic neuropathy and on myocardial blood flow in type 1 diabetes: a randomised controlled trial.

Authors:  R Pop-Busui; M J Stevens; D M Raffel; E A White; M Mehta; C D Plunkett; M B Brown; E L Feldman
Journal:  Diabetologia       Date:  2013-06-06       Impact factor: 10.122

4.  Transcriptional networks of progressive diabetic peripheral neuropathy in the db/db mouse model of type 2 diabetes: An inflammatory story.

Authors:  Lucy M Hinder; Benjamin J Murdock; Meeyoung Park; Diane E Bender; Phillipe D O'Brien; Amy E Rumora; Junguk Hur; Eva L Feldman
Journal:  Exp Neurol       Date:  2018-03-14       Impact factor: 5.330

5.  High glucose forces a positive feedback loop connecting Akt kinase and FoxO1 transcription factor to activate mTORC1 kinase for mesangial cell hypertrophy and matrix protein expression.

Authors:  Falguni Das; Nandini Ghosh-Choudhury; Nirmalya Dey; Amit Bera; Meenalakshmi M Mariappan; Balakuntalam S Kasinath; Goutam Ghosh Choudhury
Journal:  J Biol Chem       Date:  2014-10-06       Impact factor: 5.157

6.  Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network.

Authors:  Junguk Hur; Zuoshuang Xiang; Eva L Feldman; Yongqun He
Journal:  BMC Immunol       Date:  2011-08-26       Impact factor: 3.615

7.  Connecting the dots between PubMed abstracts.

Authors:  M Shahriar Hossain; Joseph Gresock; Yvette Edmonds; Richard Helm; Malcolm Potts; Naren Ramakrishnan
Journal:  PLoS One       Date:  2012-01-03       Impact factor: 3.240

8.  Networks of neuroinjury semantic predications to identify biomarkers for mild traumatic brain injury.

Authors:  Michael J Cairelli; Marcelo Fiszman; Han Zhang; Thomas C Rindflesch
Journal:  J Biomed Semantics       Date:  2015-05-18

9.  Identification of epigenetically altered genes in sporadic amyotrophic lateral sclerosis.

Authors:  Claudia Figueroa-Romero; Junguk Hur; Diane E Bender; Colin E Delaney; Michael D Cataldo; Andrea L Smith; Raymond Yung; Douglas M Ruden; Brian C Callaghan; Eva L Feldman
Journal:  PLoS One       Date:  2012-12-26       Impact factor: 3.752

10.  Simulation of Swanson's literature-based discovery: anandamide treatment inhibits growth of gastric cancer cells in vitro and in silico.

Authors:  Weiwei Dong; Yixuan Liu; Weijie Zhu; Quan Mou; Jinliang Wang; Yi Hu
Journal:  PLoS One       Date:  2014-06-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.