| Literature DB >> 18366599 |
Bhanu C Vanteru1, Jahangheer S Shaik, Mohammed Yeasin.
Abstract
BACKGROUND: The technological advances in the past decade have lead to massive progress in the field of biotechnology. The documentation of the progress made exists in the form of research articles. The PubMed is the current most used repository for bio-literature. PubMed consists of about 17 million abstracts as of 2007 that require methods to efficiently retrieve and browse large volume of relevant information. The State-of-the-art technologies such as GOPubmed use simple keyword-based techniques for retrieving abstracts from the PubMed and linking them to the Gene Ontology (GO). This paper changes the paradigm by introducing semantics enabled technique to link the PubMed to the Gene Ontology, called, SEGOPubmed for ontology-based browsing. Latent Semantic Analysis (LSA) framework is used to semantically interface PubMed abstracts to the Gene Ontology.Entities:
Mesh:
Year: 2008 PMID: 18366599 PMCID: PMC2386052 DOI: 10.1186/1471-2164-9-S1-S10
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1ROC curves showing the performance of SEGOPubmed a) training data and b) test data
TPF and FPF values for all the 60 GO terms in the training data
| mitochondrion inheritance | 10 | 6 | 4 | 0.6 | 0.005442177 |
| mitochondrial genome maintenance | 29 | 9 | 6 | 0.6 | 0.008219178 |
| reproduction | 9 | 7 | 2 | 0.777777778 | 0.002717391 |
| biological process unknown | 13 | 6 | 7 | 0.461538462 | 0.009562842 |
| ribosomal chaperone activity | 10 | 2 | 8 | 0.2 | 0.010884354 |
| high affinity zinc uptake transporter activity | 20 | 8 | 7 | 0.533333333 | 0.009589041 |
| low-affinity zinc ion transporter activity | 7 | 3 | 4 | 0.428571429 | 0.005420054 |
| thioredoxin | 19 | 0 | 15 | 0 | 0.020547945 |
| alpha-1,6-mannosyltransferase activity | 10 | 3 | 7 | 0.3 | 0.00952381 |
| trans-hexaprenyltranstransferase activity | 13 | 0 | 13 | 0 | 0.017759563 |
| vacuole inheritance | 10 | 7 | 3 | 0.7 | 0.004081633 |
| single strand break repair | 18 | 4 | 11 | 0.266666667 | 0.015068493 |
| single-stranded DNA specific | 7 | 5 | 2 | 0.714285714 | 0.002710027 |
| phosphopyruvate hydratase complex | 10 | 0 | 10 | 0 | 0.013605442 |
| lactase activity | 11 | 8 | 3 | 0.727272727 | 0.004087193 |
| alpha-glucoside transport | 17 | 3 | 12 | 0.2 | 0.016438356 |
| regulation of DNA recombination | 30 | 13 | 2 | 0.866666667 | 0.002739726 |
| regulation of mitotic recombination | 34 | 9 | 6 | 0.6 | 0.008219178 |
| negative regulation of recombination | 19 | 6 | 9 | 0.4 | 0.012328767 |
| mitotic spindle elongation | 10 | 2 | 8 | 0.2 | 0.010884354 |
| maltose metabolism | 7 | 4 | 3 | 0.571428571 | 0.004065041 |
| maltose biosynthesis | 9 | 0 | 9 | 0 | 0.012228261 |
| maltose catabolism | 16 | 13 | 2 | 0.866666667 | 0.002739726 |
| alpha-1,2-mannosyltransferase activity | 17 | 0 | 15 | 0 | 0.020547945 |
| ribosomal large subunit assembly | 22 | 4 | 11 | 0.266666667 | 0.015068493 |
| ribosomal small subunit assembly | 10 | 1 | 9 | 0.1 | 0.012244898 |
| mannosyltransferase activity | 26 | 9 | 6 | 0.6 | 0.008219178 |
| mannosylphosphate transferase activity | 14 | 5 | 9 | 0.357142857 | 0.012311902 |
| cell wall mannoprotein biosynthesis | 12 | 6 | 6 | 0.5 | 0.008185539 |
| alpha-1,3-mannosyltransferase activity | 7 | 4 | 3 | 0.571428571 | 0.004065041 |
| adenine deaminase activity | 23 | 9 | 6 | 0.6 | 0.008219178 |
| acyl binding | 21 | 7 | 8 | 0.466666667 | 0.010958904 |
| acyl carrier activity | 7 | 6 | 1 | 0.857142857 | 0.001355014 |
| very-long-chain fatty acid metabolism | 24 | 4 | 11 | 0.266666667 | 0.015068493 |
| plasma membrane long-chain | 12 | 7 | 5 | 0.583333333 | 0.006821282 |
| low affinity iron ion transport | 7 | 0 | 7 | 0 | 0.009485095 |
| transition metal ion transport | 32 | 7 | 8 | 0.466666667 | 0.010958904 |
| protein targeting to Golgi | 17 | 2 | 13 | 0.133333333 | 0.017808219 |
| ascorbate stabilization | 17 | 0 | 15 | 0 | 0.020547945 |
| autophagic vacuole formation | 21 | 6 | 9 | 0.4 | 0.012328767 |
| autophagic vacuole fusion | 14 | 3 | 11 | 0.214285714 | 0.01504788 |
| Rieske iron-sulfur protein | 17 | 1 | 14 | 0.066666667 | 0.019178082 |
| peptidyltransferase activity | 31 | 11 | 4 | 0.733333333 | 0.005479452 |
| tRNA binding | 14 | 2 | 12 | 0.142857143 | 0.016415869 |
| urea cycle | 9 | 0 | 9 | 0 | 0.012228261 |
| urea cycle intermediate metabolism | 31 | 6 | 9 | 0.4 | 0.012328767 |
| citrulline metabolism | 10 | 0 | 10 | 0 | 0.013605442 |
| argininosuccinate metabolism | 11 | 0 | 11 | 0 | 0.014986376 |
| ribosome export from nucleus | 11 | 9 | 2 | 0.818181818 | 0.002724796 |
| ribosomal large subunit export | 17 | 13 | 2 | 0.866666667 | 0.002739726 |
| ribosomal small subunit export | 9 | 1 | 8 | 0.111111111 | 0.010869565 |
| citrulline metabolism | 10 | 0 | 10 | 0 | 0.013605442 |
| argininosuccinate metabolism | 11 | 0 | 11 | 0 | 0.014986376 |
| protein import into nucleus, docking | 13 | 5 | 8 | 0.384615385 | 0.010928962 |
| protein import into nucleus, translocation | 18 | 0 | 15 | 0 | 0.020547945 |
| protein import into nucleus, substrate release | 9 | 6 | 3 | 0.666666667 | 0.004076087 |
| acyl-CoA binding | 8 | 3 | 5 | 0.375 | 0.006784261 |
| L-ornithine transporter activity | 7 | 5 | 2 | 0.714285714 | 0.002710027 |
| mitochondrial ornithine transport | 29 | 7 | 8 | 0.466666667 | 0.010958904 |
TPF and FPF values for all the 60 GO terms in the training data
| mitochondrion inheritance | 4 | 2 | 2 | 0.5 | 0.006802721 |
| mitochondrial genome maintenance | 12 | 5 | 7 | 0.416666667 | 0.024475524 |
| reproduction | 3 | 3 | 0 | 1 | 0 |
| biological process unknown | 5 | 3 | 2 | 0.6 | 0.006825939 |
| ribosomal chaperone activity | 4 | 2 | 2 | 0.5 | 0.006802721 |
| high affinity zinc uptake transporter activity | 8 | 8 | 0 | 1 | 0 |
| low-affinity zinc ion transporter activity | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| thioredoxin | 8 | 0 | 8 | 0 | 0.027586207 |
| alpha-1,6-mannosyltransferase activity | 3 | 0 | 3 | 0 | 0.010169492 |
| trans-hexaprenyltranstransferase activity | 5 | 0 | 5 | 0 | 0.017064846 |
| vacuole inheritance | 3 | 3 | 0 | 1 | 0 |
| single strand break repair | 7 | 3 | 4 | 0.428571429 | 0.013745704 |
| single-stranded DNA specific | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| phosphopyruvate hydratase complex | 3 | 1 | 2 | 0.333333333 | 0.006779661 |
| lactase activity | 4 | 3 | 1 | 0.75 | 0.003401361 |
| alpha-glucoside transport | 6 | 1 | 5 | 0.166666667 | 0.017123288 |
| regulation of DNA recombination | 12 | 7 | 5 | 0.583333333 | 0.017482517 |
| regulation of mitotic recombination | 14 | 3 | 11 | 0.214285714 | 0.038732394 |
| negative regulation of recombinations | 7 | 1 | 6 | 0.142857143 | 0.020618557 |
| mitotic spindle elongation | 3 | 1 | 2 | 0.333333333 | 0.006779661 |
| maltose metabolism | 3 | 0 | 3 | 0 | 0.010169492 |
| maltose biosynthesis | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| maltose catabolism | 6 | 5 | 1 | 0.833333333 | 0.003424658 |
| alpha-1,2-mannosyltransferase activity | 6 | 0 | 6 | 0 | 0.020547945 |
| ribosomal large subunit assembly | 9 | 3 | 6 | 0.333333333 | 0.020761246 |
| ribosomal small subunit assembly | 4 | 0 | 4 | 0 | 0.013605442 |
| mannosyltransferase activity | 11 | 7 | 4 | 0.636363636 | 0.013937282 |
| mannosylphosphate transferase activity | 5 | 2 | 3 | 0.4 | 0.010238908 |
| cell wall mannoprotein biosynthesis | 4 | 2 | 2 | 0.5 | 0.006802721 |
| alpha-1,3-mannosyltransferase activity | 3 | 3 | 0 | 1 | 0 |
| adenine deaminase activity | 9 | 8 | 1 | 0.888888889 | 0.003460208 |
| acyl binding | 8 | 4 | 4 | 0.5 | 0.013793103 |
| acyl carrier activity | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| very-long-chain fatty acid metabolism | 10 | 1 | 9 | 0.1 | 0.03125 |
| plasma membrane long-chain | 4 | 3 | 1 | 0.75 | 0.003401361 |
| low affinity iron ion transport | 3 | 0 | 3 | 0 | 0.010169492 |
| transition metal ion transport | 13 | 5 | 8 | 0.384615385 | 0.028070175 |
| protein targeting to Golgi | 6 | 2 | 4 | 0.333333333 | 0.01369863 |
| ascorbate stabilization | 6 | 0 | 6 | 0 | 0.020547945 |
| autophagic vacuole formation | 8 | 3 | 5 | 0.375 | 0.017241379 |
| autophagic vacuole fusion | 5 | 2 | 3 | 0.4 | 0.010238908 |
| Rieske iron-sulfur protein | 6 | 2 | 4 | 0.333333333 | 0.01369863 |
| peptidyltransferase activity | 13 | 5 | 8 | 0.384615385 | 0.028070175 |
| tRNA binding | 6 | 1 | 5 | 0.166666667 | 0.017123288 |
| urea cycle | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| urea cycle intermediate metabolism | 13 | 1 | 12 | 0.076923077 | 0.042105263 |
| citrulline metabolism | 4 | 0 | 4 | 0 | 0.013605442 |
| argininosuccinate metabolism | 4 | 0 | 4 | 0 | 0.013605442 |
| ribosome export from nucleus | 4 | 4 | 0 | 1 | 0 |
| ribosomal large subunit export from nucleus | 6 | 3 | 3 | 0.5 | 0.010273973 |
| ribosomal small subunit export from nucleus | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| protein import into nucleus, docking | 5 | 2 | 3 | 0.4 | 0.010238908 |
| protein import into nucleus, translocation | 7 | 1 | 6 | 0.142857143 | 0.020618557 |
| protein import into nucleus, substrate release | 3 | 3 | 0 | 1 | 0 |
| acyl-CoA binding | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| L-ornithine transporter activity | 3 | 2 | 1 | 0.666666667 | 0.003389831 |
| mitochondrial ornithine transport | 12 | 4 | 8 | 0.333333333 | 0.027972028 |
Figure 2Schematic diagram of the proposed SEGOPubmed
Figure 3Box plots of the possible ranks for one query
Figure 4Box plots of the documents under null hypothesis for one query