| Literature DB >> 28919923 |
Qian Chen1, Ni Ai2, Jie Liao2, Xin Shao2, Yufeng Liu2, Xiaohui Fan2.
Abstract
BACKGROUND: Valuable scientific results on biomedicine are very rich, but they are widely scattered in the literature. Topic modeling enables researchers to discover themes from an unstructured collection of documents without any prior annotations or labels. In this paper, taking ginseng as an example, biological dynamic topic model (Bio-DTM) was proposed to conduct a retrospective study and interpret the temporal evolution of the research of ginseng.Entities:
Year: 2017 PMID: 28919923 PMCID: PMC5596940 DOI: 10.1186/s13020-017-0148-7
Source DB: PubMed Journal: Chin Med ISSN: 1749-8546 Impact factor: 5.455
Fig. 1The overview of Bio-DTM. The system consists of four components, documents pre-processing, Bio-dictionary construction, dynamic topic models and topics analysis and visualization
The number of documents in each time slice
| The serial number of slice | The publication date of paper | The number of paper | The serial number of slice | The publication date of paper | The number of paper |
|---|---|---|---|---|---|
| 1 | 1975–1995 | 389 | 9 | 2010 | 248 |
| 2 | 1996–2000 | 402 | 10 | 2011 | 343 |
| 3 | 2001–2003 | 411 | 11 | 2012 | 382 |
| 4 | 2004–2005 | 340 | 12 | 2013 | 410 |
| 5 | 2006 | 212 | 13 | 2014 | 426 |
| 6 | 2007 | 206 | 14 | 2015 | 439 |
| 7 | 2008 | 210 | 15 | 2016 | 432 |
| 8 | 2009 | 239 | 16 | 2017 | 295 |
The topmost frequent words for 20 topics after fitting to the ginseng-related articles by Bio-DTM
| Topic ID | Topmost frequent words of topic |
|---|---|
| 1 | Renal, genom, amino, biosynthesi, clone, cultivar, yeast, cdna, athlet, polymorph |
| 2 | Stem, skin, leaf, bone, dri, phenol, marrow, stage, fibroblast, wound |
| 3 | Infect, radic, scaveng, heat, virus, substrat, cyp3a4, influenza, hydroxyl, pgp |
| 4 | Diet, muscl, cholesterol, fat, glycosid, adipocyt, ppargamma, antibodi, ampk, insulin |
| 5 | Fraction, polysaccharid, ferment, pesticid, residu, column, frg, transit, neutral, pectin |
| 6 | Strain, soil, genus, nov, warfarin, polar, bacterium, minor, dsm, genom |
| 7 | Cam, tcm, irradi, hair, radiat, sperm, exhaust, alkalin, train, contamin |
| 8 | Channel, ca2, cardiac, relax, oocyt, contract, muscl, eno, lpa, ion |
| 9 | Injuri, women, fatigu, myocardi, estrogen, ischemia, menopaus, syndrom, cerebr, ischem |
| 10 | Pharmacokinet, ion, formula, prescript, ppd, rhizom, ppt, urin, raw, excret |
| 11 | Cultiv, temperatur, wild, seed, white, embryo, genet, flower, somat, pathogen |
| 12 | Cancer, lung, colon, intak, metastasi, colorect, ventricular, pulmonari, failur, mmp9 |
| 13 | Diabet, intestin, insulin, toler, digoxin, absorpt, pancreat, beta, grg1, morphin |
| 14 | Apoptosi, macrophag, nfkappab, ros, cox2, cancer, mitochondri, angiogenesi, arrest, p38 |
| 15 | Transform, medium, hairi, transgen, cold, degrad, spectroscopi, biomass, ml1, callus |
| 16 | Breast, vaccin, adjuv, discrimin, prostat, skin, fingerprint, antibodi, metabolom, antigen |
| 17 | Pressur, hepat, oil, ethanol, metal, aqueous, allerg, hepatotox, fibrosi, asthma |
| 18 | Platelet, lymphocyt, spleen, aggreg, pain, cisplatin, nrf2, antiplatelet, milk, camp |
| 19 | Berri, behavior, alcohol, depress, memori, swim, ach, learn, avoid, erectil |
| 20 | Neuron, cognit, memori, hippocampus, drink, behavior, glutam, energi, astrocyt, task |
Fig. 2Visualization of ginseng-related topics discovered by Bio-DTM. The river stands for the 20 topics and their evolution over 16 time slices. The river, horizontally flowing from left to right, represents the whole collection time shaft from 1975 to 2017 and different colored currents represent the 20 topics, respectively. The width variation of every river visually depicts changes in strength or popularity of the corresponding topic over time slice
Fig. 3Topics from biological dynamic topic model on ginseng-related literature. The top 10 words at each time slice have been illustrated for topic 9 (a) and topic 6 (b); the topic score reflects how its topic has changed over time with c for topic 9 and d for topic 6