Literature DB >> 25250864

Full text clustering and relationship network analysis of biomedical publications.

Renchu Guan1, Chen Yang2, Maurizio Marchese3, Yanchun Liang4, Xiaohu Shi4.   

Abstract

Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

Entities:  

Mesh:

Year:  2014        PMID: 25250864      PMCID: PMC4177555          DOI: 10.1371/journal.pone.0108847

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

With the proliferation of biomedical research and publications, individual scientists can no longer keep track of relevant articles through reading alone. Instead, they must rely on text mining tools to explore the implicit knowledge and hypotheses presented in biomedical texts and to extract the data and concepts relevant to their work [1], [2]. In this regard, a number of new challenges have emerged, especially in relation to full text analysis [3], complex relation extraction [4], [5], and information fusion [6]. To date, research approaches to biomedical text mining have been based exclusively on article abstracts (Iliopoulos et al., 2001 [7]; Yu and Lee 2006 [8]; Zhu et al., 2009 [9]), culminating most recently in 2011, when Boyack et al managed to cluster about two million MEDLINE abstracts [10]. While such clustering can provide hints about primary results and conclusions, the full texts of articles are where more detailed methodologies, experimental results, critical discussions and interpretations are found. Rodriguez-Esteban pointed out that full article texts are the only source for certain crucial information, such as experimental measurements [11]. Recent research initiatives in bioinformatics, such as TREC Genomics (http://ir.ohsu.edu/ge-nomics/) and BioCreAtIvE (http://www.biocreative.org/), also recommend migrating text analysis from abstracts to full texts [12]. Intuitively, consultation only of abstracts is inadequate for judicious analysis of biomedical articles, and can even lead to inappropriate clinical decisions [13]–[15]. In the Genomics2005 data collected in [9], the mean number of documents in the 100 datasets was 690.7 and the average number of unique words was 2214.4. By contrast, in our experiments using a BioMed dataset containing 600 texts, the unique word count was 54367—approximately 24 times greater than the count for Genomics2005. With the addition of these terms, it is expected that biomedical text mining will provide more relevant information and more complete analysis. Unfortunately, access to the full text and citations of biomedical papers remains limited, and the complexity of full text mining, which must contend with large amounts of information noise, is much greater than that of abstract text mining. While the number of non-redundant words (terms) in a biomedical abstract text is typically less than 100, that of a full text is often much greater than 1000. Since most of mining frameworks employ the vector space model (VSM) [16], which treats a document as a bag of words and uses plain language terms as features, the dimensions of a full text corpus can be several times greater than that of abstract data. Clustering is one of the most popular techniques for data analysis in many disciplines [17]. The basic strength of text clustering is its capacity to automatically organize texts into meaningful groups. As such, it has been applied in a number of ways, including cluster-based retrieval [18], key sentence extraction [19], concept discovery in molecular biology [7], and so on. Traditional clustering also forms an important component of unsupervised learning algorithms, which along with various supervised techniques, play an important role in biomedical research, including studies of cancer [20], Bayesian analysis of HIV drug resistance [21], linear regression frameworks for motif finding [22], etc. In supervised text mining algorithms, a training process is first performed on a large set of known predefined topics and text labels. This generally results in better topic detection [23] than unsupervised approaches, for which there is no prior training step. Obviously, supervised and unsupervised learning algorithms can work together to improve learning processes or handle more complex problems, such as phosphorylation site prediction [24]. However, training set labeling is in general very costly and frequently unavailable in practical scenarios. In recent years, semi-supervised clustering approaches have caught the attention of the machine learning community. These approaches make use of a smaller and more easily obtained set of labeled samples to guide clustering strategy. They have been applied in many application domains, including text clustering [25], gene expression analysis [26], and image processing [27]. However, to the best of our knowledge, semi-supervised learning has not yet been applied in full text mining of real biomedical publications. In this paper, the Semi-supervised Affinity Propagation (SSAP) method of text clustering is proposed. Then this method is applied to the corpus of a real biomedical text database called BioMed Central open access full-text corpus, and compare its clustering performance to that of two classical clustering algorithms. Finally, a directed relationship network and a cluster distribution matrix are constructed based on the SSAP clustering results, and use these to reveal publication interests among the top 10 journals. The source code and datasets used in this paper are available in the Supporting Information.

Materials and Methods

Datasets

The dataset used in the experiments was the BioMed Central open-access corpus (BioMed), which can be downloaded at http://www.biomedcentral.com/about/datamining. From the BioMed corpus, 110,369 articles between Jan. 2012 to Jun. 2012 are downloaded. Each BioMed file is a well-formatted XML document, and contains various tags, such as , <bdy>, <issn>, and so on. But there are many articles contain only XML tags and abstracts, so those files of less than 4KB are removed in the pre-processing phase. Then the remaining files are divided into different ‘topics’ according to their journal names and the title, abstract and plain texts are extracted. Finally, the top 10 topics with paper numbers from 1966 to 5022 are selected as the test corpus. The detailed information of these 10 topics are listed in Table 1. In our experiments, to evaluate the performance of the different approaches across different dataset sizes, we randomly select 5 sub-datasets with scales of 400, 500, 600, 700 and 800 on each topic for comparison. And for each scale we randomly select 5 times computation to perform average to avoid accidental results.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <div class="xtable"><div class="fig"><b>Table 1</b><p><span><b>Top 10 topics in Biomed corpus.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><table xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"></colgroup><thead><tr><td align="left" rowspan="1" colspan="1">Serial Number</td><td align="left" rowspan="1" colspan="1">Name</td><td align="left" rowspan="1" colspan="1">Abb Name</td><td align="left" rowspan="1" colspan="1">DocumentNumber</td></tr></thead><tbody><tr><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">BMC Bioinformatics</td><td align="left" rowspan="1" colspan="1">BMC Bioinformatics</td><td align="left" rowspan="1" colspan="1">5022</td></tr><tr><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1">BMC Genomics</td><td align="left" rowspan="1" colspan="1">BMC Genomics</td><td align="left" rowspan="1" colspan="1">4121</td></tr><tr><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1">BMC Public Health</td><td align="left" rowspan="1" colspan="1">BMC Public Health</td><td align="left" rowspan="1" colspan="1">3758</td></tr><tr><td align="left" rowspan="1" colspan="1">4</td><td align="left" rowspan="1" colspan="1">BMC Cancer</td><td align="left" rowspan="1" colspan="1">BMC Cancer</td><td align="left" rowspan="1" colspan="1">3025</td></tr><tr><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">Journal of Cardiovascular Magnetic Resonance</td><td align="left" rowspan="1" colspan="1">J Cardiovas Magn R</td><td align="left" rowspan="1" colspan="1">2538</td></tr><tr><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">Retrovirology</td><td align="left" rowspan="1" colspan="1">Retrovirology</td><td align="left" rowspan="1" colspan="1">2478</td></tr><tr><td align="left" rowspan="1" colspan="1">7</td><td align="left" rowspan="1" colspan="1">BMC Neuroscience</td><td align="left" rowspan="1" colspan="1">BMC Neuroscience</td><td align="left" rowspan="1" colspan="1">2454</td></tr><tr><td align="left" rowspan="1" colspan="1">8</td><td align="left" rowspan="1" colspan="1">BMC Evolutionary Biology</td><td align="left" rowspan="1" colspan="1">BMC Evo Biol</td><td align="left" rowspan="1" colspan="1">1973</td></tr><tr><td align="left" rowspan="1" colspan="1">9</td><td align="left" rowspan="1" colspan="1">Malaria Journal</td><td align="left" rowspan="1" colspan="1">Malaria J</td><td align="left" rowspan="1" colspan="1">1968</td></tr><tr><td align="left" rowspan="1" colspan="1">10</td><td align="left" rowspan="1" colspan="1">Journal of Medical Case Reports</td><td align="left" rowspan="1" colspan="1">J Med Case Rep</td><td align="left" rowspan="1" colspan="1">1966</td></tr></tbody></table></div></div><pxy><span>Before beginning the clustering process, all journal-name-related information were eliminated from target texts. After clustering, this information was used to evaluate the clustering performance of each algorithm. According to the list in the website: http://norm.al/2009/04/14/list-of-english-stop-words/, the stop words are removed. Then each document is represented by a set of tow-tuples and the whole dataset should have the form indicated as follows:where N is the document number in the dataset, M is the term feature (unique words or phrases) number in the jth document, d = {<f 1, n 1>, <f 2, n 2>,…<f i, n i>} indicates the ith document, and f k and n k represent the kth term in ith document and its normalized frequency, respectively. Hereinto, the normalized frequency is computed bywhere Count is the kth term count number in the jth document.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h2><span>The Semi-Supervised Affinity Propagation Method </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span>Based on the state-of-the-art unsupervised Affinity Propagation clustering algorithm [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="19. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">19</a>], <span class="Chemical">SSAP</span> method is proposed as means of addressing the complexity problems posed by full text clustering of biomedical publications. In conventional vector space model (VSM) based methods, each document is represented in the feature space constructed by all the unique <span class="Disease">word</span> or phrase of all the documents. Therefore, the similarities between different document pairs could be computed according to any similarity measurements under the same constructed space. However, this space is high-dimensional and the document representation is serious sparse. To avoid the large dimensions and sparse matrix computation, in the proposed method each document is represented in a tiny sub-space of VSM when computing the similarity between it and another one. The sub-space is spanned by only the features of the related document and its counterpart, which is much smaller than the original vector space. If the dataset includes thousands of words or phrases, the subspace restriction will reduce computational complexity significantly. In the method, the classical similarity measurement in text clustering, namely, Cosine Coefficient similarity is used. To achieve even better performance, a semi-supervision strategy that makes use of known information is introduced. The detailed steps of the <span class="Chemical">SSAP</span> method are as follows:</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>a. Initialization of dataset: Initializing dataset D = {d 1, d 2,...,d} be a superset of N (N>0) elements, where each element consists of a sequence of two-tuples;</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>b. Seed Construction: Add a small number of initially labeled objects in two-tuple sets to the dataset D, and get a new dataset D′ containing N′ elements (N′≥N).</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>c. Similarity Computation: Compute the similarities among objects in D′ using the Cosine Coefficient similarity metric:where S(i,j) is the similarity between the ith document and the jth document elements in D′, and |d| represents the unique terms' count number in document d, respectively.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>d. Self-Similarity Computation: Compute the self- similarities s(l, l) of each object in D′ using: where , , and i≠j.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>e. Initialization of availability matrix: a(i, j) = 0, (i, j  = 1, 2,⋯..., N′).</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>f. Message Matrix Computation: Compute the message matrices using:</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>g. Exemplar Selection: Add the two message matrices and search for the exemplar of each object i, defined as the maximizer of r(i, j)+a(i, j).</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>h. Updating Messages, using:</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>i. Iterating from Step f to h for a fixed number of iterations or until the exemplar selection remains constant for some number of iterations.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>j. Merging Small Clusters with the same labels into larger clusters to produce clustering output.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h2><span>Comparison Methods </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span><span class="Chemical">SSAP</span> is proposed based on Affinity Propagation (AP) method, which was proposed by Frey and Dueck in Science in 2007 [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="19. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">19</a>] and was considered a convinced clustering algorithm in different applications. To investigate the semi-supervised learning performance of <span class="Chemical">SSAP</span>, AP is set as a comparison method in the experiments. K-means was firstly proposed by MacQueen [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="28. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">28</a>] and was recognized as one of the top 10 algorithms in data mining [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="29. Wu XD, Kumar V, Quinlan JR, Ghosh J, Yang Q. Top 10 algorithms in data mining. <i>Knowledge and In-formation Systems</i>. 2008 <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">29</a>]. Especially, it has proven that K-means is a simple but very reliable method in document clustering [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="30. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">30</a>], [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="31. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">31</a>]. Therefore, it is also used as a baseline for comparison.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h2><span>Evaluation </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span>To evaluate the performance of the three clustering methods—k-means, Affinity Propagation and <span class="Chemical">SSAP</span>, their respective values for F-measure, entropy, and CPU execution time are compared. For information retrieval, entity recognition, and information extraction in particular, F-measure is the most commonly used evaluation measurement. The global F-measure for the entire clustering result is defined as:where N is the total number of documents in the dataset, N is the number of objects of class h in cluster g, N is the number of objects of cluster g, and N is the number of objects of class h.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>In contrast to F-measure, entropy provides a measure of the homogeneity or purity of a cluster. The total entropy for a set of clusters is calculated as:where G is the total number of clusters, H is the number of predefined classes. The smaller the entropy, the better the clustering performance.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>The generated clusters with the set of journal-based topic categories in BioMed are examined. To check the effectiveness across different collection sizes, all the three algorithms are applied to the sub-datasets with 5 scales from 400 to 800 and for each scale the sub-datasets are randomly selected 5 times for averaging. All topics in the datasets had discrete uniform distributions, and all experiments were run on a PC equipped with Intel (R) Xeon(R) CPU×3430 @ 2.40 GHz, 2 GB of Ram, and no parallel computing processes.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h1>Results</h1> <pxy><span>Figures 1, 2, and 3 compare the F-measure, entropy and CPU execution time measurements, respectively, for the three algorithms. From Figure 1 and the summary results in Table 2, it can be seen that the mean F-measure value of <span class="Chemical">SSAP</span> is 0.674, an improvement of 0.290 (75.4%) over k-means. Similarly, AP achieves a mean F-measure of 0.650, an improvement of 0.266 (69.1%) over k-means.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <div class="fig"><img src="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4177555/bin/pone.0108847.g001.jpg" style="width:99%;" /><b>Figure 1</b><p><span><b>F-measure comparison.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><p><span>K-means: k-means clustering; AP: Affinity Propagation clustering; SSAP: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p></div><div class="fig"><img src="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4177555/bin/pone.0108847.g002.jpg" style="width:99%;" /><b>Figure 2</b><p><span><b>Entropy comparison.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><p><span>K-means: k-means clustering; AP: Affinity Propagation clustering; SSAP: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p></div><div class="fig"><img src="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4177555/bin/pone.0108847.g003.jpg" style="width:99%;" /><b>Figure 3</b><p><span><b>CPU execution time comparison.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><p><span>K-means: k-means clustering; AP: Affinity Propagation clustering; SSAP: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p></div><div class="xtable"><div class="fig"><b>Table 2</b><p><span><b>The mean values over all experiments.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><table xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"></colgroup><thead><tr><td align="left" rowspan="1" colspan="1"></td><td align="left" rowspan="1" colspan="1">Mean F-measure</td><td align="left" rowspan="1" colspan="1">Mean Entropy</td><td align="left" rowspan="1" colspan="1">Mean CPU execution time (Min)</td></tr></thead><tbody><tr><td align="left" rowspan="1" colspan="1">SSAP</td><td align="left" rowspan="1" colspan="1"> <bold>0.674</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>0.429</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>498.9</bold> </td></tr><tr><td align="left" rowspan="1" colspan="1">AP</td><td align="left" rowspan="1" colspan="1">0.650</td><td align="left" rowspan="1" colspan="1"> <bold>0.429</bold> </td><td align="left" rowspan="1" colspan="1">539.6</td></tr><tr><td align="left" rowspan="1" colspan="1">k-means</td><td align="left" rowspan="1" colspan="1">0.384</td><td align="left" rowspan="1" colspan="1">0.721</td><td align="left" rowspan="1" colspan="1">2866.8</td></tr></tbody></table></div></div><h1>F-measure comparison.</h1> <pxy><span>K-means: k-means clustering; AP: Affinity Propagation clustering; <span class="Chemical">SSAP</span>: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h1>Entropy comparison.</h1> <pxy><span>K-means: k-means clustering; AP: Affinity Propagation clustering; <span class="Chemical">SSAP</span>: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h1>CPU execution time comparison.</h1> <pxy><span>K-means: k-means clustering; AP: Affinity Propagation clustering; <span class="Chemical">SSAP</span>: Semi-supervised Affinity Propagation.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Figure 2 and Table 2 show a different trend for the entropy values of the three methods, with k-means showing the highest entropy and both <span class="Chemical">SSAP</span> and AP achieving around 40.0% lower entropy than k-means.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>From Figure 3, it is clear that the CPU execution times for AP and <span class="Chemical">SSAP</span> are much lower than that of k-means, for equivalent performance results. Moreover, the gaps enlarge exponentially as the dataset size increases. For example, k-means took 2.4 times longer to execute than <span class="Chemical">SSAP</span> for the 400-document dataset, and increased to 8.2 times for the 800-document dataset. This result agrees with the conclusions of other studies, such as Frey and Dueck's papers in Science [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="19. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">19</a>], [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="32. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">32</a>]. Furthermore, even after 10000 runs of 400-document clustering, the k-means could only achieve an F-measure of 0.384 and entropy of 0.721, well below the performance of <span class="Chemical">SSAP</span>. To obtain the best result, k-means needs to execute all possible solutions, the equivalent of about C 400≈2.57×1019 runs for 400 documents [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="33. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">33</a>].</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>In addition, the CPU run time comparison results shown in Figure 3 and the summary results in Table 2 indicate that <span class="Chemical">SSAP</span> is about 7.5% faster than AP. This is due to the fact that, with the added labeled texts and semi-supervised strategy, the unlabeled texts are more easily to find their clusters, the convergence of AP are more quickly. Moreover, <span class="Chemical">SSAP</span> performs better (by 3.8%) than AP with respect to F-measure and similar to AP in average entropy.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>For the 5data sizes (400, 500, 600, 700 and 800 texts), the adjustable factors φ were, respectively, 0.5∼1 for AP, and 3 for <span class="Chemical">SSAP</span>. The parameter k in the k-means execution for all experiments is set as 10 because the documents are expected to be clustered into 10 classes corresponding to the top 10 journals. It should be noticed that there are some methods have been developed to select or construct seeds in semi-supervised methods, which could improve the performance when we have complicated known knowledge. However, the feed construction strategy is not the focus of this paper and there are no discusses about this. In the experiments, we just simply randomly select 4 documents in each cluster to guide the seed construction in <span class="Chemical">SSAP</span> algorithm.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h2><span>Relationship Network of Publications </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span>Based on the <span class="Chemical">SSAP</span> clustering results, an interesting application of our method is demonstrated. Figure 4 describes the relationships among the 10 top biomedical journals yielding a total of 400 articles. Within the graph, each node represents a biomedical journal. If a directed edge from journal A to B exits, it shows that at least two articles of journal A are clustered into the class dominated by journal B, and the weight of the edge represents the count number of the articles.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <div class="fig"><img src="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4177555/bin/pone.0108847.g004.jpg" style="width:99%;" /><b>Figure 4</b><p><span><b>Directed relationship network based on SSAP clustering of BioMed journals.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><p><span>Each node indicates a biomedical journal.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p></div><h1>Directed relationship network based on SSAP clustering of BioMed journals.</h1> <pxy><span>Each node indicates a biomedical journal.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>In Figure 4, it can be seen that most of these journals share topics with other journals. A detailed analysis of the graph shows that “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” gets eight out-degrees, which means that it has outward connections with most of the other journals except for the “Journal of Cardiovascular Magnetic Resonance”. By contrast, the “Journal of Medical Case Reports” has no out-degrees. The median out-degrees of the 10 journals in the network is 3. “<span class="Chemical">BMC</span> Bioinformatics”, “<span class="Chemical">BMC</span> Public Health”, “<span class="Disease">Malaria Journal</span>” and "Retrovirology” all belong to this group. The out-degrees of “<span class="Chemical">BMC</span> Genomics” and “Journal of Cardiovascular Magnetic Resonance” are 4 and 2, respectively.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Table 3 gives the detailed parameter information of the relationship network. It can be seen that the clustering coefficient is 0.36 which is larger than 0, as most random world networks tend to be, but also much smaller than 1, which is a typical feature of ring-lattice world networks. It can also be noted that the characteristic path length is 1.89, which means that these nodes are well connected. Based on these two parameters, it can be concluded that the relationship network is similar to that of a small world [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="34. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">34</a>].</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <div class="xtable"><div class="fig"><b>Table 3</b><p><span><b>Parameter analysis of the directed relationship network for SSAP clustering of biomed journals.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><table xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"><col align="center" span="1"></colgroup><thead><tr><td align="left" rowspan="1" colspan="1">Number of nodes:10</td><td align="left" rowspan="1" colspan="1">Number of inter edges:27</td></tr></thead><tbody><tr><td align="left" rowspan="1" colspan="1"> <bold>Clustering coefficient:0.36</bold> </td><td align="left" rowspan="1" colspan="1">Connected components:1</td></tr><tr><td align="left" rowspan="1" colspan="1">Network radius:2</td><td align="left" rowspan="1" colspan="1">Network diameter:5</td></tr><tr><td align="left" rowspan="1" colspan="1">Shortest paths:73</td><td align="left" rowspan="1" colspan="1">Network density: 0</td></tr><tr><td align="left" rowspan="1" colspan="1"> <bold>Characteristic path length:1.89</bold> </td><td align="left" rowspan="1" colspan="1">Avg.number of neighbors:4.4</td></tr></tbody></table></div></div><pxy><span>Table 4 is the result of cluster distribution matrix. The principal diagonal of the table indicates the match number of the clustering results and the journal name of BioMed corpus. They are also the main parts of their clusters. The homologous texts are those belonging to other journals but are recognized under the given cluster label. For example, in cluster 1, there are 19 homologous texts, including nine articles from “<span class="Chemical">BMC</span> Evolutionary Biology”, seven from “<span class="Chemical">BMC</span> Genomics”, and three from “<span class="Chemical">BMC</span> Neuroscience”.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <div class="xtable"><div class="fig"><b>Table 4</b><p><span><b>Cluster distribution matrix.</b></span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></p><table xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"><col align="center" span="1"></colgroup><thead><tr><td align="left" rowspan="1" colspan="1">Cluster</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1">4</td><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">7</td><td align="left" rowspan="1" colspan="1">8</td><td align="left" rowspan="1" colspan="1">9</td><td align="left" rowspan="1" colspan="1">10</td><td align="left" rowspan="1" colspan="1">Amount</td></tr></thead><tbody><tr><td align="left" rowspan="1" colspan="1">BMC Bioinformatics</td><td align="left" rowspan="1" colspan="1"> <bold>34</bold> </td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">BMC Evo Biol</td><td align="left" rowspan="1" colspan="1">9</td><td align="left" rowspan="1" colspan="1"> <bold>16</bold> </td><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">BMC Genomics</td><td align="left" rowspan="1" colspan="1">7</td><td align="left" rowspan="1" colspan="1">12</td><td align="left" rowspan="1" colspan="1"> <bold>7</bold> </td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">9</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">4</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">BMC Neuroscience</td><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1"> <bold>15</bold> </td><td align="left" rowspan="1" colspan="1"> <italic><underline>11</underline></italic> </td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">BMC Cancer</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1"> <italic><underline>8</underline></italic> </td><td align="left" rowspan="1" colspan="1"> <bold>21</bold> </td><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">5</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">BMC Public Health</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1"> <bold>31</bold> </td><td align="left" rowspan="1" colspan="1"> <italic><underline>6</underline></italic> </td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">Malaria J</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1"> <italic><underline>8</underline></italic> </td><td align="left" rowspan="1" colspan="1"> <bold>25</bold> </td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">J Cardiovas Magn R</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1"> <bold>38</bold> </td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">Retrovirology</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">2</td><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1"> <bold>30</bold> </td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">J Med Case Rep</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">16</td><td align="left" rowspan="1" colspan="1">1</td><td align="left" rowspan="1" colspan="1">0</td><td align="left" rowspan="1" colspan="1">6</td><td align="left" rowspan="1" colspan="1">3</td><td align="left" rowspan="1" colspan="1"> <bold>14</bold> </td><td align="left" rowspan="1" colspan="1">40</td></tr><tr><td align="left" rowspan="1" colspan="1">Amount</td><td align="left" rowspan="1" colspan="1">53</td><td align="left" rowspan="1" colspan="1">31</td><td align="left" rowspan="1" colspan="1">16</td><td align="left" rowspan="1" colspan="1">25</td><td align="left" rowspan="1" colspan="1">70</td><td align="left" rowspan="1" colspan="1">53</td><td align="left" rowspan="1" colspan="1">44</td><td align="left" rowspan="1" colspan="1">51</td><td align="left" rowspan="1" colspan="1">43</td><td align="left" rowspan="1" colspan="1">14</td><td align="left" rowspan="1" colspan="1">400</td></tr><tr><td align="left" rowspan="1" colspan="1">Homologous texts</td><td align="left" rowspan="1" colspan="1"> <bold>19</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>15</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>9</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>10</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>49</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>22</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>19</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>13</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>13</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>0</bold> </td><td align="left" rowspan="1" colspan="1"> <bold>169</bold> </td></tr></tbody></table></div></div><h1>Discussions</h1> <h2><span>Discussions on Full Text Clustering Results </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span>From the experimental results on full text clustering experiments, it can be seen that <span class="Chemical">SSAP</span> is superior to the classical k-means and AP algorithms for full text biomedical literature clustering. With the help of local similarity computing and semi-supervision, <span class="Chemical">SSAP</span> greatly enhances the clustering performance relative to k-means. Applying AP clustering to replace k-means clustering, the two AP-based algorithms obtained higher F-measures and lower entropies. This is due to the fact that the Cosine Coefficient similarity contains both the document's own information and a portion of the mutual information for the two vectors which would have been omitted by Euclidian distance. By introducing the semi-supervised strategy, the <span class="Chemical">SSAP</span> algorithm outperforms the unsupervised AP algorithm, achieving a higher F-measure than AP while maintaining similar computation times. In addition, because the k-means method treats each document as a 54367-dimension vector (i.e. 54367 unique words in 600 texts) using Euclidian distance, the problem is mapped into a large sparse matrix, which dramatically increases the computation time. By contrast, <span class="Chemical">SSAP</span> and AP treat each document as a much smaller vector space (i.e. at most contains 5274-dimension), which allows them to execute much more quickly.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h2><span>Discussions on Relationship Network Analysis </span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button> </h2> <pxy><span>In relationship network analysis, it is assumed that the manuscripts in the same journal share similar topics and have high possibility to be clustered in a same class. However, similar papers may also exist in different journals. Since the selected dataset belongs to a discrete uniform distribution, the edge between two journals in the relationship network can reveal similarity of their publishing scopes and strategies. For example, the scope of “<span class="Chemical">BMC</span> Bioinformatics” is mostly defined by “computational and statistical methods for the modeling and analysis of all kinds of biological data” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="35. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">35</a>], and it can be seen that the journal has strong relationships with most other <span class="Chemical">BMC</span> series journals, including out-degrees for “<span class="Chemical">BMC</span> Genomics”, “<span class="Chemical">BMC</span> Neuroscience”, and “<span class="Chemical">BMC</span> Evolutionary Biology”, as well as an in-degree with “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>”. This is due to the fact that these journals belong to the same publishing company and “<span class="Chemical">BMC</span> Bioinformatics” defines a broad publishing scope, including all computing related models for all kinds of biological data. Moreover, with <span class="Disease">cancer</span> research attracting so much attention and covering such diverse biomedical topics, the papers published by “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” are featured in many different research areas, including tissue analysis, diagnosis, and treatment of <span class="Disease">tumors</span>. This explains why the “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” journal forms a central hub connected to most of the other journals. By contrast, the “Journal of Cardiovascular Magnetic Resonance” and “Journal of Medical Case Reports” are more focused on specific fields (e.g., “magnetic resonance methods applied to the cardiovascular system” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="36. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">36</a>], or “case report that expands the field of general medical knowledge, and original research relating to case reports” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="37. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">37</a>]). As a result, these journals have fewer in-degrees and fewer out-degrees.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>The cluster distribution matrix is an intuitive and informative tool for analysis, from it, it can be noted that the following:</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Search area or publishing strategy relationships. “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” has the most homologous texts (49 texts), while “Journal of Medical Case Reports” has the fewest (0 texts). This may be due to the latter's relatively narrow research area (medical case study), which excludes the majority of biology manuscripts. On the other hand, it indicates that the editors and reviewers of this journal are more concentrated on particular medical cases.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Publishing scope relationships. It is clear that the wider the publishing scope, the more homologous texts can be found. For example, the number of homologous texts for “<span class="Chemical">BMC</span> Bioinformatics” is greater than that of “<span class="Chemical">BMC</span> Evolutionary Biology,” which is in turn greater than that of “<span class="Chemical">BMC</span> Genomics”. This finding <span class="Disease">fits</span> with the scope of these journals' research areas: “<span class="Chemical">BMC</span> Bioinformatics” focuses on “the development, testing, and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="35. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">35</a>]; “<span class="Chemical">BMC</span> Evolutionary Biology” focuses on “molecular and nonmolecular evolution of all organisms, as well as phylogenetics and palaeontology” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="38. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">38</a>]; and “<span class="Chemical">BMC</span> Genomics” focuses on “genome-scale analysis, functional genomics, and proteomics” [<a title="" data-container="body" data-toggle="popover" data-placement="right" data-html="true" data-trigger="hover click" data-content="39. . . <i></i>. <a target='_blank' style='cursor:pointer;' href='si.php?db=pubmed&id='><span class='glyphicon glyphicon-share-alt'></span></a>">39</a>]. It is obvious that a publishing scope encompassing “all kinds of biological data, as well as other areas of computational biology” is larger than that encompassing just “molecular and nonmolecular evolution” or “genome-scale analysis”.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Mutual cross-relationships. Mutual cross-relationship between two journals includes the weights of both directions between these two nodes in the journal relationship network (Figure 4), or equivalently, the two elements crossed by these two journals in the cluster distribution matrix (Table 4). It can reflect the publishing scope similarity between the related two journals. It could be found that the mutual cross-relationships between “<span class="Chemical">BMC</span> Public Health” and “<span class="Disease">Malaria Journal</span>” and those between “<span class="Chemical">BMC</span> Neuroscience” and “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” are significantly bigger than others (see the numbers indicated by the italicized and underlined in Table 4), which means that the journals have homologous texts in the given partner's cluster, and suggests that the publishing scopes of the two journals have some overlap. For example, it is well known that <span class="Disease">malaria</span> is one of the epidemics which threatens <span class="Species">human</span>'s public health. So it is easy to understand “<span class="Chemical">BMC</span> Public Health” and “<span class="Disease">Malaria Journal</span>” include part of similar papers. For the second pair journals, we know that <span class="Disease">cancers of the brain and nervous system</span> (a subcategory of <span class="Disease">cancer</span>) are the second most common type of childhood <span class="Disease">cancer</span>, which explains why the mutual cross-relationship between “<span class="Chemical">BMC</span> Neuroscience” and “<span class="Chemical">BMC</span> <span class="Disease">Cancer</span>” is the most significant one among all the journals.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <h1>Conclusions</h1> <pxy><span>In this paper, we addressed the difficulties of full text clustering of biomedical publications by proposing a new method and algorithm known as <span class="Chemical">SSAP</span>. To reduce the huge dimensionality when the size of targeted texts is increasing, the new method substitutes pair-wise vector-spanned sub-space for the entire Euclidean space which are used by classic algorithms. To solve the nonmetric similarities problem, the proposed algorithm employs Affinity Propagation clustering, the performance of which is further improved by a small sampling of labels. Using the real-world corpus of BioMed Central as a target dataset, the performance of <span class="Chemical">SSAP</span> is compared to that of two classical clustering algorithms, and it can be seen that: (1) <span class="Chemical">SSAP</span> clustering thoroughly outperforms k-means clustering, and (2) <span class="Chemical">SSAP</span> clustering improves upon unsupervised AP clustering, with minimal impact on computation time.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>The study of the directed relationship network and distribution matrix based on <span class="Chemical">SSAP</span> clustering results also demonstrated the utility and applicability of the proposed method within the biomedical field. In particular, or identification of publishing scope and mutual cross- relationships among journals provided a case study of how a powerful clustering algorithm can be used to detect meaningful links among various scientific journals.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Sub-datasets with the scale of 400. Each journal contains 40 manuscripts.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(RAR)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Sub-datasets with the scale of 500. Each journal contains 50 manuscripts.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(RAR)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Sub-datasets with the scale of 600. Each journal contains 60 manuscripts.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(RAR)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Sub-datasets with the scale of 700. Each journal contains 70 manuscripts.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(RAR)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Sub-datasets with the scale of 800. Each journal contains 80 manuscripts.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(RAR)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Source code of <span class="Chemical">SSAP</span> algorithm.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>(HPP)</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> <pxy><span>Click here for additional data file.</span> <button onclick="translate_abc(this)" style="border:none;outline:none;color:#5577AA;font-size:10px;margin-bottom:0px;" title="Translate into Chinese"> <span class="glyphicon glyphicon-transfer"></span> </button></pxy> </div> <div class="tab-pane fade" id="refx"> <div style="padding-top:15px;"> <span style="font-size:13px;"> <span class="glyphicon glyphicon-stats"> </span>   21 in total</span></div> <div style="padding-top:4px;"> </div> <span class="literature_info"></span> <h2><span class="s2"></span> <a href="si.php?db=pubmed&id=11262957" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">1.  Textquest: document clustering of Medline abstracts for concept discovery in molecular biology.</span></a></h2> <span class="author">Authors:  I Iliopoulos; A J Enright; C A Ouzounis </span><br> <span class="journal">Journal:  Pac Symp Biocomput </span>      <span class="year">Date:  2001 </span><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=16500932" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">2.  Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data.</span></a></h2> <span class="author">Authors:  Desheng Huang; Wei Pan </span><br> <span class="journal">Journal:  Bioinformatics </span>      <span class="year">Date:  2006-02-24 </span>      <span class="year">Impact factor: 6.937 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=16873519" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">3.  Accessing bioscience images from abstract sentences.</span></a></h2> <span class="author">Authors:  Hong Yu; Minsuk Lee </span><br> <span class="journal">Journal:  Bioinformatics </span>      <span class="year">Date:  2006-07-15 </span>      <span class="year">Impact factor: 6.937 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <span class="review">Review</span> <a href="si.php?db=pubmed&id=16418747" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">4.  Literature mining for the biologist: from information retrieval to biological discovery.</span></a></h2> <span class="author">Authors:  Lars Juhl Jensen; Jasmin Saric; Peer Bork </span><br> <span class="journal">Journal:  Nat Rev Genet </span>      <span class="year">Date:  2006-02 </span>      <span class="year">Impact factor: 53.242 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=17218491" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">5.  Clustering by passing messages between data points.</span></a></h2> <span class="author">Authors:  Brendan J Frey; Delbert Dueck </span><br> <span class="journal">Journal:  Science </span>      <span class="year">Date:  2007-01-11 </span>      <span class="year">Impact factor: 47.728 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=17674632" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">6.  Co-clustering: a versatile tool for data analysis in biomedical informatics.</span></a></h2> <span class="author">Authors:  Sungroh Yoon; Luca Benini; Giovanni De Micheli </span><br> <span class="journal">Journal:  IEEE Trans Inf Technol Biomed </span>      <span class="year">Date:  2007-07 </span><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=17912969" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">7.  User-centered evaluation of Arizona BioPathway: an information extraction, integration, and visualization system.</span></a></h2> <span class="author">Authors:  Karin D Quiñones; Hua Su; Byron Marshall; Shauna Eggers; Hsinchun Chen </span><br> <span class="journal">Journal:  IEEE Trans Inf Technol Biomed </span>      <span class="year">Date:  2007-09 </span><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=9623998" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">8.  Collective dynamics of 'small-world' networks.</span></a></h2> <span class="author">Authors:  D J Watts; S H Strogatz </span><br> <span class="journal">Journal:  Nature </span>      <span class="year">Date:  1998-06-04 </span>      <span class="year">Impact factor: 49.962 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=21711548" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">9.  Regulation of gene expression in ovarian cancer cells by luteinizing hormone receptor expression and activation.</span></a></h2> <span class="author">Authors:  Juan Cui; Brooke M Miner; Joanna B Eldredge; Susanne W Warrenfeltz; Phuongan Dam; Ying Xu; David Puett </span><br> <span class="journal">Journal:  BMC Cancer </span>      <span class="year">Date:  2011-06-28 </span>      <span class="year">Impact factor: 4.430 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=16539745" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">10.  Exploring supervised and unsupervised methods to detect topics in biomedical text.</span></a></h2> <span class="author">Authors:  Minsuk Lee; Weiqing Wang; Hong Yu </span><br> <span class="journal">Journal:  BMC Bioinformatics </span>      <span class="year">Date:  2006-03-16 </span>      <span class="year">Impact factor: 3.169 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /> <div style="padding:0px;"> </div> <span class="more_ref_info"> <a href="javascript:void(0);" style="cursor:pointer;"> <div class="alert alert-info" style="padding-left:0px;width:90%"><center>View more</center></div></a> </span> </div> <div class="tab-pane fade" id="citex"> <div style="padding-top:15px;"> <span style="font-size:13px;"> <span class="glyphicon glyphicon-stats"> </span>   3 in total</span></div> <div style="padding-top:4px;"> </div> <h2><span class="s2"></span> <a href="si.php?db=pubmed&id=26578908" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">1.  Analyzing 7000 texts on deep brain stimulation: what do they tell us?</span></a></h2> <span class="author">Authors:  Christian Ineichen; Markus Christen </span><br> <span class="journal">Journal:  Front Integr Neurosci </span>      <span class="year">Date:  2015-10-26 </span><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <a href="si.php?db=pubmed&id=29321535" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">2.  Multi-label Deep Learning for Gene Function Annotation in Cancer Pathways.</span></a></h2> <span class="author">Authors:  Renchu Guan; Xu Wang; Mary Qu Yang; Yu Zhang; Fengfeng Zhou; Chen Yang; Yanchun Liang </span><br> <span class="journal">Journal:  Sci Rep </span>      <span class="year">Date:  2018-01-10 </span>      <span class="year">Impact factor: 4.379 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /><h2><span class="s2"></span> <span class="review">Review</span> <a href="si.php?db=pubmed&id=35994484" target="_blank" style="cursor:pointer;"><span style="font-weight:500;font-size:15px;color:#337AB7;">3.  Identification of technology frontiers of artificial intelligence-assisted pathology based on patent citation network.</span></a></h2> <span class="author">Authors:  Ting Zhang; Juan Chen; Yan Lu; Xiaoyi Yang; Zhaolian Ouyang </span><br> <span class="journal">Journal:  PLoS One </span>      <span class="year">Date:  2022-08-22 </span>      <span class="year">Impact factor: 3.752 </span><br><hr style="padding:0px;margin:10px;margin-left:0px;" /> <div style="padding-top:15px;"> <span style="font-size:13px;"> <span class="glyphicon glyphicon-stats"> </span>   3 in total</span></div> </div> </div> </div> <script type="text/javascript"> $('.more_ref_info a').click(function() { $(".more_ref_info").html('<div class="alert alert-info" style="padding:8px;padding-left:0px;width:90%"> <div class="three-bounce"> Loading  <div class="bounce1"></div> <div class="bounce2"></div> <div class="bounce3"></div> </div> </div>'); $.post('codes/reference/ref.php',{pn:'2',idx:'25250864'},function(data) { $(".more_ref_info").html(data); }) }); $('.more_cite_info a').click(function() { $(".more_cite_info").html('<div class="alert alert-info" style="padding:8px;padding-left:0px;width:90%"> <div class="three-bounce"> Loading  <div class="bounce1"></div> <div class="bounce2"></div> <div class="bounce3"></div> </div> </div>'); $.post('codes/reference/cite.php',{pn:'2',idx:'25250864'},function(data) { $(".more_cite_info").html(data); }) }); </script> <script type="text/javascript"> $(document).ready(function(){ $(".con").html('<br><br><div style="width:280px;"><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div></div>'); $.post('codes/translate/IF.php',{db:'pubmed',id:'1932-6203',lang:'en'},function(data) { $(".con").html(data); }); }); $('.search_IF a').click(function(e) { $(".con2").html(''); $(".con").html('<br><div style="width:380px;"><center><font color="#87CEEB"><b>Please waiting ...</b></font></center><br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div></div>'); $.post('codes/translate/IF.php',{db:'pubmed',id:'1932-6203',lang:'en'},function(data) { $(".con").html(data); }); }); </script> <script type="text/javascript"> $('.dx_button').click(function() { loading.showLoading({ type:1, tip:"Loading" }) $.post('codes/translate/download_dx.php',{pmid:'25250864'},function(data) { eval('var data='+data); if(data.ti==1){ loading.hideLoading(); window.open('tmpe/25250864.pdf') } }) }) </script> <script type="text/javascript"> $(document).ready(function(){ var t=new Date().getTime(); var id=getCookie('w_id'); if(window.XMLHttpRequest){ var xhr=new XMLHttpRequest(); }else{ var xhr=new ActiveXObject('Microsoft.XMLHTTP'); } xhr.open('GET','src/php/index.php?p='+t); xhr.send(); xhr.onreadystatechange=function(){ if(xhr.readyState==4){ if(xhr.status==200){ if(!(xhr.responseText=='' && id!='' && id!=0)){ var n_val=getCookie('name_val'); //alert(n_val); $.post('codes/translate/download.php',{doi:'10.1371/journal.pone.0108847',user:n_val},function(data) { $(".d_button").html(data); }); } } } } }); </script> <script> $(document).ready(function(){ $("#CellLine_id").change(function() { if($("#CellLine_id").is(":checked")) { $(".CellLine").addClass("CellLine_desc"); $(".CellLine_desc4").addClass("CellLine_desc3"); $(".CellLine").bind('click',function(e){ e.preventDefault(); var namea = $(this).text(); $(document).ready(function(){ $("#myModal_annotation").modal("show") }); $(".annotation_alert").html('<br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div>'); $.get('codes/geo/annotation.php',{pmid:'25250864',namea:namea,typea:'CellLine',query:'',db:'pubmed'},function(data) { $(".annotation_alert").html(data); }) }) }else{ $(".CellLine_desc4").removeClass("CellLine_desc3"); $(".CellLine").removeClass("CellLine_desc"); $(".CellLine").unbind(); } }) $("#Chemical_id").change(function() { if($("#Chemical_id").is(":checked")) { $(".Chemical").addClass("Chemical_desc"); $(".Chemical_desc4").addClass("Chemical_desc3"); $(".Chemical").bind('click',function(e){ e.preventDefault(); var namea = $(this).text(); $(document).ready(function(){ $("#myModal_annotation").modal("show") }); $(".annotation_alert").html('<br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div>'); $.get('codes/geo/annotation.php',{pmid:'25250864',namea:namea,typea:'Chemical',query:'',db:'pubmed'},function(data) { $(".annotation_alert").html(data); }) }) }else{ $(".Chemical_desc4").removeClass("Chemical_desc3"); $(".Chemical").removeClass("Chemical_desc"); $(".Chemical").unbind(); } }) $("#Disease_id").change(function() { if($("#Disease_id").is(":checked")) { $(".Disease").addClass("Disease_desc"); $(".Disease_desc4").addClass("Disease_desc3"); $(".Disease").bind('click',function(e){ e.preventDefault(); var namea = $(this).text(); $(document).ready(function(){ $("#myModal_annotation").modal("show") }); $(".annotation_alert").html('<br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div>'); $.get('codes/geo/annotation.php',{pmid:'25250864',namea:namea,typea:'Disease',query:'',db:'pubmed'},function(data) { $(".annotation_alert").html(data); }) }) }else{ $(".Disease_desc4").removeClass("Disease_desc3"); $(".Disease").removeClass("Disease_desc"); $(".Disease").unbind(); } }) $("#Gene_id").change(function() { if($("#Gene_id").is(":checked")) { $(".Gene").addClass("Gene_desc"); $(".Gene_desc4").addClass("Gene_desc3"); $(".Gene").bind('click',function(e){ e.preventDefault(); var namea = $(this).text(); $(document).ready(function(){ $("#myModal_annotation").modal("show") }); $(".annotation_alert").html('<br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div>'); $.get('codes/geo/annotation.php',{pmid:'25250864',namea:namea,typea:'Gene',query:'',db:'pubmed'},function(data) { $(".annotation_alert").html(data); }) }) }else{ $(".Gene_desc4").removeClass("Gene_desc3"); $(".Gene").removeClass("Gene_desc"); $(".Gene").unbind(); } }) $("#Species_id").change(function() { if($("#Species_id").is(":checked")) { $(".Species").addClass("Species_desc"); $(".Species_desc4").addClass("Species_desc3"); $(".Species").bind('click',function(e){ e.preventDefault(); var namea = $(this).text(); $(document).ready(function(){ $("#myModal_annotation").modal("show") }); $(".annotation_alert").html('<br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div>'); $.get('codes/geo/annotation.php',{pmid:'25250864',namea:namea,typea:'Species',query:'',db:'pubmed'},function(data) { $(".annotation_alert").html(data); }) }) }else{ $(".Species_desc4").removeClass("Species_desc3"); $(".Species").removeClass("Species_desc"); $(".Species").unbind(); } }) $(".population").addClass("population_desc"); $("#population_id").change(function() { if($("#population_id").is(":checked")) { $(".population_desc4").addClass("population_desc3"); $(".population").addClass("population_desc"); }else{ $(".population_desc4").removeClass("population_desc3"); $(".population").removeClass("population_desc"); } }) $(".interventions").addClass("interventions_desc"); $("#interventions_id").change(function() { if($("#interventions_id").is(":checked")) { $(".interventions_desc4").addClass("interventions_desc3"); $(".interventions").addClass("interventions_desc"); }else{ $(".interventions_desc4").removeClass("interventions_desc3"); $(".interventions").removeClass("interventions_desc"); } }) $(".outcomes").addClass("outcomes_desc"); $("#outcomes_id").change(function() { if($("#outcomes_id").is(":checked")) { $(".outcomes_desc4").addClass("outcomes_desc3"); $(".outcomes").addClass("outcomes_desc"); }else{ $(".outcomes_desc4").removeClass("outcomes_desc3"); $(".outcomes").removeClass("outcomes_desc"); } }) }) </script> <div class="col-sm-4" style=""> <div id="myNav"> <span class="con"> </span> </div> <span class="con2"></span> <span class="con3"></span> </div> </div> </div> <script type="text/javascript"> function translate_xyz(btnObj){ var x = btnObj.previousElementSibling.innerHTML; $(".con2").html(''); $(".con").html('<br><div style="width:380px;"><center><font color="#87CEEB"><b>正在翻译中 ...</b></font></center><br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div></div>'); $.post('codes/translate/translate_content.php',{content:x,to_lang:'en'},function(data) { $(".con").html(data); }); } function translate_abc(btnObj){ var x = btnObj.previousElementSibling.innerHTML; $(".con2").html(''); $(".con").html('<br><div style="width:380px;"><center><font color="#87CEEB"><b>正在翻译中 ...</b></font></center><br><div class="spinner"><div class="double-bounce1"></div><div class="double-bounce2"></div></div></div>'); $.post('codes/translate/translate_content.php',{content:x,to_lang:'zh'},function(data) { $(".con").html(data); }); } </script> <script type="text/javascript"> $(document).ready(function(){ loading.hideLoading(); }); $('.tab_b a').click(function() { initial_url_paras = window.location.href.split("?"); initial_url = initial_url_paras[0]; paras = initial_url_paras[1]; paras_array = paras.split("&"); for(let ii=0;ii<paras_array.length;ii++){ current_para_array = paras_array[ii].split("="); if(current_para_array[0]=="db"){dbx=current_para_array[1]} if(current_para_array[0]=="id"){idx=current_para_array[1]} } $(".ax2").html(' <div style="background-color:#d9edf7;padding:1px;padding-left:6px;margin-left:4px;font-size:12px;"><table><tr> <td>跳转中 ... </td> <td> <div class="three-bounce" style="min-height:22px;"> <div class="bounce1"></div> <div class="bounce2"></div> <div class="bounce3"></div> </div></td></tr></table> </div>'); window.location.href = 'si.php?db=' + dbx + '&id=' + idx; }) </script> <div class="modal fade" id="myModal_annotation" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true"> <div class="modal-dialog" style="width:300px;"> <div class="modal-content"> <div class="modal-body"> <button type="button" class="close" data-dismiss="modal" aria-hidden="true">× </button> <span class="annotation_alert"></span> </div> </div> </div> </div> <br> <script type="text/javascript" src="src/js/child_nav.js"></script> <div id="autoHeightDiv"></div> <div class="footLineGray" style="border:none;"></div> <div class="lineWhite" style="border:none;"></div> <div class="webFoot"> <div class="foot middle" style="text-align:center;padding-right:10px;;padding-top:19px;background:white;border:none;"> 北京卡尤迪生物科技股份有限公司 © 2022-2023. </div> </div> <script> $(function () { $("[data-toggle='tooltip']").tooltip({html : true }); }); $(function() { $('#rct_show_id').click(function() { $('.rct_class').show() $('.entity_class').hide() $('#rct_show_id').hide() $('#rct_hide_id').show() }) $('#rct_hide_id').click(function() { $('.rct_class').hide() $('.entity_class').show() $('#rct_show_id').show() $('#rct_hide_id').hide() }) }) </script> <script> $(function () { $("[data-toggle='popover']").popover({html:true,trigger:'hover click'}); }); </script> <script type="text/javascript" src="src/js/child_nav.js"></script> <script type="text/javascript" src="src/js/clickx.js"></script> <script src="end.js"></script> </body> </html>