| Literature DB >> 32012074 |
Jing Wang1, Huan Deng1, Bangtao Liu1, Anbin Hu1, Jun Liang2, Lingye Fan3, Xu Zheng4, Tong Wang5, Jianbo Lei1,4,6.
Abstract
BACKGROUND: Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial.Entities:
Keywords: clinical; electronic medical record; information extraction; medicine; natural language processing
Mesh:
Year: 2020 PMID: 32012074 PMCID: PMC7005695 DOI: 10.2196/16816
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram depicting the screening procedure for articles on natural language processing (NLP) in the medical field.
Figure 2Graph showing the number of articles published over time.
Medical natural language processing journal rankings (n=2336).
| Rank | Journal or proceedings | Publications, n (%) |
| 1 | Studies in Health Technology and Informatics | 408 (17.47) |
| 2 | AMIA Annual Symposium Proceedings | 386 (16.53) |
| 3 | Journal of the American Medical Informatics Association | 256 (10.96) |
| 4 | Journal of Biomedical Informatics | 223 (9.55) |
| 5 | International Journal of Medical Informatics | 54 (2.31) |
| 6 | BMC Medical Informatics and Decision Making | 50 (2.14) |
| 7 | BMC Bioinformatics | 43 (1.84) |
| 8 | AMIA Joint Summits on Translational Science Proceedings | 31 (1.33) |
| 9 | Plos ONE | 31 (1.33) |
| 10 | Journal of Digital Imaging | 30 (1.28) |
Rank of top authors by number of articles published and the most articles published as the first plus corresponding author.
| Total (first + corresponding + coauthor) | Total (first + corresponding) | |||
| Rank | Authors | Publications | Publications | Rank |
| 1 | Hongfang Liu | 70 | 21 (7+14) | 6 |
| 2 | Hua Xu | 66 | 48 (15+33) | 1 |
| 3 | Joshua C Denny | 64 | 26 (12+14) | 4 |
| 4 | Carol Friedman | 60 | 20 (6+14) | 7 |
| 5 | Wendy W Chapman | 55 | 25 (11+14) | 5 |
| 6 | Guergana Savova | 45 | — | — |
| 6 | Christopher G Chute | 45 | — | — |
| 8 | Serguei Pakhomov | 43 | — | — |
| 9 | Özlem Uzuner | 37 | — | — |
| 9 | George Hripcsak | 37 | — | — |
| 9 | Thomas C Rindflesch | 37 | — | — |
| — | Stéphane Meystre | — | 32 (17+15) | 2 |
| — | Özlem Uzuner | — | 30 (16+14) | 3 |
Top first authors and corresponding authors.
| Author designation | Rank | Publications | |
|
|
|
| |
|
| Stéphane Meystre | 1 | 17 |
|
| Özlem Uzuner | 2 | 16 |
|
| Hua Xu | 3 | 15 |
|
| Louise Deleger | 4 | 13 |
|
| Joshua C Denny | 5 | 12 |
|
| Serguei Pakhomov | 5 | 12 |
|
| Wendy W Chapman | 7 | 11 |
|
| Sunghwan Sohn | 8 | 10 |
|
| Li Zhou | 9 | 9 |
|
| Guergana Savova | 9 | 9 |
|
|
|
| |
|
| Hua Xu | 1 | 33 |
|
| Stéphane Meystre | 2 | 15 |
|
| Özlem Uzuner | 3 | 14 |
|
| Carol Friedman | 3 | 14 |
|
| Hongfang Liu | 3 | 14 |
|
| Wendy W Chapman | 3 | 14 |
|
| Joshua C Denny | 3 | 14 |
|
| Imre Solti | 8 | 11 |
|
| Genevieve B Melton | 9 | 10 |
|
| Hong Yu | 9 | 10 |
Ranking of the first author’s countries (top 10, n=2336).
| Rank | Country | Publications, n (%) |
| 1 | United States | 1472 (63.01) |
| 2 | France | 127 (5.44) |
| 3 | United Kingdom | 82 (3.51) |
| 4 | China | 71 (3.04) |
| 5 | Germany | 57 (2.44) |
| 6 | Australia | 56 (2.40) |
| 7 | Japan | 52 (2.23) |
| 8 | Switzerland | 44 (1.88) |
| 9 | Canada | 33 (1.41) |
| 10 | Spain | 28 (1.20) |
Figure 3Trend in the number of articles published over 20 years in the top five countries with the most articles published.
Ranking of institutions to which the first authors belonged (n=2336).
| Rank | Institution name | Publications, n (%) |
| 1 | Columbia University | 106 (4.54) |
| 2 | University of Utah | 97 (4.15) |
| 3 | Mayo Clinic | 90 (3.85) |
| 4 | Vanderbilt University | 59 (2.53) |
| 5 | National Library of Medicine | 57 (2.31) |
| 6 | Brigham and Women’s Hospital | 52 (2.24) |
| 7 | University of California | 47 (2.01) |
| 8 | University of Pittsburgh | 38 (1.63) |
| 9 | Massachusetts General Hospital | 37 (1.58) |
| 10 | University of Minnesota | 32 (1.37) |
Distribution of departments to which the first authors belonged (n=2336).
| Rank | Name of department | Publications, n (%) |
| 1 | Department of biomedical informatics | 334 (14.30) |
| 2 | Department of computer science | 141 (6.04) |
| 3 | Department of radiology | 75 (3.21) |
| 4 | Department of medical informatics | 55 (2.35) |
| 5 | Department of psychiatry | 37 (1.58) |
| 6 | Department of neuroscience | 35 (1.50) |
| 7 | Department of nursing | 30 (1.28) |
| 8 | Department of health sciences | 28 (1.20) |
| 9 | Department of medicine | 22 (0.94) |
| 10 | Department of health informatics | 19 (0.81) |
Figure 4(A) Network visualization of author co-occurrences analyzed using VOSviewer. A circle represents an author, the size of the circle represents the importance, and the thickness of the link connecting the circles represents the relatedness of the connections. Circles with the same color belong to the same cluster. (B) Overlay visualization generated in VOSviewer (Centre for Science and Technology Studies, Leiden University). A color closer to blue represents an earlier time and closer to red represents a time closer to 2018 (note: refer to Multimedia Appendix 1 for details on the two diagrams and related discussions).
Figure 5(A) Distribution of keywords. A circle represents an identified keyword, the size of the circle represents the importance, and the thickness of the link connecting the circles represents the relatedness of the connections among the keywords. Circles with the same color belong to the same cluster. (B) Changes in keywords over time. A color closer to blue represents an earlier time and closer to red represents a time closer to 2018 (note: refer to Multimedia Appendix 1 for details on the two diagrams and related discussions).
Figure 6Ranking of disease categories based on studies that used natural language processing for the investigation of disease cases.
Figure 7Temporal distribution of studies that used natural language processing for the investigation of disease cases (note: this figure shows the names of the top three diseases in studies that used natural language processing to investigate disease cases each year. Fewer than three disease types indicates that only one or two diseases were studied in the year. The term cancer in the figure indicates the article only mentioned the term cancer, without specifying the type of cancer).
Figure 8Distribution of diseases in studies that used natural language processing for the investigation of disease cases in the United States, China, United Kingdom, and Australia.
Figure 9Top five ranks of the research tasks of natural language processing (NLP) in the medical field.