Literature DB >> 35072720

Translatability Analysis of National Institutes of Health-Funded Biomedical Research That Applies Artificial Intelligence.

Feyisope R Eweje1, Suzie Byun1, Rajat Chandra1, Fengling Hu1, Ihab Kamel2, Paul Zhang3, Zhicheng Jiao4, Harrison X Bai2.   

Abstract

Importance: Despite the rapid growth of interest and diversity in applications of artificial intelligence (AI) to biomedical research, there are limited objective ways to characterize the potential for use of AI in clinical practice. Objective: To examine what types of medical AI have the greatest estimated translational impact (ie, ability to lead to development that has measurable value for human health) potential. Design, Setting, and Participants: In this cohort study, research grants related to AI awarded between January 1, 1985, and December 31, 2020, were identified from a National Institutes of Health (NIH) award database. The text content for each award was entered into a Natural Language Processing (NLP) clustering algorithm. An NIH database was also used to extract citation data, including the number of citations and approximate potential to translate (APT) score for published articles associated with the granted awards to create proxies for translatability. Exposures: Unsupervised assignment of AI-related research awards to application topics using NLP. Main Outcomes and Measures: Annualized citations per $1 million funding (ACOF) and average APT score for award-associated articles, grouped by application topic. The APT score is a machine-learning based metric created by the NIH Office of Portfolio Analysis that quantifies the likelihood of future citation by a clinical article.
Results: A total of 16 629 NIH awards related to AI were included in the analysis, and 75 applications of AI were identified. Total annual funding for AI grew from $17.4 million in 1985 to $1.43 billion in 2020. By average APT, interpersonal communication technologies (0.488; 95% CI, 0.472-0.504) and population genetics (0.463; 95% CI, 0.453-0.472) had the highest translatability; environmental health (ACOF, 1038) and applications focused on the electronic health record (ACOF, 489) also had high translatability. The category of applications related to biochemical analysis was found to have low translatability by both metrics (average APT, 0.393; 95% CI, 0.388-0.398; ACOF, 246). Conclusions and Relevance: Based on this study's findings, data on grants from the NIH can apparently be used to identify and characterize medical applications of AI to understand changes in academic productivity, funding support, and potential for translational impact. This method may be extended to characterize other research domains.

Entities:  

Mesh:

Year:  2022        PMID: 35072720      PMCID: PMC8787619          DOI: 10.1001/jamanetworkopen.2021.44742

Source DB:  PubMed          Journal:  JAMA Netw Open        ISSN: 2574-3805


Introduction

Artificial intelligence (AI) has the potential for transformational changes in health care. As early as the 1980s,[1] it was understood that AI tools could eventually play a major role as expert consultants to physicians by using insights from data that may not be deemed actionable by human interpretation. From convolutional neural networks for imaging-based solid organ cancer screening[2,3,4] to natural language processing (NLP) to estimate the probability of diagnoses with data from the electronic health record,[5,6,7] the breadth of AI-powered technologies affecting our understanding of human health and health care delivery processes has rapidly expanded in recent years.[8] Yet despite the exponential growth in academic research involving AI in medicine,[9] it remains difficult to understand which applications have been associated with the greatest clinical impact and which applications have the greatest potential for future impact. Maximizing clinical translation (ie, ability to lead to development that has measurable value for human health) is a challenge for AI-powered biomedical research with many hurdles limiting innovations, including prospective studies, external cohort generalization, and difficulties integrating into existing clinical workflow.[10,11,12] This problem has been described as the growing excitement around AI in health care despite limited examples of ways AI has tangibly changed clinical practice.[13,14] A potential means of characterizing translation of AI is through research funding and related bibliometric data provided by the National Institutes of Health (NIH). As the world’s largest public funder of health research, the NIH has among its mandates to “expand the knowledge base in medical and associated sciences...and ensure continued high return on the public investment in research.”[15] Studies have used NIH grant data to investigate academic productivity and translational value in biomedical research, from studying patent generation per unit of NIH funding to investigating the contribution of NIH funding in new drug approvals.[16,17] Another study analyzed NIH funding for machine learning, but did not assess translational value.[18] However, it has been reported that unsupervised NLP (ie, automatic generation of document categories without a priori knowledge) can segment NIH awards by topic similarity using text descriptions of the awards.[19] By combining elements of each of these analyses, including translational value metrics, focus on AI-related NIH awards, and unsupervised NLP, it may be possible to quantifiably address the issue of which applications of AI have the greatest potential translational impact. In this cohort study, we strictly define the scope of AI applications in health care using unsupervised categorization of NIH awards and use bibliometric data provided by the NIH to investigate potential translational impact of various applications.

Methods

Data Collection

The NIH Research Portfolio Online Reporting Tools Expenditures and Results (RePORTER) search engine was queried for awards related to AI from January 1, 1985, to December 31, 2020, using a query of AI-related terms (eTable 1 and eMethods in the Supplement, defining artificial intelligence). Awards under activity codes T (training programs) and Z (intramural awards to NIH institutes) were excluded from the analysis because these awards often do not detail a focused area of proposed study. Subprojects, individual projects within multicomponent award applications, were also excluded from analysis. In addition, the RePORTER query returns a collection of academic articles that were produced in relation to the awards. These data include PubMed identification numbers, which were used to separately query the NIH iCite platform[20] for citation information related to these articles. This cohort study involved nonhuman data and, per Common Rule 45 CFR 46.116(d)(4), was exempted from institutional review board review and the requirement for informed consent. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline for cohort studies.

Feature Extraction

eFigure 1 in the Supplement depicts the NLP pipeline. The title, abstract, and public health relevance statement from each award were combined as the input for text analysis. Stop words, which are common words with little semantic value (eg, the), were removed. Training features consisted of lemmatized unigrams (1-word sequences) and bigrams (2-word sequences) vectorized with term frequency-inverse document frequency (TF-IDF) weighting. Terms present across more than 10% of the document corpus were excluded from the feature set. The feature set was further narrowed to the 500 terms with the highest TF-IDF values summed across the corpus.

Unsupervised Clustering

The topic identification algorithm was implemented with k-means clustering—an unsupervised machine learning algorithm that identifies clusters of related data points based on minimization of geometric distance between points assigned to a given cluster. As such, awards were assigned to the single topic that best characterized the text content of the award. The k-means algorithm was implemented with minibatches,[21] in which multiple iterations with randomly selected partitions of the data set are conducted with each trial, with a batch size of 1024 and 100 iterations. The constant denoting of the number of clusters (K) was empirically determined by monitoring the silhouette score, a metric that reflects minimization of the mean intracluster distance and maximization of the mean nearest-cluster distance, with modulation of K (eFigure 2 in the Supplement). After K was selected, 50 training trials were conducted to create the final clusters, with the trial that maximized the silhouette coefficient chosen as the representative output.

Cluster Validation

Each cluster created by the k-means algorithm was manually assigned a descriptive label based on the words selected as cluster-characteristic features by the algorithm and the content of award abstracts assigned to the cluster. Clusters with a silhouette score less than 0 (indicating poor assignment of the constituent awards) were excluded from further analysis. To validate the cluster topics, 2 of us (S.B. and R.C.) blinded to the awards’ cluster assignments manually assigned 200 randomly selected awards according to the k-means–defined topics. We also determined the fraction of awards in select k-means categories that were assigned to similarly defined research, condition, and disease categories, categories of research funding first generated by the NIH in 2008 (eMethods in the Supplement, cluster validation).

Statistical Analysis

The overall award sample was characterized by comparing awards granted in 2008 and earlier with awards granted in 2009 and later, which is an inflection point corresponding with the passage of the Health Information Technology for Economic and Clinical Health (HITECH) act in the US.[22] We applied log-likelihood ratios of document frequency to the TF-IDF feature set to determine the 10 most comparatively enriched terms within the 2 time periods. Two metrics were used to quantify the likelihood of translational clinical impact based on articles associated with each NIH award. These metrics were applied to 3 collections of data: awards grouped by the funding NIH institute (eg, National Cancer Institute, National Institute on Aging), k-means–identified individual applications of AI, and applications grouped by general category. First, we calculated annualized citations per $1 million of funding (ACOF). For example, an article receiving 100 citations during 5 years received 20 annualized citations. Second, we calculated the average approximate potential to translate (APT) score for associated articles. The APT is a metric created by the NIH Office of Portfolio Analysis that has been demonstrated to be predictive of the likelihood of future citation by a clinical research article as an indicator of translation. Generated using a machine learning approach, the APT score is based on a data set of more than 9 million published biomedical research articles and outperformed academic experts in predicting clinical translation.[23] We also sought to characterize funding growth for each identified application of AI. An exponential fit was made of each cluster’s annual funding over the study period to evaluate the estimated annual growth rate. In addition, to identify significant proportionality differences between general application categories by funding mechanism (eg, R01), the distributions of awards by funding mechanism were compared using exact binomial tests at significance level α = .05 with post hoc Bonferroni correction. All values with uncertainty are reported with a 95% CI. All analyses were conducted using Python, version 3.8 (Python Software Foundation). All code used to conduct analyses in this study is publicly available.

Results

Study Sample

Table 1 describes the sample of awards, characterized as pre-HITECH and post-HITECH. A total of 16 629 awards were identified for inclusion in the study. The awards totaled $7 177 080 553 in funding and had associated articles with an average APT of 0.422 (95% CI, 0.421-0.423) and ACOF of 301. The average APT significantly increased between the pre- and post-HITECH periods (0.390 vs 0.433; P < .001), but the ACOF decreased by 38% (from 444 to 275). The most enriched TF-IDF features by log-likelihood ratio included lesion, physician, and interpretation pre-HITECH and ehr, big, and deep post-HITECH. Overall funding grew from $17.4 million in 1985 to $1.43 billion in 2020 (eFigure 3 in the Supplement).
Table 1.

National Institutes of Health–Funded Research Applying Artificial Intelligence for the HITECH Act

VariablePre-HITECH (2008 and earlier)Post-HITECH (2009 and later)Overall
No. of awards181814 81116 629
Total funding, $1 090 391 9986 086 688 5557 177 080 553
Annualized citations per $1 million funding444275301
Average approximate potential to translate (95% CI)0.390 (0.388-0.393)a0.433 (0.432-0.434)a0.422 (0.421-0.423)a
Enriched featuresbase, prototype, artificial, lesion, physician, intelligence, mass, simulation, interpretation, procedureehr, big, deep, asd, youth, leverage, personalized, trajectory, autism, informNA

Abbreviations: HITECH, Health Information Technology for Economic and Clinical Health; NA, not applicable.

Significant at P < .001.

Abbreviations: HITECH, Health Information Technology for Economic and Clinical Health; NA, not applicable. Significant at P < .001.

Translatability by NIH Institute

The ACOF and average APT among NIH institutes are shown in Figure 1 and eTable 2 in the Supplement. Among institutes that granted more than 100 AI-related awards over the study period, the National Center for Advancing Translational Sciences funded the most translatable awards by ACOF (n, 392). Other high translatability institutes by ACOF included the National Institute of Environmental Health Sciences (ACOF, 190) and the National Institute of Biomedical Imaging and Bioengineering (ACOF, 157). The National Institute of Dental and Craniofacial Research (average APT, 0.445; 95% CI, 0.417-0.473) and National Eye Institute (average APT, 0.441; 95% CI, 0.426-0.455) produced the most translatable awards by average APT. The Agency for Healthcare Research and Quality and Office of the Director, institutes with research supportive missions, had the lowest translatability by both metrics.
Figure 1.

Translatability of National Institutes of Health–Funded Biomedical Research Applying Artificial Intelligence, by Institute

The areas of the bubbles reflect the relative amount of research funding allocated by each National Institutes of Health institute. Only institutes that granted more than 100 awards during the study period are shown. AHRQ indicates Agency for Health Research and Quality; NCI, National Cancer Institute; NCRR, National Center for Research Resources; NEI, National Eye Institute; NHGRI, National Human Genome Research Institute; NHLBI, National Heart, Lung, and Blood Institute; NIA, National Institute on Aging; NIAAA, National Institute on Alcohol Abuse and Alcoholism; NIAID, National Institute of Allergy and Infectious Diseases; NIAMS, National Institute of Arthritis and Musculoskeletal and Skin Diseases; NIBIB, National Institute of Biomedical Imaging and Bioengineering; NICHD, Eunice Kennedy Shriver National Institute of Child Health and Human Development; NIDA, National Institute on Drug Abuse; NIDCD, National Institute on Deafness and Other Communication Disorders; NIDCR, National Institute of Dental and Craniofacial Research; NIDDK, National Institute of Diabetes and Digestive and Kidney Diseases; NIEHS, National Institute of Environmental Health Sciences; NIGMS, National Institute of General Medical Sciences; NIMH, National Institute of Mental Health; NINDS, National Institute of Neurological Disorders and Stroke; NINR, National Institute of Nursing Research; and NLM, National Library of Medicine.

Translatability of National Institutes of Health–Funded Biomedical Research Applying Artificial Intelligence, by Institute

The areas of the bubbles reflect the relative amount of research funding allocated by each National Institutes of Health institute. Only institutes that granted more than 100 awards during the study period are shown. AHRQ indicates Agency for Health Research and Quality; NCI, National Cancer Institute; NCRR, National Center for Research Resources; NEI, National Eye Institute; NHGRI, National Human Genome Research Institute; NHLBI, National Heart, Lung, and Blood Institute; NIA, National Institute on Aging; NIAAA, National Institute on Alcohol Abuse and Alcoholism; NIAID, National Institute of Allergy and Infectious Diseases; NIAMS, National Institute of Arthritis and Musculoskeletal and Skin Diseases; NIBIB, National Institute of Biomedical Imaging and Bioengineering; NICHD, Eunice Kennedy Shriver National Institute of Child Health and Human Development; NIDA, National Institute on Drug Abuse; NIDCD, National Institute on Deafness and Other Communication Disorders; NIDCR, National Institute of Dental and Craniofacial Research; NIDDK, National Institute of Diabetes and Digestive and Kidney Diseases; NIEHS, National Institute of Environmental Health Sciences; NIGMS, National Institute of General Medical Sciences; NIMH, National Institute of Mental Health; NINDS, National Institute of Neurological Disorders and Stroke; NINR, National Institute of Nursing Research; and NLM, National Library of Medicine.

Applications of AI

Of the total 16 629 awards, 12 459 were sorted into 75 meaningfully descript applications of AI applications in biomedical research (Table 2); the remaining awards were assigned to clusters with silhouette scores less than 0 (ie, ill-defined topic assignments). The defined applications showed frequent overlap with the research, condition, and disease categories (eTable 3 in the Supplement) and fair to moderate agreement in award assignment with the manual raters (eTable 4 in the Supplement). Cluster-characteristic terms (eTable 5 in the Supplement) and a list of awards representative of each cluster (eTable 6 in the Supplement) were also described.
Table 2.

National Institutes of Health–Funded Applications of Artificial Intelligence in Biomedical Research

ApplicationNo. of granted awardsTotal funding (1985-2020), $Annualized citations per $1 million fundingAverage APT (95% CI)Estimated annual growth rate (95% CI)Silhouette score
Neurologic
Total1552812 472 5322220.426 (0.422-0.430)0.457 (0.429-0.485)
Alzheimer disease321241 056 9131580.421 (0.413-0.430)0.482 (0.424-0.540)0.255
Neural circuits361177 492 2001350.437 (0.427-0.447)0.312 (0.265-0.359)0.067
Other dementia14899 430 7296090.436 (0.429-0.443)0.609 (0.512-0.705)0.086
Stroke17688 644 6581940.377 (0.366-0.389)0.508 (0.470-0.546)0.279
Motor function19280 329 6481570.429 (0.415-0.443)0.460 (0.335-0.585)0.083
Memory14955 412 4881620.421 (0.405-0.437)0.462 (0.387-0.537)0.158
EEG10436 783 3573170.442 (0.424-0.459)0.206 (0.155-0.256)0.173
Sleep10133 322 5392110.447 (0.424-0.469)0.869 (0.667-1.072)0.372
Genetics
Total1632732 428 4853520.428 (0.425-0.431)0.157 (0.139-0.176)
Regulatory genetics280134 538 1811980.429 (0.420-0.439)0.193 (0.166-0.220)0.062
Clinically significant genetic variation236117 453 8923270.442 (0.433-0.451)0.203 (0.158-0.249)0.153
Molecular genetics246105 378 4374210.418 (0.409-0.426)0.103 (0.084-0.123)0.100
Population genetics13076 684 6715630.463 (0.453-0.472)0.056 (0.002-0.109)0.153
Familial genetics12973 961 4753740.441 (0.430-0.453)0.244 (0.144-0.344)0.088
Gene mapping16661 918 5524850.399 (0.389-0.410)0.303 (0.248-0.358)0.052
Mouse modeling16760 846 6772070.401 (0.386-0.416)0.155 (0.104-0.206)0.087
Functional mutations16256 859 1223930.404 (0.394-0.415)0.230 (0.181-0.278)0.195
RNA analysis11644 787 4782870.439 (0.423-0.456)0.179 (0.153-0.206)0.228
Mental health
Total1361648 405 4252690.426 (0.422-0.430)0.337 (0.272-0.402)
Pain188145 739 4941420.431 (0.418-0.443)0.441 (0.173-0.709)0.414
Autism spectrum disorder19297 331 7152430.458 (0.447-0.469)0.230 (0.196-0.265)0.236
Alcohol use21886 802 2224020.412 (0.403-0.422)0.177 (0.144-0.209)0.192
Other mental health14879 154 5342550.446 (0.433-0.459)0.447 (0.384-0.509)0.126
Adolescent psychiatry15563 322 0862750.387 (0.373-0.400)0.233 (0.202-0.265)0.100
Other child development16453 656 9182720.431 (0.417-0.445)0.216 (0.190-0.243)0.164
Depression10445 168 5141470.414 (0.391-0.438)0.699 (0.498-0.900)0.185
Suicidality7940 633 4751700.380 (0.362-0.398)0.855 (0.632-1.079)0.455
Schizophrenia11336 596 4678060.438 (0.427-0.448)0.105 (0.044-0.166)0.164
Knowledge frameworks
Total693387 505 7024820.411 (0.408-0.415)0.152 (0.113-0.191)
Centers for translational and computational research144181 240 5036700.418 (0.413-0.422)0.208 (0.132-0.283)0.085
Ontology generation9964 277 6692120.428 (0.414-0.441)0.078 (0.028-0.129)0.312
Knowledge bases17661 076 5994710.383 (0.374-0.392)0.834 (−0.233-1.902)0.064
Knowledge representation and reasoning12237 427 0503950.407 (0.393-0.420)0.585 (0.337-0.833)0.074
Literature review7625 836 8441970.415 (0.392-0.437)0.104 (0.053-0.155)0.212
Intelligent search engines and data visualization7617 647 0371740.393 (0.368-0.418)0.098 (0.064-0.133)0.037
Biochemical analysis
Total788364 738 8992460.393 (0.388-0.398)0.109 (0.086-0.131)
Protein structure and binding prediction13094 382 4774170.412 (0.403-0.420)0.054 (0.022-0.086)0.061
Drug discovery15684 737 3881160.404 (0.390-0.419)0.115 (0.079-0.152)0.068
Other chemical compound characterization15968 832 0861910.377 (0.365-0.390)0.220 (0.166-0.275)0.078
Mass spectroscopy14546 167 9452690.402 (0.388-0.417)0.077 (0.052-0.102)0.162
Cell signaling pathways11736 077 5442410.368 (0.352-0.383)0.181 (0.132-0.230)0.151
Small molecule interactions8134 541 4591810.337 (0.320-0.354)0.409 (0.256-0.562)0.111
Infectious disease/immunologic
Total655317 711 9162160.422 (0.415-0.429)0.366 (0.327-0.405)
HIV243115 512 6232610.417 (0.406-0.428)0.266 (0.230-0.303)0.240
Other infectious disease242102 079 8662020.421 (0.410-0.433)0.550 (0.511-0.589)0.088
Immunology170100 119 4271790.429 (0.417-0.441)0.322 (0.235-0.408)0.072
Cancer
Total799307 395 7422090.424 (0.417-0.430)0.164 (0.143-0.184)
Other375165 458 2422010.436 (0.427-0.445)0.187 (0.167-0.207)0.141
Breast29797 856 8052150.406 (0.396-0.416)0.130 (0.100-0.160)0.244
Prostate12744 080 6952280.424 (0.410-0.439)0.111 (0.066-0.156)0.242
Language and communication
Total785294 191 4132440.427 (0.421-0.434)0.259 (0.230-0.289)
Language development and reading comprehension271108 380 1111570.420 (0.408-0.433)0.264 (0.206-0.322)0.101
Social media and social behavior21284 674 7492190.423 (0.410-0.436)0.310 (0.280-0.340)0.103
Speech21066 336 5773460.409 (0.398-0.420)0.174 (0.147-0.202)0.239
Interpersonal communication technologies9234 799 9763760.488 (0.472-0.504)0.261 (0.210-0.311)0.108
Data types
Total698270 275 3893430.420 (0.414-0.426)0.207 (0.183-0.230)
Wearable devices and mobile technology19381 082 3363580.419 (0.408-0.429)0.354 (0.264-0.444)0.106
Text mining22080 300 8982780.423 (0.411-0.435)0.060 (0.040-0.080)0.076
Motion tracking and artifact reduction15060 926 6172120.413 (0.398-0.427)0.694 (0.569-0.819)0.108
Big data13547 965 5385940.423 (0.413-0.433)0.199 (0.125-0.274)0.174
Patient safety
Total517193 476 7452250.427 (0.419-0.436)0.403 (0.323-0.482)
Adverse drug events/drug safety26593 105 8522600.422 (0.410-0.434)0.238 (0.206-0.271)0.035
Surgical planning13764 262 2851460.455 (0.437-0.474)0.843 (0.656-1.029)0.118
Other patient safety11536 108 6082740.414 (0.399-0.430)0.131 (0.097-0.166)0.088
Population health
Total385163 133 5452360.424 (0.414-0.434)0.419 (0.384-0.453)
Older adults17177 836 6933700.430 (0.418-0.443)0.335 (0.298-0.371)0.191
Population health screening15162 881 100980.406 (0.383-0.429)0.561 (0.418-0.705)0.087
Pediatrics6322 415 7521610.421 (0.394-0.448)0.716 (0.590-0.842)0.158
Model types
Total473151 710 1033070.423 (0.414-0.431)0.362 (0.301-0.423)
Deep learning22172 890 3121640.428 (0.413-0.443)0.725 (0.630-0.820)0.044
Natural language processing12148 501 3772000.442 (0.427-0.457)0.121 (0.106-0.136)0.193
Unspecified classification models13130 318 4148210.402 (0.389-0.416)0.086 (0.055-0.117)0.084
Respiratory
Total310147 146 3572210.413 (0.403-0.422)0.306 (0.230-0.382)
Asthma9375 131 9081590.441 (0.425-0.456)0.253 (0.159-0.348)0.408
Lung cancer and COPD21772 014 4492850.397 (0.386-0.409)0.396 (0.320-0.471)0.180
Electronic health record
Total296111 660 2654890.431 (0.424-0.439)0.381 (0.356-0.406)
Electronic health record296111 660 2654890.431 (0.424-0.439)0.381 (0.356-0.406)0.068
Vision
Total359110 404 1484340.412 (0.405-0.420)0.337 (0.272-0.402)
Visual processing16850 365 5334430.397 (0.386-0.408)0.098 (0.073-0.124)0.134
Object tracking and recognition11334 940 7585150.420 (0.407-0.432)0.102 (0.072-0.132)0.121
Visual impairment7825 097 8573060.442 (0.423-0.460)0.296 (0.187-0.405)0.122
Endocrine
Total21391 175 1831380.430 (0.415-0.444)0.217 (0.139-0.296)
Diabetes11452 974 1851140.429 (0.408-0.450)0.313 (0.156-0.470)0.166
Metabolic syndrome and metabolic processes9938 200 9981710.430 (0.411-0.450)0.149 (0.107-0.190)0.165
Environmental health
Total20388 913 32010380.423 (0.417-0.429)0.270 (0.236-0.303)
Environmental health20388 913 32010380.423 (0.417-0.429)0.270 (0.236-0.303)0.120
Cardiovascular
Total18685 488 6842310.430 (0.418-0.442)0.291 (0.232-0.351)
Cardiovascular disease18685 488 6842310.430 (0.418-0.442)0.291 (0.232-0.351)0.048
Injuries/trauma
Total15982 210 2231440.414 (0.396-0.431)0.217 (0.154-0.280)
Trauma15982 210 2231440.414 (0.396-0.431)0.217 (0.154-0.280)0.075
Renal
Total12343 663 4023650.436 (0.422-0.451)0.733 (0.637-0.829)
Kidney disease12343 663 4023650.436 (0.422-0.451)0.733 (0.637-0.829)0.211
Hepatic
Total12339 489 7352560.459 (0.441-0.477)0.140 (0.103-0.177)
Liver disease12339 489 7352560.459 (0.441-0.477)0.140 (0.103-0.177)0.211
Training and education
Total14935 235 9452880.424 (0.406-0.441)0.099 (0.069-0.130)
Student training and education14935 235 9452880.424 (0.406-0.441)0.099 (0.069-0.130)0.156

Abbreviations: APT, approximate potential to translate; COPD, chronic obstructive pulmonary disease; EEG, electroencephalogram.

Abbreviations: APT, approximate potential to translate; COPD, chronic obstructive pulmonary disease; EEG, electroencephalogram. The clusters were further grouped by general application categories. Some categories were clinically focused, such as neurologic disease, cancer, and mental health, and others were technically focused, such as data types and model types. The estimated annual growth rate for NIH-funded AI research overall was 0.274 (95% CI, 0.24-0.309), with the fastest growing application categories including kidney disease (0.733; 95% CI, 0.637-0.829), neurologic disease (0.457; 95% CI, 0.429-0.485), and population health (0.419; 95% CI, 0.384-0.453) (Figure 2). Applications with low estimated annual growth rate included training and education (0.099; 95% CI, 0.069-0.130), biochemical analysis (0.109; 95% CI, 0.086-0.131), and vision (0.120; 95% CI, 0.100-0.140).
Figure 2.

Estimated National Institutes of Health Funding Annual Growth Rate, by Category of Artificial Intelligence Applications

AGR indicates annual growth rate; NIH, National Institutes of Health.

Estimated National Institutes of Health Funding Annual Growth Rate, by Category of Artificial Intelligence Applications

AGR indicates annual growth rate; NIH, National Institutes of Health.

Applications by Funding Mechanisms

The 4 most frequent funding mechanisms in the award sample were R01 (investigator-initiated research projects, 48% of all awards), U01 (research project cooperative agreements between the funding NIH institute and investigators, 4.8%), R44 (small business innovation research grants, 4.2%), and R21 (exploratory/developmental research grants, 6.8%) (eTable 7 in the Supplement). Application categories that tended to have a higher proportion of R01 grants include biochemical analysis (53.3%), genetics (55.9%), and language and communication (59.0%). The cancer (10.5%) and hepatic disease (19.5%) categories tended to have a higher proportion of U01 grants than other categories (eTable 8 and eTable 9 in the Supplement).

Applications by Translational Impact Potential Metrics

Translatability was assessed at both the specific application (Table 2) and general category (Figure 3) levels. General categories, such as liver disease (average APT, 0.459; 95% CI, 0.441-0.477), kidney disease (average APT, 0.436; 95% CI, 0.422-0.451), and the electronic health record (average APT, 0.431; 95% CI, 0.424-0.439), had high translatability. Specific applications with high APT included interpersonal communication technologies (average APT, 0.488; 95% CI, 0.472-0.504) and population genetics (average APT, 0.463; 95% CI, 0.453-0.472). The biochemical analysis category (comprising drug discovery, other chemical compound characterization, mass spectroscopy, cell signaling pathways, and small molecule interactions as applications) had the lowest translatability (average APT, 0.393; 95% CI, 0.388-0.398).
Figure 3.

Translatability of National Institutes of Health–Funded Biomedical Research Applying Artificial Intelligence, by Category of Artificial Intelligence Applications

The areas of the bubbles reflect the relative amount of National Institutes of Health research funding allocated to each category of artificial intelligence applications.

Translatability of National Institutes of Health–Funded Biomedical Research Applying Artificial Intelligence, by Category of Artificial Intelligence Applications

The areas of the bubbles reflect the relative amount of National Institutes of Health research funding allocated to each category of artificial intelligence applications. General categories with high translatability by ACOF included environmental health (ACOF, 1038), the electronic health record (ACOF, 489) and knowledge frameworks (ACOF, 482). The biochemical analysis category (ACOF, 246) was determined to be among the least translatable by ACOF as well. At both the specific application and general category levels, applications related to primary care practice were found to have the lowest translatability by ACOF. These areas of primary care included diabetes (ACOF, 114) and metabolic syndrome (ACOF, 171) in the endocrine category and population health screening in the population health category (ACOF, 98).

Discussion

By applying unsupervised machine learning techniques to text data from NIH-funded AI research grants, we identified meaningful differences in NIH funding trends, funding types, and translational potential for various applications of AI in biomedical research. There was an 80-fold increase in annual NIH funding for AI over the 35-year study period. The terms enriched in the post-HITECH subset of awards, ehr, big (ie, big data), and deep (ie, deep learning), reflect how the widespread adoption of electronic medical records in the 21st century partially catalyzed this expansion by facilitating access to large, multimodal data sets for the development of AI models.[24] Although the average APT increased between the pre- and post-HITECH periods, the ACOF decreased. This discrepancy may be explained by a decrease in the number of breakthrough articles on biomedical AI as the technologies have become more pervasive, but an increase in the potential for such technologies to create clinical value. Identifying applications of AI with the potential to produce high returns toward the quality and efficiency of health care delivery is necessary. Some NIH institutes seem to have contributed more toward this goal than others. In our analysis, the National Center for Advancing Translational Sciences, National Institute of Dental and Craniofacial Research, National Eye Institute, and National Institute of Biomedical Imaging and Bioengineering were identified as granting highly translatable awards toward AI. The National Institute of Biomedical Imaging and Bioengineering and National Institute of Dental and Craniofacial Research have successfully implemented academic-industry partnerships to facilitate translation of AI technologies.[25] Among the National Center for Advancing Translational Sciences core technologies are machine learning methods for prediction of chemical properties and NLP for extracting knowledge from data in rare diseases.[26] The National Eye Institute has funded the development of an AI imaging technology to detect retinopathy of prematurity that was recently granted breakthrough status by the US Food and Drug Administration.[27] Specific applications that were identified as having high translatable potential include environmental health and interpersonal communication technologies. Interpersonal communication technologies can aid with communication for patients with functional impairments through brain-computer interfaces, hand sign analysis, and eye tracking, among other methods.[28] Environmental health AI can help elucidate how environmental toxin exposures contribute to the development of disease and promote preventive policy and infrastructural changes.[29,30] Applications focused on primary care settings seemed to have lower translatability. One such application in diabetes was a proposal to generate policy recommendations that reduce preventive care disparities in patients with diabetes using Medicare claims data and Markov decision process analysis (eTable 5 in the Supplement, detecting, understanding, and reducing diabetes belt preventive care disparities). The population health screening application included studies focused on identifying cost-effective screening methods and developing personalized screening recommendations for colon cancer, among others.[31,32] Poor translatability could be because, in the primary care setting, there is an increased need for altered patient and health care professional behavior for these types of AI technologies to make changes. Poor patient adherence to preventive health care measures and poor health care professional adoption of novel screening tools are well-recognized issues.[33,34] Applications in the biochemical analysis category (protein structure and binding prediction, drug discovery, other chemical compound characterization, mass spectroscopy, cell signaling pathways, and small molecule interactions) were deemed less translatable by both metrics. Although such applications have an intuitively longer path to clinical utility, these are nonetheless domains in which discoveries can be made that transform our understanding of disease and result in novel methods of diagnosis and treatment. Artificial intelligence to predict protein structure and functional properties from amino acid sequence includes popular models such as AlphaFold (DeepMind), a deep neural network.[35] Analyzing RNA sequence patterns at the cellular level with machine learning models has improved understanding of gene interactions and expression.[36,37] Our analysis may be biased against these types of applications in favor of technologies with more immediate potential for translation. Herein, we present, to our knowledge, a novel method for characterizing estimated biomedical research impact that uses NLP and data from NIH-awarded grants. We applied this approach to segment applications of AI in medicine and analyzed these applications with funding and citation data. The software developed for this study is open-source, making replication of these results and transfer of this method to other domains of interest straightforward. Other subjects could include an analysis of awards related to health disparities to highlight the inequities deemed most pressing based on academic interest or an analysis of what types of COVID-19–related research received the most NIH funding over the course of the pandemic.

Limitations

This study has limitations. First, although our approach identified a number of recognized medical applications of AI, it is hindered by the lack of a standardized definition of AI. Lack of a standardized definition limits our ability to determine whether our query captured the most complete set of AI-related NIH awards or whether awards were included with aims that could be deemed unrelated to AI. Second, the k-means algorithm is an imperfect method for unsupervised clustering and there was likely varying degrees of topic overlap between the generated clusters. Conversely, the awards that were sorted into ill-defined clusters and subsequently excluded from the application analysis may have been appropriately included in a described cluster or separated into smaller defined clusters should a manual review have been performed. Third, citation counts are an imperfect proxy for research impact: citation of an academic article may also be influenced by author reputation, research domain, negative citations, and self-citations.[38,39] Both citation-based translational impact metrics used in this study are simply proxies for the true measure of biomedical research impact: improvement in human health.

Conclusions

Findings from this study suggest that there are numerous applications of AI in biomedical research that are receiving exponentially increasing amounts of grant funding from the NIH, demonstrating varying degrees of estimated translational impact returns. Domains of biomedical research can be categorized using NIH research grant data to understand differences in academic productivity, funding support, and clinical translation.
  33 in total

Review 1.  Artificial intelligence in healthcare.

Authors:  Kun-Hsing Yu; Andrew L Beam; Isaac S Kohane
Journal:  Nat Biomed Eng       Date:  2018-10-10       Impact factor: 25.671

2.  Accelerating the Translation of Artificial Intelligence From Ideas to Routine Clinical Workflow.

Authors:  MingDe Lin
Journal:  Acad Radiol       Date:  2020-01       Impact factor: 3.173

3.  Improved protein structure prediction using potentials from deep learning.

Authors:  Andrew W Senior; Richard Evans; John Jumper; James Kirkpatrick; Laurent Sifre; Tim Green; Chongli Qin; Augustin Žídek; Alexander W R Nelson; Alex Bridgland; Hugo Penedones; Stig Petersen; Karen Simonyan; Steve Crossan; Pushmeet Kohli; David T Jones; David Silver; Koray Kavukcuoglu; Demis Hassabis
Journal:  Nature       Date:  2020-01-15       Impact factor: 49.962

4.  A New Comprehensive Colorectal Cancer Risk Prediction Model Incorporating Family History, Personal Characteristics, and Environmental Factors.

Authors:  Mark A Jenkins; Polly A Newcomb; Yingye Zheng; Xinwei Hua; Aung K Win; Robert J MacInnis; Steven Gallinger; Loic Le Marchand; Noralane M Lindor; John A Baron; John L Hopper; James G Dowty; Antonis C Antoniou; Jiayin Zheng
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2020-01-13       Impact factor: 4.254

5.  Comparative effectiveness of screening strategies for colorectal cancer.

Authors:  Afsaneh Barzi; Heinz-Josef Lenz; David I Quinn; Sarmad Sadeghi
Journal:  Cancer       Date:  2017-01-24       Impact factor: 6.860

6.  Use of Natural Language Processing to Improve Identification of Patients With Peripheral Artery Disease.

Authors:  E Hope Weissler; Jikai Zhang; Steven Lippmann; Shelley Rusincovitch; Ricardo Henao; W Schuyler Jones
Journal:  Circ Cardiovasc Interv       Date:  2020-10-12       Impact factor: 6.546

7.  RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes.

Authors:  Likai Wang; Yanpeng Xi; Sibum Sung; Hong Qiao
Journal:  BMC Genomics       Date:  2018-07-20       Impact factor: 3.969

8.  Global Evolution of Research in Artificial Intelligence in Health and Medicine: A Bibliometric Study.

Authors:  Bach Xuan Tran; Giang Thu Vu; Giang Hai Ha; Quan-Hoang Vuong; Manh-Tung Ho; Thu-Trang Vuong; Viet-Phuong La; Manh-Toan Ho; Kien-Cuong P Nghiem; Huong Lan Thi Nguyen; Carl A Latkin; Wilson W S Tam; Ngai-Man Cheung; Hong-Kong T Nguyen; Cyrus S H Ho; Roger C M Ho
Journal:  J Clin Med       Date:  2019-03-14       Impact factor: 4.241

9.  Into the Black Box: What Can Machine Learning Offer Environmental Health Research?

Authors:  Charles W Schmidt
Journal:  Environ Health Perspect       Date:  2020-02-26       Impact factor: 9.031

10.  Key challenges for delivering clinical impact with artificial intelligence.

Authors:  Christopher J Kelly; Alan Karthikesalingam; Mustafa Suleyman; Greg Corrado; Dominic King
Journal:  BMC Med       Date:  2019-10-29       Impact factor: 8.775

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.