Literature DB >> 34720254

Tracking developments in artificial intelligence research: constructing and applying a new search strategy.

Na Liu1, Philip Shapira2,3, Xiaoxu Yue4.   

Abstract

Artificial intelligence, as an emerging and multidisciplinary domain of research and innovation, has attracted growing attention in recent years. Delineating the domain composition of artificial intelligence is central to profiling and tracking its development and trajectories. This paper puts forward a bibliometric definition for artificial intelligence which can be readily applied, including by researchers, managers, and policy analysts. Our approach starts with benchmark records of artificial intelligence captured by using a core keyword and specialized journal search. We then extract candidate terms from high frequency keywords of benchmark records, refine keywords and complement with the subject category "artificial intelligence". We assess our search approach by comparing it with other three recent search strategies of artificial intelligence, using a common source of articles from the Web of Science. Using this source, we then profile patterns of growth and international diffusion of scientific research in artificial intelligence in recent years, identify top research sponsors in funding artificial intelligence and demonstrate how diverse disciplines contribute to the multidisciplinary development of artificial intelligence. We conclude with implications for search strategy development and suggestions of lines for further research.
© The Author(s) 2021.

Entities:  

Keywords:  Artificial intelligence; Bibliometric analysis; Emerging technology; Research trends; Search strategy

Year:  2021        PMID: 34720254      PMCID: PMC8550099          DOI: 10.1007/s11192-021-03868-4

Source DB:  PubMed          Journal:  Scientometrics        ISSN: 0138-9130            Impact factor:   3.238


Introduction

Artificial intelligence is considered as a cutting-edge technology that is increasingly driving developments and innovations in a wide range of scientific, technological, business, and government fields (WIPO 2019a). The domain is experiencing a worldwide surge in attention from policymakers, universities and institutes, corporations and the public. However, what is artificial intelligence and how can it be defined for bibliometric searches? Computer scientist John McCarthy and colleagues introduced the term “artificial intelligence” in a proposal for a conference held at Dartmouth College in 1956 (McCarthy et al. 1955). He later described artificial intelligence as “the science and engineering of making intelligent machines, especially computer programs” (McCarthy 2007). Subsequently, further perspectives have been put forward on what constitutes artificial intelligence. While so far there does not appear to be a universally accepted definition of artificial intelligence (Buiten 2019; Wang 2019), there is convergence on core attributes. In his classic text, Nilsson (1998) maintains that “artificial intelligence … is concerned with intelligent behavior in artifacts” including through the development of machines that can perceive, reason, learn, communicate and act in complex environments “as well as humans can, or possibly better.” Similarly, artificial intelligence is discussed as a branch of computer science that focuses on creating systems that perform tasks usually requiring human intelligence (Chartrand et al. 2017; Russell and Bohannon 2015) or as the endowment of machines with human-like capabilities through simulating human consciousness and thinking processes using advanced algorithms or models (Jakhar and Kaur 2020). Other scholars describe artificial intelligence as a set of technologies or applications which enable machines or computers able to mimic the cognitive functions of the human brain (Tran et al. 2019). Although there are differences in standpoints as to the specific technologies and algorithmic approaches that be encompassed within meanings of artificial intelligence, examples often highlighted include machine learning, neural networks, deep learning, support vector machines, and inductive logic programming (WIPO 2019a; Morabit et al. 2019). Meanwhile, the burgeoning in recent years of artificial intelligence applications promises to reshape economies, employment, society and governance across the world (West and Allen 2018; Dang 2019). Far reaching developments are anticipated as artificial intelligence is applied to applications such as face recognition, computer vision, biometrics, monitoring, prediction, and decision-making and transforms fields including those of finance, medicine, e-commerce, traffic management, and public security (CBInsights 2019; Zhang et al. 2019). There are expectations that artificial intelligence will free humans from repetitive tasks, generate new insights and user engagements, and boost productivity (Davenport and Ronanki 2018; Uria-Recio 2019). However, widespread concerns have also been raised about the implications of artificial intelligence for the future of work and employment as well as for widening inequities in society, ethics and bias, threats to data security, privacy, and civil liberties (British Academy 2020; Morgan et al. 2020). The growth of artificial intelligence has been fueled by a series of scientific and technological advances across many disciplines, such as computer science, mathematics, neurosciences, engineering and linguistics, and massive improvements in computational power that enables the compilation, analysis and sharing of large volumes of data (WIPO 2019a). Public research funding and public policies have also stimulated and shaped the progression of artificial intelligence around the world (Loucks et al. 2019). While countries typically seek to deploy artificial intelligence to promote productivity, competitiveness and economic development, other goals are also variously pursued. For example, in the United States, innovation, technological leadership and national security have been emphasized; China now seeks these objectives too, alongside the use of artificial intelligence to boost manufacturing power and promote smart cities; and Japan highlights goals to bolster an aging but smart society through artificial intelligence (Appelbaum et al. 2018; Cath et al. 2018; OECD 2019; Mashiko 2020). Notably in China but also in multiple countries elsewhere, artificial intelligence for surveillance has been fostered (Feldstein 2019; Roberts et al. 2020). At the same time, in Europe, several US states, and in other countries, guidelines and policies now aim to address the ethical, data security, and privacy risks of artificial intelligence (AI HLEG 2019; EPIC 2020). Artificial intelligence has further been spurred by a ramp-up of venture capital and start-up businesses (OECD 2018; Walch 2020) as well as by massive private R&D investments especially from large corporations in the US such as Amazon, Apple, Facebook, IBM, Microsoft, and Google and in China by Alibaba, Baidu and Tencent (Webb 2019). In this context of the worldwide rise of artificial intelligence, increasing public and private investment, anticipations of widespread applications, national strategy development, and on-going debate about its regulation and governance, approaches that can clarify the scope of this broad field and trace its research and innovation pathways are fundamental. Insights from such research and innovation mapping and tracking are vital in informing researchers, funders, companies, policymakers and other stakeholders. However, because this field is broad, dynamic and fast-moving, there are fuzzy boundaries between legacy technologies, emerging technologies and other related technologies in the artificial intelligence field (WIPO 2019a). Artificial intelligence has a legacy in computer science stretching back over seven decades. At the same time, artificial intelligence has absorbed knowledge derived from many other fields, including probability statistics, mathematics, information engineering, linguistics, game theory and neuroscience (Jackson 2019). Artificial intelligence techniques and methods are also applied in a further wide and expanding array of fields, such as speech recognition, computer vision, robotics and operations management. In order to delineate the scope of artificial intelligence, we construct a new search strategy for bibliometric analyses of research and innovation that is able to robustly capture the variety and spread of artificial intelligence and related concepts and procedures. Our approach aims to improve upon the limited set of bibliometric approaches published to date and avoid being either too narrow or too broad. We apply a multi-stage and hybrid approach to determine relevant terms to be included in the bibliometric definition. The process involves building on, and extending from, a core corpus of scientific publications extracted from the Web of Science (WoS). The next section of this paper details our bibliometric search strategy for artificial intelligence and the steps and procedures involved. This is followed by an assessment where we undertake a comparative analysis to investigate how our search results compare with the search approaches put forward in a set of previous studies. We then use the search definition to undertake an analysis of key global trends, including growth over the last three decades, leading publishing countries and organizations, subfields, and key funding agencies. Finally, the last section of the paper highlights conclusions, limitations and some ideas for future work.

Construction of the bibliometric search query for artificial intelligence

Bibliometric methods that analyze publications and patents are commonly used to quantitatively profile and track the development and trajectories of science and technology, including in emerging fields (Guan and Liu 2014; Liu and Guan 2016; Shapira et al. 2017; Glänzel et al. 2019). These methods typically build on search strategies that can capture relevant publications or patents in emerging fields with high recall and precision. However, the intrinsic characteristics of emerging technology domains, including their novelty, boundary ambiguities and uncertain development trajectories, present significant definitional challenges (Rotolo et al. 2015). Among the bibliometric search approaches that are available to address these challenges are those that involve lexical keyword-based searches, the use of target domain journals, subject-category schemes, and citation and co-citation analyses (Huang et al. 2011; Arora et al. 2013). Lexical queries, using keywords, are relatively straightforward but depend on the reliability and objectivity of the expertise involved in defining keyword sets. A variation is an evolutionary lexical query with semi-automated iteration, for example by identifying core publications in an emerging field with a simple search strategy, identifying keywords and their frequency rank, repeating the search with highly-ranked keywords until convergence and involving experts in reviewing expanded keyword groups. This method still relies on the reliability of keyword selection and expert input (Huang et al. 2011, 2015). Search approaches using specific journal titles or subject categories in bibliographic databases are easily operationalized but face limitations for emerging technologies that are distributed or expanding across multiple disciplines and subject domains with outputs appearing in a widening array of journals (Huang et al. 2011; Shapira et al. 2017; Muñoz-Écija et al. 2019). Citation or co-citation search approaches start with a core set of articles exemplifying the emerging technology, adding in papers identified through citation networks and bibliographic coupling (Zitt and Bassecoulard 2006). Citation or co-citation approaches are sensitive to the starting corpus definition, have citation time-lag limitations (an issue in a fast-emerging field), and require a high level of proprietary data access (Mogoutov and Kahane 2007). Noting that each of these methods has advantages and disadvantages, it has been recognized that bibliometric search strategies do not necessarily have to employ only one approach. Greater attention has been focused in recent years on combining methods, particularly in developing search strategies for emerging fields (Huang et al. 2015; Shapira et al. 2017; Muñoz-Écija et al. 2019; Wang et al. 2019). We similarly adopt a hybrid approach to constructing a search strategy for emerging artificial intelligence through a systematic process that takes advantage of multiple methods. Our search approach seeks to capture not only publications clearly acknowledged as artificial intelligence but also publications that should be included in the artificial intelligence field, even though their titles, abstracts or keywords may not involve the core term “artificial intelligence”. There are four key steps in the procedure we use to build a search strategy (Fig. 1). First, we generate a benchmark set of artificial intelligence publications. We use the core lexical query “artificial intelligence” as a topic search as well as a query of specialized artificial intelligence journals as a source search. Second, from these benchmark records, we extract “Author Keywords” and “Keywords Plus” and derive the frequencies of these keywords. We confirm the precise meanings of high-frequency keywords from descriptions found in online sources. This process leads to a retained list of high-frequency “candidate keywords” related to artificial intelligence. Third, to maintain balance between recall and precision, we test and refine this set of terms through co-occurrence analysis and manual checking identification. Fourth, we augment our strategy by combining the final term set with the use of a subject category search. These procedures are consecutive and are detailed in the next section.
Fig. 1

Overview of artificial intelligence search strategy

Overview of artificial intelligence search strategy

Artificial intelligence bibliometric search strategy

Retrieving artificial intelligence benchmark records

Gathering benchmark records is the essential first step in our bibliometric search strategy. In the artificial intelligence field, the term “artificial intelligence” itself is extremely central. Accordingly, we use it directly as a seed search term in the “Topic” field of the WoS Science Citation Index Expanded (SCI-Expanded) and Social Sciences Citation Index (SSCI) databases. An initial search was conducted for all publication years on 21 February 2020, resulting in 24,807 records. In viewing these publications, we found that many are concerned with the application of artificial intelligence technologies in specific industrial contexts. Such papers were not relevant to our purpose of developing a conceptual search strategy. A similar observation is found in Zhou et al. (2019) in their search of “artificial intelligence” in the WoS “Title” field. To anchor our search for additional keywords germane to the core of research on artificial intelligence, we focused on the WoS subject categories of “Computer Science, Artificial Intelligence”, “Computer Science, Information Systems”, “Computer Science, Interdisciplinary Applications”, “Computer Science, Theory & Methods”, “Computer Science, Software Engineering”, “Computer Science, Hardware & Architecture”, “Computer Science, Cybernetics”, and “Robotics”. For the 9422 publication records in these eight WoS subject categories, we manually reviewed their titles and abstracts and deleted 818 records that dealt with applications. This refining process reduced the set to 8604 records. The concept of “artificial intelligence”, as discussed in the opening parts of this paper, refers to the design of machines, programs and systems that can act with human-like reasoning and decision-making capabilities. While “artificial intelligence” is a central term, we recognized that would miss other relevant core publications if we used only this umbrella topic to identify benchmark records. To extend our core search, we also included specialized journals at the epicenter of the artificial intelligence domain. We identified 19 specialized journals that focus on artificial intelligence (Table 1). These specialized journals were chosen from the Scimago Journal Rankings for artificial intelligence (SJR 2020) and the recommended journal list of the China Computer Federation (CCF 2019). We only selected top-tier journals that focus on core artificial intelligence technologies; we eschewed journals that emphasized functional applications of artificial intelligence (for example, the journal Artificial Intelligence in Medicine was not selected). Of the 19 chosen top-tier journals, all are international journals; 11 are identified by both Scimago and the China Computer Federation, while the other eight are from Scimago; and all are found in the WoS and located in the subject category of “artificial intelligence”. We searched these specialized journals (all years) in the WoS on 26 February 2020. The specialized journal search resulted in a set of 32,640 records of all publication types after cleaning duplicated records.
Table 1

Specialized artificial intelligence journals

No.JournalPublisherYear foundedWebsitePublication periodSource
1Artificial intelligenceElsevier1970https://www.journals.elsevier.com/artificial-intelligence/MonthlyBoth
2Journal of machine learning researchMicrotome2001http://jmlr.org/BimonthlyBoth
3Autonomous agents and multi-agent systemsSpringer1998https://www.springer.com/journal/10458BimonthlyBoth
4IEEE transactions on neural networks and learning systemsIEEE2012https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=5962385MonthlyBoth
5Journal of artificial intelligence researchAAAI1993https://www.jair.org/index.php/jairIrregularBoth
6Machine learningSpringer1990https://www.springer.com/journal/10994MonthlyBoth
7Computational intelligenceWiley-Blackwell1995https://onlinelibrary.wiley.com/journal/14678640QuarterlyBoth
8Expert systemsWiley-Blackwell1994https://onlinelibrary.wiley.com/journal/14680394BimonthlyBoth
9International journal of intelligent systemsWiley1987https://onlinelibrary.wiley.com/journal/1098111xMonthlyBoth
10NeurocomputingElsevier1992https://www.journals.elsevier.com/neurocomputing/BimonthlyBoth
11Journal of experimental and theoretical artificial intelligenceTaylor and Francis1993https://www.tandfonline.com/toc/teta20/currentQuarterlyBoth
12IEEE computational intelligence magazineIEEE2006https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=10207QuarterlyScimago
13Artificial intelligence reviewSpringer1988https://www.springer.com/journal/10462BimonthlyScimago
14Autonomous robotsSpringer1996https://www.springer.com/journal/10514BimonthlyScimago
15International journal of machine learning and cyberneticsSpringer2010https://www.springer.com/journal/13042MonthlyScimago
16ACM transactions on intelligent systems and technologyAssociation for Computing Machinery2010https://dl.acm.org/journal/tistBimonthlyScimago
17AI magazineAAAI1987https://www.aaai.org/Magazine/magazine.phpQuarterlyScimago
18Progress in artificial intelligenceSpringer2015https://www.springer.com/journal/13748QuarterlyScimago
19Swarm intelligenceSpringer2010https://www.springer.com/journal/11721QuarterlyScimago

Derived from top-tier artificial intelligence journal listings in Scimago Journal Rankings (SJR 2020) and the China Computer Federation (2019). See discussion in text. “Both” indicates nomination from both Scimago and CCF

Specialized artificial intelligence journals Derived from top-tier artificial intelligence journal listings in Scimago Journal Rankings (SJR 2020) and the China Computer Federation (2019). See discussion in text. “Both” indicates nomination from both Scimago and CCF

Adding keywords from (co-)occurrence analysis

In the second major step of our search strategy, we extracted “Author Keywords” and “Keywords Plus” from our corpus of benchmark records and counted the frequencies of these two types of keywords. We eliminated keywords that appeared fewer than three times and also some generic phases such as “system”, “design”, “information”, “complexity” and “dynamic”. Additionally, the precise meaning of each of these keywords was ascertained by checking online web sources and Wikipedia. This led to a set of high-frequency “candidate keywords” related specifically to artificial intelligence. In total, 214 candidate terms were retained, comprising 111 keywords derived from the “artificial intelligence” topic search, with a balance of 103 non-duplicated candidate terms added through the specialized journal search.

Refining candidate keywords using (co-)occurrence analysis and hit ratio screening

Keywords that most commonly co-occurred with the central term “artificial intelligence” should themselves become part of the core lexical query. Hence, from our focal data set, we extracted benchmark records that included the keyword “artificial intelligence”. We performed a keyword co-occurrence analysis for these extracted records. This process allowed us to identify nine keywords as core lexical because they frequently co-occurred with the term “artificial intelligence”. This enabled the following “Topic Search” (TS) core lexical query for our artificial intelligence search strategy: TS = (“Artificial Intelligen*” or “Neural Net*” or “Machine* Learning” or “Expert System$” or “Natural Language Processing” or “Deep Learning” or “Reinforcement Learning” or “Learning Algorithm$” or “*supervised Learning” or “Intelligent Agent*”). To determine which search terms to accept among the remaining 204 candidate keywords, we introduced a simple “Hit Ratio” and performed manual checking. The search result obtained by using the ten-term core lexical query (as above) is denoted as group A. The search result obtained by using each of the remaining 204 candidate keywords forms group B. We then defined the “Hit Ratio” for each candidate keyword C as: HitRatio = (A ∩ B)/B. The ratio signifies how many records captured by a candidate keyword are also captured by our core lexical query. We proceed as in Huang et al. (2015) by adopting a two-step process to assess whether a candidate term should be accepted or not into the next stage of our expanded lexical query. To be specific, if HitRatio ≥ 70%, then we directly included the candidate keyword C into the expanded lexical query that is part of our final search strategy. If HitRatio ≤ 30%, we excluded the candidate keyword. If 30% < HitRatio < 70%, then a manual check was performed. For the manual check, we reviewed the search records captured by the candidate keyword C in the area of (B not (A ∩ B)). Specifically, we manually checked the abstracts of a random sample of 25 WoS records captured by the candidate keyword C but not captured by the core lexical query. To acquire the random sample, we sorted the records falling in the area of (B not (A ∩ B)) alphabetically by authors. This avoids clustering of usage changes of terms over time if sorted by publication date. We randomly selected abstracts to read and estimated how many out of each 25-record sample were related to artificial intelligence. If greater than 50% of the sample comprised publications relevant to artificial intelligence, the candidate keyword was included in our final search strategy, deeming this candidate keyword as having a low noise ratio (LR). If less than 50% of the sample were relevant artificial intelligence records, then we excluded that candidate keyword from our final search list and deemed it as having a high noise ratio (HR). After applying the Hit Ratio procedure to the set of 204 candidate keywords, 28 candidate keywords have Hit Ratios in the range of 70.53 to 97.90% (Table 2). This indicates that more than 70% of the records searched by each keyword are also captured by our core lexical query, revealing that these keywords have a high relatedness to the field of artificial intelligence. For a further 84 candidate keywords, we find Hit Ratios between 30 and 70%. These candidate keywords were each subject to a manual check, as described above. As an example, “adaptive learning” is one of these candidate keywords. This keyword appears in 1514 published records in WoS SCI-Expanded and SSCI in the period 2010 though to 23 March 2020, of which 912 are not captured by our core lexical query. However, only 12/25 of the random record sample taken from the non-captured records were deemed to be on target and relevant for artificial intelligence research. This keyword was not added to our final search query. Another example, “multiple kernel learning” (or “multi-kernel learning” or “multikernel learning”), appears in 694 published records in the WoS databases over the same period, of which 435 are not captured by the core lexical query. In the manual check of 25 records sampled at random from the non-captured set, all were found to be related to artificial intelligence. This keyword is included in the final search set. After performing manual checks, 61 of the 84 candidate keywords were added to the final search query (Table 3).
Table 2

Candidate keywords directly included in the search strategy

NumberKeywordsCandidate termsBA ∩ BHit ratio (%)Final decision
1Backpropagation Learning“Backpropagation Learning” or “Back-propagation Learning” or “Bp Learning”38137397.9Include
2Backpropagation Algorithm“Backpropagation Algorithm*” or “Back-propagation Algorithm*”1348125292.9Include
3Long Short-term Memory“Long Short-term Memory”2316211191.2Include
4Pcnn(Pcnn$ not Pcnnt) or “Pulse Coupled Neural Net*”32128689.1Include
5Perceptron“Perceptron$”5836504286.4Include
6Neuro Evolution“Neuro-evolution” or Neuroevolution13211486.4Include
7Liquid State Machine“Liquid State Machine*”474085.1Include
8Deep Belief Net“Deep Belief Net*”86172384.0Include
9Radial Basis Function Network“Radial Basis Function Net*” or Rbfnn* or “Rbf Net*”1985165483.3Include
10Deep Network“Deep Net*”111993083.1Include
11AutoencoderAutoencoder*1996164482.4Include
12Committee Machine“Committee Machine*”14011582.1Include
13Training Algorithm“Training Algorithm$”1533125281.7Include
14Backpropagation Network“Backpropagation Net*” or “Back-propagation Net*” or “Bp Network*”56645680.6Include
15Q learning“Q learning”121898080.5Include
16Convolutional Network“Convolution* Net*”1796144380.4Include
17Actor-critic Algorithm“Actor-critic Algorithm$”695579.7Include
18Feedforward Network“Feedforward Net*” or “Feed-Forward Net*”116892979.5Include
19Hopfield Network“Hopfield Net*”19815779.3Include
20NeocognitronNeocognitron*463678.3Include
21XgboostXgboost*37228877.4Include
22Boltzmann Machine“Boltzmann Machine*”84965577.2Include
23Activation Function“Activation Function$”2337180077.0Include
24Neurodynamic Programming“Neurodynamic Programming” or “Neuro dynamic Programming”403075.0Include
25Learning Model“Learning Model*”8007579072.3Include
26NeurocomputingNeurocomputing or “Neuro-Computing”14810671.6Include
27Temporal Difference Learning“Temporal Difference Learning”1218671.1Include
28Echo State Network“Echo State* Net*”43130470.5Include

Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: articles; Language: English

Table 3

Candidate keywords subject to manual review

NumberKeywordsCandidate terms B A ∩ BHit ratio (%)NNoise RatioFinal decision
1Transfer Learning“Transfer Learning”2269158870.021LRInclude
2Gradient Boosting“Gradient Boosting”115280469.825LRInclude
3Adversarial Learning“Adversarial Learning”18712969.025LRInclude
4Feature Learning“Feature Learning”1574108568.925LRInclude
5Heuristic Dynamic Programming“Heuristic Dynamic Programming”996868.75HRExclude
6Generative Adversarial Network“Generative Adversarial Net*”108073868.323LRInclude
7Representation Learning“Representation Learning”79353267.124LRInclude
8Multiagent Learning“Multiagent Learning” or “Multi-agent Learning”1067167.025LRInclude
9Reservoir Computing“Reservoir Computing”36123865.918LRInclude
10Co-training“Co-training”18211462.624LRInclude
11Pac Learning“Pac Learning” or “Probabl* Approximate* Correct Learning”644062.525LRInclude
12Extreme Learning Machine“Extreme Learning Machine*”3842239462.324LRInclude
13Instance-based Learning“Instance-based Learning”1528958.610HRExclude
14Recurrent Network“Recurrent* Net*”71241658.44HRExclude
15Competitive Learning“Competitive Learning”24513457.511HRExclude
16Ensemble Learning“Ensemble Learning”1935111057.425LRInclude
17Learning Rule“Learning Rule*”113263956.59HRExclude
18Propagation Algorithm“Propagation Algorithm$”163792056.25HRExclude
19Machine Intelligence“Machine* Intelligen*”29116255.724LRInclude
20Neuro fuzzy“Neuro fuzzy” or Neurofuzzy4324237955.025LRInclude
21Stochastic gradient descent“Stochastic gradient descent”32158554.911HRExclude
22Lazy Learning“Lazy Learning”643554.725LRInclude
23Multiple-instance Learning“Multi* instance Learning” or “Multiinstance Learning”39521353.925LRInclude
24Multi-task Learning“Multi* task Learning” or “Multitask Learning”92850053.925LRInclude
25Computational Intelligence“Computation* Intelligen*”151181353.825LRInclude
26Neural Model“Neural Model*”141175653.625LRInclude
27Multi Label Learning“Multi* Label Learning” or “Multilabel Learning”42022553.625LRInclude
28Similarity Learning“Similarity Learning”1527851.325LRInclude
29Statistical Relational Learning“Statistical Relation* Learning”804151.325LRExclude
30Support Vector Regression“Support* Vector* Regression”4655235950.725LRInclude
31Manifold Regularization“Manifold Regulari?ation”31015750.725LRInclude
32Decision Forest“Decision Forest*”1919650.324LRInclude
33Generalization Error“Generali?ation Error*”46923249.524LRInclude
34Adaptive Dynamic Programming“Adaptive Dynamic Programming” or “Approximat* Dynamic Programming”92645749.45HRExclude
35Transductive Learning“Transductive Learning”1226049.225LRInclude
36NeuroroboticsNeurorobotic* or “Neuro-robotic*”1105449.125LRInclude
37Inductive Logic Programming“Inductive Logic Programming”1225948.425LRInclude
38Natural Language Understanding“Natural Language Understanding”1205747.524LRInclude
39AdaboostAdaboost* or “Adaptive Boosting”170780146.923LRInclude
40Incremental Learning“Incremental Learning”96745246.716LRInclude
41Random Forest“Random Forest*”14,190659446.523LRInclude
42Cognitive Computing“Cognitive Computing”1908846.37HRExclude
43Metric Learning“Metric Learning”89040745.725LRInclude
44Neural Gas“Neural Gas”1657545.524LRInclude
45Grammatical Inference“Grammatical Inference”622845.225LRInclude
46Support Vector Machine“Support* Vector* Machine*”34,27815,25044.520LRInclude
47Multi Label Classification“Multi* Label Classification” or “Multilabel Classification”66829744.518LRInclude
48ChatbotChatbot*1536743.88HRExclude
49Conditional Random Field“Conditional Random Field*”129656243.419LRInclude
50Intelligent System“Intelligent System*”2365101843.011HRExclude
51Multi Class Classification“Multi* Class Classification” or “Multiclass Classification”126254243.017LRInclude
52Mixture Of Experts“Mixture Of Expert*”1737442.823LRInclude
53Concept Drift“Concept* Drift”44719142.725LRInclude
54Genetic Programming“Genetic Programming”226795742.218LRInclude
55String Kernel“String Kernel*”883742.114LRInclude
56Learning To Rank“Learning To Rank*” or “Machine-learned ranking”39516441.525LRInclude
57Boosting Algorithm“Boosting Algorithm$”43618141.525LRInclude
58Robot Learning“Robot* Learning”2008341.521LRInclude
59Relevance Vector Machine“Relevance Vector* Machine*”55022841.525LRInclude
60Feature Selection“Feature Selection”14,472583340.312HRExclude
61Computational Learning“Computational Learning”1335339.99HRExclude
62Adaptive Learning“Adaptive Learning”151460239.812HRExclude
63Gradient Descent“Gradient Descent”3454132738.47HRExclude
64Pattern Classification“Pattern Classification”249795238.111HRExclude
65ConnectionismConnectionis*1395338.120LRInclude
66Multiple Kernel Learning“Multi* Kernel$ Learning” or “Multikernel$ Learning”69425937.325LRInclude
67Graph Learning“Graph Learning”1726437.217LRInclude
68Naive Bayes Classifier“Naive Bayes* Classifi*”111941236.814LRInclude
69Rule-based System“Rule-based System$”76827435.721LRInclude
70Classification Algorithm“Classification Algorithm*”5510196035.615LRInclude
71Graph Kernel“Graph* Kernel*”1986934.921LRInclude
72Rule Induction“Rule* Induction”31611034.822LRInclude
73Feature Extraction“Feature Extraction”18,493636834.412HRExclude
74Decision Tree“Decision Tree*”11,257384834.211HRExclude
75Generative Model“Generative Model*”170256933.410HRExclude
76Intelligent Control“Intelligent Control*”146548733.27HRExclude
77Manifold Learning“Manifold Learning”133144233.221LRInclude
78Structured Learning“Structur* Learning”105935133.19HRExclude
79Label Propagation“Label Propagation”54117832.925LRInclude
80Hypergraph Learning“Hypergraph* Learning”672232.825LRInclude
81Case-based Reasoning“Case-based Reasoning”100732732.58HRExclude
82One Class Classifiers“One Class Classifi*”48215632.424LRInclude
83Intelligent Algorithm“Intelligent Algorithm*”88428532.225LRInclude
84Bio Inspired Computing“Bio* Inspired Computing” or “Bioinspired Computing”2006130.512HRExclude

Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: article. Language: English. N represents the number of records out of a 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records. HR represents “High noise ratio”, with less than 50% of the 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records. LR represents “Low noise ratio”, with more than 50% of the 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records

Candidate keywords directly included in the search strategy Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: articles; Language: English Candidate keywords subject to manual review Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: article. Language: English. N represents the number of records out of a 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records. HR represents “High noise ratio”, with less than 50% of the 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records. LR represents “Low noise ratio”, with more than 50% of the 25-record random sample falling in the area of (B not A ∩ B) relevant artificial intelligence records There were 92 candidate keywords with a Hit Ratio lower than 30% (Table 4). These keywords captured records with a low degree of overlap (A ∩ B) with those captured by the core lexical query. These keywords were deemed as low relevance to artificial intelligence and were not included in the final search strategy. A particular example is the term “AI”, which is common abbreviation for artificial intelligence. As a candidate keyword, the Hit Ratio for “AI” is only 17.8% in terms of overlap with those records captured by the core lexical query. Based on the 30% criteria, “AI” is not included for consideration in the final search set. On investigation, we find that “AI” has multiple meanings. A search of the Acronym Finder (AF 2020), finds 164 meanings for “AI”. Of these, 50 are in science and medicine, including Adequate Intake, Adaptive Iteration, Aridity Index, Artificial Insemination, Active Ingredient, Avian Influenza, Aromatase Inhibitor and Associative Ionization. These multiple meanings of “AI” are frequent in the titles, abstracts or keywords of WoS publications. For example, “Artificial Insemination” and “AI” have a high co-occurrence (in more than 10,500 WoS publications at the time our search). This confirms that “AI” is a poor identifier for artificial intelligence publications (and we do not include it in our final search set).
Table 4

Candidate keywords excluded from the search strategy

NumberKeywordsCandidate termsBA ∩ BHit ratio (%)Final decision
1Cognitive Robotics“Cognitive Robotic*”1835429.5Exclude
2Knowledge-based System“Knowledge-based System$”69220229.2Exclude
3Affective Computing“Affective Computing”60317428.9Exclude
4Computer Vision“Computer Vision”11,386326828.7Exclude
5Text Mining“Text Mining”5123146728.6Exclude
6Natural Language Generation“Natural Language Generation”1303728.5Exclude
7Supervised Classification“*supervised Classification”357899827.9Exclude
8Dictionary Learning“Dictionary Learning”192251927.0Exclude
9Online Learning“Online Learning”4199112926.9Exclude
10Preference Learning“Preference Learning”2336226.6Exclude
11Kernel Pca“Kernel* Pca” or “Kernel* Principal Component Analys*”75019425.9Exclude
12Data Mining“Data Mining”18,117462625.5Exclude
13Anomaly Detection“Anomaly Detection”352587224.7Exclude
14Artificial Immune System“Artificial Immune System*”68916223.5Exclude
15Kernel Method“Kernel* Method*”220249322.4Exclude
16Fuzzy Logic“Fuzzy Logic”12,350276222.4Exclude
17Latent Dirichlet Allocation“Latent Dirichlet Allocation”108423421.6Exclude
18Gaussian Kernel“Gaussian Kernel*”128427521.4Exclude
19Autonomous Learning“Autonomous Learning”2635621.3Exclude
20Regression Tree“Regression Tree*”5394113721.1Exclude
21Pattern Recognition“Pattern Recognition”19,626413621.1Exclude
22Evolutionary Computation“Evolutionary Comput*”255953821.0Exclude
23Automated Planning“Automated Planning”2485221.0Exclude
24Firefly Algorithm“Firefly Algorithm$”128827021.0Exclude
25Learning Automata“Learning Automata” or “Learning Automaton”52310920.8Exclude
26Bayesian Learning“Bayes* Learning”111723220.8Exclude
27Topic Model“Topic Model*”205642220.5Exclude
28Knowledge Representation“Knowledge Representation”200740920.4Exclude
29Machine Vision“Machine* Vision”266654020.3Exclude
30Granular Computing“Granular Computing”55611220.1Exclude
31Clonal Selection Algorithm“Clonal Selection Algorithm$”2244520.1Exclude
32Active Learning“Active Learning”388977920.0Exclude
33Speech Recognition“Speech Recognition”501299519.9Exclude
34Markov Decision Process“Markov Decision Process*”303259619.7Exclude
35Probabilistic Relational Model“Probabilistic Relational Model*”31619.4Exclude
36Game Tree“Game Tree*”881719.3Exclude
37Big Data“Big Data”16,201302718.7Exclude
38Bayesian Network“Bayes* Net*”6079110318.1Exclude
39Gaussian Process“Gaussian Process*”6329113918.0Exclude
40Classification Tree“Classification Tree*”178731617.7Exclude
41Commonsense Reasoning“Commonsense Reasoning”51917.7Exclude
42Particle Swarm Optimization“Particle Swarm Optimi?ation”21,909385417.6Exclude
43Autonomous Robot“Autonomous Robot*”116820117.2Exclude
44Genetic Algorithm“Genetic Algorithm$”49,488833016.8Exclude
45Face Recognition“Face Recognition”7813128716.5Exclude
46Probabilistic Logic“Probabilistic Logic”2183516.1Exclude
47Latent Semantic Analys“Latent Semantic Analys*”69211116.0Exclude
48Recommendation System“Recommender System$” or “Recommendation System$”423966715.7Exclude
49Junction Tree“Junction Tree*”771215.6Exclude
50Ambient Intelligence“Ambient Intelligen*”65010015.4Exclude
51Kernel Regression“Kernel* Regression”68110415.3Exclude
52Swarm Intelligence“Swarm Intelligen*”240336415.2Exclude
53Hidden Markov Model“Hidden Markov Model*”6672100815.1Exclude
54Logic Programming“Logic Programming”73610914.8Exclude
55Artificial Bee Colony“Artificial Bee Colony”256937814.7Exclude
56Association Rule“Association Rule*”237733714.2Exclude
57Autonomous Agent“Autonomous Agent$”92312813.9Exclude
58Ant Colony Optimization“Ant Colony Optimi?ation”370449013.2Exclude
59Expectation Propagation“Expectation Propagation”1291713.2Exclude
60Automated Reasoning“Automated Reasoning”2553312.9Exclude
61Collaborative Filtering“Collaborative Filtering”194825012.8Exclude
62Flower Pollination Algorithm“Flower Pollination Algorithm$”2923712.7Exclude
63Evolutionary Algorithm“Evolution* Algorithm*”13,331165112.4Exclude
64Discriminant Analysis“Discriminant Analys*”18,374221712.1Exclude
65Heuristic Search“Heuristic Search”102412211.9Exclude
66Emotion Recognition“Emotion* Recognition”432250811.8Exclude
67Proximal Gradient“Proximal Gradient”4365111.7Exclude
68Multi-agent System“Multi* Agent System*” or “Multiagent System*”9776111811.4Exclude
69Bee Colony Algorithm“Bee Colony Algorithm$”176520111.4Exclude
70Matrix Factorization“Matrix Factori?ation”638968210.7Exclude
71Graph Mining“Graph$ Mining” or “Graphic* Mining”368369.8Exclude
72Memetic Algorithm“Memetic Algorithm$”11471069.2Exclude
73Multi Robot System“Multi* Robot* System*” or “Multirobot* System*”947879.2Exclude
74Anytime Algorithm“Anytime Algorithm$”8078.8Exclude
75Coordinate Descent“Coordinate Descent”1052908.6Exclude
76Graphical Model“Graph* Model*”56274688.3Exclude
77Swarm Robotics“Swarm Robotic*”277238.3Exclude
78Pattern Mining“Pattern Mining”1115877.8Exclude
79Structured Prediction“Structur* Prediction”67864797.1Exclude
80Spatial Reasoning“Spatial Reasoning”358257.0Exclude
81Cloud Computing“Cloud Computing”11,5157686.7Exclude
82Belief Propagation“Belief Propagation”1430946.6Exclude
83Bayesian Model“Bayes* Model*”78594655.9Exclude
84Em Algorithm“Em Algorithm$”43912395.4Exclude
85Heuristic Algorithm“Heuristic Algorithm$”69983635.2Exclude
86Clique Tree“Clique Tree*”4124.9Exclude
87Bayesian Inference“Bayes* Inference”10,9525104.7Exclude
88Markov Chain“Markov Chain*”20,0587553.8Exclude
89Agent-based Model“Agent-based Model*”51811653.2Exclude
90Description Logic“Descripti* Logic”361113.1Exclude
91Logistic Regression“Logistic Regression”177,86936202.0Exclude
92AI“AI”17,949311917.4Exclude

Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: article. Language: English

Candidate keywords excluded from the search strategy Analysis of articles in SCI-E and SSCI in WoS core collection (2010-March 2020). Document type: article. Language: English

Final search approach

The full set of keywords for our artificial intelligence search strategy encompasses one core lexical query and two expanded lexical queries. The core lexical query is comprised of the ten core keywords identified at the second step of our procedure. Expanded lexical query 1 is made up of 28 keywords whose Hit Ratio compared with the set of records generated by our core lexical query is greater than 70% (Table 2). Expanded lexical query 2 consists of 61 manually-checked keywords with low noise ratios (Table 3). To complete the strategy, we also included the WoS subject category of “artificial intelligence” in the final search set. Scientific journals are assigned to specific categories in the WoS following consideration of their titles, scopes and citation patterns (Muñoz-Écija et al. 2019). It is recognized that subject category schemes are most helpful in delineating mature fields with relatively well-defined boundaries but insufficient for demarcating dynamic and multidisciplinary domains (Wang et al. 2019). It is thus not advisable to exclusively use subject categories in defining artificial intelligence. But, as a complement to the keyword-based approach that we have derived, the inclusion of the WoS artificial intelligence subject category adds a curated and peer-reviewed set of publications in journals that have been separately evaluated as within the field of artificial intelligence. The three lexical queries derived through the systematic procedure described in this section provide the capability to capture artificial intelligence publications across other WoS subject categories. The specialized artificial intelligence journals we identified in Table 1 are not included in the final search strategy because all of their records can be captured by the WoS subject category “artificial intelligence”. The final search approach for artificial intelligence is set out in Table 5.
Table 5

Final search approach for artificial intelligence

NoSearch strategySearch terms
# 1Core lexical queryTS = (“Artificial Intelligen*” or “Neural Net*” or “Machine* Learning” or “Expert System$” or “Natural Language Processing” or “Deep Learning” or “Reinforcement Learning” or “Learning Algorithm$” or “*Supervised Learning” or “Intelligent Agent*”)
# 2Expanded lexical query 1TS = ((“Backpropagation Learning” or “Back-propagation Learning” or “Bp Learning”) or (“Backpropagation Algorithm*” or “Back-propagation Algorithm*”) or “Long Short-term Memory” or ((Pcnn$ not Pcnnt) or “Pulse Coupled Neural Net*”) or “Perceptron$” or (“Neuro-evolution” or Neuroevolution) or “Liquid State Machine*” or “Deep Belief Net*” or (“Radial Basis Function Net*” or Rbfnn* or “Rbf Net*”) or “Deep Net*” or Autoencoder* or “Committee Machine*” or “Training Algorithm$” or (“Backpropagation Net*” or “Back-propagation Net*” or “Bp Network*”) or “Q learning” or “Convolution* Net*” or “Actor-critic Algorithm$” or (“Feedforward Net*” or “Feed-Forward Net*”) or “Hopfield Net*” or Neocognitron* or Xgboost* or “Boltzmann Machine*” or “Activation Function$” or (“Neurodynamic Programming” or “Neuro dynamic Programming”) or “Learning Model*” or (Neurocomputing or “Neuro-Computing”) or “Temporal Difference Learning” or “Echo State* Net*”)
# 3Expanded lexical query 2TS = (“Transfer Learning” or “Gradient Boosting” or “Adversarial Learning” or “Feature Learning” or “Generative Adversarial Net*” or “Representation Learning” or (“Multiagent Learning” or “Multi-agent Learning”) or “Reservoir Computing” or “Co-training” or (“Pac Learning” or “Probabl* Approximate* Correct Learning”) or “Extreme Learning Machine*” or “Ensemble Learning” or “Machine* Intelligen*” or (“Neuro fuzzy” or Neurofuzzy) or “Lazy Learning” or (“Multi* instance Learning” or “Multiinstance Learning”) or (“Multi* task Learning” or “Multitask Learning”) or “Computation* Intelligen*” or “Neural Model*” or (“Multi* label Learning” or “Multilabel Learning”) or “Similarity Learning” or “Statistical Relation* Learning” or “Support* Vector* Regression” or “Manifold Regulari?ation” or “Decision Forest*” or “Generali?ation Error*” or “Transductive Learning” or (Neurorobotic* or “Neuro-robotic*”) or “Inductive Logic Programming” or “Natural Language Understanding” or (Adaboost* or “Adaptive Boosting”) or “Incremental Learning” or “Random Forest*” or “Metric Learning” or “Neural Gas” or “Grammatical Inference” or “Support* Vector* Machine*” or (“Multi* label Classification” or “Multilabel Classification”) or “Conditional Random Field*” or (“Multi* class Classification” or “Multiclass Classification”) or “Mixture Of Expert*” or “Concept* Drift” or “Genetic Programming” or “String Kernel*” or (“Learning To Rank*” or “Machine-learned Ranking”) or “Boosting Algorithm$” or “Robot* Learning” or “Relevance Vector* Machine*” or Connectionis* or (“Multi* Kernel$ Learning” or “Multikernel$ Learning”) or “Graph Learning” or “Naive bayes* Classifi*” or “Rule-based System$” or “Classification Algorithm*” or “Graph* Kernel*” or “Rule* induction” or “Manifold Learning” or “Label Propagation” or “Hypergraph* Learning” or “One class Classifi*” or “Intelligent Algorithm*”)
#4WoS categoryWC = (“Artificial Intelligence”)
#5Total#1 OR #2 OR #3 OR #4
Final search approach for artificial intelligence

Comparative analysis of different search strategies for artificial intelligence

In the context of discussion about contrasting bibliometric search strategies and methods to define emerging fields, as highlighted earlier in this paper, it is appropriate and desirable to compare results from new approaches with those available in other studies. To undertake such benchmarking for our search approach, we undertook a comparative analysis of search strategies and results with three other recent bibliometric studies of artificial intelligence. In the first study—an analysis of research on artificial intelligence—Gao et al. (2019) acknowledged the wide range of the artificial intelligence research domain, although they use a fairly straightforward and restricted topic search in the WoS based on TS = (“artificial intelligence”). In the second study, where artificial intelligence was examined to detect technological recombination, Zhou et al. (2019) apply a title search TI = (AI or “artificial intelligence”) in the WoS. For the time period of their search, 374 publications were found, from which 23 core papers were identified. Keywords were extracted from these articles and also combined with expert review to add expanded search terms. In the third study, presenting worldwide trends in innovation in artificial intelligence, WIPO (2019a) applied a search strategy based on patent classification codes and an extended keyword list drawing on literature review, established hierarchies, web resources, and manual checking. An artificial intelligence publication search strategy was derived from this, querying about 60 words or phrases specific to artificial intelligence concepts across all subject areas in the Scopus scientific publication database and about 35 words or phrases related to artificial intelligence applied to the Scopus subject areas of Mathematics, Computer Science, and Engineering (WIPO 2019b). We compared these three search strategies with ours. (For convenience, we refer to our approach as Liu et al.) As published, there are variations among these search strategies by bibliographic record sources, time periods, and document types analyzed. Hence, to normalize the comparison, we applied all search strategies to articles in WoS SCI-Expanded and SSCI for the time period 2010 to 28 May 2020. An analysis of the results obtained reveals overlaps as well as significant differences (Fig. 2). Gao et al.'s simple and limited approach returned 13,310 articles. The search strategy of Zhou et al. garnered 57,993 articles. But less than one third of the articles identified by Gao et al. can be captured by Zhou et al., notwithstanding that multiple additional keywords were included in the search strategy of Zhou et al. In contrast, WIPO’s broad search strategy returned the largest set, comprising 532,314 articles, covering all records captured by Gao et al. and 76% of the records captured by Zhou et al. The search strategy (Liu et al.) put forward in this paper yielded 337,174 articles, covering all the records captured by Gao et al. and just over 78% of the records captured by Zhou et al.
Fig. 2

Comparison of four artificial intelligence bibliometric search strategies. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles (2010–28 May 2020). See text for details including references for search strategies

Comparison of four artificial intelligence bibliometric search strategies. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles (2010–28 May 2020). See text for details including references for search strategies The WIPO search strategy has a total return that is 37% larger than our search strategy. There is a core of shared records between these two approaches: 72% of the records identified by our search strategy are also identified by WIPO, although only 46% of the records returned by WIPO are covered by our search strategy (given WIPO’s larger total return). Put another way, more than one-half (54%) of WIPO’s search result is comprised of records not included in our definition of “artificial intelligence”. We investigated the causes of this significant difference. Several generic statistical and mathematical terms such as “logistic regression”, “hidden markov model” and “fuzzy logic” are included by WIPO but excluded by us. These three terms returned 195,477 article records in the search period. The largest subject categories captured were in the fields of public health and medicine, where a manual check indicated very few papers related to artificial intelligence. About 2% of the 195,477 records were in the WoS subject category of “artificial intelligence” and just 4% in the more comprehensive WoS research area of “computer science”. Only 5334 of these records are identified in our search strategy. Overall, the simple definition of Gao et al., with the use of just one search term “artificial intelligence”, appears to have relatively high precision but rather low recall in its limited return of article records. Zhou et al. include additional keywords, but their search also performs weakly in recall because they fail to capture artificial intelligence articles that explicitly use the term “artificial intelligence” in the “Topic” field. Conversely, WIPO’s approach has broad recall, but at the expense of precision, as a significant number of records captured are evidently extraneous to the domain of artificial intelligence. Our approach not in the arithmetic middle in this comparison of search approaches: it is in the third upper quintile of the range. While we independently include many artificial intelligence terms also identified by WIPO, our careful checking of all candidate terms means that we only include those that perform well with low noise, resulting in a search strategy that we would maintain has an appropriate balance between recall and precision.

Trends and patterns of research in artificial intelligence

In this section, we profile and track the development and patterns of scientific research in artificial intelligence by analyzing the publication records derived from our search strategy. We investigate publication outputs and growth, citations, co-author collaborations across countries, research sponsors and scientific disciplines. The record set used for these analyses stems from applying our search strategy (Table 5) to the WoS SCI-Expanded and SSCI databases for publication years covering the last three decades. The specific period covered is 1991 (1 January) to 2020 (24 May), an inclusive period of 29 years and 4.8 months. (In the balance of this paper, reference to 2020* denotes the period from 1 January 2020 until 24 May 2020.) After limiting our search to journal articles, excluding proceedings papers, book chapters, retracted papers, and other miscellaneous or duplicated records, our dataset of artificial intelligence scientific articles comprised 464,373 articles.

Artificial intelligence publication outputs

An analysis of publication trends, worldwide, for artificial intelligence articles shows continuous growth from 1991 through to 2020* (Fig. 3). An exponential growth trajectory is evident, beginning with a relatively slower growth in the first 10 years from 1991, accelerating from the mid-to-late 2000s, with a further boost in momentum from 2016. Almost half of all artificial intelligence articles produced between 1991 and 2020* were published in the most recent five years.
Fig. 3

Artificial intelligence publication outputs, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Columns represent annual article output. Dotted line represents cumulative percent of articles. Annualized total for 2020 estimated from averaged annual growth rates for prior 3 years. Countries identified by author affiliations. 2020* = 24 May 2020

Artificial intelligence publication outputs, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Columns represent annual article output. Dotted line represents cumulative percent of articles. Annualized total for 2020 estimated from averaged annual growth rates for prior 3 years. Countries identified by author affiliations. 2020* = 24 May 2020 Our artificial intelligence publication dataset includes articles from 195 countries and territories, with more than 750,000 authors reported (without disambiguation). Yet, while researchers worldwide are involved in scientific publishing on artificial intelligence, a large proportion of the publication output is associated with a small group of leading countries. The top ten countries, by author affiliations, contributed to more than 70% of total worldwide artificial intelligence articles published in the period 1991–2020*. China and the US are the two most productive countries by the total number of artificial intelligence articles published, followed by the UK (Table 6). By world share of artificial intelligence articles, US-based authors were by far the leading producers in the first decade from 1991, rising to about one-third of all articles published by the end of the 1990s; there was then a decline in share in the next decade (Fig. 4). Since 2009, the US has maintained a share of about 20% of worldwide artificial intelligence article outputs. The trend is similar for the UK, with a rise to nearly 11% by the early 2000s, then declining towards the end of that decade but maintaining a consistent level of just under 7% throughout the 2010s. The greatest change in position is that of China, which has sharply increased its world share of artificial intelligence publications. By output volume, China passed the UK in 2003 and the US in 2011. Authors based in China are now the largest producers of artificial intelligence articles, contributing to just under 45% of the world’s output by 2020*. (In this paper, China refers to mainland China, Hong Kong, and Macau.)
Table 6

Publications and citations of artificial intelligence articles, top 10 countries, 1991–2020*

MeasureChinaUSUKIndiaGermanySpainCanadaIranFranceItaly
Articles (× 1000)118.099.432.821.520.419.619.318.218.016.5
All citations (× 1000)1791.03385.9941.1327.0534.4376.5538.6250.8482.2356.5
Uncited articles (%)21.511.911.619.512.812.012.414.813.011.8
Citations per article (mean)15.234.128.715.226.219.227.913.826.921.6
H-index294549306159242178227124229184
Hm2.85.54.82.94.63.44.42.54.63.8
Top 10% cited (% country papers)7.415.313.36.612.98.912.26.512.310.4
Top 1% cited (% country papers)2.77.15.92.15.53.05.21.55.33.9

Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). See text for added details. 2020* = 24 May 2020. Countries identified by author affiliations

Fig. 4

Annual world share of artificial intelligence articles for top ten countries, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Countries identified by author affiliations. 2020* = 24 May 2020

Publications and citations of artificial intelligence articles, top 10 countries, 1991–2020* Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). See text for added details. 2020* = 24 May 2020. Countries identified by author affiliations Annual world share of artificial intelligence articles for top ten countries, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Countries identified by author affiliations. 2020* = 24 May 2020 Of the other leading countries in the top ten, Canada, Germany, France, Italy and Spain each now contribute between 3.0 to 4.2% of the world total. India has seen steady growth in its share of world artificial intelligence articles, with its output very close to the UK by the end of the 2010s. Iran has also emerged as a noticeable producer of articles in artificial intelligence, although it reached its peak global share in 2013 and has since seen a declining global share (Fig. 4). Beyond the top ten, Taiwan, South Korea, Japan, Singapore, and Brazil are among the top twenty leading producers of artificial intelligence articles. The dramatic rise of China in terms of the volume of artificial intelligence articles published is further evidenced by the significant presence of Chinese universities and institutes in the top thirty most productive organizations by artificial intelligence articles published from 1991 through to 2020* (Fig. 5). This analysis is based on the identification and aggregation by organization, city and country of author affiliations. Thirteen of the top 30 are universities or institutes based in mainland China, led by the Chinese Academy of Sciences (Beijing), Tsinghua University (Beijing), and Zhejiang University (Hangzhou), with a further two based in Hong Kong, led by Hong Kong Polytechnic University. Five of the top 30 productive organizations are in the US, including MIT, Stanford, and Carnegie Mellon University. Singapore, the UK, and Canada each have two organizations, including Nanyang Technological University (Singapore), University College London, and the University of Alberta (Edmonton). Iran and Japan each have one university among the top 30 most productive organizations, respectively the University of Tehran and the University of Tokyo.
Fig. 5

Top 30 organizations producing artificial intelligence articles, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Countries identified by author affiliations. Identified and aggregated by organization, city and country of author affiliations. 2020* = 24 May 2020

Top 30 organizations producing artificial intelligence articles, 1991–2020*. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Countries identified by author affiliations. Identified and aggregated by organization, city and country of author affiliations. 2020* = 24 May 2020

Citations to artificial intelligence articles

While volume of publication output is an important indicator of the scale of research activity, it is also vital to look at the quality of those outputs. While the drawbacks of using citation measures to assess publication quality are well recognized (Phelan 1999; van Raan 2019), citation data are widely used by scholars to assess the scientific influence of publications. To avoid limitations of using only one indicator, we calculate several citation-based indicators for artificial intelligence scientific articles for the top 10 countries (by author affiliations). We report total times cited and publication mean citations, noting that the first is related to the total number of publications while the second is susceptible to extreme citation values. Hence, we calculate composite citation-based indicators that consider both the quantity and quality of publications: the H- index, where H is the number of articles cited at least H times (Hirsch 2005); and the H = H-index/TN0.4 derived from the H-index and adjusted by the total number (TN) of articles (Molinari and Molinari 2008). Also computed is the share of worldwide highly cited articles for each country (Bornmann et al. 2012): we present measures of each country’s article outputs that are in the top 10% and top 1% of the most cited articles worldwide. Countries are identified by author affiliations. Looking across these reported measures (Table 6), the US maintains the highest scientific influence in artificial intelligence: its total times cited, average times cited, H and H indices, and share of its output among the 10% and 1% worldwide most frequently-cited articles all rank first among the benchmark countries. The UK also performs strongly by these measures of scientific influence: for its artificial intelligence articles, measures for average citations, H and H indices, and share of output in the top 10% and top 1% of the most cited articles worldwide are high, coming in below the US but higher than the next group comprising of Germany, Canada, and France. In contrast, while China now leads by the absolute number of artificial intelligence articles produced over this nearly three-decade period, it lags in terms of its average article citation level, H and H indices, and share of output in the top 10% and top 1% of the most cited articles worldwide. China also has the highest number of uncited articles, at a rate that is almost twice as great as for the US and the UK. Two other Asian countries—India and Iran—are among the top ten countries by numbers of artificial intelligence articles published, although both also perform less strongly (and behind China) on most of the reported measures of scientific influence. To observe dynamic changes in the scientific influences of the top countries (by volume of output) in the artificial intelligence field over successive time periods, we provide quinquennial calculations of the share of each country’s article output that is in the top 10% of the most cited articles worldwide (Table 7). In interpreting results, it should be noted that citation patterns are still formative in the early years after publication, although there is evidence of more reliability in citation impact measurement after a window of about three years (Adams 2005; Bornmann 2013). Over the long-run, the analysis confirms US leadership in the artificial intelligence field by this measure of scientific influence, ranking first among the compared countries in each five-year period. In the periods from 2000 to 2014, over 15% of US papers were in the top 10% most cited articles worldwide, although in the most recent 2015–2019 period, the US position diminished by more than two percentage points. Ranked second by this scientific influence measure, the UK broadly follows the US trend, rising in the share of its output in the top 10% most cited articles worldwide for the three quinquennial periods from 2000 to 2014, then dipping. However, in the 2015–2019 period, the gap between the US and the UK closed to just 0.4 percentage points. Three countries—Canada, Italy, and Iran—each saw increases in every five-year period in their share of outputs in the top 10% most cited articles worldwide, respectively ranking 3rd, 4th and 5th by this measure of scientific influence in the 2015–2019 period. Germany, which placed third by this measure in 2000–2004, saw its ranking fall to 6th place in 2015–2019. China’s share of outputs in the top 10% most cited articles worldwide grew noticeably in each of the three quinquennial periods from 2000 to 2014. In the most recent 2015–2019 period, there was no further growth (indeed a slight dip) in the share of China’s outputs in the top 10% most cited articles worldwide, although it might be noted that China’s performance on this metric was largely upheld notwithstanding a more than three-fold increase in annual article output in 2019 when compared with 2015. By share of outputs in the top 10% most cited artificial intelligence articles worldwide, China has narrowed the gap with the US, from 5.9 percentage points in the early 2000s to 1.5 percentage points towards the end of the 2010s. In this group of the leading 10 countries by article quantity, India demonstrated the weakest performance in the share of outputs in the top 10% most cited articles worldwide, although there was some modest improvement over the first three quinquennials of the twenty-year period (Table 7).
Table 7

Country share of top 10% of the most cited artificial intelligence articles worldwide, 2000–2019

2000–20042005–20092010–20142015–2019
All articles, worldwide (× 1000)36.058.9102.8195.8
In worldwide top 10% most-cited%%%%
US15.215.315.413.0
UK12.413.414.012.6
Canada9.812.212.912.2
Italy8.09.210.211.8
Iran4.06.07.811.7
Germany10.614.014.511.6
China9.310.211.611.5
France10.511.811.610.9
Spain7.17.89.29.5
India7.48.08.88.5

Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2019 (N = 393,439). Top ten countries by output of articles. Countries identified by author affiliations

Country share of top 10% of the most cited artificial intelligence articles worldwide, 2000–2019 Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2019 (N = 393,439). Top ten countries by output of articles. Countries identified by author affiliations

Co-author collaboration across countries

Researchers increasingly collaborate in teams within and across institutional and national boundaries in order to leverage knowledge, disciplinary and interdisciplinary capabilities, scientific infrastructure, reputational benefits, and other resources (Glänzel and Schubert 2004; Bozeman and Youtie 2017; Chen et al. 2019). Consistent with this broad trend, the co-authorship of scientific publications is predominant in the artificial intelligence research domain. In our WoS dataset of over 464,000 artificial intelligence articles (1991–2020*), just 8.6% are single authored, nearly a half (48.9%) have two or three authors, more than one third (34.9%) have four-to-six authors, and 7.7% have seven or more authors. Many of these co-authorships are multi-institutional. More than one-half (53.8%) of artificial intelligence articles involve authors with two or more organizational affiliations. We also find that co-authorships for artificial intelligence research are frequently international, although there are differences among the leading producers of scientific articles in this domain. For the period 1991–2019, about 41% of US artificial intelligence articles are internationally co-authored, most noticeably with China (accounting for 14% of all US artificial intelligence papers), followed by the UK (4%) and Canada (3%) (Table 8). International co-authorship is noticeably lower for China, where about 31% of artificial intelligence articles are internationally co-authored, with the USA contributing to over one-tenth of Chinese publications in the field. The percent of internationally co-authored publications for Iran is just below the Chinese level, at about 30%, while for India it is 23%—the lowest among the top ten publishing countries. The UK has the highest level of international co-authorship, with nearly three-fifths of its artificial intelligence papers being international co-authored. The UK’s international partners are led by China (15% of UK papers) and the US (12%), followed by Germany (6%). Canada, Germany and France also have a high international co-authorship rate (all over 50%), with the US, the UK and China as their leading collaborators.
Table 8

International co-authoring for top 10 artificial intelligence publishing countries, 1991–2020*

International co-authored articlesLeading co-authoring countries
 × 1000PercentCountriesFirstSecondThird
CountryPercentCountryPercentCountryPercent
USA40.941.1164China14.0UK4.1Canada3.3
China36.330.7127US11.8UK4.2Australia3.7
UK18.957.7151China15.2US12.4Germany6.3
Germany11.054.1142US15.2UK10.1China5.8
Canada10.856.0130US16.9China16.6UK4.5
France9.753.9138US11.7UK6.8China6.3
Spain8.241.7129US7.9UK7.7France4.5
Italy7.646.2133US12.1UK9.1France6.6
Iran5.429.796US5.8Canada4.2Malaysia3.7
India4.922.9112US6.4China3.3South Korea2.1

Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Top ten countries by output of articles. 2020* = 24 May 2020. Countries identified by author affiliations. Percent refers to portion of article output of each top ten country

International co-authoring for top 10 artificial intelligence publishing countries, 1991–2020* Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Top ten countries by output of articles. 2020* = 24 May 2020. Countries identified by author affiliations. Percent refers to portion of article output of each top ten country Patterns of collaboration between countries in artificial intelligence scientific research are further revealed through an international co-authorship network map for the top 30 countries (by volume of output, 1991–2020*) (Fig. 6). The US, as the leading partner of most other top countries, plays a dominant role in artificial intelligence transnational co-authorship linkages. China and the UK also serve as next tier hubs in transnational networks. China and the US are the most linked pair of countries, by volume of co-authored articles. With China and the US as dual hubs, there is an Asia–Pacific cluster, also involving Australia, Singapore, Canada, Japan and Taiwan. A clustered European network is also evident, with the UK, Germany, and France as key nodes.
Fig. 6

Artificial intelligence co-author collaboration networks, top 30 countries. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). 2020* = 24 May 2020. Visualization using VOSviewer, nodes represent countries (identified by author affiliations) and linkages represent co-authorship relationships between countries

Artificial intelligence co-author collaboration networks, top 30 countries. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). 2020* = 24 May 2020. Visualization using VOSviewer, nodes represent countries (identified by author affiliations) and linkages represent co-authorship relationships between countries

Research sponsors of artificial intelligence

Further insights into the landscape of artificial intelligence research can be gleaned by investigating research sponsors. Research sponsors are influential in guiding what research is supported, who gets support, and how they are supported. Funding acknowledgement information is first available in the WoS from mid-2008. Research in papers that do not report funding acknowledgements may have been aided through institutional resources rather than specific grant award. However, if a particular grant or funding source was received, it is likely to be reported, as funding sponsors and journals now typically require that recipients acknowledge funding support. The organizational name of the funding sponsor and often the specific grant program and award number is reported, although not the amount of funding. Individual papers may acknowledge more than one funding sponsor from one or more countries, depending on their co-authorship arrangements. Since the same funding sponsor may be reported by authors and journals in varied ways, we applied a text matching, cleaning and manual review process to our WoS dataset to develop a robust and validated set of sponsor names (Wang and Shapira 2011). Beginning from the subsequent first full year of information on funding in the WoS, we find that 66.9% of 339,347 artificial intelligence articles published during the period 2009–2020* report funding acknowledgements information. Among the leading countries by output of artificial intelligence articles, China has the highest share (88.6%) of articles that report funding acknowledgements. For the US and the UK, respectively 72.5% and 69.8% of articles report funding acknowledgements. Just over 70% of articles by authors with affiliations in Germany and Canada report funding acknowledgements. At the lowest end are India and Iran, where respectively 30.6% and 21.6% of articles report funding acknowledgements. A relatively small group of sponsors are prominent (by number of funding acknowledgements reported) in their support of funded research in the artificial intelligence research domain. The top 30 sponsors are acknowledged in more than four-fifths (82.8%) of articles that report funding acknowledgements. All are public research support bodies or agencies associated with government. We focus on the top 15 research sponsors, which are acknowledged in more than 158,000 artificial intelligence articles published between 2009–2020*—equivalent to 69.6% of all papers in this period that report funding acknowledgements. Overall, China has five sponsors among these top 15 funders of artificial intelligence research, the US has three, two are in Europe, and Taiwan, Canada, South Korea, Brazil and Japan each have one (Fig. 7). The growth of the National Natural Science Foundation (NNSF) of China as a funder of artificial intelligence research is particularly noticeable. By 2014, NNSF was already the world’s largest sponsor of research in this domain, as reported by funding acknowledgements; by 2020*, it had moved yet further ahead. Between 2015 and 2020*, more than 56,000 artificial intelligence articles acknowledged NNSF support—a sum that was greater than the number of papers supported during this period from the other 14 sponsors combined. Other leading funding agencies outside of China also increased the number of artificial intelligence papers supported, but not at the same rate. When the first period (2009–2014) is compared with the second period (2015–2020*), artificial intelligence articles acknowledging NNSF support increased by 242%. For the two largest US sponsors, the National Science Foundation (NSF) and the National Institutes of Health (NIH), the equivalent growth rate was 62% and 72% respectively, while for the UK Engineering and Physical Sciences Research Council (EPSRC), the growth rate was 29%. Other Asian funding sponsors saw higher growth rates in funding acknowledgements between these two time periods, for example South Korea’s National Research Foundation increased by 204%, but from a much lower base than for NNSF.
Fig. 7

Top 15 funding sponsors acknowledged in artificial intelligence articles, 2009–2020*. Note Analysis of WoS (SCI-E and SSCI) articles, 2009–2020*, AI search (N = 339,347). 2020* = 24 May 2020. Data label to right of each bar is average citations through to 2020* for articles published in 2016 and 2017 acknowledging that funding sponsor

Top 15 funding sponsors acknowledged in artificial intelligence articles, 2009–2020*. Note Analysis of WoS (SCI-E and SSCI) articles, 2009–2020*, AI search (N = 339,347). 2020* = 24 May 2020. Data label to right of each bar is average citations through to 2020* for articles published in 2016 and 2017 acknowledging that funding sponsor While NNSF and other sponsors in China and elsewhere have increased the quantity of research outputs supported in the artificial intelligence domain, we also probe the quality of recent publications underwritten by the top 15 research sponsors. Given the rapid growth of research outputs, we sought an appropriate time window that would capture relatively recent publications yet allow sufficient time for citation patterns to emerge. As noted in the earlier discussion on citations to artificial intelligence articles, a 3-year citation window can be viewed as appropriate. We thus focus on articles published in 2016–2017, which (given our data end point of 24 May 2020) provides an average article age of 3.3 years. In this period, almost 15,000 articles published in 2016 and 2017 acknowledge NNSF funding support, with just over 2,800 articles acknowledging support from Fundamental Research Funds from the Central Universities (FRFCU) of China. Over 2100 artificial intelligence articles published in 2016 and 2017 acknowledge funding support from each of the US NSF and NIH, with about 1000 acknowledging support from European Union sources. The other non-Chinese research bodies are acknowledged in the range of 500 to just under 800 articles published in 2016 and 2017. In the subsequent three-year period through to 2020*, publications funded by the US NIH garner the highest average citations with 27.4 per article; publications supported by the UK EPSRC attract an average of 18.1 citations per article, while for the US NSF the average is 18.0 citations per article (Fig. 7). Articles supported by China’s NNSF and FRFCU attract fewer cites on average, at 16.7 and 16.6 citations per article. Nonetheless, papers that acknowledge NNSF and FRFCU funding are cited, on average and in our three-year time window, at higher rates than for publications supported by the European Union and sponsors in Canada, South Korea, Japan and Brazil. Additionally, for articles supported by China’s 973 Program and by the Jiangsu Province National Science Foundation, average citation levels are comparable to those of EPSRC and the US NSF. This analysis does not take into account field differences in citation patterns and distributions around the mean for citations. Nor does it adjust for different patterns in citations within countries. However, it does suggest that the massive push to expand support for artificial intelligence scientific research in China has not necessarily come at the expense of quality, at least as measured by average citations to relatively recent papers.

Scientific disciplines of artificial intelligence

The inherently multidisciplinary nature of artificial intelligence (Sombattheera et al. 2012) is clearly evident by the range of WoS subject categories involved in artificial intelligence publications. Each journal in which a paper is published is classified by the WoS into one or more of over 250 granular subject categories (including multidisciplinary sciences if a journal covers more than six subject categories). Some 243 WoS subject categories are represented by the articles captured in our data set. However, a smaller number of subject categories encompasses a majority of these articles. The top 15 subject categories together cover 69.4% of all WoS artificial intelligence articles in the period 1991–2020* (Table 9). The leading subject category is “computer science, artificial intelligence”, covering about 40% of artificial intelligence articles in the most recent period of 2011–2020*, followed by “engineering, electrical & electronic” and “computer science, information systems” with 23% and 10% respectively. There is also the suggestion of a diffusion of artificial intelligence concepts and methods into other subject categories. The core topic of “computer science, artificial intelligence” dropped down in its share of artificial intelligence articles by about 11 percentage points between 1991–2000 and 2011–2020*, even though increasing in absolute numbers of publications, as other subject categories grew over these periods, including “telecommunications”, “computer science, information systems” and other non-computer science related categories.
Table 9

Top 15 WoS subject categories of artificial intelligence articles, 1991–2020*

Publication year
Total1991–20002001–20102011–2020*
Articles (× 1000)464.451.8104.5308.1
Web of science categoryPercentage of total articles (%)
Computer science, artificial intelligence43.851.150.740.2
Engineering, electrical and electronic23.326.024.022.6
Computer science, information systems8.96.36.910.0
Computer science, interdisciplinary applications7.65.47.48.1
Automation and control systems6.56.77.56.1
Computer science, theory and methods6.29.27.15.3
Neurosciences4.87.05.64.1
Operations research and management science4.64.36.04.1
Telecommunications3.50.91.04.8
Computer science, software engineering3.44.03.53.3
Engineering, multidisciplinary3.22.82.63.4
Instruments and instrumentation3.13.32.63.2
Computer science, cybernetics2.64.83.61.9
Mathematics, applied2.42.43.12.2
Chemistry, analytical2.32.82.42.2
Computer science related categories55.363.359.452.5
Non-computer science related categories76.173.576.276.5

Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Total of 243 subject categories. Computer Science related categories include “Computer Science, Artificial Intelligence”, “Computer Science, Information Systems”, “Computer Science, Interdisciplinary Applications”, “Computer Science, Theory & Methods”, “Computer Science, Software Engineering”, “Computer Science, Cybernetics”, “Computer Science, Hardware & Architecture” and “Robotics”

Top 15 WoS subject categories of artificial intelligence articles, 1991–2020* Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). Total of 243 subject categories. Computer Science related categories include “Computer Science, Artificial Intelligence”, “Computer Science, Information Systems”, “Computer Science, Interdisciplinary Applications”, “Computer Science, Theory & Methods”, “Computer Science, Software Engineering”, “Computer Science, Cybernetics”, “Computer Science, Hardware & Architecture” and “Robotics” To further explore the distribution of subject categories and the linkages among them, we constructed a co-occurrence network map which we visualize using VOSviewer (Fig. 8). We can observe five clusters in this map. A first (purple) cluster involves computer science and engineering related categories including “computer science, artificial intelligence”, “engineering, electrical & electronic”, “computer science, theory & methods”, “telecommunications” and “cybernetics”. A second (red) cluster involves “computer science, interdisciplinary applications”, “neurosciences” and multiple medical and biology related categories. A third (yellow) cluster involves “automation & control systems”, “instruments & instrumentation” and linked categories of mathematics, chemistry and physics. A fourth (blue) cluster includes categories related to engineering, manufacturing and materials science. Finally, a fifth (green) cluster includes “environmental sciences”, “remote sensing”, “engineering environmental”, “engineering, civil” and “water resources” and social sciences such as “management”, “business, finance” and “economics”. This co-occurrence visualization of subject categories shows a wide spread of artificial intelligence publications across macro-disciplines and subject categories. The map also highlights the emergence of multi-disciplinary assemblages of scientific activities engaged not only in the development of artificial intelligence concepts and hardware and control systems but also and in artificial intelligence applications especially in industrial, materials, environmental, and life science areas.
Fig. 8

Profile of artificial intelligence research by clusters and subject categories. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). 2020* = 24 May 2020. Total of 243 WoS subject categories, visualization using VOSviewer, nodes represent subject categories and linkages represent co-occurrence relationships among them

Profile of artificial intelligence research by clusters and subject categories. Note Analysis of WoS (SCI-E and SSCI) artificial intelligence articles published 1991–2020* (N = 464,373). 2020* = 24 May 2020. Total of 243 WoS subject categories, visualization using VOSviewer, nodes represent subject categories and linkages represent co-occurrence relationships among them

Discussion

As explained in the paper, we develop and apply a new search approach to map the global landscape of artificial intelligence scientific research. We analyzed articles published in the artificial intelligence domain, examining outputs over time, by countries and organizations, citations, transnational co-author collaborations, research sponsorship and the distribution of scientific disciplines. We find a sustained growth in artificial intelligence scientific research outputs over the last three decades, with a significant acceleration in the last five years (Since 2016). The US and UK were early movers in artificial intelligence scientific research, and their outputs continue to grow. However, the largest quickening of output is seen in China, which now leads all other countries by volume of papers produced in artificial intelligence. The increasing level of scientific capability that China is building, and which can be observed in research publications, is likely to have spillover effects, through knowledge and human capital development, for its governmental and industrial efforts in artificial intelligence. Although China’s scientific influence, as measured by citations to published articles, still trails the US and the UK, there has been a clear rise in citation quality of Chinese papers in artificial intelligence to levels that in recent years are higher than for Canada and some other European countries. Yet, notwithstanding that individual countries have sought to promote their capabilities in artificial intelligence research, we also find widespread international co-author collaboration in this field, with the US, China, and the UK among the hubs for international collaboration networks. The growth of scientific research in artificial intelligence is primarily supported through public funding, as we highlighted by identifying the leading research sponsors acknowledged in published articles. Additionally, while scientific research in artificial intelligence clusters in computer science and information technology areas, we see that artificial intelligence concepts and methods are spreading to other field including those related to automation, biomedicine, materials, and manufacturing.

Conclusions

In this final section, we provide concluding comments, consider limitations, and highlight further research opportunities. The paper has put forward a systematic method for constructing a bibliometric definition for the field of artificial intelligence. We explained the stages in our process in detail, making it possible for others to replicate the approach. The resulting search strategy was evaluated by comparing its search records and search terms with the counterparts of three other search strategies used in previous bibliometric analyses of artificial intelligence. This comparison suggests that these extant search strategies for artificial intelligence are either too narrow or too broad. This benchmark assessment indicated that our search strategy offers an appropriate and justified balance between recall and precision. We position the artificial intelligence search strategy defined in this paper as a public tool. It is available for other researchers to use and refine. The search approach can also be employed by technology managers, research funders, policy analysts, and others interested in research publication activity in the artificial intelligence domain. The steps involved in applying it to the Web of Science are straightforward (directly using the search strategy as defined in Table 5 involving search keywords and a subject category). The search is readily adaptable for use in other bibliometric databases, such as Scopus or in patent databases. We note that there may be a need to adjust how the search strategy is inputted. For example, to use the search strategy in Scopus, for the equivalent of the “artificial intelligence” subject category, the All Science Journal Classification Code (ASJC) for “artificial intelligence” can be applied to develop an appropriate journal list. Additionally, for patent databases (such as Derwent Innovations, PATSTAT or PatentSight), the International Patent Classification (IPC) or Cooperative Patent Classification (CPC) codes can be used to refine the keyword-based search. There are limitations that should be kept in mind when interpreting or applying our approach. The limitations of the Web of Science in terms of global journal coverage, subject category representation, and over-representation of English language publications are well-documented (Mongeon and Paul-Hus 2016). Our focus is on artificial intelligence scientific research outputs as published in articles in journals in the WoS SCI-Expanded and SSCI databases; while we contend that this is an appropriate source, especially to indicate trends and patterns, we note that we do not analyze non-journal preprints, non-journal conference papers, books, or other databases. We further note that artificial intelligence is an evolving domain and will surely give rise to search terms that we do not currently capture. Moreover, while we maintain that the “Hit Ratio” provides a rational way to assess the relevance of candidate terms in a specific field, there is no agreed standard for its threshold values. The inclusion, review, and exclusion values we use are based on judgement and iterative trial and error. Other researchers can update the search strategy by adding new artificial intelligence terms or journals using the bibliometric search process that we have described, and they can also apply variations to Hit Ratios to see if recall and precision in future searches can be improved. The construction and application of our bibliometric definition to track the profile of scientific developments in artificial intelligence is a contribution to what must be an ongoing domain of study. Artificial intelligence is developing as one of the key platform technologies of our generation, accompanied by both promise and concern about its design and implementation. In our own work, we intend to apply the search approach to analyze patents; this will assist in mapping inventions, applications, and corporate activities that use artificial intelligence concepts and methods. We are engaging in work to explore emerging innovation ecosystems at regional, national, and international levels and in how artificial intelligence is being applied in laboratory sciences. There are many other opportunities for future studies of artificial intelligence research and innovation. We trust that the bibliometric search approach presented in this study can help to inform these studies.
  7 in total

1.  An index to quantify an individual's scientific research output.

Authors:  J E Hirsch
Journal:  Proc Natl Acad Sci U S A       Date:  2005-11-07       Impact factor: 11.205

2.  Artificial intelligence. Fears of an AI pioneer.

Authors:  Stuart Russell; John Bohannon
Journal:  Science       Date:  2015-07-17       Impact factor: 47.728

3.  Artificial intelligence, machine learning and deep learning: definitions and differences.

Authors:  D Jakhar; I Kaur
Journal:  Clin Exp Dermatol       Date:  2019-06-24       Impact factor: 3.470

4.  Artificial Intelligence and the 'Good Society': the US, EU, and UK approach.

Authors:  Corinne Cath; Sandra Wachter; Brent Mittelstadt; Mariarosaria Taddeo; Luciano Floridi
Journal:  Sci Eng Ethics       Date:  2017-03-28       Impact factor: 3.525

Review 5.  Deep Learning: A Primer for Radiologists.

Authors:  Gabriel Chartrand; Phillip M Cheng; Eugene Vorontsov; Michal Drozdzal; Simon Turcotte; Christopher J Pal; Samuel Kadoury; An Tang
Journal:  Radiographics       Date:  2017 Nov-Dec       Impact factor: 5.333

6.  Tracking the emergence of synthetic biology.

Authors:  Philip Shapira; Seokbeom Kwon; Jan Youtie
Journal:  Scientometrics       Date:  2017-07-01       Impact factor: 3.238

7.  The Current Research Landscape on the Artificial Intelligence Application in the Management of Depressive Disorders: A Bibliometric Analysis.

Authors:  Bach Xuan Tran; Roger S McIntyre; Carl A Latkin; Hai Thanh Phan; Giang Thu Vu; Huong Lan Thi Nguyen; Kenneth K Gwee; Cyrus S H Ho; Roger C M Ho
Journal:  Int J Environ Res Public Health       Date:  2019-06-18       Impact factor: 3.390

  7 in total
  2 in total

1.  Mapping technological innovation dynamics in artificial intelligence domains: Evidence from a global patent analysis.

Authors:  Na Liu; Philip Shapira; Xiaoxu Yue; Jiancheng Guan
Journal:  PLoS One       Date:  2021-12-31       Impact factor: 3.240

2.  Using a Virtual Patient via an Artificial Intelligence Chatbot to Develop Dental Students' Diagnostic Skills.

Authors:  Ana Suárez; Alberto Adanero; Víctor Díaz-Flores García; Yolanda Freire; Juan Algar
Journal:  Int J Environ Res Public Health       Date:  2022-07-18       Impact factor: 4.614

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.