| Literature DB >> 35715466 |
Kathrin Blagec1, Adriano Barbosa-Silva1, Simon Ott1, Matthias Samwald2.
Abstract
Research in artificial intelligence (AI) is addressing a growing number of tasks through a rapidly growing number of models and methodologies. This makes it difficult to keep track of where novel AI methods are successfully - or still unsuccessfully - applied, how progress is measured, how different advances might synergize with each other, and how future research should be prioritized. To help address these issues, we created the Intelligence Task Ontology and Knowledge Graph (ITO), a comprehensive, richly structured and manually curated resource on artificial intelligence tasks, benchmark results and performance metrics. The current version of ITO contains 685,560 edges, 1,100 classes representing AI processes and 1,995 properties representing performance metrics. The primary goal of ITO is to enable analyses of the global landscape of AI tasks and capabilities. ITO is based on technologies that allow for easy integration and enrichment with external data, automated inference and continuous, collaborative expert curation of underlying ontological models. We make the ITO dataset and a collection of Jupyter notebooks utilizing ITO openly available.Entities:
Year: 2022 PMID: 35715466 PMCID: PMC9205953 DOI: 10.1038/s41597-022-01435-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Overview of the main classes and properties.
| Main classes | |
|---|---|
| AI process | Subclasses of |
| Data | Subclasses of the |
| - Benchmark dataset | |
| - Article | |
| edam:Data format | Subclasses of |
| Software | Instances of |
| Topic | Subclasses of the |
| involves data | The |
| - edam:has input | The |
| - edam:has output | The |
| Performance measure | The rich hierarchy of subproperties of the |
| edam:has topic | The |
| edam:is format of | The |
| rdfs:seeAlso | The |
| foaf:page | The |
| obo:date | The |
| obo:creation date | The |
Prefixes denote classes or entities derived from established vocabularies, i.e., EDAM ontology (edam), Resource Description Format Schema (rdfs), Open Biomedical Ontologies (obo) and Friend of a Friend (foaf).
Fig. 1Example of a benchmark result for a specific model (‘DeBERTa-1.5B’) on a specific dataset (‘Words in Context’, Word sense disambiguation) embedded in ITO. Solid orange lines represent subclass relations, dashed orange lines represent instance relations.
Basic ontology metrics of ITO (v1.01).
| Entities | Count |
|---|---|
| Total triples (i.e. edges in the RDF graph) | 685,560 |
| Logical axioms count | 116 828 |
| Classes (total) | 9,037 |
| Classes (AI process classes) | 1,100 |
| Individuals | 50,826 |
| Object properties | 16 |
| Data properties (i.e. AI performance measures) | 1,995 |
| Annotation properties | 32 |
| Maximum depth | 11 |
| DL expressivity | ALCHOI(D) |
Content metrics (v1.1).
| Count | |
|---|---|
| Total number of papers covered | 7,649 |
| Time span of publications covered | 2000–8/2021 |
| Total number of benchmark results | 26,495 |
| Total number of benchmark datasets | 3,633 |
Fig. 2Number of papers covered by ITO per year. The y-axis is scaled logarithmically. Publications of the year 2021 are covered until the latest import in August 2021.
Fig. 3Number of distinct benchmarks and benchmark results per ‘AI process’ class. The x-axis is scaled logarithmically.
Fig. 4Performance measures property hierarchy. The left side of the image shows an excerpt of the list of performance metric properties; the right side shows an excerpt of the list of subclasses for the parent class ‘accuracy’.
Ontology evaluation metrics.
| 0.22 | |
| 1.73 | |
| 0.002 | |
| 75.62 | |
| 5.62 | |
| 0.49 | |
| 7328 | |
| 5.36 | |
| 11 | |
| 18 849 | |
| 7.37 | |
| 4590 | |
| 9037 | |
| 0.81 | |
| 0.68 | |
| Measurement(s) | Artificial Intelligence • Benchmark |
| Technology Type(s) | digital curation |
| Sample Characteristic - Location | Globally |