Literature DB >> 31523480

Annotations, Ontologies, and Whole Slide Images - Development of an Annotated Ontology-Driven Whole Slide Image Library of Normal and Abnormal Human Tissue.

Karin Lindman1, Jerómino F Rose2, Martin Lindvall3, Claes Lundström4, Darren Treanor1,5,6.   

Abstract

OBJECTIVE: Digital pathology is today a widely used technology, and the digitalization of microscopic slides into whole slide images (WSIs) allows the use of machine learning algorithms as a tool in the diagnostic process. In recent years, "deep learning" algorithms for image analysis have been applied to digital pathology with great success. The training of these algorithms requires a large volume of high-quality images and image annotations. These large image collections are a potent source of information, and to use and share the information, standardization of the content through a consistent terminology is essential. The aim of this project was to develop a pilot dataset of exhaustive annotated WSI of normal and abnormal human tissue and link the annotations to appropriate ontological information.
MATERIALS AND METHODS: Several biomedical ontologies and controlled vocabularies were investigated with the aim of selecting the most suitable ontology for this project. The selection criteria required an ontology that covered anatomical locations, histological subcompartments, histopathologic diagnoses, histopathologic terms, and generic terms such as normal, abnormal, and artifact. WSIs of normal and abnormal tissue from 50 colon resections and 69 skin excisions, diagnosed 2015-2016 at the Department of Clinical Pathology in Linköping, were randomly collected. These images were manually and exhaustively annotated at the level of major subcompartments, including normal or abnormal findings and artifacts.
RESULTS: Systemized nomenclature of medicine clinical terms (SNOMED CT) was chosen, and the annotations were linked to its codes and terms. Two hundred WSI were collected and annotated, resulting in 17,497 annotations, covering a total area of 302.19 cm2, equivalent to 107,7 gigapixels. Ninety-five unique SNOMED CT codes were used. The time taken to annotate a WSI varied from 45 s to over 360 min, a total time of approximately 360 h.
CONCLUSION: This work resulted in a dataset of 200 exhaustive annotated WSIs of normal and abnormal tissue from the colon and skin, and it has informed plans to build a comprehensive library of annotated WSIs. SNOMED CT was found to be the best ontology for annotation labeling. This project also demonstrates the need for future development of annotation tools in order to make the annotation process more efficient.

Entities:  

Keywords:  Annotation; digital pathology; image database; ontology; whole slide images

Year:  2019        PMID: 31523480      PMCID: PMC6669998          DOI: 10.4103/jpi.jpi_81_18

Source DB:  PubMed          Journal:  J Pathol Inform


INTRODUCTION

Digital pathology is today a widely used technology and includes whole slide imaging, “virtual microscopy,” which involves the scanning of glass slides to create whole slide images (WSIs), high resolution images, which can be viewed on screen, annotated through computer-based annotation tools, and/or analyzed by computer-based image analysis tools.[123] WSIs, WSI markups, and WSI annotations can be integrated into databases and accessed through a local intranet or the internet for primary diagnosis, quality assurance, consultation, teaching, research, and image analysis.[234] The digitalization of histopathology slides to WSIs allows the use of machine learning algorithms as a tool in the diagnostic process to make a more precise assessment of findings, for example, quantification of immunohistochemistry findings, nuclei detection, gland segmentation, or identification (ID) of other morphological features.[5678] In recent years, “deep learning” algorithms for image analysis have been applied to digital pathology with great success. However, the training of these algorithms requires a large volume of high-quality images and image annotations.[89] These large image collections are a very potent source of information, and to use, reuse, and share the information, standardization of the content through a consistent terminology is essential.[10111213] In radiology, where the digitalization of images today is standard, large and annotated image datasets exist, like the lung image database consortium and the image database resource initiative.[1214] In biomedicine, an example of a large and annotated image database is the human protein atlas.[151617] In the histopathology area, there are examples of large image datasets. The International Society of Urological Pathology has established a reference image database of representative images of several pathological entities in kidney, urinary bladder, and prostate.[18] Kostopoulos et al.'s group has built an image collection library covering the brain, breast, and laryngeal tumors.[19] The University of Leeds has developed an extensive and expanding database of pathology WSI.[20] However, most of these image databases do not cover the entire WSI, have no images of normal tissue, are rarely annotated, and if image annotations occur, they refer to the quality and content of the image, rather than the different tissue structures. To the best of our knowledge, there are no existing large annotated image databases of different tissues and organs in histopathology today even though Royal Philips and LABPON have announced their plan to create a digital database of annotated pathology images.[121] For the image annotations to be useful, data have to be coupled to said annotations to provide information related to them. One important challenge is to have a uniform system of nomenclature coupled to the annotations, creating homogeneous and easily reproducible information, allowing others to contribute to or continue the annotation process. Ontologies are an example of systematic and consistent nomenclatures; they are structured vocabularies consisting of terms designed to represent the type of entities in the domain of reality that each ontology has been devised to capture. These terms are organized hierarchically, ordered by subtype relations.[2223] In medicine, many different ontologies and controlled vocabularies exist and are evolving: the International Classification of Disease (ICD), Systemized nomenclature of medicine clinical terms (SNOMED CT), Generalized Architecture for Languages Encyclopedias and Nomenclature in Medicine (GALEN), medical subject headings (MeSH), Foundational Model of Anatomy Ontology (FMA), Unified Medical Language System (UMLS), the open biomedical ontologies (OBO), National Cancer Institute Thesaurus (NCIt), and so on.[24252627] An example of a more specific diagnostic ontology is the well-known radiology ontology RadLex.[2829] To the best of our knowledge, there are no well-known and specific ontology in the histopathology area, although Quantitative Histopathology Image Ontology (QHIO) is under development. QHIO is an ontology covering terms representing the different types and subtypes of histopathological images, imaging processes and techniques, and computational algorithms.[2230] To date, machine learning development has focused on specific disease, abnormality, or simple quantification of immunohistochemical stains. The image data has consisted of limited, manually selected regions of WSI or tissue microarrays rather than exhaustive WSI annotations.[5681431] These limited regions do not provide complete information when compared to the information given by the pathologists while examining microscope slides or WSIs. In this context, exhaustive WSI annotations, where all pixels of the WSI will be included in the annotations, could be more useful. As far as we know, no publicly available dataset of exhaustive annotated WSIs of normal human tissue exists, even though examination of normal tissue is a large and time-consuming part of the histopathological analytic process. To develop machine learning algorithms for diagnosing and classifying different types of tissue, a large database of WSIs of different types of normal and abnormal human tissue will be required. The aim of this project is to develop a pilot dataset of exhaustively annotated WSIs of normal and abnormal human tissue and link the annotations produced by this process to appropriate ontological information.

MATERIALS AND METHODS

Ontology investigation

A systematic search of different biomedical ontologies and controlled vocabularies was made at the BioPortal webpage by a specialist in clinical pathology (KL): http://bioportal.bioontology.org/. This webpage is a comprehensive repository of biomedical ontologies and is provided by the National Center for Biomedical Ontology (NCBO). The goal of NCBO is to support biomedical researches by providing online tools and a web portal enabling them to access, review, and integrate ontological resources. The ontologies and controlled vocabularies were investigated at the BioPortal webpage, and a PubMed search was also made, to examine their content and structure. The goal of this research was to find the most suitable ontology for the project's purpose, and the selection criteria required an ontology that covered anatomical locations, histological subcompartments, histopathologic diagnoses, histopathologic terms, and generic terms such as normal, abnormal, and artifact.

Collection of cases

To decide which tissues or organs to be chosen for annotation, the specialist in clinical pathology (KL) made annotation suggestions of different organs and tissue types: colon, bladder, bone, breast, bronchus, ductus deferens, lung, ileum, liver, lymph node, pancreas, prostate, salivary gland, seminal vesicles, skin, spleen, stomach, thyroid, and uterus. These suggestions and annotations were discussed with a consultant pathologist (DT), and decisions were made by consensus. Colon and skin were chosen because of their well-defined and histological layered structures, making them very suitable for exhaustive and reproducible annotations. The colon cases were randomly collected from colon resections diagnosed at the Department of Clinical Pathology in Linköping in the year 2015. Small resections of adenomas in the colon were excluded. The skin cases were randomly collected from skin excisions, including pouches, diagnosed at the Department of Clinical Pathology in Linköping in the year 2016. Normal skin excisions and skin excisions diagnosed with neoplasia were included. The number of 200 WSIs was decided to be enough for the study objective, related to the time and effort taken in the creation of manual and exhaustive annotations. One hundred and one WSIs from the colon and 99 WSIs from the skin were collected. To make the collection random, colon and skin cases ending with 1, 5, or 8 in their clinical case ID number were chosen. In cases with both normal and abnormal tissue, one WSI of each type was chosen. In cases with normal or abnormal tissue, one WSI from each case was chosen. The chosen WSI had the best quality, i.e., the least artifacts. The WSIs were manually selected by the specialist in clinical pathology (KL).

Staining, scanning, image retrieval, and workstation

All of the slides were stained with hematoxylin and eosin stain and scanned by Scanscope AT (Aperio, US), NanoZoomer XR (Hamamatsu, Japan), or NanoZoomer XRL (Hamamatsu, Japan) at a resolution equivalent to 20 times magnification (approximately 0.5 microns per pixel). All of the cases were viewed and selected in Sectra workstation IDS7 Px (Sectra, Sweden), and the WSIs were retrieved from the digital image archive in the clinical pathology picture archiving and communication system (PACS) at the Department of Clinical Pathology in Linköping during 2016–2017. All the annotations were made with the Sectra workstation IDS7 Px (Sectra, Sweden) and stored in the Sectra IDS PACS system. The computer screen used was an EIZO RadiForce RX850 monitor (EIZO, Japan), and the annotations were made on a Wacom Cintiq 27QHD Touch display (Wacom, Germany) with a Wacom Pro Pen (Wacom, Germany) [Figure 1]. Each WSI from colon covered one tissue level. In skin, each WSI covered 1–6 tissue levels, but only one level was annotated in each WSI.
Figure 1

Workstation. The complexity of the annotation process and workflow required an adequate and ergonomic workstation, which was composed of several tools, including a high-resolution screen, high-resolution touch screen, precision pen, computer, ergonomic mouse, keyboard, chair, and table. The height of the table and the chair were able to be adjusted to the proportions of the person doing the annotations

Workstation. The complexity of the annotation process and workflow required an adequate and ergonomic workstation, which was composed of several tools, including a high-resolution screen, high-resolution touch screen, precision pen, computer, ergonomic mouse, keyboard, chair, and table. The height of the table and the chair were able to be adjusted to the proportions of the person doing the annotations

Annotator contribution

The colon cases were annotated by the specialist pathologist (KL). The skin cases were annotated by a research assistant (JR), after an initial training in the annotation of WSI (supervised by both KL and DT). The specialist pathologist (KL) had regular follow-ups with the consultant pathologist (DT) and research assistant (JR) regarding the annotation procedure and annotation rules. When annotation difficulties occurred (e.g., how to annotate different findings such as structures and abnormalities), the annotator made suggestions of how to annotate the area with difficulties. In the colon cases, the specialist pathologist (KL) and the consultant pathologist (DT) made decisions of annotation rules by consensus. In the skin cases, the specialist pathologist (KL), the consultant pathologist (DT), and the research assistant (JR) made decisions of annotation rules by consensus.

Annotation rules

The primary aim was to identify the predominant tissue patterns, discerned by a human observer, in each WSI. The annotations were supposed to include 50% normal and 50% abnormal areas. The annotations also covered a range of appearance of the same tissue types (e.g., dark and light staining). Background pixels such as white areas (glass) were not annotated. In the exhaustive annotation strategy, each pixel on the tissue image was allocated to a nonoverlapping class, the annotations delineated morphologically different subcompartments, and the entire tissue image was annotated. Annotations were made in as a low magnification as possible, but still enough to delineate major anatomic subcompartments in the tissue. The major subcompartments annotated in the colon WSIs were mucosa, submucosa, muscularis propria, and fatty tissue. The major subcompartments annotated in the skin WSIs were epidermis, dermis, adnexal structures, and subcutaneous fatty tissue. In the skin, cartilage tissue in excisions from the ear also was annotated. In both the colon and skin, the subcompartments were annotated as normal or abnormal. If abnormal, the abnormality was specified. Artifacts were also annotated. Normal tissue was defined as tissue including expected structures without neoplasia, fibrosis, edema, hemorrhage, or inflammation. Abnormal tissue was defined as morphological abnormal looking tissue (e.g., diverticular mucosa may be a disease, but it is morphological normal when it comes to many of the subcompartments of the lesion). If an abnormality involved multiple subcompartments and the subcompartment borders could not be morphologically distinguished, the abnormality was annotated as a whole, but the annotation labels included all the subcompartments. If an abnormality involved multiple and easily distinguished subcompartments, each subcompartment was annotated and labeled. Continuous laying lesions and subcompartments were annotated as a whole, and discontinuous laying structures were annotated separately. If a tumor consisted of tumor stroma (e.g., basal cell carcinoma), the stroma was annotated as a part of the tumor. In diffuse lesions, where exact borders and tumor cells were hard to distinguish, the borders were defined as the region where the normal tissue started/ended. In lesions with abnormal architecture but normal cell morphology, the annotations were labeled as normal. Tissue folds, focal thick areas because of tissue preparation, and separately lying tissue parts were annotated as artifacts.

Annotation strategy

Image annotations were made in a systematic way: the first step in the annotations process was to identify the parts of the tissue and structures that were to be annotated. Then, the artifacts were annotated, both those derived as a result of specimen preparation as well as those resulting from the scanning process. After that, the annotations of the tissue itself were done from the epithelial side to the innermost layers, and from left to right. For the colon WSI, the process started from the mucosa to the serosa/adventitia, doing individual annotations for each one of the layers, starting with the abnormalities found in the tissue, and then continuing with the normal colonic tissue. In the case of the skin WSI, the process started with the epidermis layer being annotated, followed by the tumor or abnormality present and then the adnexal structures were annotated. After that, if there was inflammatory tissue present, this was annotated; otherwise, the dermis area was annotated as a whole, excluding areas with abnormal tissue present and differentiating between areas with inked margins as separate annotations. Finally, the subcutaneous tissue was annotated, also doing the differentiation between areas with inked margins as separate annotations. During the annotation process, all the pixels of the WSI that contained tissue were included, and the overlay and crossing of different annotations was avoided (except for things that could be covered in the tissue layers, e.g., adnexal structures of the skin that can normally be found in the dermis), to avoid conflicting annotations [Figures 2 and 3].
Figure 2

Annotated whole slide images from the colon. The annotations were done focusing on different subcompartments found in the colon, as well as pointing out abnormalities if present

Figure 3

Annotated whole slide images from the skin. The annotations were done focusing on different subcompartments found in the skin, as well as pointing out abnormalities if present. When comparing the annotations from colon and skin whole slide images, the skin annotations were more complicated to perform

Annotated whole slide images from the colon. The annotations were done focusing on different subcompartments found in the colon, as well as pointing out abnormalities if present Annotated whole slide images from the skin. The annotations were done focusing on different subcompartments found in the skin, as well as pointing out abnormalities if present. When comparing the annotations from colon and skin whole slide images, the skin annotations were more complicated to perform

Annotation labeling rules

Terms and codes for organ, anatomic location and sub-compartment were taken from the SNOMED CT hierarchy “Body structure” class. The most specific SNOMED CT concept and code was used for organ, anatomic location, sub-compartments, abnormalities and disease (e.g. descending colon instead of colon, submucosa of colon instead of submucosa etc.). The normal, abnormal and artifact concepts and codes were also included in the annotation labelling.

Annotation labeling information

The annotations were stored in the Sectra IDS PACS system. Individual ID numbers were assigned to every individual annotation, and the information linked to each ID number was composed of the different ontology concepts and codes, describing the content of the annotation. All this information was saved in an Excel-file with following label headings: “organ,” “SNOMED CT code organ,” “sub-compartment,” “SNOMED CT code sub-compartment,” “SNOMED CT code combined organ and sub-compartment,” “normal/abnormal including specific abnormalities,” and “SNOMED CT codes for normal/abnormal including specific abnormalities.” A link to the skin WSI in the software where the annotations were made also coupled with this information, to offer rapid accessibility.

RESULTS

From the ontology search at the BioPortal webpage, four biomedical “ontologies” were found out of a total number of 314 ontologies. These four ontologies are listed in Table 1: FMA, NCIt, MeSH, and SNOMED CT [Appendix A].[2425262732333435]
Table 1

Summary of pros and cons of each ontology

OntologyProsCons
NCItWell-known and evolving reference terminology/biomedical ontology Good coverage for cancerNot a formal ontology Do not cover all other abnormalities except cancer Lack some morphological structures No coverage for the pathology laboratory process
FMAWell-known and evolving reference ontology Good coverage for anatomical structuresNo coverage for disease, terms as abnormal, normal, or artifact No coverage for the pathology laboratory process
MeSHWell-known and evolving vocabulary thesaurus Good coverage for diseases, anatomy, and tissue subcompartmentsNo ontology No coverage for all concepts needed No logical hierarchy when it comes to anatomy and tissue sub compartments Low coverage for the pathology laboratory process
SNOMED CTWell-known and evolving hierarchy of concepts/ontology Includes all concepts neededLow formality Low coverage for the pathology laboratory process

SNOMED CT: Standardized nomenclature of medicine clinical terms, MeSH: Medical subject headings, FMA: Foundational model of anatomy, NCIt: National Cancer Institute thesaurus

Summary of pros and cons of each ontology SNOMED CT: Standardized nomenclature of medicine clinical terms, MeSH: Medical subject headings, FMA: Foundational model of anatomy, NCIt: National Cancer Institute thesaurus SNOMED-CT was the only ontology fulfilling all the selection criteria and was chosen as the most suitable ontology for the project's purpose; it is a well-known, used, and evolving ontology with anatomical, histological, and pathological concepts. Disadvantages with SNOMED CT are as follows: it lacks formality and it is a clinical ontology and do not cover the laboratory process or imaging technology in the histopathology area. The majority of the SNOMED CT concepts were taken from the hierarchy “body structure” [Figure 4] class except normal, abnormal (“qualifier value” class), and artifact (“clinical finding” class).
Figure 4

Image branch of the body structure hierarchy in systemized nomenclature of medicine clinical terms (SNOMED CT) including the systemized nomenclature of medicine-clinical terms’ codes. The different structures and diagnoses are ordered in a structured way, originating from the “body structure” concept

Image branch of the body structure hierarchy in systemized nomenclature of medicine clinical terms (SNOMED CT) including the systemized nomenclature of medicine-clinical terms’ codes. The different structures and diagnoses are ordered in a structured way, originating from the “body structure” concept Fifty cases from different parts of the colon [Appendix B] were collected, giving 101 WSIs. Of these, 49 WSIs were assessed as normal and 52 as abnormal. A total of 756 annotations were made from colon, and 39 unique SNOMED CT concepts and codes were used [Appendix B]. Sixty-nine cases from skin from different parts of the body were collected, giving 99 WSIs, of which 50 were assessed as normal and 49 as abnormal. As for skin, a total of 16,741 annotations were made, and 56 unique SNOMED CT concepts and codes were used [Appendix C]. A total area of 302.19 cm2, 127.7 giga pixel, was annotated. In the colon, the highest magnifications used was 6 times magnification. In the skin, 40 times magnification, this because of the small adnexal structures in the skin. The smallest annotated finding in the colon measured 300 microns in largest diameter, and in the skin, 14 microns in largest diameter. Each slide took from 45 s to 360 min to annotate. The total time taken to annotate all of the WSIs was approximately 360 h. The full summary of results can be seen in Table 2.
Table 2

Summary of results

Result typeValue
Total number of WSIs200
 Normal WSIs99
 Abnormal WSIs101
Total number of SNOMED CT codes95
 Anatomical locations37
 Subcompartments10
 Specific abnormality codes44
Total number of annotations17497
Total area annotated302.19 cm2 127.7 GP
Total time taken to annotate360 h (approximately)

SNOMED CT: Standardized nomenclature of medicine clinical terms, WSI: Whole slide images

Summary of results SNOMED CT: Standardized nomenclature of medicine clinical terms, WSI: Whole slide images Digital object identifiers (DOIs) have been assigned to the datasets, for future reference.[3637] The data (WSIs and annotation information in JSON format) are being shared within the AIDA consortium.[38] The dataset is not publicly available because of data regulation laws, but inquiries in access can be directed to AIDA management.

DISCUSSION

The aim of this project was to develop a pilot dataset of exhaustively annotated WSIs of normal and abnormal human tissue and link the annotations produced by this process to appropriate ontological information. Our dataset is scalable, both in terms of adding new domains (diseases, tissues, etc.,) and scaling to generate a high-quality dataset. We strongly believe in the necessity of standardization of annotation and annotation labeling, in creating reusable processes to generate high-quality large scale annotations, which will enable future reusability and interoperability. We think this work has contributed to that. Our results show the distribution of anatomical locations, diagnoses, and annotations from the colon and skin in our dataset. The results also show a significant difference in the annotation number between colon and skin and in the time taken to annotate a WSI. These findings can be explained by the difference in architectural complexity between a WSI from the colon, with simple and layered subcompartment architecture and adnexal-rich skin, with more complex subcompartments architecture due to numerous and small adnexal structures. To effectively annotate the important parts of the tissues, a structured methodology had to be implemented in the annotation process. The most effective way that the research group found was to start from the outermost layers of the tissues and then going into the innermost layers, and from the left side of the tissue to the right one. The ordered pattern in which the annotation process was performed allowed a fluent workflow, which at the same time avoided overlay and crossing of different annotations. Also by going from left to right and from the epithelium to the outer most layers, the process was faster since the location of every annotation would lead to the location of next annotation, improving the workflow even further. This highlights the importance of developing efficient workflow strategies and their implementation when creating exhaustive and detailed annotation databases. To find, use, and share information in an annotated image database, it is essential to standardize the content.[234] In this study, we examined ontologies and structured vocabularies for this purpose, and found SNOMED CT to be the best ontology to use, since it covers both anatomical locations, histological subcompartments, histopathologic diagnoses, histopathologic terms and generic terms such as normal, abnormal and artifact. SNOMED CT was originally created by the College of American Pathologists and is the world's largest clinical terminology with broad coverage of clinical medicine including disease and phenotypes. SNOMED CT is a class hierarchy of concepts which includes high-level categories such as body structure, clinical findings, and so on. Within each hierarchy, the concepts are organized from the general to the more detailed concept through “is-a” relationships. The concepts can also be linked to different hierarchies through attribute relationships. SNOMED CT is not a specific histopathology ontology, but we could find all the terms we needed for this project, and SNOMED CT is well-known and widely used, which means it is constantly improving, evolving, and with pathology input can develop and be more suitable for the histopathology area.[56] In addition, SNOMED CT terms do not cover the histopathological laboratory process or histopathological imaging technology which is of importance in sharing annotated datasets. To date, QHIO is the only ontology covering terms representing the different types and subtypes of histopathological images, imaging processes and techniques, and computational algorithms, although it is not yet ready for use.[30] We think a future combination of SNOMED CT and QHIO would be of great benefit. One significant application of machine learning tools in the future could be to screen normal slides of tissue to minimize manual work, allowing pathologists to have more time to focus on the diagnosis and classification of disease. To train machine learning algorithms for image analysis software, a large amount of data is required to achieve a high accuracy rate; image databases like this one, which includes a high amount of exhaustive and detailed annotations, can be useful. Another important function is for educational purposes, that is, for both helping students and trainees in learning histology and histopathology. This study had some challenges and limitations. One of the main challenges for the creation of the dataset was the time taken to annotate. The annotations were made manually, and complicated WSIs of adnexal-rich skin took a long time to be accurately annotated since they contain additional small structures that are not found in colon tissue. We believe that automatic to semiautomatic annotation tools have an important role in a more efficient annotation workflow, and we will investigate this in future research. We had a small dataset of 200 WSI. One hundred and one WSIs were abnormal with a wide variety of diagnoses; many only covered by one WSI. Our study was a pilot project, exhaustive annotation takes time and effort, and our annotations were general and not made for specific machine learning tasks. 100 WSI from each organ were decided to be enough for the objective of this pilot project. Despite our rather small dataset, we believe that the images and annotations can be useful for others to contribute to or combine with other datasets. Our lack of a specific machine learning task when creating our annotation rules made it difficult to define which level of detail the annotations should cover. We decided to aim at major subcompartments in the colon and skin and to annotate in as low a magnification as possible, due to the effort and time exhaustive annotation entails. The low magnification made it difficult to cover all pixels of the image and to do nonoverlapping annotations. Each WSI was annotated by one annotator, and consensus annotations from multiple annotators would have enabled more objective and high-quality annotations. Another challenge in the annotation process was the annotation of diffuse lesions, such as for example, dermatofibroma, where the exact extent of the lesion was hard to define. We delineated these lesions as precisely as we could, with lesion on one side and normal tissue on the other. Lesions involved by diffuse inflammation and where the borders were hard to distinguish, were also difficult to annotate. In these cases, we annotated the lesion as precisely as we could, including part of the inflammation. The inflammation outside the lesion was annotated separately. An uncertainty in the annotation process was also the decision if some tissue changes were normal or abnormal, for example, reactive changes. These decisions were made by consensus referred to the annotator procedure in the method.

CONCLUSION

This pilot project has resulted in a dataset of 200 exhaustive annotated WSI of normal and abnormal human tissues from the colon and skin. The project illustrates the process of building an exhaustive annotated dataset of WSIs. It also illustrates the usage of systemized nomenclatures in the labelling of the annotations, with the aim of facilitating future contribution to, and sharing of the annotated image data. The 200 gathered WSI from the colon and skin resulted in 17,497 ontology-linked annotations, covering anatomical location, histological subcompartments, normal/abnormal tissue, and more specific diagnoses as well as tissue abnormalities. SNOMED CT proved to be the best ontology for the objective of this project. This work has informed plans to build a comprehensive library of annotated WSIs. This work also shows the need for future development of annotation tools to do the annotation process faster and more efficient.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.
A
SNOMED CT conceptCode
Abnormal263654008
Acute and chronic inflammation75889009
Acute inflammation4532008
Adenocarcinoma35917007
Artifact47973001
Ascending colon9040008
Atrophy13331008
Cecum32713005
Chronic inflammation84499006
Colon71854001
Colonic mucous membrane68502009
Colonic muscularis propria41948009
Colonic submucosa61647009
Colonic subserosa52010009
Descending colon32622004
Diverticula31113003
Diverticulitis18126004
Dysplasia25723000
Edema79654002
Fibrosis112674009
Granulation tissue61363009
Hemorrhage50960005
Hyalinization19010006
Hyperplasia76197007
Hyperplastic polyp62047007
Ileum34516001
Inflammation23583003
Lymphoma21964009
Mucinous adenocarcinoma72495009
Necrosis6574001
Normal17621005
Rectum34402009
Serrated adenoma128653004
Sigmoid colon60184004
Stasis19685008
Transverse colon485005
Tubular adenoma444408007
Tubulovillous adenoma61722000
Ulcer56208002

SNOMED CT: Standardized nomenclature of medicine clinical terms

Table B1

Number of cases from different parts of colon or colon as a whole and corresponding standardized nomenclature of medicine clinical terms code

Part of colonNumber of casesSNOMED CT code
Right colon1651342009
Transverse colon3485005
Left colon755572008
Sigmoid colon1860184004
Rectum234402009
Colon unspecified471854001

SNOMED CT: Standardized nomenclature of medicine clinical terms

Table B2

Number of different diagnoses and corresponding standardized nomenclature of medicine clinical terms code

DiagnosisNumber of casesSNOMED CT code
High-grade adenocarcinoma15413447005
Low-grade adenocarcinoma14413449008
Mucinous adenocarcinoma572495009
Lymphoma121964009
Tubulovillous adenoma161722000
Tubular adenoma219665009
Serrated adenoma2128653004
Ulcerative colitis164766004
Morbus crohn134000006
Hyperplastic polyp262047007
Hyperplasia176197007
Diverticulosis6397881000
Necrosis16574001
Ulceration/hemorrhage156208002/50960005
Inflammation123583003

SNOMED-CT: Standardized nomenclature of medicine clinical terms

A
SNOMED CT conceptCode
Acanthosis23620008
Actinic keratosis856006
Basal cell carcinoma1338007
Dermatofibroma72079004
Dermis53534000
Dysplastic nevus61814002
Epidermis55988001
Fibrosis112674009
Fibrin body45619005
Granuloma45647009
Inflammation23583003
Inflammatory edema103619005
Intradermal nevus112681002
Keratoacanthoma416378000
Lentigo maligna melanoma44474009
Malignant melanoma2092003
Melanocytic nevus400101001
Malignant melanoma in situ77986002
Normal skin225544001
Perichondrium11881003
Reactive cellular changes125513006
Scar12402003
Seborrheic keratosis25499005
Skin and subcutaneous tissue structure127856007
Skin and subcutaneous tissue structure of back417286006
Skin and subcutaneous tissue structure of head389074000
Skin and subcutaneous tissue structure of trunk389072001
Skin appendage structure7748002
Skin structure39937001
Skin structure of back66643007
Skin structure of breast82038008
Skin structure of calf of leg51059006
Skin structure of cheek36141000
Skin structure of ear1902009
Skin structure of eyebrow367577003
Skin structure of eyelid and periocular area399996007
Skin structure of face73897004
Skin structure of female genitalia19938000
Skin structure of forehead68698007
Skin structure of foot60496002
Skin structure of hand33712006
Skin structure of head70762009
Skin structure of lip88089004
Skin structure of lower extremity371304004
Skin structure of neck43081002
Skin structure of nose113179006
Skin structure of scalp43067004
Skin structure of scapular region of back45980000
Skin structure of shoulder76552005
Skin structure of temple244081009
Skin structure of thigh371305003
Skin structure of upper extremity371311000
Structure of cartilage of auditory canal83543000
Squamous cell carcinoma28899001
Squamous cell carcinoma in situ59529006
Subcutaneous fatty tissue67769002
Subcutaneous tissue71966008
Surgical margins82868003

SNOMED CT: Standardized nomenclature of medicine clinical terms

Table C1

Number of cases from different skin parts and corresponding standardized nomenclature of medicine clinical terms code

Part of skinNumber of casesSNOMED CT code
Back566643007
Head470762009
Trunk3389072001
Breast282038008
Calf of leg151059006
Cheek636141000
Ear91902009
Eyebrow1367577003
Eyelid and periocular area3399996007
Face573897004
Female genitalia119938000
Forehead268698007
Foot160496002
Hand133712006
Lip288089004
Lower extremity6371304004
Neck243081002
Nose6113179006
Scapular region of the back145980000
Shoulder376552005
Temple5244081009
Thigh2371305003
Upper extremity4371311000

SNOMED CT: Standardized nomenclature of medicine clinical terms

Table C2

Number of different diagnoses in the skin cases and corresponding standardized nomenclature of medicine clinical terms code

DiagnosisNumber of casesSNOMED CT code
Actinic keratosis1856006
Basal cell carcinoma311338007
Dermatofibroma272079004
Dysplastic nevus161814002
Intradermalt nevus2112681002
Keratoacanthoma1416378000
Lentigo maligna melanoma144474009
Malignant melanoma32092003
Malignant melanoma in situ177986002
Scar312402003
Seborrheic keratosis225499005
Squamous cell carcinoma328899001
Squamous cell carcinoma in situ359529006

SNOMED CT: Standardized nomenclature of medicine clinical terms

  29 in total

1.  Medical Subject Headings (MeSH).

Authors:  C E Lipscomb
Journal:  Bull Med Libr Assoc       Date:  2000-07

2.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

Authors:  Cornelius Rosse; José L V Mejino
Journal:  J Biomed Inform       Date:  2003-12       Impact factor: 6.317

Review 3.  The Human Protein Atlas as a proteomic resource for biomarker discovery.

Authors:  F Pontén; J M Schwenk; A Asplund; P-H D Edqvist
Journal:  J Intern Med       Date:  2011-08-03       Impact factor: 8.989

4.  The NCI Thesaurus quality assurance life cycle.

Authors:  Sherri de Coronado; Lawrence W Wright; Gilberto Fragoso; Margaret W Haber; Elizabeth A Hahn-Dantona; Francis W Hartel; Sharon L Quan; Tracy Safran; Nicole Thomas; Lori Whiteman
Journal:  J Biomed Inform       Date:  2009-06       Impact factor: 6.317

5.  SNOMED CT in pathology.

Authors:  Marcial García-Rojo; Christel Daniel; Arvydas Laurinavicius
Journal:  Stud Health Technol Inform       Date:  2012

6.  An ontology-based similarity measure for biomedical data-application to radiology reports.

Authors:  Thusitha Mabotuwana; Michael C Lee; Eric V Cohen-Solal
Journal:  J Biomed Inform       Date:  2013-07-11       Impact factor: 6.317

7.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans.

Authors:  Samuel G Armato; Geoffrey McLennan; Luc Bidaut; Michael F McNitt-Gray; Charles R Meyer; Anthony P Reeves; Binsheng Zhao; Denise R Aberle; Claudia I Henschke; Eric A Hoffman; Ella A Kazerooni; Heber MacMahon; Edwin J R Van Beeke; David Yankelevitz; Alberto M Biancardi; Peyton H Bland; Matthew S Brown; Roger M Engelmann; Gary E Laderach; Daniel Max; Richard C Pais; David P Y Qing; Rachael Y Roberts; Amanda R Smith; Adam Starkey; Poonam Batrah; Philip Caligiuri; Ali Farooqi; Gregory W Gladish; C Matilda Jude; Reginald F Munden; Iva Petkovska; Leslie E Quint; Lawrence H Schwartz; Baskaran Sundaram; Lori E Dodd; Charles Fenimore; David Gur; Nicholas Petrick; John Freymann; Justin Kirby; Brian Hughes; Alessi Vande Casteele; Sangeeta Gupte; Maha Sallamm; Michael D Heath; Michael H Kuhn; Ekta Dharaiya; Richard Burns; David S Fryd; Marcos Salganicoff; Vikram Anand; Uri Shreter; Stephen Vastagh; Barbara Y Croft
Journal:  Med Phys       Date:  2011-02       Impact factor: 4.071

8.  An ontology-driven, diagnostic modeling system.

Authors:  Peter J Haug; Jeffrey P Ferraro; John Holmen; Xinzi Wu; Kumar Mynam; Matthew Ebert; Nathan Dean; Jason Jones
Journal:  J Am Med Inform Assoc       Date:  2013-03-23       Impact factor: 4.497

9.  Application of neuroanatomical ontologies for neuroimaging data annotation.

Authors:  Jessica A Turner; Jose L V Mejino; James F Brinkley; Landon T Detwiler; Hyo Jong Lee; Maryann E Martone; Daniel L Rubin
Journal:  Front Neuroinform       Date:  2010-06-10       Impact factor: 4.081

10.  Review of the current state of whole slide imaging in pathology.

Authors:  Liron Pantanowitz; Paul N Valenstein; Andrew J Evans; Keith J Kaplan; John D Pfeifer; David C Wilbur; Laura C Collins; Terence J Colgan
Journal:  J Pathol Inform       Date:  2011-08-13
View more
  3 in total

Review 1.  The state of the art for artificial intelligence in lung digital pathology.

Authors:  Vidya Sankar Viswanathan; Paula Toro; Germán Corredor; Sanjay Mukhopadhyay; Anant Madabhushi
Journal:  J Pathol       Date:  2022-06-20       Impact factor: 9.883

2.  TissueWand, a Rapid Histopathology Annotation Tool.

Authors:  Martin Lindvall; Alexander Sanner; Fredrik Petré; Karin Lindman; Darren Treanor; Claes Lundström; Jonas Löwgren
Journal:  J Pathol Inform       Date:  2020-08-21

Review 3.  Artificial intelligence driven next-generation renal histomorphometry.

Authors:  Briana A Santo; Avi Z Rosenberg; Pinaki Sarder
Journal:  Curr Opin Nephrol Hypertens       Date:  2020-05       Impact factor: 3.416

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.