| Literature DB >> 35602188 |
Youqing Mu1, Hamid R Tizhoosh2, Rohollah Moosavi Tayebi1, Catherine Ross1,3, Monalisa Sur1,3, Brian Leber1,3, Clinton J V Campbell1,3.
Abstract
Background: Pathology synopses consist of semi-structured or unstructured text summarizing visual information by observing human tissue. Experts write and interpret these synopses with high domain-specific knowledge to extract tissue semantics and formulate a diagnosis in the context of ancillary testing and clinical information. The limited number of specialists available to interpret pathology synopses restricts the utility of the inherent information. Deep learning offers a tool for information extraction and automatic feature generation from complex datasets.Entities:
Keywords: Haematological cancer; Pathology
Year: 2021 PMID: 35602188 PMCID: PMC9053264 DOI: 10.1038/s43856-021-00008-0
Source DB: PubMed Journal: Commun Med (Lond) ISSN: 2730-664X
Fig. 1Generation of semantic labels for bone marrow aspirate synopses and modeling process.
An expert reader (a clinical hematologist) interprets semi-structured bone marrow aspirate synopses and maps their contents to one or more semantic labels, which impact clinical decision-making. In order to train a model to assign semantic labels to bone marrow aspirate synopses, a synopsis first becomes a single text string and then tokenized as an input vector. The input vector will go through BERT and the classifier. The final output is a vector of size 21 (the number of semantic labels in our study). It is then compared with the ground truth vector to adjust the network weights.
The evolution of the semantic labels.
| Iteration | New labels | Label count | Sample count |
|---|---|---|---|
| 1 | Acute lymphoblastic leukemia, acute myeloid leukemia, inadequate, lymphoproliferative disorder, mastocytosis, metastatic, myelodysplastic syndrome, myeloproliferative neoplasm, normal, plasma cell neoplasm | 10 | 50 |
| 2 | Erythroid hyperplasia, iron deficiency | 12 | 83 |
| 3 | Acute leukemia, acute promyelocytic leukemia, chronic myeloid leukemia, hemophagocytosis, hypercellular, hypocellular | 18 | |
| 4 | Basophilia, eosinophilia | 20 | 282 |
| 5 | 20 | 296 | |
| 6 | Granulocytic hyperplasia | 21 | 344 |
| 7 | 21 | 393 | |
| 8 | 21 | 408 | |
| 9 | 21 | 500 |
In each iteration, new cases and/or new labels are added to the dataset. In some iterations, we reviewed the labeled cases and added new labels to the previous cases, or added a small number of new semantic labels.