| Literature DB >> 34330269 |
Xin Huang1,2, Hui Chen3, Jing-Dong Yan4.
Abstract
BACKGROUND: Image text is an important text data in the medical field at it can assist clinicians in making a diagnosis. However, due to the diversity of languages, most descriptions in the image text are unstructured data. The same medical phenomenon may also be described in various ways, such that it remains challenging to conduct text structure analysis. The aim of this research is to develop a feasible approach that can automatically convert nasopharyngeal cancer reports into structured text and build a knowledge network.Entities:
Keywords: Knowledge network; Named entity recognition; Structured medical text
Year: 2021 PMID: 34330269 PMCID: PMC8323197 DOI: 10.1186/s12911-021-01547-1
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig.1Overall framework. The overall framework can be divided into three parts: data preprocessing, NER model training, Information extraction and network construction
Example of triplet
| Sentence | The scan showed that the nasopharyngeal cavity was asymmetry and severely narrowed, tumor size was seen in 1.2 cm × 0.9 cm | |||
|---|---|---|---|---|
| PSAV vocabulary | Primary entity | Subsidiary entity | Attribute | Value |
| Nasopharyngeal cavity | Tumor | Size | Asymmetry, severely narrowed, 1.2 cm × 0.9 cm | |
Example of conjunction rule base
| Entity labels | Tokens | Triplet < (P,S),A,V > |
|---|---|---|
| PCP | No abnormalities/V in ethmoid sinus/P and/C maxillary sinus/P | < (Ethmoid sinus,–),–, no abnormalities > |
| < (Maxillary sinus,–),–, no abnormalities > | ||
| ACA | No change/V in thyroid shape/A and/C signal/A | < (Thyroid,–),shape, no change > |
| < (Thyroid,–),signal, no change > | ||
| VCV | Nasopharyngeal cavity/P Asymmetry/V with/C mild stenosis/V | < ( Nasopharyngeal cavity,–),–, > Asymmetry, mild stenosis |
Fig. 2Word frequency of the primary entities. The entities from left to right are: nasopharyngeal cavity, nasopharyngeal, parapharyngeal space, carotid sheath, pharyngeal crypts, ethmoid sinus, musculus longus capitis, maxillary sinus, thyroid, pterygoid muscle, clivus, mastoid, neck, sphenoid sinus, skull base, internal and external pterygoid muscles, throat, tensor veli palatine, sphenoid body, sphenoid wing plate, nasopharynx, petrous bone, sternocleidomastoid muscle, levator veli palatine, lateral ventricle, inferior turbinate, pterygoid process, submandibular, posterolateral pharyngeal space, petrous apex, vertebral body, semiovale, nasal septum
Fig. 3Performance of NER model. The accuracy(left) and recall(right) of the NER model. The BERT-CRF performs best. In the CNN-Architecture, the performance of IDCNN-CRF is better than CNN-CRF, while BiLSTM-CRF is better than LSTM-CRF in the RNN-Architecture
Example of structure
| Sentence | The nasopharyngeal cavity is slightly asymmetrical and slightly narrow. The posterior wall and bilateral walls of the nasopharynx are thickened. A lump formed on the posterior wall of the nasopharynx. The bilateral pharyngeal recesses are narrowed. The T1WI of the lump shows a uniform signal, and theT2WI becomes a high signal | |||||
|---|---|---|---|---|---|---|
| Location | Primary entity | Location | Subsidiary entity | Attribute | Value | |
| – | Nasopharyngeal cavity | – | – | – | Asymmetrical | |
| – | Nasopharyngeal cavity | – | – | – | Slightly narrow | |
| – | Nasopharynx | Posterior wall | – | – | Thickened | |
| – | Nasopharynx | Bilateral walls | – | – | Thickened | |
| – | – | Posterior wall | Lump | – | Formed | |
| Bilateral | Pharyngeal | – | – | – | Narrowed | |
| – | – | – | Lump | T1WI | Uniform signal | |
| – | – | – | Lump | T2WI | High signal | |
Limitation of structured algorithm
| Description | Triplet < (P,S),A,V > | Notes |
|---|---|---|
| There exists multiple lymph nodes in the parapharyngeal space, the size of which is 2.6 × 1.5 cm | < ( Parapharyngeal space,–), lymph nodes, multiple > | After dividing the description by punctuation, each token has only one triplet |
| < (–, Lymph nodes),size, 2.6 × 1.5 cm > | ||
| There exists multiple lymph nodes with the size of 2.6 × 1.5 cm in the parapharyngeal space | < ( Parapharyngeal space, lymph nodes),size, common > | This description has two triples |
Fig.4Knowledge network of nasopharyngeal cancer