Literature DB >> 30179235

A new dataset of computed-tomography angiography images for computer-aided detection of pulmonary embolism.

Mojtaba Masoudi1, Hamid-Reza Pourreza1, Mahdi Saadatmand-Tarzjan2, Noushin Eftekhari1, Fateme Shafiee Zargar3, Masoud Pezeshki Rad3.   

Abstract

The lack of publicly available datasets of computed-tomography angiography (CTA) images for pulmonary embolism (PE) is a problem felt by physicians and researchers. Although a number of computer-aided detection (CAD) systems have been developed for PE diagnosis, their performance is often evaluated using private datasets. In this paper, we introduce a new public dataset called FUMPE (standing for Ferdowsi University of Mashhad's PE dataset) which consists of three-dimensional PE-CTA images of 35 different subjects with 8792 slices in total. For each benchmark image, two expert radiologists provided the ground-truth with the assistance of a semi-automated image processing software tool. FUMPE is a challenging benchmark for CAD methods because of the large number (i.e., 3438) of PE regions and, more especially, because of the location of most of them (i.e., 67%) in lung peripheral arteries. Moreover, due to the reporting of the Qanadli score for each PE-CTA image, FUMPE is the first public dataset which can be used for the analysis of mortality and morbidity risks associated with PE. We also report some complementary prognosis information for each subject.

Entities:  

Mesh:

Year:  2018        PMID: 30179235      PMCID: PMC6122162          DOI: 10.1038/sdata.2018.180

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Pulmonary embolism (PE) is a sudden blockage of a lung artery by a deep vein thrombosis (DVT) clot, typically originating in the pelvis veins and carried by the blood flow through the heart into the lung. Since it may reduce respiratory capability by pulmonary artery (PA) closure, early diagnosis and treatment of DVT can decrease the risk of PE. However, once arterial obstruction exceeds 50% of the cross-sectional area, massive PE may occur with acute and severe cardiopulmonary failure because of right ventricular overload. It was reported that 70% of patients died within the first hour after onset of the above symptoms. Therefore, early and precise diagnosis of PE is important, due to the high morbidity and mortality risk[1,2]. Contrast-enhanced computed tomography (called CT angiography or CTA) images have been widely used for PE diagnosis[3-5] because of their suitable lesion discrimination in blood vessels[6]. Specifically, PE regions appear as dark spots among the bright regions of blood arteries in CTA images[7]. The radiologist should record the CTA image in a suitable time interval after injection of the contrast material and before its traveling from the arteries to the veins. In this case, although the vein and PE regions may have similar gray-levels in the CTA image, the latter can be distinguished from the former by its higher contrast. Nevertheless, lymphatic tissue, parenchymal disease, and partial volume effect may also provide similar dark regions (especially, on artery boundaries) in CTA images[8]. This is why the manual delineation of PE regions is a time consuming task and depends on the expert insight[9]. In recent years, by progressing computing and computational technologies, computer-aided detection (CAD) systems have gained increasing impact in clinical and research applications[5]. However, due to the above challenges, automated/semi-automated detection of PE, still, is a challenging endeavor for radiologists, physicians, and biomedical engineers. These groups are unable to precisely evaluate and compare their results with each other, due to the lack of a proper dataset of PE-CTA images with suitable ground-truth, evaluation scores, and prognosis information. To tackle this problem, some researchers have generated private datasets, which are not widely shared[4,8,10]. Recently, Madrid-MIT M+Visión Consortium[11] supplied a public dataset of 20 PE-CTA images with ground-truth. However, they reported neither the clinical information of subjects nor, evaluation scores of PE-CTA images. In this paper, we present a new dataset of three dimensional (3D) PE-CTA images, called FUMPE (standing for Ferdowsi University of Mashhad's PE dataset), for computer-aided detection with research and education purposes. It includes 35 PE-CTA images with a total of 8792 slices. Furthermore, an expert radiologistmanually and precisely delineated the PE regions in every slice of each CTA image as the ground-truth. We took advantage of a semi-automated software tool to enhance the segmentation results. The final PE regions were re-examined by another expert radiologist. In addition, for further evaluation, the first radiologistprovided five CTA measurements for every benchmark image.

Methods

We primarily obtained ethical approval of the ethics committee of Mashhad University of Medical Sciences. Although all images have been anonymously published in the proposed dataset to avoid the risk of privacy breach, we got a signed informed consent from every patient. As shown in Fig. 1, the development process of the proposed dataset consists of contrast material injection, image acquisition, image selection, image segmentation, and Qanadli scoring, as comprehensively stated in the sequel.
Figure 1

Five steps of the development process of the proposed dataset.

Contrast material injection

In a normal PE-CTA, the pulmonary arteries should be full of the contrast material while the aorta should be empty of it. Therefore, a total of 70-100 mL of non-ionic contrast material (containing 300-370 milligrams of iodine per milliliter) was injected into the right antecubital vein by using gauge-18 or -16 catheters (with the flow of 4-5 mm per second) at 10-12 seconds before imaging.

Prognosis symptoms

To collect the FUMPE dataset, 400 PE-CTA images were primarily recorded from different patients. The most-common patient-complaints were dyspnea, tachypnea, and pleuritic chest pain with haemoptysis. Moreover, some patients had non-specific signs and symptoms, such as tachycardia, palpitations, wheezing, and cough. However, patients with massive PE had hypotension, extreme hypoxemia, cyanosis, syncope, or even cardiac arrest. Furthermore, for non-urgent patients, the DVT test was performed.

Imaging

CT-scanning was performed in Emam-Reza and Ghaem Medical Centers (http://quaem.mums.ac.ir/index.php/en) by using the NeuViz 16 multi–slice helical CT scanners of Philips and Neusoft Medical System Co., Ltd with 120 kVp, 0.75 mm × 16 collimation, the gantry rotation time of 0.75 s, and a beam-pitch of 1.2. Also, in order to automatically adjust the tube current, scanners took advantages of both the dose modulation and angular/longitudinal tube-current modulation (with automatic current selection) for all subjects except Patient21. The range of the tube current variations for each subject was reported in Table 1. All PE-CTA images of FUMPE were acquired in one breath hold with:
Table 1

Different characteristics of CTA images reported in FUMPE.

CaseGenderAgeDVT testSlice Thickness (Interval)Tube Current (mA)Imaging Direction#Slice#PE-ROIs (Main)#PE-ROIs (Peripheral)#PE-ROIs (Total)
Including the subject gender and age, deep vein thrombosis (DVT) test, slice thickness and interval, tube current, imaging direction, total number of slices (#Slice), total number of regions of interest with pulmonary embolism (#PE-ROIs) in main arteries, #PE-ROIs in peripheral arteries, and #PE-ROIs in the total arteries.          
Patient01M32X1.0 (1.0)[198, 297]Caudocranial201463480
Patient02F701.0 (1.0)[123, 155]Caudocranial18502121
Patient03M42NA1.0 (4.0)[210, 282]Caudocranial210395594
Patient04F70X1.0 (1.0)[183, 239]Caudocranial197172946
Patient05FNRNA1.0 (1.0)[147, 210]Caudocranial2177192163
Patient06FNRNA1.0 (1.0)[135, 167]Caudocranial23201515
Patient07M521.0 (1.0)[196, 251]Caudocranial19795147242
Patient08M281.0 (1.0)[160, 257]Caudocranial273233659
Patient09F69NA1.0 (1.0)[124, 184]Caudocranial23705353
Patient10M71NA1.0 (4.0)[142, 224]Caudocranial204088
Patient11F63NA1.0 (1.0)[60, 87]Caudocranial21701818
Patient12F711.0 (1.0)[130, 160]Craniocaudal17807777
Patient13M271.0 (1.0)[235, 303]Caudocranial189302353
Patient14F29NA1.0 (1.0)[231, 281]Caudocranial21709898
Patient15M611.0 (1.0)[145, 204]Caudocranial250153146
Patient16F53NA1.0 (1.0)[142, 236]Caudocranial235185977
Patient17F621.0 (0.5)[232, 325]Caudocranial47519460254
Patient18M821.0 (0.5)[165, 295]Caudocranial45254102156
Patient19M76NA1.0 (1.0)[206, 266]Caudocranial37051300351
Patient20M39NA1.0 (0.5)[168, 321]Caudocranial424194867
Patient21M80NA1.0 (1.0)242Caudocranial21751051
Patient22F80NA1.0 (1.0)[174, 246]Caudocranial18502222
Patient23M541.0 (1.0)[199, 276]Caudocranial2970116116
Patient24F31NA2.0 (1.5)[121, 175]Caudocranial139000
Patient25F33NA1.0 (1.0)[184, 285]Caudocranial22106767
Patient26F24NA1.0 (1.0)[134, 283]Caudocranial21377131208
Patient27M281.0 (1.0)[107, 207]Caudocranial2775388141
Patient28F311.0 (4.0)[182, 264]Caudocranial18319019
Patient29M70NA1.0 (1.0)[152, 203]Caudocranial197247296
Patient30F771.0 (1.0)[120, 228]Caudocranial205356499
Patient31M701.0 (1.0)[144, 227]Craniocaudal26854103157
Patient32F74NA2.0 (1.5)[106, 132]Caudocranial155000
Patient33F721.0 (0.5)[192, 277]Caudocranial43511560175
Patient34M80NA1.0 (1.0)[182, 279]Caudocranial18934121155
Patient35M66NA1.0 (0.5)[217, 328]Caudocranial4510154154
      Total87921134 (33%)2304 (67%)3438
slice-thickness≤1mm (except for Patient24 and Patient32 with slice-thickness=2mm) slice-interval≤1.5 (except for Patient03, Patient10, and Patient28 with slice-interval=4mm) in the caudocranial direction (except for Patient12 and Patient13 with the craniocaudal direction)

Image selection

It was frequently demonstrated that CAD systems could better extract PE regions in the main arteries compared to the peripheral vessels, due to higher contrast and better discrimination[8]. Thus, a suitable benchmark dataset for evaluation of CAD systems should considerably include a large number of PE regions in the peripheral arteries. Therefore, from among all the recorded PE-CTA images, by visual inspection, we choose 35 images with the largest number of PE regions in peripheral arteries to make the proposed dataset.

Image segmentation

To establish the ground-truth, a board-certified radiologist (with over 5-year experience for PE-CTA analysis) primarily delineated all PE regions of interest (PE-ROIs) in each PE-CTA image. He also took advantage of a semi-automated software tool called MIS (standing for medical image segmenter) which supports the coronal and sagittal reconstructions (in addition to the original axial view) to ensure about delineation accuracy. Finally, the delineated PE-ROIs were re-examined and approved by the head of the radiology department (with 18-year experience) of Emam-Reza Medical Center (http://emamreza.mums.ac.ir/index.php/en).

Code availability

We developed the MIS software tool in the MATLAB R2017 environment. It consists of a GUI window in which the user can see a 3D DICOM image in the axial, coronal, and sagittal views. Also, the user can choose the region of interest in each slice by multiple mouse selections. The software took advantage of a semi-automated segmentation algorithm which consists of the thresholding and connected-component analysis steps[12]. It can determine the local connected region to a seed point, chosen by the user, through a gray-level similarity criterion. The source codes (in the MATLAB environment), compiled executable file, and pictorial user manual of MIS are publicly available in: https://doi.org/10.6084/m9.figshare.6289085 (with the Figshare Repository).

CTA measurements

We provided five measurements for each PE-CTA image, as follows: RV/LV Ratio: The right ventricular (RV) failure is one of the most important causes of early death after PE[13]. CTA enables the radiologist to assess RV dysfunction by calculating the ratio of RV to left ventricular (LV) diameter (called RV/LV ratio) in the reconstructed four-chamber views. Reflux into IVC: Reflux of the contrast material into the inferior vena cava (IVC), which can be observed in CTA images, is associated with right heart failure due to PE[14]. Straight Septum & PA Diameter: Severe PE increases the right heart pressure. In this case, the interventricular septum may be abnormally shifted toward the left ventricle[15]; and also, the diameter of the main pulmonary artery (lateral to the ascending aorta and at the level of its bifurcation) may be increased[16]. Q-score: After image segmentation, we assessed the arterial clots of each subject according to the Qanadli scoring system (Q-score). Generally, the Q-score is computed as the superposition where n indicates the total number of proximal clot sites and d determines the obstruction index of the k-th one. In more detail, in the left lung, the upper, lingual, and lower lobar arteries are branched into three (apical, posterior, and anterior), two (superior and inferior), and five (superior, medial, lateral, posterior, and anterior) segments, respectively. Similarly, the lobar arteries of the right lung are also separated into 10 segments, in the same manner. Thus, as illustrated in Fig. 2, we totally have n=20 segments in both lungs. For the k-th segment (k=1,2,…,n), d is set equal to 0, 1, and 2 for the clot-free, partial obstruction, and total occlusion situations, respectively[17]. Once there is an embolus in the most proximal arterial level, its corresponding index is computed as the superposition of the obstruction indices of all segmental arteries arising distally. For example, Fig. 2 illustrates the obstruction indices of all arterial segments in the left and right lungs of Patient16. The Q-score can be used for prognosis evaluation, treatment-reply, and determining the anti-coagulant treatment period[18]. Also, the patients with larger Q-scores than 18 have high mortality and morbidity rates[19].
Figure 2

Sample Q-Score computation.

The arterial segments of the left and right lungs of Patient16 (used for Q-score computation) are illustrated as an example. The obstruction index of each segment is indicated in the figure. The total Q-score (i.e. 19) was computed as the superposition of all the obstruction indices.

Data Records

All data records described in this paper are available on the Figshare Repository, organized in 35 different patients (Patient01 to Patient35, Data Citation 1) and one ground-truth archive (Ground Truth, Data Citation 1). Each patient archive includes all slices of the corresponding 3D CTA image (stored in the DICOM file format) while the ground-truth archive consists of all the 3D ground-truth images of FUMPE in the MAT file format. MAT files can be simply loaded to the MATLAB programming environment by using the function load. In every ground-truth image, the foreground and background voxels were indicated by the gray-levels 1 and 0, respectively.

Technical Validation

Each image was visually checked by an experienced CT technologist to be artifact-free and have sufficient contrast for image analysis. If the image quality was not acceptable, he repeated the image acquisition process.

Summary of the dataset

Table 1 reports the characteristics of all CTA images of the proposed dataset including the subject gender and age, DVT test, slice thickness and interval, range of the tube current, imaging direction, number of slices, number of PE regions in the main arteries, and that in the peripheral arteries. FUMPE includes the PE-CTA images of 17 male and 18 female patients (aged 24-82 years). In addition, from among all FUMPE images (with totally 8792 slices), only Patient24 and Patient32 have no PE-clots (false positives) in both the main and peripheral arteries. For example, Fig. 3 illustrates the source and ground-truth images of the 77th, 80th, 106th, 116th, 119th, 133th, 139th, and 151th slices of Patient16. Note that here, all PE regions of the ground-truth were indicated by the semi-transparent green color over the source image for better visual inspection. Also, the size of PE regions was significantly various from few to hundreds of voxels.
Figure 3

A sample CTA image of FUMPE.

Including the source (left-hand side in each pair) and ground-truth (right-hand side in the same pair) images corresponding to the (a) 77th, (b) 80th, (c) 106th, (d) 116th, (e) 119th, (f) 133th, (g) 139th, and (h) 151th slices of Patient16.

For every ground-truth image of the dataset, we counted the number of PE regions in the main and peripheral arteries. As reported in Table 1, the proposed dataset totally includes 3438 PE-ROIs; such that most of them (i.e. 67%) are located in the peripheral arteries. Therefore, FUMPE is a challenging benchmark for evaluation of CAD systems. Finally, Table 2 reports the five specified CTA measurements (including RV/LV ratio, Reflux into IVC, straight septum, PA diameter, and Q-score) for all FUMPE images. As further illustrated in Fig. 4-scores were ranged from 0 to 31. Also, the most frequent Q-scores were 20, 7, and 3, with the abundance of 5, 4, and 3 subjects, respectively. Moreover, 11 patients, with larger Q-scores than 18, had high mortality/morbidity risk.
Table 2

Five different measurements reported for each CTA image of FUMPE.

CaseRV/LV ratioReflux into IVCStraight SeptumPA DiameterQ-Score
Including the ratio of right ventricular to left ventricular diameter (RV/LV ratio), reflux into the inferior vena cava (IVC), straight septum, pulmonary artery (PA) diameter, and Qanadli score (Q-Score).     
Patient010.86 (=31/35)XX3320
Patient021.13 (=27/24)X241
Patient031.03 (=33/32)X3320
Patient041.33 (=24/18)3129
Patient050.89 (=42/47)XX2820
Patient061.03 (=31/30)X383
Patient072.00 (=40/20)3131
Patient081.04 (=25/24)XX307
Patient091.18 (=20/17)X304
Patient100.78 (=21/27)XX267
Patient110.72 (=23/32)XX272
Patient120.75 (=24/32)XX286
Patient130.65 (=20/31)2714
Patient140.73 (=22/30)X258
Patient150.87 (=27/31)XX248
Patient161.65 (=28/17)2819
Patient171.46 (=41/28)3720
Patient181.19 (=25/21)X2212
Patient191.47 (=22/15)X2416
Patient200.75 (=12/16)XX195
Patient211.56 (=67/43)3610
Patient220.93 (=27/29)XX283
Patient231.12 (=28/25)4117
Patient240.97 (=38/39)X250
Patient250.40 (=19/48)XX313
Patient261.65 (=28/17)3227
Patient270.81 (=17/21)XX2018
Patient280.59 (=23/39)X257
Patient291.30 (=39/30)2722
Patient300.63 (=25/40)XX327
Patient311.19 (=32/27)X2320
Patient321.21 (=34/28)X280
Patient331.33 (=20/15)2614
Patient341.94 (=35/18)2722
Patient350.91 (=29/32)XX269
Figure 4

The Q-score histogram of FUMPE.

Comparing with other PE datasets

As shown in Table 3, FUMPE is further compared with 14 PE-CTA datasets reported in[3-5,8,10,11,20-27]. All the counterpart datasets, except m+visión[11], are private (i.e. with non-public accessibility). Clearly, FUMPE includes a large number of PE-ROIs compared to the other datasets. Furthermore, it is the first public PE dataset with the Q-score evaluation, which can be used for development of automatic scoring algorithms and medical education purposes. It is also the only dataset which provides appropriate complementary prognosis information such as DVT test, RV/LV ratio, reflux into IVC, straight septum, and PA diameter.
Table 3

Comparing FUMPE with 14 different PE-CTA datasets.

PE-CTA Dataset#Subjects#Clots#PE-ROIsPublic AccessibilityAvailable ScoringPrognosis Information
In terms of the number of subjects (#Subjects), number of clots (#Clots), number of regions of interest with pulmonary embolism (#PE-ROIs), public accessibility, available scoring, and prognosis information.      
Masutani et al.[20]1121XXX
Pichon et al.[21]322XXX
Das et al.[22]33168XXX
Digumarthy et al.[23]39270XXX
Maizlin et al.[24]845XXX
Kiraly et al.[25]869XXX
Zhou et al.[26]14225XXX
Buhmann et al.[3]40352XXX
Wittenberg et al.[4]11938XXX
Bouma et al.[8]19116XXX
Özkan et al.[5]33450XXX
Park et al.[10]2044648XXX
Tajbakhsh et al.[27]121326XXX
m+visión[11]201055521XX
FUMPE351193438Q-scoreDVT Test, RV/LV Ratio, Reflux into IVC, Straight Septum, PA Diameter

Additional information

How to cite this article: Masoudi, M. et al. A new dataset of computed-tomography angiography images for computer-aided detection of pulmonary embolism. Sci. Data 5:180180 doi: 10.1038/sdata.2018.178 (2018). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
  20 in total

1.  Interventricular septal shift due to massive pulmonary embolism shown by CT pulmonary angiography: an old sign revisited.

Authors:  T B Oliver; J H Reid; J T Murchison
Journal:  Thorax       Date:  1998-12       Impact factor: 9.139

2.  Computerized detection of pulmonary embolism in spiral CT angiography based on volumetric image analysis.

Authors:  Yoshitaka Masutani; Heber MacMahon; Kunio Doi
Journal:  IEEE Trans Med Imaging       Date:  2002-12       Impact factor: 10.048

3.  Individually tailored contrast enhancement in CT pulmonary angiography.

Authors:  Babs M F Hendriks; Madeleine Kok; Casper Mihl; Sebastiaan C A M Bekkers; Joachim E Wildberger; Marco Das
Journal:  Br J Radiol       Date:  2016-01-22       Impact factor: 3.039

4.  Acute massive pulmonary embolism: role of the cardiac surgeon.

Authors:  Allreza Sadeghi; Gregory R Brevetti; Sanghyun Kim; Joshua H Burack; Mark H Genovese; Dale A Distant; Ramesh Kodavatiganti; Robert C Lowery
Journal:  Tex Heart Inst J       Date:  2005

Review 5.  Major pulmonary embolism: review of a pathophysiologic approach to the golden hour of hemodynamically significant pulmonary embolism.

Authors:  Kenneth E Wood
Journal:  Chest       Date:  2002-03       Impact factor: 9.410

6.  New CT index to quantify arterial obstruction in pulmonary embolism: comparison with angiographic index and echocardiography.

Authors:  S D Qanadli; M El Hajjam; A Vieillard-Baron; T Joseph; B Mesurolle; V L Oliva; O Barré; F Bruckert; O Dubourg; P Lacombe
Journal:  AJR Am J Roentgenol       Date:  2001-06       Impact factor: 3.959

7.  Prediction of moderate or severe pulmonary hypertension by main pulmonary artery diameter and main pulmonary artery diameter/ascending aorta diameter in pulmonary embolism.

Authors:  Shirin Sanal; Wilbert S Aronow; Gautham Ravipati; George P Maguire; Robert N Belkin; Stuart G Lehrman
Journal:  Cardiol Rev       Date:  2006 Sep-Oct       Impact factor: 2.644

8.  A multistage approach to improve performance of computer-aided detection of pulmonary embolisms depicted on CT images: preliminary investigation.

Authors:  Sang Cheol Park; Brian E Chapman; Bin Zheng
Journal:  IEEE Trans Biomed Eng       Date:  2010-08-05       Impact factor: 4.538

9.  Computer-aided detection of pulmonary embolism on CT angiography: initial experience.

Authors:  Zeev V Maizlin; Patrick M Vos; Myrna C Godoy; Myrna B Godoy; Peter L Cooperberg
Journal:  J Thorac Imaging       Date:  2007-11       Impact factor: 3.000

10.  Preliminary investigation of computer-aided detection of pulmonary embolism in three-dimensional computed tomography pulmonary angiography images.

Authors:  Chuan Zhou; Heang-Ping Chan; Smita Patel; Philip N Cascade; Berkman Sahiner; Lubomir M Hadjiiski; Ella A Kazerooni
Journal:  Acad Radiol       Date:  2005-06       Impact factor: 3.173

View more
  5 in total

1.  Computer-aided detection and visualization of pulmonary embolism using a novel, compact, and discriminative image representation.

Authors:  Nima Tajbakhsh; Jae Y Shin; Michael B Gotway; Jianming Liang
Journal:  Med Image Anal       Date:  2019-08-06       Impact factor: 8.545

2.  LoDoPaB-CT, a benchmark dataset for low-dose computed tomography reconstruction.

Authors:  Johannes Leuschner; Maximilian Schmidt; Daniel Otero Baguer; Peter Maass
Journal:  Sci Data       Date:  2021-04-16       Impact factor: 6.444

Review 3.  Artificial Intelligence Techniques to Predict the Airway Disorders Illness: A Systematic Review.

Authors:  Apeksha Koul; Rajesh K Bawa; Yogesh Kumar
Journal:  Arch Comput Methods Eng       Date:  2022-09-28       Impact factor: 8.171

4.  Assessment of Acute Pulmonary Embolism by Computer-Aided Technique: A Reliability Study.

Authors:  Zhen-Ting Sun; Fen-E Hao; You-Min Guo; Ai-Shi Liu; Lei Zhao
Journal:  Med Sci Monit       Date:  2020-02-29

5.  Studierfenster: an Open Science Cloud-Based Medical Imaging Analysis Platform.

Authors:  Jan Egger; Daniel Wild; Maximilian Weber; Christopher A Ramirez Bedoya; Florian Karner; Alexander Prutsch; Michael Schmied; Christina Dionysio; Dominik Krobath; Yuan Jin; Christina Gsaxner; Jianning Li; Antonio Pepe
Journal:  J Digit Imaging       Date:  2022-01-21       Impact factor: 4.056

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.