Literature DB >> 35383186

fastMRI+, Clinical pathology annotations for knee and brain fully sampled magnetic resonance imaging data.

Ruiyang Zhao1,2,3, Burhaneddin Yaman1,4, Yuxin Zhang1,2,3, Russell Stewart1,5, Austin Dixon1,6, Florian Knoll7, Zhengnan Huang7, Yvonne W Lui7, Michael S Hansen8, Matthew P Lungren1,5.   

Abstract

Improving speed and image quality of Magnetic Resonance Imaging (MRI) using deep learning reconstruction is an active area of research. The fastMRI dataset contains large volumes of raw MRI data, which has enabled significant advances in this field. While the impact of the fastMRI dataset is unquestioned, the dataset currently lacks clinical expert pathology annotations, critical to addressing clinically relevant reconstruction frameworks and exploring important questions regarding rendering of specific pathology using such novel approaches. This work introduces fastMRI+, which consists of 16154 subspecialist expert bounding box annotations and 13 study-level labels for 22 different pathology categories on the fastMRI knee dataset, and 7570 subspecialist expert bounding box annotations and 643 study-level labels for 30 different pathology categories for the fastMRI brain dataset. The fastMRI+ dataset is open access and aims to support further research and advancement of medical imaging in MRI reconstruction and beyond.
© 2022. The Author(s).

Entities:  

Mesh:

Year:  2022        PMID: 35383186      PMCID: PMC8983757          DOI: 10.1038/s41597-022-01255-z

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   8.501


Background & Summary

Magnetic resonance imaging (MRI) is a widely utilized medical imaging modality critically important for a broad range of clinical diagnostic tasks including stroke, cancer, surgical planning, acute injuries, and more. Machine learning (ML) techniques have demonstrated opportunities to improve the MRI diagnostic workflow particularly in the image reconstruction task by saving time, reducing contrast, and leading in cases to FDA-cleared solutions[1-4]. Among the myriad applications of machine learning in medical imaging being explored, deep learning-based MRI reconstruction is showing considerable promise and is moving towards clinical impact. ML-based MRI reconstruction approaches often require data from “raw” fully sampled k-space datasets in order to generate ground truth images. Public MRI datasets like Calgary-Campinas Public Dataset[5], MRNet[6], OAI[7], SKM-TEA[8], and mridata.org are available to empower ML-related research. Also, various datasets can be found in multiple medical image research challenges, including MC-MRREC and RealNoiseMRI. Most of these datasets only provided reconstructed MRI images (Note SKM-TEA dataset also provides knee tissue label and pathology detection information) or limited amount of raw data. Thus, large datasets of raw MRI measurements are generally not widely available. To address this need and facilitate cross-disciplinary research in accelerated MRI reconstruction using artificial intelligence, the fastMRI initiative was developed. fastMRI is a collaborative project between Facebook AI Research (FAIR), New York University (NYU) Grossman School of Medicine, and NYU Langone Health which includes the wide release of raw MRI data and image datasets[9]. While the fastMRI data has enabled exploration of ML-driven accelerated MRI reconstruction[10,11], there is a lack of clinical pathology information to accompany the imaging data which has limited the reconstruction assessment approaches to validate quantitative metrics such as peak signal-to-noise ratio (pSNR)/structural similarity index measure (SSIM), leaving important questions regarding how various pathologies are represented in ML-based reconstruction unanswered[12]. For instance, low sensitivity and stability to clinically relevant features stall their clinical-aware applications[12-14]. In this paper, we present wide availability of a complementary dataset of annotations, fastMRI+, consisting of human subspecialist expert clinical bounding box labelled pathology annotations for knee and brain MRI scans from the fastMRI multi-coil dataset: specifically encompassing 16154 bounding box annotations and 13 study-level labels for 22 different pathology categories on knee MRIs, as well as 7570 bounding box annotations and 643 study-level labels for 30 different pathology categories on brain MRIs. This new dataset is open and accessible to all for educational and research purposes with the intent to catalyse new avenues of clinically relevant, ML-based reconstruction approaches and evaluation.

Methods

MRI image dataset

The fastMRI dataset is an open-source dataset, which contains raw and DICOM data from MRI acquisitions of knees and brains, described in detail elsewhere[9]. The images used in this study were directly obtained from the fastMRI dataset, reconstructed from fully sampled, multi-coil k-space data (both knee and brain). The fastMRI dataset was managed and anonymized as part of a study approved by the NYU School of Medicine Institutional Review Board. Image reconstruction was performed by inverse Fast Fourier Transform of each individual coil and coil combination with root sum square (RSS) for the purpose of creating pre-annotation images in fastMRI+. The reconstructed images were subsequently converted to DICOM format for human expert reader (radiologist) annotation.

Annotations

Annotation was performed using a commercial browser-based annotation platform (MD.ai, New York, NY) which allowed adjustment of brightness, contrast, and magnification of the images. Readers used personal computers to view and annotate the images using the mentioned annotation platform. A subspecialist board certified musculoskeletal radiologist with 6 years in practice experience performed annotation for the knee dataset and a subspecialist board certified neuroradiologist with 2 years in practice experience performed annotation for the brain dataset. Annotation was performed with bounding box annotation to include the relevant label for a given pathology on a slice-by-slice level. When more than one pathology was identified in a single image slice, multiple bounding boxes were used. All 1172 fastMRI knee MRI raw dataset studies were reconstructed and clinically annotated for fastMRI+. Each knee examination consisted of a single series (either proton density (PD) or T2-weighted) of coronal images where bounding box labels were placed on each slice where representative pathology was identified[15,16]. Effort was made to try to include all the pathology within the bounding box while limiting the normal surrounding anatomy. If the examination contained significant clinically limiting artifacts, then the annotation for “Artifact” was added as a study-level label. In these instances, an interpolation tool was used in which the first and last slice were each labelled and the user interface interpolated the labels on intervening slices. If no relevant pathology was identified on an examination, no labels were provided. A sub selection of 1001 out of 5847 fastMRI brain MRI raw dataset studies were selected randomly for annotation. Each brain examination included a single axial series (either T2-weighted FLAIR, T1-weighted without contrast, or T1-weighted with contrast) where bounding box labels were placed on each image in which representative pathology or normal anatomical variant was identified[17,18]. As in knee examinations, effort was made to try to include all the pathology within the bounding box while limiting the normal surrounding anatomy. In some cases, the pathology or normal anatomic variant displayed within a given examination was so extensive or diffuse that a study-level label was used to characterize the relevant images or the entire exam inclusive of the finding (i.e., diffuse white matter disease). The study-level label, in these instances, replaced the use of a bounding box. If no relevant pathology was identified on a given examination, no labels were provided. Note there are several limitations to this dataset that bear acknowledgement. First, while the annotators are subspecialist radiologists in practice at leading academic medical centers, the lack of multiple annotators/repeated annotations to determine inter-rater/intra-rater reliability metrics or ensure consensus agreement is a limitation and should be considered in the use of these labels. Further work may include multiple annotations by multiple readers to further refine the clinical labels applied in fastMRI+. Additionally, the fastMRI knee MRI raw dataset contained only coronally acquired series while the brain MRI dataset contained only axially acquired series, each in a variety of pulse sequences and coils. Most knee/brain pathologies that are visible in the non-coronal/non-axial planes are also visible in coronal/axial planes, though not as well seen or as well characterized. For instance, patellofemoral cartilage in the knee and optic neuritis in the brain. While sufficient for annotation, it is important to note that true diagnostic interpretation in MRI for the included pathologies typically demands multi-sequence and multi-planar images for clinically accurate interpretation. What is more, only binding boxes indicating knee and brain diseases were exported and reported in this work which may limit the research applications of this dataset. Full segmentation of structures would be more laborious and would be a potential subject of future work. Thus, the annotations provided by fastMRI+ may be incomplete. In the future, raw MRI datasets containing fully sampled multi-planar and multi-sequence data would enable optimal clinical annotation.

Statistical analysis

Label distribution analysis was conducted for both knee and brain datasets showing detailed label descriptions at the same time. Table 1 shows annotation count and subject count for corresponding image-level knee labels. Note ‘Artifact’ is a study-level label for the entire study rather than a label of individual images. Table 2 shows annotation count and subject count for corresponding image-level brain labels. Table 3 shows subject count for corresponding subject-level brain labels. Note subject count was provided to show the prevalence of given pathology.
Table 1

Knee label summary.

LabelAnnotation CountSubject Count
Meniscus
Meniscus Tear5658663
Displaced Meniscal Tissue23256
Bones and Cartilage
Bone-Subchondral Edema986196
Bone Lesion18329
Bone-Fracture/Contusion/Dislocation1060119
Cartilage Full Thickness Loss/Defect615122
Cartilage Partial Thickness Loss/Defect2985588
Ligaments
ACL High Grade Sprain678101
ACL Low-Mod Grade Sprain765153
MCL High Grade Sprain114
MCL Low-Mod Grade Sprain285121
PCL High Grade Sprain183
PCL Low-Mod Grade Sprain14240
LCL Complex High Grade Sprain143
LCL Complex Low-Mod Grade Sprain13048
Other
Joint Effusion1311142
Joint Bodies3811
Periarticular Cysts864161
Muscle Strain6511
Soft Tissue Lesion9010
Patellar Retinaculum High Grade Sprain244
Artifact/13

*Artifact is study-level label.

Table 2

Brain image-level label summary.

Image Level LabelAnnotation CountSubject Count
Absent Septum Pellucidum31
Craniectomy324
Craniotomy102599
Craniotomy with Cranioplasty433
Dural Thickening35130
Edema36944
Encephalomalacia16118
Enlarged Ventricles30038
Extra-Axial Mass10411
Intraventricular Substance81
Likely Cysts175
Lacunar Infarct11332
Mass38046
Nonspecific Lesion757124
Nonspecific White Matter Lesion1826173
Normal Variant7321
Paranasal Sinus Opacification408
Pineal Cyst21
Possible Artifact50552
Posttreatment Change126299
Resection Cavity19927

*Likely Cysts is applied to small lesions (approximately 1 cm or less in diameter) which are difficult to distinguish from parenchymal, simple parenchymal neuronal cyst, and prominent perivascular space.

Table 3

Brain study-level label summary.

Study Level LabelSubject Count
Global Ischemia1
Small Vessel Chronic White Matter Ischemic Change221
Motion Artifact33
Possible Demyelinating Disease2
Colpocephaly2
White Matter Disease2
Innumerable Bilateral Focal Brain Lesions2
Extra-Axial Collection9
Normal for Age371
Knee label summary. *Artifact is study-level label. Brain image-level label summary. *Likely Cysts is applied to small lesions (approximately 1 cm or less in diameter) which are difficult to distinguish from parenchymal, simple parenchymal neuronal cyst, and prominent perivascular space. Brain study-level label summary.

Data Records

We created separate annotation files for the 1172 validation knee datasets and 1001 brain datasets, all based on the fastMRI source data[9]. The annotation files (knee.csv and brain.csv) can be accessed from both fastmri-plus Synapse repository[19] and fastMRI-plus GitHub repository (https://github.com/microsoft/fastmri-plus) in CSV formats. Four CSV files are included in the ‘Annotations’ folder. File names of all radiologist-interpreted dataset are stored in knee_file_list.csv and brain_file_list.csv, respectively. Annotations are contained in knee.csv and brain.csv. In each annotation CSV file, the file names (i.e., column ‘Filename’) are aligned with the naming in the fastMRI dataset. For each annotation, file name, slice number, bounding box information, and disease label are provided. The bounding box information includes four parameters, x, y, width (pixel), and height (pixel), representing the x and y coordinates of the upper-left corner, the width and height of the bounding box. Unit of the bounding box parameters is ‘pixel’. Study-level labels are marked as ‘Yes’ in column ‘Study Level’ for slice 0 of the corresponding subjects with no specified bounding box information.

Technical Validation

A board-certified radiologist with 10 years of experience reviewed the overall quality of the MRI image dataset prior to annotation and clinical evaluation was performed by two additional board-certified subspecialist radiologists. We cleaned and validated raw annotation files following instructions from MD.ai Documentation (https://docs.md.ai/). Creation and publication of fastMRI+ code repository followed standard practices with release of open-source software. Specifically, files with annotations and associated tools and scripts were managed source code control, continuous integration tests, and code/data reviews.

Usage Notes

The bounding box information can be used to plot overlaid bounding boxes on images, as shown in Fig. 1. The clinical labels, together with the bounding box coordinates, can also be converted to other formats (e.g., YOLO format[20]) in order to configure a classification or object detection problem. The open-source repository also contains an example Jupyter Notebook (‘ExampleScripts/example.ipynb’) of how to read the annotations and plot images with bounding boxes in Python.
Fig. 1

Example annotations (labels and bounding boxes) from the fastMRI+ dataset shown superimposed on both knee (a) and brain (b) reconstructed images from the fastMRI dataset.

Example annotations (labels and bounding boxes) from the fastMRI+ dataset shown superimposed on both knee (a) and brain (b) reconstructed images from the fastMRI dataset.
Measurement(s)Pathotology annotations in knee and brain MRI images
Technology Type(s)Expert delineation
  15 in total

1.  MR imaging of the menisci and cruciate ligaments: a systematic review.

Authors:  Edwin H G Oei; Jeroen J Nikken; Antonia C M Verstijnen; Abida Z Ginai; M G Myriam Hunink
Journal:  Radiology       Date:  2003-01-15       Impact factor: 11.105

Review 2.  An open, multi-vendor, multi-field-strength brain MR dataset and analysis of publicly available skull stripping methods agreement.

Authors:  Roberto Souza; Oeslle Lucena; Julia Garrafa; David Gobbi; Marina Saluzzi; Simone Appenzeller; Letícia Rittner; Richard Frayne; Roberto Lotufo
Journal:  Neuroimage       Date:  2017-08-12       Impact factor: 6.556

Review 3.  The clinical utility and diagnostic performance of magnetic resonance imaging for identification of early and advanced knee osteoarthritis: a systematic review.

Authors:  Carmen E Quatman; Carolyn M Hettrich; Laura C Schmitt; Kurt P Spindler
Journal:  Am J Sports Med       Date:  2011-07       Impact factor: 6.202

4.  fastMRI: A Publicly Available Raw k-Space and DICOM Dataset of Knee Images for Accelerated MR Image Reconstruction Using Machine Learning.

Authors:  Florian Knoll; Jure Zbontar; Anuroop Sriram; Matthew J Muckley; Mary Bruno; Aaron Defazio; Marc Parente; Krzysztof J Geras; Joe Katsnelson; Hersh Chandarana; Zizhao Zhang; Michal Drozdzalv; Adriana Romero; Michael Rabbat; Pascal Vincent; James Pinkerton; Duo Wang; Nafissa Yakubova; Erich Owens; C Lawrence Zitnick; Michael P Recht; Daniel K Sodickson; Yvonne W Lui
Journal:  Radiol Artif Intell       Date:  2020-01-29

5.  Deep-Learning Methods for Parallel Magnetic Resonance Imaging Reconstruction: A Survey of the Current Approaches, Trends, and Issues.

Authors:  Florian Knoll; Kerstin Hammernik; Chi Zhang; Steen Moeller; Thomas Pock; Daniel K Sodickson; Mehmet Akçakaya
Journal:  IEEE Signal Process Mag       Date:  2020-01-20       Impact factor: 12.551

6.  Learning a variational network for reconstruction of accelerated MRI data.

Authors:  Kerstin Hammernik; Teresa Klatzer; Erich Kobler; Michael P Recht; Daniel K Sodickson; Thomas Pock; Florian Knoll
Journal:  Magn Reson Med       Date:  2017-11-08       Impact factor: 4.668

Review 7.  The osteoarthritis initiative: report on the design rationale for the magnetic resonance imaging protocol for the knee.

Authors:  C G Peterfy; E Schneider; M Nevitt
Journal:  Osteoarthritis Cartilage       Date:  2008-09-10       Impact factor: 6.576

8.  Optimal brain MRI protocol for new neurological complaint.

Authors:  William A Mehan; R Gilberto González; Bradley R Buchbinder; John W Chen; William A Copen; Rajiv Gupta; Joshua A Hirsch; George J Hunter; Scott Hunter; Jason M Johnson; Hillary R Kelly; Mykol Larvie; Michael H Lev; Stuart R Pomerantz; Otto Rapalino; Sandra Rincon; Javier M Romero; Pamela W Schaefer; Vinil Shah
Journal:  PLoS One       Date:  2014-10-24       Impact factor: 3.240

9.  Boosting the signal-to-noise of low-field MRI with deep learning image reconstruction.

Authors:  N Koonjoo; B Zhu; G Cody Bagnall; D Bhutto; M S Rosen
Journal:  Sci Rep       Date:  2021-04-15       Impact factor: 4.379

10.  Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.

Authors:  Nicholas Bien; Pranav Rajpurkar; Robyn L Ball; Jeremy Irvin; Allison Park; Erik Jones; Michael Bereket; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Safwan Halabi; Evan Zucker; Gary Fanton; Derek F Amanatullah; Christopher F Beaulieu; Geoffrey M Riley; Russell J Stewart; Francis G Blankenberg; David B Larson; Ricky H Jones; Curtis P Langlotz; Andrew Y Ng; Matthew P Lungren
Journal:  PLoS Med       Date:  2018-11-27       Impact factor: 11.069

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.