Literature DB >> 36060820

Dataset on acute stroke risk stratification from CT angiographic radiomics.

Emily W Avery1, Jonas Behland1,2, Adrian Mak1,2, Stefan P Haider1,3, Tal Zeevi1, Pina C Sanelli4, Christopher G Filippi5, Ajay Malhotra1, Charles C Matouk6, Christoph J Griessenauer7,8,9, Ramin Zand10, Philipp Hendrix7,11, Vida Abedi12,13, Guido J Falcone14, Nils Petersen14, Lauren H Sansing15, Kevin N Sheth14, Seyedmehdi Payabvash1.   

Abstract

With advances in high-throughput image processing technologies and increasing availability of medical mega-data, the growing field of radiomics opened the door for quantitative analysis of medical images for prediction of clinically relevant information. One clinical area in which radiomics have proven useful is stroke neuroimaging, where rapid treatment triage is vital for patient outcomes and automated decision assistance tools have potential for significant clinical impact. Recent research, for example, has applied radiomics features extracted from CT angiography (CTA) images and a machine learning framework to facilitate risk-stratification in acute stroke. We here provide methodological guidelines and radiomics data supporting the referenced article "CT angiographic radiomics signature for risk-stratification in anterior large vessel occlusion stroke." The data were extracted from the stroke center registry at Yale New Haven Hospital between 1/1/2014 and 10/31/2020; and Geisinger Medical Center between 1/1/2016 and 12/31/2019. It includes detailed radiomics features of the anterior circulation territories on admission CTA scans in stroke patients with large vessel occlusion stroke who underwent thrombectomy. We also provide the methodological details of the analysis framework utilized for training, optimization, validation and external testing of the machine learning and feature selection algorithms. With the goal of advancing the feasibility and quality of radiomics-based analyses to improve patient care within and beyond the field of stroke, the provided data and methodological support can serve as a baseline for future studies applying radiomics algorithms to machine-learning frameworks, and allow for analysis and utilization of radiomics features extracted in this study.
© 2022 The Author(s).

Entities:  

Keywords:  CTA; Large vessel occlusion; Machine-learning; Radiomics; Stroke; Telestroke

Year:  2022        PMID: 36060820      PMCID: PMC9428796          DOI: 10.1016/j.dib.2022.108542

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Institution 1: Yale New Haven Hospital City/Town/Region: New Haven, CT Country: USA Latitude and longitude (and GPS coordinates, if possible) for collected samples/data: 41°18′14.7″N 72°56′07.0″W Institution 2: Geisinger Medical Center City/Town/Region: Danville, PA Country: USA Latitude and longitude (and GPS coordinates, if possible) for collected samples/data: 40°58′04.0″N 76°36′17.7″W

Value of the Data

The data included in this publication enrich the body of publicly available radiomics data, a field of growing interest in biomedical imaging research. The radiomics data included in this publication can be utilized in conjunction with the methodological guide of the related research article to serve as a point of comparison for researchers utilizing similar machine learning methodologies. The data can benefit researchers and clinicians interested in neuroimaging, stroke, and endovascular mechanical thrombectomy. It is also of use to researchers interested in radiomics, machine-learning, and artificial intelligence. Further insights and analyses that research groups may explore from our data include: support in radiomics-based analyses, comparison of radiomics features of this dataset to those of other datasets, and assessment of clinical and radiomics variables affecting stroke patient outcomes.

Data Description

The data files that appear in this article include: AnalyzedData.docx: Table 1 summarizes the machine learning and feature selection methods utilized in the related research article. Table 2A and 2B describe the clinical and demographic characteristics of patients from each study center. Radiomics_*.csv files: These files provide the values of all extracted radiomics features for the Yale and Geisinger datasets described in the reference article. These radiomics features were extracted from the bilateral middle cerebral artery (MCA) territories of each patient's admission CTA. A complete list of the first-order and texture features used in this study is described in van Griethuysen et al. [2], and exact feature definitions are described in Pyradiomics documentation [3]. Select first-order and texture features are also described in the related research article Supplementary Table 1 [1]. A separate file is provided for discharge (short-term) and 3-month (long-term) outcome cohorts for the Yale training/cross-validation (CV) dataset, independent Yale dataset, and external Geisinger dataset (3-month – long-term – outcome cohort only). The files are titled accordingly and include: Radiomics _YaleTrainingCV_ShortTermFollowUP.csv Radiomics _YaleTrainingCV_LongTermFollowUP.csv Radiomics _YaleIndependent_ShortTermFollowUP.csv Radiomics _YaleIndependent_LongTermFollowUP.csv Radiomics _Geisinger_LongTermFollowUP.csv ClinicalData_*.csv files: These files provide the sex and age of each patient. A separate file is provided for discharge(short-term) and 3-month (long-term) outcome cohorts for the Yale training/CV dataset, independent Yale dataset, and external Geisinger dataset (long-term outcome cohort only). The files are titled accordingly and include: ClinicalData _YaleTrainingCV_ShortTermFollowUP.csv ClinicalData _YaleTrainingCV_LongTermFollowUP.csv ClinicalData _YaleIndependent_ShortTermFollowUP.csv ClinicalData _YaleIndependent_LongTermFollowUP.csv ClinicalData _Geisinger_LongTermFollowUP.csv

Experimental Design, Materials and Methods

Patient Population

The dataset consists of patients from two institutions: Yale New Haven Health (New Haven, CT, USA; n = 597) and Geisinger Health (Danville, PA, USA; n = 232). Yale subjects were identified from the Yale stroke center registry between 1/1/2014 and 10/31/2020, and Geisinger subjects were identified from the Geisinger stroke center registry between 1/1/2016 and 12/31/2019. As depicted in the related research article Supplementary Fig. 1, subjects were included if they (1) suffered an anterior circulation large vessel occlusion (LVO) stroke – including internal carotid artery (ICA) or middle cerebral artery (MCA) M1 or M2 segments, (2) had CTA source images with slice thickness ≤1 mm, (3) underwent endovascular thrombectomy (ET), and (4) had modified Rankin Scale (mRS) assessment of functional outcome recorded at discharge or at 3-month follow-up. Patients were excluded if they had (1) any simultaneous posterior circulation LVO, (2) poor quality CTA not amenable to analysis (due to motion, metal artifact, or scanner-based artifacts), or (3) missing admission clinical information.

Image Processing and Radiomics Feature Extraction

We modified the brain extract tool (BET) from FSL software (http://www.fmrib.ox.ac.uk/) to perform skull-stripping of each patient's admission CTA [4]. Next, we applied FLIRT from the FSL toolbox to co-register each CTA to the Montreal Neurological Institute (MNI)−152 brain space. We used the brain stroke atlas to generate bilateral MCA territory masks in MNI-152 space [5]. Then, bilateral MCA territory masks were reverse registered to the native CTAs. Trilinear interpolation was used to resample all CTA images within MCA territory masks to an isotropic 1 × 1 × 1 mm voxel spacing. This ensured rotational invariance of texture features [6], [7], [8]. All images were normalized by centering voxel values at the mean with standard deviation from the image. To ensure exclusion of calcified plaques or remaining skull tissue, only voxels within a 1–500 Hounsfield unit (HU) range were included in analysis. To compensate for differences in intravenous bolus timing among different CTA scans, the voxel values in each patient was normalized to the mean attenuation of the scan during radiomics feature extraction process. We applied high- and low-pass filters in each spatial direction (“coif-1″ wavelet transform [3]) and the “edge-enhancement” Laplacian of Gaussian (LoG) filter (with “sigma” settings of 2,4,and 6 mm [3]). We then extracted one set of 1116 “first-order” and “texture-matrix” radiomics features per patient from the single volume of interest (VOI), combining right and left MCA territories [3]. We utilized a custom Pyradiomics version 2.1.2 pipeline [3] to complete the steps of preprocessing, derivative image generation, and feature extraction. Supplementary Table 1 of the related research article [1] describes the first-order and texture-based features.

Machine Learning Framework

Six dimensionality reduction strategies and six machine learning classifiers appropriate for application to radiomics data are listed in the Analyzed Data file Table 1 and described in detail in the related research article supplement [1], along with their programming packages. Each combination of these dimensionality reduction strategies and machine learning classifiers were used to create 36 candidate models for prediction of LVO stroke patient outcome in the related research article [1]. The dimensionality reduction methods include: hierarchical clustering, maximum relevance minimum redundancy filtering, no feature selection, principal component analysis, Pearson correlation-based redundancy reduction with mutual information maximization filter, and RIDGE regularized logistic regression for feature selection. The machine learning classifiers include: elastic net regularized logistic regression, Naïve Bayes, random forest, support vector machine with radial kernel, support vector machine with sigmoid kernel, and extreme gradient boosting. The hyperparameters, their ranges, and tuning repetition counts used for each machine learning classifier are described in the related research article Supplementary Table 2 [1]. Detailed explanation of the machine learning training and validation methodologies can be found in the methods section of the supplementary research article [1].

Ethics Statements

Institutional Review Board approval was obtained for data collection (Yale University protocol number 2000024296), with informed consent waived at respective institutes due to the retrospective nature of our study. All procedures followed were in accordance with institutional guidelines and the Declaration of Helsinki.

CRediT authorship contribution statement

Emily W. Avery: Conceptualization, Investigation, Data curation, Methodology, Writing – original draft, Writing – review & editing, Visualization. Jonas Behland: Investigation, Data curation. Adrian Mak: Data curation, Investigation, Writing – review & editing. Stefan P. Haider: Methodology, Software, Writing – review & editing. Tal Zeevi: Methodology, Software, Writing – review & editing. Pina C. Sanelli: Supervision, Writing – review & editing. Christopher G. Filippi: Investigation, Data curation, Writing – review & editing. Ajay Malhotra: Conceptualization, Supervision, Writing – review & editing. Charles C. Matouk: . Christoph J. Griessenauer: Investigation, Data curation, Writing – review & editing. Ramin Zand: Investigation, Data curation, Writing – review & editing. Philipp Hendrix: Investigation, Data curation, Writing – review & editing. Vida Abedi: Investigation, Data curation, Writing – review & editing. Guido J. Falcone: Investigation, Writing – review & editing. Nils Petersen: Investigation, Writing – review & editing. Lauren H. Sansing: Investigation, Writing – review & editing. Kevin N. Sheth: Investigation, Writing – review & editing. Seyedmehdi Payabvash: Conceptualization, Supervision, Data curation, Writing – original draft, Writing – review & editing.

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Dr. Christopher G. Filippi receives consulting honoraria from Syntactx, Inc; minority stockholder in Avicenna.ai; and receives research funding from the National Multiple Sclerosis Society. Dr. Christoph Griessenauer receives research funding from Medtronic and Penumbra and consulting honoraria from Stryker and MicroVention. Dr. Philipp Hendrix receives salary support from Medtronic, which was used to support this work. Dr. Kevin Sheth receives grant support from Novartis, Biogen, Bard, Hyperfine and Astrocyte. He also reports equity interests in Alva Health.
SubjectMedical Imaging
Specific subject areaRadiomics-based risk stratification in acute large vessel occlusion triage
Type of dataTableFigureText
How the data were acquiredThe data were acquired by retrospective electronic health record review at two institutions: Yale New Haven Hospital and Geisinger Medical Center. Patients in the Yale Stroke Center registry who presented between 1/1/2014–10/31/2020 and patients in the Geisinger Stroke Registry who presented between 1/1/2016–12/31/2019 were identified and included in the dataset based on clinical and imaging data availability.
Data formatRawAnalyzed
Description of data collectionPatients were included if they: (1) suffered anterior circulation large vessel occlusion (LVO), (2) underwent mechanical thrombectomy, (3) had CTA source images with slices ≤1 mm, and (4) had modified Rankin Scale (mRS)
assessment of functional outcome recorded at discharge or 3-mo follow-up. Radiomics features were extracted from the anterior circulation territory of each admission CTA using FSL and pyRadiomics software.
Data source location

Institution 1: Yale New Haven Hospital

City/Town/Region: New Haven, CT

Country: USA

Latitude and longitude (and GPS coordinates, if possible) for collected samples/data: 41°18′14.7″N 72°56′07.0″W

Institution 2: Geisinger Medical Center

City/Town/Region: Danville, PA

Country: USA

Latitude and longitude (and GPS coordinates, if possible) for collected samples/data: 40°58′04.0″N 76°36′17.7″W

Data accessibilityThe referenced data is included as supplemental material in the submission, and is also available at our Github repository: https://github.com/emilywavery/Radiomics-data-sharing/tree/radiomicsdata
Related research articleAvery, E.W., Behland, J., Mak, A., Haider, S.P., Zeevi, T., Sanelli, P.C., Filippi, C.G., Petersen, N.H., Falcone, G.J., Sansing, L.H., Malhotra, A., Greissenauer, C.J., Zand, R., Hendrix, P., Abedi, V., Matouk, C.C., Sheth, K.N., Payabvash, S. CT angiographic radiomics signature for risk-stratification in anterior large vessel occlusion stroke. Neuroimage: Clinical, 2022;34:103034 [1]
  7 in total

1.  Validated automatic brain extraction of head CT images.

Authors:  John Muschelli; Natalie L Ullman; W Andrew Mould; Paul Vespa; Daniel F Hanley; Ciprian M Crainiceanu
Journal:  Neuroimage       Date:  2015-04-07       Impact factor: 6.556

2.  PET/CT radiomics signature of human papilloma virus association in oropharyngeal squamous cell carcinoma.

Authors:  Stefan P Haider; Amit Mahajan; Tal Zeevi; Philipp Baumeister; Christoph Reichel; Kariem Sharaf; Reza Forghani; Ahmet S Kucukkaya; Benjamin H Kann; Benjamin L Judson; Manju L Prasad; Barbara Burtness; Seyedmehdi Payabvash
Journal:  Eur J Nucl Med Mol Imaging       Date:  2020-05-12       Impact factor: 9.236

3.  Computational Radiomics System to Decode the Radiographic Phenotype.

Authors:  Joost J M van Griethuysen; Andriy Fedorov; Chintan Parmar; Ahmed Hosny; Nicole Aucoin; Vivek Narayan; Regina G H Beets-Tan; Jean-Christophe Fillion-Robin; Steve Pieper; Hugo J W L Aerts
Journal:  Cancer Res       Date:  2017-11-01       Impact factor: 12.701

4.  Potential Added Value of PET/CT Radiomics for Survival Prognostication beyond AJCC 8th Edition Staging in Oropharyngeal Squamous Cell Carcinoma.

Authors:  Stefan P Haider; Tal Zeevi; Philipp Baumeister; Christoph Reichel; Kariem Sharaf; Reza Forghani; Benjamin H Kann; Benjamin L Judson; Manju L Prasad; Barbara Burtness; Amit Mahajan; Seyedmehdi Payabvash
Journal:  Cancers (Basel)       Date:  2020-07-03       Impact factor: 6.639

5.  Stroke atlas of the brain: Voxel-wise density-based clustering of infarct lesions topographic distribution.

Authors:  Yanlu Wang; Julia M Juliano; Sook-Lei Liew; Alexander M McKinney; Seyedmehdi Payabvash
Journal:  Neuroimage Clin       Date:  2019-08-13       Impact factor: 4.881

6.  Prediction of post-radiotherapy locoregional progression in HPV-associated oropharyngeal squamous cell carcinoma using machine-learning analysis of baseline PET/CT radiomics.

Authors:  Stefan P Haider; Kariem Sharaf; Tal Zeevi; Philipp Baumeister; Christoph Reichel; Reza Forghani; Benjamin H Kann; Alexandra Petukhova; Benjamin L Judson; Manju L Prasad; Chi Liu; Barbara Burtness; Amit Mahajan; Seyedmehdi Payabvash
Journal:  Transl Oncol       Date:  2020-10-16       Impact factor: 4.243

7.  CT angiographic radiomics signature for risk stratification in anterior large vessel occlusion stroke.

Authors:  Emily W Avery; Jonas Behland; Adrian Mak; Stefan P Haider; Tal Zeevi; Pina C Sanelli; Christopher G Filippi; Ajay Malhotra; Charles C Matouk; Christoph J Griessenauer; Ramin Zand; Philipp Hendrix; Vida Abedi; Guido J Falcone; Nils Petersen; Lauren H Sansing; Kevin N Sheth; Seyedmehdi Payabvash
Journal:  Neuroimage Clin       Date:  2022-05-07       Impact factor: 4.891

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.