Literature DB >> 36105873

Data-driven clustering approach to identify novel phenotypes using multiple biomarkers in acute ischaemic stroke: A retrospective, multicentre cohort study.

Lingling Ding1,2,3, Ravikiran Mane4, Zhenzhou Wu4, Yong Jiang1,2,3, Xia Meng1,2, Jing Jing1,2,3, Weike Ou4, Xueyun Wang4, Yu Liu4, Jinxi Lin1,2, Xingquan Zhao1,2,3, Hao Li1,2, Yongjun Wang1,2,3, Zixiao Li1,2,3,5.   

Abstract

Background: Acute ischaemic stroke (AIS) is a highly heterogeneous disorder and warrants further investigation to stratify patients with different outcomes and treatment responses. Using a large-scale stroke registry cohort, we applied data-driven approach to identify novel phenotypes based on multiple biomarkers.
Methods: In a nationwide, prospective, 201-hospital registry study taking place in China between August 01, 2015 and March 31, 2018, the patients with AIS who were over 18 years of age and admitted to the hospital within 7 days from symptom onset were included. 92 biomarkers were included in the analysis. In the derivation cohort (n=9539), an unsupervised Gaussian mixture model was applied to categorize patients into distinct phenotypes. A classifier was developed using the most important biomarkers and was applied to categorize patients into their corresponding phenotypes in an validation cohort (n=2496). The differences in biological features, clinical outcomes, and treatment response were compared across the phenotypes. Findings: We identified four phenotypes with distinct characteristics in 9288 patients with non-cardioembolic ischaemic stroke. Phenotype 1 was associated with abnormal glucose and lipid metabolism. Phenotype 2 was characterized by inflammation and abnormal renal function. Phenotype 3 had the least laboratory abnormalities and small infarct lesions. Phenotype 4 was characterized by disturbance in homocysteine metabolism. Findings were replicated in the validation cohort. In comparison with phenotype 3, the risk of stroke recurrence (adjusted hazard ratio [aHR] 2.02, 95% confidence intervals [CI] 1.04-3.94), and mortality (aHR 18.14, 95%CI 6.62-49.71) at 3-month post-stroke were highest in phenotype 2, followed by phenotype 4 and phenotype 1, after adjustment for age, gender, smoking, drinking, history of stroke, hypertension, diabetes mellitus, dyslipidemia, and coronary heart disease. The Monte Carlo simulation showed that the patients with phenotype 2 could benefit from high-intensity statin therapy. Interpretation: A data-driven approach could aid in the identification of patients at a higher risk of adverse clinical outcomes following non-cardioembolic ischaemic stroke. These phenotypes, based on different pathophysiology, can suggest individualized treatment plans. Funding: Beijing Natural Science Foundation (grant number Z200016), Beijing Municipal Committee of Science and Technology (grant number Z201100005620010), National Natural Science Foundation of China (grant number 82101360, 92046016, 82171270), Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences (grant number 2019-I2M-5-029).
© 2022 The Author(s).

Entities:  

Keywords:  Acute ischemic stroke; Biomarkers; Clinical outcome; Machine Learning; Phenotypes

Year:  2022        PMID: 36105873      PMCID: PMC9465270          DOI: 10.1016/j.eclinm.2022.101639

Source DB:  PubMed          Journal:  EClinicalMedicine        ISSN: 2589-5370


Evidence before this study

Acute ischaemic stroke (AIS) is a highly heterogeneous disorder with high risk of stroke recurrence, disability, and mortality. We searched PubMed using the terms (“ischaemic stroke” or “cerebrovascular disease”), (“biomarker” or “molecular” or “phenotype” or “subtype” or “subgroup”), and (“machine learning” or “artificial intelligence” “data-driven” or “clustering” or “non-supervised” or “classify”) for articles published up to May 1, 2022 and found no study of clustering analysis based on biomarkers in patients with ischaemic stroke. Although several studies have confirmed the associations of biomarkers with pathogenesis and prognosis in patients with ischaemic stroke, most of them just focused on a single biomarker and neglected the interaction effects of multiple biomarkers. Therefore, it's necessary to identify novel phenotypes of AIS using unsupervised clustering analysis and further investigate their relationships with treatment responses and clinical outcomes.

Added value of this study

To the best of our knowledge, this is the first to identify four phenotypes with specific characteristics using 92 biomarkers from a large-scale, multi-centre cohort of patients with AIS. We adopted the Gaussian mixture model and light gradient boosted machine (LightGBM) model to identify the novel phenotypes. We described that the biological features, clinical outcomes, and treatment response varied across phenotypes. We found that phenotype 2, which was characterized by inflammation and abnormal renal function, had the highest risk of stroke recurrence, disability, and mortality, and was associated with a good response to the high-intensity statin therapy. Besides, we revealed that phenotype 1(abnormal glucose and lipid metabolism) and phenotype 4(disturbance in homocysteine metabolism) were also associated with adverse clinical outcomes.

Implications of all the available evidence

This study provides evidence of biological heterogeneity for AIS, that may help gain a deeper insight into the potential pathogenesis in ischaemic stroke. In addition, we provide a new risk stratification approach for supporting clinical decision making. Alt-text: Unlabelled box

Introduction

Acute ischaemic stroke (AIS) is a highly heterogeneous disorder that is associated with considerably high morbidity, disability, and mortality., Antiplatelet and lipid-lowering drugs are recommended for the prevention of non-cardioembolic ischaemic stroke. Despite strict adherence to current guideline recommendations for the prevention of stroke recurrence, some patients have been observed still to be at a high risk of recurrent stroke., This suggests a need for a reassessment of the presumptions regarding the pathophysiology of ischaemic stroke and potential therapeutic targets. Besides, the traditional stroke subtypes based on the Trial of Org 10 172 in Acute Stroke Treatment (TOAST) and Causative Classification of Stroke (CCS) criteria need a comprehensive and systematic evaluation of intracranial or extracranial arteries, as well as cardiac examination, which makes it difficult to intervene early in patients with AIS., Therefore, stratification of the heterogeneity among patients based on the ensemble of multiple biomarkers can enhance the understanding of acute ischaemic stroke and enable more personalized treatment planning. Many recent works have shown that rather than relying on expert clinicians’ knowledge, data-driven approaches like unsupervised machine learning, can be used to discover novel phenotypes of patients in various diseases including diabetes, sepsis, dilated cardiomyopathy, pulmonary arterial hypertension, and heart failure, that may help in understanding mechanisms of diseases and treatment effects. Therefore, with the availability of a large amount of biomarker data, a comprehensive, data-driven assessment of the heterogeneity using machine learning methods may provide new opportunities to understand AIS, which previously has not been done. This study aims to develop and evaluate novel phenotypes of acute non-cardioembolic ischaemic stroke based on 92 biomarkers using a large-scale multi-centre dataset. Through a machine learning-based unsupervised clustering approach, we aim to identify different phenotypes of patients that share similar pathophysiological characteristics, treatment responses and clinical outcomes.

Methods

Study design and population

This study retrospectively analysed the data from the Third China National Stroke Registry (CNSR-III), which is a nationwide, multi-centre, prospective, observational registry study of 15,166 patients with AIS or transient ischaemic attack (TIA) enrolled at 201 hospitals in China between August 01, 2015 and March 31, 2018. The patients participating in the CNSR-III study were over 18 years of age and were admitted to the hospital within 7 days of AIS or TIA onset. Further details about the CNSR-III study design and methodology have been described elsewhere. This study was approved by the Institutional Review Boards (IRB) of Beijing Tiantan Hospital. Written informed consent was obtained from all included patients or their representatives. The data were reported in adherence to the Strengthening the Reporting of OBservational studies in Epidemiology (STROBE) reporting guidelines. From the CNSR-III dataset, patients with acute non-cardiac ischaemic stroke were included in this analysis. To reduce the heterogeneity of populations, patients who experienced TIA or a stroke of other determined etiology (OE) were excluded from the analysis. As we did not collect the cardiac-specific biomarkers such as cardiac troponin-T (cTnT), cardiac troponin-I (cTnI) and B-type natriuretic peptide (BNP), we excluded the patients diagnosed with cardioembolic stroke in the analysis. Also, the patients who presented with cancer or infection within 2 weeks before stroke onset were excluded. The baseline characteristics between the included and excluded patients are presented in Supplementary Table 1. A temporal split was applied to the included patients to divide them into the derivation cohort (∼75% data, admitted before August 2017) and the validation cohort (∼25% data, admitted after August 2017) (Figure 1).
Figure 1

Study flow chart. A. Patient selection. B. Feature selection.

Abbreviations: CNSR-III, Third China National Stroke Registry; TIA, transient ischaemic attack; AIS, acute ischaemic stroke; NIHSS, National Institutes of Health Stroke Scale; BMI, body mass index; SBP, systolic blood pressure; DBP, diastolic blood pressure; sICAS, symptomatic intracranial atherosclerotic stenosis; sECAS, symptomatic extracranial atherosclerotic stenosis; HbA1c, Hemoglobin A1c; TML, trimethyllysine; LightGBM, light gradient boosted machine.

Study flow chart. A. Patient selection. B. Feature selection. Abbreviations: CNSR-III, Third China National Stroke Registry; TIA, transient ischaemic attack; AIS, acute ischaemic stroke; NIHSS, National Institutes of Health Stroke Scale; BMI, body mass index; SBP, systolic blood pressure; DBP, diastolic blood pressure; sICAS, symptomatic intracranial atherosclerotic stenosis; sECAS, symptomatic extracranial atherosclerotic stenosis; HbA1c, Hemoglobin A1c; TML, trimethyllysine; LightGBM, light gradient boosted machine. Clinical information about the patients was collected through in-person interviews by trained research coordinators. Stroke severity was assessed within 24 hours of hospital admission using the National Institutes of Health Stroke Scale (NIHSS) score. Stroke etiology was classified into 5 major categories: large artery atherosclerosis (LAA), cardioembolism (CE), small-vessel occlusion (SVO), stroke of other determined etiology (OE) and stroke of undetermined cause (UE), according to the TOAST and CCS criteria.,

Blood biomarker

The blood samples were collected on the day of the hospital enrollment. All the specimens were stored at -80°C until the testing was performed. The measurement of blood biomarkers was performed at the central laboratory at Tiantan Hospital, Beijing, China by laboratory staff who were blinded to the patients’ characteristics and clinical outcomes. A total of 83 blood biomarkers involved in this study, including blood constituents (n=14), coagulation function (n=6), liver function (n=13), renal function (n=5), inflammation (n=12), electrolyte (n=5), lipid metabolism (n=15), homocysteine metabolism (n=4), glucose metabolism (n=2), and gut microbial metabolites (n=7).

Imaging data

Brain magnetic resonance imaging (MRI) and vascular assessment for intracranial arteries and extracranial arteries were collected from 13,012 patients in the Digital Imaging and Communications in Medicine (DICOM) format for the extraction of neuroimaging features. The patients were assessed for the presence of symptomatic intracranial atherosclerotic stenosis (sICAS) and extracranial atherosclerotic stenosis (sECAS). sICAS and sECAS were defined as severe (50%-99%) stenosis or occlusion of clinically relevant intracranial and extracranial arteries, respectively. sICAS judgement was based on the Warfarin-Aspirin Symptomatic Intracranial Disease (WASID) criteria. The North American Symptomatic Carotid Endarterectomy Trial (NASCET) criteria was adopted to adjust the assessment of sECAS., The brain tissue damage caused by the acute ischaemic stroke, measured as the volume of the ischaemic lesions, was calculated from the diffusion-weighted image (DWI) and apparent diffusion coefficient (ADC) scans using a deep learning segmentation model.

Clinical outcomes

Recurrent stroke at 3-, 6-, and 12-months post-stroke were the primary clinical outcomes of this study. The onset of a composite vascular event (stroke, myocardial infarction, or vascular death), all-cause mortality at 3-, 6-, and 12-months post-stroke, and poor functional outcome (defined as modified Rankin Scale [mRS] score of 3-6) at 3-months post-stroke were the secondary clinical outcomes. Patients were followed up via in-person interview at 3 months, and via telephonic interview at 6 and 12 months by trained interviewers based on a standardized interview protocol to collect the clinical outcomes. For patients who were enrolled in this study, 141 patients were lost to follow-up, of which 97 individuals were in the derivation cohort and 44 individuals were in the validation cohort.

Data pre-processing

All the included patients in the study were assessed for the presence of clinical features. In this study, we tried to use features that were objective and can be automatically extracted. A total of 92 biomarkers were included in the analysis (Supplementary Table 2). In the derivation cohort, we excluded 251 patients without multiple biomarkers. The features which were missing in more than 35% of the patients were excluded from the clustering analysis (Figure 1). The missing value of the features were shown in Supplementary Figure 2. Then, the missing values were imputed with the mode of the data for categorical features and with the median of the data for numerical features. The patients were divided into six subgroups according to their age (≤60, 61-70 and >70) and gender (male and female), and the missing values were imputed based on the mode and the median of the respective subgroup. To focus only on the clinically important features and remove irrelevant features from the analysis, a feature selection was performed using the light gradient boosted machine (LightGBM), a gradient boosting decision tree (GBDT) algorithm. Using the data from the derivation cohort, the features were ranked according to their importance in the prediction of stroke recurrence, and the 30 most important features were selected for further clustering analysis (Figure 2A). SHapley Additive exPlanations (SHAP) values were calculated for these 30 features (Figure 2B-C). The selected features were standard normalized to zero mean and unit standard deviation. Supplementary Figure 3 shows a heatmap representing the correlation between 30 biomarkers.
Figure 2

Importance ranking of features. A. Importance ranking of 89 features according to light gradient boosted machine model. B-C. SHapley Additive exPlanations (SHAP) values for 30 features. D-I. Importance of features for phenotypes according to light gradient boosted machine models.

Abbreviations: RBC, red blood cell; FPG, fasting plasma glucose; APTT, activated partial thromboplastin time; DBP, diastolic blood pressure; BMI, body mass index; ALT, alanine aminotransferase; SBP, systolic blood pressure; GGT, γ-Glutamyl transpeptidase; DBIL, Direct bilirubin; TBIL, total bilirubin; HDL, high density lipoprotein; MCV, mean corpuscular volume; ALP, alkaline phosphatase; LDL, low density lipoprotein; hs-CRP, high-sensitivity C-reactive protein; MMA, methylmalonic aciduria; LP(a), Lipoprotein (a); WBC, white blood cell; CO2, Carbon dioxide combining power; LDH, lactate dehydrogenase; IBIL, indirect bilirubin; MCH, mean corpuscular hemoglobin; GLB, globulin; ALB, albumin; MCHC, mean corpuscular hemoglobin concentration; TMAVA, N,N,N-trimethyl-5-aminovaleric acid; RDWCV, coefficient of variation of RBC distribution width; RDW, RBC distribution width; TBA, total bile acid; AST, aspertate aminotransferase; PLCR, Platelet large cell ratio; MPV, mean platelet volume; TBIL, total bilirubin; MCP-1, monocyte chemoattractant protein-1; IL-6, interleukin-6; IL-6R, interleukin-6 receptor; TMAO, trimethylamine-N-oxide; INR, international normalized ratio; LDL-R, low density lipoprotein-receptor; PCSK9, proprotein convertase subtilisin/Kexin type 9; HCY, homocysteinemia; YKL-40, chitinase-3-like protein 1; IL-1Ra, Interleukin-1 receptor antagonist; UACR, urea albumin creatinine ratio; UMA, urine microalbumin; NIHSS, National Institutes of Health Stroke Scale; sICAS, symptomatic intracranial atherosclerotic stenosis; sECAS, symptomatic extracranial atherosclerotic stenosis.

Importance ranking of features. A. Importance ranking of 89 features according to light gradient boosted machine model. B-C. SHapley Additive exPlanations (SHAP) values for 30 features. D-I. Importance of features for phenotypes according to light gradient boosted machine models. Abbreviations: RBC, red blood cell; FPG, fasting plasma glucose; APTT, activated partial thromboplastin time; DBP, diastolic blood pressure; BMI, body mass index; ALT, alanine aminotransferase; SBP, systolic blood pressure; GGT, γ-Glutamyl transpeptidase; DBIL, Direct bilirubin; TBIL, total bilirubin; HDL, high density lipoprotein; MCV, mean corpuscular volume; ALP, alkaline phosphatase; LDL, low density lipoprotein; hs-CRP, high-sensitivity C-reactive protein; MMA, methylmalonic aciduria; LP(a), Lipoprotein (a); WBC, white blood cell; CO2, Carbon dioxide combining power; LDH, lactate dehydrogenase; IBIL, indirect bilirubin; MCH, mean corpuscular hemoglobin; GLB, globulin; ALB, albumin; MCHC, mean corpuscular hemoglobin concentration; TMAVA, N,N,N-trimethyl-5-aminovaleric acid; RDWCV, coefficient of variation of RBC distribution width; RDW, RBC distribution width; TBA, total bile acid; AST, aspertate aminotransferase; PLCR, Platelet large cell ratio; MPV, mean platelet volume; TBIL, total bilirubin; MCP-1, monocyte chemoattractant protein-1; IL-6, interleukin-6; IL-6R, interleukin-6 receptor; TMAO, trimethylamine-N-oxide; INR, international normalized ratio; LDL-R, low density lipoprotein-receptor; PCSK9, proprotein convertase subtilisin/Kexin type 9; HCY, homocysteinemia; YKL-40, chitinase-3-like protein 1; IL-1Ra, Interleukin-1 receptor antagonist; UACR, urea albumin creatinine ratio; UMA, urine microalbumin; NIHSS, National Institutes of Health Stroke Scale; sICAS, symptomatic intracranial atherosclerotic stenosis; sECAS, symptomatic extracranial atherosclerotic stenosis.

Unsupervised clustering analysis

To identify phenotypes of patients with similar clinical characteristics, an unsupervised clustering analysis on the data from the derivation cohort was performed. To extract more generalizable and robust phenotypes, an unsupervised Gaussian mixture model (GMM) clustering method was used. GMM is a probabilistic model that uses a soft clustering approach to group patients into discrete phenotypes, and it assumes that all data samples X are generated by a mixture of K multivariate Gaussian distributions. Here, each phenotype is modeled as a gaussian multivariate mixture with a mean and covariance that describes the shape of each phenotype. In our analysis, the GMM model was trained using an iterative expectation-maximization algorithm for 1000 epochs. Also, the number of phenotypes that can optimally describe the derivation cohort data was determined using the Calinski Harabasz (CH) Score and Davies Bouldin (DB) Score. Once the phenotypes were determined, patterns of biomarkers were visualized using chord plots and an unsupervised hierarchical clustering heat map.

Simplified supervised patient stratification model

The unique phenotypes identified in the clustering analysis were based on 30 features. To further reduce the dependence on multiple features and to simplify the stratification of the patients, we employed a light gradient boosted machine (LightGBM) model to classify the patients into the identified phenotypes with a reduced number of features. We first identified the 10 most important features of the LightGBM model using the information gain criteria. Next, a LightGBM prediction model using these features was developed with the data in the derivation cohort. The performance of the proposed prediction model in assigning the patients to the correct phenotype was assessed in a 10-fold cross-validation analysis using the area under the receiver operating characteristics curve (AUC). Finally, the prediction model was used to stratify the patients from the validation cohort. The clinical characteristics and outcomes in the sub-groups of the validation cohort were analysed to validate the generalizability of the proposed phenotypes. For each phenotype, we also drew radar plots based on 10 key features, using z-values of each feature.

Monte-Carlo simulation for stratified treatment effect

We used Monte-Carlo simulations to explore the heterogeneity of the treatment effects to the frequency distributions of these phenotypes. High-intensity statin treatment can provide more clinical benefits compared with standard statin in patients with high-risk atherosclerotic cardiovascular disease. In this study, we assessed how the benefits of high-intensity statin therapy(atorvastatin 40-80 mg/day, or rosuvastatin 20-40 mg/day) during hospitalization in reducing the probability of a recurrent stroke at 3 months could change with the alteration in the relative distribution of the identified phenotypes. (Supplementary Methods)

Statistical analysis

Continuous variables were expressed as means and standard deviations (SD), ranges, or medians and interquartile ranges (IQR). Categorical variables were expressed as frequencies and percentages. Univariate comparisons were done with the Kruskal-Wallis H test for continuous variables and with the chi-square test for categorical data. Spearman's correlation coefficients were calculated for associations between features and were rearranged with hierarchical clustering. Hazard ratios (HRs) and 95% confidence intervals (CIs) for stroke recurrence, composite vascular events, and all-cause mortality were estimated for every phenotype by the Cox regression model. Covariates known to be predictive of outcomes in ischaemic stroke such as age, gender, smoking, drinking, history of stroke, hypertension, diabetes mellitus, dyslipidemia, and coronary heart disease, were adjusted in the multivariable models. Crude and multivariable-adjusted odds ratios (ORs) and 95% CIs for poor functional outcomes at 3 months were obtained from a logistic regression model. All data were analysed with the SAS version 9.4 software (SAS Institute Inc, Cary, NC) or python 3.7. The level of significance was defined as p < 0.05 (2-sided).

Role of the funding source

The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding authors had full access to all data and final responsibility to submit for publication.

Results

12035 patients with acute non-cardioembolic ischaemic stroke were included in this study. Among these patients, 9539 patients were assigned to the derivation cohort and the remaining 2496 patients were assigned to the validation cohort. 251 patients in the derivation cohort without multiple biomarkers were excluded. (Figure 1 and Supplementary Figure 1) The details of all the biomarkers, along with the demographic details of all patients are presented in Supplementary Tables 2–4.

Comparison of clinical characteristics among phenotypes

In the clustering analysis, based on the DB score and CH score, 4 phenotypes were observed to be most optimal to represent the derivation cohort data (Supplementary Figure 4). Thus, we identified four phenotypes with distinctive patterns of clinical features (Supplementary Figure 5), and the summary statistics of these phenotypes are presented in Supplementary Tables 2–3. Figure 3 and Supplementary Figure 6 showed patterns of abnormal features that were characteristics of the observed phenotypes. Post-hoc analysis of the phenotypes indicated that phenotype 1, which included 2475 (26.65%) patients in the derivation cohort, was characterized by a low level of adiponectin, and abnormal lipid metabolism, with an increased level of low-density lipoprotein (LDL), triglycerides, lipoprotein (a) (Lp [a]), and impaired fasting plasma glucose (FPG). Phenotype 2, including 507 (5.46%) patients, was characterized by circulating inflammation, manifested as an increased level of neutrophil, high-sensitivity C-reactive protein (hs-CRP), interleukin-6 (IL-6), chitinase-3-like Protein 1(YKL40), and interleukin-1 receptor antagonist (IL-1RA); abnormality of renal function, with an increased level of creatinine and cystatin C, urine microalbumin (UMA), and urea albumin creatinine ratio (UACR); and increased level of proprotein convertase subtilisin/Kexin type 9 (PCSK9) and angiopoietin-Like 3 (ANGPTL3). The 4392 (47.29%) patients in phenotype 3 were associated with minimum abnormalities in biomarkers of liver and renal function indexes, inflammation, and glucose metabolism. Phenotype 3 had the highest HDL level and smaller infarct volume than patients in other phenotypes. Also, the incidence of sICAS or sECAS (16.6%) was the lowest in phenotype 3. Phenotype 4, including 1914 (20.61%) patients, was characterized by disturbance in homocysteine metabolism, with a high level of homocysteine (HCY), methylmalonic acid (MMA), and low levels of vitamin B12. The incidence of sICAS or sECAS (45.5%) was the highest in phenotype 4. Medical treatments and adherence did not differ substantially in the four phenotypes (Supplementary Figure 7). However, the patients in phenotype 2 and phenotype 4 were more likely to receive reperfusion therapy than others (Supplementary Table 3). We analysed the relationship between the newly identified phenotypes and the traditional stroke subtypes based on the TOAST and CCS criteria. The comparison of the novel phenotypes and CCS classification showed that phenotype 2 and phenotype 4 were marked by large artery atherosclerosis (46.4% and 52.1%, respectively), and phenotype 3 was marked by small artery occlusion (37.4%). The results indicated that the observed phenotypes were significantly different from the traditional ways of stroke stratification (Figure 4, Supplementary Figure 8).
Figure 4

Comparison with traditional stroke subtypes. A. Comparison with CCS classification in the derivation cohort. B. Comparison with CCS in the validation cohort.

Abbreviations: CCS, causative classification of stroke; LAA, large artery atherosclerosis; UE, undetermined etiology; SAO, small artery occlusion.

Supervised prediction model

To further simply the characterization of the identified phenotypes, the features in the clustering model were evaluated for their importance in clustering decisions, and these feature importance scores are presented in Figure 2D-I. Here, infarct volume, alanine aminotransferase (ALT), hs-CRP, γ-Glutamyl transpeptidase (GGT), neutrophil counts, FPG, creatinine, triglyceride, methylmalonic aciduria (MMA), and Lp(a) were observed to be the 10 most important features. Using these 10 model-derived, routinely collected, important biomarkers, a prediction model that can classify patients into one of the four phenotypes was developed. In the 10-cross validation analysis on the development dataset, the supervised prediction model achieved a 4-class micro-average AUC of 0.983 (95% CI 0.980-0.986) and a macro-average AUC of 0.974 (95% CI 0.969-0.979) (Individual phenotype AUC: Phenotype 1: AUC 0.975, 95% CI 0.971- 0.978; Phenotype 2: AUC 0.954, 95% CI 0.944-0.963; Phenotype 3: AUC 0.986, 95% CI 0.983-0.989; Phenotype 4: AUC 0.976, 95% CI 0.969-0.982) (Supplementary Figure 9). Using the same prediction model, the patients from the validation cohort were assigned to one of the four phenotypes. The phenotypes in the validation cohort were observed to have similar clinical characteristics as that of the derivation cohort (Figure 3 and Supplementary Table 4). Radar plots represent profiles of the four phenotypes based on 10 key features (Supplementary Figure 10).
Figure 3

Dendrogram and heat map for unsupervised hierarchical clustering. Dendrogram and heat map for unsupervised hierarchical clustering in 4 phenotypes based on all the biomarkers in the derivation cohort (A) and validation cohort (B).

Dendrogram and heat map for unsupervised hierarchical clustering. Dendrogram and heat map for unsupervised hierarchical clustering in 4 phenotypes based on all the biomarkers in the derivation cohort (A) and validation cohort (B). Comparison with traditional stroke subtypes. A. Comparison with CCS classification in the derivation cohort. B. Comparison with CCS in the validation cohort. Abbreviations: CCS, causative classification of stroke; LAA, large artery atherosclerosis; UE, undetermined etiology; SAO, small artery occlusion.

Association of phenotypes with clinical outcomes

The clinical outcomes in all the identified phenotypes were analysed and the results of this analysis are presented in Figure 5, Table 1, and Supplementary Table 5. In the derivation cohort, phenotype 3 was observed to have the best clinical outcomes with the lowest stroke recurrence rate (5.16%), combined vascular events (5.23%), and all-cause mortality (0.38%) at 3-month follow-up. At 3-month follow-up, compared to phenotype 3, patients in phenotype 2 experienced significantly worse outcomes in terms of stroke recurrence (adjusted HR 1.89, 95% CI 1.38-2.57, p<0.0001), combined vascular events (adjusted HR 1.98, 95% CI 1.46-2.68, p<0.0001), and all-cause mortality (adjusted HR 12.92, 95% CI 6.95-24.02, p<0.0001). Also, the adjusted risk of poor functional outcome was 3 times higher in phenotype 2 compared to phenotype 3 (adjusted OR 3.61, 95% CI 2.96-4.39, p<0.0001). The participants in phenotype 4 (vs. phenotype 3) were observed to have a significantly higher risk of all adverse clinical events including stroke recurrence (adjusted HR 1.77, 95% CI 1.45-2.16, p<0.0001), combined vascular events (adjusted HR 1.79, 95% CI 1.47-2.18, p<0.0001), all-cause mortality (adjusted HR 4.18, 95% CI 2.32-7.55, p<0.0001), and poor functional outcome (adjusted OR 2.31, 95% CI 2.04-2.61, p<0.0001) at 3-month follow-up.
Figure 5

Clinical outcomes stratified by the identified phenotypes. Kaplan-Meier curves of time to stroke recurrence (A), combined vascular events (B), and all-cause mortality (C) within one year after stroke in derivation cohort. D. The distribution of the modified Rankin Scale (mRS) score 90 days after stroke in derivation cohort. Kaplan-Meier curves of time to stroke recurrence (E), combined vascular events (F), and all-cause mortality (G) within one year after stroke in validation cohort. H. The distribution of the mRS score 90 days after stroke in validation cohort.

Table 1

Clinical outcomes in the derivation cohort and validation cohort by phenotypes.

PhenotypeDerivation cohort
Validation cohort
TotalEvents, n (%)HR (95% CI)P valueAdjust HR (95% CI)P valueTotalEvents, n (%)HR (95% CI)P valueAdjust HR (95% CI)P value
Stroke recurrence3 monthsPhenotype 12475148 (5.97%)1.16 (0.94-1.43)0.1611.10 (0.89-1.36)0.38562633 (5.27%)1.23 (0.79-1.91)0.3651.22 (0.78-1.19)0.385
Phenotype 250749 (9.6%)1.93 (1.41-2.62)<0.00011.89 (1.38-2.57)<0.000111411 (9.64%)2.32 (1.21-4.45)0.0112.02 (1.04-3.94)0.038
Phenotype 34392227 (5.16%)----113349 (4.32%)----
Phenotype 41914169 (8.82%)1.74 (1.43-2.12)<0.00011.77 (1.45-2.16)<0.000162341 (6.58%)1.54 (1.02-2.33)0.0411.47 (0.97-2.23)0.069
6 monthsPhenotype 12475199 (8.04%)1.26 (1.05-1.51)0.0131.21 (1.01-1.46)0.04162645 (7.18%)1.35 (0.92-1.98)0.1281.37 (0.92-2.04)0.116
Phenotype 250761 (12.03%)1.95 (1.48-2.58)<0.00011.92 (1.46-2.54)<0.000111412 (10.52%)2.08 (1.12-3.86)0.0201.77 (0.94-3.33)0.075
Phenotype 34392282 (6.42%)----113361 (5.38%)----
Phenotype 41914199 (10.39%)1.66 (1.38-1.99)<0.00011.67 (1.39-2.00)<0.000162354 (8.66%)1.65 (1.14-2.38)0.00751.57 (1.08-2.26)0.016
12 monthsPhenotype 12475253 (10.22%)1.25 (1.07-1.47)0.00591.227 (1.04-1.45)0.01462655 (8.78%)1.27 (0.90-1.80)0.1681.31 (0.92-1.86)0.137
Phenotype 250771 (14.00%)1.81 (1.40-2.33)<0.00011.77 (1.37-2.28)<0.000111414 (12.28%)1.93 (1.10-3.42)0.0221.72 (0.96-3.06)0.066
Phenotype 34392361 (8.21%)----113379 (6.97%)----
Phenotype 41914231 (12.06%)1.52 (1.29-1.79)<0.00011.52 (1.29-1.80)<0.000162369 (11.07%)1.64 (1.19 -2.27)0.00261.56 (1.13-2.16)0.0072
Combined vascular events3 monthsPhenotype 12475151 (6.10%)1.17 (0.95-1.43)0.1381.10 (0.89-1.36)0.36262636 (5.75%)1.34 (0.87-2.06)0.1841.36 (0.87-2.11)0.179
Phenotype 250752 (10.25%)2.02 (1.49-2.73)<0.00011.98 (1.46-2.68)<0.000111411 (9.64%)2.32 (1.21-4.45)0.0112.02 (1.04-3.94)0.038
Phenotype 34392230 (5.23%)----113349 (4.32%)----
Phenotype 41914173 (9.03%)1.76 (1.44-2.14)<0.00011.79 (1.47-2.18)<0.000162342 (6.74%)1.58 (1.05-2.39)0.0291.51 (1.00- 2.28)0.051
6 monthsPhenotype 12475206 (8.32%)1.25 (1.05-1.50)0.0121.21 (1.01-1.45)0.04362648 (7.66%)1.42 (0.97-2.06)0.0701.46 (0.99-2.15)0.056
Phenotype 250765 (12.82%)2.01 (1.54-2.63)<0.00011.98 (1.51-2.60)<0.000111413 (11.40%)2.23 (1.22-4.05)0.00881.86 (1.01 -3.42)0.046
Phenotype 34392293 (6.67%)----113362 (5.47%)----
Phenotype 41914204 (10.65%)1.64 (1.37-1.96)<0.00011.65 (1.38-1.98)<0.000162357 (9.14%)1.71 (1.20-2.46)0.00331.62 (1.13-2.33)0.0085
12 monthsPhenotype 12475267 (10.78%)1.27 (1.09 -1.49)0.00251.24 (1.06-1.46)0.008162659 (9.42%)1.35 (0.97-1.89)0.0791.40 (0.99-1.99)0.054
Phenotype 250777 (15.018%)1.89 (1.48 -2.42)<0.00011.85 (1.45-2.37)<0.000111415 (13.15%)2.06 (1.18-3.57)0.0101.79 (1.02-3.14)0.041
Phenotype 34392375 (8.53%)----113380 (7.06%)----
Phenotype 41914238 (12.43%)1.51 (1.28 -1.77)<0.00011.51 (1.28-1.78)<0.000162372 (11.55%)1.69 (1.23-2.33)0.00111.60 (1.17-2.21)0.0038
Mortality3 monthsPhenotype 1247512 (0.48%)1.25 (0.60-2.62)0.5511.26 (0.59 -2.70)0.5486268 (1.27%)2.41 (0.84-6.95)0.1023.44 (1.17-10.09)0.024
Phenotype 250726 (5.12%)13.63 (7.39-25.11)<0.000112.92 (6.95-24.02)<0.000111413 (11.40%)22.64 (8.61-59.59)<0.000118.14 (6.62-49.71)<0.0001
Phenotype 3439217 (0.38%)----11336 (0.52%)----
Phenotype 4191432 (1.67%)4.35 (2.42-7.83)<0.00014.18 (2.32-7.55)<0.000162316 (2.56%)4.92 (1.92-12.57)<0.00014.65 (1.81-11.93)0.0014
6 monthsPhenotype 1247524 (0.96%)1.37 (0.81-2.34)0.2431.49 (0.86 -2.57)0.1566269 (1.43%)1.25 (0.54-2.93)0.6021.60 (0.67-3.81)0.285
Phenotype 250735 (6.90%)10.16 (6.27-16.48)<0.00019.69 (5.93-15.84)<0.000111417 (14.91%)14.11 (6.85-29.06)<0.000112.33 (5.73-26.51)<0.0001
Phenotype 3439231 (0.71%)----113313 (1.14%)----
Phenotype 4191450 (2.61%)3.75 (2.39-5.86)<0.00013.60 (2.30-5.64)<0.000162322 (3.53%)3.15 (1.58-6.24)0.00112.96 (1.48-5.90)0.0021
12 monthsPhenotype 1247540 (1.61%)1.16 (0.78-1.73)0.4581.31 (0.87-1.97)0.19962615 (2.39%)1.18 (0.62-2.26)0.6191.38 (0.71-2.68)0.348
Phenotype 250746 (9.07%)6.90 (4.70-10.11)<0.00016.38 (4.32-9.41)<0.000111420 (17.54%)9.72 (5.33-17.70)<0.00018.94 (4.76-16.77)<0.0001
Phenotype 3439261 (1.38%)----113323 (2.03%)----
Phenotype 4191470 (3.65%)2.68 (1.90-3.78)<0.00012.53 (1.80-3.58)<0.000162329 (4.65%)2.36 (1.36-4.08)0.00212.16 (1.24-3.74)0.0062

Adjust for age, gender, smoking, drinking, history of stroke, hypertension, diabetes mellitus, dyslipidemia, and coronary heart disease.

Abbreviations: HR, Hazard ratios; CI, confidence intervals.

Clinical outcomes stratified by the identified phenotypes. Kaplan-Meier curves of time to stroke recurrence (A), combined vascular events (B), and all-cause mortality (C) within one year after stroke in derivation cohort. D. The distribution of the modified Rankin Scale (mRS) score 90 days after stroke in derivation cohort. Kaplan-Meier curves of time to stroke recurrence (E), combined vascular events (F), and all-cause mortality (G) within one year after stroke in validation cohort. H. The distribution of the mRS score 90 days after stroke in validation cohort. Clinical outcomes in the derivation cohort and validation cohort by phenotypes. Adjust for age, gender, smoking, drinking, history of stroke, hypertension, diabetes mellitus, dyslipidemia, and coronary heart disease. Abbreviations: HR, Hazard ratios; CI, confidence intervals. A similar pattern was repeated in the validation cohort, the patients in phenotype 3 were observed to have the best clinical outcomes. Whereas, phenotype 2 had the highest risk of stroke recurrence (adjusted HR 2.02, 95% CI 1.04-3.94, p=0.038), all-cause mortality (adjusted HR 18.14, 95% CI 6.62-49.71, p<0.001), and poor functional outcome (adjusted OR 5.62, 95% CI 3.67-8.60, p<0.0001) at 3-month follow-up compared to phenotype 3. Phenotype 1 (adjusted HR 3.44, 95% CI 1.17-10.09, p=0.024) and phenotype 4 (adjusted HR 4.65, 95% CI 1.81-11.93, p=0.0014) were associated with a significantly higher risk of all-cause mortality at 3-month follow-up (Figure 5, Table 1 and Supplementary Table 5). At one-year follow-up, patients in phenotype 2 had the highest risk of combined vascular events (adjusted HR 1.79, 95% CI 1.02-3.14 p=0.041), and all-cause mortality (adjusted HR 8.94, 95% CI 4.76-16.77, p<0.0001). Patients in phenotype 4 had a higher risk of stroke recurrence (adjusted HR 1.56, 95% CI 1.13-2.16, p=0.0072), combined vascular events (adjusted HR 1.60, 95% CI 1.17-2.21, p=0.0038), and all-cause mortality (adjusted HR 2.16, 95% CI 1.24-3.74, p=0.0062) (Figure 5, Table 1 and Supplementary Table 5).

Differential estimated therapy effects by phenotypes distributions

A Monte-Carlo simulation was performed to analyse the effect of high-intensity statin therapy by varying the proportion of phenotypes and the results of this analysis are presented in Figure 6. In the baseline phenotype distribution, the use of high-intensity statin therapy had a 0.01% chance of a benefit, a 76.69% chance of producing no significant effect, and a 23.30% chance of harm for stroke recurrence at 3 months. The chance of finding benefit increased to 6.10% when phenotype 2 represented the majority of the population, and the risk of high-intensity statin therapy being harmful reduced to 0.22%. A similar pattern was observed in the validation cohort. In the validation cohort, with the baseline phenotypes distribution, the high-intensity statin therapy had a 0.35% chance of a benefit, a 96.92% chance of producing no significant effect, and a 2.73% chance of harm for stroke recurrence at 3 months. With phenotype 2 representing the majority of the population, the chance of finding benefit increased to 87.51%, and the chance of a harmful effect reduced to 0.00%. The results of the Monte-Carlo simulation showed that changing the proportion of phenotype 1, phenotype 3, or phenotype 4 did not significantly benefit from high-intensity statin therapy. (Figure 6, Supplementary Figure 11).
Figure 6

Monte-Carlo simulation of response to high-intensity statin therapy with a different relative frequency of phenotypes. In the derivation cohort, the actual distribution of the data in the given phenotypes and the associated results of the harm, benefit, and neutral effect analysis using Monte-Carlo simulation analysis are presented in panel A. Each simulation was conducted with 100000 iterations using sampling with replacement. The results of the same analysis by changing the phenotype distributions of the data are presented in panels B and C. In panel B, the distribution of phenotype 2 was gradually increased whereas panel C presents the results associated with the gradually increasing the distribution of phenotype 4. Panels D to F present similar results for the validation cohort data.

Abbreviations: HBN: harm, benefit, or neutral.

Monte-Carlo simulation of response to high-intensity statin therapy with a different relative frequency of phenotypes. In the derivation cohort, the actual distribution of the data in the given phenotypes and the associated results of the harm, benefit, and neutral effect analysis using Monte-Carlo simulation analysis are presented in panel A. Each simulation was conducted with 100000 iterations using sampling with replacement. The results of the same analysis by changing the phenotype distributions of the data are presented in panels B and C. In panel B, the distribution of phenotype 2 was gradually increased whereas panel C presents the results associated with the gradually increasing the distribution of phenotype 4. Panels D to F present similar results for the validation cohort data. Abbreviations: HBN: harm, benefit, or neutral.

Discussion

In this multi-centre study analysing 12035 patients with acute non-cardioembolic ischaemic stroke, we proposed a novel stratification of patients into four biomarker-based phenotypes with unique clinical characteristics, possibly unique disease pathophysiology, and significantly different clinical outcomes. The proposed stratification of patients may provide information about underlying disease mechanisms, and aid in guiding the choice of post-stroke therapy. To the best of our knowledge, this is the first study that provides a novel stratification of acute non-cardioembolic ischaemic stroke patients based on 92 biomarkers, including blood constituents, coagulation function, liver function, renal function, inflammation, lipid metabolism, homocysteine metabolism, glucose metabolism, gut microbial metabolites, and neuro-imaging features, and it is the first study that applies machine learning techniques to resolve heterogeneity in AIS using dense phenotypic data. In this study, we have derived phenotypes to facilitate the early detection of patients with a high risk of unfavorable clinical outcomes. These defined phenotypes can be identified at the time of hospital admission, and thus could aid in early treatment planning and enrollment of patients in experimental clinical trials. Furthermore, with the use of feature importance analysis and predictive modeling, we showed that the patients can be uniquely assigned to the identified phenotypes using only ten biomarkers which are routinely acquired even in resource-limited settings. This ensures that the proposed method can be made available in remote healthcare centres. Phenotype 1 was most strongly characterized by abnormal values of glucose and lipid metabolism as well as clinical features associated with liver dysfunction. The results showed that the patients in phenotype 1 had a low level of adiponectin. Adiponectin is being recognized as a protective adipokine in insulin resistance and liver diseases. Previous studies indicate that the decreased levels of adiponectin might play a key role in the development of atherosclerosis and cardiovascular diseases., Changes in gut microbiota-related metabolites represented by increased levels of TMAO and its precursors, choline, were observed in phenotype 1. Alterations in the gut microbiota composition are known to drive activation of lipopolysaccharide, which might result in hepatic steatosis, adipose tissue macrophages infiltration, dyslipidemia, hyperglycemia, hyperinsulinemia, and obesity., In particular, TMAO has been shown to directly influence the propensity of macrophages to accumulate cholesterol and form foam cells in atherosclerotic lesions, as well as to alter cholesterol and sterol metabolism within multiple compartments including the liver and intestines. , The evidence from this study may provide opportunities for the development of new diagnostic tests and therapeutic approaches for the individuals that are classified as phenotype 1. The participants in phenotype 2 were observed to be at the highest risk of recurrent stroke, combined vascular events, poor functional outcomes, and all-cause mortality. This phenotype was primarily characterized by elevated levels of inflammation and a high incidence of sICAS/sECAS. The serum level of ANGPTL3, lp(a), and PCSK9 were also observed to be increased in phenotype 2. Atherosclerosis is a chronic inflammatory process of the vascular wall that is initiated by excessive LDL-C and is mediated by activated macrophages. Hyperlipidemia elicits a profound enrichment of a pro-inflammatory subset of monocytes. These pro-inflammatory monocytes, home to atherosclerotic lesions, give rise to macrophages, which in the arterial intima form foam cells, and stimulate the innate immune response by expressing high levels of pro-inflammatory cytokines., Progress in understanding the basic biology of inflammation in atherosclerosis will help to identify potential novel strategies for modulating inflammation in stroke prevention. Phenotype 2 was also characterized by an abnormal renal function index. Inflammation is highly prevalent in patients with chronic kidney disease (CKD) and is consistently associated with cardiovascular events and mortality, supporting the role of kidney dysfunction in the systemic process., Phenotype 2 had the highest level of inflammatory markers, which may explain the correlation between the new classification and outcomes. In recent years, inflammation has been increasingly recognized as an important contributor to the fate of the ischaemic brain and the survival of people after ischaemic stroke. The concentrations of various inflammatory markers like neutrophils, high-sensitive C-reactive protein (hs-CRP), and interleukin-6 (IL-6) could reflect a systemic stress response to injury, which have been associated with a high risk of cerebrovascular events.36, 37, 38, 39 Therefore, anti-inflammatory therapy has been proposed as a potential treatment for preventing stroke recurrence and other vascular events after ischaemic stroke or TIA.40, 41, 42 The Colchicine Cardiovascular Outcomes Trial (COLCOT) trial has shown that anti-inflammatory therapy with colchicine can reduce the occurrence of vascular events. Also, recent evidence indicates that statins, in addition to their lipid-lowering properties, can have anti-inflammatory and immunomodulatory effects and these additional effects may play a vital role in the prevention of vascular events. In this study, Monte Carlo simulation revealed that patients in phenotype 2 could benefit from high-intensity statin therapy. The results of An Intervention Trial Evaluating Rosuvastatin (JUPITER) trial showed that rosuvastatin (20 mg daily) effectively reduced the incidence of major cardiovascular events as compared with placebo (P <0.001) in 17,802 healthy individuals without hyperlipidemia, but with high hs-CRP levels of >2 mg/L, and the level of hs-CRP and LDL concentrations were reduced by 37% and 50%, respectively. Furthermore, in vitro and in vivo experiments have shown that statins can modulate the NLRP3 inflammasome and pro-inflammatory cytokine release such as IL-6., The concept of statin pleiotropy has provided a window of opportunity to test and target other nonlipid-lowering signaling pathways that may affect cardiovascular disease., Future prospective intervention studies are needed to explore the therapeutic effect of interventions targeting inflammation in patients with phenotype 2. In the present analysis, out of the 4 identified phenotypes, the patients in phenotype 3 presented with the least amount of laboratory abnormalities. Also, patients in phenotype 3 had significantly smaller infarct lesions and small artery occlusion was the prominent cause of stroke in this phenotype. Consequently, the patients in this group were observed to be at the lowest risk of recurrent stroke and had relatively better clinical outcomes. The risks of all adverse clinical events were observed to be significantly higher in phenotype 4. Phenotype 4 was characterized by a low level of vitamin B12 and a high level of MMA. Vitamin B12 has shown efficacy as a nitric oxide scavenger. Accumulating pieces of evidence suggest that Cbl-associated metabolites, MMA and HCY, may promote atherogenesis through its toxic effects on the vascular endothelium, which is likely mediated through oxidative stress. Besides, MMA accumulation reflects the decreased activity of a mitochondrial Cbl-dependent enzyme, which is more sensitive to oxidative damage. Further studies are needed to establish specific therapy for the patients in phenotype 4, and vitamin supplements or antioxidant therapy may prove to be beneficial to these patients. The results presented in this study have significant implications for understanding the mechanisms of AIS. First, the phenotypes presented in this study can be used to prospectively stratify patients in future clinical research, paving the way to a personalized precision medicine approach in the management of AIS. Second, our findings in this study advance the understanding of circulating biomarker profiles in AIS and suggest that multi-biomarker approaches can be implemented for achieving better risk stratification. More importantly, owing to the multi-biomarker approach in this study, we were able to shed light on more complex pathophysiological pathways associated with AIS, which couldn't have been possible with single-biomarker analysis methods. Lastly, in this analysis, the phenotypes were derived from a large observational cohort and their generalizability was ensured using a large validation cohort. Some potential limitations of the study should be noted. First, despite all the efforts, the imputation of missing values may affect the results of the study. To reduce missing data, we derived a machine learning model based on 10 key biomarkers, which are basic variables from clinical practice. Second, although the identified phenotypes were found to be generalizable in the validation cohort, further research is needed to determine the utility of these novel phenotypes to optimize clinical care and trial design. Third, the output of machine learning is limited by the limitations of input. Future studies with large biological data that can enable an integrative analysis of multi-omics data (e.g., genomics, transcriptomics, metabolomics) should be conducted to uncover the complex molecular pathways leading to AIS. Fourth, imputing mode or median by subgroup may incur bias. However, considering ischemic stroke is a disease highly correlated with age and gender, we divided the patients into six subgroups according to their age and gender. Other imputation methods such as multiple imputation could also be used to generate accurate estimations of missing values. Fifth, we employed telephonic interviews to collect information about cardiovascular events at 6 and 12 months after AIS, which may potentially influence the appraisal of the clinical outcomes. However, previous studies have indicated the telephonic assessment of recurrent ischaemic strokes to be reliable and creditable., Furthermore, the present study was based on participants from China, which may potentially limit the interethnic extrapolation of the findings. Further studies using the independent cohort and other ethnic cohorts are needed to generalize the study's findings. In conclusion, using data from a nationwide cohort and machine learning methods, we identified four biomarker-based phenotypes that were correlated with specific pathophysiology and clinical outcomes in patients with acute non-cardioembolic ischaemic stroke. With a data-driven approach, this study presents a step towards a more clinically useful stratification of patients, which can play an important role in precision medicine and clinical decision-making in AIS.

Contributors

YW, ZL, and LD designed and implemented the study. RM, ZW, WO, XW, and YL performed the statistical analyses. LD wrote the manuscript. YJ, XM,JJ, JL, XZ, and HL completed the data collection and management. ZL, YW, and YL have accessed and verified the underlying data. All authors contributed to the interpretation of the data and critical revision of the manuscript. All authors approved the final version of the manuscript.

Data sharing statement

The data that support the findings of this study are available from the corresponding author (Prof. YW, yongjunwang@ncrcnd.org.cn) on reasonable request. Interested parties can apply for data access requests from the website of China National Clinical Research Center for Neurological Diseases at https://www.ncrcnd.org.cn.

Declaration of interests

All authors declare no competing interests.
  50 in total

1.  A cluster separation measure.

Authors:  D L Davies; D W Bouldin
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  1979-02       Impact factor: 6.226

2.  2018 AHA/ACC/AACVPR/AAPA/ABC/ACPM/ADA/AGS/APhA/ASPC/NLA/PCNA Guideline on the Management of Blood Cholesterol: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines.

Authors:  Scott M Grundy; Neil J Stone; Alison L Bailey; Craig Beam; Kim K Birtcher; Roger S Blumenthal; Lynne T Braun; Sarah de Ferranti; Joseph Faiella-Tommasino; Daniel E Forman; Ronald Goldberg; Paul A Heidenreich; Mark A Hlatky; Daniel W Jones; Donald Lloyd-Jones; Nuria Lopez-Pajares; Chiadi E Ndumele; Carl E Orringer; Carmen A Peralta; Joseph J Saseen; Sidney C Smith; Laurence Sperling; Salim S Virani; Joseph Yeboah
Journal:  Circulation       Date:  2018-11-10       Impact factor: 29.690

3.  Adiponectin, risk of coronary heart disease and correlations with cardiovascular risk markers.

Authors:  Dietrich Rothenbacher; Hermann Brenner; Winfried März; Wolfgang Koenig
Journal:  Eur Heart J       Date:  2005-06-02       Impact factor: 29.983

Review 4.  Nuclear and mitochondrial compartmentation of oxidative stress and redox signaling.

Authors:  Jason M Hansen; Young-Mi Go; Dean P Jones
Journal:  Annu Rev Pharmacol Toxicol       Date:  2006       Impact factor: 13.820

5.  C-Reactive protein predicts all-cause and cardiovascular mortality in hemodialysis patients.

Authors:  J Y Yeun; R A Levine; V Mantadilok; G A Kaysen
Journal:  Am J Kidney Dis       Date:  2000-03       Impact factor: 8.860

6.  Incidence, case-fatality rate, and prognosis of ischaemic stroke subtypes in a predominantly Hispanic-Mestizo population in Iquique, Chile (PISCIS project): a community-based incidence study.

Authors:  Pablo M Lavados; Claudio Sacks; Liliana Prina; Arturo Escobar; Claudia Tossi; Fernando Araya; Walter Feuerhake; Marcelo Gálvez; Rodrigo Salinas; Gonzalo Alvarez
Journal:  Lancet Neurol       Date:  2007-02       Impact factor: 44.182

7.  Global, regional, and national burden of stroke, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016.

Authors: 
Journal:  Lancet Neurol       Date:  2019-03-11       Impact factor: 44.182

8.  Anti-inflammatory therapy for preventing stroke and other vascular events after ischaemic stroke or transient ischaemic attack.

Authors:  Sarah Coveney; John J McCabe; Sean Murphy; Martin O'Donnell; Peter J Kelly
Journal:  Cochrane Database Syst Rev       Date:  2020-05-11

Review 9.  Ageing and atherosclerosis: vascular intrinsic and extrinsic factors and potential role of IL-6.

Authors:  Daniel J Tyrrell; Daniel R Goldstein
Journal:  Nat Rev Cardiol       Date:  2020-09-11       Impact factor: 49.421

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.