| Literature DB >> 35494077 |
Meifang Wang1, Chunxia Dong1, Yan Gao1, Jianlan Li1, Mengru Han1, Lijun Wang1.
Abstract
Aim: Bone marrow biopsy is essential and necessary for the diagnosis of patients with aplastic anemia (AA), myelodysplastic syndromes (MDS), and acute myeloid leukemia (AML). However, the convolutional neural networks (CNN) model that automatically distinguished AA, MDS, and AML based on bone marrow smears has not been reported.Entities:
Keywords: aplastic anemia; convolutional neural networks; identification model; myelodysplastic syndromes; myeloid leukemia
Year: 2022 PMID: 35494077 PMCID: PMC9047549 DOI: 10.3389/fonc.2022.844978
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1Sample images of aplastic anemia (AA), myelodysplastic syndromes (MDS), and acute myeloid leukemia (AML).
Figure 2The effect of applying image transformation to the same image sample. (A) rotation; (B) ZCA whitening; (C) width shift; (D) height shift; (E) shearing; (F) zoom; (G) horizontal flip; (H) vertical flip.
Figure 3The architectural details of Resnet 50.
Figure 4The construction process of the recognition model.
Figure 5Accuracy-Loss curves of the Resnet 50 Image-Net pretrained model in different outcome weights and epochs. (A) 30 epochs and 5:5 outcome weight; (B) 30 epochs and 2:8 outcome weight; (C) 30 epochs and 1:9 outcome weight; (D) 50 epochs and 1:9 outcome weight; (E) 200 epochs and 1:9 outcome weight.
The performances of the MDS two-classification model with different outcome weights.
| Models | Data set | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) | AUC (95%CI) | Accuracy (95%CI) |
|---|---|---|---|---|---|---|---|
| 30 epochs, 5:5 outcome weight | Training set | 0.984 (0.974-0.994) | 0.905 (0.888-0.921) | 0.818 (0.789-0.847) | 0.992 (0.987-0.997) | 0.982 (0.977-0.987) | 0.929(0.917-0.940) |
| Testing set | 0.963 (0.939-0.987) | 0.870 (0.842-0.898) | 0.763 (0.715-0.811) | 0.982 (0.970-0.994) | 0.969 (0.959-0.978) | 0.898(0.877-0.919) | |
| Validate set | 0.852 (0.804-0.900) | 0.934 (0.911-0.957) | 0.856 (0.809-0.904) | 0.932 (0.908-0.955) | 0.931 (0.903-0.959) | 0.908(0.886-0.930) | |
| 30 epochs, 2:8 outcome weight | Training set | 0.989 (0.981-0.998) | 0.898 (0.881-0.914) | 0.808 (0.779-0.838) | 0.995 (0.991-0.999) | 0.985 (0.981-0.989) | 0.925(0.913-0.937) |
| Testing set | 0.959 (0.933-0.984) | 0.864 (0.836-0.893) | 0.755 (0.707-0.803) | 0.980 (0.967-0.992) | 0.979 (0.972-0.986) | 0.893(0.781-0.914) | |
| Validate set | 0.886 (0.843-0.929) | 0.934 (0.911-0.957) | 0.861(0.815-0.907) | 0.946 (0.925-0.967) | 0.916 (0.886-0.945) | 0.918(0.898-0.939) | |
| 30 epochs, 1:9 outcome weight | Training set | 1.000 (1.000-1.000) | 0.902 (0.886-0.918) | 0.817 (0.788-0.846) | 1.000 (1.000-1.000) | 0.989 (0.986-0.992) | 0.932(0.920-0.943) |
| Testing set | 0.971 (0.950-0.992) | 0.882 (0.856-0.909) | 0.783 (0.736-0.829) | 0.986 (0.975-0.996) | 0.984 (0.978-0.990) | 0.909(0.889-0.929) | |
| Validate set | 0.867 (0.821-0.913) | 0.967 (0.950-0.983) | 0.924 (0.887-0.961) | 0.940 (0.918-0.961) | 0.965 (0.947-0.983) | 0.935(0.916-0.954) |
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
The performances of the MDS two-classification model with different epochs.
| Models | Data set | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) | AUC (95%CI) | Accuracy (95%CI) |
|---|---|---|---|---|---|---|---|
| 30 epochs, 1:9 outcome weight | Training set | 1.000 (1.000-1.000) | 0.902 (0.886-0.918) | 0.817 (0.788-0.846) | 1.000 (1.000-1.000) | 0.989 (0.986-0.992) | 0.932 (0.920-0.943) |
| Testing set | 0.971 (0.950-0.992) | 0.882 (0.856-0.909) | 0.783 (0.736-0.829) | 0.986 (0.975-0.996) | 0.984 (0.978-0.990) | 0.909 (0.889-0.929) | |
| Validate set | 0.867 (0.821-0.913) | 0.967 (0.950-0.983) | 0.924 (0.887-0.961) | 0.940 (0.918-0.961) | 0.965 (0.947-0.983) | 0.935 (0.916-0.954) | |
| 50 epochs, 1:9 outcome weight | Training set | 0.973 (0.960-0.987) | 0.894 (0.878-0.911) | 0.801 (0.771-0.831) | 0.987 (0.981-0.994) | 0.983 (0.979-0.988) | 0.918 (0.906-0.931) |
| Testing set | 0.950 (0.923-0.978) | 0.870 (0.842-0.898) | 0.761 (0.713-0.809) | 0.976 (0.962-0.989) | 0.975 (0.967-0.983) | 0.894 (0.873-0.916) | |
| Validate set | 0.857 (0.810-0.904) | 0.867 (0.836-0.899) | 0.750 (0.695-0.805) | 0.929 (0.904-0.953) | 0.888 (0.855-0.921) | 0.864 (0.838-0.890) | |
| 200 epochs, 1:9 outcome weight | Training set | 0.998 (0.995-1.000) | 0.907 (0.891-0.923) | 0.824 (0.795-0.853) | 0.999 (0.997-1.000) | 0.991 (0.988-0.993) | 0.935 (0.923-0.946) |
| Testing set | 0.983 (0.967-1.000) | 0.875 (0.848-0.903) | 0.775 (0.728-0.821) | 0.992 (0.984-1.000) | 0.985 (0.979-0.991) | 0.908 (0.888-0.928) | |
| Validate set | 0.895 (0.854-0.937) | 0.942 (0.921-0.964) | 0.879 (0.835-0.922) | 0.951 (0.931-0.971) | 0.924 (0.894-0.953) | 0.927 (0.908-0.947) |
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
The performances of the AA, MDS, and AML three-classification model with different outcome weights.
| Models | Data set | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) | AUC (95%CI) | Accuracy (95%CI) |
|---|---|---|---|---|---|---|---|
| 30 epochs, 5:5 outcome weight | Training set | 0.884 (0.867-0.902) | 0.960 (0.952-0.968) | 0.922 (0.907-0.937) | 0.940 (0.930-0.949) | 0.970 (0.965-0.976) | 0.934 (0.926-0.942) |
| Testing set | 0.834 (0.803-0.865) | 0.952 (0.939-0.965) | 0.902 (0.876-0.928) | 0.915 (0.898-0.931) | 0.945 (0.934-0.957) | 0.911 (0.897-0.925) | |
| Validate set | 0.858 (0.826-0.891) | 0.880 (0.858-0.901) | 0.787 (0.751-0.823) | 0.923 (0.905-0.941) | 0.911 (0.892-0.929) | 0.872 (0.854-0.890) | |
| 30 epochs, 2:8 outcome weight | Training set | 0.855 (0.836-0.874) | 0.952 (0.943-0.960) | 0.905 (0.888-0.921) | 0.925 (0.914-0.935) | 0.971 (0.966-0.976) | 0.918 (0.909-0.927) |
| Testing set | 0.807 (0.774-0.839) | 0.929 (0.913-0.944) | 0.858 (0.828-0.888) | 0.900 (0.882-0.918) | 0.945 (0.933-0.956) | 0.886 (0.870-0.902) | |
| Validate set | 0.823 (0.788-0.858) | 0.889 (0.868-0.910) | 0.793 (0.757-0.830) | 0.906 (0.887-0.926) | 0.905 (0.886-0.924) | 0.866 (0.848-0.885) | |
| 30 epochs, 1:9 outcome weight | Training set | 0.890 (0.873-0.907) | 0.986 (0.981-0.990) | 0.970 (0.961-0.980) | 0.944 (0.935-0.953) | 0.976 (0.971-0.981) | 0.952 (0.945-0.959) |
| Testing set | 0.841 (0.810-0.871) | 0.972 (0.962-0.982) | 0.941 (0.921-0.962) | 0.920 (0.903-0.936) | 0.958 (0.948-0.968) | 0.926 (0.913-0.939) | |
| Validate set | 0.852 (0.819-0.885) | 0.901 (0.882-0.921) | 0.817 (0.783-0.852) | 0.921 (0.903-0.940) | 0.925 (0.909-0.941) | 0.884 (0.867-0.902) |
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
The performances of the AA, MDS, and AML three-classification model with different epochs.
| Models | Data set | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) | AUC (95%CI) | Accuracy (95%CI) |
|---|---|---|---|---|---|---|---|
| 30 epochs, 1:9 outcome weight | Training set | 0.890 (0.873-0.907) | 0.986 (0.981-0.990) | 0.970 (0.961-0.980) | 0.944 (0.935-0.953) | 0.976 (0.971-0.981) | 0.952 (0.945-0.959) |
| Testing set | 0.841 (0.810-0.871) | 0.972 (0.962-0.982) | 0.941 (0.921-0.962) | 0.920 (0.903-0.936) | 0.958 (0.948-0.968) | 0.926 (0.913-0.939) | |
| Validate set | 0.852 (0.819-0.885) | 0.901 (0.882-0.921) | 0.817 (0.783-0.852) | 0.921 (0.903-0.940) | 0.925 (0.909-0.941) | 0.884 (0.867-0.902) | |
| 50 epochs, 1:9 outcome weight | Training set | 0.892 (0.875-0.909) | 0.963 (0.956-0.971) | 0.928 (0.914-0.942) | 0.944 (0.934-0.953) | 0.975 (0.970-0.979) | 0.938 (0.931-0.946) |
| Testing set | 0.850 (0.820-0.880) | 0.958 (0.946-0.971) | 0.916 (0.892-0.940) | 0.923 (0.907-0.939) | 0.951 (0.939-0.963) | 0.921 (0.907-0.934) | |
| Validate set | 0.834 (0.800-0.868) | 0.912 (0.893-0.931) | 0.830 (0.796-0.865) | 0.914 (0.895-0.932) | 0.894 (0.873-0.916) | 0.885 (0.868-0.902) | |
| 200 epochs, 1:9 outcome weight | Training set | 0.901 (0.884-0.917) | 0.983 (0.977-0.988) | 0.965 (0.955-0.975) | 0.949 (0.940-0.957) | 0.983 (0.979-0.987) | 0.954 (0.947-0.961) |
| Testing set | 0.857 (0.828-0.886) | 0.967 (0.956-0.978) | 0.933 (0.911-0.955) | 0.927 (0.911-0.942) | 0.968 (0.960-0.976) | 0.929 (0.916-0.941) | |
| Validate set | 0.887 (0.858-0.916) | 0.929 (0.912-0.946) | 0.866 (0.835-0.897) | 0.941 (0.925-0.957) | 0.948 (0.935-0.961) | 0.915 (0.900-0.930) |
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
The performances of the final two-classification and the three-classification models.
| Models | Data set | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) | AUC (95%CI) | Accuracy (95%CI) |
|---|---|---|---|---|---|---|---|
| Two-classification (200 epochs, 1:9) | Training set | 0.996 (0.992-1.000) | 0.911 (0.895-0.926) | 0.830 (0.802-0.858) | 0.998 (0.996-1.000) | 0.991 (0.988-0.993) | 0.937 (0.926-0.948) |
| Testing set | 0.992 (0.980-1.000) | 0.881 (0.854-0.908) | 0.784 (0.737-0.830) | 0.996 (0.990-1.000) | 0.985 (0.979-0.991) | 0.914 (0.895-0.934) | |
| Validate set | 0.886 (0.843-0.929) | 0.938 (0.916-0.960) | 0.869 (0.824-0.914) | 0.946 (0.926-0.967) | 0.942 (0.918-0.967) | 0.921 (0.901-0.942) | |
| Three-classification (200 epochs, 1:9) | Training set | 0.901 (0.884-0.917) | 0.983 (0.977-0.988) | 0.965 (0.955-0.975) | 0.949 (0.940-0.957) | 0.983 (0.979-0.987) | 0.954 (0.947-0.961) |
| Testing set | 0.857 (0.828-0.886) | 0.967 (0.956-0.978) | 0.933 (0.911-0.955) | 0.927 (0.911-0.942) | 0.968 (0.960-0.976) | 0.929 (0.916-0.941) | |
| Validate set | 0.887 (0.858-0.916) | 0.929 (0.912-0.946) | 0.866 (0.835-0.897) | 0.941 (0.925-0.957) | 0.948 (0.935-0.961) | 0.915 (0.900-0.930) |
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
Figure 6The receiver operator characteristic (ROC) curves of the final recognition models. (A) ROC curves of the two-classification model; (B) ROC curves of the three-classification model.