| Literature DB >> 36237581 |
Jia Guo1, Tuotuo Gong1, Beina Hui1, Xu Zhao1, Jing Li1.
Abstract
The rapid development of molecular biology and gene chip technology has produced a large amount of gene expression profile data. The main research in this article is to screen the tumor-related genes of gallbladder cancer based on AR-based tumor expression profile gene chip. First, convert the chip data into an expression matrix pattern that can be analyzed, and then standardize and normalize all the data. Run ReliefF, GA, and IReliefF-GA on the data set, record the size of the feature subset, and use the tenfold cross-validation method to obtain the classification accuracy, specificity, and sensitivity of each method on the classifier. The target genes used in the chip were amplified by PCR with the universal primers used in cDNA library construction, and the quality of PCR was monitored by agarose gel electrophoresis. The gene chip data of gallbladder cancer was processed with missing values, singular values, and so forth, and 22294 transcripts were obtained. After statistical testing, there were 9483 transcripts with statistically significant differences. The results show that as the number of clusters increases, the network can be better reconstructed through decomposition modeling.Entities:
Mesh:
Year: 2022 PMID: 36237581 PMCID: PMC9529521 DOI: 10.1155/2022/8579279
Source DB: PubMed Journal: Contrast Media Mol Imaging ISSN: 1555-4309 Impact factor: 3.009
Clinicopathological characteristics of gallbladder cancer patients detected by gene chip.
| No. | Sex | Age | Tum | LN | Dif |
|---|---|---|---|---|---|
| 1 | F | 54 | T3 | N1 | MD |
| 2 | M | 68 | T2 | N2 | WD |
| 3 | M | 50 | T3 | N1 | MD |
| 4 | M | 73 | T4 | N3 | PD |
| 5 | F | 62 | T3 | N0 | PD |
| 6 | M | 45 | T2 | N0 | PD |
Figure 1Curve of mean value of NRMSE.
The number of feature genes and classification performance of various algorithms on different data sets.
| Lung | Colon | Leukemia | Prostate | |||||
|---|---|---|---|---|---|---|---|---|
| Number | Rate | Number | Rate | Number | Rate | Number | Rate | |
| ODP | 2880 | 84.62 | 2000 | 81.10 | 7129 | 94.44 | 12600 | 61.90 |
| SNRS | 6 | 85.44 | 15 | 82.26 | 4 | 97.36 | 5 | 91.18 |
| RF | 2880 | 86.37 | 2000 | 84.75 | 7129 | 90.18 | 12600 | 92.54 |
| SVM | 16 | 86.36 | 15 | 84.40 | 10 | 94.10 | 10 | 90.34 |
| SNRRF | 10 | 89.89 | 72 | 87.48 | 26 | 94.77 | 49 | 93.14 |
Comparison of serum CA242, CA125, CA199, and CEA detection levels.
| Group | MFI (CEA) | MFI (CA199) | MFI (CA125) | MFI (CA242) |
|---|---|---|---|---|
| Healthy control group | 3.93 ± 2.04 | 14.97 ± 8.91 | 10.48 ± 6.38 | 9.48 ± 3.43 |
| Benign gallbladder disease group | 3.83 ± 1.85 | 15.17 ± 7.82 | 12.99 ± 6.99 | 10.19 ± 3.08 |
| Gallbladder cancer group | 9.36 ± 3.58 | 238.17 ± 346.36 | 55.34 ± 81.78 | 39.92 ± 45.9 |
Figure 2Classification accuracy of different algorithms.
Figure 3Selection results of different genes.
Classification performance of core genes in the validation data set.
| Classification | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUC |
|---|---|---|---|---|
| Random forest | 86.7 | 82.7 | 86.8 | 0.892 |
| Random forest | 88.6 | 85.5 | 87.6 | 0.925 |
| SVM | 87.8 | 87.6 | 87.5 | 0.923 |
| Decision tree | 89.5 | 90.9 | 89.4 | 0.921 |
| Random forest | 93.5 | 91.5 | 90.6 | 0.957 |
The effect of NDRG2 expression on the proliferation of GBC-SD cells.
| Day | GBC-SD | GBC-SD-Ve | GBC-SD-NDRG2 |
|
|
|---|---|---|---|---|---|
| 1 | 100 | 100.42 ± 10.73 | 92.51 ± 6.23 | 12.66 |
|
| 2 | 100 | 99.18 ± 12.54 | 84.16 ± 7.42 | 23.78 |
|
| 3 | 100 | 101.36 ± 8.41 | 73.88 ± 10.64 | 122.53 |
|
| 4 | 100 | 102.58 ± 14.59 | 57.51 ± 9.87 | 131.66 |
|
| 5 | 100 | 100.88 ± 13.62 | 46.19 ± 3.92 | 266.51 |
|
| 6 | 100 | 98.59 ± 9.27 | 32.25 ± 7.65 | 639.16 |
|
Figure 4The effect of NDRG2 expression on the proliferation of GBC-SD cells.
Figure 5The accuracy of the four feature selection methods for five data sets.
Figure 6The average LOESS curve of the absolute value of M for all pairwise comparisons of nonchecked transcripts.
Figure 7Performance comparison results of different projection methods when training samples are reduced.
The prediction accuracy of the four methods under KNN and SVM classifiers.
| Data set | KNN | SVM | ||||||
|---|---|---|---|---|---|---|---|---|
| PCA | 2DPCA | LDA | 2DLDA | PCA | 2DPCA | LDA | 2DLDA | |
| ALL | 86.79 | 94.38 | 94.91 | 95.32 | 87.87 | 93.90 | 94.89 | 95.08 |
| DLBCL | 55.25 | 60.54 | 63.88 | 66.28 | 59.17 | 63.76 | 63.68 | 69.38 |
| Lung | 90.79 | 93.20 | 90.29 | 93.25 | 92.23 | 94.12 | 90.12 | 94.15 |
| Novartis | 89.06 | 94.35 | 93.27 | 96.38 | 89.28 | 94.95 | 90.68 | 96.11 |