| Literature DB >> 31459128 |
Xinliang Yu1,2, Huiqiong Yang1, Xianwei Huang1.
Abstract
Prostate cancer (PCa) is one of the most common malignancies in men and seriously threatens men's health. Developing aptamer probes for PCa cells is of great significance for early diagnosis and treatment of PCa. This paper reports a classification model for SELEX-based aptamers, which were obtained with PCa cell line PCa-3M-1E8 (highly metastatic tumor cell) as target cells and PCa cell line PCa-3M-2B4 (low metastatic tumor cell) as control cells. On the basis of the SELEX principle, 100 oligonucleotide sequences from the 3rd round of SELEX were defined as low affinity and specificity aptamers, and 100 sequences from the 11th round were set as high affinity and specificity aptamers. Seven molecular descriptors were used for the classification model, which were calculated from amino acid sequences translated from DNA aptamer sequences with DNAMAN software. The classification model based on binary logical regression analysis has prediction accuracies, sensitivity, and specificity of about 80% for both the training set and test set. Therefore, it is feasible to calculate molecular descriptors from amino acid sequence translated from DNA aptamer sequences and develop a classification model for PCa cell line PCa-3M-1E8.Entities:
Year: 2018 PMID: 31459128 PMCID: PMC6644987 DOI: 10.1021/acsomega.8b01464
Source DB: PubMed Journal: ACS Omega ISSN: 2470-1343
Twenty-Four Candidate Aptamer Sequences from 12 Rounds
| no. | sequences (5′ → 3′) | exp. | pre. |
|---|---|---|---|
| 1 | TGCAGGGTGAGAGGTTGGCTTTAGAGGGTTAGGGGGAATT | 2 | 1 |
| 2 | GGAGGGCTAGAGTAGGGGGCTGTCAAGGGGTCGGTGGGGA | 2 | 2 |
| 3 | TGCAGGGTGAGAGGTTGGTTTTAGAGGGTTAGGGGGAATT | 2 | 1 |
| 4 | GGAAGGGGCGTGGTTGGTAGAAAGGGAAGGGGAAGTTTAG | 2 | 2 |
| 5 | AGGGGGCAAGAGGGTGGTTTTAGAGGGGCAGGGGGAGTT | 2 | 2 |
| 6 | GGGGGAGGCGGGCGGGGTGCTGACGGGGGAGTTTAGCCGT | 2 | 2 |
| 7 | GAGGAGGTCATGGGAGAGGAGGCGGAGACGGGGAGGGATG | 2 | 2 |
| 8 | GCACGGGATCAGGGTGGGGTGGAGAGGGGAATTTTAGTGG | 2 | 2 |
| 9 | GGGACACGGTTGGGAGTGGGGTTTGGTCGTCCGGGGGATG | 2 | 2 |
| 10 | GAGCATGGAGTACGGGCGGGGTGATGACGGGGGAGTTTAG | 2 | 2 |
| 11 | AGGGGGGGTTGGGTATGCGTCCGGAGAGTTGCTCGAGTTC | 2 | 2 |
| 12 | TTCGGGCGATGGGTTAGGTTGGCGGAGGTGGGAGGGCGCGG | 2 | 2 |
| 13 | CTCTCGGGAAGTACGGTAGGAAGTGGTACCACGGGGTTA | 2 | 2 |
| 14 | TGGGGGCAAGAGGGTGGTTTTAGAGGGGCAGGGGGAGTT | 2 | 2 |
| 15 | GGGGGCGAGAGGGTGGTTTTAGAGGGGATGGGAGGAGTT | 2 | 2 |
| 16 | GGGGAGGCGGGCGGGGTGCTGACGGGGGAGTTTAGCCGT | 2 | 2 |
| 17 | GGGGAGGCGGGCGGGGTGCTGACGGGGGAGTTTAGCCGT | 2 | 2 |
| 18 | GGAGGAGGCGGGCGGGGTGCTGACGGGGGAGTTTAGCCGT | 2 | 2 |
| 19 | GTACCCGGAGACCAGTGACGGGGGTGTTTCGGCTGAAGCT | 2 | 1 |
| 20 | GTACTCTGCGTGCTGGGGGTGTTTGATGTAGTTCAGGCT | 2 | 2 |
| 21 | GTACTTGCGGAGCAGGGGTGGACCGTGTATAGTCGGGACT | 2 | 2 |
| 22 | GAGAGAGTGGGGGAGTGATCGGAGCGTGGGGTGTAGGGC | 2 | 1 |
| 23 | GGATCGGCTCGGGGGGGCAAGGGCCGGCGGGGATGTCATG | 2 | 2 |
| 24 | TAGGACGGAATGGGGGTGTGGGCTGTAGGGGAGGACAAAG | 2 | 2 |
Classification Table Based on the Total Set
| | predicted | |||
|---|---|---|---|---|
| observed | class 1 | class 2 | accuracy | |
| model 1 | class 1 | 56 | 44 | 56.0 |
| class 2 | 32 | 68 | 68.0 | |
| overall percentage | 62.0 | |||
| model 2 | class 1 | 69 | 31 | 69.0 |
| class 2 | 32 | 68 | 68.0 | |
| overall percentage | ||||
| model 3 | class 1 | 72 | 28 | 72.0 |
| class 2 | 26 | 74 | 74.0 | |
| overall percentage | 73.0 | |||
| model 4 | class 1 | 72 | 28 | 72.0 |
| class 2 | 22 | 78 | 78.0 | |
| overall percentage | 75.0 | |||
| model 5 | class 1 | 76 | 24 | 76.0 |
| class 2 | 23 | 77 | 77.0 | |
| overall percentage | ||||
| model 6 | class 1 | 76 | 24 | 76.0 |
| class 2 | 20 | 80 | 80.0 | |
| overall percentage | 78.0 | |||
| model 7 | class 1 | 81 | 19 | 81.0 |
| class 2 | 19 | 81 | 81.0 | |
| overall percentage | 81.0 | |||
Variables in the Classification Models
| model | descriptor | SE | Waals | df | sig. | exp( | |
|---|---|---|---|---|---|---|---|
| model 1 | –0.125 | 0.027 | 20.919 | 1 | 0.000 | 0.883 | |
| constant | 4.242 | 0.936 | 20.538 | 1 | 0.000 | 69.544 | |
| model 2 | –0.526 | 0.117 | 20.319 | 1 | 0.000 | 0.591 | |
| –0.221 | 0.038 | 34.229 | 1 | 0.000 | 0.802 | ||
| constant | 0.169 | 1.267 | 0.018 | 1 | 0.894 | 1.184 | |
| model 3 | –0.555 | 0.123 | 20.356 | 1 | 0.000 | 0.574 | |
| –0.228 | 0.039 | 34.630 | 1 | 0.000 | 0.796 | ||
| –0.350 | 0.104 | 11.399 | 1 | 0.001 | 0.705 | ||
| constant | 0.442 | 1.334 | 0.110 | 1 | 0.740 | 1.556 | |
| model 4 | 23.734 | 13.191 | 3.237 | 1 | 0.072 | 2.031 × 1010 | |
| –0.622 | 0.128 | 23.566 | 1 | 0.000 | 0.537 | ||
| –0.280 | 0.046 | 37.076 | 1 | 0.000 | 0.756 | ||
| –0.426 | 0.115 | 13.750 | 1 | 0.000 | 0.653 | ||
| constant | –45.591 | 25.627 | 3.165 | 1 | 0.075 | 0.000 | |
| model 5 | 42.590 | 15.877 | 7.196 | 1 | 0.007 | 3.139 × 1018 | |
| –0.684 | 0.136 | 25.155 | 1 | 0.000 | 0.505 | ||
| –0.0326 | 0.052 | 39.040 | 1 | 0.000 | 0.722 | ||
| 1.196 | 0.381 | 9.863 | 1 | 0.002 | 3.306 | ||
| –0.855 | 0.194 | 19.404 | 1 | 0.000 | 0.425 | ||
| constant | –82.136 | 30.791 | 7.116 | 1 | 0.008 | 0.000 | |
| model 6 | 36.783 | 16.405 | 5.027 | 1 | 0.025 | 9.431 × 1015 | |
| –0.853 | 0.157 | 29.673 | 1 | 0.000 | 0.426 | ||
| –0.362 | 0.058 | 39.332 | 1 | 0.000 | 0.696 | ||
| 1.245 | 0.396 | 9.894 | 1 | 0.002 | 3.473 | ||
| –0.901 | 0.201 | 20.087 | 1 | 0.000 | 0.406 | ||
| –0.240 | 0.089 | 7.211 | 1 | 0.007 | 0.787 | ||
| constant | –70.938 | 31.822 | 4.969 | 1 | 0.026 | 0.000 | |
| model 7 | 36.306 | 16.785 | 4.679 | 1 | 0.031 | 5.852 × 1015 | |
| –0.838 | 0.161 | 27.062 | 1 | 0.000 | 0.433 | ||
| –90.770 | 28.598 | 10.074 | 1 | 0.002 | 0.000 | ||
| –0.404 | 0.064 | 40.162 | 1 | 0.000 | 0.667 | ||
| 1.306 | 0.424 | 9.488 | 1 | 0.002 | 3.691 | ||
| –0.995 | 0.216 | 21.164 | 1 | 0.000 | 0.370 | ||
| –0.329 | 0.098 | 11.388 | 1 | 0.001 | 0.719 | ||
| constant | –55.171 | 32.756 | 2.837 | 1 | 0.092 | 0.000 |
Variable(s) entered on model 1: FCO.
Variable(s) entered on model 2: SVV.
Variable(s) entered on model 3: FCS.
Variable(s) entered on model 4: EED.
Variable(s) entered on model 5: FOS.
Variable(s) entered on model 6: FNN.
Variable(s) entered on model 7: CWP.
Definitions of Molecular Descriptors in Classification Models
| no. | symbol | definition | class |
|---|---|---|---|
| 1 | eigenvalue | edge adjacency indices | |
| 2 | signal 05/weighted by van der Waals volume | 3D-MoRSE descriptors | |
| 3 | 3rd component symmetry directional WHIM index/weighted by polarizability | WHIM descriptors | |
| 4 | frequency of C–O at topological distance 6 | 2D atom pairs | |
| 5 | frequency of O–S at topological distance 7 | 2D aom pairs | |
| 6 | frequency of C–S at topological distance 9 | 2D atom pairs | |
| 7 | frequency of N–N at topological distance 10 | 2D atom pairs |
Statistical Results from Logical Regression Equation
| data set | label | experiment | prediction | accuracy (%) | |
|---|---|---|---|---|---|
| training set | class 1 | class 2 | |||
| class 1 | 67 | 53 | 14 | 79.1 | |
| class 2 | 67 | 12 | 55 | 82.1 | |
| sensitivity | specificity | ||||
| 81.5% | 79.7% | ||||
| test set | class 1 | class 2 | |||
| class 1 | 33 | 27 | 6 | 81.8 | |
| class 2 | 33 | 6 | 27 | 81.8 | |
| sensitivity | specificity | ||||
| 81.8% | 81.8% | ||||