| Literature DB >> 22623973 |
Li-Yeh Chuang1, Yu-Da Lin, Hsueh-Wei Chang, Cheng-Hong Yang.
Abstract
BACKGROUND: Possible single nucleotide polymorphism (SNP) interactions in breast cancer are usually not investigated in genome-wide association studies. Previously, we proposed a particle swarm optimization (PSO) method to compute these kinds of SNP interactions. However, this PSO does not guarantee to find the best result in every implement, especially when high-dimensional data is investigated for SNP-SNP interactions. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2012 PMID: 22623973 PMCID: PMC3356401 DOI: 10.1371/journal.pone.0037018
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Population initialization using conservation of the best 5 results.
Estimated effect (odds ratio and 95% CI) from individual SNPs of 23 steroid hormone metabolisms and signalling-related genes on the occurrence of breast cancer in patients.
| SNP (Gene) | SNPtype | Caseno. | Controlno. | Oddsratio | 95% CI |
| SNP (Gene) | SNPtype | Caseno. | Controlno. | Oddsratio | 95% CI |
|
| (Ch/position)cd | (Position) | ||||||||||||
| 1. rs6269 (COMT) | 1-AA | 1694 | 1769 | 13. rs9478249 (ESR1) | 1-TT | 1890 | 1773 | ||||||
| (22/19949952) | 2-AG | 2389 | 2390 | 1.044 | 0.955, 1.140 | 0.337 | (6/152199431) | 2-TG | 2381 | 2430 | 0.919 | 0.843, 1.003 | 0.056 |
| 3-GG | 917 | 841 | 1.139 | 1.013, 1.279 |
| 3-GG | 729 | 797 | 0.858 | 0.760, 0.969 |
| ||
| 2. rs4680 (COMT) | 1-GG | 1308 | 1377 | 14. rs1514348 (ESR1) | 1-CC | 1717 | 1830 | ||||||
| (22/19951271) | 2-GA | 2440 | 2417 | 1.063 | 0.966, 1.169 | 0.211 | (6/152182315) | 2-CA | 2435 | 2415 | 1.075 | 0.985, 1.173 | 0.107 |
| 3-AA | 1252 | 1206 | 1.093 | 0.978, 1.221 | 0.118 | 3-AA | 851 | 755 | 1.201 | 1.066, 1.354 |
| ||
| 3. rs10046 (CYP19A1) | 1-CC | 1434 | 1430 | 15. rs532010 (ESR1) | 1-TT | 1848 | 1891 | ||||||
| (15/51502986) | 2-CT | 2411 | 2497 | 0.963 | 0.877, 1.057 | 0.424 | (6/152130918) | 2-TC | 2377 | 2422 | 1.004 | 0.921, 1.095 | 0.930 |
| 3-TT | 1155 | 1073 | 1.073 | 0.959, 1.201 | 0.214 | 3-CC | 775 | 687 | 1.154 | 1.021, 1.305 |
| ||
| 4. rs3020314 (ESR1) | 1-CC | 2147 | 2343 | 16. rs566351 (PGR) | 1-TT | 2062 | 2014 | ||||||
| (6/152270672) | 2-CT | 2280 | 2164 | 1.150 | 1.057, 1.250 |
| (11/100985014) | 2-TC | 2280 | 2326 | 0.957 | 0.879, 1.043 | 0.312 |
| 3-TT | 573 | 493 | 1.268 | 1.107, 1.453 |
| 3-CC | 658 | 660 | 0.974 | 0.858, 1.105 | 0.680 | ||
| 5. rs2234693 (ESR1) | 1-TT | 1446 | 1450 | 17. rs660149 (PGR) | 1-CC | 2708 | 2591 | ||||||
| (6/152163335) | 2-TC | 2480 | 2524 | 1.015 | 0.925, 1.113 | 0.761 | (11/100934314) | 2-CG | 1927 | 2042 | 0.903 | 0.831, 0.981 |
|
| 3-CC | 1074 | 1026 | 1.065 | 0.961, 1.181 | 0.232 | 3-GG | 365 | 367 | 0.952 | 0.813, 1.114 | 0.554 | ||
| 6. rs1543404 (ESR1) | 1-TT | 1468 | 1467 | 18. rs11571171 (PGR) | 1-TT | 2419 | 2338 | ||||||
| (6/152428838) | 2-TC | 2439 | 2441 | 0.999 | 0.910, 1.095 | 0.981 | (11/100974887) | 2-TC | 2082 | 2163 | 0.930 | 0.856, 1.012 | 0.091 |
| 3-CC | 1093 | 1092 | 1.000 | 0.894, 1.119 | 1.000 | 3-CC | 499 | 499 | 0.967 | 0.841, 1.110 | 0.626 | ||
| 7. rs3798577 (ESR1) | 1-TT | 1413 | 1406 | 19. rs500760 (PGR) | 1-AA | 2888 | 2994 | ||||||
| (6/152421130) | 2-TC | 2494 | 2542 | 0.976 | 0.889, 1.072 | 0.621 | (11/100909991) | 2-AG | 1866 | 1767 | 1.095 | 1.007, 1.190 |
|
| 3-CC | 1093 | 1052 | 1.034 | 0.922, 1.159 | 0.567 | 3-GG | 246 | 239 | 1.067 | 0.883, 1.290 | 0.508 | ||
| 8. rs2747652 (ESR1) | 1-CC | 1377 | 1372 | 20. rs858518 (SHBG) | 1-TT | 1693 | 1597 | ||||||
| (6/152437016) | 2-CT | 2479 | 2447 | 1.009 | 0.918, 1.109 | 0.849 | (17/7533025) | 2-TC | 2412 | 2490 | 0.914 | 0.836, 0.999 |
|
| 3-TT | 1144 | 1181 | 0.965 | 0.863, 1.080 | 0.535 | 3-CC | 895 | 913 | 0.925 | 0.823, 1.039 | 0.188 | ||
| 9. rs2077647 (ESR1) | 1-AA | 1383 | 1347 | 21. rs272428 (SHBG) | 1-CC | 1609 | 1523 | ||||||
| (6/152129077) | 2-AG | 2449 | 2589 | 0.921 | 0.838, 1.012 | 0.087 | (5/179323119) | 2-CT | 2438 | 2442 | 0.945 | 0.863, 1.035 | 0.225 |
| 3-GG | 1168 | 1064 | 1.069 | 0.954, 1.198 | 0.242 | 3-TT | 953 | 1035 | 0.872 | 0.778, 0.977 |
| ||
| 10. rs2175898 (ESR1) | 1-AA | 1350 | 1353 | 22. rs858524 (SHBG) | 1-AA | 1613 | 1725 | ||||||
| (6/152196952) | 2-AG | 2507 | 2457 | 0.846 | 0.768, 0.932 |
| (17/7511287) | 2-AG | 2459 | 2393 | 1.099 | 1.005, 1.201 |
|
| 3-GG | 1143 | 1190 | 0.941 | 0.852, 1.040 | 0.238 | 3-GG | 928 | 882 | 1.125 | 1.002, 1.264 |
| ||
| 11. rs9340799 (ESR1) | 1-AA | 2016 | 2107 | 23. rs2017591 (STS) | 1-TT | 1823 | 1760 | ||||||
| (6/152163381) | 2-AG | 2360 | 2302 | 1.071 | 0.984, 1.166 | 0.109 | (X/7158114) | 2-TC | 2258 | 2437 | 0.895 | 0.819, 0.977 |
|
| 3-GG | 624 | 591 | 1.103 | 0.969, 1.257 | 0.133 | 3-CC | 919 | 803 | 1.105 | 0.983, 1.242 | 0.094 | ||
| 12. rs1709182 (ESR1) | 1-TT | 1932 | 1988 | ||||||||||
| (6/152175357) | 2-TC | 2326 | 2341 | 1.022 | 0.938, 1.114 | 0.618 | |||||||
| 3-CC | 742 | 671 | 1.138 | 1.006, 1.288 |
| ||||||||
Data collected from literature [14].
Data highlighted in bold text are statistically significant results.
All the [Ch/position], i.e., [Chromosome no./Chromosome position], information is based on “Assembly GRCh37”.
The contig information is shown in SNP no. (contig accession no.) as follows: SNP 1–2 (NT_011519.10); SNPs 3 (NT_010194.17); SNPs 4–15 (NT_025741.15); SNPs 16–19 (NT_033899.8); SNPs 20–22 (NT_010718.16); SNPs 23 (NT_167197.1).
Values with p value<0.05 are highlighted in bold fonts.
IPSO pseudo-code.
| 01: |
| 02: find the top five 2-SNP barcodes |
| 03: conservation of the best five results Xg≡( |
| 04: |
| 05: P |
| 06: µ |
| 07: µ |
| 08: evaluate |
| 09: find best Xg in |
| 10: the worst five |
| 11: |
| 12: for each swarm P |
| 13: |
| 14: |
| 15: |
| 16: |
| 17: |
| 18: |
| 19: |
| 20: |
| 21: |
| 23: |
| 24: |
| 25: |
| 26: conservation of the top five results Xg≡( |
| 27: |
| 28: |
Pseudo-code for randomly generated data.
| 01: |
| 02: Set size = 5000 |
| 03: Set number of genotype = 3 |
| 04: Calculate amount of three genotypes |
| 05: |
| 06: Calculate amount of each genotype |
| 07: Calculate numbers of each normalized genotype |
| 08: |
| 09: Randomly create numbers of each normalized genotype |
| 10: |
| 11: end while |
| 12: |
The best estimated protective SNP combinations on the occurrence of breast cancer as determined by IPSO.
| Number of combined SNPs(specific SNPs) | SNP genotypes | Control no./Case no. | Difference(specific SNPs) | Correct | Sen.+Spe. | PPV+NPV | Risk Ratio | Odds Ratio( |
| 2-SNP | others | 3596/3770 | 0.84 | |||||
| SNPs(4-19) | 1-1 | 1404/1230 | 174 | 0.48 | 0.97 | 0.96 | 0.88 | ( |
| 3-SNP | others | 4301/4429 | 0.79 | |||||
| SNPs(4-19-23) | 1-1-2 | 699/571 | 128 | 0.49 | 0.97 | 0.94 | 0.82 | ( |
| 4-SNP | Others | 4644/4731 | 0.74 | |||||
| SNPs(4-9-19-23) | 1-2-1-2 | 356/269 | 87 | 0.49 | 0.96 | 0.93 | 0.76 | ( |
| 5-SNP | Others | 4809/4864 | 0.70 | |||||
| SNPs(3-4-9-19-23) | 2-1-2-1-2 | 191/136 | 55 | 0.50 | 0.99 | 0.91 | 0.71 | ( |
| 6-SNP | Others | 4911/4946 | 0.60 | |||||
| SNPs(3-4-9-13-19-23) | 2-1-2-2-1-2 | 89/54 | 35 | 0.50 | 0.99 | 0.88 | 0.61 | ( |
| 7-SNP | Others | 4951/4972 | 0.57 | |||||
| SNPs(3-4-9-13-19-20-23) | 2-1-2-2-1-2-2 | 49/28 | 21 | 0.50 | 1.00 | 0.87 | 0.57 | ( |
| 8-SNP | Others | 4971/4983 | 0.59 | |||||
| SNPs(3-4-9-12-13-19-20-23) | 2-1-2-2-2-1-2-2 | 29/17 | 12 | 0.50 | 1.00 | 0.87 | 0.59 | (0.103) |
| 9-SNP | Others | 4986/4994 | 0.43 | |||||
| SNPs(3-4-9-12-13-14-19-20-23) | 2-1-2-2-2-2-1-2-2 | 14/6 | 8 | 0.50 | 1.00 | 0.80 | 0.43 | (0.115) |
| 10-SNP | Others | 4994/4999 | 0.17 | |||||
| SNPs(3-4-9-12-13-14-19-20-21-23) | 2-1-2-2-2-2-1-2-3-2 | 6/1 | 5 | 0.50 | 1.00 | 0.64 | 0.17 | (0.125) |
The SNP combinations on the occurrence of breast cancer are significantly different (p value<0.05). Sen.; Sensitivity; Spe., specificity; PPV, positive predictive value; NPV, negative predictive value. The meanings of the SNP and genotype numbers are provided in Table 3. For example, barcode SNPs (4-19)-genotype (1-1) is [rs3020314-CC]-[rs500760-AA]; SNPs (4, 19, 23) with genotype 1-1-2; [rs3020314-CC]-[rs500760-AA]-[rs2017591-TC].
Figure 2The maximum difference between cases and controls for PSO and IPSO on the best barcodes containing two to ten SNPs.
Figure 3Boxplots displaying the extremes, the upper and lower quartiles, and the median of the maximum difference between cases and controls for (A) the IPSO algorithm and (B) the PSO algorithm on three to ten combined SNPs over 20 runs.
The boundary of the box closest to zero indicates the 25th percentile, a line within the box marks the median, and the boundary of the box farthest from zero indicates the 75th percentile. Error bars above and below the boxes indicate the 90th and 10th percentiles, respectively. The triangle symbols indicate the 95th and 5th percentiles.
The best estimated protective SNP combinations on the occurrence of breast cancer as determined by PSO.
| Number of combined SNPs(specific SNPs) | SNP genotypes | Control no./Case no. | Difference(specific SNP) | Correct | Sen.+Spe. | PPV+NPV | Risk Ratio | Odds Ratio( |
| 2-SNP | others | 3596/3770 | 0.84 | |||||
| SNPs(4-19) | 1-1 | 1404/1230 | 174 | 0.48 | 0.97 | 0.96 | 0.88 | ( |
| 3-SNP | others | 4427/4505 | 0.85 | |||||
| SNPs(4-22-23) | 1-2-2 | 573/495 | 78 | 0.49 | 0.98 | 0.96 | 0.86 | ( |
| 4-SNP | Others | 4670/4728 | 0.81 | |||||
| SNPs(9-18-19-23) | 2-2-1-2 | 330/272 | 58 | 0.49 | 0.99 | 0.95 | 0.82 | ( |
| 5-SNP | Others | 4885/4911 | 0.77 | |||||
| SNPs(3-4-12-20-23) | 2-1-1-2-2 | 115/89 | 26 | 0.50 | 1.00 | 0.93 | 0.77 | (0.077) |
| 6-SNP | Others | 4950/4962 | 0.76 | |||||
| SNPs(12-15-17-19-21-22) | 2-2-2-1-2-1 | 50/38 | 12 | 0.50 | 1.00 | 0.93 | 0.76 | (0.239) |
| 7-SNP | Others | 4982/4988 | 0.67 | |||||
| SNPs(2-7-14-18-19-21-23) | 2-2-1-2-1-1-2 | 18/12 | 6 | 0.50 | 1.00 | 0.90 | 0.67 | (0.361) |
| 8-SNP | Others | 4990/4995 | 0.50 | |||||
| SNPs(9-10-11-13-17-20-21-23) | 3-3-2-2-1-2-2-2 | 10/5 | 5 | 0.50 | 1.00 | 0.83 | 0.50 | (0.301) |
| 9-SNP | Others | 4993/4995 | 0.71 | |||||
| SNPs(1-3-4-11-14-16-18-21-22) | 2-2-2-1-2-2-1-1-1 | 7/5 | 2 | 0.50 | 1.00 | 0.92 | 0.71 | (0.774) |
| 10-SNP | Others | 4998/4999 | 0.50 | |||||
| SNPs(1-3-5-9-10-13-18-20-21-22) | 2-2-2-2-2-3-1-1-3-1 | 2/1 | 1 | 0.50 | 1.00 | 0.83 | 0.50 | (1.000) |
The SNP combinations on the occurrence of breast cancer are significantly different (p value<0.05). The meanings of the SNP and genotype numbers are provided in Table 3.