| Literature DB >> 21082039 |
Y-X Lin1, V Baladandayuthapani, V Bonato, K-A Do.
Abstract
MOTIVATION: Existing methods for estimating copy number variations in array comparative genomic hybridization (aCGH) data are limited to estimations of the gain/loss of chromosome regions for single sample analysis. We propose the linear-median method for estimating shared copy numbers in DNA sequences across multiple samples, demonstrate its operating characteristics through simulations and applications to real cancer data, and compare it to two existing methods.Entities:
Keywords: array CGH; common copy number alterations regions; copy number alterations
Year: 2010 PMID: 21082039 PMCID: PMC2978932 DOI: 10.4137/CIN.S5614
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Histogram of log2 [(1 + ɛ)/(2 + η)].
The values of m(π, t) and s2(π, t) (Part A).
| π = 1 | ||||
| 0.9988320 (1.204417e-05) | 0.9999922 (6.722244e-06) | 1.0006107 (5.334313e-06) | 0.9996400 (1.261637e-05) | 1.0010851 (1.167334e-05) |
| 0.9996429 (1.231912e-06) | 1.0007002 (5.472384e-06) | 0.9996422 (5.472957e-06) | 0.9995458 (4.414939e-06) | |
| π= 0.9 | ||||
| 1.0234726 (7.944477e-04) | 0.9999251 (3.122031e-06) | 0.9945141 (7.239544e-05) | 0.9874093 (2.153089e-04) | 0.9815765 (3.306295e-04) |
| 0.9784618 (4.723169e-04) | 0.9754699 (5.685863e-04) | 0.9728850 (5.824456e-04) | 0.9690245 (5.801032e-04) | |
| π = 0.8 | ||||
| 1.0387630 (2.886933e-03) | 0.9996188 (9.836780e-06) | 0.9892445 (2.590746e-04) | 0.9765723 (8.382101e-04) | 0.9673282 (1.415877e-03) |
| 0.9586696 (1.799211e-03) | 0.9522338 (2.116493e-03) | 0.9460126 (2.245196e-03) | 0.9428445 (2.510819e-03) | |
| π = 0.7 | ||||
| 1.0490667 (5.548408e-03) | 1.0001018 (1.432224e-05) | 0.9855943 (5.197429e-04) | 0.9663545 (1.709809e-03) | 0.9524253 2 (2.958586e-03) |
| 0.9407424 (3.912353e-03) | 0.930110 (4.538055e-03) | 0.9227174 (5.050342e-03) | 0.9165458 (5.479345e-03) | |
| π = 0.6 | ||||
| 1.0488169 (7.753325e-03) | 1.0010726 (6.911221e-06) | 0.9854699 (7.413494e-04) | 0.9623178 (2.893487e-03) | 0.9414670 (4.946303e-03) |
| 0.9257995 (6.682264e-03) | 0.9128190 (8.140030e-03) | 0.9026656 (9.115055e-03) | 0.8949812 (1.010583e-02) |
The values of m(π, t) and s2(π, t) (Part B).
| π = 0.5 | ||||
| 0.9976367 (3.009497e-03) | 1.0008558 (3.751868e-06) | 1.0051801 (1.132037e-03) | 1.0075940 (8.828197e-03) | 1.0563732 (3.143084e-02) |
| 1.0996647 (5.681301e-02) | 0.9949510 (3.069008e-02) | 1.0348440 (5.681301e-02) | 1.2778189 (3.069008e-02) | |
| π = 0.4 | ||||
| 0.9689243 (2.197164e-03) | 1.0004657 (8.269312e-06) | 1.0224327 (1.544194e-03) | 1.0774329 (1.112485e-02) | 1.1533009 (3.060863e-02) |
| 1.2460811 (5.815739e-02) | 1.3484609 (9.379170e-02) | 1.4610660 (1.286891e-01) | 1.5757287 (1.732840e-01) | |
| π = 0.3 | ||||
| 0.9690245 (1.446965e-03) | 1.0008585 (3.876559e-06) | 1.0242324 (1.226318e-03) | 1.0860159 (6.909656e-03) | 1.1647982 (1.727091e-02) |
| 1.2629289 (2.912726e-02) | 1.3679820 (4.057614e-02) | 1.4846238 (5.020234e-02) | 1.5987995 (5.951541e-02) | |
| π = 0.2 | ||||
| 0.9737273 (6.463392e-04) | 1.0001785 (3.456922e-06) | 1.0231539 (5.653691e-04) | 1.0743057 (3.026114e-03) | 1.1446959 (6.303903e-03) |
| 1.2239605 (9.262918e-03) | 1.3104238 (1.097795e-02) | 1.3991247 (1.232603e-02) | 1.4862704 (1.325612e-02) | |
| π = 0.1 | ||||
| 0.9836251 (1.448035e-04) | 0.9996371 (1.181245e-05) | 1.0143460 (1.537200e-04) | 1.0458579 (6.429722e-04) | 1.0869769 (1.166530e-03) |
| 1.1335024 (1.374409e-03) | 1.1808455 (1.518496e-03) | 1.2294815 (1.587512e-03) | 1.2743215 (1.669519e-03) |
The sample mean and sample standard error of the estimated error rate {d(k)} given by different combinations of a and n, where a is the parameter of the uniform distribution U[−a, a] and n is the number of the independent sequences in the realizations.
| 0.5 | 0.00267 (0.00505932) | 0.00021 (0.00143456) | 0.00003 (0.00054717) |
| 0.8 | 0.03080 (0.01634096) | 0.00578 (0.00725564) | 0.00566 (0.01330000) |
| 1 | 0.06900 (0.02388243) | 0.01822 (0.01335702) | 0.00771 (0.00880531) |
| 1.5 | 0.19759 (0.03871163) | 0.08208 (0.02652880) | 0.04332 (0.02063500) |
| 1.9 | 0.30161 (0.04409140) | 0.15426 (0.03687835) | 0.09367 (0.02802528) |
Figure 2.Plot of the sequence of the true copy numbers.
The true positive (TP) rates and false positive (FP) rates for the linear-median method and the cghMCR method, where n = 20.
| 0.2 | ||||||
| TP | 0.6382 (0.0496) | 0.0714 (0.1101) | 0.7568 (0.0406) | 0.0024 (0.0188) | 0.8096 (0.0414) | 0 (0) |
| FP | 0.3785 (0.0384) | 0.0040 (0.0154) | 0.6549 (0.0384) | 2.83e-04 (0.0041) | 0.7657 (0.0357) | 0 (0) |
| 0.4 | ||||||
| TP | 0.7849 (0.0429) | 0.6760 (0.1830) | 0.7696 (0.0413) | 0.0308 (0.0779) | 0.7616 (0.0453) | 0 (0) |
| FP | 0.0861 (0.0248) | 0.0415 (0.0302) | 0.3827 (0.0402) | 0.0011 (0.0081) | 0.5611 (0.0408) | 0 (0) |
| 0.6 | ||||||
| TP | 0.9503 (0.0227) | 0.9075 (0.0224) | 0.8708 (0.0359) | 0.2759 (0.1129) | 0.8013 (0.0410) | 0 (0) |
| FP | 0.0122 (0.0090) | 2.58e-05 (0.0004) | 0.2000 (0.0310) | 0.0023 (0.0114) | 0.3905 (0.0419) | 0 (0) |
| 0.8 | ||||||
| TP | 0.9966 (0.0060) | 0.9030 (0.0206) | 0.9451 (0.0204) | 0.3877 (0.1308) | 0.8677 (0.0331) | 0 (0) |
| FP | 0.0013 (0.0028) | 0 (0) | 0.0238 (0.0917) | 0 (0) | 0.2617 (0.0358) | 0 (0) |
| 1 | ||||||
| TP | 1 (0) | 0.9490 (0.0147) | 0.9817 (0.0147) | 0.6542 (0.1561) | 0.9237 (0.0287) | 0 (0) |
| FP | 7.74e-05 (0.0007) | 0 (0) | 0.04026 (0.0154) | 0 (0) | 0.1667 (0.0314) | 0 (0) |
The true positive (TP) rates and false positive (FP) rates for the linear-median method and the cghMCR method, where n = 100.
| 0.2 | ||||||
| TP | 0.6771 (0.0539) | 0.0048 (0.0203) | 0.7438 (0.0505) | 0 (0) | 0.7335 (0.0461) | 0 (0) |
| FP | 0.0561 (0.0187) | 0 (0) | 0.3266 (0.0412) | 0 (0) | 0.5146 (0.0381) | 0 (0) |
| 0.4 | ||||||
| TP | 0.9566 (0.0233) | 0.6650 (0.1317) | 0.9299 (0.02653) | 0.0004 (0.0030) | 0.8718 (0.0341) | 0 (0) |
| FP | 0.0003 (0.0013) | 0.0455 (0.0270) | 0.0578 (0.0196) | 0 (0) | 0.2012 (0.0340) | 0 (0) |
| 0.6 | ||||||
| TP | 0.9998 (0.0015) | 0.9033 (0.0108) | 0.9920 (0.01010) | 02804 (0.0706) | 0.9556 (0.0239) | 0 (0) |
| FP | 0 (0) | 0 (0) | 0.0065 (0.0075) | 0 (0) | 0.0621 (0.0224) | 0 (0) |
| 0.8 | ||||||
| TP | 1 (0) | 0.8956 (0.0087) | 0.9998 (0.0015) | 0.3345 (0.0193) | 0.9922 (0.0099) | 0 (0) |
| FP | 0 (0) | 0 (0) | 0.0003 (0.0013) | 0 (0) | 0.0167 (0.0125) | 0 (0) |
| 1 | ||||||
| TP | 1 (0) | 0.8971 (0.0177) | 0.9998 (0.0015) | 0.2263 (0.1156) | 0.9983 (0.0044) | 0 (0) |
| FP | 0 (0) | 0 (0) | 0.0003 (0.0013) | 0 (0) | 0.0043 (0.0056) | 0 (0) |
Figure 3.Application of the CBS method to the sequence of the median of the logarithm of the ratios (top panel). The red bars show the values of the estimation of log2(t/2). Application of the linear-median method to the data in Example 3 (bottom panel), showing the estimates of t at each probe position.
Figure 4.The output of the linear-median adjusted method is shown in red and that of the cghMCR method is in green.
Number of genes identified by the linear-median method (LM) and the cghMCR method in the regions of shared copy number aberrations with the status of copy number loss, neutrality or gain. NR/U is not cancer-related or unknown function phenotype, CR is cancer-related phenotype (except for lung cancer), and LCR is lung cancer-related phenotype.
| Losses | 670 | 346 | 89 | 33 | 9 | 4 | 768 | 383 |
| Neutral | 342 | 758 | 35 | 103 | 3 | 9 | 380 | 870 |
| Gains | 100 | 8 | 13 | 1 | 1 | 0 | 114 | 9 |
| 1112 | 137 | 13 | ||||||
List of lung cancer-related genes for each phenotypic group identified by the linear-median method (LM) and the cghMCR method.
| Loss | PSIP1, CDKN2A TUSC1, IGFBPL1 TLE1, FRMD3 DAPK1, MIRLET7A1 PTPN3 | PSIP1, CDKN2A TUSC1, IGFBPL1 |
| Neutral | PHF19, DAB2IP RPL12 | PHF19, DAB2IP RPL12, TLE1 FRMD3, DAPK1 MIRLET7A1, PTPN3 GAS1 |
| Gain | GAS1 |
Figure 5.The plot of the estimated copy numbers (<1 or >3) given by the linear-median method for π = 0.2.
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 |
| 5 | 5 | 5 | 5 | 5 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
| 1 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
The true positive (TP) rates and false positive (FP) rates for the linear-median method and the cghMCR method, where n = 50.
| 0.2 | ||||||
| TP | 0.6309 (0.0547) | 0.02442 (0.0626) | 0.7147 (0.0499) | 0 (0) | 0.7521 (0.0455) | 0 (0) |
| FP | 0.1712 (0.0346) | 6.45e-04 (0.0065) | 0.4866 (0.0488) | 0 (0) | 0.6425 (0.0437) | 0 (0) |
| 0.4 | ||||||
| TP | 0.8895 (0.0347) | 0.6643 (0.1542) | 0.8574 (0.0357) | 0.0019 (0.0109) | 0.7975 (0.0420) | 0 (0) |
| FP | 0.0089 (0.0070) | 0.0439 (0.0297) | 0.1737 (0.0365) | 0 (0) | 0.3603 (0.0416) | 0 (0) |
| 0.6 | ||||||
| TP | 0.9949 (0.0072) | 0.9046 (0.0149) | 0.9581 (0.0212) | 0.2762 (0.0842) | 0.8926 (0.0358) | 0 (0) |
| FP | 6.45e-05 (0.0006) | 0 (0) | 0.0482 (0.0189) | 0 (0) | 0.1814 (0.0364) | 0 (0) |
| 0.8 | ||||||
| TP | 1 (0) | 0.8962 (0.0118) | 0.9912 (0.0100) | 0.3384 (0.0416) | 0.9545 (0.0209) | 0 (0) |
| FP | 0 (0) | 0 (0) | 0.0100 (0.0082) | 0 (0) | 0.0826 (0.0238) | 0 (0) |
| 1 | ||||||
| TP | 1 (0) | 0.9207 (0.0154) | 0.9992 (0.0029) | 0.4155 (0.1679) | 0.9848 (0.0107) | 0 (0) |
| FP | 0 (0) | 0 (0) | 0.0023 (0.0038) | 0 (0) | 0.0348 (0.0153) | 0 (0) |