| Literature DB >> 28420343 |
Meiling Liu1,2, Sanghoon Moon3, Longfei Wang4, Sulgi Kim5, Yeon-Jung Kim3, Mi Yeong Hwang3, Young Jin Kim3, Robert C Elston6, Bong-Jo Kim7, Sungho Won8,9,10.
Abstract
BACKGROUND: Copy number variation (CNV) is known to play an important role in the genetics of complex diseases and several methods have been proposed to detect association of CNV with phenotypes of interest. Statistical methods for CNV association analysis can be categorized into two different strategies. First, the copy number is estimated by maximum likelihood and association of the expected copy number with the phenotype is tested. Second, the observed probe intensity measurements can be directly used to detect association of CNV with the phenotypes of interest.Entities:
Keywords: Association analysis; CNV; Hematocrit; Score test
Mesh:
Year: 2017 PMID: 28420343 PMCID: PMC5395793 DOI: 10.1186/s12859-017-1622-z
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Basic characteristics of study participants and hema-tological trait
| Variables | Discovery (family) | Replication (cohort) |
|---|---|---|
| Sample size (n) | 521 | 4694 |
| Age (years) | 38.2 ± 18.3 | 54.0 ± 9.0 |
| Male (%) | 45.7% | 47.1% |
| Hematocrit (%) | 41.3 ± 4.3 | 41.1 ± 4.5 |
Empirical type 1 error estimates (M = 3)
| Significance Level | |||||
|---|---|---|---|---|---|
| .005 | .05 | .1 | .2 | ||
| BSC | T1 | 0.0060 ± 0.0021 | 0.0504 ± 0.0061 | 0.1018 ± 0.0084 | 0.2082 ± 0.0113 |
| T2 | 0.0056 ± 0.0021 | 0.0550 ± 0.0063 | 0.1008 ± 0.0086 | 0.2072 ± 0.0112 | |
| MSC | T1 | 0.0048 ± 0.0019 | 0.0486 ± 0.0060 | 0.1006 ± 0.0083 | 0.2104 ± 0.0113 |
| T2 | 0.0048 ± 0.0019 | 0.0472 ± 0.0059 | 0.0956 ± 0.0082 | 0.1884 ± 0.0108 | |
| WSC | T1 | 0.0056 ± 0.0021 | 0.0516 ± 0.0061 | 0.0962 ± 0.0082 | 0.2006 ± 0.0111 |
| T2 | 0.0048 ± 0.0019 | 0.0498 ± 0.0060 | 0.0968 ± 0.0082 | 0.1922 ± 0.0109 | |
The 95% confidence intervals of empirical type I error estimates for the proposed methods were calculated from 5000 replicates at four significance levels under BSC, MSC and WSC, when there are three copy number clusters
Fig. 1The QQ plots without genome control for T and T from simulated data. The empirical p-values adjusted by genomic control for the proposed methods were calculated under the null hypothesis with 5000 replicates under BSC, MSC and WSC, and their QQ plots are shown
Empirical power estimates (M = 3)
| Significance Level |
| |||||||
|---|---|---|---|---|---|---|---|---|
| .1 | .2 | .3 | .4 | .5 | .6 | |||
| .001 | BSC | T1 | 0.0135 | 0.1390 | 0.4830 | 0.8410 | 0.9830 | 0.9985 |
| T2 | 0.0065 | 0.0510 | 0.2370 | 0.5745 | 0.8860 | 0.9795 | ||
| FBAT | 5e-4 | 0.0060 | 0.0340 | 0.1300 | 0.3570 | 0.6005 | ||
| MSC | T1 | 0.0160 | 0.1570 | 0.5530 | 0.8740 | 0.9885 | 1.0000 | |
| T2 | 0.0085 | 0.0685 | 0.3200 | 0.6900 | 0.9290 | 0.9945 | ||
| FBAT | 0.0000 | 0.0100 | 0.0695 | 0.2505 | 0.5575 | 0.8385 | ||
| WSC | T1 | 0.0195 | 0.1615 | 0.5375 | 0.8935 | 0.9910 | 0.9980 | |
| T2 | 0.0075 | 0.0815 | 0.3300 | 0.7240 | 0.9545 | 0.9970 | ||
| FBAT | 0.0010 | 0.0115 | 0.0915 | 0.3240 | 0.6460 | 0.8955 | ||
| .01 | BSC | T1 | 0.0710 | 0.3585 | 0.7510 | 0.9605 | 0.9990 | 1.0000 |
| T2 | 0.0265 | 0.1780 | 0.4630 | 0.7900 | 0.9685 | 0.9950 | ||
| FBAT | 0.0155 | 0.0455 | 0.1540 | 0.3775 | 0.6620 | 0.8430 | ||
| MSC | T1 | 0.0725 | 0.3805 | 0.8070 | 0.9690 | 0.9990 | 1.0000 | |
| T2 | 0.0340 | 0.2100 | 0.5640 | 0.8660 | 0.9790 | 0.9980 | ||
| FBAT | 0.0150 | 0.0550 | 0.2400 | 0.5385 | 0.8155 | 0.9645 | ||
| WSC | T1 | 0.0800 | 0.3795 | 0.7925 | 0.9740 | 0.9985 | 1.0000 | |
| T2 | 0.0370 | 0.2115 | 0.5730 | 0.8855 | 0.9920 | 0.9995 | ||
| FBAT | 0.0175 | 0.0710 | 0.2740 | 0.6350 | 0.8775 | 0.9740 | ||
| .05 | BSC | T1 | 0.1905 | 0.5930 | 0.9075 | 0.9920 | 1.0000 | 1.0000 |
| T2 | 0.0975 | 0.3505 | 0.6880 | 0.9080 | 0.9910 | 0.9990 | ||
| FBAT | 0.0575 | 0.9610 | 0.3660 | 0.6310 | 0.8555 | 0.9525 | ||
| MSC | T1 | 0.2050 | 0.6190 | 0.9295 | 0.9900 | 1.0000 | 1.0000 | |
| T2 | 0.1110 | 0.3915 | 0.7650 | 0.9495 | 0.9970 | 1.0000 | ||
| FBAT | 0.0725 | 0.2080 | 0.4990 | 0.7585 | 0.9305 | 0.993 | ||
| WSC | T1 | 0.2095 | 0.6145 | 0.9260 | 0.9950 | 0.9995 | 1.0000 | |
| T2 | 0.1255 | 0.3995 | 0.7650 | 0.9530 | 0.9990 | 1.0000 | ||
| FBAT | 0.0790 | 0.2240 | 0.5460 | 0.8415 | 0.9705 | 0.9960 | ||
The empirical power for the proposed methods have been estimated at various significance levels based on 2000 replicates for different values of β under BSC, MSC and WSC, when there are three copy number clusters. The score test using the inferred CNVs is denoted by T 1. The score test using the intensity measurements is denoted by T 2. For comparison, we also calculated the power using FBAT
Fig. 2The QQ plot and Manhattan plots for T and T from analysis of the family data
The most significant results of T and T from analyzing the family data
| Chr | Position | 0/1/2 |
|
|
|---|---|---|---|---|
| 8 | 94141469–94142527 | 42/226/253 | 1.38e-03 | 4.67e-02 |
| 5 | 147534018–147534337 | 119/265/137 | 1.19e-02 | 3.92e-03 |