| Literature DB >> 16451629 |
Alison P Klein1, Ya-Yu Tsai, Priya Duggal, Elizabeth M Gillanders, Michael Barnhart, Rasika A Mathias, Ian P Dusenberry, Amy Turiff, Peter S Chines, Janet Goldstein, Robert Wojciechowski, Wayne Hening, Elizabeth W Pugh, Joan E Bailey-Wilson.
Abstract
Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed.Entities:
Mesh:
Year: 2005 PMID: 16451629 PMCID: PMC1866766 DOI: 10.1186/1471-2156-6-S1-S20
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Information content
| Marker set | Number of marker in map | Mean minimuma (SD) | Overall meana (SD) | Mean maximuma (SD) | |
| Simulated | MS (~7.5 cM) | 416 | 0.812 (0.077) | 0.934 (0.004) | 0.9724 (0.005) |
| SNP 3 cM | 917 | 0.644 (0.084) | 0.833 (0.015) | 0.914 (0.010) | |
| SNP 1 cMb | 34 | 0.849 (0.018) | 0.937 (0.006) | 0.969 (0.006) | |
| SNP 0.3 cMb | 201 | 0.933 (0.013) | 0.986 (0.001) | 0.998 (0.001) | |
| COGA | MS (~13.5 cM) | 315 | 0.586 (0.078) | 0.744 (0.060) | 0.840 (0.064) |
| SNP 3 cM | 1103 | 0.674 (0.076) | 0.747 (0.012) | 0.820 (0.015) | |
| SNP 2 cM | 1792 | 0.566 (0.074) | 0.767 (0.010) | 0.825 (0.010) | |
| SNP 1 cM | 2382 | 0.692 (0.055) | 0.868 (0.008) | 0.910 (0.007) | |
| SNP 0.6 cM | 3671 | 0.724 (0.059) | 0.895 (0.006) | 0.930 (0.006) | |
| SNP 0.3 cM | 5405 | 0.751 (0.062) | 0.916 (0.005) | 0.943 (0.006) | |
| SNP 0.25 cM | 15015 | 0.825 (0.046) | 0.939 (0.005) | 0.955 (0.003) |
aOverall mean, average minimum, and average maximum information content across all 3 populations and replicates for simulated data and across all 22 chromosomes for COGA data.
bThe SNP 1 cM and SNP 0.3 cM map for the simulated data are based only on the regions for which fine mapping markers were purchased.
Power in simulated data
| Trait | Marker set | Pop. dz. freq. | Percentage of replicates with | |||||
| 0.05 | 0.01 | 0.0017 | 0.001 | 0.0001 | 0.000049 | |||
| Dominant | ||||||||
| D8044 | MS 7.5 cM | 0.06 | 0.96 | 0.89 | 0.67 | 0.57 | 0.27 | 0.16 |
| SNP 3 cM | 0.95 | 0.86 | 0.60 | 0.51 | 0.18 | 0.12 | ||
| SNP 1 cM | 0.98 | 0.90 | 0.72 | 0.63 | 0.29 | 0.21 | ||
| SNP 0.3 cM | 0.96 | 0.92 | 0.77 | 0.69 | 0.37 | 0.27 | ||
| D8050 | MS 7.5 cM | 0.18 | 1.00 | 1.00 | 0.99 | 0.99 | 0.97 | 0.94 |
| SNP 3 cM | 1.00 | 1.00 | 0.99 | 0.99 | 0.93 | 0.91 | ||
| SNP 1 cM | 1.00 | 1.00 | 1.00 | 1.00 | 0.98 | 0.98 | ||
| SNP 0.3 cM | 0.99 | 0.99 | 0.99 | 0.99 | 0.97 | 0.96 | ||
| D8051 | MS 7.5 cM | 0.50 | 0.99 | 0.95 | 0.88 | 0.83 | 0.62 | 0.54 |
| SNP 3 cM | 0.99 | 0.95 | 0.85 | 0.81 | 0.59 | 0.51 | ||
| SNP 1 cM | 1.00 | 0.97 | 0.90 | 0.88 | 0.70 | 0.61 | ||
| SNP 0.3 cM | 1.00 | 0.98 | 0.92 | 0.9 | 0.77 | 0.70 | ||
| Recessive | ||||||||
| R8045 | MS 7.5 cM | 0.08 | 0.99 | 0.93 | 0.74 | 0.67 | 0.32 | 0.22 |
| SNP 3 cM | 0.98 | 0.89 | 0.69 | 0.61 | 0.24 | 0.15 | ||
| SNP 1 cM | 0.99 | 0.95 | 0.78 | 0.73 | 0.36 | 0.26 | ||
| SNP 0.3 cM | 0.98 | 0.95 | 0.84 | 0.78 | 0.44 | 0.34 | ||
| R8050 | MS 7.5 cM | 0.01 | 0.46 | 0.10 | 0.005 | 0 | 0 | 0 |
| SNP 3 cM | 0.41 | 0.05 | 0 | 0 | 0 | 0 | ||
| SNP 1 cM | 0.79 | 0.45 | 0.13 | 0.004 | 0 | 0 | ||
| SNP 0.3 cM | 0.47 | 0.13 | 0.005 | 0.005 | 0 | 0 | ||
| R8051 | MS 7.5 cM | 0.22 | 1.00 | 1.00 | 0.96 | 0.94 | 0.83 | 0.78 |
| SNP 3 cM | 1.00 | 1.00 | 0.94 | 0.92 | 0.81 | 0.74 | ||
| SNP 1 cM | 1.00 | 1.00 | 0.98 | 0.98 | 0.88 | 0.85 | ||
| SNP 0.3 cM | 0.99 | 0.99 | 0.99 | 0.98 | 0.90 | 0.88 | ||
aPercentage of replicates with p-value below the following criteria within a 20 cM range of the given "true" trait locus. The results were summarized across the 3 simulated populations. Each population was analyzed separately.
Power in COGA data
| Trait | Marker set | Dz. freq. | LOD | Minimum |
| Dominant | ||||
| Drs0041510 | MS | 0.14 | 3.1 | 0.00008 |
| SNP 3 cM | 3.8 | 0.00001 | ||
| SNP 2 cM | 2.8 | 0.0002 | ||
| SNP 1 cM | 4.1 | 0.00001 | ||
| SNP 0.6 cM | 3.0 | 0.00008 | ||
| SNP 0.3 cM | 3.9 | 0.00001 | ||
| SNP 0.25 | 4.1 | 0.00001 | ||
| Dtsc0061481 | MS | 0.31 | 1.0 | 0.02b |
| SNP 3 cM | 5.7 | <0.00001 | ||
| SNP 2 cM | 5.0 | <0.00001 | ||
| SNP 1 cM | 5.9 | <0.00001 | ||
| SNP 0.6 cM | 6.1 | <0.00001 | ||
| SNP 0.3 cM | 6.3 | <0.00001 | ||
| SNP 0.25 | 6.1 | <0.00001 | ||
| Recessive | ||||
| Rtsc0061481 | MS | 0.03 | 0.81 | 0.03b |
| SNP 3 cM | 1.65 | 0.003 | ||
| SNP 2 cM | 1.70 | 0.003 | ||
| SNP 1 cM | 1.68 | 0.002 | ||
| SNP 0.6 cM | 1.68 | 0.003 | ||
| SNP 0.3 cM | 1.68 | 0.002 | ||
| SNP 0.25 | 1.79 | 0.002 | ||
| Rtsc2832191 | MS | 0.22 | 6.0 | <0.00001 |
| SNP 3 cM | 4.5 | <0.00001 | ||
| SNP 2 cM | 6.2 | <0.00001 | ||
| SNP 1 cM | 6.7 | <0.00001 | ||
| SNP 0.6 cM | 7.4 | <0.00001 | ||
| SNP 0.3 cM | 7.5 | <0.00001 | ||
| SNP 0.25 | 3.6 | <0.00003 |
aMinimum p-value within 20 cM of the "true" trait locus
bMarker D13S325 located about 12.3 cM from the trait loci gave a p-value of 0.0004 for trait Dtsc0061481 and a p-value of 0.004 for trait Rtsc0061481
Type I error count in simulated data for full dataset
| Trait | Marker set | Pop. dz. freq. | # of replicates with dataa | Mean number of false positives below | |||||
| 0.05 | 0.01 | 0.0017 | 0.001 | 0.0001 | 0.000049 | ||||
| Dominant | |||||||||
| D8044 | MS 7.5 cM | 0.06 | 300 | 8.10 | 1.92 | 0.18 | 0.08 | 0 | 0 |
| SNP 3 cM | 7.73 | 1.79 | 0.16 | 0.09 | 0 | 0 | |||
| D8050 | MS 7.5 cM | 0.18 | 300 | 7.63 | 1.82 | 0.53 | 0.38 | 0.03 | 0.02 |
| SNP 3 cM | 7.33 | 2.11 | 0.47 | 0.30 | 0.03 | 0.02 | |||
| D8051 | MS 7.5 cM | 0.50 | 300 | 7.48 | 2.56 | 0.45 | 0.30 | 0.04 | 0.02 |
| SNP 3 cM | 7.22 | 2.19 | 0.45 | 0.30 | 0.04 | 0.02 | |||
| Recessive | |||||||||
| R8045 | MS 7.5 cM | 0.08 | 300 | 8.59 | 2.36 | 0.29 | 0.16 | 0.01 | 0 |
| SNP 3 cM | 8.11 | 2.23 | 0.25 | 0.14 | 0.01 | 0 | |||
| R8050 | MS 7.5 cM | 0.01 | 229 | 2.66 | 0.06 | 0 | 0 | 0 | 0 |
| SNP 3 cM | 2.20 | 0.03 | 0 | 0 | 0 | 0 | |||
| R8051 | MS 7.5 cM | 0.22 | 300 | 7.57 | 2.02 | 0.39 | 0.27 | 0.03 | 0.02 |
| SNP 3 cM | 7.44 | 2.03 | 0.47 | 0.34 | 0.04 | 0.01 | |||
aFor rare disease not all replicates contained informative pedigrees.
bMean number of false positive regions in the 9 unlinked chromosomes per replicate with p-value below the following criteria.
Type I error count in densely mapped simulated data
| Trait | Marker set | Pop. dz. freq. | # of replicates with data | Mean number of false positives below | |||||
| 0.05 | 0.01 | 0.0017 | 0.001 | 0.0001 | 0.000049 | ||||
| Dominant | |||||||||
| D8044 | SNP 1 cM | 0.06 | 300 | 0.157 | 0.063 | 0 | 0 | 0 | 0 |
| SNP 0.3 cM | 0.177 | 0.053 | 0.010 | 0.007 | 0 | 0 | |||
| D8050 | SNP 1 cM | 0.18 | 300 | 0.146 | 0.003 | 0.007 | 0.007 | 0.003 | 0 |
| SNP 0.3 cM | 0.18 | 0.037 | 0.013 | 0.007 | 0.003 | 0 | |||
| D8051 | SNP 1 cM | 0.50 | 300 | 0.230 | 0.057 | 0.02 | 0.013 | 0 | 0 |
| SNP 0.3 cM | 0.280 | 0.100 | 0.02 | 0.017 | 0 | 0 | |||
| Recessive | |||||||||
| R8045 | SNP 1 cM | 0.08 | 300 | 0.113 | 0.037 | 0.007 | 0.003 | 0 | 0 |
| SNP 0.3 cM | 0.147 | 0.050 | 0.003 | 0 | 0 | 0 | |||
| R8050 | SNP 1 cM | 0.01 | 229 | 0.039 | 0.004 | 0 | 0 | 0 | 0 |
| SNP 0.3 cM | 0.037 | 0.004 | 0 | 0 | 0 | 0 | |||
| R8051 | SNP 1 cM | 0.22 | 300 | 0.160 | 0.060 | 0.020 | 0.020 | 0.007 | 0.007 |
| SNP 0.3 cM | 0.183 | 0.063 | 0.020 | 0.003 | 0.007 | 0.007 | |||
aMean number of false positive results in the ~18 cM unlinked region per replicate with p-value below the following criteria.
Type I error in COGA data
| Trait | SNP Set | Dz. freq. | Number of false positive below | |||||
| 0.05 | 0.01 | 0.0017 | 0.001 | 0.0001 | 0.000049 | |||
| Dominant | ||||||||
| Drs0041510 | MS | 0.14 | 7 | 2 | 2 | 2 | 0 | 0 |
| SNP 3 cM | 10 | 3 | 1 | 0 | 0 | 0 | ||
| SNP 2 cM | 7 | 1 | 0 | 0 | 0 | 0 | ||
| SNP 1 cM | 15 | 2 | 1 | 1 | 0 | 0 | ||
| SNP 0.6 cM | 12 | 5 | 1 | 1 | 0 | 0 | ||
| SNP 0.3 cM | 14 | 5 | 2 | 1 | 0 | 0 | ||
| SNP 0.25 | 18 | 8 | 2 | 1 | 0 | 0 | ||
| Dtsc0061481 | MS | 0.31 | 6 | 1 | 1 | 0 | 0 | 0 |
| SNP 3 cM | 8 | 4 | 2 | 1 | 0 | 0 | ||
| SNP 2 cM | 12 | 6 | 2 | 1 | 0 | 0 | ||
| SNP 1 cM | 17 | 7 | 0 | 0 | 0 | 0 | ||
| SNP 0.6 cM | 16 | 6 | 2 | 2 | 0 | 0 | ||
| SNP 0.3 cM | 15 | 7 | 1 | 1 | 0 | 0 | ||
| SNP 0.25 | 24 | 9 | 3 | 1 | 0 | 0 | ||
| Recessive | ||||||||
| Rtsc0061581 | MS | 0.03 | 8 | 0 | 0 | 0 | 0 | 0 |
| SNP 3 cM | 7 | 0 | 0 | 0 | 0 | 0 | ||
| SNP 2 cM | 7 | 0 | 0 | 0 | 0 | 0 | ||
| SNP 1 cM | 9 | 0 | 0 | 0 | 0 | 0 | ||
| SNP 0.6 cM | 10 | 0 | 0 | 0 | 0 | 0 | ||
| SNP 0.3 cM | 9 | 0 | 0 | 0 | 0 | 0 | ||
| SNP 0.25 | 13 | 1 | 0 | 0 | 0 | 0 | ||
| Rtsc2832191 | MS | 0.22 | 6 | 4 | 2 | 0 | 0 | 0 |
| SNP 3 cM | 10 | 3 | 1 | 0 | 0 | 0 | ||
| SNP 2 cM | 12 | 3 | 2 | 2 | 0 | 0 | ||
| SNP 1 cM | 13 | 4 | 1 | 1 | 0 | 0 | ||
| SNP 0.6 cM | 14 | 6 | 2 | 1 | 0 | 0 | ||
| SNP 0.3 cM | 16 | 6 | 1 | 1 | 1 | 0 | ||
| SNP 0.25 | 23 | 6 | 2 | 2 | 0 | 0 | ||
aNumber of false positive regions across the 18 unlinked chromosomes with p-value below the following criteria.