| Literature DB >> 20964851 |
Robert Lawrence1, Aaron G Day-Williams, Katherine S Elliott, Andrew P Morris, Eleftheria Zeggini.
Abstract
BACKGROUND: Genome-wide association studies have been successful in finding common variants influencing common traits. However, these associations only account for a fraction of trait heritability. There has been a shift in the field towards studying low frequency and rare variants, which are now widely recognised as putative complex trait determinants. Despite this increasing focus on examining the role of low frequency and rare variants in complex disease susceptibility, there is a lack of user-friendly analytical packages implementing powerful association tests for the analysis of rare variants.Entities:
Mesh:
Year: 2010 PMID: 20964851 PMCID: PMC2973964 DOI: 10.1186/1471-2105-11-527
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1CCRaVAT and QuTie Workflow. Flowchart summarizing the implementation of the low frequency/rare variant analysis methods in CCRaVAT and QuTie.
Three Column Map File
| 1 | SNP1 | 1111 |
| 1 | SNP2 | 2111 |
| 1 | SNP3 | 3111 |
| 1 | SNP4 | 4111 |
3 column map file that contains the chromosome, marker name, and base pair position of each marker. The header row is for display purposes only and should not appear in the actual file.
Four Column Map File
| 1 | SNP1 | 0 | 1111 |
| 1 | SNP2 | 1 | 2111 |
| 1 | SNP3 | 2 | 3111 |
| 1 | SNP4 | 3 | 4111 |
4 column map file that contains the chromosome, marker name, genetic position, and base pair position. The four column format is the same used by the genetics software package PLINK. The header row is for display purposes only and should not appear in the actual file.
Pedigree File
| 1 | 1 | 0 | 0 | 1 | 1 | A | A | A | C | T | G |
| 2 | 2 | 0 | 0 | 2 | 1 | A | G | A | A | G | G |
| 3 | 3 | 0 | 0 | 2 | 2 | G | G | C | C | T | T |
| 4 | 4 | 0 | 0 | 1 | 2 | A | G | A | C | T | G |
Pedigree file that contains genotype data for 3 SNPs and 4 individuals (2 controls and 2 cases). The first column is for pedigree IDs, the second for individual IDs, the third for paternal ID, the forth for maternal ID, the fifth for sex code, and the sixth for disease designation or quantitative phenotype value. Column 7 starts the genotype data for the markers, with each allele of each genotype in its own column (e.g. for 3 markers there will be 6 allele columns). The header row is for display purposes only and should not appear in the actual file.
CCRaVAT Summary Output File
| MGC33212 | 3 | 197456409 | 197583455 | (10/1) | 2 | 1909 | 31 | 2903 | 15.5 | 0.000083 | 1.96E-05 |
| PPIC | 5 | 122336979 | 122450324 | (12/6) | 26 | 1877 | 7 | 2906 | 21.44 | 0.0000037 | 5.94E-06 |
| NR3C1 | 5 | 142589325 | 142813087 | (24/2) | 8 | 1912 | 47 | 2869 | 14.71 | 0.00013 | 7.28E-05 |
| ADAMTS2 | 5 | 178423474 | 178754935 | (30/2) | 62 | 1859 | 44 | 2885 | 16.15 | 0.000059 | No < 30 |
| 3.8_1.5 | 6 | 29790873 | 29892049 | (43/2) | 26 | 1890 | 10 | 2920 | 16.21 | 0.000057 | 9.74E-05 |
| KLF6 | 10 | 3761233 | 3867455 | (14/2) | 14 | 1907 | 1 | 2931 | 18.18 | 0.00002 | 2.13E-05 |
This file provides summary statistics for all genes that achieved a p value ≤ the p value set by the -pout command line option. The summary file is a tab-delimited file with 12 columns: Gene/Window name, Chromosome, Starting bp position, End bp position, Number of SNPs in the Gene/Window, Number of cases with low frequency/rare variant minor alleles, Number of cases without low frequency/rare variant minor alleles, Number of controls with low frequency/rare variant minor alleles, Number of controls without low frequency/rare variant minor alleles, Chi-Squared Value, Chi-Squared p value, Fisher exact p value.
QuTie Summary Output File
| MIB2 | 1 | 0 | 107622 | (10/1) | 106 | 1125 | -0.367 | 0.037 | 6.77E-05 | 0.404 | 0.101 | [0.206 - 0.602] | -3.975 | 3.72E-05 |
| GOLGA8C | 15 | 18977714 | 19091040 | (29/4) | 127 | 1097 | -0.347 | 0.042 | 3.19E-05 | 0.389 | 0.093 | [0.206 - 0.571] | -4.149 | 1.78E-05 |
| TPO | 2 | 1346242 | 1575502 | (29/6) | 70 | 1158 | 0.478 | -0.028 | 4.00E-05 | -0.506 | 0.122 | [-0.746 - -0.266] | 4.107 | 1.80E-05 |
| EXOC3 | 5 | 446375 | 570407 | (24/1) | 4 | 1228 | 1.935 | -0.006 | 1.00E-04 | -1.941 | 0.498 | [-2.918 - -0.964] | 3.874 | 5.12E-05 |
| C10orf110 | 10 | 1008606 | 1130138 | (66/4) | 111 | 1116 | 0.379 | -0.036 | 4.00E-05 | -0.415 | 0.099 | [-0.609 - -0.221] | 4.161 | 1.48E-05 |
This file provides summary statistics for all genes that achieved a p value ≤ the p value set by the -pout command line option. The summary file is a tab-delimited file with 15 columns: Gene/Window name, Chromosome, Starting bp position, End bp position, Number of SNPs in the Gene/Window, Number of individuals with low frequency/rare variant minor alleles, Number of individuals without low frequency/rare variant minor alleles, The mean QT value of individuals with low frequency/rare variant minor alleles, The mean QT value of individuals without low frequency/rare variant minor alleles, The p value of the linear regression, The Beta coefficient from the linear regression, The Standard Error of the Beta Coefficient, the lower and upper 95% confidence intervals of Beta, The t-statistic, and The t-statistic p value.
CCRaVAT Permutation Summary Output File
| LOC254099 | 1 | 1012320 | 1219359 | case: (86/1213) cont: (75/581) | 0.00026 | Perm: 0/10 = 0 |
| TTLL10 | 1 | 1055000 | 1261164 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 0/10 = 0 |
| TNFRSF18 | 1 | 1078812 | 1282012 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 1/10 = 0.1 |
| TNFRSF4 | 1 | 1086630 | 1289435 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 1/10 = 0.1 |
| SDF4 | 1 | 1092212 | 1307334 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 0/10 = 0 |
| B3GALT6 | 1 | 1107568 | 1310341 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 1/10 = 0.1 |
| C1QDC2 | 1 | 1117751 | 1318766 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 0/10 = 0 |
| UBE2J2 | 1 | 1129217 | 1349157 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 0/10 = 0 |
| SCNN1D | 1 | 1157499 | 1367332 | case: (164/1133) cont: (104/549) | 0.047 | Perm: 0/10 = 0 |
This file provides summary statistics for all genes that achieved a p value ≤ the p value set by the -pperm command line option, which initiates permutation testing. The summary file is a tab-delimited file with 8 columns: Gene/Window name, Chromosome, Starting bp position, End bp position, Summary of the number of cases and controls that have low frequency/rare variant minor alleles, the original p value, Summary of permutations run, and Permutation p value. The output file for QuTie is the same except that column 5 contains the number of individuals with and without low frequency/rare variant minor alleles and corresponding QT values.
CCRaVAT Chromosome Output File
| MIB2 | 1 | 0 | 107622 | (5/0) | 0 | 1924 | 0 | 2938 | 0 | 1 | 1 | |
| OR4G11P | 1 | 2878 | 103747 | (6/1) | 1 | 1922 | 2 | 2935 | 0.05 | 0.82 | 1 | |
| MMP23B | 1 | 9202 | 111672 | (0/0) | 0 | 1924 | 0 | 2938 | 0 | 1 | 1 | |
| MMP23A | 1 | 9225 | 111672 | (12/2) | 112 | 1784 | 167 | 2755 | 0.08 | 0.78 | No < 30 | |
| CDC2L2 | 1 | 12742 | 197336 | (32/1) | 1 | 1921 | 7 | 2926 | 2.46 | 0.12 | 0.158 | |
| LOC440748 | 1 | 39316 | 143660 | (47/12) | 380 | 1531 | 594 | 2329 | 0.14 | 0.71 | No < 30 | |
| NBPF20 | 1 | 114476 | 233524 | (23/1) | 4 | 1919 | 2 | 2932 | 1.84 | 0.17 | 0.222 | |
| CCNL2 | 1 | 115136 | 226993 | (13/0) | 0 | 1924 | 0 | 2938 | 0 | 1 | 1 | |
| OR4F29 | 1 | 357522 | 544452 | (7/3) | 159 | 1756 | 239 | 2685 | 0.03 | 0.86 | No < 30 | |
| LOC440551 | 1 | 519055 | 657573 | (5/0) | 0 | 1924 | 0 | 2938 | 0 | 1 | 1 | |
| LOC440552 | 1 | 558787 | 660167 | (29/9) | 207 | 1679 | 263 | 2640 | 4.74 | 0.029 | No < 30 | Perm: 1/10 = 0.1 |
| FAM87B | 1 | 742614 | 845077 | (28/3) | 15 | 1906 | 21 | 2907 | 0.06 | 0.81 | 0.865 |
This file provides summary statistics for all genes analyzed on each chromosome and is the most comprehensive output file. The summary file is a tab-delimited file with 13 columns: Gene/Window name, Chromosome, Starting bp position, End bp position, Number of SNPs in the Gene/Window and the number that are low frequency/rare, Number of cases with low frequency/rare variant minor alleles, Number of cases without low frequency/rare variant minor alleles, Number of controls with low frequency/rare variant minor alleles, Number of controls without low frequency/rare variant minor alleles, Chi-Squared Value, Chi-Squared p value, Fisher exact p value, and a description of any permutations run.
QuTie Chromosome Output File
| MIB2 | 924412 | 1025537 | (20/1) | 4 | 597 | 9.014 | -0.137 | 0.222 | -9.151 | 7.484 | [-23.819 - 5.517] | 1.223 | 0.111 |
| OR4G11P | 938946 | 1039986 | (25/1) | 4 | 597 | 9.014 | -0.137 | 0.222 | -9.151 | 7.484 | [-23.819 - 5.517] | 1.223 | 0.111 |
| MMP23B | 945587 | 1081419 | (43/6) | 75 | 281 | 3.308 | -0.354 | 0.06 | -3.662 | 1.94 | [-7.484 - 0.159] | 1.884 | 0.03 |
| MMP23A | 983671 | 1097098 | (46/7) | 89 | 280 | 2.929 | -0.493 | 0.059 | -3.423 | 1.807 | [-6.983 - 0.138] | 1.89 | 0.03 |
| CDC2L2 | 997120 | 1097869 | (43/7) | 89 | 280 | 2.929 | -0.493 | 0.059 | -3.423 | 1.807 | [-6.983 - 0.138] | 1.89 | 0.03 |
| LOC440748 | 1007128 | 1117407 | (47/8) | 115 | 260 | 2.73 | -0.135 | 0.085 | -2.865 | 1.659 | [-6.134 - 0.404] | 1.724 | 0.04 |
| NBPF20 | 1062320 | 1169359 | (32/10) | 159 | 224 | 0.713 | -0.169 | 0.558 | -0.882 | 1.505 | [-3.847 - 2.082] | 0.588 | 0.28 |
| CCNL2 | 1105000 | 1211164 | (35/14) | 150 | 523 | -1.291 | 0.224 | 0.281 | 1.515 | 1.403 | [-1.235 - 4.266] | -1.081 | 0.14 |
| OR4F29 | 1128812 | 1232012 | (30/17) | 142 | 437 | -1.402 | 0.231 | 0.258 | 1.633 | 1.442 | [-1.193 - 4.459] | -1.133 | 0.13 |
| LOC440551 | 1136630 | 1239435 | (30/17) | 143 | 443 | -1.407 | 0.204 | 0.262 | 1.611 | 1.435 | [-1.201 - 4.423] | -1.123 | 0.13 |
| LOC440552 | 1142212 | 1257334 | (28/15) | 139 | 444 | -1.48 | 0.156 | 0.261 | 1.636 | 1.454 | [-1.214 - 4.485] | -1.126 | 0.13 |
| FAM87B | 1157568 | 1260341 | (22/14) | 110 | 449 | -1.612 | 0.209 | 0.254 | 1.822 | 1.596 | [-1.306 - 4.949] | -1.142 | 0.13 |
This file provides summary statistics for all genes analyzed on each chromosome and is the most comprehensive output file. The summary file is a tab-delimited file with 14 columns: Gene/Window name, Starting bp position, End bp position, Number of SNPs in the Gene/Window, Number of individuals with low frequency/rare variant minor alleles, Number of individuals without low frequency/rare variant minor alleles, Mean QT value for individuals with low frequency/rare variant minor alleles, Mean QT value for individuals without low frequency/rare variant minor alleles, Linear regression p value, Beta coefficient from linear regression, Standard Error of the Beta coefficient, The lower and upper 95% Confidence Interval s of Beta, T-statistic, and T-statistic p value.
CCRaVAT/QuTie Significant Region Output File
| rs715643 | 1 | 1212830 | 0.042 |
| rs3934834 | 1 | 1045729 | 0.163 |
| rs3737728 | 1 | 1061338 | 0.301 |
| rs6687776 | 1 | 1070488 | 0.158 |
| rs9651273 | 1 | 1071463 | 0.295 |
| rs4970405 | 1 | 1088878 | 0.086 |
| rs12726255 | 1 | 1089873 | 0.125 |
| rs2298217 | 1 | 1104902 | 0.133 |
| rs4970357 | 1 | 1116987 | 0.093 |
| rs4970362 | 1 | 1134661 | 0.378 |
| rs9660710 | 1 | 1139265 | 0.068 |
| rs4970420 | 1 | 1146396 | 0.192 |
| rs1320565 | 1 | 1159781 | 0.095 |
| rs11260549 | 1 | 1161717 | 0.116 |
| rs9729550 | 1 | 1175165 | 0.262 |
| rs11721 | 1 | 1192554 | 0.101 |
| rs2887286 | 1 | 1196054 | 0.17 |
| rs3813199 | 1 | 1198200 | 0.106 |
| rs3766186 | 1 | 1202358 | 0.105 |
| rs7515488 | 1 | 1203727 | 0.158 |
| rs6675798 | 1 | 1216520 | 0.105 |
This file provides summary statistics for all SNPs that reside within a gene or region with p value ≤ the p value set by the -pout command line option. The file is tab-delimited with 5 columns: Marker name, Chromosome, bp position of Marker, and the Minor Allele Frequency (MAF) considering all analyzed individuals.
Figure 2CCRaVAT and QuTie Manhattan Plot. An example Manhattan plot generated by CCRaVAT and QuTie displaying the -LOG10 p value of all genes/windows analyzed. Each point represents a gene or region, with loci achieving p values below a predefined threshold denoted in red.
Figure 3QuTie Quantitative Trait Distribution Histogram. Histogram showing the distribution of the analysed quantitative trait across all individuals (individuals with and without low frequency/rare-variant minor alleles).
Figure 4QuTie Quantitative Trait Distribution Comparison Histogram. Histogram displaying the distribution of quantitative trait values for individuals that either do (red) or do not (blue) carry at least one low frequency/rare variant minor allele within a region that has a p value ≤ the value set by the -pout option. A histogram is produced for every significant gene/window.
Gene File
| 7293 | TNFRSF4 | 1 | 1136569 | 1139375 |
| 51150 | SDF4 | 1 | 1142151 | 1157274 |
| 126792 | B3GALT6 | 1 | 1157508 | 1160281 |
| 388581 | C1QDC2 | 1 | 1167696 | 1171965 |
| 118424 | UBE2J2 | 1 | 1179155 | 1199097 |
| 6339 | SCNN1D | 1 | 1207439 | 1217272 |
| 116983 | CENTB5 | 1 | 1218807 | 1228503 |
| 126789 | PUSL1 | 1 | 1233857 | 1236920 |
Gene file that defines the genes to be analyzed and their coordinates to allow the collapsing of the correct markers defined in the map file. The first five columns of the file must be: Gene ID, Gene Name/Symbol, Chromosome, Start bp position, End bp position. Additional columns will be ignored.