| Literature DB >> 23469237 |
Li Zhang1, Mohammed S Orloff, Sean Reber, Shengchao Li, Ye Zhao, Charis Eng.
Abstract
Identification of disease variants via homozygosity mapping and investigation of the effects of genome-wide homozygosity regions on traits of biomedical importance have been widely applied recently. Nonetheless, the existing methods and algorithms to identify long tracts of homozygosity (TOH) are not able to provide efficient and rigorous regions for further downstream association investigation. We expanded current methods to identify TOHs by defining "surrogate-TOH", a region covering a cluster of TOHs with specific characteristics. Our defined surrogate-TOH includes cTOH, viz a common TOH region where at least ten TOHs present; gTOH, whereby a group of highly overlapping TOHs share proximal boundaries; and aTOH, which are allelically-matched TOHs. Searching for gTOH and aTOH was based on a repeated binary spectral clustering algorithm, where a hierarchy of clusters is created and represented by a TOH cluster tree. Based on the proposed method of identifying different species of surrogate-TOH, our cgaTOH software was developed. The software provides an intuitive and interactive visualization tool for better investigation of the high-throughput output with special interactive navigation rings, which will find its applicability in both conventional association studies and more sophisticated downstream analyses. NCBI genome map viewer is incorporated into the system. Moreover, we discuss the choice of implementing appropriate empirical ranges of critical parameters by applying to disease models. This method identifies various patterned clusters of SNPs demonstrating extended homozygosity, thus one can observe different aspects of the multi-faceted characteristics of TOHs.Entities:
Mesh:
Year: 2013 PMID: 23469237 PMCID: PMC3585782 DOI: 10.1371/journal.pone.0057772
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Chromosome and cTOH navigation bars with TOH clusters.
(A) TOH cluster explorer. The top panel is the chromosome navigation bar, with a close-up of the cTOH navigation bar below. Clicking a particular cTOH in the cTOH navigation bar will reveal all TOHs within the region in the bottom panel. (B) Navigation ring of TOH cluster tree. The tree ring navigation panel displays the corresponding clustering information based on the binary tree. Ring segments can be clicked on the tree ring navigation panel to highlight the TOHs belonging to that cluster in the TOH view. All colours are based on a heat map that corresponds to p-values of the regions obtained through the association tests.
aTOHs significantly associated with lung cancer (PLCO data).
| Chr. region | Gene1 | Start SNP2 | End SNP3 | Length4(bp) | No. ofSNPs5 | No. ofCa6 | No. ofCon 7 | aTOHp-value8 | cTOHp-value9 | OR10(95%CI) |
| 5q23.1 |
| rs2662447 | rs1014643 | 463544 | 129 | 12 | 1 | 0.0014 | 0.659 | 12.8 (1.89,547.3) |
| 6p22.1 |
| rs3130778 | rs376681 | 417514 | 99 | 9 | 1 | 0.0097 | 0.449 | 9.57 (1.32,419.5) |
| 8q23.3 |
| rs7812989 | rs2884258 | 665393 | 102 | 11 | 0 | 0.0004 | 0.643 | Inf (2.67,Inf) |
| 10q23.1 | rs10887346 | rs4433512 | 761177 | 142 | 8 | 0 | 0.0031 | 0.226 | Inf (1.81,Inf) | |
| 16q23.1-q23.2 | rs4547344 | rs4888070 | 360163 | 118 | 7 | 0 | 0.0064 | 0.629 | Inf (1.53,Inf) |
Gene: genes reported in GWAS Catalog [17] related with cancer, lung disease and inflammatory biomarkers.
Start SNP: start SNP of the aTOH region.
End SNP: end SNP of the aTOH region.
Length: length of the aTOH region in bp.
No. of SNPs: the number of SNPs of the aTOH region.
No. of Ca.: the number of cases present in the aTOH region.
No. of Con.: the number of controls present in the aTOH region.
aTOH p-value: p-value of association test for the aTOH region.
cTOH p-value: p-value of association test for the cTOH region where the aTOH resides.
OR (95% CI): Odds ratio with 95% confidence interval of the aTOH region.
doi:10.1371/journal.pone.0057772.t002
gTOHs significantly associated with lung cancer.
| Chr. region | Gene1 | Start SNP2 | End SNP3 | Length4(bp) | No.ofSNPs5 | No.of Ca. 6 | No.of Con 7 | gTOHp-value8 | cTOHp-value9 | OR10(95%CI) |
| 3q13.12-q13.13 | rs7622560 | rs1348994 | 811233 | 126 | 7 | 0 | 0.0064 | 0.785 | Inf(1.53,Inf) | |
| 6p22.1 |
| rs198845 | rs12190473 | 916897 | 128 | 0 | 8 | 0.0078 | 0.449 | 0(0,0.61) |
| 8p22 |
| rs2250991 | rs4831667 | 201087 | 129 | 9 | 1 | 0.0097 | 0.148 | 9.57(1.32,419.5) |
| 8q23.3 |
| rs7812989 | rs2884258 | 665393 | 102 | 7 | 0 | 0.0064 | 0.643 | Inf(1.53,Inf) |
| 12q24.1-q24.13 | rs17192160 | rs4767879 | 1610909 | 111 | 18 | 6 | 0.0059 | 0.221 | 3.85(1.47,10.05) | |
| 14q23.2 |
| rs1271565 | rs2295639 | 858019 | 114 | 7 | 0 | 0.0064 | 0.262 | Inf(1.53,Inf) |
| 15q21.3 | rs2932195 | rs7359330 | 210678 | 106 | 7 | 0 | 0.0064 | 0.032 | Inf(1.53,Inf) |
Gene: genes reported in GWAS Catalog [17] related with cancer, lung disease and inflammatory biomarkers.
Start SNP: start SNP of the gTOH region.
End SNP: end SNP of the gTOH region.
Length: length of the gTOH region in bp.
No. of SNPs: the number of SNPs of the gTOH region.
No. of Ca.: the number of cases present in the gTOH region.
No. of Con.: the number of controls present in the gTOH region.
gTOH p-value: p-value of association test for the gTOH region.
cTOH p-value: p-value of association test for the cTOH region where the gTOH resides.
OR (95% CI): Odds ratio with 95% confidence interval of the gTOH region.
doi:10.1371/journal.pone.0057772.t001
Figure 2gTOH (rs198845-rs12190473) and aTOH (rs3130778- rs376681) regions associated with lung cancer.
(A) –log10 transformed p-values obtained from the association tests. The green line, red line and blue line are the p-values corresponding to gTOH (rs198845-rs12190473), aTOH (rs3130778- rs376681) region and their parent cTOH region, respectively. The purple dots and black dots are p-values<0.05 and > = 0.05 based on single SNP association tests within the same region of the parent cTOH. (B) The corresponding lung-cancer risks as odds ratios (OR) and 95% confidence interval (CI). Green solid line and dash line corresponding to OR and 95%CI for gTOH, while red and blue lines are for aTOH and it’s parent cTOH. The purple dots represent OR for single SNP risk with grey solid vertical lines showing the 95% CIs.
Figure 3gTOH (rs7812989-rs2884258) and aTOH (rs7812989-rs2884258) regions associated with lung cancer.
(A) –log10 transformed p-values obtained from the association tests. The green line, red line and blue line are the p-values corresponding to gTOH (rs7812989-rs2884258), aTOH (rs7812989-rs2884258) region and their parent cTOH region, respectively. The purple dots and black dots are p-values<0.05 and > = 0.05 based on single SNP association tests within the same region of the parent cTOH. (B) The corresponding lung-cancer risks as odds ratios (OR) and 95% confidence interval (CI). Green solid line and dash line corresponding to OR and 95%CI for gTOH, while red and blue lines are for aTOH and it’s parent cTOH. The purple dots represent OR for single SNP risk with grey solid vertical lines showing the 95% CIs.