| Literature DB >> 31018026 |
Atsuko Imai-Okazaki1,2,3, Yi Li4, Sukanya Horpaopan5, Yasser Riazalhosseini6, Masoud Garshasbi7, Yael P Mosse8, Di Zhang9, Isabelle Schrauwen9, Aarushi Sharma10, Cathy S J Fann11, Suzanne M Leal9, Mark Lathrop6, Jurg Ott3.
Abstract
Homozygosity mapping is a well-known technique to identify runs of homozygous variants that are likely to harbor genes responsible for autosomal recessive disease, but a comparable method for autosomal dominant traits has been lacking. We developed an approach to map dominant disease genes based on heterozygosity frequencies of sequence variants in the immediate vicinity of a dominant trait. We demonstrate through theoretical analysis that DNA variants surrounding an inherited dominant disease variant tend to have increased heterozygosity compared with variants elsewhere in the genome. We confirm existence of this phenomenon in sequence data with known dominant pathogenic variants obtained on family members and in unrelated population controls. A computer-based approach to estimating empirical significance levels associated with our test statistics shows genome-wide p-values smaller than 0.05 for many but not all of the individuals carrying a pathogenic variant.Entities:
Keywords: ALSPAC; computer simulation; gene mapping; genetic association analysis; sequence variants
Mesh:
Year: 2019 PMID: 31018026 PMCID: PMC6617796 DOI: 10.1002/humu.23765
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Rankings of H (Stage 1) and H max (Stage 2) values for each individual analyzed and for family members analyzed jointly
| Stage 1 ( | Stage 2 ( | |||||||
|---|---|---|---|---|---|---|---|---|
| Rank | top% |
| Rank | d | Rank2 | Rank3 | Rank4 | |
| S1; A, C | 199,591 | 3.1 | 6,347,882 | 268 | 553 | 252 | 228 | 970 |
| S5; A, C | 78,924 | 1.2 | 6,347,882 | 480 | 533 | 464 | 420 | 138 |
| S9; U, C | 287,746 | 4.5 | 6,347,882 | 468 | 20 | 452 | 432 | 141 |
| S2; U, N | 1,889,519 | 29.8 | 6,347,882 | 366 | 119 | 354 | 332 | 1046 |
| S6; U, N | 1,148,993 | 18.1 | 6,347,882 | 994 | 1,429 | 978 | 923 | 279 |
| S7; A, N | 709,184 | 11.2 | 6,347,882 | 587 | 8 | 577 | 557 | 1176 |
| S1, S5, S7; A | 38,654 | 0.6 | 6,347,882 | 298 | 0.7 | 282 | 254 | 66 |
| S1, S5, S9; C | 78,225 | 1.2 | 6,347,882 | 516 | 3 | 494 | 458 | 1706 |
| S2, S6, S7; N | 207,214 | 3.3 | 6,347,882 | 394 | 115 | 374 | 342 | 102 |
| M1; A, C | 87,720 | 20.7 | 424,175 | 66 | 7,432 | 66 | 60 | 26 |
| M2; A, C | 126,896 | 29.5 | 430,739 | 58 | 7,497 | 56 | 48 | 18 |
| M7; U, C | 87,868 | 18.3 | 481,206 | 32 | 7,618 | 30 | 24 | 10 |
| M9; A, C | 77,205 | 20.1 | 383,563 | 26 | 2,238 | 26 | 20 | 10 |
| M1,2,7,9; C (1) | 39,632 | 3.7 | 1,082,429 | 124 | 67 | 111 | 81 | 33 |
| M1,2,7,9; C (4) | 6,010 | 5.9 | 102,321 | 25 | 2,285 | 20 | 16 | 11 |
| L21; A, C | 1,305,196 | 20.4 | 6,395,904 | 138 | 555 | 105 | 78 | 218 |
| L22; A, C | 172,382 | 2.9 | 5,996,970 | 84 | 556 | 57 | 39 | 18 |
| L21,22; C (1) | 318,778 | 3.0 | 10,549,605 | 1,898 | 0.2 | 1,194 | 768 | 2,896 |
| L21,22; C (2) | 133,870 | 7.3 | 1,843,269 | 93 | 553 | 73 | 205 | 172 |
|
| ||||||||
| Ctrl; BRCA2 | 131,040 | 22.0 | 594,234 | 66 | 66,596 | 66 | 56 | 26 |
| Ctrl; WFS1 | 242,130 | 40.6 | 594,234 |
|
|
|
|
|
| Ctrl; PHOX2 | 423,768 | 71.0 | 594,234 | 40 | 5,904 | 40 | 31 | 16 |
| M9; NCBI1 | 223,045 | 58.2 | 383,563 | 26 | 15,889 | 26 | 20 | 10 |
| M9; NCBI3 | 179,653 | 46.9 | 383,563 | 26 | 16,062 | 26 | 20 | 10 |
Family S contains three affected females (S1, S5, and S7), two unaffected noncarriers (S2 and S6), and one unaffected carrier (S9) of the BRCA2 pathogenic variant in this family. In family M, individuals M1, M2, and M9 are affected, and M7 is an unaffected obligate carrier. Symbols: A = affected, U = unaffected, C = carrier, N = noncarrier, N var = number of variants (H values), d = absolute difference in kb between estimated and true position of the pathogenic variant, rank2 = rank given that the RIH (region of increased heterozygosity) segment length is at least 25% of the average of all RIH lengths; rank3 = rank given that RIH length is at least 50% of the average RIH length; rank4 = rank given that RIH length is at least equal to the average RIH length; n at Stage 2 means that the H value for the known pathogenic variant does not occur in any RIH. False positives: First item = individual (Ctrl = unaffected in psoriasis family 12), second item = assumed disease variant (NCBI: intronic variants).
Standard parameter settings for haplotypes at a disease variant and a nearby marker variant
| Marker variant | |||
|---|---|---|---|
| Disease variant |
|
| Sum |
|
|
|
|
|
| + | (1− | (1− | 1− |
| Sum |
| 1− | 1 |
Symbols: e = disease allele (mutation) frequency, P(u); f = marker variant allele frequency, P(A); D = disequilibrium parameter
Figure 1Stage 1 heterozygosity rate, H (y‐axis), plotted against marker positions (x‐axis) surrounding a pathogenic BRCA2 mutation in three females affected with breast cancer in family S. The x‐axis scale is the distance in kb from the pathogenic BRCA2 mutation
Figure 2Stage 1 heterozygosity rate, H (y‐axis), plotted against marker positions (x‐axis) surrounding a pathogenic BRCA2 mutation in a control individual, that is, the first individual in our unpublished collection of family members affected with psoriasis. The x‐axis scale is the distance in kb from the pathogenic BRCA2 mutation
Nine carriers (C) and eight noncarriers (N) and top % results for H values of disease variants and pseudo‐disease variants, respectively
| ID |
| Status | Sensitivity | Specificity |
|---|---|---|---|---|
| S5 | 1.2 | C | 0.111 | 1 |
| L22 | 2.9 | C | 0.222 | 1 |
| S1 | 3.1 | C | 0.333 | 1 |
| S9 | 4.5 | C | 0.444 | 1 |
| S7 | 11.2 | N | 0.444 | 0.875 |
| S6 | 18.1 | N | 0.444 | 0.750 |
| M7 | 18.3 | C | 0.556 | 0.750 |
| M9 | 20.1 | C | 0.667 | 0.750 |
| L21 | 20.4 | C | 0.778 | 0.750 |
| M1 | 20.7 | C | 0.889 | 0.750 |
| Ctrl; BRCA2 disease variant | 22.0 | N | 0.889 | 0.625 |
| M2 | 29.5 | C | 1 | 0.625 |
| S2 | 29.8 | N | 1 | 0.500 |
| Ctrl; WFS1 disease variant | 40.6 | N | 1 | 0.375 |
| M9; NCBI3 | 46.9 | N | 1 | 0.250 |
| M9; NCBI1 | 58.2 | N | 1 | 0.125 |
| Ctrl; PHOX2 disease variant | 71.0 | N | 1 | 0 |
Figure 3Receiver operating characteristic curve based on results of Table 3. The graph shows y = sensitivity plotted against x = 1 − specificity. The area under the curve is AUC =0.85