| Literature DB >> 31880204 |
Marianne A Jonker1, Priya Vart1, Mar Rodriguez Girondo2.
Abstract
Information on the age at onset distribution of the asymptomatic stage of a disease can be of paramount importance in early detection and timely management of that disease. However, accurately estimating this distribution is challenging, because the asymptomatic stage is difficult to recognize for the patient and is often detected as an incidental finding or in case of recommended screening; the age at onset is often interval-censored. In this paper, we propose a method for the estimation of the age at onset distribution of the asymptomatic stage of a genetic disease based on ascertained pedigree data that take into account the way the data are ascertained to overcome selection bias. Simulation studies show that the estimates seem to be asymptotically unbiased. Our work is motivated by the analysis of data on facioscapulohumeral muscular dystrophy, a genetic muscle disorder. In our application, carriers of the genetic causal variant are identified through genetic screening of the relatives of symptomatic carriers and their disease status is determined by a medical examination. The estimates reveal an early age at onset of the asymptomatic stage of facioscapulohumeral muscular dystrophy.Entities:
Keywords: Age at onset; ascertainment; current status data; family data; outcome-dependent sampling
Mesh:
Year: 2019 PMID: 31880204 PMCID: PMC7391479 DOI: 10.1177/0962280219893400
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Figure 1.Presentation of the observations in the three configurations. There are no observations after C and the exact moment U took place is not indicated, because this is unknown.
Definition of variables and distributions.
| Variable | Meaning | Distribution |
|---|---|---|
|
| Age at onset of symptomatic disease for a carrier |
|
|
| Age at onset of asymptomatic disease for a carrier |
|
|
| Age at time of examination/screening for a carrier | |
| Δ | Indicator function | |
| Σ | Indicator function |
Figure 2.Results of the simulation study with m = 500 families. Top: scenario 1 (). Middle: scenario 2 (). Bottom: scenario 3 (). Left panels: small families (). Right panels: large families (). Solid black and gray lines are the true distribution functions and , respectively. Dashed lines form a band based on estimated pointwise 2.5 and 97.5% percentiles based on M = 1000 Monte Carlo trials.
reBias, standard deviation (SD), and coverage probabilities of the 95% confidence intervals (Cov) for the location and scale parameters of the age at onset gamma distributions of U (k1, θ1) and T () along 1000 trials for several sample sizes ( families) and effective sample sizes, defined as the mean number of ascertained families along the 1000 trials ().
Small families | Large families | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Parameter |
|
| reBias | SD | Cov |
| reBias | SD | Cov |
| Scenario 1 | |||||||||
| 100 | 26 | 0.736 | – | – | 58 | 0.054 | 10.594 | 0.921 | |
| 500 | 132 | 0.076 | 16.088 | 0.802 | 289 | 0.014 | 4.299 | 0.917 | |
| | 1000 | 264 | 0.033 | 9.791 | 0.792 | 579 | 0.002 | 3.076 | 0.896 |
| 100 | 26 | 0.656 | – | – | 58 | −0.013 | 0.244 | 0.900 | |
| 500 | 132 | 0.005 | 0.337 | 0.937 | 289 | −0.006 | 0.105 | 0.926 | |
| | 1000 | 264 | 0.002 | 0.225 | 0.956 | 579 | 0.002 | 0.076 | 0.919 |
| 100 | 26 | 0.093 | – | – | 58 | 0.025 | 8.623 | 0.962 | |
| 500 | 132 | 0.020 | 8.000 | 0.969 | 289 | 0.008 | 3.771 | 0.958 | |
| | 1000 | 264 | 0.009 | 5.656 | 0.951 | 579 | 0.003 | 2.485 | 0.960 |
| 100 | 26 | 0.076 | – | – | 58 | 0.001 | 0.201 | 0.940 | |
| 500 | 132 | 0.003 | 0.189 | 0.932 | 289 | −0.002 | 0.088 | 0.948 | |
| | 1000 | 264 | 0.003 | 0.136 | 0.942 | 579 | −0.001 | 0.059 | 0.960 |
| Scenario 2 | |||||||||
| 100 | 13 | 1.384 | – | – | 34 | 0.070 | 10.464 | 0.969 | |
| 500 | 65 | 0.186 | 20.123 | 0.941 | 167 | 0.013 | 4.288 | 0.925 | |
| | 1000 | 130 | 0.067 | 10.792 | 0.922 | 333 | 0.008 | 2.836 | 0.946 |
| 100 | 13 | −0.050 | – | – | 34 | −0.013 | 0.268 | 0.939 | |
| 500 | 65 | −0.036 | 0.402 | 0.904 | 167 | −0.002 | 0.125 | 0.939 | |
| | 1000 | 130 | <0.001 | 0.279 | 0.945 | 333 | −0.003 | 0.084 | 0.961 |
| 100 | 13 | 0.389 | – | – | 34 | 0.041 | 15.561 | 0.975 | |
| 500 | 65 | 0.046 | 15.168 | 0.971 | 167 | 0.007 | 6.391 | 0.958 | |
| | 1000 | 130 | 0.014 | 10.036 | 0.961 | 333 | 0.005 | 4.483 | 0.957 |
| 100 | 13 | 0.324 | – | – | 34 | 0.031 | 0.341 | 0.937 | |
| 500 | 65 | 0.036 | 0.400 | 0.928 | 167 | 0.006 | 0.144 | 0.944 | |
| | 1000 | 130 | 0.023 | 0.246 | 0.946 | 333 | 0.001 | 0.101 | 0.953 |
| Scenario 3 | |||||||||
| 100 | 66 | 0.309 | – | – | 92 | 0.027 | 0.332 | 0.930 | |
| 500 | 329 | 0.046 | 0.320 | 0.891 | 462 | 0.009 | 0.139 | 0.935 | |
| | 1000 | 658 | 0.015 | 0.219 | 0.897 | 924 | 0.003 | 0.097 | 0.937 |
| 100 | 66 | 0.552 | – | – | 92 | 0.061 | 6.241 | 0.946 | |
| 500 | 329 | 0.038 | 5.822 | 0.958 | 462 | 0.009 | 2.378 | 0.954 | |
| | 1000 | 658 | 0.024 | 3.888 | 0.966 | 924 | 0.005 | 1.648 | 0.951 |
| 100 | 66 | 0.021 | – | – | 92 | 0.004 | 0.224 | 0.953 | |
| 500 | 329 | 0.006 | 0.186 | 0.942 | 462 | 0.001 | 0.101 | 0.944 | |
| | 1000 | 658 | 0.002 | 0.017 | 0.943 | 924 | <0.001 | 0.068 | 0.954 |
| 100 | 66 | 0.015 | – | – | 92 | 0.005 | 4.051 | 0.954 | |
| 500 | 329 | <0.001 | 1.786 | 0.942 | 462 | 0.002 | 0.904 | 0.949 | |
| | 1000 | 658 | 0.002 | 1.310 | 0.937 | 924 | <0.001 | 0.624 | 0.954 |
For small families and m = 100, the SDs are not reliable and left out from the table. reBias: relative bias.
Figure 4.Results of the simulation study based on populations of families with . Left: scenario 1 (). Right: scenario 2 (). Top: exclusion probability of 0.20. Bottom: exclusion probability of 0.50. Solid black and gray lines are the true distribution functions and , respectively. Dashed lines form a band based on estimated pointwise 2.5 and 97.5% percentiles based on M = 1000 Monte Carlo trials.
Overview of data: For every pedigree, the number of units, carriers, and symptomatic and asymptomatic carriers are given.
| Pedigree no. | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| No. of D4Z4 units among carriers | 4 | 5 | 5 | 6 | 6 | 6 | 7 | 7 | 9 | 9 |
| No. of carriers in the pedigree | 5 | 9 | 8 | 5 | 7 | 13 | 5 | 3 | 8 | 6 |
| No. of symptomatic carriers | 5 | 9 | 7 | 5 | 2 | 3 | 3 | 2 | 2 | 2 |
| No. of asymptomatic carriers | 0 | 0 | 1 | 0 | 5 | 6 | 1 | 1 | 2 | 1 |
A carrier is defined as an individual with a loss of repetitions of the D4Z4 unit.
Figure 3.Solid lines: Estimates of (top panels) and (bottom panels) for 4, 5, or 6 repetitions of the D4Z4 unit (left) and for 7, 8, or 9 repetitions (right). Dashed lines: corresponding pointwise 95% confidence intervals.