Literature DB >> 35242993

Estimating heritability of height without zygosity information for twins under five years in low- and middle-income countries: An application of normal finite mixture distribution models.

Omar Karlsson1,2, Benjamin W Domingue3, Rockli Kim4,5,6, S V Subramanian6,7.   

Abstract

Twin studies are widely used to estimate heritability of traits and typically rely on knowing the zygosity of twin pairs in order to determine variation attributable to genetics. Most twin studies are conducted in high resource settings. Large scale household survey data, such as the Demographic and Health Surveys, collect various biomarkers for children under five years old in low- and middle-income countries. These data include twins but no information on zygosity. We applied mixture models to obtain heritability estimates without knowing zygosity of twins, using 249 Demographic and Health Surveys from 79 low- and middle-income countries (14,524 twin pairs). We focused on height of children, adjusted for age and sex, but also provided estimates for other biomarkers available in the data. We estimated that the heritability of height in our sample was 46%.
© 2022 The Authors.

Entities:  

Keywords:  Height; Heritability; Low- and middle-income countries; Twin studies; Unknown zygosity

Year:  2022        PMID: 35242993      PMCID: PMC8861393          DOI: 10.1016/j.ssmph.2022.101043

Source DB:  PubMed          Journal:  SSM Popul Health        ISSN: 2352-8273


Introduction

Many traits, such as height, are determined by both environmental exposures and genetics (Jelenkovic, Sund, et al., 2016). Twin studies have long been used to assess the extent to which such traits are determined by genetics (Polderman et al., 2015). These studies utilize known differences in genetic relatedness between monozygotic twins, who essentially share 100% of genes, and dizygotic twins, who share 50% of genes on average. (They also rely upon an assumption that environmental influences on monozygotic and dizygotic twins are equivalent.) Therefore, twin studies require information on zygosity of twins. There is a lack of twin data from low- and middle-income countries and, therefore, most heritability estimates refer to high-income settings. Large scale household surveys, such as the Demographic and Health Surveys (DHS), collect various biometric data, such as height, of children under five years old in low- and middle-income countries. These surveys include twins but do not contain information on zygosity. Previous studies have suggested methods to obtain heritability estimates from twin data without zygosity information (Benyamin et al., 2005, 2006; Neale, 2003); such methods have been used elsewhere (e.g., Conley et al. 2006, Figlio et al. 2017). This paper applied these methods and estimated the heritability of biometric traits of children under five years old using data on 14,524 twin pairs obtained from 249 DHS conducted in 79 low- and middle-income countries. We focused on height, due to its frequent use in twin studies, but also show heritability estimates for weight, birthweight, hemoglobin level, and weight-for-height. We show results for children under five, children under two, and children 2–4 years old.

Material and methods

Data

All data were from the DHS (DHS, 2021). The DHS are nationally representative household surveys collected roughly every five years in many low- and middle-income countries. The surveys used a two-stage stratified sampling design. In the first stage, primary sampling units, typically consisting of villages or neighborhoods, were selected from strata of subnational geographic or administrative regions, separated into urban and rural areas, with a probability proportional to population size. In the second stage, 20–30 households were sampled using random sampling. From these households, women 15–49 years old (a few surveys used slightly different age range) were interviewed and data were collected on their birth histories and health, as well as their children's health. Response rates generally exceeded 90% (Corsi et al., 2012). We used data from all available DHS (as of 10/27/2020) that recorded child height and retained 249 surveys from 79 countries conducted between 1986 and 2018. The height of children under five years old was measured at the time of the survey. In a few surveys, the age limit was three years old, and in others, height was only measured for a subsample of children in the eligible age group. We excluded 2,169,269 singletons, 1,198 triplets and quadruplets, 19,960 twins where neither twin had a height measure recorded (e.g., because of death, not being sampled, implausible height value), 10,196 twins where only one twin had a height measure, and 34 where age did not match between twins. The remaining sample consisted of 29,048 twins or 14,524 twin pairs.

Outcome

Our main outcome was child height measured in millimeters by an interviewer at the time of the survey. Since children in the DHS are measured at different ages, we factored out age by using a residualized measure of height for the analysis. We used a two-step process and computed residualized results separately across sex (which implicitly also adjusts for sex-differences): 1) To account for non-linear patterns in growth, we first residualized height on age using a LOESS regression. 2) To account for increasing variation in height as a function of age, we then identified a projected SD of height by age using a LOESS model to predict SD of height as a function of age (with age binned by month). Finally, we took the residualized value from the first step and standardized via the fitted value from the second step.

Model

We describe the approach here for a generic outcome y. Following (Arbet et al., 2020), we suppose thatwhere variance components (additive genetic), (common shared family environment), and (nonshared environmental) are random effects drawn from distributions that depend on the zygosity of the twin, z. The key assumption has to do with the distributions underlying these random effects. We assume thatwhere σ2A, σ2C, and σ2E are variance parameters and is a matrix with ones on the diagonals and relatedness on the off-diagonals (0.5 on the off-diagonals for dizygotic (DZ) twins and 1 for monozygotic (MZ) twins) and J is a matrix of ones. This model, the classic ACE model (Maes, 2014), then defines heritability as Given the nature of the DHS data, zygosity here is unknown.

Estimation

We utilized an approach originally suggested in (Neale, 2003) to compute h without known zygosity by estimating a mixture model. In particular, we used maximum likelihood estimation wherein the likelihood for each observation was weighted by the probability that the pair is MZ or DZ. For different-sex (DS) twins, this was known (i.e., all DS twins are DZ). For same-sex (SS) twins, we relied on an external parameter ρ which described the proportion of twins that were MZ (in practice, this quantity was unknown, so we used sensitivity analysis to describe the robustness of results to various choices of ρ). The heritability estimates for height were mostly constant over different assumptions for the ρ parameter (Fig. 1). Additional work on this approach is described in (Benyamin et al., 2005, 2006). Confidence intervals were constructed via bootstrap resampling of the twin pairs.
Fig. 1

Heritability for height estimates using different assumptions for.

Notes: ‘Height, flexible residualization’ shows our main measure. ‘HAZ’ shows height-for-age z-score. ‘Height residualized’ shows the alternative, less flexible, residualization. ‘Height’ based on unadjusted height in cm.

Heritability for height estimates using different assumptions for. Notes: ‘Height, flexible residualization’ shows our main measure. ‘HAZ’ shows height-for-age z-score. ‘Height residualized’ shows the alternative, less flexible, residualization. ‘Height’ based on unadjusted height in cm.

Simulation study

We conducted a simulation study to ensure appropriate power to estimate heritabilities of height given our sample size. Each iteration of the simulation was based on 15,000 simulated twin pairs with the MZ prevalence at 0.5. In each iteration, we set and independently sampled and from the unit interval. For DZ pairs, we also simulated sex for each twin. All code is available at https://github.com/ben-domingue/twins-nozyg. For comparison, we also used known zygosity to estimate the relevant variance components. We compared recovery of the relevant variance components to those of the known variance components. Fig. 2 (top row) compares estimates of to truth (x-axis). On left, we see estimates with known zygosity whereas, on right, we have estimates without zygosity (these are the relevant ones for our purposes). Estimates without zygosity are clearly noisier. We can see this visually but also with the mean squared error (MSE) quantities shown at bottom right. The MSE for was 0.0024 with zygosity while it was over ten times greater without, or 0.025. Turning to recovery of (bottom row): for the MSE was 0.0017 with zygosity while it was over four times greater without zygosity information, or 0.0074. Estimates of without zygosity were somewhat better than for (note that the MSE has fallen from 0.025 to 0.007). We then convert these quantities into estimates of h2. Since h2 for height tends to be large, we expect this method being able to estimate heritability.
Fig. 2

Comparison of true and recovered variance components in 300 simulated twin datasets.

Notes: Each simulated dataset consisted of 15,000 pairs. On left, we estimate with zygosity information. On right, without zygosity. At top, we focus on and at bottom we focus on .

Comparison of true and recovered variance components in 300 simulated twin datasets. Notes: Each simulated dataset consisted of 15,000 pairs. On left, we estimate with zygosity information. On right, without zygosity. At top, we focus on and at bottom we focus on .

Supplementary analysis

As supplementary analysis, we provide heritability estimates for other biomarkers available in the DHS and alternative specifications of height: height without residualization, height using alternative residualization (based on a linear model using age and survey year), height-for-age z-scores, weight (residualized), weight-for-age z-scores, weight-for-height z-scores, birthweight, and hemoglobin levels, using the same approach. All z-scores for anthropometric measures were based on the 2006 WHO growth standards (WHO, 2006). As a supplementary analysis we also provide heritability for height across different survey-level infant mortality rates, which reflect living standards and child health.

Results

Overall, first born twins had an average height of 80.5 cm (standard deviation [SD] = 14.7) while the second born twin had an average height of 80.2 cm (SD = 14.7), with a correlation of 0.97 between heights of twins (Table I). First born twins less than two years old had a mean height of 68.0 cm (SD = 9.5) while second born twins less than two years old had an average height of 67.8 cm (SD = 9.5), with a correlation of 0.95 between twins. For twins two years old and older, the average height was 91.1 cm (SD = 8.9) for the first-born twin and 90.1 cm (SD = 8.8) for the second born twin, with a correlation of 0.91 between twins.
Table 1

Descriptive statistics.

Mean (SD)


First born twinSecond born twinPerson's correlationNumber of twin pairs
Overall80.5 (14.7)80.2 (14.7)0.9714,524
Less than 24 months old68.0 (9.5)67.8 (9.5)0.956,696
Over 23 months old91.1 (8.9)90.1 (8.8)0.917,828
Descriptive statistics. Fig. 3 (top row) shows trajectories of height in centimeters as a function of age within rolling 6-month windows, split by sex and quintiles of a household wealth index (a measure of living standards provided with the DHS). Children in the lower quintiles grew at a slower pace than children in the upper quintiles. Fig. 3 (bottom row) compares squared differences in height across same-sex (SS) and different-sex (DS) pairs. On the left, we observe that there is less variation in squared height differences amongst SS pairs as compared to DS pairs. This captures the basic notion of twin-based heritability: DS pairs are all dizygotic (DZ) twins while the SS pairs are a mix of DZ and monozygotic (MZ) twins, the latter of which tend to show reduced variation in height due to their increase in genetic similarity. On the right, we look at these differences as a function of age. The differences increased as a function of age with perhaps some additional increase amongst DS pairs.
Fig. 3

Top: Height as a function of age by wealth index quintiles. Bottom: Squared differences in height as a function of age.

Notes: Age was used in 6 month rolling birth windows.

Top: Height as a function of age by wealth index quintiles. Bottom: Squared differences in height as a function of age. Notes: Age was used in 6 month rolling birth windows. The heritability estimate for height (residualized according to age and sex) was 0.46 (95% confidence interval [CI]: 0.44–0.48) overall (Table II). The heritability was effectively the same (0.46) when computed separately amongst children younger than two (95% CI: 0.41–0.50) versus older than two (95% CI: 0.43–0.49). The results from our supplementary analysis shows that the heritability estimate is highly sensitive to whether age, in some form, was factored out of the height measure (Table III). When using height in centimeters as an unstandardized measure, the heritability estimate is 0.07 overall. Results for height-for-age z-score and height using an alternative residualization shows similar results as our main estimates.
Table 2

Results: Heritability estimate for residualized height and estimates of variance components (ρ = 0.5).

h2ACE
All0.460.460.520.026
[0.44–0.48][0.44–0.49][0.49–0.55][0.024–0.028]



<24 months0.460.470.520.033
[0.41–0.50][0.43–0.51][0.48–0.58][0.029–0.038]



≥24 months0.460.450.510.021
[0.43–0.49][0.42–0.48][0.47–0.55][0.018–0.024]

Notes: h2 shows a heritability estimate. A shows additive genetic variance, C shows environmental variance, E shows specific environmental variance. 95% confidence intervals from 100 bootstrap replications are shown in brackets.

Table 3

Supplementary analysis.

All
h2ACE
Height (not residualized)0.0714.24200.510.73
Height (alternative residualization)0.3814.2422.050.73
Height-for-age z-score0.491.41.410.08
Weight (residualized)0.512.011.770.16
Weight-for-age z-score0.541.180.880.12
Weight-for-height z-score0.933.470.120.15
Birthweight0.71365087.41142914.537022.51
Hemoglobin levels0.612.110.940.44




<24 months
h2
A
C
E
Height (not residualized)0.1110.1678.650.67
Height (alternative residualization)0.3110.1822.030.67
Height-for-age z-score0.511.691.530.11
Weight (residualized)0.451.21.380.1
Weight-for-age z-score0.531.371.070.15
Weight-for-height z-score0.994.75−0.170.2
Birthweight0.68289856.62124189.5211988.78
Hemoglobin levels0.592.141.060.44




≥24 months
h2
A
C
E
Height (not residualized)0.2217.5759.780.8
Height (alternative residualization)0.4317.5522.190.8
Height-for-age z-score0.451.141.320.05
Weight (residualized)0.522.62.170.24
Weight-for-age z-score0.550.980.710.1
Weight-for-height z-score0.912.790.130.13
Birthweight0.74345941.12115849.987298.88
Hemoglobin levels0.692.050.460.45

Notes: h2 shows a heritability estimate. A shows additive genetic variance, C shows environmental variance, E shows specific environmental variance.

Results: Heritability estimate for residualized height and estimates of variance components (ρ = 0.5). Notes: h2 shows a heritability estimate. A shows additive genetic variance, C shows environmental variance, E shows specific environmental variance. 95% confidence intervals from 100 bootstrap replications are shown in brackets. Supplementary analysis. Notes: h2 shows a heritability estimate. A shows additive genetic variance, C shows environmental variance, E shows specific environmental variance. We found a heritability estimate for weight-for-age z-score of 0.54 while for residualized weight it was 0.51. Weight-for-height z-score had a heritability estimate of 0.99 before age two and 0.91 after age two. Birthweight had a heritability estimate of 0.71 and hemoglobin level had a heritability estimate of 0.61. There was no clear link between the heritability estimate for height and infant mortality rate, although the heritability estimate appears to be greater in surveys where infant mortality rate was greater (Fig. 4).
Fig. 4

h2 for residualized height across infant mortality rate (per 1,000 births).

Notes: Infant mortality rate (IMR) was calculated for each surveys using synthetic cohort probability method.

h2 for residualized height across infant mortality rate (per 1,000 births). Notes: Infant mortality rate (IMR) was calculated for each surveys using synthetic cohort probability method.

Discussion

This paper demonstrated a method for obtaining heritability estimates without information on zygosity using data on 14,524 twin pairs from 249 DHS conducted in 79 LMIC. Our heritability estimate for height, for example, indicates that 46% of the height variation was attributable to genetics, and remained the same before and after age two. This study had limitations. We do not account for assortative mating, which may lead to an underestimation in the heritability estimates (Zietsch et al., 2011). Further, we are estimating heritability for children at different ages who were still growing: for example, in our sensitivity analysis we observed that not factoring out age reduces the heritability estimate for height to 0.07. Although we factor out age variation in height, overall, there may be residual influence of age on our heritability estimate. Further, although in most cases, the heritability estimates obtained were plausible, the weight-for-height z-score heritability estimate indicated a genetic heritability of 99% before age two, which is inconsistent with the much lower heritability of height and weight separately, and the known role that acute undernutrition plays for weight-for-height. Further, the estimate for birthweight indicated 71% heritability which was considerably greater than in other studies, which have put the heritability estimates at around 25–40%. These estimates are based on high-income settings, where the environmental insults were presumably less important than in our sample (Wells & Stock, 2011). Previous studies of height have suggested heritability estimates ranging from 0.2 to 0.5 for infants (Dubois et al., 2012; Jelenkovic, Sund, et al., 2016; Mook-Kanamori et al., 2012; Silventoinen et al., 2008). However, other studies have found that heritability of height increases sharply with age reaching, for example, 0.7 in the Netherlands (Mook-Kanamori et al., 2012) at age three and 0.9 in Sweden at age two (Silventoinen et al., 2008). Our estimate remained the same before and after age two years old. Our heritability estimates of 0.46 for height is at the upper limit of heritability for infants suggested by previous studies while considerably lower than previous studies have found for children age 2–4. A few explanations have been offered for the observed increase in heritability by age found in other studies. First, in utero exposures act differently on MZ and DZ twins (Phillips, 1993), which is especially compromising when studying a trait such as early-life growth using twin designs. Catch-up growth—where restricted growth in utero is compensated for by faster and longer postnatal growth (Jelenkovic, Sund, et al., 2016)—may happen over several years and, therefore, the observation that the genetically determined component is smaller in infancy may reflect variation in prenatal insults to growth, which affect MZ and DZ twins differently (Phillips, 1993), and the increasing heritability may reflect catch-up growth. Second, the heritability of other traits, such as BMI and cognitive development, have also been observed to increase with age, which has been attributed to heterogeneity in gene-expressions (Dubois et al., 2012), gene-environment interactions (Lajunen et al., 2009; Purcell, 2002), or gene-environment correlations (Bergen et al., 2007; Jaffee & Price, 2007). Third, higher measurement error in height at earlier ages (Pullum, 2008) may explain the increase in heritability by age (Jelenkovic, Sund, et al., 2016). The absence of a relationship with age in our study, compared to other studies, may relate to differences in study setting. We study low-resource populations where continuous insults to growth, from chronic undernutrition and repeated infections, are more common than in North America, Europe, Australia, and East Asia, where most other studies were conducted. The absence of increase in heritability by age in our study may relate to more adverse environment in our study setting than in previous studies, which may inhibit catch-up growth. However, other studies have found the heritability estimates to be fairly stable across various levels of living standards: The genetic and environmental variances were found to be similar across parental education levels (Jelenkovic et al., 2020) and the environmentally determined component of adult height was found to remain similar across cohorts born into vastly different environments over the 20th century, within Europe, North America, Australia, and East Asia (Jelenkovic, Hur, et al., 2016). We further found a potential indication that our heritability estimate was higher at higher levels of adversity, indicated by infant mortality rate, although this relationship was unclear. Other explanations may relate to absence of information on zygosity or that children were measured at different ages. Our explanations for absence of age differences in heritability in our study are highly speculative.

Conclusions

To conclude, we find a heritability estimate in height for children under five years old to be 0.46 in our sample of twins from low- and middle-income countries. Different from other studies on heritability in height for children, we find no difference in the heritability estimate by age. We also find an implausible 99% heritability of weight-for-age before age two. We apply a method which may be utilized to obtain heritability estimates from birth histories from survey data where zygosity is unknown.

Author contributions

Conceptualization and Design: SVS, BD. Data Acquisition: OK. Statistical analysis and interpretation: BD. Drafting of the Manuscript: OK, BD. Critical revisions to Manuscript: OK, BD, RK, SVS. Overall Supervision: SVS.

Data availability

DHS data are available at https://dhsprogram.com (requiring a simple application).

Compliance with ethical standards

This project used publicly accessible secondary data obtained from the DHS website. The DHS data are not collected specifically for this study and no one on the study team has access to identifiers linked to the data. These activities do not meet the regulatory definition of human subject research. As such, an Institutional Review Board (IRB) review is not required. The Harvard Longwood Campus allows researchers to self-determine when their research does not meet the requirements for IRB oversight via an IRB Decision Tool. The ICF IRB and local IRBs approved data collection procedures and questionnaires and the U.S. Center for Disease Control and Prevention (CDC) reviewed protocols.

Funding

OK was funded by a Wallander stipendium (W19-0015) from the Jan Wallander and Tom Hedelius foundation.

Code availability

All code used is available at https://github.com/ben-domingue/twins-nozyg.

Declaration of competing interest

None.
  21 in total

1.  A finite mixture distribution model for data collected from twins.

Authors:  Michael C Neale
Journal:  Twin Res       Date:  2003-06

2.  Large, consistent estimates of the heritability of cognitive ability in two entire populations of 11-year-old twins from Scottish mental surveys of 1932 and 1947.

Authors:  Beben Benyamin; Valerie Wilson; Lawrence J Whalley; Peter M Visscher; Ian J Deary
Journal:  Behav Genet       Date:  2005-09       Impact factor: 2.805

3.  Twin differences in birth weight: the effects of genotype and prenatal environment on neonatal and post-neonatal mortality.

Authors:  Dalton Conley; Kate W Strully; Neil G Bennett
Journal:  Econ Hum Biol       Date:  2006-01-24       Impact factor: 2.184

4.  Genetic regulation of growth from birth to 18 years of age: the Swedish young male twins study.

Authors:  Karri Silventoinen; Kirsi H Pietiläinen; Per Tynelius; Thorkild I A Sørensen; Jaakko Kaprio; Finn Rasmussen
Journal:  Am J Hum Biol       Date:  2008 May-Jun       Impact factor: 1.937

5.  Demographic and health surveys: a profile.

Authors:  Daniel J Corsi; Melissa Neuman; Jocelyn E Finlay; S V Subramanian
Journal:  Int J Epidemiol       Date:  2012-11-12       Impact factor: 7.196

6.  Twin studies in medical research: can they tell us whether diseases are genetically determined?

Authors:  D I Phillips
Journal:  Lancet       Date:  1993-04-17       Impact factor: 79.321

7.  Meta-analysis of the heritability of human traits based on fifty years of twin studies.

Authors:  Tinca J C Polderman; Beben Benyamin; Christiaan A de Leeuw; Patrick F Sullivan; Arjen van Bochoven; Peter M Visscher; Danielle Posthuma
Journal:  Nat Genet       Date:  2015-05-18       Impact factor: 38.330

8.  Genetic and environmental contributions to weight, height, and BMI from birth to 19 years of age: an international study of over 12,000 twin pairs.

Authors:  Lise Dubois; Kirsten Ohm Kyvik; Manon Girard; Fabiola Tatone-Tokuda; Daniel Pérusse; Jacob Hjelmborg; Axel Skytthe; Finn Rasmussen; Margaret J Wright; Paul Lichtenstein; Nicholas G Martin
Journal:  PLoS One       Date:  2012-02-08       Impact factor: 3.240

9.  Genetic and environmental influences on human height from infancy through adulthood at different levels of parental education.

Authors:  Aline Jelenkovic; Reijo Sund; Yoshie Yokoyama; Antti Latvala; Masumi Sugawara; Mami Tanaka; Satoko Matsumoto; Duarte L Freitas; José Antonio Maia; Ariel Knafo-Noam; David Mankuta; Lior Abramson; Fuling Ji; Feng Ning; Zengchang Pang; Esther Rebato; Kimberly J Saudino; Tessa L Cutler; John L Hopper; Vilhelmina Ullemar; Catarina Almqvist; Patrik K E Magnusson; Wendy Cozen; Amie E Hwang; Thomas M Mack; Tracy L Nelson; Keith E Whitfield; Joohon Sung; Jina Kim; Jooyeon Lee; Sooji Lee; Clare H Llewellyn; Abigail Fisher; Emanuela Medda; Lorenza Nisticò; Virgilia Toccaceli; Laura A Baker; Catherine Tuvblad; Robin P Corley; Brooke M Huibregtse; Catherine A Derom; Robert F Vlietinck; Ruth J F Loos; S Alexandra Burt; Kelly L Klump; Judy L Silberg; Hermine H Maes; Robert F Krueger; Matt McGue; Shandell Pahlen; Margaret Gatz; David A Butler; Jennifer R Harris; Ingunn Brandt; Thomas S Nilsen; K Paige Harden; Elliot M Tucker-Drob; Carol E Franz; William S Kremen; Michael J Lyons; Paul Lichtenstein; Meike Bartels; Catharina E M van Beijsterveldt; Gonneke Willemsen; Sevgi Y Öncel; Fazil Aliev; Hoe-Uk Jeong; Yoon-Mi Hur; Eric Turkheimer; Dorret I Boomsma; Thorkild I A Sørensen; Jaakko Kaprio; Karri Silventoinen
Journal:  Sci Rep       Date:  2020-05-14       Impact factor: 4.379

10.  Heritability estimates of body size in fetal life and early childhood.

Authors:  Dennis O Mook-Kanamori; Catharina E M van Beijsterveldt; Eric A P Steegers; Yurii S Aulchenko; Hein Raat; Albert Hofman; Paul H Eilers; Dorret I Boomsma; Vincent W V Jaddoe
Journal:  PLoS One       Date:  2012-07-25       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.