| Literature DB >> 36070757 |
Hanxin Zhang1, Atif Khan2, Andrey Rzhetsky3.
Abstract
In complex diseases, the phenotypic variability can be explained by genetic variation (G), environmental stimuli (E), and interaction of genetic and environmental factors (G-by-E effects), among which the contribution G-by-E remains largely unknown. In this study, we focus on ten major neuropsychiatric disorders using data for 138,383 United States families with 404,475 unique individuals. We show that, while gene-environment interactions account for only a small portion of the total phenotypic variance for a subset of disorders (depression, adjustment disorder, substance abuse), they explain a rather large portion of the phenotypic variation of the remaining disorders: over 20% for migraine and close to or over 30% for anxiety/phobic disorder, attention-deficit/hyperactivity disorder, recurrent headaches, sleep disorders, and post-traumatic stress disorder. In this study, we have incorporated-in the same analysis-clinical data, family pedigrees, the spatial distribution of individuals, their socioeconomic and demographic confounders, and a collection of environmental measurements.Entities:
Keywords: Bayesian inference; etiology; gene-environment interaction; heritability; mixed-effects model; neuropsychiatric disorder; psychiatric disorder
Mesh:
Year: 2022 PMID: 36070757 PMCID: PMC9512674 DOI: 10.1016/j.xcrm.2022.100736
Source DB: PubMed Journal: Cell Rep Med ISSN: 2666-3791
Model setups and statistics
| Model | Fixed effects | Random effects | Statistics |
|---|---|---|---|
| Linear model 0 | Demo + Env | G + E | |
| Linear model 1 | Demo + Env | Geo + G + E | |
| Interaction model 1 | Demo + Env | Geo + G + E + GE | |
| Linear model 2 | Demo + Env | Geo + G + F + C + S + E | |
| Interaction model 2 | Demo + Env | Geo + G + F + C + S + E + GF + GC + GS + GE |
The fixed-effect terms are sex + age (Demo) and environmental quality indices (Env).
The random-effect terms are defined by the partition of the phenotype explained by the following. Geo, geographic position, described by coordinates (latitude and longitude); G, genetics; E, the individually independent environment; F, the environment shared by family members; C, the environment shared by couples; S, the environment shared by siblings; GE, the interaction between genetics and the individually independent environment; GF, the interaction between genetics and the family-shared environment; GC, the interaction between genetics and the couples-shared environment; GS, the interaction between genetics and the siblings-shared environment.
Mean estimates and 95% highest posterior density credible intervals of the heritability and environmental statistics (see Table S1 for details on simulations and inferences under the proposed models)
| WAIC | PSIS-LOO-CV | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LM0 | 53.52 (44.94, 64.42)% | 46.48 (35.58, 55.06)% | 257,729.31 (256,459.62, 258,999.00) | 291,289.56 (289,805.55, 292,773.57) | ||||||||
| LM1 | 1.40 (0.96, 1.97)% | 47.50 (43.82, 52.54)% | 51.10 (45.92, 54.86)% | 240,648.05 (239,482.50, 241,813.60) | 284,715.61 (283,263.98, 286,167.24) | |||||||
| IM1 | 2.35 (1.53, 3.04)% | 31.92 (20.40, 55.46)% | 36.36 (27.03, 44.87)% | 29.37 (14.29, 42.94)% | 195,306.87 (194,145.39, 196,468.35) | 263,091.00 (261,644.38, 264,537.62) | ||||||
| LM2 | 1.54 (0.98, 2.10)% | 56.44 (50.26, 60.70)% | 1.99 (0.32, 4.06)% | 31.45 (27.93, 33.82)% | 5.12 (1.70, 9.43)% | 3.47 (0.50, 7.39)% | 269,150.17 (267,800.73, 270,499.61) | 296,226.08 (294,703.79, 297,748.37) | ||||
| IM2 | 2.02 (1.29, 2.84)% | 47.42 (26.90, 60.63)% | 1.97 (0.08, 5.22)% | 19.47 (12.41, 25.20)% | 4.66 (0.61, 9.82)% | 3.40 (0.29, 6.94)% | 0.95 (0.00, 4.69)% | 12.70 (4.31, 23.10)% | 4.05 (0.00, 11.12)% | 3.37 (0.00, 10.86)% | 245,390.66 (244,122.11, 246,659.21) | 284,535.98 (283,018.76, 286,053.20) |
| LM0 | 69.48 (61.50, 79.38)% | 30.52 (20.62, 38.50)% | 176,007.77 (174,690.14, 177,325.40) | 197,970.08 (196,439.77, 199,500.39) | ||||||||
| LM1 | 2.55 (1.52, 3.65)% | 68.76 (57.32, 82.22)% | 28.69 (14.56, 40.43)% | 178,182.13 (176,836.49, 179,527.77) | 197,324.49 (195,796.14, 198,852.84) | |||||||
| IM1 | 4.69 (3.21, 6.02)% | 55.35 (44.43, 69.13)% | 12.87 (8.29, 19.87)% | 27.10 (17.38, 38.25)% | 139,055.19 (137,852.02, 140,258.36) | 184,686.67 (183,193.74, 186,179.60) | ||||||
| LM2 | 2.07 (1.34, 2.94)% | 49.53 (43.08, 55.48)% | 8.57 (3.43, 12.94)% | 35.00 (31.65, 37.78)% | 2.97 (0.07, 7.40)% | 1.86 (0.14, 5.47)% | 143,671.22 (142,640.64, 144,711.80) | 177,874.25 (176,496.13, 179,252.37) | ||||
| IM2 | 2.30 (1.25, 3.58)% | 50.30 (43.53, 57.38)% | 7.12 (1.21, 15.90)% | 31.18 (20.32, 38.22)% | 1.68 (0.07, 4.59)% | 1.99 (0.02, 5.57)% | 0.49 (0.00, 1.76)% | 1.98 (0.00, 11.24)% | 0.77 (0.00, 2.57)% | 2.19 (0.00, 7.99)% | 135,787.57 (134,804.02, 136,771.12) | 173,205.69 (171,849.96, 174,561.42) |
| LM0 | 57.31 (44.37, 74.73)% | 42.69 (25.27, 55.63)% | 179,692.88 (178,303.44, 181,082.32) | 194,241.46 (192,718.76, 195,764.16) | ||||||||
| LM1 | 0.60 (0.35, 0.89)% | 48.47 (38.09, 60.94)% | 50.93 (38.44, 61.42)% | – | 165,004.39 (163,752.75, 166,256.03) | 188,446.05 (186,970.39, 189,921.71) | ||||||
| IM1 | 1.06 (0.63, 1.51)% | 31.14 (17.40, 49.83)% | 45.89 (33.71, 56.61)% | 21.91 (8.35, 43.25)% | 133,837.58 (132,839.27, 134,835.89) | 177,214.09 (175,772.12, 178,656.06) | ||||||
| LM2 | 0.65 (0.35, 0.99)% | 49.41 (38.69, 64.85)% | 3.01 (0.32, 8.21)% | 14.01 (9.56, 18.79)% | 13.84 (4.89, 23.04)% | 19.07 (4.49, 35.67)% | 174,572.51 (173,229.81, 175,915.21) | 192,363.95 (190,855.34, 193,872.56) | ||||
| IM2 | 0.77 (0.42, 1.17)% | 46.82 (35.62, 57.99)% | 4.57 (0.48, 11.64)% | 11.86 (4.53, 20.00)% | 15.54 (5.05, 24.94)% | 13.17 (0.68, 33.14)% | 1.34 (0.00, 4.34)% | 2.23 (0.00, 8.44)% | 1.88 (0.00, 4.93)% | 1.82 (0.00, 6.34)% | 170,124.00 (168,807.66, 171,440.34) | 191,132.71 (189,617.06, 192,648.36) |
| LM0 | 68.94 (60.06, 76.83)% | 31.06 (23.17, 39.94)% | 128,812.67 (127,578.50, 130,046.84) | 146,767.54 (145,311.69, 148,223.39) | ||||||||
| LM1 | 4.00 (2.61, 5.52)% | 62.14 (53.13, 69.43)% | 33.86 (26.15, 42.96)% | 125,231.76 (124,031.40, 126,432.12) | 144,010.34 (142,578.76, 145,441.92) | |||||||
| IM1 | 7.55 (5.08, 10.04)% | 60.65 (41.43, 76.10)% | 14.57 (2.27, 24.19)% | 17.24 (2.02, 31.67)% | 118,529.41 (117,371.56, 119,687.26) | 142,509.46 (141,050.65, 143,968.27) | ||||||
| LM2 | 3.51 (2.47, 4.79)% | 27.82 (17.26, 36.92)% | 21.65 (15.43, 27.97)% | 36.25 (31.03, 41.89)% | 5.04 (0.28, 8.96)% | 5.73 (0.04, 12.01)% | 108,221.89 (107,194.65, 109,249.13) | 131,849.40 (130,522.81, 133,175.99) | ||||
| IM2 | 3.37 (1.90, 4.88)% | 32.53 (22.77, 40.77)% | 13.60 (5.21, 20.90)% | 42.59 (32.21, 51.84)% | 2.02 (0.16, 4.50)% | 2.71 (0.03, 5.98)% | 0.95 (0.00, 3.52)% | 0.64 (0.00, 1.73)% | 0.99 (0.00, 4.68)% | 0.59 (0.00, 1.82)% | 113,680.71 (112,586.79, 114,774.63) | 134,470.01 (133,113.51, 135,826.51) |
| LM0 | 51.67 (40.51, 66.57)% | 48.33 (33.43, 59.49)% | 119,551.43 (118,226.49, 120,876.37) | 128,775.41 (127,324.72, 130,226.10) | ||||||||
| LM1 | 2.10 (0.98, 3.70)% | 53.77 (37.89, 89.74)% | 44.12 (6.73, 60.36)% | 121,334.21 (119,979.12, 122,689.30) | 128,047.10 (126,603.56, 129,490.64) | |||||||
| IM1 | 2.29 (1.07, 3.81)% | 40.66 (31.59, 48.66)% | 53.70 (37.49, 61.66)% | 3.35 (0.00, 13.24)% | 110,533.35 (109,318.80, 111,747.90) | 126,149.23 (124,716.08, 127,582.38) | ||||||
| LM2 | 1.62 (0.96, 2.38)% | 40.13 (32.24, 46.53)% | 3.11 (0.20, 5.81)% | 34.45 (29.05, 40.41)% | 8.03 (2.64, 13.63)% | 12.66 (1.95, 21.80)% | 107,571.67 (106,400.26, 108,743.08) | 123,469.25 (122,078.41, 124,860.09) | ||||
| IM2 | 2.31 (1.18, 3.66)% | 41.65 (30.25, 53.14)% | 3.62 (0.17, 8.70)% | 26.16 (15.56, 36.32)% | 8.49 (2.01, 15.28)% | 6.79 (0.35, 14.17)% | 1.58 (0.00, 5.55)% | 7.58 (0.00, 17.21)% | 1.06 (0.00, 3.60)% | 0.75 (0.00, 2.88)% | 111,049.17 (109,814.90, 112,283.44) | 125,555.95 (124,121.80, 126,990.10) |
| LM0 | 95.22 (88.59, 99.88)% | 4.78 (0.12, 11.41)% | 90,855.79 (89,773.01, 91,938.57) | 106,646.84 (105,399.46, 107,894.22) | ||||||||
| LM1 | 3.48 (2.28, 4.60)% | 88.19 (80.62, 94.62)% | 8.33 (1.39, 15.59)% | 89,893.81 (88,885.66, 90,901.96) | 107,365.15 (106,110.44, 108,619.86) | |||||||
| IM1 | 3.52 (2.25, 4.91)% | 76.58 (69.83, 83.63)% | 4.36 (2.04, 10.07)% | 15.53 (10.16, 21.20)% | 75,706.35 (74,759.16, 76,653.54) | 94,878.98 (93,761.72, 95,996.24) | ||||||
| LM2 | 3.49 (2.26, 4.94)% | 48.23 (36.21, 58.38)% | 24.60 (15.25, 39.05)% | 17.15 (11.38, 21.71)% | 1.67 (0.02, 3.48)% | 4.85 (0.07, 13.24)% | 89,568.83 (88,558.49, 90,579.17) | 106,050.93 (104,810.11, 107,291.75) | ||||
| IM2 | 5.80 (4.09, 8.00)% | 55.22 (42.20, 67.84)% | 3.22 (0.01, 9.37)% | 2.45 (0.00, 10.93)% | 1.72 (0.05, 4.94)% | 1.48 (0.02, 3.44)% | 8.66 (1.04, 20.44)% | 21.11 (9.64, 30.64)% | 0.14 (0.00, 0.68)% | 0.21 (0.00, 1.14)% | 57,894.43 (57,295.36, 58,493.50) | 89,048.44 (88,006.33, 90,090.55) |
| LM0 | 79.66 (72.10, 87.38)% | 20.34 (12.62, 27.90)% | 65,719.01 (64,697.44, 66,740.58) | 75,862.89 (74,636.60, 77,089.18) | ||||||||
| LM1 | 0.97 (0.50, 1.52)% | 77.43 (72.13, 86.46)% | 21.61 (12.68, 27.25)% | 64,106.44 (63,110.64, 65,102.24) | 75,406.16 (74,191.33, 76,620.99) | |||||||
| IM1 | 1.50 (0.68, 2.44)% | 81.57 (73.34, 90.50)% | 4.76 (0.15, 12.39)% | 12.18 (0.80, 22.00)% | 65,983.85 (64,891.17, 67,076.53) | 77,988.95 (76,723.30, 79,254.60) | ||||||
| LM2 | 0.94 (0.49, 1.52)% | 45.68 (27.56, 61.37)% | 16.49 (6.45, 25.97)% | 21.96 (11.29, 28.34)% | 3.65 (0.28, 7.20)% | 11.27 (0.48, 26.99)% | 58,846.71 (57,951.28, 59,742.14) | 72,074.67 (70,913.27, 73,236.07) | ||||
| IM2 | 2.18 (1.02, 3.67)% | 53.22 (25.43, 73.22)% | 4.27 (0.14, 10.92)% | 6.60 (0.01, 20.06)% | 2.02 (0.02, 5.92)% | 3.16 (0.02, 9.74)% | 4.04 (0.00, 12.78)% | 22.43 (7.43, 34.23)% | 0.42 (0.00, 1.74))% | 1.64 (0.00, 6.69)% | 36,754.75 (36,228.37, 37,281.13) | 60,323.29 (59,349.95, 61,296.63) |
| LM0 | 52.90 (24.75, 96.63)% | 47.10 (3.37, 75.25)% | 63,569.71 (62,403.35, 64,736.07) | 65,366.62 (64,182.07, 66,551.17) | ||||||||
| LM1 | 2.34 (1.51, 3.39)% | 40.78 (31.76, 51.20)% | 56.88 (46.03, 66.40)% | 55,287.67 (54,290.48, 56,284.86) | 64,434.32 (63,281.10, 65,587.54) | |||||||
| IM1 | 6.23 (4.26, 8.83)% | 23.77 (6.32, 55.84)% | 33.96 (0.70, 55.89) % | 36.04 (24.66, 43.17)% | 40,592.46 (39,731.29, 41,453.63) | 52,606.16 (51,683.98, 53,528.34) | ||||||
| LM2 | 2.25 (1.43, 3.23)% | 26.62 (8.83, 42.34)% | 8.10 (0.34, 15.60)% | 18.24 (6.89, 28.40)% | 18.79 (2.90, 32.95)% | 26.00 (6.99, 43.40)% | 55,280.79 (54,321.10, 56,240.48) | 63,910.85 (62,759.02, 65,062.68) | ||||
| IM2 | 5.48 (3.30, 7.65)% | 17.68 (3.69, 39.63)% | 3.52 (0.04, 8.36)% | 11.98 (0.01, 25.74)% | 10.57 (0.03, 22.92, 14.11)% | 14.11 (0.74, 27.30)% | 0.92 (0.00, 4.20)% | 19.91 (5.66, 38.30)% | 6.83 (0.00, 23.99)% | 8.99 (0.00, 25.35)% | 32,654.79 (32,126.08, 33,183.50) | 53,238.40 (52,274.86, 54,201.94) |
| LM0 | 47.50 (33.86, 61.38)% | 52.50 (38.62, 66.14)% | 60,866.74 (59,747.27, 61,986.21) | 65,015.00 (63,799.64, 66,230.36) | ||||||||
| LM1 | 2.11 (1.29, 2.99)% | 40.15 (30.78, 50.45)% | 57.74 (46.95, 67.26)% | 55,129.58 (54,075.63, 56,183.53) | 61,872.96 (60,726.07, 63,019.85) | |||||||
| IM1 | 5.55 (3.45, 7.70)% | 13.61 (1.71, 28.47)% | 45.90 (32.31, 60.71)% | 34.93 (18.56, 50.17)% | 37,786.65 (37,132.72, 38,440.58) | 56,917.80 (55,828.63, 58,006.97) | ||||||
| LM2 | 1.72 (1.07, 2.41)% | 30.80 (19.42, 40.54)% | 4.26 (0.51, 9.21)% | 23.13 (16.25, 30.14)% | 12.99 (2.71, 24.09)% | 27.09 (17.05, 40.25)% | 49,665.71 (48,785.67, 50,545.75) | 59,973.66 (58,857.58, 61,089.74) | ||||
| IM2 | 4.16 (2.22, 6.17)% | 24.29 (2.33, 48.33)% | 3.73 (0.03, 9.92)% | 7.83 (0.01, 22.19)% | 13.66 (3.14, 26.56)% | 14.11 (3.33, 36.75)% | 0.38 (0.00, 1.75)% | 24.22 (6.00, 40.01)% | 3.82 (0.00, 10.80)% | 3.80 (0.00, 14.40)% | 34,098.02 (33,515.27, 34,680.77) | 53,307.66 (52,302.53, 54,312.79) |
| LM0 | 65.28 (52.03, 85.61)% | 34.72 (14.39, 47.97)% | 30,127.48 (29,251.32, 31,003.64) | 32,319.99 (31,361.90, 33,278.08) | ||||||||
| LM1 | 3.10 (1.17, 5.79)% | 63.15 (47.25, 79.69)% | 33.75 (16.16, 50.53))% | 29,902.80 (29,032.46, 30,773.14) | 32,125.35 (31,172.61, 33,078.09) | |||||||
| IM1 | 7.71 (3.36, 13.88)% | 42.44 (18.04, 64.50)% | 11.38 (0.07, 30.44)% | 38.48 (28.64, 49.33)% | 15,312.56 (14,903.66, 15,721.46) | 25,411.63 (24,645.78, 26,177.48) | ||||||
| LM2 | 2.51 (0.95, 4.66)% | 28.30 (10.50, 44.30)% | 16.92 (7.78, 25.64)% | 25.84 (17.80, 33.92)% | 13.21 (3.44, 23.84)% | 13.22 (2.63, 25.75)% | – | 25,494.21 (24,771.62, 26,216.80) | 30,175.70 (29,285.47, 31,065.93) | |||
| IM2 | 6.44 (2.84, 10.82)% | 30.43 (2.81, 53.74)% | 5.14 (0.04,15.64)% | 6.09 (0.01, 20.57)% | 5.61 (0.17, 12.64)% | 3.93 (0.03, 9.87)% | 1.50 (0.00, 6.09)% | 36.29 (23.24, 54.61)% | 2.90 (0.00, 12.62)% | 1.67 (0.00, 8.40)% | 10,777.49 (10,510.50, 11,044.48) | 18,752.11 (18,208.68, 19,295.54) |
ADHD, attention-deficit hyperactivity disorder; PTSD, post-traumatic stress disorder.
The statistics are defined by the partition of the phenotype explained by the following. p2, geographic position, described by coordinates (latitude and longitude); h2, genetics; e2, the individually independent environment; f2, the environment shared by family members; c2, the environment shared by couples; s2, the environment shared by siblings; he2, the interaction between genetics and the individually independent environment; hf2, the interaction between genetics and the family-shared environment; hc2, the interaction between genetics and the couples-shared environment; hs2, the interaction between genetics and the siblings-shared environment. WAIC and PSIS-LOO-CV data are presented with 95% confidence intervals.
The widely applicable information criterion (WAIC) and Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV) reward goodness of fit but penalize more complex models. The lower the WAIC or PSIS-LOO-CV, the better the model.
Figure 1Model WAIC estimates and the mean estimates of heritability and environmental statistics
Bar plots show the posterior mean heritability estimates (h2, gray), variance explained by the geographic location (p2, violet), variances explained by shared environments (f2, familial; c2, couple-shared; s2, sibling-shared: yellow/orange-colored bars), and variances explained by gene-environment interactions (hf2, gene-familial; hc2, gene-couple-shared-environmental; hs2, gene-sibling-shared environmental: blue-colored bars) given in the five models for the ten most diagnosed neuropsychiatric diseases in our data. LM0, LM1, and LM2 are the three linear models that consider only the additive effects of genetics and shared environments as defined in Table 1. IM1 and IM2 are the two interaction models. Corresponding to their linear counterpart (LM1 and LM2), these two models consider the gene-environmental interactions as defined in Table 1. The five models form two forward selection traces: Linear model 0 (LM0) ⟶ Linear model 1 (LM1) ⟶ Interaction model 1 (IM1) and Linear model 2 (LM2) ⟶ Interaction model 2 (IM2). Within each trace, the later models encompass all the preceding models’ variables. The widely applicable information criterion (WAIC) rewards goodness of fit but penalizes more complex models. The lower the WAIC, the better the model. The scatterplot above each bar plot illustrates which model could be considered as the best one for each disease.
Figure 2Mean estimates of the geographic random effects under best-fit model for each disease
These figure plots show the posterior mean estimates of the geographic random effects for each disease’s WAIC-best model () (see Equation 8 in the models section of STAR Methods). We modeled the geographic random effects using a Gaussian process assuming that adjacent geographic locations have close-value random effects (assumption of smoothness). We did include residents of Hawaii and Alaska in our estimation process, but the results are not shown here. We omitted these results because the discontinuity between the geographic locations disobeys the Gaussian process’s presumptions. The Gaussian process’s poor extrapolation power also makes it difficult to estimate the random effects related to outlying states such as Alaska and Hawaii. Our data do not record any residents of other non-continental US islands.
Figure 3Posterior distribution of the log-odds (logit) change contributed by one’s sex according to each disease’s WAIC-best model
For each disease’s WAIC-best model, this figure delineates the posterior distribution of the regression coefficient estimate associated with the dummy-codes sex (female = 0, male = 1). Because we used a logit link, the coefficient estimate represents the log-odds difference of the risk (in terms of diagnosis probability) between males and females. High values in this figure indicate high risks for males, and low values indicate high risks for females, after controlling for other effects in the regression.
Figure 4Posterior distribution of the log-odds (logit) change contributed by patient’s numeric age according to each disease’s WAIC-best model
For each disease’s WAIC-best model, this figure delineates the posterior distribution of the regression coefficient estimate associated with numeric age. Because we used a logit link, the coefficient estimate represents the log-odds’ change of the risk (in terms of the probability of diagnosis) at every year older. High values in this figure indicate high risks for seniors, and low values indicate high risks for juniors, after controlling for other effects in the regression.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| The data necessary to reproduce our analysis is publicly available at Mendeley Data. | Mendeley Data | |
| The source code used for this analysis is publicly available at Mendeley Data. | Mendeley Data | |