| Literature DB >> 33981842 |
Anna Filippova1, Connor Gilroy2, Ridhi Kashyap3, Antje Kirchner4,5, Allison C Morgan6, Kivan Polimis7, Adaner Usmani8, Tong Wang9.
Abstract
Survey data sets are often wider than they are long. This high ratio of variables to observations raises concerns about overfitting during prediction, making informed variable selection important. Recent applications in computer science have sought to incorporate human knowledge into machine-learning methods to address these problems. The authors implement such a "human-in-the-loop" approach in the Fragile Families Challenge. The authors use surveys to elicit knowledge from experts and laypeople about the importance of different variables to different outcomes. This strategy offers the option to subset the data before prediction or to incorporate human knowledge as scores in prediction models, or both together. The authors find that human intervention is not obviously helpful. Human-informed subsetting reduces predictive performance, and considered alone, approaches incorporating scores perform marginally worse than approaches that do not. However, incorporating human knowledge may still improve predictive performance, and future research should consider new ways of doing so.Entities:
Keywords: Fragile Families Challenge; machine learning; missing data; prediction; surveys
Year: 2019 PMID: 33981842 PMCID: PMC8112737 DOI: 10.1177/2378023118820157
Source DB: PubMed Journal: Socius ISSN: 2378-0231
Figure 1.Missing data.
Note: The Fragile Families and Child Wellbeing Study data set includes 12,943 variables for 4,232 observations. Within this data set, 74 percent of our observations are missing completely at random, missing at random, or missing not at random.
Figure 2.MSE’s from approaches relevant to human-in-the-loop rankings.
Note: The mean squared errors (MSEs) from the approaches when evaluating which “human-in-the-loop” strategy performed best are shown. As explained in “Results,” because the permutation space is not filled, this comparison must be conducted on a restricted set of seven approaches (i.e., each “human-in-the-loop” approach fit to a data set imputed by linear regression accounting for variable types). Here we plot the scores from that restricted set by outcome.
Figure 3.Permutation space of possible and relevant approaches.
Note: The choice of five different imputation strategies, three different subsetting strategies, and three different score incorporation strategies yields a permutation space of 45 different approaches. Computational resources were available to explore only 25. Moreover, as explained in “Results,” the fact that not all 45 approaches were explore made it necessary to restrict to a subset of the explored approaches when comparing choices made in one or more dimensions. This figure shows the relevant set to which the analysis was restricted when addressing one of the four questions asked in this article. Those questions are given in the facet titles on the right-hand side of the figure.
Figure 4.Rankings by lowest average and median MSE.
Note: The figure illustrates the lessons the results yield when researchers are confronted with any one of four questions: how to impute, whether to subset, whether to score, and whether to involve humans in the loop. Approaches are ranked by both average and median mean squared error (MSE), across outcomes. See “Results” for a complete discussion.
Figure 5.Average percentage reduction in MSE.
Note: Figure 4 gives the relevant rankings, but it does not convey how much these choices matter. Here, the average percentage reduction in mean squared error (MSE) is plotted relative to baseline, across outcomes, for the same four questions that surface in the main article.
Predictors and Resulting Variables.
| Predictor | Variables |
|---|---|
| Number of times father has missed work | {f2b30a, f3b22} |
| Child’s IQ | {hv5_ppvtpr} |
| Foreign-born mother | {m3h1b, m3h1a} |
| Parents are in cohabiting partnership | {cf2cohm} |
| Private school | {p5l1a, p5l23} |
| Parents’ substance abuse | {cm3drug_case, cf3alc_case, cm3alc_case, cf3drug_case, m4j21, m5g20, f5g20, f4j21} |
| Number of books at home | {f5k14e, m4b27, f4b27, m5k14e} |
| Existing number of siblings | {cm3kids, cf4kids, cf3kids, cm1kids, cm4kids, cm5kids, cf5kids, cf2kids, cf1kids, cm2kids} |
| Father’s interest in sports or entertainment | {f5k14b} |
| Home schooling | {p5l1a} |
| Mother’s mental health | {f4c38, f5b31x} |
| Child’s perseverance | {k5g1e, k5g1d, k5g1a, k5g1b} |
| Father’s nonstandard work hours | {f4k16a, f5k17l, f3k17a, f5i16a, f2k18a} |
| Child’s birth weight | {cm1lbw} |
| Families on block known well | {m4i0, m4i0l, f4i0l, p5m1, m4i0n2, m4i0n3} |
| Child’s physical disability | {hv3a2} |
| Number of child’s emergency room visits | {p5h10, f2b8, m2b8, hv3a9, hv4a14} |
| Domestic violence | {m4a8b_7, f3d7n1, m3d7n1, f5f26b2_10, f4a8b_7, m3a8b_7, m3d7p, f5f28a_10, m5f28a_10, m3d7m, m3d7n, m3d7o, m3e23q, m3d9p1, m3e23o, m3d9p, f3d9i, f3d9h, f3d9m, f3d9o, f3d9n, f3d9p, m5b30a1_10, f3a8b_7, m3d9n, m3d9o, m3d9m, m3e23p, m3d9i, m3d9n1, m5f26b2_10, f3d9n1, f3d7o, f3d7n, f3d7m, m3d7p1, m3e23p1, f5b26x_10} |
| Household income | {cf1hhinc, cm5hhinc, cm2hhinc, cf5hhinc, cf3hhinc, cm1hhinc, cf4hhincb, cf2hhinc, cf5hhincb, cf3hhincb, cf4hhinc, cf2hhincb, cm4hhinc, cm3hhinc} |
| Child’s gender | {cm1bsex, hv4sex_child} |
| Mother’s employment | {m2k12, m2k8, m3k15, m4k15} |
| Income-to-poverty ratio | {cf3povco, cf1inpov, cf4povcob, cm3povco, cf3p… |
| Mother’s education | {cm5edu} |
| Teacher quality | {t5g7} |
| Child’s participation in sports | {p5i1b, m5k14b} |
| Grandparents are present in household | {cm3gmom, cm3gdad} |
| Number of parental romantic relationships | {m3a13, f3a13, m5a10, f5a101, f5a10, m5a101} |
| Child’s exposure to someone smoking | {hv4a24, p5q3cr, p5h15c} |
| Mother’s incarceration | {m3i29} |
| Child’s race | {m1h3, f1h3} |
| Mother’s substance abuse | {cm3drug_case, cm3alc_case} |
| Mother’s age at childbirth | {cm2fbir, cm1age} |
| Child’s participation in chores | {f3b32d, f3b4d, f3c3d, f3e18d, f5k14a, m3b32d, m3b4d, m3c3d, m3e18d, m5k14a, p5i1a, p5i31a, p5i40a} |
| Teacher says child works independently | {t5b2c} |
| Domestic abuse in family | {m4a8b_7, f3d7n1, m3d7n1, f5f26b2_10, f4a8b_7, m3a8b_7, m3d7p, f5f28a_10, m5f28a_10, m3d7m, m3d7n, m3d7o, m3e23q, m3d9p1, m3e23o, m3d9p, f3d9i, f3d9h, f3d9m, f3d9o, f3d9n, f3d9p, m5b30a1_10, f3a8b_7, m3d9n, m3d9o, m3d9m, m3e23p, m3d9i, m3d9n1, m5f26b2_10, f3d9n1, f3d7o, f3d7n, f3d7m, m3d7p1, m3e23p1, f5b26x_10} |
| Divorce or separation | {m2a8c} |
| Father’s age at childbirth | {cf1age} |
| Father’s education | {cf5edu} |
| Mother has chronic illness | {m5g2a_107} |
| Parent’s mental health | {m5g2a_101} |
| Amount of parental involvement in school | {m4i0d, f4i0d, m4i0} |
| Mother’s nonstandard work hours | {m3k16a, m2k13a, m5k17l, m4k16a, m5i16a} |
| Parent’s chronic illness | {f5a3a1_9, m5g2a_107} |
| Parent impulsivity | {p5q3an} |
| Number of books in the home | {f5k14e, m4b27, f4b27, m5k14e} |
| School quality | {t5g4_104, m4i0, m4i0b, f5k5d, p5l1a, f5k5e, p5l13f, t5c7f} |
| Foreign-born father | {f3h1a, f3h1b} |
| Father’s sense of familial responsibility | {p5i37, p5i32b, f5k15, m2e4b, n5c3f, m2c3b, f2b17a, f2b17b, f2b17c, f2b17d, f2b17e, f2b17f, f2b17g, f2b17h, p5i32a, p5i32c, n5c3e} |
| Availability of extended family | {cf4gdad, cf1gdad, cm1gmom, cm2gdad, cm5gmom, cm4gmom, cf5gmom, cf3gmom, cf3gdad, cm5gdad, cf5gdad, cm4gdad, cm3gmom, cf1gmom, cm1gdad, cm3gdad, cf4gmom, cm2gmom, cf2gmom, cf2gdad} |
| Father’s incarceration | {f3i29} |
| Mother’s prenatal smoking | {m1g4} |
| Family on welfare | {m3i8c1, f3i8c1, m5f8c1, m4i8c1, m5f8b1, m3i8b1, m4i8b1, m2h9c1, m2h9b1, f5f8c1, f1k2a, m1j2b, f2h8c1, f4i8c1} |
| Child’s learning disability | {kind_a13} |
| Child makes friends easily | {t5b1h} |
| Mother’s multiple job holding | {m3k18, m3k17, m2k14a} |
| Father’s substance abuse | {cf3alc_case, cf3drug_case} |
| Parents have savings account | {m5j6h} |
| Father’s multiple job holding | {f2k19a, f3k18} |
| Father absent at time of birth | {m1a6} |
| Household size | {cf5adult, cm4adult, cm3kids, cm5kids, cf1adul… |
| Parent’s religion | {f3r1, m3r1} |
| Neighborhood crime | {t5f4a} |
| Father’s unemployment | {m2c33, m3c41} |
| Childcare center enrollment | {m4b13} |
| Never-married mother | {cm5relf} |
| Child’s health | {f5a3i_10, f5a6g02_10, m5a3a1_10, f5a3a1_10, m5a6g01_10, m5a6g03_10, f5a6g03_10, m5a3i_10, m5a6g02_10, f5a6g01_10} |
| Grandparents in the household | {cm3gmom, cm3gdad} |
| Multigenerational household | {cf4gdad, cm4gdad, cf4gmom, cm4gmom} |
Ideas with which each survey was seeded.
Participation in Wiki Survey by Outcome.
| Experts | MTurk Users | |||
|---|---|---|---|---|
| Outcome | Votes | Voters | Votes | Voters |
| Grade point average | 530 | 24 | 5,130 | 137 |
| Grit | 299 | 9 | 4,419 | 110 |
| Material Hardship | 777 | 31 | 5,533 | 127 |
| Eviction | 980 | 32 | 3,741 | 113 |
| Layoff | 32 | 4 | 3,964 | 115 |
| Job Training | 33 | 4 | 4,434 | 129 |
Holdout Scores.
| Imputation | Subsetting | Scores | Eviction | GPA | Grit | Job Training | Layoff | Material Hardship |
|---|---|---|---|---|---|---|---|---|
| MI | Wiki surveyed | Experts | 0.05443 | 0.36480 | 0.25285 | 0.18084 | 0.16320 | 0.02268 |
| MI | Wiki surveyed | MTurk users | 0.05471 | 0.37015 | 0.24998 | 0.18114 | 0.16334 | 0.02265 |
| MI | Wiki surveyed | No scores | 0.05465 | 0.36588 | 0.25013 | 0.18206 | 0.16360 | 0.02256 |
| MI | Constructed | No scores | 0.05533 | 0.37728 | 0.24839 | 0.18237 | 0.16265 | 0.02265 |
| LASSO | No subsetting | No scores | 0.05491 | 0.35455 | 0.24754 | 0.17909 | 0.16611 | 0.02056 |
| LASSO | Wiki surveyed | Experts | 0.05497 | 0.37064 | 0.25509 | 0.18220 | 0.16399 | 0.02309 |
| LASSO | Wiki surveyed | MTurk users | 0.05546 | 0.37512 | 0.25064 | 0.18209 | 0.16574 | 0.02318 |
| LASSO | Wiki surveyed | No scores | 0.05539 | 0.36750 | 0.25139 | 0.18318 | 0.16570 | 0.02300 |
| LASSO | Constructed | No scores | 0.05546 | 0.37793 | 0.24976 | 0.18367 | 0.16467 | 0.02320 |
| LM-untyped | Wiki surveyed | Experts | 0.05507 | 0.37089 | 0.25292 | 0.18246 | 0.16558 | 0.02329 |
| LM-untyped | Wiki surveyed | MTurk users | 0.05546 | 0.37540 | 0.25041 | 0.18278 | 0.16704 | 0.02331 |
| LM-untyped | Wiki surveyed | No scores | 0.05537 | 0.36781 | 0.25171 | 0.18377 | 0.16573 | 0.02310 |
| LM | No subsetting | Experts | 0.05454 | 0.35425 | 0.24777 | 0.17810 | 0.16379 | 0.02012 |
| LM | No subsetting | MTurk users | 0.05462 | 0.35444 | 0.24664 | 0.17962 | 0.16431 | 0.02000 |
| LM | No subsetting | No scores | 0.05454 | 0.35537 | 0.24468 | 0.17797 | 0.16560 | 0.01999 |
| LM | Wiki surveyed | Experts | 0.05481 | 0.36279 | 0.25310 | 0.18355 | 0.16457 | 0.02263 |
| LM | Wiki surveyed | MTurk users | 0.05464 | 0.36439 | 0.25078 | 0.18306 | 0.16434 | 0.02264 |
| LM | Wiki surveyed | No scores | 0.05492 | 0.36387 | 0.25062 | 0.18509 | 0.16462 | 0.02267 |
| LM | Constructed | No scores | 0.05546 | 0.37797 | 0.24939 | 0.18195 | 0.16255 | 0.02282 |
| Mean | No subsetting | Experts | 0.05446 | 0.35449 | 0.24913 | 0.17771 | 0.16375 | 0.02000 |
| Mean | No subsetting | MTurk users | 0.05448 | 0.35258 | 0.24629 | 0.17909 | 0.16427 | 0.01993 |
| Mean | No subsetting | No scores | 0.05424 | 0.35182 | 0.24566 | 0.17704 | 0.16473 | 0.01985 |
| Mean | Wiki surveyed | Experts | 0.05493 | 0.36457 | 0.25324 | 0.18290 | 0.16426 | 0.02267 |
| Mean | Wiki surveyed | MTurk users | 0.05477 | 0.36477 | 0.25064 | 0.18311 | 0.16444 | 0.02268 |
| Mean | Wiki surveyed | No scores | 0.05498 | 0.36353 | 0.25033 | 0.18371 | 0.16481 | 0.02269 |
Note: Each row displays mean squared errors from 1 of the 25 different strategies we used. The first column describes the imputation strategy used. The second describes whether we subsetted the data and, if so, to which set. The third describes which set of scores were used, if any. GPA = grade point average; LASSO = least absolute shrinkage and selection operator; LM = linear model; MI = multiple imputation.
Original Holdout Scores.
| Imputation | Subsetting | Scores | Eviction | GPA | Grit | Job Training | Layoff | Material Hardship |
|---|---|---|---|---|---|---|---|---|
| MI | Wiki surveyed | Experts | 0.0545 | 0.3642 | 0.2522 | 0.1817 | 0.1637 | 0.0226 |
| MI | Wiki surveyed | MTurk users | 0.0547 | 0.3700 | 0.2498 | 0.1813 | 0.1643 | 0.0226 |
| MI | Wiki surveyed | No scores | 0.0547 | 0.3645 | 0.2506 | 0.1827 | 0.1644 | 0.0225 |
| MI | Constructed | No scores | 0.0548 | 0.3761 | 0.2480 | 0.1826 | 0.1627 | 0.0227 |
| LASSO | No subsetting | No scores | 0.0543 | 0.3587 | 0.2453 | 0.1795 | 0.1672 | 0.0203 |
| LASSO | Wiki surveyed | Experts | 0.0555 | 0.3779 | 0.2533 | 0.1825 | 0.1672 | 0.0249 |
| LASSO | Wiki surveyed | MTurk users | 0.0555 | 0.3729 | 0.2537 | 0.1822 | 0.1670 | 0.0248 |
| LASSO | Wiki surveyed | No scores | 0.0554 | 0.3721 | 0.2517 | 0.1832 | 0.1668 | 0.0241 |
| LASSO | Constructed | No scores | 0.0555 | 0.3803 | 0.2516 | 0.1840 | 0.1662 | 0.0236 |
| LM-untyped | Wiki surveyed | Experts | 0.0550 | 0.3734 | 0.2533 | 0.1822 | 0.1642 | 0.0231 |
| LM-untyped | Wiki surveyed | MTurk users | 0.0555 | 0.3804 | 0.2505 | 0.1820 | 0.1656 | 0.0232 |
| LM-untyped | Wiki surveyed | No scores | 0.0554 | 0.3718 | 0.2504 | 0.1834 | 0.1658 | 0.0230 |
| LM | No subsetting | Experts | ||||||
| LM | No subsetting | MTurk users | ||||||
| LM | No subsetting | No scores | 0.0545 | 0.3551 | 0.2445 | 0.1779 | 0.1654 | 0.0200 |
| LM | Wiki surveyed | Experts | 0.0549 | 0.3633 | 0.2531 | 0.1824 | 0.1649 | 0.0226 |
| LM | Wiki surveyed | MTurk users | 0.0547 | 0.3644 | 0.2509 | 0.1831 | 0.1643 | 0.0226 |
| LM | Wiki surveyed | No scores | 0.0549 | 0.3639 | 0.2506 | 0.1851 | 0.1646 | 0.0226 |
| LM | Constructed | No scores | 0.0555 | 0.3764 | 0.2495 | 0.1818 | 0.1635 | 0.0228 |
| Mean | No subsetting | Experts | ||||||
| Mean | No subsetting | MTurk users | ||||||
| Mean | No subsetting | No scores | 0.0548 | 0.3521 | 0.2471 | 0.1773 | 0.1648 | 0.0199 |
| Mean | Wiki surveyed | Experts | ||||||
| Mean | Wiki surveyed | MTurk users | 0.0548 | 0.3708 | 0.2509 | 0.1838 | 0.1646 | 0.0228 |
| Mean | Wiki surveyed | No scores | 0.0550 | 0.3658 | 0.2507 | 0.1846 | 0.1650 | 0.0227 |
Note: Each row displays mean squared errors (MSEs) from our original submission to the Fragile Families Challenge. To explore an additional four strategies (which are blank here), we obtained new holdout MSEs after the original challenge had closed. These are shown in Table D1. We include these original MSEs as a reference for the interested reader. GPA = grade point average; LASSO = least absolute shrinkage and selection operator; LM = linear model; MI = multiple imputation.