| Literature DB >> 31661513 |
Alex James1,2, Jeanette McLeod1,2, Shaun Hendy2,3, Kip Marks4,5, Delia Rusu4, Syen Nik4, Michael J Plank1,2.
Abstract
Preventing child abuse is a unifying goal. Making decisions that affect the lives of children is an unenviable task assigned to social services in countries around the world. The consequences of incorrectly labelling children as being at risk of abuse or missing signs that children are unsafe are well-documented. Evidence-based decision-making tools are increasingly common in social services provision but few, if any, have used social network data. We analyse a child protection services dataset that includes a network of approximately 5 million social relationships collected by social workers between 1996 and 2016 in New Zealand. We test the potential of information about family networks to improve accuracy of models used to predict the risk of child maltreatment. We simulate integration of the dataset with birth records to construct more complete family network information by including information that would be available earlier if these databases were integrated. Including family network data can improve the performance of models relative to using individual demographic data alone. The best models are those that contain the integrated birth records rather than just the recorded data. Having access to this information at the time a child's case is first notified to child protection services leads to a particularly marked improvement. Our results quantify the importance of a child's family network and show that a better understanding of risk can be achieved by linking other commonly available datasets with child protection records to provide the most up-to-date information possible.Entities:
Year: 2019 PMID: 31661513 PMCID: PMC6818793 DOI: 10.1371/journal.pone.0224554
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A child’s family network growing over time.
Panels (a)-(c) show networks constructed using recorded relationships and panels (d)-(f) show networks constructed using whole-life-relationships. The networks are shown at: (a,d) the time of the child’s first notification; (b,e) the time of the child’s second notification; (c,f) the end of the data collection period. Individuals who are part of the child’s family network at each time point are shown in dark colours; individuals who exist but are not part of the network are shown in light colours; individuals who are not yet born are not shown. The values of the network predictor variables (see Table 1 for definitions) are provided in each case. This is an illustration only and does not correspond to a real child.
Summary of predictor variables used in the statistical classification models.
Event variables are the numbers of various types of event that occurred for the focal child prior to the notification. Recorded relationship variables are based on relationships recorded prior to the notification (see Fig 1A–1C); whole-life relationship variables are based on close family relationships, whether they were recorded before or after the notification (see Fig 1D–1F). A target event is defined to be a serious intervention by child protection services or a substantiated finding of maltreatment. Abbreviations are included for reference for Table 2 and Figs 1 and 2.
| Variable type | Predictor variables | Abbreviation | Details |
|---|---|---|---|
| Age | Age at time of notification | Age | |
| Event-based | # prior notifications | nNotes | Not included in the first notifications model |
| # prior target events | nTargets | ||
| # notifications in the last 2 years | nNotesRecent | ||
| # target events in the last 2 years | nTargetsRecent | ||
| Recorded network | # individuals in network | nNeighsR | Network constructed using relationships recorded prior to the notification |
| # individuals in network with a prior notification | nWithNotesR | ||
| # individuals in network with a prior target event | nWithTargetR | ||
| # known abusers in network | nAbusersR | ||
| Whole-life network | # individuals in network with a prior notification | nWithNotesW | Network constructed using whole-life relationships (parent, child, sibling, half-sibling relationships only, backdated to the DOB of the younger of the two individuals in the relationship |
| # individuals in network with a prior target event | nWithTargetW | ||
| # known abusers in network | nAbusersW |
Models using whole-life relationships are always better than those without whole-life relationships.
Results for selected logistic regression models for estimated concern. Columns show the dataset used, the predictor variables included (see Table 1 for definitions), the number of predictor variables and the ROC score. For each dataset, results are shown for the best overall model (i.e. highest ROC score), the best four-variable model, and the best model that does not have any whole-life network predictor variables. Note that the model using events-based predictor variables only was not applied to the first notifications subsample as these predictor variables are all equal to zero (see Methods).
| Sample | Predictor variables | Number of predictors | ROC score | Comments |
|---|---|---|---|---|
| All notifications | All | 12 | 0.675 | Best model |
| nWithContactW | 4 | 0.672 | Best four variable model | |
| nWithNoteR | 4 | 0.650 | Best four variable model using events and recorded relationship predictors only | |
| nTargets | 4 | 0.639 | Best four variable model using events predictors only | |
| First notifications | All | 8 | 0.636 | Best model |
| nWithNoteW | 4 | 0.635 | Best four variable model | |
| nNeighsR | 4 | 0.593 | Best four variable model using recorded relationship predictors only | |
| Subsequent notifications | All | 12 | 0.657 | Best model |
| nWithTargetW | 4 | 0.653 | Best four variable model | |
| nWithTargetR | 4 | 0.636 | Best four variable model using events and recorded relationship predictors only | |
| nTargets | 4 | 0.628 | Best four variable model using events predictors only |
Fig 2Whole-life relationships are consistently better than recorded relationships for identifying high-risk groups.
Each panel shows the incidence rate of high estimated concern when children are split into two groups according to a threshold value (determined by a single-split classification tree) of a single network-based predictor variable at the time of their first notification. Above each graph is shown the predictor variable being used (see Table 1 for definitions), the threshold value used to group the children, and the proportion of children in the above-threshold (high-risk) group. The difference in the incidence rate between the high-risk and low-risk groups is always greater using whole-life relationships (red) than using recorded relationships (blue).