Literature DB >> 32251439

Predicting mental health problems in adolescence using machine learning techniques.

Ashley E Tate¹, Ryan C McCabe², Henrik Larsson^1,3, Sebastian Lundström^4,5, Paul Lichtenstein¹, Ralf Kuja-Halkola¹.

Abstract

BACKGROUND: Predicting which children will go on to develop mental health symptoms as adolescents is critical for early intervention and preventing future, severe negative outcomes. Although many aspects of a child's life, personality, and symptoms have been flagged as indicators, there is currently no model created to screen the general population for the risk of developing mental health problems. Additionally, the advent of machine learning techniques represents an exciting way to potentially improve upon the standard prediction modelling technique, logistic regression. Therefore, we aimed to I.) develop a model that can predict mental health problems in mid-adolescence II.) investigate if machine learning techniques (random forest, support vector machines, neural network, and XGBoost) will outperform logistic regression.
METHODS: In 7,638 twins from the Child and Adolescent Twin Study in Sweden we used 474 predictors derived from parental report and register data. The outcome, mental health problems, was determined by the Strengths and Difficulties Questionnaire. Model performance was determined by the area under the receiver operating characteristic curve (AUC).
RESULTS: Although model performance varied somewhat, the confidence interval overlapped for each model indicating non-significant superiority for the random forest model (AUC = 0.739, 95% CI 0.708-0.769), followed closely by support vector machines (AUC = 0.735, 95% CI 0.707-0.764).
CONCLUSION: Ultimately, our top performing model would not be suitable for clinical use, however it lays important groundwork for future models seeking to predict general mental health outcomes. Future studies should make use of parent-rated assessments when possible. Additionally, it may not be necessary for similar studies to forgo logistic regression in favor of other more complex methods.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32251439 PMCID： PMC7135284 DOI： 10.1371/journal.pone.0230389

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Childhood onset psychopathology can carry a heavy burden of negative outcomes that persist through adolescence and into adulthood. These outcomes are often severe: criminal convictions, low educational attainment, unemployment, and increased risk of suicide attempts [1, 2]. As many of the documented risk factors for mental illnesses in adolescence can be mitigated by early interventions [3], research establishing the most informative mental health indicators could help more precisely identify the proper traits for intervention targets. There are several well researched indicators in childhood that are associated with the development of mental health problems. Psychopathological traits in early childhood also often indicate a higher risk for consistent mental health problems in adolescence and adulthood [4]; with even subthreshold symptoms indicating future adversity and a general predisposition to mental illnesses [5-7]. Internalizing and externalizing symptoms in childhood are both frequently associated with higher risk of mental illness diagnosis later in life [5, 8]. Specifically, impulsivity has been associated with a susceptibility of developing mental illnesses and suicide [9, 10]. Moreover, neurodevelopmental disorders, such as autism or ADHD, indicate lifelong diagnosis and frequent psychiatric comorbidities [11]. Similarly, learning difficulties can also indicate future mental health adversity and are frequently seen in children with neurodevelopmental disorders [12, 13]. Additionally, parental mental health, such as anxiety or depression, has been found to correlate with childhood internalizing and externalizing symptoms, likely due to a shared biologic (genetic) etiology[14, 15]. Thus, parental mental health may serve as an indicator of a more general predisposition for mental illness in lieu of genetic data. Genetic etiology is important to account for as most childhood psychiatric disorders overlap at both the phenotypic and etiological level [15]. Similarly, living in a lower SES neighborhood has been associated with an increase in internalizing problems and ADHD, although the mechanisms of this association are debated [16, 17]. Factors associated with the neonatal environment and birth have been associated with later adverse mental health and neurodevelopmental disorders [18, 19]. Moreover, chronic physical illness or disability can have a profound effect on mental health [20]. Taken together, reported factors in childhood associated with adolescent mental illness reflect intricate developmental pathways at almost every level. Understandably, most studies have not properly integrated risk factors from varying domains. Modern advancements in prediction modeling with machine learning may, in part, provide a cost-efficient solution to this problem.

Machine learning in mental health

Supervised machine learning, used for classification or prediction modelling, has the advantage of accounting for complex relationships between variables that may not have been previously identified. Thus, as datasets become larger and the variables more complex, machine learning techniques may become a useful tool within psychiatry to properly disentangle variables associated with outcomes for patients[21, 22]. A majority of studies using machine learning within psychiatry have focused on classification or diagnosis [23, 24]. However, critique has been raised that these studies are prone to under-perform due to a lack of insight on underlying assumptions of the various machine learning techniques or on the psychiatric disorders and corresponding diagnostic processes [25]; highlighting the difficulty in creating and validating such models. That said, advancements have been made in the field using tree based models to predict suicide in adolescents and the U.S. military [26, 27]. Beyond their proven efficacy, tree based models provide information on how extensively a variable was used for the model, or variable importance, which gives some insight to the models’ classification process. This indicates that, while the way forward is arduous, properly conducted machine learning techniques can be interpretable and improve the efficacy of clinical decision making. The primary aim for this study is to develop a model that can predict mental health problems in mid-adolescence. Additionally, we aim to investigate various machine learning techniques along with standard logistic regression to determine which performs best using combined questionnaire and register data. We expect that the techniques used will perform with similar accuracy according to the “No Free Lunch Theorem” [28, 29].

Methods

Participants

Participants came from the Child and Adolescent Twin Study in Sweden (CATSS), an ongoing, longitudinal study containing 15,156 twin pairs born in Sweden. During the first wave, the twins’ parents were contacted close to their 9th or 12th birthdays for a phone interview, this wave had a response rate of 80% [30], while the second wave at age 15 had a response rate of ~55%. This sample population was chosen due to the depth of information available, including questionnaire and register data. Using the unique identification number given to all Swedes we linked several Swedish national registries to the CATSS data; the National Patient Register (NPR) [31], the Multi-Generation Register (for identification of parents) [32], the Medical Birth Register (MBR) [33], the Prescribed Drug Register (PDR) [34], as well the Longitudinal Integration Database for Health Insurance and Labor Market Studies (LISA) [35]. A total of 7,638 participants born between 1994 and 1999 who completed data collection at age 9 or 12 and again at age 15 were eligible for inclusion and were used in the analysis. The study was approved by the Regional Ethical Review Board in Stockholm (the CATSS study, Dnr 02–289, 03–672, 2010/597-31/1, 2009/739-31/5, 2010/1410-31/1, 2015/1947-31/4; linkage to national registers, Dnr 2013/862–31/5).

Measures

The outcome measure of adolescent mental health problems was collected at age 15 via the Strengths and Difficulties Questionnaire (SDQ) [36]. We used the SDQ to obtain parent-rated emotional symptoms, conduct problems, prosocial behavior, hyperactivity/inattention, and peer relationship problems. A binary variable was created based on a combination of the parent reported subscales, not including prosocial behavior, with a cut-off score validated for the Swedish population, corresponding to approximately 10% scoring above cut-off and thus rated as having mental health problems [37]. Predictors were collected at 9/12 or earlier from questionnaires administered through CATSS and through registers. We included a wide range of predictors based on previous findings of association with adolescent mental health outcomes and/or childhood mental health. Predictors encompassed everything from birth information, physical illness, to mental health symptoms, to environmental factors such as neighborhood and parental income. Informants included both register and parental reported information. A total of 474 variables were initially included in the dataset, a complete list can be found in S1 File.

Data pre-processing

Variables with more than 50% missingness were removed from analysis (202 variables excluded). Redundant variables were also removed (134 excluded). Additionally, variables with no variance were removed (32 excluded) and those with variance near zero were combined into one variable if possible, e.g. dust, mold, and pollen allergy collapsed into allergy [38]. Ultimately, 85 variables were determined to be suitable for analysis. As most machine learning techniques require complete datasets, missing values were imputed with tree based imputation with the R package mice [39].

Statistical analysis

All analyses were performed in R. First, a learning curve was plotted with the entire dataset in order to check if our study was sufficiently powered. Then, we split our data into a training-set (60% of the sample), a tune-set (10%), and a test-set (30%). Splitting data allows for more accurate determination in how the model will perform in a new dataset and helps alleviate overfitting, i.e. to fit the training data too closely to accurately predict other datasets. Stratified random sampling was used to ensure that the twin pairs would not get separated between the datasets, thus avoiding potential overfitting. Additionally, we preserved an equal distribution of the outcome between each set. Descriptive statistics were created for each set to determine the quality of the partition (Table 2).

Table 2

Descriptive information from the partitioned data.

	N	Birth year	Sex	SDQ Cutoff
		Mean (SD)	Male %	Cut off reached %
Trainset	4554	1996.5 (1.69)	48.4%	12.1%
Tuneset	804	1996.3 (1.68)	49.6%	12.3%
Testset	2280	1996.5 (1.68)	48.1%	11.5%

We artificially inflated the number of cases in the training-set through a Synthetic Minority Over-sampling Technique (SMOTE), as implemented in the R-package SMOTEBoost [40], because positive cases were relatively rare. This phenomenon is commonly termed class imbalance [41] and can cause the model to predict all outcomes as the majority class. The performance of predictions from considered models were determined by the area under the receiver operating characteristic curve (AUC). We created prediction models using several machine learning techniques: random forest, XGBoost, logistic regression, neural network and support vector machines (Table 1) to determine which produced the best fitting model for a test set. Using cross validation, each technique trained multiple models using the training set and tested their performance on a subset of the training set. The model with the lowest error was then tested using the tune set. Once the performance in the tune set was deemed satisfactory, the final models were then fitted to the test set. Parameter tuning was guided in part by standard practice when available, however a majority of the tuning took place through the random search function in R package mlr [42, 43]. Random search was completed using cross-validation with 3 iterations, 50 times. Variable importance was calculated for tree-based models: random forest and XGBoost. Confidence intervals at 95% were created for each AUC by bootstrapping predictions 10,000 times. Positive Predictive and Negative predictive values were obtained for the best performing model.

Table 1

Information on techniques.

Technique	R Package used*	Descrption
Random Forest	RandomForest [51]	Decision trees are a model type that groups data in a tree like structure based on if-then-else decisions. At each decision point (node), data is branched off into smaller subgroups based on one of the predictor variables. Random forest is a method based on aggregating the results of many decision trees and prediction is determined based on the majority decision [52]
XGBoost	XGBoost [53]	XGBoost, or extreme gradient boosting, uses gradient boosting within random forest. Gradient boosting works by assigning scores to each leaf of the tree and builds new trees based on the performance of previously created trees, thus varying weight is assigned to each tree. This is in contrast to standard boosting techniques in random forest that work by assigning equal weights to trees [53].
Logistic Regression	Base R	Logistic regression represents the standard method in epidemiology for analyzing binary outcomes [54]. In this model predictors are assumed to have a linear relationship to the outcome on the log-odds scale. Each predictor in the model has an associated regression coefficient which describes the direction and strength of its relationship to the outcome. We tested this model with interactions for all variables with sex, as well as with linear and quadratic effects for the A-TAC variables.
Neural Network	Neuralnet [55]	Neural network features numerous interconnected processors, or “neurons”, organized in multiple layers: input, hidden, and output [55]. While there is only one input and output layer, there can be numerous hidden layers. During the learning process the input neurons respond to the data while neurons in the hidden and output layer respond to weighted connections from neurons at the previous layer. These weighted connections may be linear or non-linear and vary in complexity depending on the data and task [55]. Before analysis with this method, the predictors were scaled and centered.
Support Vector Machines	e1701[56]	Support Vector Machines works by dividing classes, i.e., cases versus non-cases, based on a line called a hyperplane. The hyperplane is created based on the greatest possible distance of the nearest neighboring predictor data points between the classes. Data with higher complexity that cannot be separated in two dimensions can be lifted to a higher dimension through a process called kernelling [57].

*mlr [42] was also used for all techniques

Sensitivity analysis

The SDQ, used to derive our outcome variable, has several suggested cut-offs based on different criteria and sample populations. Although we used a cut-off of 11, based on capturing the highest 10% in a Swedish sample [37], it’s possible that this cut-off does not represent a distinct subgroup of psychopathology, ultimately hampering model performance. To assess whether model performance was affected based on used cut-off, we created a new model using the best performing technique with a more stringent cut-off from the original publication. This cut-off of 17 was based on capturing the highest 10% of scorers in a UK sample in the original publication [36].

Results

The datasets were deemed to be well separated (Table 2). Our classes were fairly imbalanced as only 12% of our sample reached the cut off, we mitigated the effects of this through a combination of over- and under sampling on the training set using SMOTEBoost. Next, the learning curve revealed that the models performed well without additional data nor hyper-parameter modifications, with an exception of neural network which required additional data preparation, e.g. centering and scaling of continuous variables (Fig 1).

Fig 1

Learning curve.

The learning curve specifying the performance of each technique without any data nor hyper-parameter modification (y axis) given the total percent of the dataset (x axis) used to train the models.

Learning curve.

The learning curve specifying the performance of each technique without any data nor hyper-parameter modification (y axis) given the total percent of the dataset (x axis) used to train the models.

Model tuning

We then fit models using all considered techniques; the AUCs from the tune-set of the final models for each technique can be found in Fig 2. A full list of the optimal parameters and the ranges tried for each model can be found in S1–S4 Tables. No model was found to be significantly superior, however random forest and support vector machine (SVM) had the highest AUCs of 0.754 (95% CI 0.698–0.804; and 95% CI 0.701–0.802, respectively). The rest of the models performed similarly with an AUC above 0.700 (Fig 2 & Table 3).

Fig 2

AUC curves for tune set.

The AUC performance for each technique using the tune set.

Table 3

Model performance on tune set.

Learner	AUC	95% bootstrap interval
Logistic Regression	0.750	0.693–0.805
XGBoost	0.723	0.662–0.778
Random Forest	0.754	0.698–0.804
Support Vector Machines	0.754	0.701–0.802
Neural Network	0.715	0.658–0.769

AUC curves for tune set.

The AUC performance for each technique using the tune set.

Prediction

The created models were then used to predict the outcome in the test set. The lack of a statistically significant better model remained. The random forest model preformed slightly better at predicting the test set than SVM, with an AUC of 0.739 (95% CI 0.708–0.769) and 0.735 (95% CI 0.707–0.764) respectively (Table 4 & Fig 3), however the CI of each AUC overlaps the estimate of the other indicating a non-significant difference.

Table 4

Model performance on test set.

Learner	AUC	95% bootstrap interval
Logistic Regression	0.700	0.665–0–734
XGBoost	0.692	0.660–0.723
Random Forest	0.739	0.708–0.769
Support Vector Machines	0.736	0.707–0.765
Neural Network	0.705	0.671–0.737

Fig 3

AUC curves for test set.

The AUC performance for each technique using the test set.

AUC curves for test set.

The AUC performance for each technique using the test set. The probability threshold was set to 0.8, meaning that the model classified participants as having mental health problems if the probability of belonging to the class was greater than 0.2. Our top model had a predictive value of 15%, while the negative predictive value was at 96%. This corresponds to a sensitivity of .91 and a specificity of .30, and classified 15% of the test set with the outcome. The more stringent cut-off based on a UK sample [36] categorized roughly 3% of our sample as having mental health issues. We trained a random forest model based on this new cut-off, and found a test AUC of 0.765 (95% CI 0.698–0.826). Although the AUC was marginally better, the confidence interval overlapped with the top performing model with the Swedish cut offs, indicating no meaningful difference.

Variable importance

The variable importance for random forest revealed that the parent-reported mental health items ranked highly, as well as neighborhood quality, gestational age, and parity (Table 5). This indicates that model accuracy decreased significantly when these particular variables were permuted, i.e. randomly exchanged between individuals, during the analysis.

Table 5

Variable importance in random forest.

Predictor (Source)	Importance
Oppositional Defiant symptoms ¹	136.97
Impulsivity symptoms ¹	94.05
Inattention symptoms ¹	92.66
Executive dysfunction ¹	87.72
Emotional symptoms ¹	76.82
Neighborhood deprivation ²	64.03
Peer difficulty ¹	53.22
Parity ³	44.17
Gestational age at birth ³	43.71
Separation anxiety ¹	43.13

1 Autism—Tics, AD/HD and other Comorbidities inventory [58]

2 the Longitudinal Integration Database for Health Insurance and Labor Market Studies[35]

3 Medical Birth Register [33]

1 Autism—Tics, AD/HD and other Comorbidities inventory [58] 2 the Longitudinal Integration Database for Health Insurance and Labor Market Studies[35] 3 Medical Birth Register [33]

Discussion

Using a large range of data from parent reports and register data from numerous Swedish national registers, this study predicted adolescent mental health reasonably well, with a maximum AUC of 0.739 on the test set (using the random forest model). Although the AUC indicates an adequate model, it is not accurate enough for clinical use. While the negative predictive value is at 96% indicates clinical level sensitivity, the positive predictive value of this model is only 15%. This indicates that only a small percentage of the children flagged will actually reach our pre-specified cut-off for mental health problems, which should be compared to the prevalence in the sample of 10%. The variable importance derived from the random forest model indicated that the model did not overly rely on any variable, thus the model would be relatively stable with the removal of any one variable, including those stable over time. The highest ranked variables were parent-reported mental health symptoms such as impulsivity, inattention, and emotional symptoms were important predictive factors for poor mental health at 15. Register information on neighborhood quality, parity and gestational age of birth were also deemed important. These findings fit within literature [17, 18, 44] and could potentially be used by clinicians, parents, or educators to identify at risk children for potential intervention. The highest ranking variables were either parent-rated or could easily be reported by parents, this indicates that register information, which can be expensive or difficult for researchers to obtain, may not be necessary for a successful psychiatric risk model. Thus, future studies predicting adolescent mental health may want to place a greater emphasis on assessment from caregivers. Moreover, this provides further encouragement for parental involvement in clinicians’ assessment of childhood and adolescent psychiatric prognosis and emotional well-being. Additionally, future studies with similar aims should focus on using symptom ratings for mental health, including neurodevelopmental disorders, for their model. Sensitivity analysis showed that the model performance was slightly improved, although not significantly, with a more extreme cut-off (sensitivity analysis AUC = 0.765, 95% CI 0.698–0.826; random forest AUC = 0.739, 95% CI 0.708–0.769). This indicates that future studies can use cut-offs validated for their country or the original study based on preference. Additionally, this provides some evidence that the more extreme cases do not represent a distinct severe class. In line with the No Free Lunch Theorem, all models performed with relatively similar accuracy [29]. A recent systematic review found no clear predictive performance advantage of using machine learning techniques instead of logistic regression, in a range of clinical prediction studies [45]. In our study, the similar performance to logistic regression may partially be attributed to the relatively linear associations from the predictors to the outcome, evident by the lack of significance for non-linear associations in our logistic regression model. When the data has a mostly linear relationship to the outcome, machine learning models will be very similar to standard regression [46]. Although random forest performed slightly better than the compared models, it may be unnecessary for studies with similar datasets and aims to use complex machine learning techniques instead of logistic regression when weighed against time spent learning the techniques, computational time, as well as interpretability of the model. The strengths of this study include the comprehensive analysis of a wide variety of factors associated with adolescent mental health. Further, the use of parental reports indicates that these risk factors are identifiable by non-clinicians, indicating a low cost future solution for large scale mental health screens. The results need to be viewed in the light of several limitations. First, because we used a twin sample our findings may not be generalizable to singletons as our sample might have underlying differences in comparison to singletons. However, previous literature has found little difference in mental health between singletons and twins [47]. That said, zygosity did not rank as highly important, indicating that the model did not rely on the similarity between twins. On a similar note, our study results may not generalize outside of Sweden or Scandinavia, as all of our participants were Swedish born and we did not validate our results in an external sample. Second, the outcome as well as the most important variables were all parent-reported, this may have introduced an association due to a reporting bias. Additionally, because we used mixed data types (continuous, categorical, and binary) in our model it’s possible that the variable importance could have been biased, however this effect is likely to be mitigated as we did not sample with replacement [48]. Finally, the response rate between data collections was 55% [30], so it’s likely that the nonresponders had elevated psychopathology symptoms compared to responders. Additionally, the performance of the model would likely improve with a larger sample size. In summation, our models had a reasonable AUC, but no model had statistically significant higher performance than the other. Although supervised machine learning techniques are currently generating considerable interest across scientific fields, it may not be necessary for most studies to forgo logistic regression, especially for studies with smaller datasets featuring primarily linear relationships. Additionally, our results provide further support for diligent screening of neurodevelopmental symptoms and learning difficulties in children for later psychiatric vulnerabilities. Although, machine learning techniques seem to be promising for the integration of risks across different domains for the prediction of mental health problems in adolescence, it seems premature for implementation in clinical use. Nevertheless, as early treatment for these and other mental health symptoms has been found to largely mitigate negative outcomes and symptoms [49, 50], there is hope for prevention of negative mental health problems in adolescence with properly timed interventions.

Variable codebook.

A list of variables considered for our model. (XLSX) Click here for additional data file.

Support vector machine.

Optimal and explored parameters for the support vector machine model. (DOCX) Click here for additional data file.

Neural network.

Optimal and explored parameters for the neural network model. (DOCX) Click here for additional data file.

Random forest.

Optimal and explored parameters for the random forest model. (DOCX) Click here for additional data file.

XGBoost.

Optimal and explored parameters for the XGBoost model. (DOCX) Click here for additional data file. 4 Dec 2019 PONE-D-19-24985 Predicting mental health problems in adolescence using machine learning techniques PLOS ONE Dear Ms Tate, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. We would appreciate receiving your revised manuscript by Jan 18 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. We look forward to receiving your revised manuscript. Kind regards, Parisa Rashidi Academic Editor PLOS ONE Journal Requirements: 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: The Swedish Twin Registry is managed by Karolinska Institutet and receives funding through the Swedish Research Council under the grant no 2017-00641. We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: The Child and Adolescent Twin Study in Sweden study was supported by the Swedish Council for Working Life, funds under the ALF agreement, the Söderström Königska Foundation and the Swedish Research Council (Medicine, Humanities and Social Science; grant number 2017-02552, and SIMSAM). SL, PL This work has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie CAPICE Project grant agreement number 721567. (https://www.capice-project.eu/) AT, PL, SL We acknowledge financial support from the Swedish Research Council for Health, Working Life and Welfare (project 2012-1678; PL), the Swedish Research Council (2016-01989; PL), as well as the the Swedish Initiative for Research on Microdata in the Social And Medical Sciences (SIMSAM) framework (340-2013-5867; PL) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. 3. Thank you for stating the following in the Competing Interests section: I have read the journal's policy and the authors of this manuscript have the following competing interests: H. Larsson has served as a speaker for Evolan Pharmaand Shire and has received research grants from Shire; all outside the submitted work. P. Lichtenstein has served as a speaker for Medice, also outside the submitted work. R. McCabe serves as a data scientist for Spotify outside of the submitted work. All other authors declare that no competing interests exist Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. In your revised cover letter, please address the following prompts: a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. We will update your Data Availability statement on your behalf to reflect the information you provide. Additional Editor Comments: Based on the reviewers' comments, a minor revision is recommended for this manuscript. Please address reviewers' comments as appropriate. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: I Don't Know ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Overall this is a very well-written paper with clear descriptions of motivation, methods, conclusions, and limitations. The authors were very thorough in describing variables and parameters used in the prediction models. I have only minor comments that I feel would improve clarity: 1. Typo in abstract: "METODS" --> "METHODS" 2. The authors state that machine learning models are "black box", but typically this is in reference to deep learning models. Most of the models used in this paper would be considered conventional and "interpretable" 3. The authors state that CATSS participants are "described in detail elsewhere". This should at least be summarized in the current manuscript. 4. Was there any particular reason that 50% was the missingness threshold for removing variables? I think it would be nice to examine possible missingness patterns, e.g. particular variables missing for certain subgroups. 5. Since the described models are not computationally expensive, it might be nice to perform nested cross validation as opposed to a fixed train/val/test split. 6. The class imbalance should be mentioned in the main manuscript. 7. Table 2 should be referenced for the following line: "Descriptive statistics were created for each set to determine the quality of the partition" 8. Authors should slightly reword the description of training procedure. I assume "fit was determined by finding the maximum AUC" is referring to AUC on the tune set, but this should be explicitly mentioned. It almost reads like models were first trained on the training set before moving on to the tune set, but both of these should be used simultaneously in the cross validation procedure. 9. Why weren't feature importances explored for models like logistic regression or SVM? It is certainly possible. 10. I assume the "best performing model" is based on tune set performance (and not test set), but this should be explicitly mentioned. 11. I find the description of the neural network to be problematic, several important parameters were not mentioned (# of layers, optimizer and its parameters, dropout, etc.). Furthermore, the final hidden dimension of 3 seems very low. 12. The authors should construct a supplemental table of the ranges of parameters explored in the random search. Reviewer #2: This is a study of predictors of mental health issues in a sample of 7,638 Swedish twins. Predictors were collected on them at ages 9-12 and the mental health criterion data were collected at age 15. Although governmental data on Swedish twins is used in this study, the fact that they are twins is irrelevant and appears to pose no source of bias regarding the results. Of 474 variables collected from various governmental data sources, 85 survived scientific scrutiny and were included in the machine learning and regression models reported. Findings suggested that both kinds of analyses produced AUC scores above .7. Apparently these values are not adequate for clinical application, but they are certainly informative for behavioral scientists. Two very important findings from this investigation are (a) logistic regression was adequate for this work, so machine learning analyses may be unnecessary in similar future studies, and (b) the most powerful predictors of mental health issues among these Swedish teens came from parent reports, which are far faster and easier data to collect than most of the other predictors. These two findings are important to share because they provide a green light to the work of investigators in this area who may not be proficient in machine learning and who may only have access to data from parents. In addition, based on these findings, extramural funders may seek to fund these more affordable projects, instead of rejecting them in favor of funding projects that use more costly machine learning analyses (with the need for a lot of data) and governmental data sources. I am not a machine learning expert, so I cannot speak to the statistical conclusion validity of those analyses, but I am competent in logistic regression and saw no issues in those analyses. In several places, the authors need to be careful not to elevate or hint at elevating nonsignificant effects to significance. When the authors say two values are different, then later say they are not significantly different, they blur the conclusion. Statistically the numbers are not different. Better simply to say the two values did not significantly differ and leave it at that. I am no fan of null hypothesis statistical testing, but the authors chose that approach, and some scientific communities still use that approach, so the authors need to remain true to that approach, which posits that findings either are or are not different based on statistical significance. Because participants being twins was immaterial to the scientific questions addressed in this study, the authors should explain why they used a twin sample. It seems as though they could have gotten data on a far larger sample of Swedish children if they did not restrict their focus to twins, who on average represent less than 3% of a population. It could be that the kinds of data collected on Swedish twins simply are not collected on their non-twin counterparts. If that's so, the authors should say that. The ms. would be improved by a section that very specifically enumerated important next steps in predicting teen mental health issues, given these twin data. What do the authors think would be good ways for scholars to increase the AOC to levels appropriate for clinical use, for example? Other scholars would very much appreciate this kind of insight to guide their work. An important limit to this work is cultural. Findings based on Swedes and Swedish culture may not broadly generalize, especially with regard to outcomes as socially defined as mental health concerns. So in addition to the five limitations briefly included in the Discussion, I suggest the authors add concerns about generalizability beyond Sweden and other very similar and similarly homogeneous nations. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step. 23 Jan 2020 Dear Editor, First, we would like to thank you for your timely feedback and thorough review of the paper. Our response can be read in the following key: Reviewer’s comment Revisions Response Lines numbers correspond to the document without tracked changes Journal Comment 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf Answer 1: Thank you for bringing this to our attention. We have unbolded the title and additionally have added a new line for the corresponding author’s email. We have also changed the supporting files names to add the file type in the name, e.g. “Fig1” -> “Fig1.tif”. Please let us know if any additional style requirements were unintentionally omitted. Journal Comment 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: The Swedish Twin Registry is managed by Karolinska Institutet and receives funding through the Swedish Research Council under the grant no 2017-00641. We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: The Child and Adolescent Twin Study in Sweden study was supported by the Swedish Council for Working Life, funds under the ALF agreement, the Söderström Königska Foundation and the Swedish Research Council (Medicine, Humanities and Social Science; grant number 2017-02552, and SIMSAM). SL, PL This work has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie CAPICE Project grant agreement number 721567. (https://www.capice-project.eu/) AT, PL, SL We acknowledge financial support from the Swedish Research Council for Health, Working Life and Welfare (project 2012-1678; PL), the Swedish Research Council (2016-01989; PL), as well as the the Swedish Initiative for Research on Microdata in the Social And Medical Sciences (SIMSAM) framework (340-2013-5867; PL) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Answer 2. We have removed the funding information from lines 296 – 298. We would like to kindly request that our funding statement be updated to include the lines: We acknowledge The Swedish Twin Registry for access to data. The Swedish Twin Registry is managed by Karolinska Institutet and receives funding through the Swedish Research Council under the grant no 2017-00641. Journal Comment 3. Thank you for stating the following in the Competing Interests section: I have read the journal's policy and the authors of this manuscript have the following competing interests: H. Larsson has served as a speaker for Evolan Pharmaand Shire and has received research grants from Shire; all outside the submitted work. P. Lichtenstein has served as a speaker for Medice, also outside the submitted work. R. McCabe serves as a data scientist for Spotify outside of the submitted work. All other authors declare that no competing interests exist Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests Answer 3. Thank you for bringing this to our attention. We have added included an updated version of the competing interests to our cover letter. Journal Comment 4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. In your revised cover letter, please address the following prompts: a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. We will update your Data Availability statement on your behalf to reflect the information you provide. Answer 4. Thank you for addressing this. We have added the following response to this prompt in our cover letter: Moreover, we regret to add that we are unable to share even de-identified data, as legally bound by the Swedish Serecy Act. Data from the national Swedish registers and twin register were used for this study and made available by ethical approval. The data used for this study include: Swedish Twin Registry, National Patient Register, Multi-Generation Register, Medical Birth Register, Prescribed Drug Register, the Longitudinal Integration Database for Health Insurance and Labor Market Studies. Researchers may apply for access these data sources through the Swedish Research Ethics Boards (etikprovningsmyndigheten.se; kansli@cepn.se) and from the primary data owners: Swedish Twin Registry (str-research@meb.ki.se), Statistics Sweden (scb@scb.se), and the National Board of Health and Welfare (socialstyrelsen@socialstyrelsen.se), in accordance with Swedish law. Reviewer #1: Overall this is a very well-written paper with clear descriptions of motivation, methods, conclusions, and limitations. The authors were very thorough in describing variables and parameters used in the prediction models. I have only minor comments that I feel would improve clarity: The authors would like to sincerely thank you for the kind evaluation and encouragement on the manuscript. Question 1: Typo in abstract: "METODS" --> "METHODS" Answer 1: The typo in line 34 has been corrected. Question 2: The authors state that machine learning models are "black box", but typically this is in reference to deep learning models. Most of the models used in this paper would be considered conventional and "interpretable" Answer 2: Thank you for this comment, we agree that the reference to “black box” was misplaced. We have changed the text accordingly, lines 92 – 94 have been changed to: Beyond their proven efficacy, tree based models provide information on how extensively a variable was used for the model, or variable importance, which gives some insight to the models’ classification process. Question 3: The authors state that CATSS participants are "described in detail elsewhere". This should at least be summarized in the current manuscript. Answer 3: Additional information has been added on lines 104 – 106: Participants came from the Child and Adolescent Twin Study in Sweden (CATSS), an ongoing, longitudinal study containing 15,156 twin pairs born in Sweden. During the first wave, the twins’ parents were contacted close to their 9th or 12th birthdays for a phone interview, this wave had a response rate of 80% (36), while the second wave at age 15 had a response rate of ~55%. Question 4: Was there any particular reason that 50% was the missingness threshold for removing variables? I think it would be nice to examine possible missingness patterns, e.g. particular variables missing for certain subgroups. Answer 4: Thank you for this comment. The 50% missingness was chosen a bit arbitrarily, the main aim was to keep variables with acceptable coverage in the prediction model and we felt that overly imputed variables ultimately would not lead to better results compared to removing them. CATSS is an ongoing longitudinal study and some of questions were changed or added over the years. This means that some questions had a high rate of missingness because only a small percentage of our sample was asked. Additionally, there were several gated questions that also had a high degree of missingness. Thus, our approach automatically excludes these questions. A distribution of the missingness in our data is visualized in the below figure: As can be seen, the cut-off of 50% missingness (chosen a priori) removes a set of variables with 90% or more missingness and a set of variables with 70-80% missingness (more borderline quality of variables). We believe our choice, albeit somewhat arbitrary, achieves a good balance between retaining variables with sufficient coverage/quality while not being overly conservative. Question 5; Since the described models are not computationally expensive, it might be nice to perform nested cross validation as opposed to a fixed train/val/test split. Answer 5: This is a great topic for discussion, thank you for bringing this up. One problem with this approach would be the potential splitting of twins between the subsets in nested-cross validation, which could lead to overfitting. Without the tune set as a “safety net” we thought it would be hard to catch the potentiality for overfitting before moving to the test set. This concerned the authors as we wanted to avoid training a new model after moving to the test set. To our knowledge, no such control for this issue within nested-cross validation exists within the common ML packages in R (please let us know if you know of a potential solution). Notably, caret has groupKfolds, but in practice this did not turn out to be helpful [1]. We did additional sensitivity analysis by training a random forest model with nested cross validation. We split our data into a train and test set (70/30 split) and found similar performance to our previously created models (AUC=0.743, 95% CI 0.712 - 0.773) compared to our top performing model (AUC=0.739, 95% CI 0.708 – 0.769). This result indicates that the tuning approach comes down to personal preference, although the authors would like to contend that nested cross validation seems to be a more streamlined option. Question 6: The class imbalance should be mentioned in the main manuscript. Answer 6: Thank you for this suggestion, we agree and an additional sentence has been added to the results section on lines 178 – 180: Our classes were fairly imbalanced as only 12% of our sample reached the cut off, we mitigated the effects of this through a combination of over- and under sampling on the training set using SMOTEBoost. Question 7: Table 2 should be referenced for the following line: "Descriptive statistics were created for each set to determine the quality of the partition" Answer 7: The suggested edit has been made on line 147. Question 8: Authors should slightly reword the description of training procedure. I assume "fit was determined by finding the maximum AUC" is referring to AUC on the tune set, but this should be explicitly mentioned. It almost reads like models were first trained on the training set before moving on to the tune set, but both of these should be used simultaneously in the cross validation procedure. Answer 8: Thank you for spotting this omission. Edits were made on lines 154 – 160: We created prediction models using several machine learning techniques: random forest, XGBoost, logistic regression, neural network and support vector machines (Table 1) to determine which produced the best fitting model for a test set. Using cross validation, each technique trained multiple models using the training set and tested their performance on a subset of the training set. The model with the lowest error was then tested using the tune set. Once the performance in the tune set was deemed satisfactory, the final models were then fitted to the test set. Question 9: Why weren't feature importances explored for models like logistic regression or SVM? It is certainly possible. Answer 9: While possible, we felt that this was not worth delving into for two reasons: the non-superiority of any one model, and the feature importances showed similar results to the random forest model (parent reported mental health symptoms ranked highest). Since random forest was the slightly better model we chose to only report the variable importance for that model. Question 10: I assume the "best performing model" is based on tune set performance (and not test set), but this should be explicitly mentioned. Answer 10: Thank you for letting us clarify this. The set we are referring to is in fact the test set. This has been clarified on lines 154 – 155: We created prediction models using several machine learning techniques: random forest, XGBoost, logistic regression, neural network and support vector machines (Table 1) to determine which produced the best fitting model for a test set. Additionally on lines 224 – 226: Using a large range of data from parent reports and register data from numerous Swedish national registers, this study predicted adolescent mental health reasonably well, with a maximum AUC of 0.739 on the test set (using the random forest model). Question 11: I find the description of the neural network to be problematic, several important parameters were not mentioned (# of layers, optimizer and its parameters, dropout, etc.). Furthermore, the final hidden dimension of 3 seems very low. Answer 11: Thank you for your feedback, this particular portion of the analysis gave the authors the most trouble! The parameters not mentioned were not adjusted or modified in our analysis. Given our relatively linear data and small number of participants (from an ML standpoint), a larger number of hidden dimensions were unnecessary [2], moreover we tried a range of hidden dimensions and 3 was determined to lead to the best fitting model. See Answer 12 below for further information about hyper-parameters tested. Question 12: The authors should construct a supplemental table of the ranges of parameters explored in the random search. Answer 12: The suggested edit has been added to tables S1 – S4 in the supporting information. This edit has also been reflected on line 191 – 192: A full list of the optimal parameters and the ranges tried for each model can be found in S1-S4 tables. Reviewer #2: This is a study of predictors of mental health issues in a sample of 7,638 Swedish twins. Predictors were collected on them at ages 9-12 and the mental health criterion data were collected at age 15. Although governmental data on Swedish twins is used in this study, the fact that they are twins is irrelevant and appears to pose no source of bias regarding the results. Of 474 variables collected from various governmental data sources, 85 survived scientific scrutiny and were included in the machine learning and regression models reported. Findings suggested that both kinds of analyses produced AUC scores above .7. Apparently these values are not adequate for clinical application, but they are certainly informative for behavioral scientists. Two very important findings from this investigation are (a) logistic regression was adequate for this work, so machine learning analyses may be unnecessary in similar future studies, and (b) the most powerful predictors of mental health issues among these Swedish teens came from parent reports, which are far faster and easier data to collect than most of the other predictors. These two findings are important to share because they provide a green light to the work of investigators in this area who may not be proficient in machine learning and who may only have access to data from parents. In addition, based on these findings, extramural funders may seek to fund these more affordable projects, instead of rejecting them in favor of funding projects that use more costly machine learning analyses (with the need for a lot of data) and governmental data sources. I am not a machine learning expert, so I cannot speak to the statistical conclusion validity of those analyses, but I am competent in logistic regression and saw no issues in those analyses. Question 13: In several places, the authors need to be careful not to elevate or hint at elevating nonsignificant effects to significance. When the authors say two values are different, then later say they are not significantly different, they blur the conclusion. Statistically the numbers are not different. Better simply to say the two values did not significantly differ and leave it at that. I am no fan of null hypothesis statistical testing, but the authors chose that approach, and some scientific communities still use that approach, so the authors need to remain true to that approach, which posits that findings either are or are not different based on statistical significance. Answer 13: Thank you for this valuable comment, we agree that a clearer distinction needed to be made. We’ve edited several lines to further clarify that there were no significant differences between the models. Line 224 – 226: Using a large range of data from parent reports and register data from numerous Swedish national registers, this study predicted adolescent mental health reasonably well, with a maximum AUC of 0.739 on the test set (using the random forest model). In line 248- 249 this sentence was deleted: The model created with random forest closely followed by support vector machine had the highest AUCs in the test set. However, Additionally line 281 – 282 was changed to: In summation, our models had a reasonable AUC, but no model had statistically significant higher performance than the other. Question 14: Because participants being twins was immaterial to the scientific questions addressed in this study, the authors should explain why they used a twin sample. It seems as though they could have gotten data on a far larger sample of Swedish children if they did not restrict their focus to twins, who on average represent less than 3% of a population. It could be that the kinds of data collected on Swedish twins simply are not collected on their non-twin counterparts. If that's so, the authors should say that. Answer 14: Thank you for this comment, this is indeed the case, the depth of information, e.g. longitudinal questionnaires, obtained from the Child and Adolescent Twin Study in Sweden is simply not available in singleton samples easily accessible by the authors. An edit at lines 107 - 108 has been added: This sample population was chosen due to the depth of information available, including questionnaire and register data Question 15: The ms. would be improved by a section that very specifically enumerated important next steps in predicting teen mental health issues, given these twin data. What do the authors think would be good ways for scholars to increase the AOC to levels appropriate for clinical use, for example? Other scholars would very much appreciate this kind of insight to guide their work. Answer 15: Thank you for this suggestion. We’ve added this information to lines 245 - 247: Additionally, future studies with similar aims should focus on using symptom ratings for mental health, including neurodevelopmental disorders, for their model. Question 16: An important limit to this work is cultural. Findings based on Swedes and Swedish culture may not broadly generalize, especially with regard to outcomes as socially defined as mental health concerns. So in addition to the five limitations briefly included in the Discussion, I suggest the authors add concerns about generalizability beyond Sweden and other very similar and similarly homogeneous nations. Answer 16: We’ve added this limitation to lines 272 – 273: On a similar note, our study results may not generalize outside of Sweden or Scandinavia, as all of our participants were Swedish born and we did not validate our results in an external sample. References 1. Kuhn M. Caret: classification and regression training. Astrophysics Source Code Library 2015 2. Dreiseitl S, Ohno-Machado L. Logistic regression and artificial neural network classification models: a methodology review. Journal of biomedical informatics 2002;35(5-6):352-59 Submitted filename: Response Letter 20200123.docx Click here for additional data file. 28 Feb 2020 Predicting mental health problems in adolescence using machine learning techniques PONE-D-19-24985R1 Dear Dr. Tate, We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements. Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication. Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. With kind regards, Wajid Mumtaz Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors have fully addressed my concerns and made satisfactory improvements to the revised manuscript. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No 3 Mar 2020 PONE-D-19-24985R1 Predicting mental health problems in adolescence using machine learning techniques Dear Dr. Tate: I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. For any other questions or concerns, please email plosone@plos.org. Thank you for submitting your work to PLOS ONE. With kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Wajid Mumtaz Academic Editor PLOS ONE

41 in total

Review 1. Logistic regression and artificial neural network classification models: a methodology review.

Authors: Stephan Dreiseitl; Lucila Ohno-Machado
Journal: J Biomed Inform Date: 2002 Oct-Dec Impact factor: 6.317

2. An exploration of concomitant psychiatric disorders in children with autism spectrum disorder.

Authors: Luc Lecavalier; Courtney E McCracken; Michael G Aman; Christopher J McDougle; James T McCracken; Elaine Tierney; Tristram Smith; Cynthia Johnson; Bryan King; Benjamin Handen; Naomi B Swiezy; L Eugene Arnold; Karen Bearss; Benedetto Vitiello; Lawrence Scahill
Journal: Compr Psychiatry Date: 2018-11-06 Impact factor: 3.735

3. Different neurodevelopmental symptoms have a common genetic etiology.

Authors: Erik Pettersson; Henrik Anckarsäter; Christopher Gillberg; Paul Lichtenstein
Journal: J Child Psychol Psychiatry Date: 2013-10-15 Impact factor: 8.982

4. The Strengths and Difficulties Questionnaire: a research note.

Authors: R Goodman
Journal: J Child Psychol Psychiatry Date: 1997-07 Impact factor: 8.982

5. Family income in early childhood and subsequent attention deficit/hyperactivity disorder: a quasi-experimental study.

Authors: Henrik Larsson; Amir Sariaslan; Niklas Långström; Brian D'Onofrio; Paul Lichtenstein
Journal: J Child Psychol Psychiatry Date: 2013-09-23 Impact factor: 8.982

Review 6. Efficacy of early interventions for infants and young children with, and at risk for, autism spectrum disorders.

Authors: Rebecca J Landa
Journal: Int Rev Psychiatry Date: 2018-03-14

7. Applying machine learning to facilitate autism diagnostics: pitfalls and promises.

Authors: Daniel Bone; Matthew S Goodwin; Matthew P Black; Chi-Chun Lee; Kartik Audhkhasi; Shrikanth Narayanan
Journal: J Autism Dev Disord Date: 2015-05

8. A twin-singleton comparison of developmental trajectories of externalizing and internalizing problems in 6- to 12-year-old children.

Authors: Sylvana C C Robbers; Meike Bartels; Floor V A van Oort; C E M Toos van Beijsterveldt; Jan van der Ende; Frank C Verhulst; Dorret I Boomsma; Anja C Huizink
Journal: Twin Res Hum Genet Date: 2010-02 Impact factor: 1.587

Review 9. Impulsivity and self-harm in adolescence: a systematic review.

Authors: Joanna Lockwood; David Daley; Ellen Townsend; Kapil Sayal
Journal: Eur Child Adolesc Psychiatry Date: 2016-11-05 Impact factor: 4.785

10. Extreme learning machine-based classification of ADHD using brain structural MRI data.

Authors: Xiaolong Peng; Pan Lin; Tongsheng Zhang; Jue Wang
Journal: PLoS One Date: 2013-11-19 Impact factor: 3.240

8 in total

1. Early Detection of Severe Functional Impairment Among Adolescents With Major Depression Using Logistic Classifier.

Authors: I-Ming Chiu; Wenhua Lu; Fangming Tian; Daniel Hart
Journal: Front Public Health Date: 2021-01-26

2. Exploratory analysis using machine learning of predictive factors for falls in type 2 diabetes.

Authors: Yasuhiro Suzuki; Hiroaki Suzuki; Tatsuya Ishikawa; Yasunori Yamada; Shigeru Yatoh; Yoko Sugano; Hitoshi Iwasaki; Motohiro Sekiya; Naoya Yahagi; Yasushi Hada; Hitoshi Shimano
Journal: Sci Rep Date: 2022-07-13 Impact factor: 4.996

3. Deep Learning-Based Text Emotion Analysis for Legal Anomie.

Authors: Botong She
Journal: Front Psychol Date: 2022-06-17

4. Machine Learning Algorithms to Distinguish Myocardial Perfusion SPECT Polar Maps.

Authors: Erito Marques de Souza Filho; Fernando de Amorim Fernandes; Christiane Wiefels; Lucas Nunes Dalbonio de Carvalho; Tadeu Francisco Dos Santos; Alair Augusto Sarmet M D Dos Santos; Evandro Tinoco Mesquita; Flávio Luiz Seixas; Benjamin J W Chow; Claudio Tinoco Mesquita; Ronaldo Altenburg Gismondi
Journal: Front Cardiovasc Med Date: 2021-11-11

5. Listening to Mental Health Crisis Needs at Scale: Using Natural Language Processing to Understand and Evaluate a Mental Health Crisis Text Messaging Service.

Authors: Zhaolu Liu; Robert L Peach; Emma L Lawrance; Ariele Noble; Mark A Ungless; Mauricio Barahona
Journal: Front Digit Health Date: 2021-12-06

Review 6. Machine learning for autism spectrum disorder diagnosis using structural magnetic resonance imaging: Promising but challenging.

Authors: Reem Ahmed Bahathiq; Haneen Banjar; Ahmed K Bamaga; Salma Kammoun Jarraya
Journal: Front Neuroinform Date: 2022-09-28 Impact factor: 3.739

7. Machine learning prediction of dropping out of outpatients with alcohol use disorders.

Authors: So Jin Park; Sun Jung Lee; HyungMin Kim; Jae Kwon Kim; Ji-Won Chun; Soo-Jung Lee; Hae Kook Lee; Dai Jin Kim; In Young Choi
Journal: PLoS One Date: 2021-08-02 Impact factor: 3.240

Review 8. Overview of CAPICE-Childhood and Adolescence Psychopathology: unravelling the complex etiology by a large Interdisciplinary Collaboration in Europe-an EU Marie Skłodowska-Curie International Training Network.

Authors: Hema Sekhar Reddy Rajula; Mirko Manchia; Kratika Agarwal; Wonuola A Akingbuwa; Andrea G Allegrini; Elizabeth Diemer; Sabrina Doering; Elis Haan; Eshim S Jami; Ville Karhunen; Marica Leone; Laura Schellhas; Ashley Thompson; Stéphanie M van den Berg; Sarah E Bergen; Ralf Kuja-Halkola; Anke R Hammerschlag; Marjo Riitta Järvelin; Amy Leval; Paul Lichtenstein; Sebastian Lundstrom; Matteo Mauri; Marcus R Munafò; David Myers; Robert Plomin; Kaili Rimfeld; Henning Tiemeier; Eivind Ystrom; Vassilios Fanos; Meike Bartels; Christel M Middeldorp
Journal: Eur Child Adolesc Psychiatry Date: 2021-01-20 Impact factor: 5.349

8 in total