| Literature DB >> 33884544 |
Angela Pinot de Moira1, Sido Haakma2, Katrine Strandberg-Larsen3, Esther van Enckevort2, Marjolein Kooijman4,5, Tim Cadman6,7, Marloes Cardol8, Eva Corpeleijn8, Sarah Crozier9,10, Liesbeth Duijts4,5, Ahmed Elhakeem6,7, Johan G Eriksson11,12,13,14, Janine F Felix4,5, Sílvia Fernández-Barrés15,16,17, Rachel E Foong18,19, Anne Forhan20, Veit Grote21, Kathrin Guerlich21, Barbara Heude20, Rae-Chi Huang18, Marjo-Riitta Järvelin22,23, Anne Cathrine Jørgensen3, Tuija M Mikkola12,24, Johanna L T Nader25, Marie Pedersen3, Maja Popovic26, Nina Rautio22, Lorenzo Richiardi26, Justiina Ronkainen22, Theano Roumeliotaki27, Theodosia Salika9, Sylvain Sebert22, Johan L Vinther3, Ellis Voerman4,5, Martine Vrijheid15,16,17, John Wright28, Tiffany C Yang28, Faryal Zariouh20, Marie-Aline Charles20,29, Hazel Inskip9,30, Vincent W V Jaddoe4,5, Morris A Swertz2,31, Anne-Marie Nybo Andersen3.
Abstract
The Horizon2020 LifeCycle Project is a cross-cohort collaboration which brings together data from multiple birth cohorts from across Europe and Australia to facilitate studies on the influence of early-life exposures on later health outcomes. A major product of this collaboration has been the establishment of a FAIR (findable, accessible, interoperable and reusable) data resource known as the EU Child Cohort Network. Here we focus on the EU Child Cohort Network's core variables. These are a set of basic variables, derivable by the majority of participating cohorts and frequently used as covariates or exposures in lifecourse research. First, we describe the process by which the list of core variables was established. Second, we explain the protocol according to which these variables were harmonised in order to make them interoperable. Third, we describe the catalogue developed to ensure that the network's data are findable and reusable. Finally, we describe the core data, including the proportion of variables harmonised by each cohort and the number of children for whom harmonised core data are available. EU Child Cohort Network data will be analysed using a federated analysis platform, removing the need to physically transfer data and thus making the data more accessible to researchers. The network will add value to participating cohorts by increasing statistical power and exposure heterogeneity, as well as facilitating cross-cohort comparisons, cross-validation and replication. Our aim is to motivate other cohorts to join the network and encourage the use of the EU Child Cohort Network by the wider research community.Entities:
Keywords: Birth cohort; Cross-cohort collaboration; Data harmonisation; FAIR (findable, accessible, interoperable and reusable) principles; Lifecourse epidemiology
Mesh:
Year: 2021 PMID: 33884544 PMCID: PMC8159791 DOI: 10.1007/s10654-021-00733-9
Source DB: PubMed Journal: Eur J Epidemiol ISSN: 0393-2990 Impact factor: 12.434
Pregnancy and child cohorts contributing data to the EU Child Cohort Network as of June 2020
| Cohort (full name) | Country | Recruitment | Enrolment period | Age at last follow-up (y) | Na |
|---|---|---|---|---|---|
| ALSPAC (Avon Longitudinal Study of Parents & Children) | UK | 1991–1992 | Pregnancy | 25 | 10,742 |
| BiB (Born in Bradford) | UK | 2007–2011 | Pregnancy | 9 | 12,397 |
| CHOP (The EU Childhood Obesity Programme) | Germany, Belgium, Italy, Spain and Poland | 2002–2004 | Birth | 11 | 1280 |
| DNBC (Danish National Birth Cohort) | Denmark | 1996–2002 | Pregnancy | 18 | 72,157 |
| EDEN (Study on the pre- & early postnatal determinants of child health & development) | France | 2003–2005 | Pregnancy | 8 | 1676 |
| ELFE (Etude Longitudinale Francaise depuis l’Enfance) | France | 2011 | Birth | 7 | 10,825 |
| GECKO (Groningen Expert Center for Kids with Obesity Drenthe Cohort) | The Netherlands | 2006–2007 | Pregnancy | 10 | 2682 |
| Gen R (Generation R) | The Netherlands | 2002–2006 | Pregnancy | 17 | 8534 |
| HBCS (Helsinki Birth Cohort Study) | Finland | 1934–1944 | Birth | 76 | 13,343 |
| INMA (INMA-Infancia y Medio Ambiente (Environment and Childhood Project)) | Spain | 1997–2008 | Pregnancy | 18 | 1900 |
| MoBa (Norwegian Mother, Father and Child Cohort Study) | Norway | 1999–2008 | Pregnancy | 14 | 76,569 |
| NFBC1966 (Northern Finland Birth Cohort 1966) | Finland | 1966 | Pregnancy | 46–48 | 7810 |
| NFBC1986 (Northern Finland Birth Cohort 1986) | Finland | 1985–1986 | Pregnancy | 33–35 | 8372 |
| NINFEA (Nascita e INFanzia: gli Effetti dell’Ambiente) | Italy | 2005–2016 | Pregnancy | 13 | 6018 |
| Raine (The Raine Study) | Australia | 1989–1992 | Pregnancy | 26 | 2491 |
| Rhea (Mother Child Cohort in Crete) | Greece | 2007–2008 | Pregnancy | 7 | 967 |
| SWS (Southampton Women’s Survey) | UK | 1998–2007 | Preconception | 9 | 2921 |
aNumber of children from the cohort contributing data to the EU Child Cohort Network and with all three of the following variables harmonised: (1) birth weight, (2) sex, (3) at least one height or weight measurement taken at ≥ 1 year
Fig. 1The process adopted in LifeCycle to establish and harmonise the core variables for the EU Child Cohort Network
A glossary of the key elements and concepts in LifeCycle
| Term | Definition |
|---|---|
| Complete harmonisation | The ability to derive the variable as described in the harmonization manual, both in definition and format |
| Data harmonisation | The process of creating a common dataset from disparate datasets |
| DataSHIELD | An infrastructure and series of R packages that enables the remote and non-disclosive analysis of individual participant data |
| EU Child Cohort Network | A network bringing together existing data from more than 250,000 European and Australian children and their parents |
| Federated data analysis | Centralised analysis of individual participant data where data are stored on local servers and do not leave the host institution |
| Harmonisation manual | A manual containing a list of target variables together with instructions for their harmonisation |
| Impossible harmonisation | The complete inability to derive the variable due to no or limited information |
| Horizon2020 LifeCycle Project | A collaboration between scientists from more than 17 existing pregnancy and child cohort studies |
| EU Child Cohort Network Variable Catalogue | An online catalogue providing an overview of available data in the EU Child Cohort Network, including details of how data have been created ( |
| LifeCycle core variables | A set of basic variables, derivable by the majority of cohorts participating in LifeCycle and frequently required in lifecourse analyses |
| Opal | A data warehouse that is integrated with R and the DataSHIELD platform, allowing the analysis of data without the physical sharing or disclosing of individual participant data |
| Partial harmonisation | The ability to derive the variable as described but with some loss of information |
Fig. 2An illustration of the EU Child Cohort Network Variable Catalogue displaying the LifeCycle variable “maternal history of asthma before pregnancy”. Displayed is a description of the target EU Child Cohort Network variable and how the variable was harmonised in two separate cohorts. Note: descriptions from two separate cohorts are displayed on the same page for illustrative purposes only
Fig. 3An illustration of the EU Child Cohort Network Variable Catalogue’s menu structure giving an overview of the themes included in the EU Child Cohort Network and the number of variables included in each theme. 1Including yearly-repeated variables with up to 18 measures between the ages of 0 and < 18 years. 2Including weekly-repeated variables with up to 43 measures taken between gestational weeks 0 and < 43. 3Including trimester-repeated variables with separate measures for the first, second and third trimesters. 4Including separate variables indicating the type of father the variable relates to (biological, social father, social mother, unknown). 5Including separate variables relating to secondary father-figures. 6Including monthly-repeated variables with up to 216 measures between the ages of 0 and < 216 months. 7Including yearly-repeated variables with up to four measures between the ages of 0 and < 4 years. 8Including yearly-repeated variables with up to 13 measures between the ages of 0 and < 13 years
Child-related characteristics of cohorts contributing data to the EU Child Cohort Network
| Cohort | Na | Female, n (%) | GA (weeks), mean (SD) | Birth weight (g), mean (SD) | SGAb, n (%) | LGAc, n (%) | Ever breastfed, n (%) |
|---|---|---|---|---|---|---|---|
| ALSPAC | 10,742 | 5313 (49.5) | 40.0 (1.9) | 3408 (555) | 644 (6.0) | 1015 (9.5) | 7213 (75.8) |
| BiB | 12,397 | 5980 (48.2) | 39.5 (1.8) | 3212 (557) | 1385 (11.2) | 562 (4.5) | 3228 (78.7) |
| CHOP | 1280 | 659 (51.5) | 40.4 (1.2) | 3297 (351) | 28 (2.2) | 34 (2.7) | 901 (70.4) |
| DNBC | 72,157 | 35,464 (49.1) | 39.9 (1.8) | 3565 (582) | 2281 (3.2) | 10,046 (14.0) | 55,214 (98.3) |
| EDEN | 1676 | 802 (47.9) | 39.7 (1.7) | 3283 (506) | 118 (7.0) | 60 (3.6) | 1230 (73.4) |
| ELFE | 10,825 | 5277 (48.7) | 39.6 (1.5) | 3322 (488) | 644 (6.0) | 535 (5.0) | 7858 (74.8) |
| GECKO | 2682 | 1332 (49.7) | 39.8 (1.6) | 3542 (548) | 87 (3.3) | 357 (13.4) | 1938 (79.4) |
| Gen R | 8534 | 4229 (49.6) | 40.3 (1.9) | 3400 (576) | 615 (7.4) | 541 (6.5) | 6013 (91.8) |
| HBCS | 13,343 | 6369 (47.7) | 39.8 (1.8) | 3407 (479) | NA | NA | 11,110 (99.6) |
| INMA | 1900 | 923 (48.6) | 39.9 (1.6) | 3263 (467) | 139 (7.3) | 70 (3.7) | 1648 (88.6) |
| MoBa | 76,569 | 37,390 (48.8) | 39.8 (1.9) | 3576 (578) | 2725 (3.6) | 7377 (9.6) | 71,768 (93.7) |
| NFBC1966 | 7810 | 3628 (46.5) | 40.5 (1.9) | 3491 (530) | 378 (5.3) | 703 (9.9) | 4550 (86.0) |
| NFBC1986 | 8372 | 4112 (49.1) | 39.8 (1.7) | 3560 (546) | 259 (3.1) | 1186 (14.2) | NA |
| NINFEA | 6018 | 2951 (49.0) | 39.7 (1.7) | 3238 (493) | 471 (7.9) | 200 (3.3) | 5502 (92.1) |
| Raine | 2491 | 1218 (48.9) | 39.1 (2.3) | 3299 (602) | 142 (7.0) | 146 (7.2) | 2082 (89.7) |
| Rhea | 967 | 459 (47.5) | 38.7 (1.5) | 3183 (455) | 56 (5.9) | 51 (5.3) | 805 (86.5) |
| SWS | 2921 | 1411 (48.3) | 39.7 (1.8) | 3441 (547) | 126 (4.3) | 259 (8.9) | 2376 (82.5) |
Values are mean (standard deviation) or n (valid percent)
GA gestational age at birth, SGA small for gestational age, LGA large for gestational age, NA data not available
aNumber of children from the cohort contributing data to the EU Child Cohort Network and with all three of the following variables harmonised: i) birth weight, ii) sex, iii) at least one height or weight measurement taken at ≥ 1 year
bBirth weight ≤ 5th percentile for gestational age (in completed weeks) using the WHO fetal growth charts [52] as the growth standard
cBirth weight ≥ 95th percentile for gestational age (in completed weeks) using the WHO fetal growth charts [52] as the growth standard
Mother-related characteristics of cohorts contributing data to the EU Child Cohort Network
| Cohort | Na | Maternal age at birth (y), mean (SD) | Education level, n (%) | Ethnicity, n (%) | Multiparous, n (%) | Smoked in pregnancy, n (%) | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| High | Medium | Low | White | Black, Asian or minority ethnic | Mixed | |||||
| ALSPAC | 10,742 | 29.2 (4.6) | 1444 (14.2) | 6954 (68.6) | 1741 (17.2) | 9874 (98.3) | 169 (1.7) | – | 5629 (54.8) | 2468 (26.0) |
| BiB | 12,397 | 27.6 (5.6) | 2534 (26.8) | 1502 (15.9) | 5420 (57.3) | 4290 (41.8) | 5783 (56.3) | 200 (1.9) | 7259 (60.8) | 1659 (16.2) |
| CHOP | 1280 | 30.2 (5.0) | 336 (26.3) | 640 (50.2) | 300 (23.5) | 1232 (96.4) | 46 (3.6) | – | 652 (51.0) | 416 (32.6) |
| DNBC | 72,157 | 30.1 (4.2) | 33,700 (52.3) | 14,067 (21.8) | 16,655 (25.9) | NA | NA | NA | 37,964 (52.6) | 17,580 (24.7) |
| EDEN | 1676 | 29.7 (4.8) | 938 (56.2) | 636 (38.1) | 94 (5.6) | 1437 (99.1) | 7 (0.5) | 6 (0.4) | 911 (54.5) | 413 (24.7) |
| ELFE | 10,825 | 30.8 (4.7) | 7240 (66.9) | 3063 (28.3) | 521 (4.8) | 8706 (83.9) | 963 (9.3) | 705 (6.8) | 5673 (53.0) | 1779 (16.6) |
| GECKO | 2682 | 30.7 (4.4) | 900 (35.9) | 724 (28.9) | 885 (35.3) | 2400 (95.5) | 70 (2.8) | 43 (1.7) | 1591 (59.9) | 411 (15.4) |
| Gen R | 8534 | 30.7 (5.2) | 3448 (45.3) | 3380 (44.4) | 778 (10.2) | 4606 (57.1) | 2665 (33.0) | 799 (9.9) | 3691 (44.8) | 1888 (25.9) |
| HBCS | 13,343 | 28.4 (5.4) | NA | NA | NA | NA | NA | NA | 6861 (51.4) | NA |
| INMA | 1900 | 31.8 (4.2) | 661 (35.2) | 768 (40.9) | 449 (23.9) | 1802 (95.7) | 80 (4.3) | – | 810 (44.5) | 588 (31.4) |
| MoBa | 76,569 | 30.4 (4.4) | 48,804 (67.5) | 22,166 (30.6) | 1354 (1.9) | NA | NA | NA | 39,262 (51.7) | 6194 (8.1) |
| NFBC1966 | 7810 | 28.1 (6.7) | 254 (3.3) | 1033 (13.5) | 6387 (83.2) | NA | NA | NA | 5387 (69.1) | 1569 (20.7) |
| NFBC1986 | 8372 | 27.8 (5.5) | 1735 (23.7) | 2744 (37.4) | 2856 (38.9) | NA | NA | NA | 5499 (65.9) | 1975 (23.7) |
| NINFEA | 6018 | 33.2 (4.2) | 3799 (63.6) | 1923 (32.2) | 253 (4.2) | NA | NA | NA | 1548 (27.0) | 453 (7.6) |
| Raine | 2491 | 27.9 (5.8) | 465 (20.1) | 633 (27.3) | 1221 (52.7) | 2175 (89.2) | 264 (10.8) | – | 1275 (52.3) | 666 (27.3) |
| Rhea | 967 | 29.7 (4.9) | 304 (32.1) | 481 (50.7) | 163 (17.2) | 926 (99.8) | 2 (0.2) | – | 524 (54.9) | 290 (33.1) |
| SWS | 2921 | 30.2 (3.8) | 837 (28.7) | 1730 (59.2) | 345 (11.8) | 2799 (95.8) | 105 (3.6) | 16 (0.5) | 1409 (48.3) | 428 (15.4) |
Values are mean (standard deviation) or n (valid percent)
aNumber of children from the cohort contributing data to the EU Child Cohort Network and with all three of the following variables harmonised: (1) birth weight, (2) sex, (3) at least one height or weight measurement taken at ≥ 1 year. Mothers who contributed more than one child to a cohort are counted more than once in the table
Fig. 4Percentage of EU Child Cohort Network core variables harmonised by each cohort. The figure displays the percentage of the 123 core variables listed in Online Resource 1 (excluding meta-variables) harmonised by each cohort. Shading of bars displays the degree of matching within each cohort: black bars represent percentage of completely harmonised variables; dark grey bars represent percentage of partially harmonised variables; light grey bars represent percentage of variables that were not harmonizable (impossible harmonisation)
Fig. 5Harmonised non-repeated core variables in the EU Child Cohort Network. Bars display the number of children with either a partially (grey bars) or completely (black bars) harmonised core variable for each of the main themes/exposures. The dashed line represents the total number of children (240,684), as of June 2020, contributing data to the EU Child Cohort Network with all three of the following variables harmonised: (1) birth weight, (2) sex, (3) at least one height or weight measurement taken at ≥ 1 year. COB country of birth, PE pre-eclampsia, gest. HT gestational hypertension, size for GA size for gestational age
Fig. 6Number of children in the EU Child Cohort Network with yearly-repeated measure core variables. Bars display the number of children with at least one measure between the ages of zero and three (child-care variables) or zero and seventeen (all other variables), either partially (grey bars) or completely (black bars) harmonised. The dashed line represents the total number of children (240,684), as of June 2020, contributing data to the EU Child Cohort Network with all three of the following variables harmonised: i) birth weight, ii) sex, iii) at least one height or weight measurement taken at ≥ 1 year
Fig. 7Weight and height data in the EU Child Cohort Network. Graphs display a number of children in the network with at least one weight (dark grey bars) or height (light grey bars) measure at < 3 months, 3–6 months, 6–12 months and yearly intervals from 1 to 17 years; b total number of weight (dark grey bars) and height (light grey bars) within each age band (i.e. one child may contribute multiple measurements within each age band)