| Literature DB >> 35793237 |
Jelena Bešević1, Ben Lacey1, Megan Conroy1, Wemimo Omiyale1, Qi Feng1, Rory Collins1,2, Naomi Allen1,2.
Abstract
UK Biobank is an intensively characterized prospective study of 500 000 men and women, aged 40 to 69 years when recruited, between 2006 and 2010, from the general population of the United Kingdom. Established as an open-access resource for researchers worldwide to perform health research that is in the public interest, UK Biobank has collected (and continues to collect) a vast amount of data on genetic, physiological, lifestyle, and environmental factors, with prolonged follow-up of heath conditions through linkage to administrative electronic health records. The study has already demonstrated its unique value in enabling research into the determinants of common endocrine and metabolic diseases. The importance of UK Biobank, heralded as a flagship project for UK health research, will only increase over time as the number of incident disease events accrue, and the study is enhanced with additional data from blood assays (such as whole-genome sequencing, metabolomics, and proteomics), wearable technologies (including physical activity and cardiac monitors), and body imaging (magnetic resonance imaging and dual-energy X-ray absorptiometry). This unique research resource is likely to transform our understanding of the causes, diagnosis, and treatment of many endocrine and metabolic disorders.Entities:
Keywords: UK Biobank; cohort; endocrinology; metabolic disorders; review
Mesh:
Year: 2022 PMID: 35793237 PMCID: PMC9387695 DOI: 10.1210/clinem/dgac407
Source DB: PubMed Journal: J Clin Endocrinol Metab ISSN: 0021-972X Impact factor: 6.134
Figure 1.Breadth and depth of data in UK Biobank.
Baseline and resurvey data in UK Biobank
| Data type | Details | Number of participants | Date of collection | Date first available |
|---|---|---|---|---|
| Baseline questionnaire | Sociodemographic factors, family history, psychosocial factors, local environment, lifestyle, health status, medical history, cognitive function | Whole cohort (baseline) | 2006-2010 | 2012 |
| 20 000 (first resurvey) | 2012-2013 | 2013 | ||
| 100 000 target (imaging visit) | 2014- | 2014 | ||
| 60 000 target (repeat imaging) | 2019- | 2019 | ||
| Baseline physical measures | Blood pressure and heart rate, hand grip strength, anthropometry (including bio-impedance), spirometry, heel bone density, arterial stiffness, hearing, eye examination, ECG (at rest and during activity) | Whole cohort (baseline) | 2006-2010 | 2012 |
| 20 000 (first resurvey) | 2012-2013 | 2013 | ||
| 100 000 target (imaging visit) | 2014- | 2014 | ||
| 60 000 target (repeat imaging) | 2019- | 2019 | ||
| Web-based questionnaires | 24-h diet recall (4 occasions) | 210 000 | 2011 | 2012 |
| Cognitive function | 121 000 | 2014 | 2015 | |
| Occupational history | 121 500 | 2015 | 2015 | |
| Mental health | 158 000 | 2017 | 2017 | |
| Digestive health | 176 000 | 2017 | 2018 | |
| Food preferences | 174 000 | 2019 | 2020 | |
| Pain | 169 000 | 2019 | 2021 | |
| Physical activity monitor | Accelerometer data on duration and intensity of physical activity | 100 000 | 2013-2016 | 2016 |
| 2500 (repeat measurements) | 2018 | 2019 | ||
| Imaging assessment | Abdominal, brain, and heart MRI; full-body DEXA; carotid ultrasound; ECG | 100 000 target (imaging visit) | 2014- | 2014 |
| 60 000 target (repeat imaging) | 2019- | 2019 | ||
| Cardiac monitor | 14 days continual ECG to assess atrial fibrillation | 36 000 target | Pending | Pending |
Abbreviations: DEXA = dual-energy X-ray absorptiometry; ECG, electrocardiogram; MRI, magnetic resonance imaging.
Data currently available for 50 000 participants. Detailed information on the data available in UK Biobank can be found on the UK Biobank data showcase: https://biobank.ndph.ox.ac.uk/showcase/.
Health record linkage data in UK Biobank
| Data type | Details | Number of participants | Date of collection | Date first available |
|---|---|---|---|---|
| Death registrations | ICD-coded cause-specific mortality | Whole cohort | 2006- | 2012- |
| Cancer registrations | ICD-coded cancer diagnoses | Whole cohort | England 1971- | 2012- |
| Hospital admissions | ICD-coded diagnoses, and OPCS-coded procedures, from hospital inpatient records, including critical care | Whole cohort | England 1997- | 2012- |
| Primary care | Includes Read-coded data on diagnoses, prescriptions, and referrals | 230 000 | England 1938- | 2019- |
| Primary care (COVID-19 research only) | Includes Read-coded data on diagnoses, prescriptions, and referrals | Whole cohort | England 1938- | 2020- |
| SARS-CoV-2 antigen tests | Data on test result and date, and laboratory | Whole cohort | 2020- | 2020- |
Abbreviations: ICD, International Classification of Diseases; OPCS, Office of Population Censuses and Surveys Classification of Interventions and Procedures.
Sample assay data in UK Biobank
| Data type | Details | Number of participants | Date of collection | Date first available |
|---|---|---|---|---|
| Biochemistry markers | Biomarkers assayed in plasma, serum, red blood cells, and urine samples; includes established risk factors for disease (eg, lipids for vascular disease, sex hormones for cancer), diagnostic measures (eg, HbA1c for diabetes and rheumatoid factor for arthritis), and other measures (such as liver and renal function tests) | Whole cohort (baseline) | 2006-2010 | 2016 |
| 20 000 (first resurvey) | 2012-2013 | |||
| Infectious agents | Measurement of antibody sero-positivity status of 20 pathogens | 10 000 (baseline) | 2006-2010 | 2019 |
| Genotyping | Genome-wide genotyping was performed using the UK BiLEVE Axiom array (~50 000 participants) and the UK Biobank Axiom Array (~450 000 participants). Approximately 850 000 variants were directly measured, with > 90 million variants imputed | Whole cohort (baseline) | 2006-2010 | 2017 |
| Whole exome sequencing | Whole exome sequencing measures the regions of the genome (about 2%) that are involved in coding for proteins and is particularly suitable for identifying disease-causing and/or rare genetic variants | Whole cohort (baseline) | 2006-2010 | 2019 |
| Whole genome sequencing | Whole genome sequencing measures the entire genome and will provide information that will complement and enhance the existing genotyping and exome data | Whole cohort (baseline) | 2006-2010 | 2021 |
| Telomeres | Telomere length | Whole cohort (baseline) | 2006-2010 | 2021 |
| Plasma metabolites | >200 circulating metabolites (predominantly lipids) measured using NMR metabolomics platform | Whole cohort (baseline) | 2006-2010 | 2021 |
| 20 000 (first resurvey) | 2012-2013 | |||
| Plasma proteins | Approx. 3000 circulating proteins | 57 000 (baseline) | 2006-2010 | Pending |
Abbreviations: HbA1c, hemoglobin A1c test; NMR, nuclear magnetic resonance.
Date data made available to the wider research community (ie, not including the exclusive period of access offered to research groups who have funded some of the sample assays).
Data expected to be available end-2022.
Figure 2.UK Biobank publications and citations 2012-2021.