Literature DB >> 31062032

Cohort Profile: Extended Cohort for E-health, Environment and DNA (EXCEED).

Catherine John1, Nicola F Reeve1, Robert C Free2,3, Alexander T Williams1, Ioanna Ntalla1,4, Aliki-Eleni Farmaki1,5, Jane Bethea1, Linda M Barton6, Nick Shrine1, Chiara Batini1, Richard Packer1, Sarah Terry2, Beverley Hargadon2, Qingning Wang1, Carl A Melbourne1, Emma L Adams1, Catherine E Bee1, Kyla Harrington1, José Miola7, Nigel J Brunskill8, Christopher E Brightling2,3, Julian Barwell9, Susan E Wallace1, Ron Hsu1, David J Shepherd1, Edward J Hollox9, Louise V Wain1,2, Martin D Tobin1,2.   

Abstract

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31062032      PMCID: PMC6659362          DOI: 10.1093/ije/dyz073

Source DB:  PubMed          Journal:  Int J Epidemiol        ISSN: 0300-5771            Impact factor:   7.196


× No keyword cloud information.

Why was the cohort set up?

EXCEED aims to develop understanding of the genetic, environmental and lifestyle-related causes of health and disease. Cohorts like EXCEED, with broad consent to study multiple phenotypes related to onset and progression of disease and drug response, have a role to play in medicines development, by providing genetic evidence that can identify, support or refute putative drug efficacy or identify possible adverse effects. Furthermore, such cohorts are well suited to the study of multimorbidity. Multimorbidity describes the presence of multiple diseases or conditions in one patient, though definitions in the literature vary widely. It demands a holistic approach to optimize care and avoid iatrogenic complications, such as drug interactions. In the context of increasing specialisation of many health care systems and high health care use among people with multimorbidity, providing such care poses a complex challenge. In high-income countries, multimorbidity is particularly common among more deprived socioeconomic groups and may even be considered as the norm amongst older people,, and an ageing global population and a growing burden of non-communicable diseases in low- and middle-income countries compound its global importance. An expert working group convened by the UK Academy of Medical Sciences recently highlighted the lack of available evidence relating to the burden, determinants, prevention and treatment of multimorbidity, and recommended the prioritisation of research on multimorbidity spanning the translational pathway from understanding of its biological mechanisms to health services research. Studies designed to investigate multimorbidity, rather than considering individual conditions in relative isolation, are therefore vital., Linkage to electronic health records (EHR) has enabled information on a broad range of diseases and risk factors to be studied in EXCEED and places multimorbidity at the study’s heart. The EHR linkage also facilitates longitudinal follow-up over an extended period, enabling, for example, the investigation of lifestyle factors and other exposures on healthy ageing and outcomes in later life. Combining wide-ranging data from EHR with genome-wide genotyping is also central to EXCEED’s purpose. In recent years, our understanding of which genes are associated with both rare and common diseases has advanced rapidly as available sample sizes for genome-wide association studies (GWAS) have grown rapidly. For example, there are now 279 genetic variants associated with lung function and chronic obstructive pulmonary disease (COPD). However in many cases, our understanding of the mechanisms through which these variants influence disease risk—and which could therefore be therapeutic targets—is relatively limited. An efficient design to inform this understanding is to stratify participants based on available study data on their health status (phenotype) or genetic risk factors (genotype), to thus recall them for further detailed investigations which would be impracticable across a whole cohort. EXCEED was purposely designed as a resource for recall-by-genotype sub-studies, and all participants have consented to be recalled on this basis. The study is led by the University of Leicester, in partnership with University Hospitals of Leicester NHS Trust and in collaboration with Leicestershire Partnership NHS Trust, local general practices and smoking cessation services.

Who is in the cohort?

Recruitment to the cohort has taken place since 22 November 2013, primarily from the general population through general practices in Leicester City, Leicestershire and Rutland. In total, 10 156 participants have been recruited to 4 December 2018. Of these, 445 were recruited through smoking cessation services in Leicester City, Leicestershire and Rutland, 44 through targeted recruitment of those with a recorded diagnosis of COPD in their electronic primary care record and 117 through additional community-based recruitment focused on Leicester’s South Asian communities (see Figure 1). Although a recruitment target of 10 000 has now been reached, community-based recruitment particularly focusing on minority ethnic groups will continue subject to further funding.
Figure 1.

Recruitment methods and numbers.

Recruitment methods and numbers. All tables in this paper present participants recruited via primary care or smoking cessation, whose data were collected and quality control undertaken at 4 December 2018 (9384 participants for questionnaire data, 8930 participants for primary care data). Quality control for the remaining questionnaires and linkage to primary care data are ongoing. Around 400 participants do not have questionnaire data but were recruited as consent and a saliva sample were provided. In the UK, over 98% of the population is registered with a National Health Service (NHS) general practitioner. For recruitment through primary care, all registered patients aged between 40 and 69 years in participating general practices were eligible for recruitment. Exclusion criteria were minimal: those receiving palliative care, those with learning disabilities or dementia and those whose records indicated they had declined consent for record sharing for research. All eligible patients identified through primary care were sent an initial letter with brief information about the study and a reply slip to indicate their interest. For participants recruited via smoking cessation services, the lower age limit was reduced to 30 years because of the higher risk of disease among smokers. Initial eligibility screening and information provision were either undertaken through electronic client records followed by a letter to the client (as in primary care) or face-to-face by a smoking cessation adviser during a routine appointment. Additionally, patients with a recorded diagnosis of COPD were invited from four local general practices with a higher prevalence of COPD, to boost the numbers available for a sub-study of respiratory disease. For this group, the lower age limit was 30 years, and all other exclusion criteria were identical to the main primary care recruitment. All those who responded to indicate they were interested in taking part were sent full written information on the study, in addition to a study consent form. Full information regarding participant consent can be found at [http://www.leicsrespiratorybru.nihr.ac.uk/our-research/our-research-studies/exceed]. All participants consent to follow-up of their electronic health care records for up to 25 years, to storage and analysis of their DNA sample and to being contacted for further studies on the basis of their genetic data (recall-by-genotype) or health status (recall-by-phenotype). They may also consent or decline to be contacted regarding genetic variants which may, in the future, be considered clinically relevant. Participants proceeded via one of two routes depending on their location and personal preference: a face-to-face appointment with a research professional, or by post. The flow of participants through the main primary care recruitment route is illustrated in Figure 2. Approximately 8% of those who received an initial invitation via primary care completed recruitment. Table 1 gives an overview of the demographic characteristics of the primary care population sampled, compared with the characteristics of those recruited to the study via primary care. Participants in the study were older and more likely to be female than the primary care population from which they were drawn. This reflects well-known patterns of participation in similar cohorts., The local primary care population includes a large proportion of minority ethnic groups, especially Asian and Asian British. These groups are under-represented among study participants, although the proportion of study participants of Asian and Asian British ethnicity (5%) is higher than many UK cohorts, including UK Biobank. This reflects experience of similar recruitment methods in other studies. Explanations for the under-representation of minority ethnic groups in medical research more generally include language barriers, inequitable access to health care services, cultural sensitivities and a lack of awareness of medical research and its purpose., Community-based recruitment has been introduced to EXCEED to improve representation of these groups.
Figure 2.

Recruitment via primary care.

Table 1.

Demographic characteristics of the primary care population sampled for the study and those who participated (via the primary care recruitment route only)

Primary care populationaRecruitedDifference in proportions (95% confidence interval)
Age (years) n (N = 117 965)% n (N = 8979)%
  (<45)21 05717.96617.4−10.5 (−9.9 to −11.1)
  (45–54)44 55937.8226425.2−12.6 (−11.6 to −13.5)
  (55–64)36 13330.6336537.56.9 (5.8 to 7.9)
  (≥ 65)16 21613.7268929.916.2 (15.2 to 17.2)
Sex n (N = 117 965)% n (N = 8979)%
 Male59 00350.0399344.5−5.5 (−4.5 to −6.6)
 Female58 96250.0498655.55.5 (4.5 to 6.6)
Ethnicity n (N = 81 947)% n (N = 8937)%
 White59 57672.7828492.720.0 (19.4 to 20.6)
 Asian/Asian British17 67021.64274.8−16.8 (−16.3 to −17.3)
 Black/African/Caribbean/Black British27633.4120.1−3.3 (−3.1 to −3.4)
 Mixed6860.8931.00.2 (0.02 to 0.4)
 Chinese3010.4560.60.2 (0.08 to 0.4)
 Other9511.2650.7−0.5 (−0.2 to −0.6)

Primary care population is all patients within the eligible age range in the practices sampled, and includes those who were excluded at the next step (codes for palliative care, dementia, learning disability, or lack of consent to share data for research).

Recruitment via primary care. Demographic characteristics of the primary care population sampled for the study and those who participated (via the primary care recruitment route only) Primary care population is all patients within the eligible age range in the practices sampled, and includes those who were excluded at the next step (codes for palliative care, dementia, learning disability, or lack of consent to share data for research).

How often have they been followed up?

Participants have consented to follow-up through linkage to EHR for up to 25 years. Linkage to electronic primary care records (i.e. records from the participant’s general practice) is undertaken upon completion of recruitment at each practice and has been completed for 8930 participants to 4 December 2018. As participants are prospectively followed up, we expect losses due to deaths (to date less than 1% of participants), withdrawals (to date less than 0.1% of participants), relatively few losses due to house moves within the UK or changing general practitioner (as NHS patients retain the same NHS number and their electronic records move with them) and some losses due to emigration. Analyses of historical health care records to track disease development and progression may be subject to selection bias, in particular survivor bias.

What has been measured?

There are several phases of data collection, summarised in Table 2. Linked primary care data provide historical cohort data. Since the mid-1990s, prospectively recorded consultations enable the retrieval of information not only on symptoms for which participants have visited their general practitioner and diagnoses which have been made, but also on examination findings (including blood pressure readings and spirometry results, for example), laboratory test results, drug prescriptions and secondary care referrals. Major diagnoses documented on paper records before the mid-1990s were retrospectively coded at the time of computerization and so can also be retrieved.
Table 2.

Summary of data collected at each phase

PhaseMeasurements
Historical cohort dataHistorically coded primary care data, transferred from paper records at the time of practice computerization, approximately mid-1990s, and since mid-1990s prospectively recorded consultations, with coded:

symptoms

diagnoses

measurements, such as blood pressure and spirometry

laboratory test results

drug prescriptions

secondary care referrals

BaselineAll participants:

Questionnaire, including smoking and alcohol use

DNA saliva sample

Postal participants only: self-measured height, weight, waist circumference. Examination by research professional only: height, weight, waist circumference, hip circumference and spirometry
OngoingPlanned updates to primary care record linkage (detailed above), with consent to follow-up for 25 years, to track health longitudinally Ongoing linkage to:

admissions, accident and emergency attendances and outpatient appointments via hospital episode statistics

Pathology data (East Midlands Pathology Service)

Myocardial Ischaemia National Audit (MINAP)

Summary of data collected at each phase symptoms diagnoses measurements, such as blood pressure and spirometry laboratory test results drug prescriptions secondary care referrals Questionnaire, including smoking and alcohol use DNA saliva sample admissions, accident and emergency attendances and outpatient appointments via hospital episode statistics Pathology data (East Midlands Pathology Service) Myocardial Ischaemia National Audit (MINAP) Baseline data collection for all participants included a self-completion questionnaire which collected detailed information on current and past smoking habits, smoking cessation attempts, e-cigarette and shisha usage, environmental tobacco smoke (second-hand smoke) exposure and alcohol use. For those recruited via a face-to-face appointment, this was undertaken during the appointment. Those participating by post completed the questionnaire online using their own computer, with a paper version available if necessary. Height, weight and waist circumference were either measured by a research professional or self-reported by postal participants. Those recruited face-to-face also had their hip circumference measured and, where feasible, underwent spirometric measurement of lung function. Finally, a saliva sample was collected from all participants either at their appointment or returned by post, for extraction of DNA. The samples are stored at the NIHR Biocentre (Milton Keynes, UK), providing industrial-scale laboratory information management and automated robotic systems which have been shown to facilitate efficient error-free sample storage and extraction from freezers in the UK Biobank study. To date, genome-wide genotype data (using the Affymetrix UK Biobank Axiom Array) are available for 5216 participants after quality control, enabling analysis of over 40 million variants after imputation to the Haplotype Reference Consortium (HRC) panel. Planned updates to linked primary care records will enable longitudinal tracking of health. There is also ongoing linkage to other sources of health data including: admissions, accident and emergency attendances and outpatient appointments via hospital episode statistics; pathology data (East Midlands Pathology Service); and the Myocardial Ischaemia National Audit (MINAP).

What has it found? Key findings and publications

Table 3 shows that, in general, our cohort is slightly healthier than average for common health risk factors and behaviours. This is similar to findings by other cohort studies. For example, the total proportion of participants who were overweight, obese or morbidly obese (64.2%) was slightly lower than similar age groups in Health Survey for England 2016, where it was above 70% for all ages from 45 upwards.
Table 3.

Prevalence of risk factors and health behaviours

n %
Deprivationa (n = 9171)
 1 (most deprived)120413.1
 2111212.1
 3165918.1
 4250727.3
 5 (least deprived)268929.3
BMI (n = 9285)
 Underweight (<18.5)981.1
 Normal (18.5–24.9)322134.7
 Overweight (25–29.9)357638.5
 Obese (30–39.9)209222.5
 Morbidly obese (≥40)2983.2
Waist circumference (n = 9103)
 Low risk (males <94 cm, females <80 cm)295432.5
 Increased risk (males 94–102 cm; females 80–88 cm)244226.8
 High risk (males >102 cm; females >88 cm)370740.7
Smoking statusb (n = 9381)
 Current smoker9129.7
 Ex-smoker (regular or occasional)367839.2
 Never smoker479151.1
Alcohol intake (units/week) (n = 9335)
 None179219.2
 Lower risk (< 14 u)451548.4
 Increasing risk (females 14–35 u; males 14–50 u)237425.4
 Higher risk (females >35 u; males >50 u)6577.0

BMI, body mass index; u, units.

Index of multiple deprivation national quintiles by postcode.

Includes cigarettes, cigars, cigarillos, pipes or shisha.

Prevalence of risk factors and health behaviours BMI, body mass index; u, units. Index of multiple deprivation national quintiles by postcode. Includes cigarettes, cigars, cigarillos, pipes or shisha. Similarly, the proportion of EXCEED participants who currently smoke is 9.7%, considerably lower than the national average (15.8%) and comparable only to the oldest age group (65 and over) in the national Annual Population Survey, among whom smoking prevalence was 8.3%. Smoking prevalence among all younger age groups nationally is 15% or above. On the other hand, the proportion of people who report never smoking is also lower than in national population surveys. This may be influenced by question wording and interpretation: whereas the relevant national survey asked if people had ever ‘regularly’ smoked, the EXCEED questionnaire included occasional use in the definition of ever smokers.Table 4 presents more detailed information on smoking habits among current and ex-smokers. The vast majority of both current and ex-smokers reported smoking cigarettes, but cigar/cigarillo and pipe smoking was less common amongst current than ex-smokers. Alcohol intake for our cohort is comparable to that of similar age groups in Health Survey for England 2016.
Table 4.

Smoking history (self-reported by current or ex-smokers)

Current smokers
Ex-smokers
n % n %
Type of tobacco useda(n = 905)(n = 3675)
 Cigarettesb85793.6360397.8
 Shisha10.160.2
 Cigars/cigarillos465.02526.8
 Pipe151.61474.0
 Other131.420.1
Use of electronic cigarettes(n = 909)(n = 3675)
 Ever23826.22045.6
 Never67173.8347194.4
Smoking cessation aids used (ever)c(n = 377)(n = 3636)
 NRT11731.046412.6
 Bupropion51.3401.1
 Varenicline5414.32316.3
 Other4812.73048.3
 None19351.2265572.2
Mean SD Mean SD
Pack-years of smokingd(n = 520)(n = 3167)
27.218.918.420.6
Cigarettes per day(n = 562)(n = 3197)
13.58.614.811.9
Age at smoking initiation (years)(n = 776)(n = 3645)
18.45.817.13.8

People may use more than one type of tobacco, so percentages will not add up to 100.

Filtered, unfiltered and hand-rolled.

Only for quit attempts lasting at least 6 months. Denominator for percentages is current smokers who have made a quit attempt lasting at least 6 months, or total number of ex-smokers. People may have used more than one aid, so percentages will not add up to 100.

Only for cigarette smokers.

Smoking history (self-reported by current or ex-smokers) People may use more than one type of tobacco, so percentages will not add up to 100. Filtered, unfiltered and hand-rolled. Only for quit attempts lasting at least 6 months. Denominator for percentages is current smokers who have made a quit attempt lasting at least 6 months, or total number of ex-smokers. People may have used more than one aid, so percentages will not add up to 100. Only for cigarette smokers. Only 25.2% of participants are in the two most deprived national quintiles and 29.3% are in the least deprived quintile. For Leicester City, 75.9% of the population are in the two most deprived quintiles and only 1.4% are in the least deprived quintile. Though this reflects the whole Leicester population, not just those aged 40–69 and registered with the GP practices that agreed to take part in EXCEED, it indicates that the most deprived communities are under-represented in the cohort. The Quality and Outcomes Framework (QOF), introduced in 2004, aims to improve the quality of care patients are given by rewarding practices for meeting specified standards of care. Prevalence of 16 chronic conditions prioritized for management in primary care by QOF is presented in Table 5. The figures presented are based on presence of any qualifying diagnostic code [QOF business rules v37, 2017/18] at any time in the patient’s record, with no further exclusions or restrictions. For those conditions where the national QOF prevalence is calculated in a comparable way, prevalence in EXCEED is generally slightly higher and in some cases markedly so. For example, prevalence of hypertension in EXCEED was 28.2% compared with 13.9% nationally. This is likely to be largely due to our older population, since QOF covers all ages.
Table 5.

Prevalence of chronic conditions

Condition n a %
Atrial fibrillation2432.7
Asthma113812.7
Cancer6196.9
Coronary heart disease3934.4
Chronic kidney disease (3a-5)2803.1
Chronic obstructive pulmonary disease3013.4
Depression202322.7
Diabetes8269.2
Epilepsy1031.2
Heart failure1071.2
Hypertension251628.2
Mental health (psychosis, schizophrenia and bipolar affective disorder)710.8
Osteoporosis2853.2
Peripheral arterial disease510.6
Rheumatoid arthritis1221.4
Stroke1081.2

Number of participants with one occurrence at any time of a diagnostic code listed in the Quality and Outcomes Framework for that condition; % is out of all participants for whom primary care data were available (8930).

Prevalence of chronic conditions Number of participants with one occurrence at any time of a diagnostic code listed in the Quality and Outcomes Framework for that condition; % is out of all participants for whom primary care data were available (8930). The number of conditions per individual is summarized in Table 6. We found that, overall, 27.2% of our participants had a recorded diagnostic code for more than one QOF condition. This is in line with findings from a large study of almost 100 000 individuals in the Clinical Practice Research Database by Salisbury and colleagues, who used a similar approach to define multimorbidity. They found that 16% of their population had a code for more than one QOF condition, but this rose sharply with age, reaching around 20% among 55- to 64-year-olds and over 30% in 65- to 74-year-olds. Two further large UK-based studies have used more comprehensive lists of conditions to define multimorbidity, but limited their focus to active morbidity only, and found prevalence of multimorbidity between 23.2% and 27.2% across all ages, rising substantially with age to 50% or more among 65- to 74-year-olds.,
Table 6.

Proportion of participants with multiple chronic conditionsa

Number of chronic conditions n % b
1294232.9
2153117.1
36106.8
42182.4
5700.8
6 or more200.2

16 chronic conditions prioritized for management in primary care by the Quality and Outcomes Framework (see Table 5).

Of participants with primary care data.

Proportion of participants with multiple chronic conditionsa 16 chronic conditions prioritized for management in primary care by the Quality and Outcomes Framework (see Table 5). Of participants with primary care data. We specifically examined primary care diagnoses of one condition, COPD, for which we had independent diagnostic information from baseline spirometry. Diagnosis of COPD defined by presence of a COPD code in primary care data compared with COPD defined by baseline spirometry results indicates that there is underdiagnosis of COPD in EXCEED participants: 84.8% of those with GOLD stage 1–4 COPD and 71.9% of GOLD stage 2–4 were undiagnosed (Table 7). Existing estimates of the proportion of COPD which is undiagnosed range from around 60% to over 80%, depending on the setting and population studied. Using comparable methodology in a similar population to ours, Shahab and colleagues found that 81.2% of those with spirometric COPD had no respiratory diagnosis at all and over 95% had not been diagnosed with COPD. The slightly lower level of underdiagnosis in EXCEED (84.8% of those with spirometric COPD had not received a COPD diagnosis) may be partially attributable to recent improvements in case-finding, and to our use of primary care records rather than self-report to define diagnoses. Reasons posited for such extensive underdiagnosis include a perception among clinicians that COPD is solely a disease of elderly smokers, pessimistic views of treatment, lack of availability or underuse of spirometry,, and the unreliability of self-reported smoking status in clinical practice.
Table 7.

Comparison of diagnosis of COPD as defined by COPD codes in primary care data and defined by baseline spirometrya

COPD defined by baseline spirometry using GOLD criteria
GOLD 1–4
GOLD 2-4
Yes
No
Yes
No
n % n % n % n %
COPD code in primary careYes8615.2190.77628.1291.0
No47984.8264399.319471.9292899.0

For participants with linked primary care data and baseline spirometry measures (n = 3227). All percentages are column percentages.

Comparison of diagnosis of COPD as defined by COPD codes in primary care data and defined by baseline spirometrya For participants with linked primary care data and baseline spirometry measures (n = 3227). All percentages are column percentages. To demonstrate the utility of EXCEED for enabling cross-sectional or longitudinal studies of quantitative traits, we examined some of the most common measures available in the primary care data and the numbers of participants with one or more recordings of these measures (Table 8). Table 9 shows the average values of these measures. For example, 98.0% of participants have two or more recordings of blood pressure and over 90% have four or more recordings. Mean systolic blood pressure was 129.9 [standard deviation (sd) 13.9] and mean diastolic blood pressure was 78.3 (sd 8.6) (Table 9).
Table 8.

Numbers of participants with multiple occurrences of a Read code for a quantitative measurement

≥1 record
≥2 records
≥3 records
≥4 records
n % n % n % n %
Blood pressure reading889599.6874898.0848795.0809390.6
Serum creatinine842294.3749283.9646872.4559962.7
Serum sodium841794.3747783.7644772.2558262.5
Serum potassium838993.9743583.3640071.7554662.1
Serum urea level841594.2746083.5641471.8553762.0
eGFRa822792.1704578.9588365.9494655.4
Serum triglyceride levels838993.9695077.8563363.1466952.3
Serum cholesterol level828792.8685276.7558662.6468052.4
Platelet count805490.2691877.5582765.3487354.6
Serum HDL cholesterol level834093.4677075.8541160.6441249.4
Serum LDL cholesterol level805090.1641571.8505456.6409145.8
Serum bilirubin level707479.2588965.9483554.1398944.7
Haemoglobin A1c level678275.9480053.8334237.4238326.7
Total white blood count799189.5680476.2567463.5473153.0
Eosinophil count803089.9686076.8572864.1476153.3

HDL and LDL, high- and low-density lipoprotein.

GFR, glomerular filtration rate calculated by abbreviated Modification of Diet in Renal Disease Study Group calculation.

Table 9.

Summary of values for selected measuresa

Term n b %Meansd
Blood pressure reading (systolic, mmHg)889599.6129.914.0
Blood pressure reading (diastolic, mmHg)889599.678.18.8
Serum creatinine level (umol/L)842294.374.121.3
Serum sodium level (mmol/L)841794.3140.22.2
Serum potassium level (mmol/L)838993.94.40.4
Serum urea level (mmol/L)841594.25.71.6
eGFRc (mL/min/1.73 m2)822792.182.410.5
Serum triglyceride levels (mmol/L)838993.91.50.8
Serum cholesterol level (mmol/L)828792.85.11.1
Platelet count observation (x 109/L)805490.2252.564.7
Serum HDL cholesterol level (mmol/L)834093.41.60.5
Serum LDL cholesterol level (mmol/L)805090.12.90.9
Serum bilirubin level (umol/L)707479.210.56.8
Haemoglobin A1c level (%)678275.95.70.8
n b %MedianIQR
Total white blood count (x 109/L)799189.56.25.2–7.4
Eosinophil count (x 109/L)803089.90.160.10–0.24

Where participants have more than one recording of a measure, the most recent value for each participant was used.

Number of participants for whom values were available.

Glomerular filtration rate calculated by abbreviated Modification of Diet in Renal Disease Study Group calculation.

Numbers of participants with multiple occurrences of a Read code for a quantitative measurement HDL and LDL, high- and low-density lipoprotein. GFR, glomerular filtration rate calculated by abbreviated Modification of Diet in Renal Disease Study Group calculation. Summary of values for selected measuresa Where participants have more than one recording of a measure, the most recent value for each participant was used. Number of participants for whom values were available. Glomerular filtration rate calculated by abbreviated Modification of Diet in Renal Disease Study Group calculation.

Recall-by-phenotype study

Recalling by phenotype (see Figure 3) facilitates in-depth study of disease mechanisms, with a reduced risk of bias as with nested case-control studies. One such sub-study has recalled EXCEED participants to take part in a study examining the microbiome in COPD cases and in smoking and non-smoking controls.
Figure 3.

Examples of potential recall-by-phenotype (top) and recall-by-genotype studies (bottom).

Examples of potential recall-by-phenotype (top) and recall-by-genotype studies (bottom).

Potential for recall-by-genotype studies

Future recall-by-genotype studies (Figure 3) are expected to contribute to a deeper understanding of genetic variants which may be potential therapeutic targets, by bringing back participants for detailed assessments on the basis of the known or suspected mechanism of the relevant gene. Such recall-by-genotype sub-studies may investigate disease susceptibility, disease progression or drug response, and though they could be interventional in design, most will be observational studies. Observational studies of this kind can provide evidence which is not susceptible to reverse causation and to confounding by lifestyle factors, given Mendelian randomization. Nested designs are also feasible, which do not rely on recall of participants but which could be undertaken quickly and inexpensively using stored biological samples and linked electronic data, and such sub-studies could select samples based on either phenotype or genotype. Small-scale intervention-by-genotype studies could, for example, evaluate response to a treatment with a known safety profile in participants with a specific genetic variant.

What are the main strengths and weaknesses?

Linkage to EHR is a key strength of EXCEED, enabling the study of a wide range of risk factors and diseases, even where data have not been specifically collected at baseline or precede enrolment as a study participant. UK general practice has had over 20  years of near-universal computerized records. These records have been further enhanced with the introduction of the QOF in 2004, which incentivized GPs to keep comprehensive records of several chronic diseases. Some of these indicators incentivize the recording of quantitative traits relevant to the chronic disease diagnosed, such as blood pressure, lung function, estimated glomerular filtration rate, glycated haemoglobin (HbA1c) and cholesterol measures. That these are expected to be recorded approximately annually means that registered patients often have many repeat measures within linked EHRs, providing an excellent opportunity to study trends in control of conditions such as hypertension or progression of diseases such as COPD. Previous studies have validated some of these primary care measures—for example, routinely recorded spirometry has shown good validity when compared with study specific measures. Other more complex longitudinal outcomes—for example, related to healthy ageing—can also be measured using EHRs. The use of EHR can have limitations. Misclassification and miscoding of diagnoses may occur, and it is particularly likely that the true prevalence of many diseases will be underestimated (the ‘clinical iceberg’), as demonstrated by a comparison of COPD diagnoses in primary care data in EXCEED with COPD from spirometry (Table 7). However, the availability of repeat recordings and multiple types of data (including examination findings, pathology results and onwards referrals) over a long period of time can be used to improve and validate the classification of diagnoses and other important exposures and outcomes. Many disease definitions have been validated already—for example, definitions of COPD and asthma in the GOLD-CPRD database—and EXCEED will contribute further to this important area of study. In addition to disease status validation, combining records of drug prescriptions and diagnostic and symptom codes can be used to define complex phenotypes that it has not been possible to study previously. The cohort recruited adults aged between 30 and 69 years, mostly aged 40 or over, and therefore permits the study of a wide range of questions pertaining to health and disease in adulthood. The absence of younger participants renders it less suitable to study the evolution of disease before age 40. This is mitigated by the availability of linked health care data from EHR. These records include data prospectively coded by primary care practitioners from the mid-1990s onwards, and will therefore include extensive data from early adulthood for those in middle age when recruited. Earlier life events, for example in childhood and adolescence, are likely to be captured only when they were transferred from paper to electronic health care records in the early 1990s, and will therefore include major events such as childhood pneumonia but not more minor illnesses. The very elderly are also currently absent from the cohort, but data will become available through follow-up of those recruited at the older end of the age spectrum. More generally, the cohort’s age, sex and ethnicity distribution will influence generalizability of research findings to other population groups, and for some research questions, validation in other cohorts may be required. Minority ethnic groups, notably Leicester’s Asian and Asian British population, are currently under-represented in EXCEED. This reflects the recruitment methods used to date. We have extended recruitment to increase minority ethnic participant numbers and have adapted our recruitment methods to achieve this, for example by undertaking recruitment at community events. Minority ethnic groups are also substantially under-served in the availability of samples with genome-wide genotype data worldwide. Although the situation has improved in recent years for Asian populations, only 14% of individuals included in genome-wide association studies worldwide up to 2016 were from Asian backgrounds. This situation is replicated in UK-based studies. In UK Biobank, only 2% of participants are from Asian or Asian British ethnic groups, despite this group representing around 7% of the UK population. It is essential that representation of minority ethnic groups increases substantially in genomic studies if these communities are to realise the benefits of genomically-informed advances in precision medicine. EXCEED aims to contribute towards this important goal. The utility of combining EHR and genetic data for efficient and flexible genetic studies has been highlighted by the eMERGE network of biobanks and Geisinger MyCode., The comprehensive nature and near-universal coverage of NHS health records adds further strength to this study design. In particular, the ability to capture virtually all primary and secondary care contacts over decades of the lifespan enables longitudinal studies with a depth of data available in relatively few studies. Strengths of the study also include consent from all participants to be contacted to participate in recall-by-genotype studies, a type of consent which is not yet widely sought in cohort studies. Recall-by-genotype studies are expected to be highly valuable to identify and validate drug targets and to inform targeting of therapeutics in a precision medicine approach. Maintaining the engagement of cohort participants is important for such studies. Potential strategies include incentivizing and/or reducing barriers to further participation, building a study ‘community’ through study branding, newsletters and events, and efforts to trace participants where contact details have lapsed., In EXCEED, in addition to a study newsletter, we are devising approaches with our patient and public involvement (PPI) group, including planned focus groups on dynamic consent approaches. Some studies incorporating genetic analyses (such as Genomics England) actively seek clinically actionable variants, whereas most cohort studies may not seek to identify these but may discover them as incidental findings. Anticipating this potential, at the time of consent we asked whether participants would wish to be notified about clinically actionable variants; 99.5% of participants stated that they would wish to be informed in this situation. Clinically actionable variants will be discussed with the regional clinical genetics department of University Hospitals Leicester NHS Trust and then reported back to participants on request for NHS validation. Understanding the reasons for participants’ preferences, how these change over time and how these can best be supported by future policies and procedures, will be of key importance for EXCEED and other longitudinal cohort studies.

Can I get hold of the data? Where can I find out more?

Participants have consented to their pseudonymized data being made available to other approved researchers, and we welcome requests for collaboration and data access. Access to the resource requires completion of a proposal form, including a lay summary of the proposed research. Applications to access the resource will be assessed for consistency with the data access policy and with the guidance of the Scientific Committee, which has participant representation. Access to the data will be subject to completion of an appropriate Data/Materials Transfer Agreement and to necessary funding being in place. Requests to collect new data or to use biological samples may be subject to additional requirements. Interested researchers are encouraged to contact the study management team via exceed@le.ac.uk.

Profile in a nutshell

EXCEED is a longitudinal population-based cohort which facilitates investigation of genetic, environmental and lifestyle-related determinants of a broad range of diseases and of multiple morbidity, through data collected at baseline and via electronic health care record linkage. Recruitment has taken place in Leicester, Leicestershire and Rutland since 2013 and is ongoing, with 10 156 participants aged 30-69 to date. The population of Leicester is diverse and additional recruitment from Black, Asian and minority ethnic (BAME) communities is ongoing. Participants have consented to follow-up for up to 25 years through electronic health records (EHR). Data available include baseline demographics, anthropometry, spirometry, lifestyle factors (smoking and alcohol use) and longitudinal health information from primary care records, with additional linkage to other EHR datasets planned. Patients have consented to be contacted for recall-by-genotype and recall-by-phenotype sub-studies, providing an important resource for precision medicine research. We welcome requests for collaboration and data access by contacting the study management team via exceed@le.ac.uk.

Funding

The study has been supported by the University of Leicester, the NIHR Leicester Biomedical Research Centre, the NIHR Clinical Research Network East Midlands, Leicester City Council, the Medical Research Council (grant G0902313 to MDT), the Wellcome Trust (grant 202849 to MDT) and a respiratory genomic collaboration with GSK. C.J. holds a Medical Research Council Clinical Research Training Fellowship (MR/P00167X/1). C.B. holds UKRI Innovation Fellowship at Health Data Research UK (MR/S003762/1). L.V.W. holds a GSK/British Lung Foundation Chair in Respiratory Research (grant C17-1).
  57 in total

1.  Genomics is failing on diversity.

Authors:  Alice B Popejoy; Stephanie M Fullerton
Journal:  Nature       Date:  2016-10-13       Impact factor: 49.962

Review 2.  Measures of multimorbidity and morbidity burden for use in primary care and community settings: a systematic review and guide.

Authors:  Alyson L Huntley; Rachel Johnson; Sarah Purdy; Jose M Valderas; Chris Salisbury
Journal:  Ann Fam Med       Date:  2012 Mar-Apr       Impact factor: 5.166

3.  Exome Array Analysis Identifies a Common Variant in IL27 Associated with Chronic Obstructive Pulmonary Disease.

Authors:  Brian D Hobbs; Margaret M Parker; Han Chen; Taotao Lao; Megan Hardin; Dandi Qiao; Iwona Hawrylkiewicz; Pawel Sliwinski; Jae-Joon Yim; Woo Jin Kim; Deog Kyeom Kim; Peter J Castaldi; Craig P Hersh; Jarrett Morrow; Bartolome R Celli; Victor M Pinto-Plata; Gerald J Criner; Nathaniel Marchetti; Raphael Bueno; Alvar Agustí; Barry J Make; James D Crapo; Peter M Calverley; Claudio F Donner; David A Lomas; Emiel F M Wouters; Jorgen Vestbo; Peter D Paré; Robert D Levy; Stephen I Rennard; Xiaobo Zhou; Nan M Laird; Xihong Lin; Terri H Beaty; Edwin K Silverman; Michael H Cho
Journal:  Am J Respir Crit Care Med       Date:  2016-07-01       Impact factor: 21.405

Review 4.  10 Years of GWAS Discovery: Biology, Function, and Translation.

Authors:  Peter M Visscher; Naomi R Wray; Qian Zhang; Pamela Sklar; Mark I McCarthy; Matthew A Brown; Jian Yang
Journal:  Am J Hum Genet       Date:  2017-07-06       Impact factor: 11.025

5.  Data Resource Profile: Clinical Practice Research Datalink (CPRD).

Authors:  Emily Herrett; Arlene M Gallagher; Krishnan Bhaskaran; Harriet Forbes; Rohini Mathur; Tjeerd van Staa; Liam Smeeth
Journal:  Int J Epidemiol       Date:  2015-06-06       Impact factor: 7.196

6.  Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population.

Authors:  Anna Fry; Thomas J Littlejohns; Cathie Sudlow; Nicola Doherty; Ligia Adamska; Tim Sprosen; Rory Collins; Naomi E Allen
Journal:  Am J Epidemiol       Date:  2017-11-01       Impact factor: 4.897

7.  Retention strategies in longitudinal cohort studies: a systematic review and meta-analysis.

Authors:  Samantha Teague; George J Youssef; Jacqui A Macdonald; Emma Sciberras; Adrian Shatte; Matthew Fuller-Tyszkiewicz; Chris Greenwood; Jennifer McIntosh; Craig A Olsson; Delyse Hutchinson
Journal:  BMC Med Res Methodol       Date:  2018-11-26       Impact factor: 4.615

8.  A reference panel of 64,976 haplotypes for genotype imputation.

Authors:  Shane McCarthy; Sayantan Das; Warren Kretzschmar; Olivier Delaneau; Andrew R Wood; Alexander Teumer; Hyun Min Kang; Christian Fuchsberger; Petr Danecek; Kevin Sharp; Yang Luo; Carlo Sidore; Alan Kwong; Nicholas Timpson; Seppo Koskinen; Scott Vrieze; Laura J Scott; He Zhang; Anubha Mahajan; Jan Veldink; Ulrike Peters; Carlos Pato; Cornelia M van Duijn; Christopher E Gillies; Ilaria Gandin; Massimo Mezzavilla; Arthur Gilly; Massimiliano Cocca; Michela Traglia; Andrea Angius; Jeffrey C Barrett; Dorrett Boomsma; Kari Branham; Gerome Breen; Chad M Brummett; Fabio Busonero; Harry Campbell; Andrew Chan; Sai Chen; Emily Chew; Francis S Collins; Laura J Corbin; George Davey Smith; George Dedoussis; Marcus Dorr; Aliki-Eleni Farmaki; Luigi Ferrucci; Lukas Forer; Ross M Fraser; Stacey Gabriel; Shawn Levy; Leif Groop; Tabitha Harrison; Andrew Hattersley; Oddgeir L Holmen; Kristian Hveem; Matthias Kretzler; James C Lee; Matt McGue; Thomas Meitinger; David Melzer; Josine L Min; Karen L Mohlke; John B Vincent; Matthias Nauck; Deborah Nickerson; Aarno Palotie; Michele Pato; Nicola Pirastu; Melvin McInnis; J Brent Richards; Cinzia Sala; Veikko Salomaa; David Schlessinger; Sebastian Schoenherr; P Eline Slagboom; Kerrin Small; Timothy Spector; Dwight Stambolian; Marcus Tuke; Jaakko Tuomilehto; Leonard H Van den Berg; Wouter Van Rheenen; Uwe Volker; Cisca Wijmenga; Daniela Toniolo; Eleftheria Zeggini; Paolo Gasparini; Matthew G Sampson; James F Wilson; Timothy Frayling; Paul I W de Bakker; Morris A Swertz; Steven McCarroll; Charles Kooperberg; Annelot Dekker; David Altshuler; Cristen Willer; William Iacono; Samuli Ripatti; Nicole Soranzo; Klaudia Walter; Anand Swaroop; Francesco Cucca; Carl A Anderson; Richard M Myers; Michael Boehnke; Mark I McCarthy; Richard Durbin
Journal:  Nat Genet       Date:  2016-08-22       Impact factor: 38.330

Review 9.  A systematic review of the effect of retention methods in population-based cohort studies.

Authors:  Cara L Booker; Seeromanie Harding; Michaela Benzeval
Journal:  BMC Public Health       Date:  2011-04-19       Impact factor: 3.295

10.  Dynamic consent: a patient interface for twenty-first century research networks.

Authors:  Jane Kaye; Edgar A Whitley; David Lund; Michael Morrison; Harriet Teare; Karen Melham
Journal:  Eur J Hum Genet       Date:  2014-05-07       Impact factor: 4.246

View more
  2 in total

1.  Genetic and clinical characteristics of treatment-resistant depression using primary care records in two UK cohorts.

Authors:  Chiara Fabbri; Saskia P Hagenaars; Catherine John; Alexander T Williams; Nick Shrine; Louise Moles; Ken B Hanscombe; Alessandro Serretti; David J Shepherd; Robert C Free; Louise V Wain; Martin D Tobin; Cathryn M Lewis
Journal:  Mol Psychiatry       Date:  2021-03-22       Impact factor: 13.437

2.  Genetic Associations and Architecture of Asthma-COPD Overlap.

Authors:  Catherine John; Anna L Guyatt; Nick Shrine; Richard Packer; Thorunn A Olafsdottir; Jiangyuan Liu; Lystra P Hayden; Su H Chu; Jukka T Koskela; Jian'an Luan; Xingnan Li; Natalie Terzikhan; Hanfei Xu; Traci M Bartz; Hans Petersen; Shuguang Leng; Steven A Belinsky; Aivaras Cepelis; Ana I Hernández Cordero; Ma'en Obeidat; Gudmar Thorleifsson; Deborah A Meyers; Eugene R Bleecker; Lori C Sakoda; Carlos Iribarren; Yohannes Tesfaigzi; Sina A Gharib; Josée Dupuis; Guy Brusselle; Lies Lahousse; Victor E Ortega; Ingileif Jonsdottir; Don D Sin; Yohan Bossé; Maarten van den Berge; David Nickle; Jennifer K Quint; Ian Sayers; Ian P Hall; Claudia Langenberg; Samuli Ripatti; Tarja Laitinen; Ann C Wu; Jessica Lasky-Su; Per Bakke; Amund Gulsvik; Craig P Hersh; Caroline Hayward; Arnulf Langhammer; Ben Brumpton; Kari Stefansson; Michael H Cho; Louise V Wain; Martin D Tobin
Journal:  Chest       Date:  2022-01-31       Impact factor: 10.262

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.