| Literature DB >> 22808065 |
Elodie Caboux1, Christophe Lallemand, Gilles Ferro, Bertrand Hémon, Maimuna Mendy, Carine Biessy, Matt Sims, Nick Wareham, Abigail Britten, Anne Boland, Amy Hutchinson, Afshan Siddiq, Paolo Vineis, Elio Riboli, Isabelle Romieu, Sabina Rinaldi, Marc J Gunter, Petra H M Peeters, Yvonne T van der Schouw, Ruth Travis, H Bas Bueno-de-Mesquita, Federico Canzian, Maria-José Sánchez, Guri Skeie, Karina Standahl Olsen, Eiliv Lund, Roberto Bilbao, Núria Sala, Aurelio Barricarte, Domenico Palli, Carmen Navarro, Salvatore Panico, Maria Luisa Redondo, Silvia Polidoro, Laure Dossus, Marie Christine Boutron-Ruault, Françoise Clavel-Chapelon, Antonia Trichopoulou, Dimitrios Trichopoulos, Pagona Lagiou, Heiner Boeing, Eva Fisher, Rosario Tumino, Claudia Agnoli, Pierre Hainaut.
Abstract
The European Prospective Investigation into Cancer and nutrition (EPIC) is a long-term, multi-centric prospective study in Europe investigating the relationships between cancer and nutrition. This study has served as a basis for a number of Genome-Wide Association Studies (GWAS) and other types of genetic analyses. Over a period of 5 years, 52,256 EPIC DNA samples have been extracted using an automated DNA extraction platform. Here we have evaluated the pre-analytical factors affecting DNA yield, including anthropometric, epidemiological and technical factors such as center of subject recruitment, age, gender, body-mass index, disease case or control status, tobacco consumption, number of aliquots of buffy coat used for DNA extraction, extraction machine or procedure, DNA quantification method, degree of haemolysis and variations in the timing of sample processing. We show that the largest significant variations in DNA yield were observed with degree of haemolysis and with center of subject recruitment. Age, gender, body-mass index, cancer case or control status and tobacco consumption also significantly impacted DNA yield. Feedback from laboratories which have analyzed DNA with different SNP genotyping technologies demonstrate that the vast majority of samples (approximately 88%) performed adequately in different types of assays. To our knowledge this study is the largest to date to evaluate the sources of pre-analytical variations in DNA extracted from peripheral leucocytes. The results provide a strong evidence-based rationale for standardized recommendations on blood collection and processing protocols for large-scale genetic studies.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22808065 PMCID: PMC3396633 DOI: 10.1371/journal.pone.0039821
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
DNA extraction generated in the course of 12 distinct projects using specimens of the EPIC cohort.
| Study code | Study name | Objectives | Numberof DNAextractions |
| BLAD | Participation in GWAS for bladder cancer | Nested case control study aimed at identifying novel genetic variants which are worthy of intensive pursuit in epidemiological, genetic mapping, clinical and laboratory investigationson bladder cancer. | 950 |
| BRCD | Participation in the Breast an Prostate Cancer Cohort Consortium (BPC3) – Breast cancer component | Nested case-control study aimed at the analysis of genes related to steroid hormone and insulin-like growth factor-1 metabolism and breast cancer risk in EPIC which is part of theNCI breast and prostate cancer cohort consortium and GWAS study of ER-negativebreast cancer. | 8071 |
| CORD | The Influence of Vitamin D andPolymorphisms of the Vitamin D Receptorand Calcium Sensing Receptoron Colorectal Cancer Risk | Nested case-control study aimed at evaluating the roles of both vitamin-D (important in calcium homeostasis/cell cycle kinetics) and calcium (role in cell cycle kinetics) incolorectal cancer prevention. | 2177 |
| EGAD | Genetic susceptibility, environmental factors and the gastric cancer risk in European populations (EUR-GAST II) | Nested case-control study aimed at (a) evaluating the effect of dietary and environmental exposures by histological and anatomical subtypes of gastric cancer; (b) evaluating theeffect of dietary and environmental factors on esophageal adenocarcinomas; (c) evaluatingthe main effect of genetic polymorphisms in several candidates genes. | 1444 |
| EPHD | Study of the interplay of genetic, biochemical and lifestyle factors in coronary heart disease (EPIC-HEART) | Nested case-control study aimed at investigating the separate and combined influencesof genetic, biochemical and major lifestyle factors (notably diet) on the incidence ofcoronary heart disease (CHD). | 7643 |
| HPVD | HPV and cervical: the role of diet,environmental and infectiouscofactors, and genetic susceptibility | Nested case-control study aimed at evaluating the association between serological markersof HPV infection and cervical cancer as well as the role in cervical carcinogenesis of: (a) environmental cofactors (diet, tobacco, parity, use of hormonal contraceptives),(b) infectious cofactors (HSV-2 and | 664 |
| INTD | Examination of the interaction of genetic and lifestyle factors on the incidence of type 2 diabetes (INTERACT) | Nested case-control study aimed at evaluating gene-lifestyle interactions in relationwith type 2 diabetes. | 18439 |
| KIDD | Genome Wide Association Study of kidney cancer | The aims of this study are to (i) immediately replicate approximately the top 30 variants in a large follow-up series, and (ii) substantially replicate between 20,000 and 317,000 variants following the GWAS of kidney cancer recently completed involving 1400 cases and 2800 controls from an IARC Central Europe study. | 792 |
| LUND | DNA methylation changes associated with cancer risk factors and blood levels ofvitamin metabolites | The aim of this study is to investigate the contribution of common human genetic variationto susceptibility of lung cancer. The association between lung cancer and DNA methylation patterns in a panel of candidate genes is examined. It is also investigated whether bloodlevels of vitamin metabolites modify DNA methylation levels in blood cells. DNAmethylation levels are quantitatively determined in blood cells of nestedcases and controls. | 2450 |
| LYMD | EPIC Nested case-control investigation on lymphomas | Nested case-control study aimed at elucidating whether risk factors for lymphoma exerttheir effect by modulation of the immune system by studying the inherited andacquired immune response in non-Hodgkin lymphoma (NHL) cases and controls. | 1789 |
| PAND | Genome Wide association Study andpancreatic cancer (PanScan) | Nested case-control study aimed at conducting a whole genome scan (WGS) of common genetic variants to identify genetic markers of susceptibility to pancreatic cancer. | 504 |
| PROD | Participation in the Breast an Prostate Cancer Cohort Consortium (BPC3) – Prostate cancer component | Nested case-control study aimed at the analysis of genes related to steroid hormone and insulin-like growth factor-1 metabolism and prostate cancer risk in EPIC which is part ofthe NCI breast and prostate cancer cohort consortium and GWAS study of aggressiveprostate cancer. | 2238 |
Technical, epidemiological and anthropometric factors analyzed for evaluation of DNA yield variations.
| Variables | N | % | |
| Gender | Men | 18680 | 39.6 |
| Women | 28481 | 60.4 | |
| Age | <45 | 7054 | 15.0 |
| 45–49 | 6631 | 14.1 | |
| 50–54 | 8835 | 18.7 | |
| 55–59 | 9514 | 20.1 | |
| 60–64 | 8567 | 18.2 | |
| ≥65 | 6560 | 13.9 | |
| Body Mass Index | Normal (<25) | 14449 | 30.7 |
| Moderate pre-obesity (25–27.5) | 12784 | 27.1 | |
| Overweight (27.5–30) | 7185 | 15.2 | |
| Moderate obesity (30–35) | 8953 | 19.0 | |
| Obesity (≥35) | 2934 | 6.2 | |
| Missing | 856 | 1.8 | |
| Cancer | Incident | 10954 | 23.2 |
| Non Incident | 36207 | 76.8 | |
| Prevalent | 1311 | 2.8 | |
| Non Prevalent | 45850 | 97.2 | |
| Time from blood collection to incident cancer diagnosis | <2 years | 1813 | 3.8 |
| 2–5 years | 3299 | 7.0 | |
| 5–10 years | 4460 | 9.5 | |
| ≥10 years | 1081 | 2.3 | |
| Missing | 36508 | 77.4 | |
| Time from prevalent cancer diagnosis to blood collection | <2 years | 240 | 0.5 |
| 2–5 years | 298 | 0.6 | |
| 5–10 years | 340 | 0.7 | |
| ≥10 years | 427 | 0.9 | |
| Missing | 45856 | 97.3 | |
| Never | 21290 | 45.1 | |
| Tobacco consumption | Former | 13847 | 29.4 |
| Current | 11067 | 23.5 | |
| Missing | 957 | 2.0 | |
| Number of straws | 1 | 11838 | 25.1 |
| 2 | 35323 | 74.9 | |
| Extraction method | Extractor LS1 | 29441 | 62.4 |
| Extractor LS2 | 17305 | 36.7 | |
| Manual | 415 | 0.9 | |
| Quantification method | Nanodrop | 33805 | 71.7 |
| Picogreen | 13356 | 28.3 | |
| Haemolysis | Yes | 3337 | 7.1 |
| No | 21379 | 45.3 | |
| Missing | 22445 | 47.6 | |
| Haemolysis gradient | Light haemolysis | 2870 | 6.08 |
| Medium haemolysis | 445 | 0.94 | |
| Heavy haemolysis | 20 | 0.05 | |
| Missing | 43826 | 92.93 | |
| Time from collection to refrigeration | <5 min | 2704 | 5.7 |
| 5 min - 1 hour | 5421 | 11.5 | |
| 1–3 hours | 3934 | 8.3 | |
| >3 hours | 3987 | 8.5 | |
| Missing | 31115 | 66.0 | |
| Time from refrigeration to centrifugation | <1.5 hours | 498 | 1.0 |
| 1.5–2 hours | 3482 | 7.4 | |
| 2–6 hours | 2153 | 4.6 | |
| ≥6 hours | 1865 | 4.0 | |
| Missing | 39163 | 83.0 | |
| Time from centrifugation to freezing | <45 min | 6179 | 13.1 |
| 45–59 min | 6296 | 13.4 | |
| 1–2 hours | 6800 | 14.4 | |
| ≥2 hours | 7598 | 16.1 | |
| Missing | 20288 | 43.0 | |
First incident cancer case.
Last prevalent cancer case.
Effects of individual characteristics and processing variations on DNA yield (µg).
| Estimated coefficientfor effect(c) | SE | P value | |
|
| |||
| Men | reference | ||
| Women | 1.437 | 0.388 | <0.01 |
|
| −0.107 | 0.020 | <0.01 |
|
| 0.390 | 0.039 | <0.01 |
|
| |||
| Never smoker | reference | ||
| Former smoker | 0.366 | 0.375 | 0.33 |
| Current smoker | 10.871 | 0.515 | <0.01 |
|
| |||
| No | reference | ||
| Yes | 2.494 | 0.363 | <0.01 |
|
| |||
| No | reference | ||
| Yes | 1.252 | 1.157 | 0.28 |
|
| |||
| One straw | reference | ||
| Two straws | 30.276 | 0.489 | <0.01 |
|
| |||
| Autopure LS 1 | reference | ||
| Autopure LS 2 | −2.439 | 0.434 | <0.01 |
| Manual | −6.757 | 0.994 | <0.01 |
|
| |||
| Nanodrop | reference | ||
| Picogreen | 6.449 | 0.516 | <0.01 |
|
| |||
| No | reference | ||
| Yes | −7.895 | 0.814 | <0.01 |
|
| |||
| Light haemolysis | reference | ||
| Medium haemolysis | −5.370 | 1.685 | <0.01 |
| Heavy haemolysis | −9.509 | 9.221 | 0.30 |
|
| 0.482 | 0.213 | 0.02 |
|
| 0.227 | 0.036 | <0.01 |
|
| 0.057 | 0.138 | 0.68 |
|
| −0.138 | 0.116 | 0.24 |
|
| 0.358 | 0.165 | 0.03 |
The estimated coefficients for effect reflect the increasing (positive value) or decreasing (negative value) concentration response to the lifestyle/exposure factor, adjusted for the other lifestyle/exposure factors.
Figure 1Distribution of yield (µg) for DNA samples extracted with 2 aliquots of buffy coat.
Representation of DNA yield for DNA extractions performed from 2 buffy coat aliquots. Boxes extend from 25th to 75th percentiles and are divided by a solid line representing the median of each center. Whiskers extend from lower to upper adjacent values as defined by Tukey. Outliers are denoted by a dot.
Inclusion criteria for genotyping projects.
| Criteria | Number of samples excluded | ||||||
| Project | Number of samples | Quantity | Concentration | Insufficientyield | Lowconcentration | % of samples failed | % of samples qualified for genotyping |
| Kidney (KIDD) | 258 | 50 ng | 50 ng/µl | 0 | 0 | 0 | 100 |
| PanScan (PAND) | 489 | 1250 ng | 25 ng/ul | n/a | 18 | 3.68 | 96.32 |
| BPC3 (BRCD+PROD) | 5684 | 250 ng | 50 ng/µl | 0 | 0 | 0 | 100 |
| Interact (INTD) | 20794 | 10 µg | 50 ng/ul | 433 | 2889 | 15.98 | 84.02 |
Samples not having the required amount of DNA (with less than 10 µg of DNA) were sent to the laboratory.
Qualification for different genotyping method.
| Project | Genotypingmethod | Platform/technology | Site | Number ofsamples selectedfor genotyping | % of samplesgenotypedthat passed | Failedgenotyping | Criteria |
| Kidney (KIDD) | GWAS | Illumina Infinium 610 K | CNG, Evry, France | 258 | 100.00 | 0 | |
| PanScan(PAND) | GWAS | Illumina Infinium IIHuman 550 K Bead | NCI, Bethesda,USA | 471 | 96.82 | 15 | <98% call rate(n = 13, 2.76%),gender (n = 2, 0.42%) |
| BPC3(BRCD+PROD) | GWAS | Illumina Golden Gate | ICL, London, UK | 5684 | 99.47 | 30 | <75% call rate(n = 30, 0.53%) |
| Interact (INTD) | I-plex | Sequenom | MRC, Cambridge,UK | 17472 | 98.48 | 265 | <75% call rate(n = 96, 0.55%),gender (n = 169, 0.97%) |