Literature DB >> 27330997

Integrity of genome-wide genotype data from low passage lymphoblastoid cell lines.

Nina S McCarthy1, Spencer M Allan2, David Chandler3, Assen Jablensky4, Bharti Morar5.   

Abstract

We compared genotype data from the HumanExomeCore Array in peripheral blood mononuclear cells and low passage lymphoblastoid cell lines from the same 24 individuals to test for genotypic errors caused by the Epstein-Barr Virus transformation process. Genotype concordance across the 24 comparisons was 99.57% for unfiltered genotype data, and 99.63% following standard genotype quality control filters. Mendelian error rates and levels of heterozygosity were not significantly different between lymphoblastoid cell lines and their parent peripheral blood mononuclear cells. These results show that at low passage numbers, genotype discrepancies are minimal even before stringent quality control, and extend current evidence qualifying the use of low-passage lymphoblastoid cell lines as a reliable DNA source for genotype analysis.

Entities:  

Keywords:  Genotyping; Lymphoblastoid cell line; Single nucleotide polymorphism

Year:  2016        PMID: 27330997      PMCID: PMC4909818          DOI: 10.1016/j.gdata.2016.05.006

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Introduction

Lymphoblastoid cell lines (LCLs), which are human B lymphocytes immortalized by in vitro infection with Epstein-Barr Virus (EBV), are a renewable source of DNA, and an alternative to primary cells or tissue samples as a source of genomic DNA. More and more DNA will be required as the genomics era progresses from genome-wide association studies to whole exome sequencing and whole genome sequencing, and studies are likely to utilise diverse sets of samples, possibly comprising combinations of DNA from peripheral blood mononuclear cells (PBMC) and LCLs. A number of studies have shown that the immortalisation and/or subsequent passaging of these LCLs may lead to new mutations and genomic instability, including extended homozygosity, structural genomic variation and changes in DNA methylation patterns [1], [2], [3], [4], [5], [6], [7], [8]. These new non-germline mutations would confound association studies of human disease, especially as the field moves towards rare variant analysis. Other studies have however indicated that LCLs with low passage numbers display good genomic stability, with the EBV-transformation process producing minor, if any, artefacts on genomic structure [9], [10]. Two recent studies have reported high genotype concordance between DNA from LCLs and their parent PBMCs at low passage numbers (> 99%), especially after genotype quality control filtering has been applied [7], [11], though there is evidence that high numbers of cell passages may produce instability [7]. Similarly, a small number of next-generation sequencing studies have reported that stringent filtering parameters significantly reduce discordant calls and validation experiments indicate minimal differences between PBMC–LCL pairs [6], [10], [12]. To add strength to the findings of the limited number of studies that have assessed the validity of using LCL DNA in genetic studies, we have tested for genotypic errors potentially induced by the EBV transformation process by comparing single nucleotide polymorphism (SNP) genotype calls in PBMCs and LCLs from the same individuals (N = 24). Our cohort included two family groups, allowing for the detection of Mendelian errors. All samples were at early passage (< 5) and were genotyped on the Illumina HumanExomeCore Array, which contains > 500,000 common and ‘rarish’ SNPs. We report high concordance between PBMC–LCL pairs, and contrary to previous studies, our data do not show marked overall improvement in concordance after application of genotype quality control filtering. These data support the use of low passage LCL DNA in genetic studies where PBMC DNA from an individual is unavailable/depleted.

Materials and methods

Sample collection and generation of LCLs

The study sample comprised 24 individuals from the Western Australian Family Study of Schizophrenia [13] (WAFSS), including 16 unrelated individuals and two nuclear families — one trio, and one family consisting of parents and 3 offspring, DNA was extracted using standard protocols and stored frozen in 1XTE buffer. LCLs were generated as described in Verbrugghe et al. [14] Briefly, lymphocytes were isolated from whole blood samples using Ficoll Lymphocyte Separation Medium (MP Biochemicals). LCLs were established by transformation of fresh lymphocytes with EBV and cultured in a T25 flask in advanced RPMI medium supplemented with 2% fetal calf serum, 2 mM L-glutamax, 50 units/ml penicillin/50 μg/ml streptomycin and 2% crude phytohemagglutinin (M Form) [Invitrogen, Carlsbad, CA, USA] in a humidified environment at 37 °C in 5% (v/v) carbon dioxide. The culture was maintained in this media until it reached a cell density of 0.5–1 × 106 cells/ml in 20 ml. The cells were then transferred to a T75 flask and allowed to reach a density of ~ 1 × 106 cells/ml in 50–55 ml. At this stage, an aliquot of cells was removed for DNA isolation and the remaining cells cryopreserved in duplicate in liquid nitrogen. DNA was isolated from the cells using standard protocols and stored at − 80 °C. The study was approved by the Human Research Ethics Committee of The University of Western Australia. Written informed consent was obtained from all subjects.

Genotyping

Genotyping on the Illumina HumanCoreExome beadchip-12v1-1_A was performed at Pathwest, QEII Medical Centre (Nedlands, WA, Australia) according to manufacturer's instructions. The chip assays approximately 250,000 common [minor allele frequency (MAF) > 5%] and 250,000 ‘rarish’ exonic SNPs (MAF 1–5%); details on the history and content of the chip can be found at http://genome.sph.umich.edu/wiki/Exome_Chip_Design.

Quality control

For the ‘unfiltered’ analysis, all 542,585 SNPs on the chip were compared between the 24 PBMC–LCL pairs. In order to assess whether concordance levels were improved following standard genotyping quality control measures, the quality control filters described in Table 1 were applied to the genotype data using PLINK [15] (http://pngu.mgh.harvard.edu/purcell/plink/). Rates of SNP heterozygosity were also calculated for each sample across the 232,171 autosomal markers which were polymorphic in this population. In addition, to provide a baseline error rate for this assay, 6 PBMC samples were genotyped in duplicate on the same 542,585 SNPs.
Table 1

Sample (upper panel) and SNP (lower panel) quality control exclusions for the 24 PBMC-LCL pairs. P values are for 2-sample tests for equality of proportions between the PBMC and LCL values. MAF — minor allele frequency; HWE — Hardy Weinberg equilibrium.

Samples
Total samples24
Sample exclusions
Genotypes inconsistent with phenotypic sex0
Samples with > 10% of SNP genotypes missing0
Samples with > 5% SNPs showing Mendelian errors0
Final samples remaining after exclusions24

Statistical analysis

Tests of equal proportions were performed using the two-Sample test for equality of proportions as implemented in the prop.test function in the R package ‘stats’, or the paired test pairwise.prop.test with correction for multiple testing when comparing groups.

Results

Call rate as a proportion of the 542,585 SNPs on the chip was significantly lower overall for PBMCs than for LCLs (P < 2.2 × 10− 16; Table 1) and remained significant after adjusting for between-sample variability (paired test of proportions P = 0.001). Application of the control filters (Table 1) to PBMC and LCL datasets resulted in a significantly different proportions of SNPs being removed due to MAF filtering and missingness. Following these exclusions, 237,429 and 239,448 SNPs remained in the PBMC and LCL datasets, respectively. Call rate was significantly different between most individual PBMC–LCL pairs (P < 0.05; Table 2 and Fig. 1), whereas genome wide rates of heterozygosity and number of Mendelian errors based on the filtered data were not significantly different between individual pairs (Table 2), or overall.
Table 2

Comparative data for individual PBMC–LCL pairs analysed in the study (N = 24). The first panel shows the comparison in call rate in the unfiltered (n = 542.585) SNP set. The second panel shows a comparison of heterozygosity levels based on QC-filtered autosomal SNPs common to both PBMC and LCL datasets (n = 232.171). In the third panel, Mendelian errors are reported for the two nuclear families present in the sample — one trio (FID F_3), and one family consisting of parents and 3 offspring (FID F_15). For these first three panels, P values are for 2-sample test for equality of proportions between individual PBMC and LCL pairs. The fourth panel shows concordance rates between PBMC–LCL pairs before and after QC filtering of SNPs. P values are for 2-sample test for equality of proportions (concordance) between unfiltered and filtered data. IID: individual ID; FID: Family ID; PID: paternal ID; MID: maternal ID; Sex: M — male, F — female; Age: age of the individual at the time of blood collection.

% call rate (nSNPs = 542,585)
% heterozygosity (nSNPs = 232,171)
Mendelian errors (nSNPs 237,429)
Concordance rate between PBMC–LCL pairs
P
IIDFIDPIDMIDSEXAGEPBMCLCLPPBMCLCLPPBMCLCLPUnfiltered, nSNPs = 542,585QC filtered, nSNPs = 237 .429
1F_12526M310.9830.999< 2e − 160.3790.3800.8340.9950.996< 2.2e − 16
2F_22728M33,0.999< 2e − 160.3810.3810.9460.9950.9969.8e − 15
3F_354F340.9990.9991.5E − 030.4280.4280.678586011.0000.998< 2.2e − 16
4F_3F620.9990.9998.5E − 010.3950.3940.674313211.0000.998< 2.2e − 16
5F_3M640.9990.9992.7E − 050.4560.4550.630293011.0000.998< 2.2e − 16
6F_42930M250.9830.999< 2e − 160.3760.3760.8890.9950.997< 2.2e − 16
7F_53132M260.9990.9996.9E − 010.3780.3780.9460.9990.998< 2.2e − 16
8F_63334M310.9810.995< 2e − 160.3770.3770.9460.9920.996< 2.2e − 16
9F_73536M370.9990.9993.9E − 050.3790.3790.9461.0000.998< 2.2e − 16
10F_83738M280.9990.9991.2E − 010.3800.3800.8891.0000.998< 2.2e − 16
11F_93940M200.9820.999< 2e − 160.3720.3720.9460.9930.997< 2.2e − 16
12F_104142M250.9810.999< 2e − 160.3800.3801.0000.9930.996< 2.2e − 16
13F_114344M330.9990.9991.3E − 010.3800.3800.8891.0000.998< 2.2e − 16
14F_124546M300.9590.999< 2e − 160.3800.3810.4370.9690.979< 2.2e − 16
15F_134748F450.9990.9996.7E − 010.3820.3810.9461.0000.998< 2.2e − 16
16F_144950F380.9790.999< 2e − 160.3760.3750.6210.9860.9874.2e − 05
17F_151918M230.9830.999< 2e − 160.4160.4160.782616410.9940.997< 2.2e − 16
18F_15F550.9830.999< 2e − 160.4160.4160.947616010.9950.997< 2.2e − 16
19F_15M540.9830.999< 2e − 160.4160.4160.891575210.9950.997< 2.2e − 16
20F_151918M270.9990.9995.0E − 010.4140.4140.891271911.0000.998< 2.2e − 16
21F_151918F250.9990.9993.2E − 010.4150.4150.891262611.0000.998< 2.2e − 16
22F_165152F320.9990.9992.1E − 010.3810.3811.0000.9990.998< 2.2e − 16
23F_165152F350.9990.9991.0E + 000.3840.3840.9461.0000.998< 2.2e − 16
24F_175354M200.9990.9991.9E − 050.3780.3770.8891.0000.998< 2.2e − 16
Fig. 1

Upper panel: Call rates for PBMCs and LCLs across all genotyped SNPs (N = 542.585). Lower panel: concordance rates between PBMC and LCL genotypes for all SNPs before and after QC filtering (237,429 SNPs common to both datasets after filtering).

Genotype concordance between individual PBMC–LCL pairs was high across unfiltered (range 0.969–1.000, mean = 0.996, SD = 0.007) and QC filtered (range 0.979–0.998, mean = 0.996, SD = 0.004) datasets (Table 2 and Fig. 1). However, concordance between each individual pair for unfiltered and filtered SNP sets was significantly different (Table 2), although the direction of effect varied between samples. On average, there was a non-significant increase in concordance across all 24 pairs following quality control filtering (paired t-test, P = 0.715). By comparison, genotyping rate was 99.21% in the 6 PBMCs genotyped in duplicate (12 samples in total). Concordance between the genotypes in each replicate pair was 100%. There were no associations with sample age or sex for any of the quality control measures or concordance rates (linear/logistic regression, P > 0.05).

Discussion

This study provides further evidence for minimal rates of discordant genotypes between PBMC and LCL pairs at low passage numbers, supporting the use of low-passage LCLs as a reliable DNA source for genotype analysis. Contrary to previous reports, there was no significant increase in concordance rates after stringent quality control filtering of the genotype data. We were able to check Mendelian error rates in our two family groups, and report comparable rates of Mendelian error in PBMC and LCL DNA. Surprisingly, we report significantly higher genotype call rates in the LCL DNA, which may indicate some degradation of the PBMC DNA.

Conflict of interest

The authors declare no conflict of interest.
  15 in total

1.  Concerns regarding "Whole exome sequencing reveals minimal differences between cell line and whole blood derived DNA".

Authors:  Matthew D Shirley
Journal:  Genomics       Date:  2013-08-01       Impact factor: 5.736

2.  Chromosomal rearrangements after ex vivo Epstein-Barr virus (EBV) infection of human B cells.

Authors:  S Lacoste; E Wiechec; A G Dos Santos Silva; A Guffei; G Williams; M Lowbeer; K Benedek; M Henriksson; G Klein; S Mai
Journal:  Oncogene       Date:  2009-11-02       Impact factor: 9.867

3.  Genotype instability during long-term subculture of lymphoblastoid cell lines.

Authors:  Ji Hee Oh; Young Jin Kim; Sanghoon Moon; Hye-Young Nam; Jae-Pil Jeon; Jong Ho Lee; Jong-Young Lee; Yoon Shin Cho
Journal:  J Hum Genet       Date:  2012-11-22       Impact factor: 3.172

Review 4.  Subtyping schizophrenia: implications for genetic research.

Authors:  A Jablensky
Journal:  Mol Psychiatry       Date:  2006-06-27       Impact factor: 15.992

5.  Variation in genome-wide mutation rates within and between human families.

Authors:  Donald F Conrad; Jonathan E M Keebler; Mark A DePristo; Sarah J Lindsay; Yujun Zhang; Ferran Casals; Youssef Idaghdour; Chris L Hartl; Carlos Torroja; Kiran V Garimella; Martine Zilversmit; Reed Cartwright; Guy A Rouleau; Mark Daly; Eric A Stone; Matthew E Hurles; Philip Awadalla
Journal:  Nat Genet       Date:  2011-06-12       Impact factor: 38.330

6.  Global variation in copy number in the human genome.

Authors:  Richard Redon; Shumpei Ishikawa; Karen R Fitch; Lars Feuk; George H Perry; T Daniel Andrews; Heike Fiegler; Michael H Shapero; Andrew R Carson; Wenwei Chen; Eun Kyung Cho; Stephanie Dallaire; Jennifer L Freeman; Juan R González; Mònica Gratacòs; Jing Huang; Dimitrios Kalaitzopoulos; Daisuke Komura; Jeffrey R MacDonald; Christian R Marshall; Rui Mei; Lyndal Montgomery; Kunihiro Nishimura; Kohji Okamura; Fan Shen; Martin J Somerville; Joelle Tchinda; Armand Valsesia; Cara Woodwark; Fengtang Yang; Junjun Zhang; Tatiana Zerjal; Jane Zhang; Lluis Armengol; Donald F Conrad; Xavier Estivill; Chris Tyler-Smith; Nigel P Carter; Hiroyuki Aburatani; Charles Lee; Keith W Jones; Stephen W Scherer; Matthew E Hurles
Journal:  Nature       Date:  2006-11-23       Impact factor: 49.962

7.  EBV transformation and cell culturing destabilizes DNA methylation in human lymphoblastoid cell lines.

Authors:  D Grafodatskaya; S Choufani; J C Ferreira; D T Butcher; Y Lou; C Zhao; S W Scherer; R Weksberg
Journal:  Genomics       Date:  2009-12-18       Impact factor: 5.736

8.  In depth comparison of an individual's DNA and its lymphoblastoid cell line using whole genome sequencing.

Authors:  Dorothee Nickles; Lohith Madireddy; Shan Yang; Pouya Khankhanian; Steve Lincoln; Stephen L Hauser; Jorge R Oksenberg; Sergio E Baranzini
Journal:  BMC Genomics       Date:  2012-09-14       Impact factor: 3.969

9.  Whole-exome sequencing of DNA from peripheral blood mononuclear cells (PBMC) and EBV-transformed lymphocytes from the same donor.

Authors:  Eric R Londin; Margaret A Keller; Michael R D'Andrea; Kathleen Delgrosso; Adam Ertel; Saul Surrey; Paolo Fortina
Journal:  BMC Genomics       Date:  2011-09-26       Impact factor: 3.969

10.  Fidelity of SNP array genotyping using Epstein Barr virus-transformed B-lymphocyte cell lines: implications for genome-wide association studies.

Authors:  Joshua T Herbeck; Geoffrey S Gottlieb; Kim Wong; Roger Detels; John P Phair; Charles R Rinaldo; Lisa P Jacobson; Joseph B Margolick; James I Mullins
Journal:  PLoS One       Date:  2009-09-04       Impact factor: 3.240

View more
  4 in total

1.  Integration of genomics and transcriptomics predicts diabetic retinopathy susceptibility genes.

Authors:  Andrew D Skol; Segun C Jung; Ana Marija Sokovic; Siquan Chen; Sarah Fazal; Olukayode Sosina; Poulami P Borkar; Amy Lin; Maria Sverdlov; Dingcai Cao; Anand Swaroop; Ionut Bebu; Barbara E Stranger; Michael A Grassi
Journal:  Elife       Date:  2020-11-09       Impact factor: 8.140

2.  A high-throughput real-time PCR tissue-of-origin test to distinguish blood from lymphoblastoid cell line DNA for (epi)genomic studies.

Authors:  Lise M Hardy; Yosra Bouyacoub; Antoine Daunay; Mourad Sahbatou; Laura G Baudrin; Laetitia Gressin; Mathilde Touvier; Hélène Blanché; Jean-François Deleuze; Alexandre How-Kit
Journal:  Sci Rep       Date:  2022-03-18       Impact factor: 4.379

3.  Comparison of mitochondrial DNA sequences from whole blood and lymphoblastoid cell lines.

Authors:  Chunyu Liu; Jessica L Fetterman; Xianbang Sun; Kaiyu Yan; Poching Liu; Yan Luo; Jun Ding; Jun Zhu; Daniel Levy
Journal:  Sci Rep       Date:  2022-02-02       Impact factor: 4.379

4.  Discovery of genomic variation across a generation.

Authors:  Brett Trost; Livia O Loureiro; Stephen W Scherer
Journal:  Hum Mol Genet       Date:  2021-10-01       Impact factor: 6.150

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.