| Literature DB >> 27594985 |
Gianmarco Contino1, Matthew D Eldridge2, Maria Secrier2, Lawrence Bower2, Rachael Fels Elliott1, Jamie Weaver1, Andy G Lynch2, Paul A W Edwards3, Rebecca C Fitzgerald1.
Abstract
Esophageal adenocarcinoma (EAC) is highly mutated and molecularly heterogeneous. The number of cell lines available for study is limited and their genome has been only partially characterized. The availability of an accurate annotation of their mutational landscape is crucial for accurate experimental design and correct interpretation of genotype-phenotype findings. We performed high coverage, paired end whole genome sequencing on eight EAC cell lines-ESO26, ESO51, FLO-1, JH-EsoAd1, OACM5.1 C, OACP4 C, OE33, SK-GT-4-all verified against original patient material, and one esophageal high grade dysplasia cell line, CP-D. We have made available the aligned sequence data and report single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number alterations, identified by comparison with the human reference genome and known single nucleotide polymorphisms (SNPs). We compare these putative mutations to mutations found in primary tissue EAC samples, to inform the use of these cell lines as a model of EAC.Entities:
Keywords: Esophageal adenocarcinoma; cancer genome; cell line; copy number alteration; high-grade dysplasia; single nucleotide variant; whole genome sequencing
Year: 2016 PMID: 27594985 PMCID: PMC4991527 DOI: 10.12688/f1000research.7033.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Characteristics and clinico-pathological features of the EAC cell lines analysed.
Verified origin identifies cell lines whose pathological origin from EAC has been verified in Boonstra .
| Cell line | Alternative
| Age | Sex | Ethnicity | Histology | Date
| Stage | Ploidy | Commercial
| Verified
| Ref |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| CP-18821 | Adult | M | hTERT immortalized
| 1995 | HGD | hypoyhetraploid | ATCC |
| ||
|
| 56 | M | Caucasian | GOJ
| 2000 | Stage IV | hypodiploid (1.8) | Public Health
| YES |
| |
|
| 74 | M | Caucasian | Distal Oesophageal
| 2000 | Stage IV | hypotriploid (2.75) | Public Health
| YES |
| |
|
| 68 | M | Caucasian | Distal Oesophageal
| 1991 | hypodiploid (1.9) | Public Health
| YES |
| ||
|
| JHAD1 | 66 | M | Caucasian | Moderately to
| 1997 | Stage IIA
| triploid | No, due to be
| YES |
|
|
| 47 | F | Caucasian | Lymph node
| 2001 | Stage IV | hypodiploid | Public Health
| YES |
| |
|
| 55 | M | Caucasian | Gastric cardia
| 2001 | Stage IV | Aneuploidy (53–57
| Public Health
| YES |
| |
|
| JROECL33 | 73 | F | Distal Oesophageal
| 1993 | Stage IIA | hypotetraploid (3.5) | Public Health
| YES |
| |
|
| 83 | M | Distal Oesophageal
| 1989 | Stage IIB | Aneuoplid (mode 59
| Public Health
| YES |
|
Figure 1. Distribution of detected variants and coding sequence consequences (mean percentage value).
A) Bar chart showing the distribution of called variants across various regions of the genome as indicated; B) Details of the coding sequence variants identified by the Variant Effect Predictor (Ensembl) expressed as a mean percentage value of all cell lines (values were not statistically different among samples).
Detailed distribution of identified variants for each cell lines.
Absolute number, median, median absolute deviation and range interval are listed for each category of mutation according to Variant Effect Predictor classification (Ensembl).
| CP-D | ESO26 | ESO51 | FLO-1 | JH-EsoAD1 | OACM5.1 | OACP4C | OE33 | SK-
| Median | Median
| Min | Max | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| UTR | 5 prime UTR | 229 | 301 | 262 | 191 | 206 | 264 | 229 | 216 | 305 | 229 | 33 |
|
|
| 3 prime UTR | 979 | 1097 | 1002 | 926 | 929 | 1026 | 848 | 986 | 1113 | 986 | 57 |
|
| ||
| Start/Stop | initiator codon | 1 | 3 | 2 | 2 | 3 | 2 | 1 | 0 | 1 | 2 | 1 |
|
| |
| stop lost | 2 | 2 | 4 | 2 | 2 | 2 | 3 | 3 | 2 | 2 | 0 |
|
| ||
| stop retained | 2 | 1 | 4 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 0 |
|
| ||
| stop gained | 10 | 14 | 17 | 16 | 14 | 17 | 9 | 14 | 24 | 14 | 3 |
|
| ||
| Missense | missense | 385 | 496 | 497 | 436 | 435 | 481 | 431 | 446 | 454 | 446 | 15 |
|
| |
| Splice Sites | splice
| 4 | 11 | 7 | 8 | 11 | 11 | 9 | 7 | 7 | 8 | 1 |
|
| |
| splice donor | 5 | 7 | 6 | 10 | 6 | 9 | 6 | 5 | 18 | 6 | 1 |
|
| ||
| splice region | 105 | 113 | 107 | 92 | 96 | 95 | 83 | 103 | 102 | 102 | 6 |
|
| ||
| Frameshift
| frameshift | 42 | 52 | 41 | 45 | 34 | 34 | 49 | 46 | 54 | 45 | 4 |
|
| |
| In Frame
| inframe
| 11 | 10 | 15 | 18 | 15 | 14 | 10 | 15 | 20 | 15 | 3 |
|
| |
| inframe
| 10 | 17 | 19 | 8 | 14 | 10 | 11 | 8 | 16 | 11 | 3 |
|
| ||
| Synonymous | 199 | 278 | 284 | 259 | 221 | 283 | 202 | 208 | 242 | 242 | 36 |
|
| ||
| Other | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
|
| ||
|
| Gene
| downstream | 19197 | 20411 | 18927 | 18009 | 17711 | 19363 | 16202 | 18463 | 20318 | 18927 | 918 |
|
|
| upstream | 19197 | 20761 | 19332 | 18122 | 18196 | 20182 | 16825 | 18944 | 21239 | 19197 | 1001 |
|
| ||
| Intergenic | 29694 | 38091 | 34040 | 31999 | 27269 | 31875 | 21550 | 32985 | 33380 | 31999 | 2041 |
|
| ||
| Introns | 55372 | 61682 | 56671 | 54869 | 51163 | 56193 | 43210 | 55945 | 61374 | 55945 | 1076 |
|
| ||
| Non-coding
| Mature
| 8 | 13 | 6 | 6 | 5 | 10 | 5 | 8 | 4 | 6 | 2 |
|
| |
| non-coding
| 1 | 2 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 |
|
| ||
| non coding
| 2149 | 2200 | 2116 | 1868 | 1920 | 2113 | 1811 | 2095 | 2310 | 2113 | 87 |
|
| ||
| Regulatory
| TF binding
| 404 | 453 | 469 | 431 | 413 | 500 | 408 | 440 | 486 | 440 | 29 |
|
| |
| regulatory
| 4667 | 5863 | 5301 | 4686 | 4512 | 5011 | 3582 | 4778 | 6158 | 4778 | 266 |
|
| ||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2. Analysis SNV and CNA of putative EAC genes identified in Dulak and Weaver .
A) Log ratio of copy number status of the selected genes computed with Control-Freec (green indicates CN gain and red CN loss). Genome wide CN for each line is available in Supplementary material 1 and Supplementary material 3. B) SNVs identified by our pipelines and annotated by Variant Effect Predictor analysis (Ensembl). When more than one variant was present in a single gene, the most deleterious was annotated according to the color-coded legend reported at the bottom of the figure. A complete annotation of identified SNV are available in the Supplementary material 2. C) Blue and red bars indicate the mutation rate of EAC genes reported in Dulak ; and Weaver , respectively.