| Literature DB >> 28423598 |
Weerachai Jaratlerdsiri1, Eva K F Chan1,2, Desiree C Petersen1,2, Claire Yang3, Peter I Croucher2,4,5, M S Riana Bornman6, Palak Sheth3, Vanessa M Hayes1,2,6,7.
Abstract
Complex genomic rearrangements are common molecular events driving prostate carcinogenesis. Clinical significance, however, has yet to be fully elucidated. Detecting the full range and subtypes of large structural variants (SVs), greater than one kilobase in length, is challenging using clinically feasible next generation sequencing (NGS) technologies. Next generation mapping (NGM) is a new technology that allows for the interrogation of megabase length DNA molecules outside the detection range of single-base resolution NGS. In this study, we sought to determine the feasibility of using the Irys (Bionano Genomics Inc.) nanochannel NGM technology to generate whole genome maps of a primary prostate tumor and matched blood from a Gleason score 7 (4 + 3), ETS-fusion negative prostate cancer patient. With an effective mapped coverage of 35X and sequence coverage of 60X, and an estimated 43% tumor purity, we identified 85 large somatic structural rearrangements and 6,172 smaller somatic variants, respectively. The vast majority of the large SVs (89%), of which 73% are insertions, were not detectable ab initio using high-coverage short-read NGS. However, guided manual inspection of single NGS reads and de novo assembled scaffolds of NGM-derived candidate regions allowed for confirmation of 94% of these large SVs, with over a third impacting genes with oncogenic potential. From this single-patient study, the first cancer study to integrate NGS and NGM data, we hypothesise that there exists a novel spectrum of large genomic rearrangements in prostate cancer, that these large genomic rearrangements are likely early events in tumorigenesis, and they have potential to enhance taxonomy.Entities:
Keywords: next generation mapping; next generation sequencing; prostate cancer; structural genomic rearrangements
Mesh:
Substances:
Year: 2017 PMID: 28423598 PMCID: PMC5410329 DOI: 10.18632/oncotarget.15802
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Number of NGS-derived somatic variants in UP2153
| ≤ 50 bp | SNVs/indels | Functional | Oncogenic | Oncogenic Drivers ( | > 50 bp | SVs | Functional Potentialc | Oncogenic Potentiald | ≥ 1 Kb | SVs (MetaSV) | Functional Potentialc | Oncogenic Potentiald |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNVs | 5981 | 23 | 4 | 4 | 45 | 7 | 1 | 26 | 10 | 6 | ||
| DEL | 62 | 0 | 0 | 0 | 4 | 0 | 0 | 7 | 0 | 0 | ||
| INS | 80 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | ||
Abbreviations: bp, base pairs; Kb, kilobases; SNVs, single nucleotide variants; DEL, deletion; INS, insertion; DUP, duplication; INV, inversion.
aUsing functional prediction tools GEMINI, SIFT and PolyPhen2.
bDriver mutation prediction using TransFIC and CanDrA identified a single mutation by both methods within ATRX gene.
cDefined as ‘HIGH’ impact by the GEMINI sequence ontology.
dInspection of GeneCards and PubMed identified the following genes with known oncogenic potential, namely MAP3K5, NID1, SMAD1, ZFR, ROR2, WWC1 and RAD51B.
Figure 1Circos plot depicting the human karyogram with coordinated chained events and SCNAs in the UP2153 tumor
Somatic copy number gains (red) and losses (blue) are depicted in the inner ring, while a single coordinated chained event between chromosomes 6 and 16 (blue) and rearrangements not assigned to a chained event (gray) are depicted as lines within the plot.
Verification of NGM-detected somatic SVs >1 Kb in UP2153 using short-read NGS data
| > 1kb SVs | NGM SVs ( | Affected Genesa | NGS verification | ||||
|---|---|---|---|---|---|---|---|
| NGS coverageb | MetaSV Evidence (≥ 4 reads)c | Manual read inspection (< 4 reads) | Evidence from NGS assembly | Total SV verified | |||
| 26 | 14 | 26 | 6 | 20 | 7 | 26 (100%) | |
| 59 | 33 | 58 | 3 | 42 | 35 | 52 (90%)3 | |
Abbreviations: SV, structural variation; DEL, deletion; INS, insertion; DUP, duplication; INV, inversion.
a Predicted based on co-location of SV with UCSC known canonical genes.
b One insertion showed no or minimal read or scaffolds coverage and was removed from the verification set.
c MetaSV called somatic variants based on four or more reads and observed using two or more informatic SV callers including Manta, Breakdancer, Lumpy, CNVnator and/or Pindel for SVs > 1 kb.
d All insertion variants were called duplications using NGS except one variant, which was called as a deletion.
Figure 2The DUSP11-C2orf78 gene fusion event identified using NGM involves a 14.3 Kb somatic deletion at Chr2: 74.006–74.020 Mb
(Upper Panel) The deletion is embedded within known segmental duplications and self-chains, overlapping both DUSP11 and C2orf78 genes. (Middle Panel) Rectangular tracks (horizontal bars) represent in silico genome map for Hg19 (green) and consensus genome maps for the tumour (blue) and matched blood (red). Hg19 genomic coordinates are indicated with dark blue font (M = Megabase). Overlaid on the Hg19 track are gene annotations, represented and distinguished by colored rectangles; gene symbols are indicated above with matching colors. Irys enzymatic labels (nick sites) are shown as vertical grey bars overlaid on the genome map tracks, and alignments of labels are shown as grey connecting lines. NGM called INS and DEL are highlighted, respectively, as green (4.8 Kb insertion in blood relative to Hg19) and orange trapezoids between aligned genome maps (9.8 Kb and 14.3 Kb deletions in tumor relative to Hg19 and matched blood, respectively). (Bottom Panel) In the NGS (IGV) panels, the tracks are alignments of reads, in grey. Orientation of sequencing reads are indicated by blunt ends for 5′ end and arrow end as 3′ end. Several single NGS reads with discordant alignments to Hg19 provide evidence for the deletion (red) in the tumor sample.
Figure 3Examples of NGM-derived UP2153 somatic SVs with NGS support
(A) A 1.4 Kb NGM-derived somatic insertion (green trapezoid) within PRDM16 gene at Chr1: 3.088–3.100 Mb in the tumor consensus map relative to Hg19 in silico genome map, with alignment of enzymatic labels (nick sites) shown a grey connecting lines. NGS verification included a 223 bp and 760 bp duplication (represented by green tracks) identified by MetaSV (left inset), and from IGV manual inspection of sequencing reads that show evidence of insertion within the region (right inset). (B) A 4.8 Kb NGM-derived somatic deletion (orange trapezoid) within ZNF438 gene at Chr10: 31.233–31.249 Mb in the tumor consensus map (blue horizontal bar) relative to Hg19 (green horizontal bar), was further verified by IGV manual inspection of sequencing reads (bottom panel of inset) and assembled scaffolds (top panel of inset). Note two haplotypes (scaffolds) are observed in the tumor genome assembly, corresponding to the normal (likely from stromal contamination) and mutant haplotypes (from the tumor).
Figure 4Examples of NGM-derived somatic SVs found in UP2153 with confounding calls by direct tumor-blood comparisons, compared to SV calls relative to Hg19
(A) Comparing the tumor (blue horizontal bar) and blood genome maps (red horizontal bar) directly, identified a 7.3 Kb somatic insertion (green trapezoid) at Chr20: 47.131–47.147 Mb. Relative to Hg19 reference the (aqua horizontal bars), the NGM IrysSolve Pipeline identified a 11.5 Kb insertion (green trapezoid) at Chr20: 47.130–47.147 Mb in the tumor and a 6.6 Kb insertion (green trapezoid) at Chr20: 47.068–47.147 Mb in the blood. Several sequencing reads provided NGS support for these insertions (green tracks) in both the blood (top inset) and tumor (bottom inset). (B) Direct comparison between the tumor (blue horizontal bar) and blood genome maps (red horizontal bar) identified a 6.3 Kb somatic insertion (green trapezoid) at Chr19: 2 9.949–29.952 Mb, within LOC284395 (pink rectanglular gene annotation). Relative to the in silico reference genome map Hg19, the NGM IrysSolve Pipeline identified a 2 Kb insertion (green trapezoid) at Chr19: 30.003–30.011 Mb in the tumor and a 4 Kb deletion (orange trapezoid) at Chr19: 29.949–300.003 Mb in the blood, each supported by several sequencing reads corresponding to deletions (red tracks, top insets) and duplications (green tracks, bottom insets), with aligned reads in grey.