Literature DB >> 32054878

Refined detection and phasing of structural aberrations in pediatric acute lymphoblastic leukemia by linked-read whole-genome sequencing.

Jessica Nordlund1, Yanara Marincevic-Zuniga2, Lucia Cavelier3, Amanda Raine2, Tom Martin2, Anders Lundmark2, Jonas Abrahamsson4, Ulrika Norén-Nyström5, Gudmar Lönnerholm6, Ann-Christine Syvänen2.   

Abstract

Structural chromosomal rearrangements that can lead to in-frame gene-fusions are a leading source of information for diagnosis, risk stratification, and prognosis in pediatric acute lymphoblastic leukemia (ALL). Traditional methods such as karyotyping and FISH struggle to accurately identify and phase such large-scale chromosomal aberrations in ALL genomes. We therefore evaluated linked-read WGS for detecting chromosomal rearrangements in primary samples of from 12 patients diagnosed with ALL. We assessed the effect of input DNA quality on phased haplotype block size and the detectability of copy number aberrations and structural variants in the ALL genomes. We found that biobanked DNA isolated by standard column-based extraction methods was sufficient to detect chromosomal rearrangements even at low 10x sequencing coverage. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements and aneuploidy assessment. With use of haplotype information from the linked-reads, we also identified previously unknown structural variants, such as a compound heterozygous deletion of ERG in a patient with the DUX4-IGH fusion gene. We conclude that linked-read WGS allows detection of important pathogenic variants in ALL genomes at a resolution beyond that of traditional karyotyping and FISH.

Entities:  

Mesh:

Year:  2020        PMID: 32054878      PMCID: PMC7018692          DOI: 10.1038/s41598-020-59214-w

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Sequencing of complete human genomes has become feasible owing to next generation sequencing (NGS) technologies, but detection of the whole spectrum of somatic single nucleotide variants (SNVs), copy number alterations (CNAs), and structural variations (SVs) in cancer cells remains challenging[1]. The human genome is diploid, and molecular haplotyping of the two alleles across large genomic regions is beyond the resolution of standard short-read NGS technologies[2]. “Linked-read” technology, by which single DNA molecules are massively barcoded in a microfluidic format and subsequently sequenced using short-read NGS technology, allows determination of molecular haplotypes across mega-base regions of the genome[3-5]. An advantage of linked-read whole genome sequencing (WGS) is its enhanced ability to detect the breakpoints of SVs and to provide long-range haplotype information for phasing SNVs and SVs. Linked-read WGS has the potential to provide an ordered view of the structure of all genetic variants in a genome, shown by assignment of complex SVs, chromosomal rearrangements, and CNAs to individual chromosomes in germline and cancer genomes[3,5,6]. Structural chromosomal rearrangements that may lead to aberrant gene-fusions are used for diagnosis, risk stratification and prognosis in pediatric acute lymphoblastic leukemia (ALL)[7]. Several recurrent chromosomal aberrations define genetic subtypes of ALL that are associated with clinical outcome[8,9]. Karyotyping (G-banding) and fluorescent in situ hybridization (FISH) commonly applied in clinical genetics laboratories do not capture the full spectrum of complex aberrations in cancer genomes. Thus, up to 30% of B-cell precursor ALL (BCP-ALL) patients remain cytogenetically unclassified and lack genetic information as support for treatment decisions[10]. More recently, the application of WGS and whole-transcriptome sequencing (RNA-sequencing) have enabled discovery of novel mutations and expressed gene-fusions in ALL[11-16] including recurrent fusion genes with biological and clinical implications, such as DUX4, ZNF384, and MEF2D rearrangements[17-19]. However, limited information presently exists on the complex structure of the leukemogenic aberrations present in ALL genomes. Here, we use linked-read WGS technology to obtain haplotype-resolved genomic aberrations in primary DNA samples from 12 well-characterized patients with pediatric ALL. Furthermore, we evaluate if linked-read WGS can achieve the same or improved level of detection as joint G-banding and FISH.

Results

We subjected diagnostic samples from 12 children with acute lymphoblastic leukemia (ALL) enrolled on the Nordic Society of Pediatric Hematology and Oncology (NOPHO) protocols during 1998–2008 (Table 1)[8,20] to linked-read WGS (Table S1). The DNA used to prepare linked-read sequencing libraries was obtained from biobanked DNA isolated by a standard column-based method or by freshly prepared HMW DNA extraction. The estimated length of the input DNA was directly correlated to the phase block size (Table S2). The proportion of phased SNPs was 81–99% (mean 92%), and the longest phased blocks ranged from 0.9–18 Mb (mean 7 Mb) (Table S3). The DNA extracted using the High Molecular Weight protocol yielded the longest haplotype blocks (18 Mb), but the DNA extracted by the standard column-based method allowed for detection of all known SVs even at low sequencing coverage (10×), despite the shorter phase blocks produced (Fig. S1).
Table 1

Patient characteristics.

Patient IDSexAge at diagnosisImmuno-phenotypeSubtype at diagnosisRevised subtypeKaryotype at diagnosisRevised karyotype after linked-read WGS
ALL_370F3BCP-ALLHeH55, XX, +X, +4, +6, +10, +14, +17, +18, +21, +21[2]/54, XX, +X, +4, +6, +10, i(14)(q10), +17, +18, +21, +21[cp16]/46, XX[12]55, XX, +X, +4, +6, +10, +14, +17, +18, +21, +21
ALL_689F18BCP-ALLHeH55, XX, +X, dup(1)(q24q32), +4, +6, +10, +14, +17, +18, +21, +21[17]/46, XX[3]54, XX, +X, dup(1)(q24q42), +4, +6, +10, +14, +17, +18, +21
ALL_47M2BCP-ALLNormal karyotypeHeH46, XY[2]58, XY, +4, +5, +6, +9, +10, +12, +14, +17, +18, -19, +19, +21, +21, +22
ALL_458M4BCP-ALLETV6-RUNX1.ish.t(12;21)(p13;q22), del(12)(p13p13), del(21)(q22q22)47, XY, +10, del(11)(q22.1q25), t(12;21)(p13.2;q22), del(12)(p12.1p13.2)
ALL_386M13BCP-ALLETV6-RUNX1.ish.t(3;21;12), t(3;12;14), t(12;21)(p13;q22)46, XY, del(2)(q33.1q37.3), der(3)del(3)(p21.2p21.31)t(3;12)(p21.31;q24)ins(3;3)(q21.2;p21.31p21.31), der(12)t(14;12)(q24.1;p13.2)t(3;12)(q21.3;q24.11), del(12)(p13.2), der(14)t(14;2)(q24.1;q37.3)t(2;21)(q33.1;q22.12), del(19)(q13.32q13.43), der(21)t(12;21)(p13.2;q22.12), dup(21)(q11.2q22.12)
ALL_402M6BCP-ALLBCR-ABL146, XY[12].ish.t(9;22)(q34;q11), del(9)(p21p21)46, XY, t(1;5;9;22)(p36.33;q31.2;q34.12;q11.23), del(8)(p11p23), dup(8)(p11.23q24.3), del(9)(p21p21)
ALL_390F8BCP-ALLNormal karyotypeDUX4-IGH46, XX[19]46, XX, del(6)(q14.1q27)
ALL_501F7BCP-ALLNormal karyotypeDUX4-IGH46, XX[20]46, XX
ALL_604M11BCP-ALLB-otherTCF3-ZNF38446, XY, del(7)(q22)[8]/46, XY, del(6)(q2?1)[7]/ 46, XY[17]46, XY, del(6)(q16.2q22.33), del(7)(q21.3q36.3), t(12;19)(p13.31;p13.3)
ALL_613M5BCP-ALLB-otherEP300-ZNF38446, XY, del(16)(q13q24)[5]/47-48, XY, +del(1)(q21), del(16)(q13q24), +mar[cp3]/ 46, XY[9]46, XY, dup(1)(q21q44), t(12;22)(p13.2;q13.2), del(16)(q21q24.3)
ALL_707M2BCP-ALLB-otherPAX5-ELN46, XY, der(7)t(7;9)(q11;p13)del(9)(p21p24), der(9)t(7;9)(q11;p13)[9]/46, XY, idem, del(19)(q13)[15]/46, XY[1]46, XY, del(7)(q11), der(9)t(7;9)(q11;p13), del(9)(p13p24)
ALL_559M6T-ALLT-ALL46, XY, t(7;9)(q3?4;q3?2)[10].ish.del(9)(p21p21)x2, der(11)t(7;11)(q3?4;p1?3)/46, XY[15]46, XY, der(7)t(7;9)(q34;q31), t(7;11)(q34;p15), der(9)t(7;9)(q34;q31)del(9)(p21p21), del(9)(p21p21)

aThe parts of the karyotype revised after linked-read WGS are highlighted in bold.

Patient characteristics. aThe parts of the karyotype revised after linked-read WGS are highlighted in bold. For five of the 12 ALL genomes, detailed karyotype information obtained at diagnosis by G-banding or FISH for the subtype-defining genetic aberrations high hyperdiploidy (HeH), t(12;21) and t(9;22) was available and allowed verification of the results from linked-read WGS. The remaining six patients with either T-ALL or B-other subtype had either complex or incomplete karyotype available from ALL diagnosis. Their subtypes were determined in previous studies by a combination of WGS, RNA-sequencing, and/or arrays (Table S1)[11,19]. In all cases, existing karyotype information, newly generated FISH data (when cells were available), and/or a combination of Infinium arrays for copy number estimates and RNA-sequencing validated the findings from linked-read WGS. The results for each patient and subtype are detailed below, and for each case a revised karyotype after linked-read WGS is given in Table 1.

High Hyperdiploidy (HeH)

Two patients (ALL_370 and ALL_689) had the classical HeH subtype with 55 chromosomes. Using the linked-read WGS data, we binned the average sequencing coverage in 10 Kb bins across the genome and scanned for CNAs across the 22 autosomal chromosomes (Fig. 1a,b). The linked-read WGS estimates of copy numbers correlated perfectly with that from the karyotypes and array-based CNA for ALL_370 and ALL_689. For a third patient (ALL_47) with suspected HeH subtype[21], we verified the HeH karyotype in the linked-read WGS data to be 58, XY, +4, +5, +6, +9, +10, +12, +14, +17, +18, −19, +19, +21, +21, +22, which was confirmed by array-based CNA analysis (Table 1; Fig. 1c). The copy neutral loss of chromosome 19 (uniparental disomy) was visible in the linked-read WGS data by an overrepresentation of homozygous SNVs on chromosome 19 (Fig. S2).
Figure 1

Copy number by chromosome for the three ALL patients with the HeH subtype (a–c). The average linked-read WGS coverage calculated in 10 Kb bins is plotted in the top row of each panel. The Log R ratios from Infinium SNP and/or 450k array data are visualized in the lower part of each panel. Red coloring indicates chromosomal gains according to the color key above panel a.

Copy number by chromosome for the three ALL patients with the HeH subtype (a–c). The average linked-read WGS coverage calculated in 10 Kb bins is plotted in the top row of each panel. The Log R ratios from Infinium SNP and/or 450k array data are visualized in the lower part of each panel. Red coloring indicates chromosomal gains according to the color key above panel a.

Translocations t(12;21) and t(9;22)

The t(12;21)[ETV6-RUNX1] translocation and associated aberrations were determined in two patients (ALL_386 and ALL_458) (Fig. 2a,b). As anticipated by karyotyping and previous WGS of patient ALL_458[11], a balanced t(12;21) translocation resulting in the expression of both the canonical ETV6-RUNX1 and the reciprocal RUNX1-ETV6 fusion genes was unambiguously detected at base-pair resolution in the linked-read WGS data (Fig. 2c). A deletion spanning over a 2.1 Mb region that includes the second allele of ETV6 was observed on the other haplotype, thus affecting both alleles of ETV6 (Fig. S3). Besides gain of chromosome 10 and a heterozygous 38 Mb deletion of chromosome 11q22-q25, no other large structural variants were identified in ALL_458.
Figure 2

Structural aberrations detected by linked-read WGS in t(12;21)[ETV6-RUNX1] genomes. (a,b) Circos plots for patients ALL_386 and ALL_458. The first (outer) track shows the chromosomes and their banding, the second track shows log R ratios from Infinium arrays, the third track shows copy number determined by linked-read WGS in 10 Kb bins, and the fourth (innermost) track shows copy number calls using the CNVnator software. Red indicates gain and blue indicates deletion. Expressed fusion genes are highlighted within each circos plot, solid lines indicate in-frame fusion genes. (c) Heatmap of overlapping linked-reads supporting a balanced inter-chromosomal translocation t(12;21) resulting in the ETV6-RUNX1 fusion gene in ALL_486. (d) Linked-reads mapped to the two haplotypes at the ETV6 locus in patient ALL_386, which depicts a deletion on haplotype 1 (indicated by the red box) and the breakpoint giving rise to the DCAF5-ETV6 and the ETV6-RUNX1 fusion genes is indicated by a dashed line on the second allele (haplotype 2). (e) Schematic representation of the chromosomal rearrangements resulting in derivative chromosomes as determined by linked-read WGS in ALL_386. The resulting fusion transcripts with breakpoints are drawn alongside the chromosomes involved in the translocations.

Structural aberrations detected by linked-read WGS in t(12;21)[ETV6-RUNX1] genomes. (a,b) Circos plots for patients ALL_386 and ALL_458. The first (outer) track shows the chromosomes and their banding, the second track shows log R ratios from Infinium arrays, the third track shows copy number determined by linked-read WGS in 10 Kb bins, and the fourth (innermost) track shows copy number calls using the CNVnator software. Red indicates gain and blue indicates deletion. Expressed fusion genes are highlighted within each circos plot, solid lines indicate in-frame fusion genes. (c) Heatmap of overlapping linked-reads supporting a balanced inter-chromosomal translocation t(12;21) resulting in the ETV6-RUNX1 fusion gene in ALL_486. (d) Linked-reads mapped to the two haplotypes at the ETV6 locus in patient ALL_386, which depicts a deletion on haplotype 1 (indicated by the red box) and the breakpoint giving rise to the DCAF5-ETV6 and the ETV6-RUNX1 fusion genes is indicated by a dashed line on the second allele (haplotype 2). (e) Schematic representation of the chromosomal rearrangements resulting in derivative chromosomes as determined by linked-read WGS in ALL_386. The resulting fusion transcripts with breakpoints are drawn alongside the chromosomes involved in the translocations. In contrast, the karyotype for patient ALL_386 suggested a complex series of translocations involving ETV6 and RUNX1 and chromosomes 3, 12, 14 and 21. In a previous study, two in-frame fusion genes were identified in this patient (ETV6-RUNX1 and DCAF5-ETV6)[19]. Linked-read data resolved that the DCAF5-ETV6 fusion gene arose from a translocation between 14q24.1 and 12p13.2 and the ETV6-RUNX1 fusion gene arose from a translocation between 12p13.2 and 21q22.12. The phasing information further resolved a heterozygous 0.15 Mb intragenic deletion in ETV6 (haplotype 1) and that the ETV6-RUNX1 and DCAF5-ETV6 fusion genes originated from the other allele (haplotype 2) of ETV6, thus disrupting both copies of ETV6 in this patient (Fig. 2d). Linked-read WGS resolved the exact breakpoints on chromosomes 3, 12 14 and 21, and identified several additional alterations that were missed by genetic analysis at diagnosis. Of these, DCAF5 (chr14) and the reciprocal RUNX1 (chr21) loci were separated by a 44 Mb insertion of a region originating from chromosome 2q33.1-q37.3 on the derivative chromosome 14q24.1 (Fig. 2e). Furthermore, a 650 Kb region from chromosome 3p21.31 was inverted and inserted into the derivative chromosome 3q21.2 arm where the material from chromosome 12q24.13 was translocated (Fig. S4). All of the derived chromosomes determined by linked-read WGS were subsequently validated by FISH (Fig. S5). In patient ALL_402 with t(9;22)[BCR-ABL1], linked-read WGS revealed an unexpectedly complex rearrangement that involved the BCR (22q11.23), ABL1 (9q34.12), PRRC2B (9q34.13), SIL1 (5q31.2) and LINC01128 (1p36.33) loci (Fig. S6). In addition to the deletion of chromosome 9p21 reported in the karyotype, we detected a 35 Mb deletion (8p11.23-p23.3) and a gain starting at 8p11.23 and continuing through the entire q-arm of chromosome 8 (Fig. 3a). RNA-sequencing verified that the 5′ end of BCR is fused with the 3′ end of ABL1, the 5′ ends of the reciprocal ABL1 and SIL1 loci form a head to head translocation, resulting in two truncated transcripts, the 5′ end of LINC01128 is fused with the 3′ end of SIL1, whilst the 5′ end of PRRC2B is fused with the reciprocal 3′ end of the BCR gene (Fig. 3b). None of these complex rearrangements were phased in the linked-read WGS data, but phasing information was not required to fully resolve the structure of the breakpoints in this case.
Figure 3

Complex structural rearrangements in the patient ALL_402. (a) A circos plot depicting the genome-wide copy number changes in ALL_402. The first (outer) track shows each chromosome and their banding, the second track shows log R ratios from infinium arrays, the third track shows copy number determined by linked-read WGS in 10 Kb bins, and the fourth (innermost) track shows copy number calls using the CNVnator software. Red indicates gain and blue indicates deletion. Expressed fusion genes are highlighted inside of the circos plot, solid lines indicate in-frame and dashed lines indicate out of frame fusion or truncated genes. (b) The derivative chromosomes as outlined using linked-read WGS. The structures of the expressed fusion genes are shown alongside their derivative chromosomes with the direction of transcription indicated by arrows.

Complex structural rearrangements in the patient ALL_402. (a) A circos plot depicting the genome-wide copy number changes in ALL_402. The first (outer) track shows each chromosome and their banding, the second track shows log R ratios from infinium arrays, the third track shows copy number determined by linked-read WGS in 10 Kb bins, and the fourth (innermost) track shows copy number calls using the CNVnator software. Red indicates gain and blue indicates deletion. Expressed fusion genes are highlighted inside of the circos plot, solid lines indicate in-frame and dashed lines indicate out of frame fusion or truncated genes. (b) The derivative chromosomes as outlined using linked-read WGS. The structures of the expressed fusion genes are shown alongside their derivative chromosomes with the direction of transcription indicated by arrows.

B-other group

DUX4 and ZNF384-rearrangements define newly described subtypes of BCP-ALL that were initially detected in large-scale RNA-sequencing studies[17,18,22]. The DUX4-IGH fusion gene results from an insertion of the DUX4 gene (subtelomeric region of chr4q and 10q), into the enhancer region of the IGH locus (chr14)[23]. With the exception of a 93 Mb deletion on chromosome 6q14.1-q27 in ALL_390, the two patients with DUX4-IGH (ALL_390 and ALL_501) had normal karyotypes typical of this subtype (Fig. S7). Previous short-read WGS of ALL_501 failed to identify the DUX4-IGH rearrangement in this patient[11]. The DUX4-IGH rearrangement was not directly detected in the linked-read data by the longranger software, however with the aid of the Integrated Genome Viewer, we were able to identify split linked-reads supporting the insertion of at least one copy of DUX4 into the IGH locus, thus supporting the rearrangement (Fig. S8). Besides the 6q deletion in ALL_390, the linked-read data revealed a compound heterozygous deletion of ERG transcript variant 1 (NCBI Reference Sequence: NM_182918.3). A large 6.5 Mb phase block on chromosome 21q22 enabled detection of a 9.3 Kb focal deletion of exon 1 on haplotype 1 and a separate 57.2 Kb deletion spanning exons 3–10 on haplotype 2 (Fig. 4a).
Figure 4

Structural rearrangements detected in B-other patients by linked-read WGS. (a) Linked-reads mapped to each of the two homologous chromosomes at the ERG locus on chromosome 21 in patient ALL_390. Reads are color-coded by chromosome and deletions are marked by red squares. B-C) Heatmaps of overlapping linked-reads supporting subtype-defining balanced inter-chromosomal translocations from the 10x Genomics Loupe software. (b) The genomic breakpoint in chromosomes 12 and 19, resulting in the TCF3-ZNF384 fusion gene in patient ALL_604. (c) The genomic breakpoint in chromosomes 12 and 22, resulting in the EP300-ZNF384 fusion gene in patient ALL_613. (d) Ideogram of the structure of the translocation between chromosome 7 and 9 in the patient ALL_707 resulting in the PAX5-ELN fusion gene, which is shown besides the derivative chromosome 9 with the direction of the transcription indicated by an arrow. (e,f) Validation of the chromosome 7q deletion and derivative chromosome 9 by FISH in the patient ALL_707.

Structural rearrangements detected in B-other patients by linked-read WGS. (a) Linked-reads mapped to each of the two homologous chromosomes at the ERG locus on chromosome 21 in patient ALL_390. Reads are color-coded by chromosome and deletions are marked by red squares. B-C) Heatmaps of overlapping linked-reads supporting subtype-defining balanced inter-chromosomal translocations from the 10x Genomics Loupe software. (b) The genomic breakpoint in chromosomes 12 and 19, resulting in the TCF3-ZNF384 fusion gene in patient ALL_604. (c) The genomic breakpoint in chromosomes 12 and 22, resulting in the EP300-ZNF384 fusion gene in patient ALL_613. (d) Ideogram of the structure of the translocation between chromosome 7 and 9 in the patient ALL_707 resulting in the PAX5-ELN fusion gene, which is shown besides the derivative chromosome 9 with the direction of the transcription indicated by an arrow. (e,f) Validation of the chromosome 7q deletion and derivative chromosome 9 by FISH in the patient ALL_707. The most common fusion gene partners of ZNF384 are the TCF3 and EP300 genes. Linked-read WGS determined the chromosomal breakpoints at base-pair resolution for the balanced translocations t(12;19)(p13.31;p13.3)[TCF3-ZNF384] in ALL_604 and t(12;22)(p13.31;q13.2)[EP300-ZNF384] in ALL_613 (Fig. 4b,c). The heterozygous deletions expected from the karyotypes in ALL_604 and ALL_613 were refined by linked-read WGS to 7q21.3-q36.3 and 6q16.2-q22.33 in ALL_604, and 16q21-q24.3 in ALL_613 (Table 1; Fig. S9). Gain of the q arm of chromosome 1, a common genomic aneuploidy in ALL[24] was observed in the linked-read data from patient ALL_689, but not in the diagnostic karyotype. One patient with a PAX5-ELN fusion gene (ALL_707) detected by RNA-sequencing and short-read WGS was included[11]. The karyotype indicated two derivative chromosomes (chromosome 7 and 9) as well as a 9p deletion. These aberrations were resolved at a higher resolution with linked-read WGS, which demonstrated a derivative chromosome 9 harboring the PAX5-ELN fusion gene, a truncated chromosome 7, as well as a heterozygous deletion of chromosome 9p13.2 with the breakpoint in the PAX5 locus (Fig. S10). The structure of the resulting derivative chromosomes and their validation by FISH are shown in Fig. 4d–f.

T-ALL

Based on karyotype, a bi-allelic deletion of chromosome 9p21 and two translocations involving chromosomes 7 and 9 and chromosomes 7 and 11 were expected in ALL_559. The homozygous deletion of chromosome 9p21 was clearly resolved in the linked-read WGS data (Fig. S11). Previous short-read WGS and RNA-sequencing data identified two translocations involving the T-cell receptor beta locus (TRBC2 gene) on chromosome 7, namely t(7;11)(q34;p15)[RIC3-TRBC2] and t(7;9)(q34;q31) resulting in the fusion of TRBC2 with an unannotated transcript expressed on chromosome 9 between the TAL2 and TMEM38B genes[11]. The linked-read WGS data clarified that the two alleles of TRBC2 were involved in independent translocation events. First, the t(7;11)(q34;p15) resulting in expression of RIC3-TRBC2 was a consequence of a balanced translocation of chromosome 7 involving one allele of TRBC2 (Fig. 5a). On the other allele of TRBC2, the t(7;9)(q34;q31) was accompanied by a 0.2 Mb deletion flanked by an inversion of chromosome 7q34 (Fig. 5b–d), a re-arrangement that was missed by both karyotyping and previous short-read WGS[11]. FISH verified the derivative chromosomes determined by linked-read WGS (Fig. 5e,f).
Figure 5

Chromosomal aberrations in the patient ALL_559 (T-ALL) determined by linked-read WGS. (a–c) Heatmaps from the 10x Genomics Loupe software of overlapping linked-reads indicating genomic rearrangements. (a) A balanced interchromosomal translocation between chromosomes 7 and 11. (b) A translocation between chromosomes 7 and 9, which is accompanied by a 0.2 Mb deletion flanked by an inversion of chromosome 7q34 on the second allele at the TRBC2 locus. The translocation results in an expressed fusion gene between TRBC2 and an unannotated gene located 500 bp upstream of TMEM38B on chromosome 9. (c) Zoomed in view of the inversion flanking the TRBC2 locus on 7q34. (d) Ideogram of the structure of the translocations observed in ALL_559. The chromosomes are drawn to scale using the CyDAS software. (e) Whole chromosomal paint depicting the translocation of material from chromosome 7 to chromosomes 9 and 11. (f) Whole chromosomal paint of chromosome 9 depicting the balanced translocation involving chromosome 7.

Chromosomal aberrations in the patient ALL_559 (T-ALL) determined by linked-read WGS. (a–c) Heatmaps from the 10x Genomics Loupe software of overlapping linked-reads indicating genomic rearrangements. (a) A balanced interchromosomal translocation between chromosomes 7 and 11. (b) A translocation between chromosomes 7 and 9, which is accompanied by a 0.2 Mb deletion flanked by an inversion of chromosome 7q34 on the second allele at the TRBC2 locus. The translocation results in an expressed fusion gene between TRBC2 and an unannotated gene located 500 bp upstream of TMEM38B on chromosome 9. (c) Zoomed in view of the inversion flanking the TRBC2 locus on 7q34. (d) Ideogram of the structure of the translocations observed in ALL_559. The chromosomes are drawn to scale using the CyDAS software. (e) Whole chromosomal paint depicting the translocation of material from chromosome 7 to chromosomes 9 and 11. (f) Whole chromosomal paint of chromosome 9 depicting the balanced translocation involving chromosome 7.

Detection of key diagnostic deletions for ALL

To further demonstrate that linked-read WGS allows detection of other aberrations than large-scale aneuploidies and translocations, we screened the 12 ALL genomes for focal deletions in a set of relevant genes for ALL, including BTG1, CDKN2A/B, EBF1, ETV6, IKZF1, PAX5, RB1 and ERG[25] (Fig. S12). With the exception of RB1, each of the genes analyzed were deleted in at least one patient based on linked-read WGS. All deletions were verified by array-based CNA analysis. Phasing data revealed that both of the t(12;21) cases harbored ETV6 deletions on the allele that was not affected by the translocation, thus resulting in bi-allelic disruption of ETV6. Consistent with previous studies[23,26], recurrent BTG1 and IKZF1 deletions were detected in the t(12;21) and DUX4-IGH patients, respectively (Fig. S13).

Discussion

In our study the linked-reads enabled highly accurate resolution of the majority of the genomic aberrations defined by cytogenetic methods and refined or identified new structural rearrangements in 10 of the 12 analyzed ALL genomes. Although the ALL subtypes and numbers of samples were modest, these results show clear proof of principle for linked-read WGS for digital karyotyping in ALL. Studies that have applied linked-read-WGS to other cancer types such as triple negative breast cancer[27], metastatic gastric tumors[28], prostate cancer[29], and cell lines[30,31] have reached similar conclusions. Linked-read WGS requires long input DNA molecules to gain the most benefit from the technology[3]. However, when working with clinical samples, high molecular weight DNA extraction and handling of HMW DNA is not practical in most clinical settings. In our study we showed that DNA from patient samples with an average size of DNA < 50 kb prepared using a standard column-based DNA extraction method were highly informative for detection of genomic aberrations with linked-read WGS. When we compared HMW DNA to DNA from standard column extractions, and when we compared low-coverage GemCode to Chromium library preparation, the results were concordant. Although HMW DNA may increase the chances of phasing over chromosomal breakpoints, which makes interpretation of the chromosomal structure and organization easier, our data suggest that long DNA molecules and high sequencing depth may not be required for accurate detection of prognostically relevant aberrations present in the major clone of leukemic samples. Although the genomic structure of most chromosomal rearrangements that are of clinical relevance in ALL were resolved with high precision by linked-read WGS, the recently described DUX4-IGH fusion gene failed to be precisely resolved by this technology. The DUX4-IGH rearrangement is a particularly challenging aberration to resolve due to the location of DUX4 in the complex tandemly repeated region D4Z4 in the subtelomeric region of chr4q and chr10q[32], and the insertion of DUX4 into the IGH locus. This complexity is likely the reason for the lack of identification of the recurrent DUX4-IGH fusion gene prior to recent RNA-sequencing studies in ALL[17-19]. Nonetheless, a guided analysis based on identifying split linked-reads that map to the DUX4 and IGH loci identified support for the insertion of at least one copy of DUX4 into the IGH locus in the linked-read WGS data. The present study is limited by the fact that we have not compared the linked-reads to other next generation approaches such as standard paired-end WGS, Hi-C, third-generation single-molecule sequencing, or optical mapping technologies, which when used in a multiplatform approach have been demonstrated to be a powerful method for resolving complex structural rearrangements[33-35]. Future studies will be required for more formal benchmarking of linked-read WGS and other next generation technologies for digital karyotyping specifically in ALL and for other cancer types. In summary, we focused on detecting large-scale structural aberrations, which are the most relevant type of aberrations for clinical care in ALL[36]. We generated a detailed view of large-scale chromosomal aberrations in cells from pediatric ALL patients, which reaches beyond the resolution of traditional karyotyping data[11,12,37]. Our data suggests that digital karyotyping by linked-read WGS can replace, or at the least complement traditional clinical diagnostic methods such as G-banding and FISH in the future.

Patients and Methods

Patient samples

Primary ALL samples were collected as described previously[38]. The patients were selected from the NOPHO cohort based on presence of cytogenetic aberrations detected at diagnosis or fusion genes detected by previous WGS or RNA-sequencing studies (Table S1)[11,19,21]. DNA and RNA were extracted from 2–10 million cells using the AllPrep DNA/RNA Mini Kit, AllPrep DNA/RNA/miRNA Universal Kit, or the MagAttract HMW DNA kit (Qiagen). The DNA concentrations were measured using the Qubit dsDNA Broad Range assay (Invitrogen). The study was approved by the Regional Ethics Review Board in Uppsala, Sweden and was conducted according to the guidelines of the Declaration of Helsinki. The patients and/or their guardians provided written informed consent.

Molecular diagnosis, karyotyping, and FISH

ALL diagnosis was established by analysis of leukemic cells with respect to morphology, immunophenotype, and cytogenetic aberrations. High hyperdiploidy (HeH) was defined as presence of 51–67 chromosomes per cell[39]. FISH or RT-PCR analyses were used to screen for t(12;21)(p13;q22)[ETV6-RUNX1] and t(9;22)(q34;q11)[BCR-ABL1]. Whole-chromosome paint (Metasystems XCP orange/green XCyting Chromosome Paints) and subtelomeric probes (Vysis Totelvysion probes) followed by analysis using a fluorescence microscope (Carl Zeiss) and the Isis software (MetaSystems) were used to validate translocations identified by linked-read WGS on metaphase spreads from cultured bone marrow cells.

Library construction and sequencing

GemCode and Chromium libraries for linked-read WGS (10x Genomics) were prepared from 1–1.2 ng of genomic DNA following manufacturer’s protocols for GemCode and Chromium V1 reagents. GemCode libraries (n = 12) were sequenced on an Illumina HiSeq 2500 instrument (read1:98 bp, i7:8 bp, i5:14 bp, read2:98) to an average depth of 14×. Chromium libraries (n = 5) were sequenced on an Illumina HiSeqX instrument with 150 bp paired-end reads to an average depth of 32×.

Linked-read data analysis

Linked-read WGS data was processed and phased using the Long Ranger pipeline from 10x Genomics (v1.2.0 for GemCode and v2.1.6 for Chromium) with the hg19/GRCh37 reference genome. Data were visualized using the Loupe Genome Browser v2.1.1. SVs called by Long Ranger were manually reviewed against karyotype data, CNA data from Illumina Infinium arrays, and fusion genes detected by RNA-sequencing. Genomic copy number levels were estimated by chromosomal segmentation read-depth analysis in 10 Kb windows using the CNVnator software[40]. B-allele frequencies were calculated from VCF files using the VariantAnnotation package and custom scripts in R[41]. Ideograms of derivative chromosomes were drawn to scale with the CyDAS software[42].

RNA-sequencing

A RNA-sequencing library was constructed from 300 ng total RNA with the TruSeq stranded total RNA protocol (Illumina) for sample ALL_402. The library was sequenced on a NovaSeq. 6000 instrument with 100 bp paired-end reads. Strand-specific RNA-sequencing data was available from previous studies for all of the remaining patient samples, except from patient ALL_370 where RNA was not available (Table S1)[11,19,21]. Fusion genes were called with FusionCatcher V0.99.7d[43] and validated using a previously described approach[19].

Copy Number Analysis

Infinium HumanMethylation450 BeadChip (450k array) data from all samples are available at the Gene Expression Omnibus (GSE49031)[44]. The R package “CopyNumber450kCancer” was used to detect CNAs[45]. Genomic DNA (200 ng) from nine patient samples was genotyped on the Illumina HumanOmni2.5 Exome-8v1 SNP arrays (Illumina). CNAs were called from the SNP array data using the Tumor Aberration Prediction Suite[46]. Supplementary Figures S1-S13. Supplementary Tables S1, S2, S3.
  36 in total

Review 1.  The clinical relevance of chromosomal and genomic abnormalities in B-cell precursor acute lymphoblastic leukaemia.

Authors:  Anthony V Moorman
Journal:  Blood Rev       Date:  2012-03-20       Impact factor: 8.250

2.  Long-term results of NOPHO ALL-92 and ALL-2000 studies of childhood acute lymphoblastic leukemia.

Authors:  K Schmiegelow; E Forestier; M Hellebostad; M Heyman; J Kristinsson; S Söderhäll; M Taskinen
Journal:  Leukemia       Date:  2009-12-10       Impact factor: 11.528

3.  The genomic landscape of hypodiploid acute lymphoblastic leukemia.

Authors:  Linda Holmfeldt; Lei Wei; Ernesto Diaz-Flores; Michael Walsh; Jinghui Zhang; Li Ding; Debbie Payne-Turner; Michelle Churchman; Anna Andersson; Shann-Ching Chen; Kelly McCastlain; Jared Becksfort; Jing Ma; Gang Wu; Samir N Patel; Susan L Heatley; Letha A Phillips; Guangchun Song; John Easton; Matthew Parker; Xiang Chen; Michael Rusch; Kristy Boggs; Bhavin Vadodaria; Erin Hedlund; Christina Drenberg; Sharyn Baker; Deqing Pei; Cheng Cheng; Robert Huether; Charles Lu; Robert S Fulton; Lucinda L Fulton; Yashodhan Tabib; David J Dooling; Kerri Ochoa; Mark Minden; Ian D Lewis; L Bik To; Paula Marlton; Andrew W Roberts; Gordana Raca; Wendy Stock; Geoffrey Neale; Hans G Drexler; Ross A Dickins; David W Ellison; Sheila A Shurtleff; Ching-Hon Pui; Raul C Ribeiro; Meenakshi Devidas; Andrew J Carroll; Nyla A Heerema; Brent Wood; Michael J Borowitz; Julie M Gastier-Foster; Susana C Raimondi; Elaine R Mardis; Richard K Wilson; James R Downing; Stephen P Hunger; Mignon L Loh; Charles G Mullighan
Journal:  Nat Genet       Date:  2013-01-20       Impact factor: 38.330

Review 4.  Childhood Acute Lymphoblastic Leukemia: Progress Through Collaboration.

Authors:  Ching-Hon Pui; Jun J Yang; Stephen P Hunger; Rob Pieters; Martin Schrappe; Andrea Biondi; Ajay Vora; André Baruchel; Lewis B Silverman; Kjeld Schmiegelow; Gabriele Escherich; Keizo Horibe; Yves C M Benoit; Shai Izraeli; Allen Eng Juh Yeoh; Der-Cherng Liang; James R Downing; William E Evans; Mary V Relling; Charles G Mullighan
Journal:  J Clin Oncol       Date:  2015-08-24       Impact factor: 44.544

5.  The mutational landscape in pediatric acute lymphoblastic leukemia deciphered by whole genome sequencing.

Authors:  Carl Mårten Lindqvist; Jessica Nordlund; Diana Ekman; Anna Johansson; Behrooz Torabi Moghadam; Amanda Raine; Elin Övernäs; Johan Dahlberg; Per Wahlberg; Niklas Henriksson; Jonas Abrahamsson; Britt-Marie Frost; Dan Grandér; Mats Heyman; Rolf Larsson; Josefine Palle; Stefan Söderhäll; Erik Forestier; Gudmar Lönnerholm; Ann-Christine Syvänen; Eva C Berglund
Journal:  Hum Mutat       Date:  2015-01       Impact factor: 4.878

6.  A hybrid approach for de novo human genome sequence assembly and phasing.

Authors:  Yulia Mostovoy; Michal Levy-Sakin; Jessica Lam; Ernest T Lam; Alex R Hastie; Patrick Marks; Joyce Lee; Catherine Chu; Chin Lin; Željko Džakula; Han Cao; Stephen A Schlebusch; Kristina Giorda; Michael Schnall-Levin; Jeffrey D Wall; Pui-Yan Kwok
Journal:  Nat Methods       Date:  2016-05-09       Impact factor: 28.547

7.  Direct determination of diploid genome sequences.

Authors:  Neil I Weisenfeld; Vijay Kumar; Preyas Shah; Deanna M Church; David B Jaffe
Journal:  Genome Res       Date:  2017-04-05       Impact factor: 9.043

8.  Resolving the full spectrum of human genome variation using Linked-Reads.

Authors:  Patrick Marks; Sarah Garcia; Alvaro Martinez Barrio; Kamila Belhocine; Jorge Bernate; Rajiv Bharadwaj; Keith Bjornson; Claudia Catalanotti; Josh Delaney; Adrian Fehr; Ian T Fiddes; Brendan Galvin; Haynes Heaton; Jill Herschleb; Christopher Hindson; Esty Holt; Cassandra B Jabara; Susanna Jett; Nikka Keivanfar; Sofia Kyriazopoulou-Panagiotopoulou; Monkol Lek; Bill Lin; Adam Lowe; Shazia Mahamdallie; Shamoni Maheshwari; Tony Makarewicz; Jamie Marshall; Francesca Meschi; Christopher J O'Keefe; Heather Ordonez; Pranav Patel; Andrew Price; Ariel Royall; Elise Ruark; Sheila Seal; Michael Schnall-Levin; Preyas Shah; David Stafford; Stephen Williams; Indira Wu; Andrew Wei Xu; Nazneen Rahman; Daniel MacArthur; Deanna M Church
Journal:  Genome Res       Date:  2019-03-20       Impact factor: 9.043

9.  Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.

Authors:  Grace X Y Zheng; Billy T Lau; Michael Schnall-Levin; Mirna Jarosz; John M Bell; Christopher M Hindson; Sofia Kyriazopoulou-Panagiotopoulou; Donald A Masquelier; Landon Merrill; Jessica M Terry; Patrice A Mudivarti; Paul W Wyatt; Rajiv Bharadwaj; Anthony J Makarewicz; Yuan Li; Phillip Belgrader; Andrew D Price; Adam J Lowe; Patrick Marks; Gerard M Vurens; Paul Hardenbol; Luz Montesclaros; Melissa Luo; Lawrence Greenfield; Alexander Wong; David E Birch; Steven W Short; Keith P Bjornson; Pranav Patel; Erik S Hopmans; Christina Wood; Sukhvinder Kaur; Glenn K Lockwood; David Stafford; Joshua P Delaney; Indira Wu; Heather S Ordonez; Susan M Grimes; Stephanie Greer; Josephine Y Lee; Kamila Belhocine; Kristina M Giorda; William H Heaton; Geoffrey P McDermott; Zachary W Bent; Francesca Meschi; Nikola O Kondov; Ryan Wilson; Jorge A Bernate; Shawn Gauby; Alex Kindwall; Clara Bermejo; Adrian N Fehr; Adrian Chan; Serge Saxonov; Kevin D Ness; Benjamin J Hindson; Hanlee P Ji
Journal:  Nat Biotechnol       Date:  2016-02-01       Impact factor: 54.908

10.  Dense and accurate whole-chromosome haplotyping of individual genomes.

Authors:  David Porubsky; Shilpa Garg; Ashley D Sanders; Jan O Korbel; Victor Guryev; Peter M Lansdorp; Tobias Marschall
Journal:  Nat Commun       Date:  2017-11-03       Impact factor: 14.919

View more
  5 in total

1.  Patient-Specific Assays Based on Whole-Genome Sequencing Data to Measure Residual Disease in Children With Acute Lymphoblastic Leukemia: A Proof of Concept Study.

Authors:  Cecilia Arthur; Fatemah Rezayee; Nina Mogensen; Leonie Saft; Richard Rosenquist; Magnus Nordenskjöld; Arja Harila-Saari; Emma Tham; Gisela Barbany
Journal:  Front Oncol       Date:  2022-07-05       Impact factor: 5.738

Review 2.  V(D)J Recombination: Recent Insights in Formation of the Recombinase Complex and Recruitment of DNA Repair Machinery.

Authors:  Shaun M Christie; Carel Fijen; Eli Rothenberg
Journal:  Front Cell Dev Biol       Date:  2022-04-29

3.  A somatic UBA2 variant preceded ETV6-RUNX1 in the concordant BCP-ALL of monozygotic twins.

Authors:  Benedicte Bang; Jesper Eisfeldt; Gisela Barbany; Arja Harila-Saari; Mats Heyman; Vasilios Zachariadis; Fulya Taylan; Ann Nordgren
Journal:  Blood Adv       Date:  2022-04-12

4.  Copy Number Variation Analysis of 5p Deletion Provides Accurate Prenatal Diagnosis and Reveals Candidate Pathogenic Genes.

Authors:  Guoming Chu; Pingping Li; Juan Wen; Gaoyan Zheng; Yanyan Zhao; Rong He
Journal:  Front Med (Lausanne)       Date:  2022-07-14

Review 5.  Unravelling the tumour genome: The evolutionary and clinical impacts of structural variants in tumourigenesis.

Authors:  Alhafidz Hamdan; Ailith Ewing
Journal:  J Pathol       Date:  2022-04-28       Impact factor: 9.883

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.