| Literature DB >> 32636395 |
Ferran Nadeu1,2, Rut Mas-de-Les-Valls3, Alba Navarro3,4, Romina Royo5, Silvia Martín3,4, Neus Villamor3,4,6, Helena Suárez-Cisneros7, Rosó Mares3, Junyan Lu8, Anna Enjuanes3,7, Alfredo Rivas-Delgado3,6, Marta Aymerich3,4,6, Tycho Baumann6, Dolors Colomer3,4,6,9, Julio Delgado3,4,6, Ryan D Morin10,11, Thorsten Zenz12, Xose S Puente4,13, Peter J Campbell14, Sílvia Beà3,4,9, Francesco Maura14,15, Elías Campo3,4,6,9.
Abstract
Immunoglobulin (Ig) gene rearrangements and oncogenic translocations are routinely assessed during the characterization of B cell neoplasms and stratification of patients with distinct clinical and biological features, with the assessment done using Sanger sequencing, targeted next-generation sequencing, or fluorescence in situ hybridization (FISH). Currently, a complete Ig characterization cannot be extracted from whole-genome sequencing (WGS) data due to the inherent complexity of the Ig loci. Here, we introduce IgCaller, an algorithm designed to fully characterize Ig gene rearrangements and oncogenic translocations from short-read WGS data. Using a cohort of 404 patients comprising different subtypes of B cell neoplasms, we demonstrate that IgCaller identifies both heavy and light chain rearrangements to provide additional information on their functionality, somatic mutational status, class switch recombination, and oncogenic Ig translocations. Our data thus support IgCaller to be a reliable alternative to Sanger sequencing and FISH for studying the genetic properties of the Ig loci.Entities:
Mesh:
Year: 2020 PMID: 32636395 PMCID: PMC7341758 DOI: 10.1038/s41467-020-17095-7
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Overview of IgCaller and Ig loci at short-read WGS level.
a Bioinformatic steps to fully characterize the rearranged Ig gene from short-read WGS data. IgCaller extracts the rearranged sequences from already aligned reads (BAM file). The output of IgCaller might be used as input to downstream specific programs for a complete Ig annotation. b Schema of the Ig loci at B cell genome level (top) and reference genome level (bottom) for the different loci/rearrangements analyzed by IgCaller. Of note, although the IGK locus is oriented on the negative strand, the IGKV4-1, IGKV5-2, and IGKV genes within the distal cluster are inverted and therefore oriented on the positive strand. Thus, rearrangements involving these IGKV genes are formed through inversions rather than deletions. The WGS reads that cover the rearrangements and are used by IgCaller are depicted in each scenario. White arrows represent the coding strand.
Fig. 2Benchmarking of IgCaller: characterization of the IGH locus.
a Bar plot showing the percentage of cases with productive IGH rearrangements by IgCaller in each cohort. b Dot plots of the percentage of identity of the rearranged IGHV sequence to the germ line by IgCaller (y axis) and SSeq/NGS (x axis). The 95% confidence interval is depicted by the light blue area. The gray area highlights cases in which the presence of a high density of clustered mutations impairs an accurate identification of the percentage of identity. P values are from t-test. c Comparison of the number of mutations associated with signature 9 (SBS9, left y axis) and the identity of the rearranged sequence both by SSeq and IgCaller (right y axis) in the C1-CLL cohort. Source data are provided as a Source data file.
Fig. 3Benchmarking of IgCaller: IGLC rearrangements, CSR, and oncogenic Ig translocations.
a Agreement between the IGLC productive rearrangement detected by IgCaller and FC analysis. b TTFT and OS of patients with CLL according to the presence of IGLV3-21 rearrangements. P values for TTFT curves are from Gray test. P values for the multivariate analysis of OS are from Cox regression. c Comparison of the CSR identified by IgCaller and FC. d TTFT of CLL patients according to the presence of CSR. P values are from Gray test. e Circular representation of the oncogenic Ig rearrangements (translocations and deletions) identified by IgCaller genome-wide. Frequencies of recurrent alterations are shown. Source data are provided as a Source data file.
Fig. 4Sequencing depth and tumor purity requirements for IgCaller.
a Sensitivity of IgCaller to detect a complete and productive IGH or IGK/L rearrangement at different ranges of coverage for CLL and MCL cases. b Downsampling experiment with 29 tumor samples. The sensitivity of IgCaller is shown for each specific mean coverage analyzed. c Identity for IGH (top) and IGK/L (bottom) gene rearrangements for each case at different downsampling conditions. d Sensitivity of IgCaller at distinct tumor cell contents after mixing tumor/normal (T/N) pairs at different ratios. The mean depth was set to 30×. e Similar to d but with a mean depth of 60×. Note that only purities of 50%, 35%, 20%, and 5% were analyzed. f Ig gene identity according to tumor cell content in different T/N mixing conditions. g Sensitivity of IgCaller when tumor samples are mixed with a polyclonal-like population. h Oligoclonal situation created in silico by mixing at different ratios two tumor samples carrying two IGH rearrangements each; one productive (Prod.) and one unproductive (Unprod.). The score of each rearrangement according to IgCaller is shown. A score of 0 is used for illustrative purposes for rearrangements not identified in a specific mixing condition. The score is calculated based on the number of reads supporting each rearrangement (“Methods”). Source data are provided as a Source data file.