Literature DB >> 34806017

CRIS: complete reconstruction of immunoglobulin V-D-J sequences from RNA-seq data.

Rashedul Islam1,2,3, Misha Bilenky3, Andrew P Weng4,5, Joseph M Connors6, Martin Hirst1,2,3.   

Abstract

MOTIVATION: B cells display remarkable diversity in producing B-cell receptors through recombination of immunoglobulin (Ig) V-D-J genes. Somatic hypermutation (SHM) of immunoglobulin heavy chain variable (IGHV) genes are used as a prognostic marker in B-cell malignancies. Clinically, IGHV mutation status is determined by targeted Sanger sequencing which is a resource-intensive and low-throughput procedure. Here, we describe a bioinformatic pipeline, CRIS (Complete Reconstruction of Immunoglobulin IGHV-D-J Sequences) that uses RNA sequencing (RNA-seq) datasets to reconstruct IGHV-D-J sequences and determine IGHV SHM status.
RESULTS: CRIS extracts RNA-seq reads aligned to Ig gene loci, performs assembly of Ig transcripts and aligns the resulting contigs to reference Ig sequences to enumerate and classify SHMs in the IGHV gene sequence. CRIS improves on existing tools that infer the B-cell receptor repertoire from RNA-seq data using a portion IGHV gene segment by de novo assembly. We show that the SHM status identified by CRIS using the entire IGHV gene segment is highly concordant with clinical classification in three independent chronic lymphocytic leukemia patient cohorts.
AVAILABILITY AND IMPLEMENTATION: The CRIS pipeline is available under the MIT License from https://github.com/Rashedul/CRIS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34806017      PMCID: PMC8600631          DOI: 10.1093/bioadv/vbab021

Source DB:  PubMed          Journal:  Bioinform Adv        ISSN: 2635-0041


1 Introduction

During development in the bone marrow, B lymphocytes undergo rearrangement of immunoglobulin (Ig) heavy (V, D and J) and light chain (V and J) gene segments through recombination (Fig. 1). Addition or deletion of nucleotides occurs at segment junctions during recombination. In the germinal center, B-cells acquire additional somatic hypermutation (SHM) within the Ig variable regions as part of the adaptive immune response to generate a B-cell receptor (BCR) repertoire diversity estimated to be as much as ∼1018 (Briney ; Janeway ). Following SHM, B cells are positively selected for further differentiation into memory B cells or antibody-secreting plasma cells (Akkaya ).
Fig. 1.

IGHV-D-J recombination and SHM during B-cell development. BCRs are generated by ordered assembly of the Ig heavy chain gene segments (V, D and J) during B-cell development. Addition and deletion of junctional nucleotides (N) contribute to the diversity of BCR repertoires. BCR sequences undergo affinity maturation upon antigen stimulation through SHMs in the variable domain (indicated in black arrows). SHMs of Ig are enriched at the complementarity-determining regions (CDRs)

IGHV-D-J recombination and SHM during B-cell development. BCRs are generated by ordered assembly of the Ig heavy chain gene segments (V, D and J) during B-cell development. Addition and deletion of junctional nucleotides (N) contribute to the diversity of BCR repertoires. BCR sequences undergo affinity maturation upon antigen stimulation through SHMs in the variable domain (indicated in black arrows). SHMs of Ig are enriched at the complementarity-determining regions (CDRs) Profiling of the B-cell Ig repertoire has become an essential component of immune research and is used clinically for malignant B-cell classification (Briney ; Georgiou ). B-cell malignancies arise at different stages of B-cell development and BCR diversification is used as both a prognostic and diagnostic marker (Georgiou ; Monk ). The presence of SHM and specific usage of immunoglobulin heavy chain variable (IGHV) genes are prognostic markers in different B-cell malignancies, including chronic lymphocytic leukemia (CLL), mantle cell lymphoma (MCL) and follicular lymphoma (Berget ; Damle ; Hamblin ; Navarro ). Malignant B cells are classified into two major subtypes based on the SHM status, where cells with very low SHM are classified as ‘unmutated IGHV’ subtype, while those cells with evidence of SHM are classified as ‘mutated IGHV’ subtype. Unmutated IGHV subtypes of CLL and MCL show more aggressive disease compared to the mutated IGHV subtype (Damle ; Hamblin ; Navarro ). IGHV gene usage is also used as a prognostic in follicular lymphoma (Berget ). SHM analysis of the IGHV gene is commonly performed using multiplex PCR and Sanger sequencing following the best practice guidelines by the European Research Initiative on CLL (ERIC) (Ghia ). However the PCR-Sanger method is resource-intensive and technically challenging in both clinical and research applications and suffers from a 9% to 18% failure rate (Stamatopoulos ). Massively parallel sequencing of targeted genomic DNA regions or RNA has emerged as an alternative method to reliably sequence V-D-J segments (Boyd and Joshi, 2014; Georgiou ; Menzel ; Yaari and Kleinstein, 2015). RNA sequencing (RNA-seq) has become the gold standard for transcriptome analysis, applied in both clinical and research settings and has been used in limited cases to identify BCR rearrangement repertoire (Blachly ; Iglesia ; Monk ; Mose ). Several bioinformatic pipelines have been developed to infer BCR repertoire from RNA-seq data, including ABRA (Iglesia ), TRUST (Hu ), ImReP (Mandric ), MiXCR (Bolotin ), V’DJer (Mose ) and IgID (Blachly ). Among them, ABRA (Iglesia ) and IgID (Blachly ) were not published with stand-alone code to allow for replication. The remaining IGHV-D-J reconstruction tools (e.g. TRUST, ImReP, MiXCR and V'DJer) were designed to reconstruct only the CDR3 region, representing only a portion of the IGHV gene, while the entire IGHV gene segment is required to determine the SHM status in B-cell malignancies. In addition, these tools have not been validated against gold standard PCR-Sanger datasets for SHM classification. To address these gaps in determining IGHV mutational status in B-cell malignancies, we developed a bioinformatic pipeline, CRIS (Complete Reconstruction of Immunoglobulin IGHV-D-J Sequences), which extracts RNA-seq reads aligned to putative Ig loci, assembles the complete IGHV gene, identifies the most abundant Ig transcript and enumerates SHMs by comparison with germline reference sequences. Classification of IGHV mutational subtypes by CRIS was validated against PCR-Sanger-based clinical classification in three independent cohorts of CLL patients and shown to be comparable.

2 Methods

2.1 CLL samples

In the Centre for Epigenomic Technology (CEMT) cohort, peripheral blood samples were obtained from CLL patients undergoing treatment at BC Cancer (n = 16) and used according to procedures approved by the Research Ethics Board (REB H12-01767) of the University of British Columbia (Supplementary Table S1). RNA was purified from those peripheral blood samples and extraction was performed on CD19+ sorted cells with >90% purity as described (Pellacani ).

2.2 RNA sequencing

The CEMT CLL RNA-seq datasets were generated as described (Pellacani ). RNA extraction, library construction and sequencing were performed following the guidelines formulated by the International Human Epigenome Consortium (http://www.ihec-epigenomes.org). These guidelines as well as the standard operating procedures for RNA-seq library construction and sequencing are available at https://thisisepigenetics.ca/for-scientists/protocols-and-standards and by request. Additional CLL patient RNA-seq datasets with matching IGHV mutation status were collected from published datasets: GSE66228 (Blachly ), EGAD00001004046 (Beekman ) and phs000435.v3 (Wang ).

2.3 Identification of putative Ig loci

We identified five putative Ig loci enriched with reads that were used to reconstruct Ig containing contigs in the 16 CEMT samples (Table 1). The detailed procedure of identifying Ig loci is described in Supplementary Figure S1a.
Table 1.

Genomic coordinates of the putative Ig loci in the GRCh38 reference

Chromosome/contigStartEndLength (bp)
Chr14105 550 001106 880 0001 329 999
Chr1521 710 00022 190 000480 000
Chr1631 950 00133 970 0002 019 999
chr14_KI270726v1_random143 73943 739
chr16_KI270728v1_random11 872 7591 872 759
Genomic coordinates of the putative Ig loci in the GRCh38 reference Step 1: Read extraction prior to assembly of Ig transcripts: hg38-bam-file was created by aligning the reads to the GRCh38 reference genome using BWA mem (v0.7.6a; Li and Durbin, 2009). Using sambamba (v0.7.0; Tarasov ), we extracted reads that were aligned to the putative IGHV loci () and saved them in fastq format using Picard SamToFastq (v2.20.3; Broad Institute, 2009; ). These resultant paired-end reads originated from the putative Ig loci were used as input for Trinity (v2.1.1; Grabherr ) for de novo transcriptome assembly.
Fig. 2.

CRIS workflow. CRIS extract reads from the putative Ig loci prior to assembly of Ig transcripts and quantify transcript abundances. The percent of IGHV mutations of Ig transcripts is calculated by comparing to the germline sequences

Step 2: Identification of Ig transcripts and their abundances: Trinity assembly performed in the previous step produced around 250 transcripts per sample. To filter the transcripts that have similarity (expectation value ≤20) with the germline IGHV sequences, we used blastn (v2.9.0; Altschul ) with default parameters with a custom database of IGHV sequences downloaded from the international ImMunoGeneTics information system (IMGT) (Giudicelli ). The resultant Ig transcripts were used in Salmon (v0.8.1; Patro ) to quantify their abundances with a k-mer of 31 bp. Transcript with the highest TPM (transcripts per million) value was marked as the dominant clone. Step 3: SHM and clonotype analysis: The Ig-transcript sequences identified in step 2 were queried in IgBLAST (v1.14.0; Ye ) against the germline V, D and J gene database of IMGT. IgBLAST returned the percent identity of the IGHV segment of Ig transcripts compared to the germline alleles and clustered the similar Ig transcripts into clonotypes. Productive Ig transcript with highest TPM value was used to determine IGHV mutation status of CLL sample and further compared with available clinical PCR-Sanger data. Transcripts having TPM values within one log10 of the highest expressed transcript were also considered while comparing with the PCR-Sanger data according to (Blachly ). CRIS workflow. CRIS extract reads from the putative Ig loci prior to assembly of Ig transcripts and quantify transcript abundances. The percent of IGHV mutations of Ig transcripts is calculated by comparing to the germline sequences

2.5 Analysis of SHM status using V’DJer, TRUST and MiXCR

V’DJer, TRUST (v3.0.3) and MiXCR (v3.0.3) were run on the RNA-seq bam file generated by STAR (v2.7.5a) aligner (Dobin ) using GRCh38 genome as reference. During STAR alignment ‘–outSAMunmapped Within’ was used to include the unmapped reads in the bam file. All three tools were run with default parameters to generate VDJ contigs of IGH. VDJ contigs were analyzed by IgBLAST to generate the percent identity of IGHV sequences compared to the germline database.

3 Results

3.1 De novo assembly-based Ig detection from RNA-seq

De novo assembly using Trinity (Grabherr ) for 16 deeply sequenced (∼300 M read pairs) CLL RNA-seq libraries generated an average of ∼450 000 contigs per sample with 6–29 contigs demonstrating IGHV sequence homology. However, de novo assembly of the complete RNA-seq read sets required significant computational resources (Hölzer and Marz, 2019) and thus we sought to identify the fraction of reads in the RNA-seq libraries corresponding to the Ig loci. Using the resulting assemblies, we found that on average 99.85% of the sequence reads used to reconstruct IGHV containing contigs originated from five putative Ig loci in the GRCh38 reference (Supplementary Table S2). These putative Ig loci consist of human Ig locus, Ig pseudogene loci and unlocalized contigs at chromosomes 14, 15 and 16 (Table 1 and Supplementary Fig. S2). This suggests that sequence reads used to reconstruct Ig sequence not only map to the reference Ig locus but also to pseudogene regions both within the current assembly and in unlocalized contigs. We hypothesized that this novel set of loci could be used as a highly specific filter to reconstruct IGHV-D-J sequence.

3.2 CRIS pipeline development

Given the time required to complete a full assembly from an RNA-seq library, we sought to extract Ig sequences prior to performing assembly. For this, we leveraged the putative Ig loci identified in our pilot set of libraries, retrieved sequences aligned by BWA mem (Li and Durbin, 2009) within these coordinates (∼1% of all reads) and subjected these to de novo assembly using Trinity (Grabherr ). Enriching for Ig sequences from the bulk RNA-seq sequence set reduced the run time for assembly by two orders of magnitude while not significantly impacting the subsequent SHM analysis of Ig transcripts (Supplementary Table S3). We confirmed that our approach also successfully assembled the IGHV-D-J and N-junctional segments (Fig. 3a, Table 2 and Supplementary Table S4).
Fig. 3.

Evaluation of CRIS to reconstruct IGHV-D-J sequences. (a) The most abundant Ig transcript from US-1422278 sample was aligned to the germline database using IgBLAST where top hit germline genes are shown. In the alignment, mismatches are represented as nucleotide bases and matches as dots. The alignment length, number of matches and mismatches are 296, 280 and 16, respectively. Total number of matched nucleotides between query and germline IGHV sequence is used to calculate percent identity e.g., 100*(280/296) = 96.4%. N-junctional sequences are highlighted in gray boxes. (b) Fraction of the IGHV gene assembled in two CLL RNA-seq datasets with different sequence depths and lengths as indicated. An unpaired two-tailed t-test demonstrated no significant (P = 0.15) difference between the two distributions (NS). (c) Scatter plot comparing the percent of mutation of IGHV as predicted by CRIS and clinical PCR-Sanger-based analysis for 16 CLL patient samples obtained from GSE66228

Table 2.

Concordance of IGHV gene prediction and percent mutation between PCR-Sanger-based analysis and CRIS

Sample IDSanger
CRIS
IGHVMutation (%)IGHV IGHV mutation (%)IGHDIGHJNo. of Ig transcriptNo. of clonotype
US-1422282V1-690.4IGHV1-69*040.3IGHD6-19*01IGHJ4*0274
US-1422366V1-180.34IGHV1-18*040IGHD3-3*01IGHJ6*02215
US-1422311V3-112IGHV3-11*012IGHD4-17*01IGHJ4*0254
US-1422278V3-745.4IGHV3-74*015.4IGHD5-18*01IGHJ6*0253
US-1422335V4-5910.2IGHV4-59*028.5IGHD3-10*01IGHJ4*0232
US-1422321V3-660.7IGHV3-66*020.7NAIGHJ4*0294
US-1422333V4-340IGHV4-34*010IGHD3-3*01IGHJ6*0263
US-1422356V2-700.8IGHV2-70*010.3IGHD3-16*01IGHJ3*02158
US-1422368V3-746.1IGHV3-74*038.8IGHD1-1*01IGHJ5*0222
US-1422309V3-538.8IGHV3-53*016.1IGHD3-10*01IGHJ6*0343
US-1422302V2-700.3IGHV2-70*010.3IGHD2-15*01IGHJ4*02204
US-1422351V1-460IGHV1-46*010IGHD3-10*01IGHJ4*0263
US-1422314V1-30.7IGHV1-3*010IGHD6-19*01IGHJ4*0253
US-1422342V3-210IGHV3-21*010IGHD3-16*01IGHJ4*0242
US-1422350V3-482.8IGHV3-48*032.4IGHD3-22*01IGHJ4*0232
US-1422352V1-460IGHV1-46*010IGHD3-22*01IGHJ6*02174

Notes: CRIS reconstructed V-D-J segments of Ig transcripts and identified multiple transcripts per sample that belong to different clonotypes. NA is used in cases where IGHD genes were absent.

Evaluation of CRIS to reconstruct IGHV-D-J sequences. (a) The most abundant Ig transcript from US-1422278 sample was aligned to the germline database using IgBLAST where top hit germline genes are shown. In the alignment, mismatches are represented as nucleotide bases and matches as dots. The alignment length, number of matches and mismatches are 296, 280 and 16, respectively. Total number of matched nucleotides between query and germline IGHV sequence is used to calculate percent identity e.g., 100*(280/296) = 96.4%. N-junctional sequences are highlighted in gray boxes. (b) Fraction of the IGHV gene assembled in two CLL RNA-seq datasets with different sequence depths and lengths as indicated. An unpaired two-tailed t-test demonstrated no significant (P = 0.15) difference between the two distributions (NS). (c) Scatter plot comparing the percent of mutation of IGHV as predicted by CRIS and clinical PCR-Sanger-based analysis for 16 CLL patient samples obtained from GSE66228 Concordance of IGHV gene prediction and percent mutation between PCR-Sanger-based analysis and CRIS Notes: CRIS reconstructed V-D-J segments of Ig transcripts and identified multiple transcripts per sample that belong to different clonotypes. NA is used in cases where IGHD genes were absent. Having established that enrichment of sequences using our Ig feature set significantly reduced compute resources without a reduction in the sensitivity, we next examined the impact of RNA-seq sequencing depth. For this, we leveraged a set of 16 CLL RNA-seq libraries with an average of ∼28 million paired reads from GSE66228 (Blachly ) and compared these to the results obtained from our deeply sequenced CEMT libraries (∼300 million paired reads). We found no significant difference in the fraction of the IGHV gene assembled between deep and shallow RNA-seq libraries (Fig. 3b). This appears to be in part due to the high expression level of the dominating clone in the GSE66228 dataset (Blachly ) driving sufficient sequence read coverage (at least 104 reads) for Trinity to assemble the Ig transcript. However, as expected, the overall number of Ig transcripts identified correlated with the sequencing depth. Based on this analysis, we developed a pipeline called CRIS and benchmarked its ability to call IGHV mutation status. In the CRIS pipeline, we automate the process of read extraction from our novel putative Ig coordinates, perform quality trimming of selected sequence reads, assemble transcripts, enumerate transcript abundances and identify somatic mutations using reference germline sequences for SHM classification.

3.3 CRIS is concordant with clinical IGHV mutation status

Having established that CRIS could efficiently assembly IGHV transcripts, we explored its ability to call SHM mutation status in CLL. Unmutated CLL (uCLL) is clinically defined by IGHV sequence alignments of >98% identity to the reference sequence (Georgiou ; Monk ). To benchmark CRIS against gold standard Sanger-based clinical classification, we analyzed a series of published RNA-seq libraries from CLL patients with matched Sanger sequencing classifications. CRIS reported SHM on clonally amplified Ig transcripts and its classification of mutated/unmutated CLL (mCLL/uCLL) showed perfect concordance with Sanger-based clinical calls in the GSE66228 dataset (Blachly ; Table 2). The percent mutations reported by CRIS and the clinical test were also highly correlated (Pearson’s r = 0.95, 95% CI 0.86–0.98; Fig. 3c). The reported IGHV mutational frequency was identical in 8/16 cases with the remaining cases showing small deviations (mean deviation 0.22%) that did not change the SHM classification. In seven of the eight divergent cases, the percent IGHV identity reported by the Sanger-based test was higher compared to CRIS (Table 2). Closer inspection of the alignments revealed that this likely an artifact in the Sanger calls due to incomplete IGHV coverage by the PCR product used as denominator to calculate percent identity (Blachly ). In addition to calling mCLL/uCLL status, CRIS also reported 2–8 dominant clonotypes in the GSE66228 dataset, a feature not detected by clinical Sanger-based classifiers. We further benchmarked CRIS using two independent CLL RNA-seq datasets with matched IGHV mutation status determined by Sanger sequencing. In the phs000435.v3 dataset (Wang ), CRIS calls were identical to the Sanger-based calls in 50/51 cases with 98.3% accuracy, 100% sensitivity and 97.3% specificity (Fig. 4a). A single sample (DFCI-5121) was reported as mCLL (Wang ), however, CRIS determined it as uCLL. In the third independent dataset, EGAD00001004046 (Beekman ), CRIS agreed with clinical classification in all cases and determined the identical IGHV gene as the dominant clone (Fig. 4b and Supplementary Fig. S1b).
Fig. 4.

Comparison of CRIS with clinical data and existing tools. (a and b) Confusion matrix represents the classification accuracy of CRIS compared to Sanger-PCR data in two independent CLL cohorts. The P-value was calculated by one-sided binomial test. (c) Comparison of CRIS, V’DJer and TRUST to reconstruct the proportion of IGHV sequences in GSE66228 (Blachly ) dataset. The average fraction of IGHV gene length for each tool is represented by dashed horizontal lines

Comparison of CRIS with clinical data and existing tools. (a and b) Confusion matrix represents the classification accuracy of CRIS compared to Sanger-PCR data in two independent CLL cohorts. The P-value was calculated by one-sided binomial test. (c) Comparison of CRIS, V’DJer and TRUST to reconstruct the proportion of IGHV sequences in GSE66228 (Blachly ) dataset. The average fraction of IGHV gene length for each tool is represented by dashed horizontal lines

3.4 Comparison of CRIS against existing tools

We next compared CRIS with previously published tools: V’DJer (Mose ), TRUST (Hu ) and MiXCR (Bolotin ) that reconstruct BCR repertoires from short-read RNA-seq data. In 16 CLL RNA-seq samples obtained from GSE66228 (Blachly ), V’DJer did not produce full-length IGHV as it is designed to generate contigs of fixed length (360 bp) spanning the CDR3 region. Thus, on average, V’DJer assembled 75.44% of the IGHV gene whereas CRIS reconstructed 99.74% (Fig. 4c). Partial reconstruction of the IGHV gene could lead to misclassification of IGHV mutation status especially for samples with IGHV sequence identity near the established 98% cutoff. For example, CRIS reconstructed 295 bp out of 296 bp of the IGHV3-74*03 sequence whereas V’DJer assembled 226 bp in US-1422368 (Supplementary Fig. S3a and b). The additional 69 bp reported by CRIS contained two mutations that resulted in a 1.2% difference in reported percent identity between CRIS (91.2%) and V’DJer (89.4%). TRUST assembled only 41.5% of the IGHV gene on average using the GSE66228 dataset (Fig. 4c). Furthermore, V’DJer and TRUST did not produce a contig for US-1422282 that contained IGHV1-69 gene whereas CRIS generated IGHV1-69 containing contig in agreement with the clinical call. To compare the computational performance between CRIS and V’DJer, both of the pipelines were configured to use up to 16 threads. In the shallow libraries from GSE66228 dataset, CRIS had ∼14% faster total run time (average 3.07 wall-clock minutes) compared to V’DJer (average 3.50 wall-clock minutes). Using deeper RNA-seq datasets (∼300 million reads) V’DJer took five times more time to run than CRIS (87 versus 16 wall-clock minutes on average). Using 16 threads, TRUST took 36 wall-clock minutes on average using GSE66228, an order of magnitude longer than CRIS. MiXCR (Bolotin ) generated partial CDR3 sequence contigs with <10% of IGHV gene sequence in the GSE66228 dataset of 75 bp read length. MiXCR recommends ≥100 bp read length to extract CDR3 repertoires from RNA-Seq data. Thus, our comparisons suggest that existing BCR reconstruction tools developed to extract just CDR3 regions perform poorly compared to CRIS in the determination of SHM status because they are designed to generate and analyze partial IGHV sequences. Overall, CRIS showed increased sensitivity and specificity and reduced run time over existing RNA-seq-based BCR reconstruction tools.

4 Discussion

PCR-Sanger-based Ig SHM classification is resource-intensive, subject to PCR bias, and suffers from an ∼9% to 18% failure rate (Ghia ; Stamatopoulos ). In contrast, RNA-seq is now routinely applied in the clinical setting, eliminates the need for targeted amplification of Ig locus and can be used to identify BCR rearrangement repertoire (Blachly ). Here, we showed that CRIS can rapidly analyze RNA-seq to detect IGHV mutation status in CLL at a sensitivity and specificity equivalent to current Sanger-based clinical tests. Furthermore, CRIS was able to reconstruct the entire IGHV sequence thus increasing the accuracy of SHM classification. This is in contrast to a majority of existing pipelines designed to infer only CDR3-derived sequences (Bolotin ; Hu ; Mose ). A registry of ∼1500 CLL patients showed that 90% of patients were not screened for IGHV mutations (Mato ). In the public domain, there are thousands of RNA-seq data available for different B-cell malignancies but their SHM status of IGHV genes is either not reported or partially reported. Furthermore, for a majority of publicly available RNA-seq datasets where SHM status is reported, detailed IGHV mutation reports with gene name, percent identity and clonal frequency are not available restricting the ability to assess mutational values. To meet this need, we developed CRIS and demonstrated its ability to rapidly classify IGVH mutational status with clinical accuracy. We anticipate that CRIS will prove to be useful in the mining of available B-cell RNA-seq datasets and that it will provide a framework to incorporate RNA-seq as a diagnostic tool to examine the BCR clonal rearrangement and SHM status. Click here for additional data file.
  32 in total

1.  MiXCR: software for comprehensive adaptive immunity profiling.

Authors:  Dmitriy A Bolotin; Stanislav Poslavsky; Igor Mitrophanov; Mikhail Shugay; Ilgar Z Mamedov; Ekaterina V Putintseva; Dmitriy M Chudakov
Journal:  Nat Methods       Date:  2015-05       Impact factor: 28.547

2.  STAR: ultrafast universal RNA-seq aligner.

Authors:  Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras
Journal:  Bioinformatics       Date:  2012-10-25       Impact factor: 6.937

3.  Targeted deep sequencing reveals clinically relevant subclonal IgHV rearrangements in chronic lymphocytic leukemia.

Authors:  B Stamatopoulos; A Timbs; D Bruce; T Smith; R Clifford; P Robbe; A Burns; D V Vavoulis; L Lopez; P Antoniou; J Mason; H Dreau; A Schuh
Journal:  Leukemia       Date:  2016-10-31       Impact factor: 11.528

4.  Ig V gene mutation status and CD38 expression as novel prognostic indicators in chronic lymphocytic leukemia.

Authors:  R N Damle; T Wasil; F Fais; F Ghiotto; A Valetto; S L Allen; A Buchbinder; D Budman; K Dittmar; J Kolitz; S M Lichtman; P Schulman; V P Vinciguerra; K R Rai; M Ferrarini; N Chiorazzi
Journal:  Blood       Date:  1999-09-15       Impact factor: 22.113

5.  Landscape of B cell immunity and related immune evasion in human cancers.

Authors:  Xihao Hu; Jian Zhang; Jin Wang; Jingxin Fu; Taiwen Li; Xiaoqi Zheng; Binbin Wang; Shengqing Gu; Peng Jiang; Jingyu Fan; Xiaomin Ying; Jing Zhang; Michael C Carroll; Kai W Wucherpfennig; Nir Hacohen; Fan Zhang; Peng Zhang; Jun S Liu; Bo Li; X Shirley Liu
Journal:  Nat Genet       Date:  2019-02-11       Impact factor: 38.330

6.  Prognostic B-cell signatures using mRNA-seq in patients with subtype-specific breast and ovarian cancer.

Authors:  Michael D Iglesia; Benjamin G Vincent; Joel S Parker; Katherine A Hoadley; Lisa A Carey; Charles M Perou; Jonathan S Serody
Journal:  Clin Cancer Res       Date:  2014-06-10       Impact factor: 12.531

7.  Real-world clinical experience in the Connect® chronic lymphocytic leukaemia registry: a prospective cohort study of 1494 patients across 199 US centres.

Authors:  Anthony Mato; Chadi Nabhan; Neil E Kay; Mark A Weiss; Nicole Lamanna; Thomas J Kipps; David L Grinblatt; Ian W Flinn; Mark F Kozloff; Christopher R Flowers; Charles M Farber; Pavel Kiselev; Arlene S Swern; Kristen Sullivan; E Dawn Flick; Jeff P Sharman
Journal:  Br J Haematol       Date:  2016-11-08       Impact factor: 6.998

8.  Antigen receptor repertoire profiling from RNA-seq data.

Authors:  Dmitriy A Bolotin; Stanislav Poslavsky; Alexey N Davydov; Felix E Frenkel; Lorenzo Fanchi; Olga I Zolotareva; Saskia Hemmers; Ekaterina V Putintseva; Anna S Obraztsova; Mikhail Shugay; Ravshan I Ataullakhanov; Alexander Y Rudensky; Ton N Schumacher; Dmitriy M Chudakov
Journal:  Nat Biotechnol       Date:  2017-10-11       Impact factor: 54.908

9.  Comprehensive evaluation and optimization of amplicon library preparation methods for high-throughput antibody sequencing.

Authors:  Ulrike Menzel; Victor Greiff; Tarik A Khan; Ulrike Haessler; Ina Hellmann; Simon Friedensohn; Skylar C Cook; Mark Pogson; Sai T Reddy
Journal:  PLoS One       Date:  2014-05-08       Impact factor: 3.240

10.  Profiling immunoglobulin repertoires across multiple human tissues using RNA sequencing.

Authors:  Igor Mandric; Jeremy Rotman; Harry Taegyun Yang; Nicolas Strauli; Dennis J Montoya; William Van Der Wey; Jiem R Ronas; Benjamin Statz; Douglas Yao; Velislava Petrova; Alex Zelikovsky; Roberto Spreafico; Sagiv Shifman; Noah Zaitlen; Maura Rossetti; K Mark Ansel; Eleazar Eskin; Serghei Mangul
Journal:  Nat Commun       Date:  2020-06-19       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.