| Literature DB >> 32070336 |
Jerome Samir1,2, Simone Rizzetto1, Money Gupta1,2, Fabio Luciani3,4.
Abstract
BACKGROUND: Single cell RNA sequencing provides unprecedented opportunity to simultaneously explore the transcriptomic and immune receptor diversity of T and B cells. However, there are limited tools available that simultaneously analyse large multi-omics datasets integrated with metadata such as patient and clinical information.Entities:
Keywords: B cell receptor; Immune cells; Multi-omics; T cell receptor; scRNA-seq
Mesh:
Year: 2020 PMID: 32070336 PMCID: PMC7029546 DOI: 10.1186/s12920-020-0696-z
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Overview of VDJView. Top: VDJView upload page, showing where required (immune receptor sequences and gene expression matrix) and optional inputs (metadata) can be uploaded. Bottom: examples of analysis using scRNA-seq from primary cancer tissues and metastatic lymph-node revealing clonally expanded T and B cells. The table (top left) shows a clonal expansion of IGL chains across primary breast tissue and metastatic lymph-node. The Circos plot (bottom left) shows the IgL V and J gene pairings identified. Dimensionality reduction using UMAP (top right) shows a cluster of B cells derived from metastatic lymph-node in two patients with ER+ HER2+ breast cancer, while T and B cells from the primary breast cancer tissue had similar gene signature regardless of molecular subtype. Pseudo-time plot (bottom right) shows the inferred evolutionary trajectory between all immune cells determined by genes that differentiate primary from metastatic tissues in two subjects with matched samples
List of modules implemented in VDJView with their outputs and integrated packages
| Module | Description | Software packages | Output |
|---|---|---|---|
| Filtering | Selection of cells based on metadata, gene and immune receptor features | dplyr | Venn Diagram, data-table |
| Quality control | Metrics with options for easily filtering cells according to total read counts, number of genes, and percentage of mitochondrial/ribosomal genes | Seurat [ | Violin plots |
| Random sampling | Selection of small subsets of data, providing the ability to analyse larger datasets | Seurat | |
| Clonotype usage | Pie charts of single- and paired-chain CDR3 contig usage for both T and B cells. Tables detailing single- and paired-chain CDR3 contigs generated across all cells | plotly | Pie charts, data-tables |
| CDR3 length | Distribution of CDR3 lengths for single- and paired chains | tcR | Histograms |
| VDJ gene usage | Distribution of V, D and J gene usage for single chains | tcR | Histograms |
| Gene interactions | Frequencies of inter- and intra-chain VDJ gene pairings, and inter-chain CDR3 pairings | Rcircos [ | Circos plots |
| Shared clonotypes | Table and scatter plot detailing the number of single- and paired-chain CDR3 contigs and VDJ genes that occur in multiple subgroups, and their frequency in each group | tcR | Scatter plot, data-table |
| Dimensionality reduction | PCA plot, t-SNE plot and UMAP plot with customisable parameters. Metadata can be used to control data point shape, size and colour. Data points are selectable and displayed with their metadata in a data-table below each plot | Scater [ | PCA plots, t-SNE plots, UMAP plots, data-tables |
| Unsupervised clustering | Consensus matrix, gene expression heatmaps and marker-gene heatmaps are calculated by SC3 based on user defined cluster ranges, p-values and AUROC values. Metadata can be displayed above plot. Gene list can be uploaded to generate an expression heatmap. Tabular SC3 clustering information is generated | Scater, SC3 | Consensus matrix, Expression matrix, DE Genes heatmap, Marker genes heatmap, data-table |
| Supervised clustering | Differentially expressed gene heatmap generated by MAST comparing groups of cells based on clusters predetermined by the user, p-value and fold change thresholds. Gene fold change values and a tabular version of the heatmap are generated | MAST [ | Gene expression matrix, data-tables |
| Pseudo-time | Pseudo-time plot to determine single-cell state trajectories based on genes which are differentially expressed between user defined metadata groups | Monocle [ | Pseudo-time cell trajectory plot |
| Cell metadata summary | Tabular summary of the cells uploaded, the metadata associated with them and the number of receptors contigs, and expressed genes reported for each cell | Data-table |
Fig. 2Analysis of CD8+ antigen-specific T cells sampled from Donor 1. a Unsupervised clustering with k = 8 clusters, p-value = 0.01, AUROC = 0.8. Epitope species specificity, the four largest TCR clones, surface protein expression levels, and the percentage of mitochondrial genes are annotated. b t-SNE coloured by the results of clustering, epitope species, TCR clone and genes of interest (CCR7, CMC1, LEF1), with point size corresponding to highest tetramer read count of each cell, CD45RO TotalSeq expression, and genes of interest (GZMH, CST7, TCF7), show that clustering is preserved, and that clonally expanded T cells dominate the major clusters. Genes of interest reveal further sub-clusters of cells. c Pseudo-time plots reveal a naïve to effector phenotype transition, with cluster preservation at the extremes of each state and a clear trajectory for influenza specific T cells
Fig. 3Summary of donor 1 and donor 2 clonal repertoires. Top 16 clones for each donor displayed in pie charts and the TRBV gene usage across all TCR in each donor is detailed in the histograms
TCR clones shared between donor 1 and donor 2, and the species they target with the number of occurrences in each donor
| TCR_CDR3 | Species | d1 | d2 |
|---|---|---|---|
| CAGHTGNQFYF_CASSWGGGSHYGYTF | EBV | 2379 | 37 |
| CAVGDNFNKFYF_CASSLYSATGELFF | EBV | 1511 | 23 |
| CAARVRGFGNVLHC_CASSLYSATGELFF | EBV | 1442 | 19 |
| CAASGYDYKLSF_CSVSASGGDEQYF | CMV, EBV | 214 | 10 |
| CAVFLYGNNRLAF_CSVSASGGDEQYF | CMV, EBV | 199 | 11 |
| CAASETSYDKVIF_CASSFSGNTGELFF | EBV | 38 | 5810 |
| CADSGGGADGLTF_CASSLRDGSEAFF | EBV | 34 | 4428 |
| CAASETSYDKVIF_CASSWGGGSHYGYTF | EBV | 14 | 35 |
| CAGAGSQGNLIF_CASSIRSSYEQYF | Influenza | 16 | 345 |
| CAVTDGGSQGNLIF_CASSIRSSYEQYF | Influenza | 120 | 39 |
| CAGAHGSSNTGKLIF_CASSIRSAYEQYF | Influenza | 71 | 48 |
| CAVSGSQGNLIF_CASSIRSSYEQYF | Influenza | 10 | 79 |
| CAAGGSQGNLIF_CASSIRSSYEQYF | Influenza | 10 | 77 |
| CAGGGSQGNLIF_CASSIRSSYEQYF | Influenza | 469 | 1094 |
| CAGGGSQGNLIF_CASSVRSSYEQYF | Influenza | 119 | 72 |