| Literature DB >> 30541027 |
Lies Van Horebeek1, Kelly Hilven1, Klara Mallants1, Annemarie Van Nieuwenhuijze2,3, Tiina Kelkka4,5, Paula Savola4,5, Satu Mustjoki4,5, Susan M Schlenner2,3, Adrian Liston2,3, Bénédicte Dubois1,6, An Goris1.
Abstract
The role of somatic variants in diseases beyond cancer is increasingly being recognized, with potential roles in autoinflammatory and autoimmune diseases. However, as mutation rates and allele fractions are lower, studies in these diseases are substantially less tolerant of false positives, and bio-informatics algorithms require high replication rates. We developed a pipeline combining two variant callers, MuTect2 and VarScan2, with technical filtering and prioritization. Our pipeline detects somatic variants with allele fractions as low as 0.5% and achieves a replication rate of >55%. Validation in an independent data set demonstrates excellent performance (sensitivity > 57%, specificity > 98%, replication rate > 80%). We applied this pipeline to the autoimmune disease multiple sclerosis (MS) as a proof-of-principle. We demonstrate that 60% of MS patients carry 2-10 exonic somatic variants in their peripheral blood T and B cells, with the vast majority (80%) occurring in T cells and variants persisting over time. Synonymous variants significantly co-occur with non-synonymous variants. Systematic characterization indicates somatic variants are enriched for being novel or very rare in public databases of germline variants and trend towards being more damaging and conserved, as reflected by higher phred-scaled combined annotation-dependent depletion (CADD) and genomic evolutionary rate profiling (GERP) scores. Our pipeline and proof-of-principle now warrant further investigation of common somatic genetic variation on top of inherited genetic variation in the context of autoimmune disease, where it may offer subtle survival advantages to immune cells and contribute to the capacity of these cells to participate in the autoimmune reaction.Entities:
Mesh:
Year: 2019 PMID: 30541027 PMCID: PMC6452186 DOI: 10.1093/hmg/ddy425
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Baseline patient characteristics
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
| MS-1 | F | 24 | 16 | BOMS | 1.04 | pos | 0.65 | Interferon- | — |
| MS-2 | M | 36 | 18 | BOMS | 1.03 | pos | 1.55 | Interferon- | — |
| MS-3 | M | 31 | 30 | BOMS | 1.79 | NA | NA | Never treated | — |
| MS-4 | F | 24 | 44 | BOMS | NA | NA | NA | Interferon- | — |
| MS-5 | M | 22 | 15 | BOMS | 0.71 | pos | 0.92 | Interferon- | — |
| MS-6 | F | 20 | 8 | BOMS | 4.96 | neg | 0.83 | Fingolimod | — |
| MS-7 | M | 40 | 9 | BOMS | 0.24 | NA | 0.56 | Interferon- | — |
| MS-8 | F | 45 | 22 | PPMS | 3.65 | pos | 1.09 | Never treated | Breast + kidney cancer, HT |
| MS-9 | F | 29 | 14 | BOMS | 1.92 | pos | 1.92 | Interferon- | — |
| MS-10 | M | 42 | 6 | BOMS | 2.01 | pos | 1.31 | Interferon- | — |
MSSS: multiple sclerosis severity score; OCB: oligoclonal band status; IgG: immunoglobulin G; F: female, M: male; BOMS: bout onset MS; PPMS: primary progressive MS; pos: positive; neg: negative; HT: Hashimoto thyroiditis; NA: not available.
Figure 1Pipeline for the detection of somatic variants in autoimmune diseases based on the overlap of variant callers MuTect2 and VarScan2. AAF: alternate allele fraction; GDI: gene damage index; N: total count; *: default/adapted MuTect2 filters (original/final pipeline).
Overview of replicated somatic variants
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||||||
| * | 2 | 24952447 | C | T |
| P988P | T | 2.00 | 0.57 | 9.56E-207 | 13.15 | 3.31 | 4.98 | 6.50E-6 | − |
| 3 | 48508943 | C | G |
| L352 V | T | 2.78 | 2.53 | 1.05E-149 | 23.50 | 5.61 | 1.35 | 0 | − | |
| * | 11 | 8734271 | G | A |
| R247C | T | 0.75 | 1.12 | 1.42E-272 | 35.00 | 17.32 | 5.28 | 6.50E-6 | − |
| 19 | 49703983 | G | A |
| R611H | T | 1.60 | 0.85 | 2.67E-87 | 30.00 | 0.09 | 4.57 | 5.82E-5 | − | |
|
| |||||||||||||||
| * | 1 | 93202076 | T | C |
| T54A | T | 0.98 | 1.14 | 8.66E-108 | 0.08 | 3.31 | −0.13 | 1.29E-5 | − |
| * | 2 | 33590430 | A | T |
| D1156V | T | 0.70 | 0.80 | 3.71E-56 | 27.20 | 3.31 | 5.67 | 0 | − |
| * | 3 | 46244852 | C | G |
| R318T | B | 1.22 | 1.41 | 5.80E-143 | 0.24 | 3.31 | −2.00 | 0 | − |
| * | 11 | 63403722 | T | C |
| N294S | T | 2.51 | 2.72 | 0 | 23.10 | 27.30 | 5.55 | 4.53E-5 | − |
| * | 11 | 128680557 | A | G |
| K152E | T | 1.16 | 1.45 | 2.24E-291 | 23.90 | 23.30 | 3.89 | 0 | − |
| * | 12 | 123812503 | G | C |
| G456G | T | 0.79 | 5.58 | 0 | 3.43 | 3.31 | 2.02 | 0 | − |
| * | 16 | 67116210 | A | T |
| E165V | B | 1.85 | 2.79 | 2.10E-163 | 33.00 | 15.45 | 5.54 | 0 | − |
| * | Y | 16734258 | C | T |
| R87W | B | 1.07 | 0.77 | 1.33E-06 | 27.10 | 15.26 | 0.54 | 0 | − |
|
| |||||||||||||||
| * | 6 | 74073541 | G | A |
| Q204Q | T | 5.63 | 3.77 | 0 | 0.97 | 22.30 | 1.63 | 0 | − |
| * | 16 | 88504410 | T | C |
| L3483P | T | 1.25 | 4.28 | 2.38E-06 | 3.60 | 0.00 | 0.28 | 0 | − |
|
| |||||||||||||||
| 2 | 135888230 | G | A |
| R392Q | T | 0.70 | 0.51 | 4.90E-10 | 22.70 | 8.10 | 3.43 | 3.23E-5 | + | |
| 6 | 31631776 | G | C |
| S160R | T | 18.33 | 13.88 | 0 | 8.47 | 3.31 | 3.38 | 0 | − | |
| 6 | 149700524 | T | C |
| L491 L | T | 2.24 | 2.55 | 0 | 0.04 | 0.01 | −6.98 | 0 | − | |
| 7 | 47409021 | G | A |
| R408C | T | 0.98 | 0.64 | 1.99E-197 | 23.50 | 3.31 | 4.91 | 1.29E-5 | − | |
| 8 | 135521904 | C | A |
| S1088S | B | 0.63 | 0.69 | 1.19E-88 | 20.70 | 3.31 | −11.6 | 0 | − | |
| 16 | 67517193 | C | T |
| A37T | B | 2.15 | 1.88 | 2.52E-21 | 22.20 | 3.31 | 4.20 | 0 | − | |
| 21 | 15538710 | G | C |
| P236A | T | 15.54 | 16.58 | 0 | 23.80 | 3.31 | 5.48 | 6.50E-6 | − | |
|
| |||||||||||||||
| 1 | 240071069 | C | T |
| F106F | T | 7.47 | 3.67 | 0 | 9.42 | 3.31 | 4.69 | 0 | − | |
| 2 | 97427889 | A | T |
| M385 L | B | 1.20 | 1.68 | 1.16E-245 | 18.11 | 0.00 | 5.19 | 0 | − | |
| 5 | 133481460 | T | C |
| D253D | T | 1.12 | 0.81 | 0 | 6.13 | 5.37 | 1.39 | 3.88E-4 | − | |
| 16 | 61891025 | G | A |
| T222I | T | 3.77 | 3.36 | 4.32E-252 | 29.30 | 3.31 | 5.88 | 0 | − | |
| X | 29972647 | G | T |
| D404Y | T | 1.30 | 0.79 | 4.57E-276 | 30.00 | 32.00 | 5.72 | 0 | + | |
|
| |||||||||||||||
| 2 | 141533745 | C | T |
| G1808R | T | 2.72 | 1.97 | 0 | 34.00 | 5.45 | 5.69 | 6.50E-6 | − | |
| 6 | 26508797 | C | A |
| R326R | T | 1.51 | 1.58 | 0 | 10.95 | 3.31 | 2.10 | 3.23E-5 | − | |
| 7 | 103048325 | C | T |
| P287P | T | 1.19 | 1.51 | 0 | 14.92 | 3.31 | −3.99 | 1.94E-5 | − | |
| 7 | 122635510 | T | G |
| Q60P | T | 3.76 | 3.89 | 0 | 23.20 | 3.31 | 3.42 | 0 | − | |
| 9 | 73477936 | C | T |
| R117Q | T | 0.52 | 0.64 | 8.75E-119 | 22.80 | 3.31 | 5.95 | 6.50E-6 | + | |
| 11 | 118373702 | A | C |
| K2365 N | T | 1.60 | 1.72 | 9.15E-33 | 18.97 | 26.10 | 4.03 | 0 | − | |
| 12 | 2622058 | A | T |
| D433V | T | 2.98 | 3.83 | 0 | 28.40 | 0.07 | 4.43 | 0 | − | |
| 17 | 65026679 | C | T |
| Y181Y | T | 8.23 | 8.23 | 0 | 10.62 | 13.3 | −4.15 | 3.88E-5 | − | |
| 19 | 37618680 | A | C |
| N263H | T | 0.62 | 0.61 | 3.24E-36 | 14.07 | 3.31 | 2.91 | 0 | − | |
| X | 17746076 | G | T |
| D1086Y | B | 0.92 | 0.48 | 1.52E-05 | 24.60 | 0.00 | 5.49 | 0 | − | |
List of replicated somatic variants grouped per patient. *: a second sample, obtained between 8 months and 2 years after the original sample, was used for the replication phase; Chr: Chromosome, Pos: position on hg19; Ref: reference allele, Alt: alternate allele, AA: amino acid change in the corresponding Gene (RefSeq ID of corresponding transcript in Supplementary Material, Table S5); AAF: alternate allele fraction in screening (screen) and replication (repl) phase; p repl: VarScan2 P-value in support of a somatic variant in the replication phase; CADD: phred-scaled combined annotation depletion score (predicted deleteriousness), MSC: mutation significance cut-off (gene-level lower limit of the 99% confidence interval of CADD scores); GERP: genomic evolutionary rate profiling score (score > 3 = conserved); Kaviar: frequency of the alternate allele in the Kaviar public database of germline variants; COSMIC: presence of the variant in the COSMIC70 database of somatic variants.
Figure 2Somatic AAF: replication and evolution over time. For replicated somatic variants, the AAF is shown in the screening phase and in the replication phase using a second blood sample obtained at the same time point or a blood sample obtained on average 1 year later. (A) AAF correlates between screening and replication phases for samples from the same time point (N = 24 variants, r2 = 0.91, P = 4.54 × 10−13). (B) Somatic variants persist over time: evolution of AAF over time for longitudinal samples (N = 12 variants, time point 0 = screening phase, time point for replication phase on X-axis); patients indicated by symbols (square: MS-1, circle: MS-3, triangle: MS-4). (C) Clonal expansion rate (α) as change in AAF over time.
Figure 3Clustering of somatic variants by cell type and by non-synonymous/synonymous effect. Somatic variants are observed in T cells of 60% of MS patients, with 40% of patients additionally carrying somatic variants in B cells. Synonymous variants (grey) co-occur with non-synonymous variants (black) (P = 0.0031). AAF: alternate allele fraction; gene names in italics; patients in which no somatic variant was identified (N = 4) not shown.
Somatic variant characteristics compared to matched germline variants
|
|
|
| ||
|---|---|---|---|---|
| Frequency | Novel | 22 (61.11%) | 54 (14.29%) | 1.96 × 10−9 |
| Median (range) | 0 (0–3.88 × 10−4) | 0.0003752 (0–6.00 × 10−4) | 9.80 × 10−13 | |
| COSMIC | Present | 3 (8.33%) | 17 (4.50%) | 0.40 |
| Pathogenicity | CADD > MSC: | 29 (80.56%) | 272 (72.00%) | 0.33 |
| Median CADD (range) | 22.45 (0.037–35.00) | 14.40 (0.001–40.00) | 0.057 | |
| Conservation | GERP > 3 | 22 (61.11%) | 179 (47.35%) | 0.12 |
| Median (range) | 3.96 (−11.60–3.96) | 2.66 (−11.90–6.17) | 0.062 |
Frequency based on Kaviar database. N: total count; CADD: phred-scaled combined annotation-dependent depletion score; MSC: variant significance cut-off score; GERP++: genomic evolutionary rate profiling score.
Figure 4Somatic variant (full line) characteristics compared to matched germline variants (dashed line). (A) Somatic variants are enriched for being rare in public databases (Kaviar) (P = 9.80 × 10−13). (B) Somatic variants show a trend for being more damaging (CADD) (P = 0.057), and (C) the positions of somatic variants show a trend for being more conserved (GERP++) (P = 0.062). Non-parametric statistical tests (Kruskal–Wallis) were performed.
Accurate and sensitive ddPCR quantification of somatic variants: somatic variants are specific to T cell subsets
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| CD3+ T cells | 12.6 | 1000 | 1.26 | — | — | — |
| CD8+ TEMRA cells | 11.3 | 331 | 3.41 | 0 | 233 | 0 |
| Remaining CD8+ T cells | 0 | 247 | 0 | 0 | 184 | 0 |
| CD4+ TEM cells | 0 | 172 | 0 | 8.5 | 129 | 6.59 |
| Remaining CD4+ T cells | 0 | 197 | 0 | 0 | 89 | 0 |
| CD19+ B cells | 0.34 | 688.5 | 0.049a | — | — | — |
| Other immune cells | 0.22 | 546 | 0.040a | — | — | — |
| Negative control | 0.16 | 974.5 | 0.016a | — | — | — |
Results for ddPCR measurements in DNA obtained from all T cells and isolated T-cell subsets, B cells and other (non-B and non-T) immune cells of MS-3. ALT and REF indicate the copies per μl of the alternate and reference allele, respectively, in a 20 μl ddPCR reaction. ALT AF (%) indicates the fraction of the alternate allele. TEMRA: T effector memory cells re-expressing CD45RA; TEM: T effector memory cells. aValues below detection threshold (0.1%).