| Literature DB >> 21445261 |
Liang Goh1, Geng Bo Chen, Ioana Cutcutache, Benjamin Low, Bin Tean Teh, Steve Rozen, Patrick Tan.
Abstract
Next generation sequencing technology has revolutionized the study of cancers. Through matched normal-tumor pairs, it is now possible to identify genome-wide germline and somatic mutations. The generation and analysis of the data requires rigorous quality checks and filtering, and the current analytical pipeline is constantly undergoing improvements. We noted however that in analyzing matched pairs, there is an implicit assumption that the sequenced data are matched, without any quality check such as those implemented in association studies. There are serious implications in this assumption as identification of germline and rare somatic variants depend on the normal sample being the matched pair. Using a genetics concept on measuring relatedness between individuals, we demonstrate that the matchedness of tumor pairs can be quantified and should be included as part of a quality protocol in analysis of sequenced data. Despite the mutation changes in cancer samples, matched tumor-normal pairs are still relatively similar in sequence compared to non-matched pairs. We demonstrate that the approach can be used to assess the mutation landscape between individuals.Entities:
Mesh:
Year: 2011 PMID: 21445261 PMCID: PMC3060821 DOI: 10.1371/journal.pone.0017810
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1IBS clustering of SNP6 (green) and NGS (blue) samples.
IBS for each pair of samples is computed and the mean (x-axis) and variance (y-axis) plotted in the IBS-space. Matched (denoted as *) and unmatched pairs (denoted as +) are clustered differentially where the matched pairs are positioned towards the bottom-right corner indicating more relatedness between samples. 5 SNP6 samples (red) were also sequenced in NGS. One of the samples, 76629543, is clustered further away from the bottom right in both datasets, indicating its higher level of mutations.
Figure 2IBS landscape of samples 76629543 and 2000619 in NGS.
For each chromosome, the different states of IBS is shown (green: IBS-2, red: IBS-1, black: IBS-0). Sample ID is indicated at the top of each genome plot. Most of the alleles do not change state between matched pairs, i.e. IBS-2 (green). The most frequent allele change is IBS-1 or heterozygous variant (red). IBS-0 or homozygous variant (black) is not common, occurring less than 2%. Sample 76629543 shows the most varied IBS in chr8, 10, 11, 12, 17, 19, and 22q.
IBS-0 and IBS-1 (somatic and LOH) frequency summary of the NGS samples.
| Sample | AA/BB->AB (Somatic, IBS-1) | AB->AA/BB (LOH, IBS-1) | AA->BB (IBS-0) | Total |
| 990172 | 151 (50.84) | 143 (48.15) | 3 (1.01) | 297 |
| 990300 | 311 (57.06) | 224 (41.1) | 10 (1.83) | 545 |
| 990355 | 203 (61.7) | 123 (37.39) | 3 (0.91) | 329 |
| 990475 | 170 (50.6) | 163 (48.51) | 3 (0.89) | 336 |
| 2000619 | 220 (29.69) | 515 (69.5) | 6 (0.81) | 741 |
| 2000778 | 260 (56.03) | 201 (43.32) | 3 (0.65) | 464 |
| 76629543 | 318 (8.76) | 3287 (90.55) | 25 (0.69) | 3630 |
Percentages are indicated in parenthesis.
IBS scores between samples.
| Sample 1 | Sample 2 | IBS |
| AA | AA | 2 |
| AA | AB | 1 |
| AA | BB | 0 |
| AB | AB | 2 |
| AB | BB | 1 |
| BB | BB | 2 |