| Literature DB >> 30482155 |
Sangdi Lin1, Chen Wang2, Shabnam Zarei3, Debra A Bell3, Sarah E Kerr3, George C Runger1,4, Jean-Pierre A Kocher5.
Abstract
BACKGROUND: Copy Number Alternations (CNAs) is defined as somatic gain or loss of DNA regions. The profiles of CNAs may provide a fingerprint specific to a tumor type or tumor grade. Low-coverage sequencing for reporting CNAs has recently gained interest since successfully translated into clinical applications. Ovarian serous carcinomas can be classified into two largely mutually exclusive grades, low grade and high grade, based on their histologic features. The grade classification based on the genomics may provide valuable clue on how to best manage these patients in clinic. Based on the study of ovarian serous carcinomas, we explore the methodology of combining CNAs reporting from low-coverage sequencing with machine learning techniques to stratify tumor biospecimens of different grades.Entities:
Keywords: Classification; Copy number alternations; Data science; Low-coverage whole genome sequencing; Machine learning; Ovarian serious carcinoma; Tumor grade
Mesh:
Year: 2018 PMID: 30482155 PMCID: PMC6258141 DOI: 10.1186/s12864-018-5177-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Photomicrograph of low grade and high grade serous carcinoma cases. a Low grade serous carcinoma, 20X. b High grade serous carcinoma, 20X
Fig. 2Bag-of-Segments representation workflow
Categorization of CNA segment using the adjusted quantiles of segment width and segment height
| Narrow Amplified (NA) | Medium Amplified (MA) | Wide Amplified (WA) | |
| Narrow Normal (NN) | Medium Normal (MN) | Narrow Normal (NN) | |
| Narrow Deleted (ND) | Medium Deleted (MD) | Narrow Deleted (ND) |
Fig. 3a Segmentation example for a CNA profile sample (23 chromosomes). b 2D distribution of the segment width and height for the segmentation in a
Fig. 4Aggregated joint distribution and marginal distributions of segment widths and heights
Test results of the two-sample Kolmogorov-Smirnov tests for the segment width and segment height
| Variable | D statistics | |
|---|---|---|
| Segment height | 0.154 | 7.75×10−7 |
| Segment width | 0.702 | <2.2×10−16 |
Bag-of-Segments representation based on the distribution over the CNA segment classes
| MA | MD | MN | NA | ND | NN | WA | WD | WN | Grade | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.18 | 0.32 | 0.18 | 0.18 | 0.02 | 0.04 | 0.04 | 0.02 | 0.02 | High |
| 2 | 0.04 | 0.04 | 0.07 | 0.04 | 0.07 | 0.00 | 0.04 | 0.04 | 0.67 | Low |
| 3 | 0.00 | 0.14 | 0.07 | 0.00 | 0.07 | 0.04 | 0.14 | 0.11 | 0.43 | Low |
| 4 | 0.00 | 0.00 | 0.04 | 0.00 | 0.04 | 0.00 | 0.13 | 0.00 | 0.79 | Low |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Fig. 5Sensitivity analysis with various α and Cp values
Fig. 6RF importance score for Bag-of-Segments features
Fig. 7Correlation plot for Bag-of-Segments features
Fig. 8Box plots for the values of Bag-of-Segments features