| Literature DB >> 26346608 |
Norbert Niklas1, Julia Hafenscher2, Agnes Barna3, Karin Wiesinger4, Johannes Pröll5, Stephan Dreiseitl6, Sandra Preuner-Stix7, Peter Valent8, Thomas Lion9, Christian Gabriel10.
Abstract
BACKGROUND: Next-generation sequencing allows for determining the genetic composition of a mixed sample. For instance, when performing resistance testing for BCR-ABL1 it is necessary to identify clones and define compound mutations; together with an exact quantification this may complement diagnosis and therapy decisions with additional information. Moreover, that applies not only to oncological issues but also determination of viral, bacterial or fungal infection. The efforts to retrieve multiple haplotypes (more than two) and proportion information from data with conventional software are difficult, cumbersome and demand multiple manual steps.Entities:
Mesh:
Year: 2015 PMID: 26346608 PMCID: PMC4562109 DOI: 10.1186/s13104-015-1382-7
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Fig. 1Workflow for using cFinder. Sequencing data is aligned against a reference sequence by a user defined tool (blue) and result is loaded in sam or ace format (together with optional annotation information) into cFinder, where variant detection is accomplished and optional filtering can be performed. After selection of desired variants clones are automatically calculated and presented for further evaluation
Fig. 2Overlapping multi-amplicon paired-end design. Schematic representation of covering a region larger than the maximum reading length, haplotypes are therefore scattered in multiple sequence reads. Reference sequence is displayed in black, amplicons in forward (green) and reverse (red), pairs (but not covered regions) are denoted with a dashed line
Comparison of alignment methods and cFinder to manual approach
| Sample (no.) | Clone variants | Manual | cFinder (AVA) | cFinder (CLCbio) | |||
|---|---|---|---|---|---|---|---|
| % | Hits | % | Hits | % | Hits | ||
| 1 | c.749 G>A | 22.0 | 14,956 | 23.3 | 15,611 | 22.43 | 14,745 |
| c.757 T>C | 26.5 | 18,458 | 30.6 | 20,886 | 29.39 | 19,604 | |
| 2 | c.749G>A, c.943A>G, c.1497A>G | 15.7 | 4748 | 6.4 | 4362 | 6.21 | 4066 |
| c.749G>A, c.949T>A, c.1497A>G | 4.1 | 1240 | 2.4 | 1641 | 2.3 | 1522 | |
| c.749G>A, c.1497A>G | 22.7 | 6864 | 7.1* | 4145 | 6.7* | 3729 | |
| c.943A>G, c.1497A>G | 17.1 | 16,237 | 12.6 | 9183 | 12.5 | 8715 | |
| c. 949T>A, c.1497A>G | 3.3 | 3128 | 4.5 | 3342 | 4.6 | 3240 | |
| c.1497A>G | 45.2 | 26,522 | 61.2* | 35,867* | 61.3* | 33,804 | |
| c.749G>A, c.949T>A | – | – | 3.4 | 2519 | 3.5 | 2488 | |
| c.749G>A | – | – | 13.9 | 7996 | 14.2 | 8074 | |
| c.749G>A, c.943A>G | – | – | 10.8 | 7838 | 10.9 | 7692 | |
| c.943A>G | – | – | 4.0 | 3483 | 4.2 | 3574 | |
| c.949T>A | – | – | 1.1 | 991 | 1.2 | 1023 | |
| 3 | c.1375G>A, c.1423_1424ins35 | 6.2 | 3495 | 5.9 | 3186 | 6.4 | 3019 |
| c.1375G>A | 90.9 | 51,247 | 90.9 | 51,891 | 87.9 | 42,693* | |
| 4 | c.730A>G | 20.3 | 9887 | 27.5 | 17,769 | 27.5 | 17,483* |
| 5 | c.756G>T, c.1086_1270del185 | 5.4 | 1407 | 6.2 | 1406 | 6.8 | 1490 |
| c.756G>T, c.1423_1424ins35 | 1.0 | 160 | 1.6 | 244 | 0.2** | 30 | |
| c.756G>T | 41.0 | 10,845 | 58.9* | 12,446* | 57.3* | 11,545 | |
| c.1423_1424ins35 | – | – | 1.6 | 244 | 1.6 | 235 | |
| c.1086_1270del185 | 2.7 | 727 | 2.6 | 625 | 2.6 | 611 | |
| c.888_919del32 | 1.6 | 427 | 0.9** | 220 | 0.2** | 40 | |
| 6 | c.756G>>T, c.1086_1270del185 | 20.4 | 8151 | 14.1 | 7922 | 14.1 | 7702 |
| c.1086_1270del185 | – | – | 6.3 | 4008 | 4.6 | 2828 | |
| c.756G>T | 64.6 | 32,472 | 65.0 | 31,932 | 63.3 | 30,093 | |
| 7 | c.756G>T | 51.2 | 38,883 | 62.3* | 38,726 | 61.5* | 37,336 |
| 8 | c.838_1378del540, c.1423_1424ins35 | 98.4 | 49,425 | – | – | 85.7* | 47,729 |
| c.1423_1424ins35 | – | – | – | – | 2.1 | 1187 | |
| c.838_1378del540 | – | – | – | – | 10.8 | 6013 | |
| 9 | c.825G>A | 2.2 | 229 | 3.1 | 269 | 3.0 | 250 |
| 10 | c.1086_1270del185, c.1423_1424ins35 | 1.3 | 278 | 0.5** | 132 | 0.5** | 116 |
| c.944C>T, c.1086_1270del185 | 9.1 | 2260 | 7.5 | 2215 | 5.3 | 1548 | |
| c.944C>T | 30.6 | 9778 | 33.7 | 10,069 | 34.3 | 10,061 | |
| c.1086_1270del185 | 12.1 | 3278 | 10.4 | 3026 | 7.4 | 2164 | |
| c.1423_1424ins35 | 2.1 | 460 | 1.9 | 384 | 1.8 | 375 | |
Sample and found clone variants are listed with their percentage of occurrence and the hits (absolute number of reads with that variant), comparing manual detection with automated analysis with cFinder with two different alignment software products. Results marked with one star (*) show intense deviation from manual findings, numbers marked with two stars (**) fell below threshold of 1 %. If clone was not detected it is marked with a dash (–)
Fig. 3Test scenarios for overlapping amplicon design. Two test scenarios were created with simulated reads, consisting of three clones (a) and five clones (b). Each line represents a clone having different variants (numbered squares), scattered over multiple sections of amplicon design (see Fig. 2)
Test of simulated data with overlapping amplicons
| Test | Clone variants | Default | Infer relationships | ||
|---|---|---|---|---|---|
| % | Hits | % | Hits | ||
| A | 1 | 23.8 | 5 | – | – |
| 3 | 23.8 | 5 | – | – | |
| 1, 3* | 14.3 | 3 | 28.6 | 2 | |
| 1, 2 | 4.8 | 1 | – | – | |
| 2 | 4.8 | 1 | – | – | |
| 2, 3 | 4.8 | 1 | – | – | |
| 1, 2, 3* | – | – | 14.3 | 1 | |
| B | 3* | 43.8 | 21 | 37.5 | 6 |
| 2* | 37.5 | 18 | 31.3 | 5 | |
| 2, 3 | 8.3 | 4 | – | – | |
| 3, 4 | 8.3 | 4 | – | – | |
| 2, 4 | 8.3 | 4 | – | – | |
| 4 | 6.3 | 3 | – | – | |
| 1, 4 | 2.1 | 1 | – | – | |
| 1, 3 | 2.1 | 1 | – | – | |
| 1, 2 | 2.1 | 1 | – | – | |
| 2, 3, 4* | – | – | 18.8 | 3 | |
| 1, 2, 3, 4* | – | – | 6.3 | 1 | |
Comparison of using the infer relationship option to detect haplotypes that are scattered over multiple amplicons. The column clone variants holds a list of variants (numbered according to Fig. 3, design of the test cases) where “1, 3” means that the clone has variant 1 and variant 3. Default analysis yields just occurring haplotypes on simple amplicons while ticking the checkbox for inferring haplotypes manages to identify the connected variants. If clone was not detected it is marked with a dash (–), true haplotypes (used for simulation) are marked with a star (*)
Fig. 4Scatterplot of simulated reads. The frequencies of clones were plotted against the actual detected frequencies. One dot represents one clone, perfect matches are on the red line