| Literature DB >> 28287439 |
Francesca Lantieri1, Michela Malacarne2, Stefania Gimelli3, Giuseppe Santamaria4, Domenico Coviello5, Isabella Ceccherini6.
Abstract
The presence of false positive and false negative results in the Array Comparative Genomic Hybridization (aCGH) design is poorly addressed in literature reports. We took advantage of a custom aCGH recently carried out to analyze its design performance, the use of several Agilent aberrations detection algorithms, and the presence of false results. Our study provides a confirmation that the high density design does not generate more noise than standard designs and, might reach a good resolution. We noticed a not negligible presence of false negative and false positive results in the imbalances call performed by the Agilent software. The Aberration Detection Method 2 (ADM-2) algorithm with a threshold of 6 performed quite well, and the array design proved to be reliable, provided that some additional filters are applied, such as considering only intervals with average absolute log₂ratio above 0.3. We also propose an additional filter that takes into account the proportion of probes with log₂ratio exceeding suggestive values for gain or loss. In addition, the quality of samples was confirmed to be a crucial parameter. Finally, this work raises the importance of evaluating the samples profiles by eye and the necessity of validating the imbalances detected.Entities:
Keywords: CNV detection filters; agilent aberration call software; high density custom CGH array
Mesh:
Substances:
Year: 2017 PMID: 28287439 PMCID: PMC5372625 DOI: 10.3390/ijms18030609
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Quality control metrics: Distribution of the sample quality controls is reported as box plots and as statistics. In particular, sample metrics are highlighted as excellent, good or poor (evaluate) and how many samples are in each category is also reported. Solid circles and asterisks in the box plot graphs represents the outliers: solid circles are cases with values more than 1.5 times the InterQuartile (IQ) range, asterisks are cases with values more than 3 times the IQ range.
Log2ratios correlations.
| Selected Pairs | Groups Comparisons | All Log2ratios | Log2ratios > |0.3| | ||
|---|---|---|---|---|---|
| Mean | Mean | ||||
| All samples | replicated | 37 | 0.18 | 0.42 | 1.8 × 10−9 |
| random | 37 | 0.07 | 0.14 | 0.0036 | |
| 0.004 | 4.8 × 10−5 | ||||
| Only pairs with at least one excellent quality sample (DLRS ≤ 0.2) | replicated | 24 | 0.23 * | 0.53 § | 2.01 × 10−8 |
| random | 24 | 0.09 ** | −0.17 §§ | 0.0057 | |
| 0.0018 | 6.8 × 10−6 | ||||
| Pairs with no excellent quality sample (DLRS > 0.2) | replicated | 13 | 0.09 * | 0.21 § | 0.003 |
| random | 13 | 0.05 ** | 0.09 §§ | 0.1594 | |
| 0.2635 | 0.1492 |
* p = 0.0069; ** p = 0.2320; § p = 0.0009; §§ p = 0.2188.
Figure 2Sample profiles. An example of four samples selected for excellent quality, good, evaluate and very bad quality. For each, the profile at chromosome 9 is shown, including a region of probes scattered across the genome and two high density regions. The upper high density region inside the blue box in the left panel is zoomed in into the central panel (inside the large blue box) and the specific region inside the yellow box is further zoomed in into the right panel (inside the large yellow box).
Figure 3Aberration calls and probes correlations. Correlation between the number of calls detected in each high density region and the number of probes selected in each region (upper) or the probe density (number of probes/size) of the selected region (bottom) considering any calls, including: single probe calls (A); or only multi probes calls with MAAD > 0.3 (B).
Aberrations detected.
| Sample ID | DLRS | Chromosomal Region (chr:start–end) | CNV Type | # Probes | Detection Algorithm | Fuzzy Zero | Visual Inspection Classification | Reported on DGV | Reported on Decipher | Validated | Replicate | True Variants † | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ADM-2, Threshold 6 | ADM-2, Threshold 8 | ||||||||||||
| HSCR000 | 0.148 | 9:110381888–110401999 | gain | 9 | Y | Y | Y | likely | N | N | Y | confirmed | yes |
| HSCR000 | 0.148 | 10:43435867–60812533 | loss | 849 | Y | Y | Y | known | N | N | known | confirmed | known |
| HSCR000 | 0.148 | 10:43572551–43573368 | gain | 3 | N | N | N | unlikely | N | N | not confirmed | no | |
| HSCR037 | 0.120 | 10:43589687–62786887 | loss | 544 | Y | Y | Y | known | N | N | known | known | |
| HSCR005 | 0.226 | 7:84217007–84225649 | loss | 4 | Y | - | - | likely | Y (freq < 1%) | N | Y | yes | |
| HSCR005 | 0.226 | 10:43679892–43680816 | loss | 5 | Y | - | Y | likely | N | N | N | no | |
| HSCR005 * | 0.226 | 21:9833187–11096086 | loss | 4 | N | - | N | possible | N | N | unknown | ||
| HSCR006 | 0.276 | 10:43679612–43680816 | loss | 6 | N | - | - | likely | N | N | N | no | |
| HSCR006 | 0.276 | 10:43685614–43715348 | gain | 78 | N | N | - | unlikely | N | N | unknown | ||
| HSCR006 | 0.276 | 19:5822193–5832504 | gain | 13 | Y | - | - | unlikely | N | N | unknown | ||
| HSCR009 | 0.176 | 10:43691613–43713132 | gain | 50 | N | N | N | unlikely | N | N | unknown | ||
| HSCR009 | 0.176 | 19:5825458–5831976 | gain | 9 | Y | Y | Y | unlikely | N | N | unknown | ||
| HSCR010 * | 0.211 | 15:20848460–22432687 | gain | 5 | Y | - | - | likely | Y (freq ≥ 5%) | N | not excluded | yes | |
| HSCR014 | 0.221 | 8:32532001–32532545 | gain | 2 | Y | - | Y | unlikely | N | N | unknown | ||
| HSCR014 * | 0.221 | 10:29939955–30822470 | gain | 3 | Y | - | Y | possible | N | N | unknown | ||
| HSCR014 * | 0.221 | 12:80226392–80589429 | gain | 2 | Y | - | Y | possible | N | N | unknown | ||
| HSCR014 | 0.221 | 22:22417683–23228483 | loss | 15 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR016 | 0.117 | 5:69288477–70309855 | gain | 3 | Y | - | - | likely | Y (freq ≥ 5%) | N | not excluded | yes | |
| HSCR016 | 0.117 | 22:25672585–25892401 | gain | 5 | Y | - | Y | likely | Y (freq ≥ 5%) | Y (3 inds.) | not excluded | yes | |
| HSCR018 § | 0.172 | 9:109336464–109348467 | gain | 6 | - | - | - | likely | N | N | Y | yes | |
| HSCR019 * | 0.122 | 1:146638075–147824207 | loss | 4 | Y | Y | Y | likely | N | Y (1q21.1 recurrent microdel) | Y | confirmed with a different size | yes |
| HSCR033* | 0.229 | 15:21162691–22173977 | loss | 3 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR036 | 0.177 | 22:22781091–23228483 | loss | 8 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR039 | 0.217 | 3:51458492–51665134 | loss | 62 | N | N | - | unlikely | N | N | not confirmed | no | |
| HSCR039 | 0.217 | 6:148651353–150170473 | loss | 52 | N | N | - | unlikely | N | N | not confirmed | no | |
| HSCR039 | 0.217 | 9:110130442–110370427 | loss | 99 | N | N | - | unlikely | N | N | not confirmed | no | |
| HSCR043 § | 0.175 | 9:109273643–109275694 | loss | 2 | - | - | - | likely | N | N | Y | yes | |
| HSCR045 § | 0.271 | 7:84594683–84607065 | loss | 6 | - | - | - | unlikely | N | N | N | no | |
| HSCR045 § | 0.271 | 8:32597644–32598929 | loss | 3 | - | - | - | likely | N | N | Y | yes | |
| HSCR045 | 0.271 | 10:43679612–43680816 | loss | 6 | Y | - | Y | likely | N | N | N | no | |
| HSCR045 | 0.271 | 19:5819037–18310693 | gain | 25 | Y | - | - | unlikely | N | N | unknown | ||
| HSCR058 | 0.243 | 22:18661724–18920001 | gain | 7 | Y | - | Y | unlikely | Y (freq ≥ 5%) | N | not evaluable | yes | |
| HSCR064 * | 0.192 | 15:20848460–22173977 | loss | 4 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR126 | 0.176 | 19:4205366–18310693 | gain | 26 | N | - | - | unlikely | N | N | unknown | ||
| HSCR146 * | 0.122 | 15:58257674–59009890 | gain | 2 | Y | Y | Y | likely | N | N | Y | yes | |
| HSCR146 | 0.122 | 19:30888070–30891329 | gain | 2 | Y | - | Y | likely | N | N | N | no | |
| HSCR160 * | 0.200 | 15:20848460–22173977 | gain | 4 | Y | - | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR162 *,§ | 0.184 | 9:43659247–43659512 | loss | 2 | - | - | - | likely | Y (freq ≥ 5%) | N | confirmed with a different size | yes | |
| HSCR181 * | 0.150 | 15:20848460–22432687 | loss | 5 | N | - | - | possible | Y (freq ≥ 5%) | N | not excluded | yes | |
| HSCR181 | 0.150 | 21:14629063–48080926 | gain | 245 | Y | Y | Y | known | N | N | known | confirmed | known |
| HSCR183 | 0.138 | 22:22781091–23228483 | loss | 8 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR195 | 0.158 | 9:112078131–112089193 | loss | 5 | Y | - | - | likely | N | N | inconclusive | confirmed with a different size | yes |
| HSCR217 | 0.168 | 16:82200334–82202467 | gain | 2 | Y | - | Y | likely | N | N | Y | yes | |
| HSCR228 § | 0.158 | 22:25672585–25892401 | gain | 5 | - | - | - | likely | Y (freq ≥ 5%) | Y (3 inds.) | not excluded | yes | |
| HSCR231* | 0.164 | 15:21162691–22432687 | gain | 4 | Y | - | Y | unlikely | Y (freq ≥ 5%) | N | yes | ||
| HSCR312 | 0.215 | 3:50161771–50618134 | gain | 143 | N | - | - | unlikely | N | N | unknown | ||
| HSCR312 | 0.215 | 4:41748211–41753993 | gain | 16 | N | - | - | unlikely | N | N | unknown | ||
| HSCR312 | 0.215 | 10:43550696–43621994 | gain | 196 | N | - | - | unlikely | N | N | unknown | ||
| HSCR312 | 0.215 | 10:43684681–43718450 | gain | 86 | N | N | N | unlikely | N | N | unknown | ||
| HSCR312 | 0.215 | 14:36983123–36994136 | gain | 14 | Y | - | - | unlikely | N | N | unknown | ||
| HSCR312 | 0.215 | 19:5821171–5832504 | gain | 15 | N | N | N | unlikely | N | N | unknown | ||
| HSCR323 | 0.253 | 13:78465278–78484576 | gain | 30 | N | - | - | unlikely | N | N | unknown | ||
| HSCR331 | 0.172 | 19:5822193–5832928 | gain | 14 | N | - | - | unlikely | N | N | not excluded | unknown | |
| HSCR335 * | 0.183 | 15:20848460–22173977 | gain | 4 | Y | - | - | possible | Y (freq ≥ 5%) | N | not excluded | yes | |
| HSCR335 | 0.183 | 22:18628019–18807881 | gain | 6 | Y | - | Y | unlikely | N | N | not excluded | unknown | |
| HSCR335 | 0.183 | 22:20345868–20499789 | gain | 4 | Y | - | Y | unlikely | Y (freq ≥ 5%) | N | not excluded | yes | |
| HSCR335 | 0.183 | 22:21494163–21704972 | gain | 5 | Y | - | Y | unlikely | N | N | not excluded | unknown | |
| HSCR349 | 0.220 | 3:51452049–51647312 | loss | 59 | N | N | - | unlikely | N | N | unknown | ||
| HSCR349 * | 0.220 | 7:63449575–75986814 | loss | 25 | N | - | - | unlikely | N | N | unknown | ||
| HSCR349 | 0.220 | 10:43573685–43574005 | gain | 2 | Y | - | Y | unlikely | N | N | N | no | |
| HSCR374 | 0.266 | 10:43473690–43474033 | gain | 4 | Y | - | Y | unlikely | N | N | N | no | |
| HSCR380 | 0.123 | 22:16054691–18807881 | gain | 23 | Y | Y | Y | known | N | N | known | known | |
| HSCR380 | 0.123 | 22:20345868–20659606 | gain | 5 | Y | Y | Y | unlikely | N | N | unknown | ||
| HSCR380 | 0.123 | 22:21494163–21704972 | gain | 5 | Y | Y | Y | unlikely | N | N | unknown | ||
| HSCR382 | 0.235 | 10:43474436–43483543 | loss | 29 | N | - | - | unlikely | N | N | unknown | ||
| HSCR382 | 0.235 | 10:43630181–43636329 | gain | 31 | N | - | - | unlikely | N | N | unknown | ||
| HSCR382 * | 0.235 | 15:20190548–22173977 | gain | 5 | Y | - | - | possible | Y (freq ≥ 5%) | N | yes | ||
| HSCR391 | 0.173 | 21:14629063–48080926 | gain | 245 | Y | Y | Y | known | N | N | known | confirmed with a different size | known |
| HSCR403 §§ | 0.111 | 4:41746863–41751291 | loss | 11 | N | - | - | likely | N | N | Y | yes | |
| HSCR403 *,§§ | 0.111 | 9:43659247–43659512 | gain | 2 | Y | - | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR403 | 0.111 | 22:18661724–18807881 | gain | 5 | Y | - | - | possible | N | N | not excluded | unknown | |
| HSCR403 | 0.111 | 22:21494163–21704972 | gain | 5 | Y | - | Y | unlikely | N | N | confirmed and not excluded | yes | |
| HSCR403 | 0.111 | 22:23056562–23228483 | loss | 3 | Y | - | Y | likely | Y (freq ≥ 5%) | N | confirmed with a different size | yes | |
| HSCR409 * | 0.139 | 15:20848460–22173977 | gain | 4 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | yes | ||
| HSCR412 | 0.204 | 22:20345868–21778882 | loss | 26 | N | - | - | unlikely | N | N | not confirmed | no | |
| HSCR414 * | 0.156 | 15:20848460–22432687 | loss | 5 | N | - | - | possible | Y (freq ≥ 5%) | N | yes | ||
| HSCR415 | 0.195 | 9:113025039–113029430 | loss | 2 | Y | Y | Y | likely | Y (freq ≥ 5%) | Y (1 ind.) ‡ | yes | ||
| HSCR421 * | 0.166 | 9:43659247–43659512 | loss | 2 | Y | Y | Y | likely | Y (freq ≥ 5%) | N | confirmed | yes | |
| HSCR421 | 0.166 | 22:25672585–25892401 | loss | 5 | Y | Y | Y | likely | Y (freq ≥ 5%) | Y (3 inds.) | not excluded | yes | |
| HSCR426 * | 0.111 | 9:43659247–43659512 | loss | 2 | Y | - | Y | likely | Y (freq ≥ 5%) | N | not confirmed and confirmed | yes | |
| HSCR481 * | 0.248 | 5:7656467–8124532 | loss | 2 | Y | - | Y | possible | N | N | not confirmed | no | |
| HSCR481 | 0.248 | 19:31954093–31966036 | loss | 5 | Y | - | - | likely | N | N | Y | not evaluable | yes |
| HSCR481 | 0.248 | 21:14629063–48080926 | gain | 245 | Y | Y | Y | known | N | N | known | confirmed | known |
† True (yes) = if either already reported on DGV, validated with different methods or confirmed on at least one replicate; (no) if not validated and/or not confirmed on replicate(s); known = selected controls or known chromosomal rearrangements; unknown = not possible to discriminate between true yes or no; * probes not located in the selected high density regions; § aberration not detected by the software call, but identified by visual inspection; Y = percentage of probes with absolute high log2ratio (≥0.5 for gains and ≤−0.8 for loss) above 33.3%; N = percentage ≤ 33.3%, - = not called by the algorithm; ‡ deletion reported as CNV with pathogenicity unknown, reported in an individual with aganglionic megacolon (another name for HSCR), intellectual disability and short stature; §§ aberrations assumed as detected because identified in two additional replicates.
Detection filters comparison.
| Comparison Groups | True Calls | Not Confirmed | Unknown | Total | ||
|---|---|---|---|---|---|---|
| Likely/possible | 39 | 5 | 4 | 48 | 0.0003 † | |
| Unlikely | 4 | 8 | 23 | 35 | ||
| ADM-2_th6 ≥ 0.333 | 35 | 6 | 12 | 53 | 1.0000 | 0.0033 †† |
| ADM-2_th6 < 0.333 | 3 | 6 | 15 | 24 | ||
| NO ADM-2_th6 (visual only) * | 5 | 1 | 0 | 6 | ||
| ADM-2_th8 ≥ 0.333 | 18 | 0 | 3 | 21 | 0.5346 | 0.0001 †† |
| ADM-2_th8 < 0.333 | 0 | 4 | 5 | 9 | ||
| NO ADM-2_th8 | 25 | 9 | 19 | 53 | ||
| Fuzzy ≥ 0.333 | 28 | 6 | 8 | 42 | 0.5230 | 0.2000 |
| Fuzzy < 0.333 | 0 | 1 | 4 | 5 | ||
| NO Fuzzy | 15 | 6 | 15 | 36 | ||
| Total | 43 | 13 | 27 | 83 |
True calls include controls, aberrations reported on DGV, aberrations confirmed in at least a replicate and aberrations confirmed at validation. Not confirmed calls include aberrations not confirmed at validation and not found in the available replicate. Unknown includes calls not validated and not reported on DGV, for which a replicate sample was not available and that have not been evaluated in the statistical test. * p-value for true vs. not confirmed calls; † likely/possible have a significantly higher chance of being true than those unlikely; †† the thresholde ≥ 0.33 filter has a better chance to discriminate between true and false calls, significant for the ADM-2 detection algorithm.
Regions mapped on the aCGH and probe density.
| Kind of Probes | Candidate Region | Locus | # of Features * | # of Unique Probes * | Average Space (nt) * |
|---|---|---|---|---|---|
| Selected | 10q11.2 | 813 | 8333 | 312 | |
| 9q31 | 9q31 | 1824 | 2501 | ||
| 9p24.1 | 9p24.1 | 142 | 3521 | ||
| 4p13 | 49 | 508 | |||
| 8p12 | 473 | 501 | |||
| 7q21.11 | 468 | 2506 | |||
| rs12707682 | 40 | 500 | |||
| 6q25.1 | 6q25.1 | 714 | 3501 | ||
| 21q22 | 21q22 | 202 | 48,297 | ||
| 3p21 | 3p21 | 1141 | 3503 | ||
| 19q12 | 19q12 | 1085 | 3502 | ||
| 19p13.3 | 18 | 806 | |||
| 16q23.3 | 16q23.3 | 714 | 3501 | ||
| 14q13 | 17 | 812 | |||
| 22q13 | 27 | 823 | |||
| 22q11.2 | 22q11.2 | 162 | 49,383 | ||
| 1p36.1 | 103 | 806 | |||
| 2q22.3 | 165 | 923 | |||
| 13q22 | 112 | 804 | |||
| 5p13.1-p12 | 42 | 810 | |||
| 20q13.2-q13.3 | 44 | 808 | |||
| Genome | 3149 | 3130 | 971,074 | ||
| Replicates | 301 × 5 = 1505 | 301 | |||
| Normalization | 1262 | 1262 | |||
| Agilent controls | 1482 | ||||
| Total | 15,748 | 13,026 |
* Twenty-two probes selected among the high density panel were also included in the normalization set or in the replicates set and are not reported among the # of unique probes selected, but considered for the average coverage. Nineteen probes selected in the rest of the genome had already been selected for the high density regions (10) or already part of the normalization set (9).