| Literature DB >> 26072506 |
Andrew Palmer1, Ekaterina Ovchinnikova1, Mikael Thuné2, Régis Lavigne2, Blandine Guével2, Andrey Dyatlov1, Olga Vitek2, Charles Pineau2, Mats Borén2, Theodore Alexandrov3.
Abstract
MOTIVATION: Imaging mass spectrometry (IMS) is a maturating technique of molecular imaging. Confidence in the reproducible quality of IMS data is essential for its integration into routine use. However, the predominant method for assessing quality is visual examination, a time consuming, unstandardized and non-scalable approach. So far, the problem of assessing the quality has only been marginally addressed and existing measures do not account for the spatial information of IMS data. Importantly, no approach exists for unbiased evaluation of potential quality measures.Entities:
Mesh:
Year: 2015 PMID: 26072506 PMCID: PMC4765867 DOI: 10.1093/bioinformatics/btv266
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The workflow of our study containing two parts: (i) creation of the gold-standard set of ion image pairs annotated by experts with relative quality of images in a pair and (ii) evaluating candidate measures of quality of ion images. The anonymized data from the survey and the Matlab source code for data analysis is available in the GitHub project repository
Fig. 2.A screenshot from the online survey showing a pair of ion images. Raters were asked to provide relative quality of these images by moving the slider either to the left or to the right depending on which image they believe is of higher quality
Measures calculated per image, where the input can be the raw intensity values or mapped onto a jet colourscale
| Input image | Measure | Statistics | Window size |
|---|---|---|---|
| Grey | COV | a | 3, 5, 11, 21, 51 |
| STD | a | 3, 5, 11, 21, 51 | |
| SNR | a | 3, 5, 11, 21, 51 | |
| SE | a | 3, 5, 11, 21, 51 | |
| SC | b | ||
| histogram | c | ||
| RGB | luminescence | a | |
| histogram (per channel) | c |
Sets of summary statistics calculated were:a, mean, median, maximum absolute deviation (mad), maximum (max), minimum (min), sum; b, mean; c, skew, kurtosis, entropy, maximum value. Each measure was applied to the whole image, and if indicated in the column ‘Window size’, locally to moving square windows of specified size in pixels.
Fig. 3.Overview of the raters who completed the survey and their feedback on the task and survey. (a) Raters’ experience and background show that a diverse range of experts was recruited. (b) The raters described the objectives of the survey as important, the mechanics and the layout of the survey clear and easy to use but the task of determining image quality was found to be difficult. (c) Raters’ feedback on the survey duration and number of pairs showed that the survey was comfortable for the participants
Fig. 4.Assessment of the raters and the ratings they provided. (a) A histogram of the change of the inter-rater agreement when removing one rater in turn. (b) A histogram of the change of the inter-rater agreement when removing one image pair in turn. (c) The median time spent per pair against average slider value and SD per pair. (d) Box-whisker plot of tracked time shows an expected learning curve with less time per rating being needed as survey progresses (plot shows 25/50/75 quantiles with whiskers covering 99.3% of data)
Fig. 5.Agreement (Krippendorff’s alpha αk) calculated after sequentially removing the worst performing image pairs (top) and raters (bottom) then recalculating the agreement on the remaining subset (i.e. removing from left to right in Fig. 4a and b)
Evaluation of the candidate image-based measures on the three gold-standard datasets
| Measure | ||||||
|---|---|---|---|---|---|---|
| Corr* | Sign | Corr* | Sign | Corr* | Sign | |
| 0.74 | 0.76 | 0.76 | 0.77 | 0.93 | 0.93 | |
| 0.71 | 0.77 | 0.74 | 0.78 | 0.93 | 0.93 | |
| SC | 0.58 | 0.74 | 0.59 | 0.75 | 0.72 | 0.85 |
| 0.54 | 0.70 | 0.56 | 0.71 | 0.67 | 0.82 | |
| 0.48 | 0.67 | 0.49 | 0.67 | 0.61 | 0.76 | |
| 0.48 | 0.67 | 0.49 | 0.68 | 0.61 | 0.75 | |
| 0.44 | 0.62 | 0.45 | 0.62 | 0.57 | 0.69 | |
| 0.40 | 0.63 | 0.41 | 0.63 | 0.54 | 0.72 | |
| 0.40 | 0.63 | 0.41 | 0.63 | 0.53 | 0.71 | |
| 0.40 | 0.62 | 0.41 | 0.61 | 0.53 | 0.71 |
Out of 143 considered measures, 10 with the highest correlation are shown, sorted by their Pearson correlation (Corr) between the differential measures and the human ratings. Sign stands for the sign matching statistic as defined in Equation (7). *All P-values < 0.001.