| Literature DB >> 21816082 |
Bernhard Y Renard1, Martin Löwer, Yvonne Kühne, Ulf Reimer, Andrée Rothermel, Ozlem Türeci, John C Castle, Ugur Sahin.
Abstract
BACKGROUND: Peptide microarrays offer an enormous potential as a screening tool for peptidomics experiments and have recently seen an increased field of application ranging from immunological studies to systems biology. By allowing the parallel analysis of thousands of peptides in a single run they are suitable for high-throughput settings. Since data characteristics of peptide microarrays differ from DNA oligonucleotide microarrays, computational methods need to be tailored to these specifications to allow a robust and automated data analysis. While follow-up experiments can ensure the specificity of results, sensitivity cannot be recovered in later steps. Providing sensitivity is thus a primary goal of data analysis procedures. To this end we created rapmad (Robust Alignment of Peptide MicroArray Data), a novel computational tool implemented in R.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21816082 PMCID: PMC3174949 DOI: 10.1186/1471-2105-12-324
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart. Flowchart of the data analysis pipeline for extracting a list of signal-carrying peptides from the measured intensities of a peptide microarray scan.
Experimental Setup
| Array number | Print batch | Spike-in antibody concentration [ng/ml] | Plasma |
|---|---|---|---|
| 1 | 1 | - | - |
| 2 | 1 | 1 | + |
| 3 | 1 | 3 | + |
| 4 | 2 | - | - |
| 5 | 2 | 1 | + |
| 6 | 2 | 3 | + |
Summary of the experimental setup for the six microarrays used in this study.
Linear model fit summary
| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| Peptide | 12 | 170916 | 14243.0 | 57461.65 | < 2.2e-16 |
| Subarray | 2 | 24 | 12.2 | 49.18 | < 2.2e-16 |
| Needle | 15 | 45 | 3.0 | 12.08 | < 2.2e-16 |
| Row | 234 | 458 | 2.0 | 7.89 | < 2.2e-16 |
| Column | 75 | 409 | 5.5 | 22.03 | < 2.2e-16 |
| Residuals | 2218 | 550 | 0.2 |
Linear model fit summary. The F-values and the corresponding probabilities (Pr(>F)) clearly indicate that all explanatory variables used are highly significant and that each contributes to reducing the variance present in the arrays. The peptide sequence itself shows by far the strongest effect while requiring only twelve degrees of freedom (Df), it shows a sum of squares for its effect (Sum Sq) of 170,916, much stronger than the residuals sum of squares of 550. While not as strong as the peptide effect, the remaining explanatory effects reduce the residuals by 65%. Column and row effects are strongest, but subarray and needle effects require fewer degrees of freedom, resulting in strong mean sum of squares (Mean Sq) values.
Figure 2Unreliable Spot Finding. Scatter plot of intensities from slides of two print batches with the same high antibody concentration on a binary logarithm scale. Due to experimental noise, we see departures from the diagonal line on which we would expect all data points. The quality control algorithm identifies approximately 2.5% of all data points in each print batch as unreliable across all subarrays; these peptides are removed accordingly (colored in magenta, cyan and orange), resulting in an increase of the coefficient of variation of approximately 3%. While not identifying all outlying observation, the removed spots primarly affect peptide spots which show large variation between the print batches.
Figure 3Sensitivity, Specificity and Accuracy in Comparison. Sensitivity, specificity and accuracy for our approach without and with secondary antibody binding removal in comparison to the approaches of [2] and [17]. Both, the high (3 ng/ml) and the low (1 ng/ml) spike-in antibody concentration slides were evaluated for both print batches (left and right). 95% bootstrap confidence intervals were computed based on 1000 times resampled peptide intensities and are shown by dashed lines. For all approaches, the specificity remains rather constant when reducing the spike-in antibody concentration, while we see a general decline in sensitivity and accuracy. For the approaches of [2] and [17] the decline in accuracy is rather steep, our approach shows still good accuracy above 0.8 for the low antibody concentration.