| Literature DB >> 17578580 |
Amina Chebira1, Yann Barbotin, Charles Jackson, Thomas Merryman, Gowri Srinivasa, Robert F Murphy, Jelena Kovacević.
Abstract
BACKGROUND: Fluorescence microscopy is widely used to determine the subcellular location of proteins. Efforts to determine location on a proteome-wide basis create a need for automated methods to analyze the resulting images. Over the past ten years, the feasibility of using machine learning methods to recognize all major subcellular location patterns has been convincingly demonstrated, using diverse feature sets and classifiers. On a well-studied data set of 2D HeLa single-cell images, the best performance to date, 91.5%, was obtained by including a set of multiresolution features. This demonstrates the value of multiresolution approaches to this important problem.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17578580 PMCID: PMC1933440 DOI: 10.1186/1471-2105-8-210
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Basic multiresolution block. Top: Two-channel analysis filter bank. The filter h is a highpass filter and g is a lowpass filter. Bottom: A 2-level filter bank decomposition of actin. If the original image is of size N × N, the ones in the middle are of sizes N/2 × N/2 and the ones on the right are of sizes N/4 × N/4. Each branch has either the lowpass filter g or the highpass filter h followed by downsampling by 2 as in the top figure. Filtering and sampling are performed along the horizontal direction (rows) followed by the same operations along the vertical direction (columns).
Figure 2Multiresolution (MR) classification system. The generic classification system (GCS) consists of feature extraction followed by classification (inside the dashed box). We add an MR block in front of GCS and compute features in MR subspaces (subbands). Classification is then performed on each of the subbands yielding local decisions which are then weighed and combined to give a final decision.
Classification accuracy per class. Z, M and T stand for Zernike, morphological and texture features.
| System | Weight. | Classification accuracy [%] | |||||||
| All | |||||||||
| nMR | NW | 66.12 | 85.49 | 51.20 | 85.76 | 72.48 | 85.06 | 85.04 | |
| NW | 66.12 | 85.76 | 51.20 | 86.64 | 72.48 | 85.78 | 86.24 | ||
| NW | 66.12 | 51.20 | 87.38 | 72.48 | 87.12 | 86.86 | |||
| MRB | OF | 81.62 | 91.82 | 65.42 | 92.04 | 83.38 | 91.66 | 92.36 | |
| CF | 81.48 | 65.84 | 92.62 | 83.58 | 92.34 | 92.54 | |||
| MRF | OF | 84.92 | 94.72 | 65.82 | 94.64 | 86.80 | 94.74 | 94.52 | |
| CF | 85.16 | 65.24 | 85.88 | 95.26 | |||||
T1 are the original Haralick texture features, T2 are modified Haralick texture features from [16] and T3 are our improved texture features. nMR denotes the base system with no MR, MRB denotes MR basis classification and MRF denotes MR frame classification. OF denotes open-form weighting algorithm while CF denotes closed-form weighting algorithm. NW denotes no weighting as there is no MR block in front. Each entry is a number denoting the classification accuracy mean over a number of trials (different orderings of the images) for a given combination of feature sets. Note that the accuracy of nMR with features M is the same across the rows T1, T2, T3 since texture features are not involved in the classification when morphological features alone are used (similarly with Z, and M, Z). A subset of these results is shown pictorially in Figure 3. These results should be compared to the best previously obtained result (on the same data set) of 91.5% [15].
Figure 3Pictorial representation of classification accuracy results. The diagram shows results from Table 1 for those sets involving T3, namely (T3), (T3, M) and (T3, M, Z). Diamond markers represent the nMR system (no MR block), circles represent the MRB system (MR bases, no redundancy) and squares represent the MRF system (MR frames, redundancy). Filled markers denote the closed-form weighting algorithm (CF), while empty ones denote the open-form weighting algorithm (OF). The following trends are noteworthy: (a) Introducing MR (both MRB and MRF) significantly outperforms nMR, thus demonstrating that classifying in MR subspaces indeed improves classification accuracy. (b) MRF outperform MRB. (b) For the two versions of the weighting algorithm, open form and closed form, the closed-form algorithm slightly outperforms the open-form one. (d) The trend in each case is almost flat across various feature set combinations, indicating that the texture set T3 alone (26 features) is sufficient for high classification accuracy.