| Literature DB >> 28035942 |
Ronggang Zhou1, Alan H S Chan2.
Abstract
BACKGROUND: In order to take into account the inherent uncertainties during product usability evaluation, Zhou and Chan [1] proposed a comprehensive method of usability evaluation for products by combining the analytic hierarchy process (AHP) and fuzzy evaluation methods for synthesizing performance data and subjective response data. This method was designed to provide an integrated framework combining the inevitable vague judgments from the multiple stages of the product evaluation process. OBJECTIVE AND METHODS: In order to illustrate the effectiveness of the model, this study used a summative usability test case to assess the application and strength of the general fuzzy usability framework. To test the proposed fuzzy usability evaluation framework [1], a standard summative usability test was conducted to benchmark the overall usability of a specific network management software. Based on the test data, the fuzzy method was applied to incorporate both the usability scores and uncertainties involved in the multiple components of the evaluation. Then, with Monte Carlo simulation procedures, confidence intervals were used to compare the reliabilities among the fuzzy approach and two typical conventional methods combining metrics based on percentages. RESULTS ANDEntities:
Keywords: Keywords: Usability; analytic hierarchy process (AHP); fuzzy comprehensive evaluation
Mesh:
Year: 2017 PMID: 28035942 PMCID: PMC5302047 DOI: 10.3233/WOR-162473
Source DB: PubMed Journal: Work ISSN: 1051-9815
Fig.1A hierarchy structure of the evaluated indexes for usability measure (SysUse = System Usefulness, InfoQual = Information Quality, IntQual = Interface Quality).
Numerical success ratings with corresponding definitions
| Success | Operational definition |
| 1.0 or 0.9 | Complete the task independently without errors or invalid actions |
| 0.8 or 0.7 | Complete the task independently with a few errors or invalid actions |
| 0.6 or 0.5 | There are some difficulties for performing task, and with more errors or invalid actions |
| 0.4, 0.3 or 0.2 | There are more errors or invalid actions. The task can be completed only with help of documents or hints from facilitator |
| 0.1 or 0 | The user cannot complete the task or gives up on the task. |
The preparatory statistical result with respect to each original measure
| Participants | Effective | Efficiency | User Satisfaction | |||
| Success | Time | Time* | InfoQual | IntQual | SysUse | |
| P1 | 0.955 | 569.333 | 0.644 | 5.143 | 5.667 | 5.875 |
| P2 | 0.969 | 554.000 | 0.681 | 5.800 | 4.333 | 5.875 |
| P3 | 0.988 | 650.667 | 0.451 | 6.286 | 6.333 | 6.750 |
| P4 | 0.983 | 369.667 | 1.120 | 4.000 | 5.667 | 5.625 |
| P5 | 0.962 | 633.667 | 0.491 | 5.286 | 4.667 | 6.125 |
| P6 | 0.962 | 543.000 | 0.707 | 5.600 | 4.667 | 5.625 |
| P7 | 0.943 | 478.000 | 0.862 | 6.143 | 6.333 | 6.250 |
| P8 | 0.954 | 362.333 | 1.137 | 5.714 | 6.667 | 6.375 |
| P9 | 0.937 | 774.000 | 0.157 | 5.000 | 5.333 | 5.375 |
| P10 | 0.933 | 781.000 | 0.140 | 6.429 | 5.667 | 7.000 |
| P11 | 0.969 | 537.000 | 0.721 | 5.000 | 4.667 | 6.125 |
| P12 | 0.940 | 814.500 | 0.061 | 5.167 | 5.333 | 5.875 |
| P13 | 0.960 | 377.500 | 1.101 | 5.143 | 6.000 | 5.625 |
| P14 | 0.970 | 353.000 | 1.160 | 5.714 | 6.333 | 6.000 |
| P15 | 0.935 | 772.000 | 0.162 | 5.714 | 5.333 | 6.000 |
| P16 | 0.988 | 310.500 | 1.261 | 5.571 | 5.333 | 5.500 |
Times* were converted from times according to formula of [2 –(original task time / expectable shortest time] proposed in [1], and the expectable shortest time were designed as 420 seconds in this case. SysUse = System Usefulness, InfoQual = Information Quality, IntQual = Interface Quality.
Fig.2The fuzzy membership functions of task success and task time (converted value). v is a measure value for task success or converted value for task time, μ(v), ranges from 0 to 1, means the value’s corresponding membership degree to very poor, poor, medium, good, and excellent, respectively. The ranges for v in the interval [0, 1], for the corresponding threshold parameters were: 0, 0.3, 0.6, 0.8, 0.95, and 1 which are the value of v1, v2, v3, v4, and v5 respectively. 0.15, 0.45, 0.7, 0.875, and 0.975 are the value of c1, c2, c3, c4, and c5, which represent the middle values of the intervals (v1, v2), (v2, v3), (v3, v4), (v4, v5), and (v5, v6) respectively. In terms of task time, v values correspond to very poor singly and completely for v <0, and correspond to excellent singly and completely for 1 < v<2.
The membership mapping for task success ranking
| Participants | Success | |||||
| P1 | 0.955 | 0 | 0 | 0 | 0.797 | 1 |
| P2 | 0.969 | 0 | 0 | 0 | 0.244 | 1 |
| P3 | 0.988 | 0 | 0 | 0 | 0 | 1 |
| P4 | 0.983 | 0 | 0 | 0 | 0 | 1 |
| P5 | 0.962 | 0 | 0 | 0 | 0.511 | 1 |
| P6 | 0.962 | 0 | 0 | 0 | 0.533 | 1 |
| P7 | 0.943 | 0 | 0 | 0 | 1 | 0.911 |
| P8 | 0.954 | 0 | 0 | 0 | 0.822 | 1 |
| P9 | 0.937 | 0 | 0 | 0 | 1 | 0.822 |
| P10 | 0.933 | 0 | 0 | 0 | 1 | 0.778 |
| P11 | 0.969 | 0 | 0 | 0 | 0.244 | 1 |
| P12 | 0.940 | 0 | 0 | 0 | 1 | 0.867 |
| P13 | 0.960 | 0 | 0 | 0 | 0.600 | 1 |
| P14 | 0.970 | 0 | 0 | 0 | 0.200 | 1 |
| P15 | 0.935 | 0 | 0 | 0 | 1 | 0.800 |
| P16 | 0.988 | 0 | 0 | 0 | 0 | 1 |
| 0 | 0 | 0 | 8.952 | 15.178 | ||
| 0 | 0 | 0 | 0.371 | 0.629 | ||
Fig.3The fuzzy membership function of satisfaction. v is measure value, μ(v), which ranges from 0 to 1, and means the value’s corresponding membership degree to very poor, poor, medium, good, and excellent, respectively. 1, 2, 3.5, 5.5, 6.5, and 7 are the value of v1, v2, v3, v4, and v5 respectively. 1.5, 2.75, 4.5, 6, and 6.75 are the value of c1, c2, c3, c4, and c5.
Data from Table 3 transformed to percentages for the true case
| Participants | Effective | Efficiency | User Satisfaction | Averages | Weighted Averages | ||
| (Success) | (Time) | InfoQual | IntQual | SysUse | |||
| P1 | 95.5 | 64.4 | 69.0 | 77.8 | 81.3 | 78.7 | 83.0 |
| P2 | 96.9 | 68.1 | 80.0 | 55.6 | 81.3 | 79.1 | 83.8 |
| P3 | 98.8 | 45.1 | 88.1 | 88.9 | 95.8 | 78.3 | 87.0 |
| P4 | 98.3 | 100.0 | 50.0 | 77.8 | 77.1 | 88.9 | 87.2 |
| P5 | 96.2 | 49.1 | 71.4 | 61.1 | 85.4 | 72.7 | 80.5 |
| P6 | 96.2 | 70.7 | 76.7 | 61.1 | 77.1 | 79.5 | 83.2 |
| P7 | 94.3 | 86.2 | 85.7 | 88.9 | 87.5 | 89.3 | 90.2 |
| P8 | 95.4 | 100 | 78.6 | 94.4 | 89.6 | 94.3 | 93.0 |
| P9 | 93.7 | 15.7 | 66.7 | 72.2 | 72.9 | 60.0 | 71.6 |
| P10 | 93.3 | 14.0 | 90.5 | 77.8 | 100 | 65.6 | 79.6 |
| P11 | 96.9 | 72.1 | 66.7 | 61.1 | 85.4 | 80.0 | 84.1 |
| P12 | 94.0 | 6.1 | 69.4 | 72.2 | 81.3 | 58.1 | 72.0 |
| P13 | 96.0 | 100 | 69.0 | 69.0 | 83.3 | 90.8 | 88.9 |
| P14 | 97.0 | 100 | 78.6 | 88.9 | 83.3 | 93.5 | 92.1 |
| P15 | 93.5 | 16.2 | 78.6 | 72.2 | 83.3 | 62.6 | 75.0 |
| P16 | 98.8 | 100.0 | 76.2 | 72.2 | 75.0 | 91.1 | 89.7 |
| 78.9 (12.18) | 83.8 (6.70) | ||||||
In order to match the possible percentage zero data, the original user satisfaction data with scaling from 1 to 7 were converted to the scaling from 0 to 6, so it was possible to divide the converted score for each participant by the maximum possible score of 6 to get the user subjective percentage. SysUse = System Usefulness, InfoQual = Information Quality, IntQual = Interface Quality.
Sample data of confidence interval width with simulation for the three evaluation methods
| Case | Methods | Mean | S.D. | N = 1 | N = 2 | N = 3 | ... ... | N = 16 |
| True case | Fuzzy | 85.0 | 5.26 | 10.32 | 7.30 | 5.96 | 2.58 | |
| Weighted | 83.8 | 6.70 | 13.13 | 9.28 | 7.58 | 3.28 | ||
| Averages | 78.9 | 12.18 | 23.87 | 16.88 | 13.78 | 5.97 | ||
| Simulation 1 | Fuzzy | 87.70 | 3.98 | 7.80 | 5.52 | 4.50 | 1.95 | |
| Weighted | 84.27 | 6.18 | 12.11 | 8.56 | 6.99 | 3.03 | ||
| Averages | 79.00 | 12.29 | 24.09 | 17.03 | 13.91 | 6.02 | ||
| ... ... | ... ... | |||||||
| Simulation 100 | Fuzzy | 84.68 | 4.72 | 9.25 | 6.54 | 5.34 | 2.31 | |
| Weighted | 83.52 | 7.52 | 14.74 | 10.42 | 8.51 | 3.68 | ||
| Averages | 78.31 | 12.43 | 24.36 | 17.23 | 14.07 | 6.09 |
Fig.4Confidence interval width as a function of sample size and evaluation method.