| Literature DB >> 36033205 |
Jelle Wouters1, Abel Menkveld2, Sjaak Brinkkemper3, Fabiano Dalpiaz3.
Abstract
Crowd-based Requirements Engineering (CrowdRE) promotes the active involvement of a large number of stakeholders in RE activities. A prominent strand of CrowdRE research concerns the creation and use of online platforms for a crowd of stakeholders to formulate ideas, which serve as an additional input for requirements elicitation. Most of the reported case studies are of small size, and they analyze the size of the crowd, rather than the quality of the collected ideas. By means of an iterative design that includes three case studies conducted at two organizations, we present the CREUS method for crowd-based elicitation via user stories. Besides reporting the details of these case studies and quantitative results on the number of participants, ideas, votes, etc., a key contribution of this paper is a qualitative analysis of the elicited ideas. To analyze the quality of the user stories, we apply criteria from the Quality User Story framework, we calculate automated text readability metrics, and we check for the presence of vague words. We also study whether the user stories can be linked to software qualities, and the specificity of the ideas. Based on the results, we distill six key findings regarding CREUS and, more generally, for CrowdRE via pull feedback.Entities:
Keywords: Case studies; CrowdRE; Elicitation; Pull feedback; User stories
Year: 2022 PMID: 36033205 PMCID: PMC9392511 DOI: 10.1007/s00766-022-00384-6
Source DB: PubMed Journal: Requir Eng ISSN: 0947-3602 Impact factor: 2.275
Comparison of the , S-Sys, and V-Sys cases with earlier studies
| Measurement | Tournify | S-Sys | V-Sys | Tournify | REfine | GARUSO |
|---|---|---|---|---|---|---|
| Duration in days | 35 | 33 | 56 | 35 | 92 | |
| Participants | ||||||
| Invited | 337 | 478 | 2,393 | unk. | 37 | unk. |
| Accessed | 157 | 135 | 385 | unk. | 19 | 726 |
| Active | 39 | 60 | 130 | 19 | 32 | |
| Ideas | 57 | 32 | 78 | 248 | 21 | 56 |
| Logins | 247 | 240 | 623 | unk. | unk. | unk. |
| Votes | 89 | 513 | 130 | 160 | ||
| Comments | 14 | 28 | 78 | 161 | 37 | unk. |
| Ideas/Accessed | 0.36 | 0.24 | 0.20 | unk. | 1.11 | 0.08 |
| Ideas/Active | 1.46 | 0.53 | 0.60 | 1.84 | 1.11 | 1.75 |
The column refers to the additional ideas obtained from the channel after the case study period ended. : for , we count only participants who posted ideas; for technical reasons, we could not record participants who voted or commented. : these numbers are slightly lower than those reported in previous work [13], as every idea included a vote self-assigned to the author; for consistency, we subtracted those in this paper
Metrics used for analyzing the ideas in the a-posteriori analysis
| Metric | Description |
|---|---|
| User story quality [ | |
| Well-formed | A user story includes at least a |
| Atomic | A user story expresses a requirement for exactly one user-visible feature |
| Conceptually sound | The |
| Problem-oriented | A user story only specifies the problem, not the solution to it |
| Vagueness | |
| Vagueness | Does the user story include one of the weak words from QUARS++ [ |
| Text readability | |
| Automated readability index | Complexity of a text in terms of average number of characters per words, and the average number of words per sentence [ |
| Flesch reading-ease test | Text complexity in terms of the average number of word per sentence and the average number of syllables per word [ |
| Quality requirements (ISO/IEC 25010 standard [ | |
| Reliability | Degree to which a system, product or component performs specified functions under specified conditions for a specified period of time |
| Performance (efficiency) | Performance relative to the amount of resources used under stated conditions |
| Security | Degree to which a product or system protects information and data so that persons or other products or systems have the degree of data access appropriate to their types and levels of authorization |
| Compatibility | Degree to which a product, system or component can exchange information with other products, systems or components, and/or perform its required functions, while sharing the same hardware or software environment |
| Usability | Degree to which a product or system can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use |
| Generality versus Specificity | |
| General | The idea refers to the general user of the system, without limitation on certain usage contexts |
| Specific | The idea concerns specific user types or specific usage contexts |
Fig. 1Process-Deliverable Diagram representing the CREUS method for crowd-based requirements elicitation. The activities with the symbol are executed by the crowd, the others are performed by the core team
Concept table for the CREUS method illustrated in Fig. 1
| Concept | Description |
|---|---|
| CROWDRE GOAL | It defines the main objective of the crowd-based elicitation [ |
| FEEDBACK CHANNEL | The platform through which the crowd members express their feedback, for instance, an idea generation portal. |
| CROWD MEMBER | A crowd participant who accesses FEEDBACK CHANNEL and may express FEEDBACK. |
| USER STORY | A concise representation of a requirement that expresses an expected |
| FEEDBACK | Any input provided by the crowd that addresses the CROWDRE QUESTION via the FEEDBACK CHANNEL: either an IDEA, a VOTE, or a COMMENT. |
| IDEA | A suggestion or a request for the system that a crowd member expresses as an USER STORY. |
| VOTE | Expresses the positive or negative support to an IDEA. Can be expressed by crowd members who are not the authors of the USER STORY. |
| COMMENT | A clarification or explanation of an IDEA provided by a crowd member (either the author or another crowd member). |
| SUMMARY | A recap of a set of FEEDBACK items that is intended to provide an overview of that FEEDBACK. |
| RESPONSE | A response to an IDEA given by the core team, to show that the IDEA is being considered. Preferably, RESPONSEs show whether IDEAs are being included in the product under consideration, and show what the argumentation for ex/inclusion is. |
| TIMELINE | An overview of the temporal horizon of the expected implementation of the FEEDBACK. The level of details depends on the development method. |
| PRODUCT BACKLOG | The master list of all functionality desired in the product [ |
| BACKLOG ITEM | A single unit of work that is placed on the PRODUCT BACKLOG [ |
| SPRINT | A short iteration (2-4 weeks, typically) through which a subset of the PRODUCT BACKLOG (thus, a number of BACKLOG ITEMS) is moved onto a sprint backlog, which is implemented in that iteration. |
Activity table for the CREUS method illustrated in Fig. 1
| Activity | Sub-activity | Description |
|---|---|---|
| CrowdRE preparation | Create core team | The core team oversees and interacts with the crowd after its deployment, in order to support and retain their engagement. |
| Define goal for the crowd | The core team defines the CROWDRE GOAL for the crowd to address, so to maximize the relevance of the collected ideas [ | |
| Setup feedback channel | A FEEDBACK CHANNEL is selected and configured so that FEEDBACK can be collected from the crowd [ | |
| Deploy crowd | The CrowdRE elicitation is kickstarted by opening up the FEEDBACK CHANNEL and by inviting the crowd members to join. | |
| Idea generation | Generate ideas | CROWD MEMBERs formulate IDEAs via the FEEDBACK CHANNEL using the USER STORY format. |
| Vote on ideas | CROWD MEMBERs post up- and down-VOTEs on previous IDEAs to express their support [ | |
| Discuss ideas | CROWD MEMBERs post COMMENTs that elaborate on the previously posted IDEAs. | |
| Monitor crowd | The core team oversees the progress of the crowd and sends stimuli when necessary to boost participation. | |
| Refinement | Write summary | The core team puts together a SUMMARY of the FEEDBACK obtained, and shares it with the crowd to give a quick overview. |
| Respond to ideas | The core team adds RESPONSEs to the IDEAs to clearly indicate that they are currently being considered | |
| Generate ideas | As earlier: the generation of IDEAs continues, also based on the RESPONSEs and the SUMMARY. | |
| Vote on ideas | As earlier: the casting of VOTEs continues | |
| Discuss ideas | As earlier: more COMMENTs are posted | |
| Monitor crowd | The overseeing process continues, with a focus on supporting convergence for the current IDEAs. | |
| Response and execution | Respond to remaining ideas | The core team writes RESPONSEs to all the remaining IDEAs. |
| Develop and share timeline | The core team prepares a TIMELINE for the implementation of the system, based on the crowd-based elicitation. This TIMELINE is shared with the CROWD MEMBERs. | |
| Invite to focus group | The most active participants are invited to join a focus group [ | |
| Execute timeline in sprints | The TIMELINE is executed iteratively by defining and carrying out SPRINTs, each of which gets assigned a number of BACKLOG ITEMs from the PRODUCT BACKLOG. |
Fig. 2The wizard template for authoring user stories in the case
Fig. 3The idea board of the KMar Crowd platform, with data from S-Sys translated to English. On the left, the existing ideas together with voting buttons button are visible. On the right, new ideas can be expressed via a simplified user-story format
Fig. 4Effort estimation for the not-already-implemented ideas for
Activity per user type in the S-Sys case study (N=135)
| Origin | % of total | Per user activity | ||
|---|---|---|---|---|
| Ideas | Votes | Logins | ||
| Operational employee | 58.52% | 0.23 | 2.66 | 1.84 |
| Middle management | 8.15% | 0.82 | 3.18 | 2.55 |
| Non-targeted employee | 33.34% | 0.11 | 0.77 | 1.55 |
Usefulness of the ideas in the S-Sys case study, assessed by the two analysts who conducted the elicitation without CREUS
| Measurement | Value | # Ideas |
|---|---|---|
| KANO model | Must-be | 13 |
| One-dimensional | 10 | |
| Attractive | 7 | |
| Gathered earlier | Completely | 19 |
| Partly | 6 | |
| Not at all | 5 | |
| Complete for dev teams | Yes | 11 |
| No | 19 |
Fig. 5Usage indicators for the V-Sys case study plotted over time
V-Sys: usefulness of the ideas, assessed by a pool of analysts
| Measurement | Value | Ideas | |
|---|---|---|---|
| # | % | ||
| KANO model | Must-be | 40 | 50.6 |
| One-dimensional | 29 | 36.7 | |
| Attractive | 10 | 12.7 | |
| Enough for MVP | 47 | 59.5 | |
| Enough for product | 22 | 27.8 | |
| Granularity | Epic | 32 | 40.5 |
| User story | 43 | 54.4 | |
| Not applicable | 4 | 5.1 | |
Violations of criteria from the Quality User Story (QUS) framework
| Quality violations | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| Not well-formed | 7 | 24.1 | 12 | 17.9 | 0 | 0.0 |
| Not atomic | 10 | 34.5 | 18 | 26.9 | 33 | 13.5 |
| Not conceptually sound | 3 | 10.3 | 3 | 4.5 | 11 | 4.5 |
| Not problem-oriented | 7 | 24.1 | 23 | 34.3 | 46 | 18.8 |
| Ideas with: | ||||||
| No violations | 12 | 41.4 | 26 | 38.8 | 164 | 66.9 |
| One violation | 10 | 34.5 | 26 | 38.8 | 71 | 29.0 |
| Two violations | 6 | 20.7 | 12 | 17.9 | 9 | 3.7 |
| Three violations | 1 | 3.4 | 3 | 4.5 | 1 | 0.4 |
| Total ideas | 29 | 67 | 245 | |||
Analysis of whether the ideas would pertain to quality requirements
| Property | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| Reliability | 0 | 0.0 | 3 | 4.5 | 1 | 0.4 |
| Performance | 0 | 0.0 | 1 | 1.5 | 0 | 0.0 |
| Security | 0 | 0.0 | 1 | 1.5 | 3 | 1.2 |
| Compatibility | 14 | 48.3 | 23 | 34.3 | 17 | 6.9 |
| Usability | 12 | 41.4 | 25 | 37.3 | 55 | 22.4 |
| Ideas with | ||||||
| No properties | 5 | 17.2 | 20 | 29.9 | 169 | 69.0 |
| One property | 22 | 75.9 | 42 | 62.7 | 76 | 31.0 |
| Two or three properties | 2 | 6.9 | 5 | 7.4 | 0 | 0.0 |
| Total ideas | 29 | 67 | 245 | |||
Specific versus general ideas
| Item | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| Specific | 4 | 13.8 | 12 | 17.9 | 25 | 10.2 |
| General | 25 | 86.2 | 55 | 82.1 | 220 | 89.8 |
| Total ideas | 29 | 67 | 245 | |||
Average number of votes per category
| Average # of votes | S-Sys | V-Sys | Tournify |
|---|---|---|---|
| Specific | 11.25 | 2.85 | 1.24 |
| General | 5.44 | 4.65 | 2.15 |
Fig. 6Boxplot of the Flesch-scores for S-Sys, V-Sys and
Fig. 7Boxplot of the ARI-scores for S-Sys, V-Sys and
Distribution of vagueness hits
| # Hits per idea | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| None | 8 | 27.6 | 25 | 37.3 | 153 | 62.5 |
| One | 11 | 37.9 | 21 | 31.3 | 72 | 29.4 |
| Two | 6 | 20.7 | 9 | 13.4 | 15 | 6.1 |
| Three | 3 | 10.3 | 5 | 7.5 | 5 | 2.0 |
| Four or more | 1 | 3.5 | 7 | 10.5 | 0 | 0.0 |
| Total ideas | 29 | 67 | 245 | |||
Quantitative results of vagueness analysis
| Statistic | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| True positive | 10 | 27.0 | 27 | 31.0 | 29 | 24.8 |
| False positive | ||||||
| Clarified by sentence | 4 | 10.8 | 20 | 23.0 | 42 | 35.9 |
| Phrasal expression | 22 | 59.5 | 40 | 46.0 | 40 | 34.2 |
| Typo | 0 | 0.0 | 0 | 0.0 | 1 | 0.8 |
| Domain term | 1 | 2.7 | 0 | 0.0 | 5 | 4.3 |
| Vagueness hits | 37 | 87 | 117 | |||
Inter-rater reliability calculations
| Quality | S-Sys | V-Sys | Tournify | |||
|---|---|---|---|---|---|---|
| % | % | % | ||||
| (a) Quality User Story (QUS) Framework | ||||||
| Well-formed | 0.79 | 93.1 | 0.83 | 95.5 | n/a | 100.0 |
| Atomic | 1.00 | 100.0 | 0.72 | 89.6 | 0.60 | 91.4 |
| Conceptually sound | 0.33 | 79.3 | 0.57 | 94.0 | 0.37 | 95.1 |
| Problem-oriented | 0.37 | 75.9 | 0.44 | 73.1 | 0.35 | 81.6 |
| (b) ISO/IEC 25010 qualities | ||||||
| Reliability | n/a | n/a | 0.48 | 97.0 | 0.00 | 99.6 |
| Performance | n/a | n/a | 1.00 | 100.0 | n/a | 100.0 |
| Security | n/a | n/a | 1.00 | 100.0 | 0.50 | 99.2 |
| Compatibility | 0.73 | 86.2 | 0.90 | 95.5 | 0.47 | 94.3 |
| Usability | 0.60 | 79.3 | 0.59 | 80.6 | 0.54 | 85.7 |