| Literature DB >> 32832221 |
Shrinivas Pundlik1,2, Vilte Baliutaviciute1,2, Mojtaba Moharrer1,2, Alex R Bowers1,2, Gang Luo1,2.
Abstract
Purpose: Evaluating mobility aids in naturalistic conditions across many days is challenging owing to the sheer amount of data and hard-to-control environments. For a wearable video camera-based collision warning device, we present the methodology for acquisition, reduction, review, and coding of video data for quantitative analyses of mobility outcomes in blind and visually impaired participants.Entities:
Keywords: mobility aid; naturalistic mobility; wearable video camera
Mesh:
Year: 2020 PMID: 32832221 PMCID: PMC7414611 DOI: 10.1167/tvst.9.7.14
Source DB: PubMed Journal: Transl Vis Sci Technol ISSN: 2164-2591 Impact factor: 3.283
Figure 1.Data recorded by the collision warning device. The chest-mounted video camera captures scene videos, and each video frame is embedded with relevant device data including whether a collision warning was provided, the direction of collision warning (left, center, right), device operating mode, and the real-time motion sensor data. If a collision warning is provided, its location is indicated on the video frame (white box with a dot in the center). This helps in determining the object for which the warning was provided. The text information embedded at the top and bottom of the video frames are extracted by OCR processing, for computerized preprocessing, but they are not visible to study staff in video reviewing.
Figure 2.Flowchart showing the steps in video data processing to obtain quantifiable mobility outcomes.
Figure 3.Reviewing and coding a collision warning event. (Left) An event can unfold in a complex manner, and depending on how it unfolds and the action taken by the user could result in contact with the obstacle. Following a complex tree for detailed annotation of an event may not be feasible or possible directly via video review. Success and failure can be either defined from a user's perspective or from device's perspective. From user's perspective, not having a body contact can be considered as a success, irrespective of the reason. From device's perspective, a cane contact may be considered a failure even if there is no body contact, depending on when the cane contact happens. (Right) Conceptually breaking down an event into three categories: device performance, user action, and the final result, can help to simplify the coding of an event while maintaining thoroughness of the review process.
Definitions of the Annotation Categories Used to Rate Events
| Annotation Category | Options | Meaning |
|---|---|---|
| Valid event | Yes/no | Yes: Camera view was unobstructed, device operation as expected. |
| No: Device operation was disrupted in some way. E.g. user hand obstructed camera, light glare created a visual artifact, or the device is not being worn. | ||
| True hazard | Yes/no | Yes: The warning was valid and associated with a true hazard; a collision would have occurred if the trajectory of motion was maintained. |
| No: the warning was a false alarm. | ||
| Evasion attempt | Yes (cane not involved, cane involved, not sure)/no | Yes (cane not involved): There was an evasion attempt (e.g. step to the side) with no clear use of long cane. |
| Yes (cane involved): There was an evasion attempt after cane contact with the obstacle. | ||
| Yes (not sure): There was an evasion attempt, but it is unclear whether the long cane made contact. Use sparingly. | ||
| No: There was no visible evasion attempt. | ||
| Contact | Cane contact/body contact/not sure/no | Cane contact: Participant made contact with the obstacle with their habitual mobility aid (long cane or the guide dog). In the absence of direct visual evidence, contact could be inferred by a sudden pause, sharp change of walking direction, or jolting/shaking of the camera, together with the relative distance to the object in the scene. NOTE: This option was also considered when a participant used their hand to find an obstacle that they were aware of. |
| Body contact: Participant collided with the obstacle directly. Notable by more severe camera jolt and close camera view. If both cane and body contact occur, mark as body. | ||
| Not sure: a contact occurs, but it is ambiguous whether with cane or body. Use sparingly. | ||
| No: there was no contact of any kind. | ||
| Home/office vs. other environment | Yes/no | Yes: The scene is inside participant's home/work environment. |
| No: The scene is outside participant's home/work related environment, such as streets, shopping mall, or transit stations, etc. | ||
| Nature of the hazard | Pedestrian, furniture, poles, walls, overhanging, trees, other | Pedestrian: the hazard was a person |
| Furniture: desks, chairs, shelves, racks, etc. | ||
| Poles: Poles, posts, pillars, bollards, columns, other similar standing structures. | ||
| Walls: Walls, doors, building structures, etc. | ||
| Overhanging: Tree branches, flags, banners and similar hanging/head-height objects. | ||
| Trees: Tree trunks, bushes, hedges, etc. | ||
| Other: Anything that doesn't fit the above categories (lights, vehicles, etc.) | ||
| Moving camera | Yes/no | Yes: The user was in motion (walking, swaying, on an escalator). |
| No: the user is still (sitting/standing). | ||
| Moving object/hazard | Yes/no | Yes: the hazard is moving (e.g., a walking pedestrian) |
| No: The hazard is still (e.g., stationary furniture) | ||
| Left turn | Yes/no | This is only selectable if there is an evasion attempt, and notes the direction of the evasion. |
| Right turn | Yes/no | This is only selectable if there is an evasion attempt, and notes the direction of the evasion. |
The categories of valid event, true hazard, evasion attempt, and contact are critical for assessing mobility outcomes and device performance. Only these categories were relevant for disagreement reconciliation. The other categories provide additional detail, such as what the hazard was, or where the user was at the time of the collision hazard event.
Figure 4.Agreement/disagreement between the 2 masked reviewers when performing manual review of the video data. A total of 2712 events were reviewed independently by each reviewer (rater A and rater B). The four review items shown here were rated hierarchically in following order: valid event, true hazard, all contacts, and body contacts. If both the reviewers rated no for any given item, the event was dropped from consideration for subsequent review items. Therefore, the total number of events was lower for items lower in the hierarchy.
Inter-Rater Reliability Between the 2 Independent Reviewers for Ratings of Valid Event, True Hazard, all Contact, and Body Contact Across 2712 Events
| Measure | Agree | Disagree | Agreement Probability | Cohen's Kappa |
|---|---|---|---|---|
| Valid Event | 2592 | 120 | 0.96 | 0.67 |
| True Hazard | 2035 | 539 | 0.79 | 0.57 |
| All Contacts | 902 | 428 | 0.68 | 0.24 |
| Body Contacts | 391 | 200 | 0.66 | 0.05 |
The order of listing of the items in the table represent the hierarchy followed when scoring these items for a given event. Therefore, the total events reduce as we move from valid event to body contact ratings.
Figure 5.Results for predicting disagreements in rating of body contacts during event review by the two raters. The machine learning algorithm was trained on each reviewer's ratings for the same 2712 events with 200 known disagreements. The % values in the table are relative to the total events reviewed (2712). Results were computed using five-fold cross-validation for these set of events. For data reviewed by rater A, the algorithm predicted 176 disagreements, with rater B correctly, while missing 24 (success rate of 88%). For data reviewed by rater B, the algorithm predicted 185 disagreements with rater A, while missing 15 (success rate of 93%).