| Literature DB >> 36117680 |
Leixian Shen1, Enya Shen1, Zhiwei Tai1, Yihao Xu1, Jiaxiang Dong1, Jianmin Wang1.
Abstract
General visualization recommendation systems typically make design decisions for the dataset automatically. However, most of them can only prune meaningless visualizations but fail to recommend targeted results. This paper contributes TaskVis, a task-oriented visualization recommendation system that allows users to select their tasks precisely on the interface. We first summarize a task base with 18 classical analytic tasks by a survey both in academia and industry. On this basis, we maintain a rule base, which extends empirical wisdom with our targeted modeling of the analytic tasks. Then, our rule-based approach enumerates all the candidate visualizations through answer set programming. After that, the generated charts can be ranked by four ranking schemes. Furthermore, we introduce a task-based combination recommendation strategy, leveraging a set of visualizations to give a brief view of the dataset collaboratively. Finally, we evaluate TaskVis through a series of use cases and a user study.Entities:
Keywords: Analytic task; Answer set programming; Visual data analysis; Visualization recommendation
Year: 2022 PMID: 36117680 PMCID: PMC9470074 DOI: 10.1007/s41019-022-00195-3
Source DB: PubMed Journal: Data Sci Eng ISSN: 2364-1541
Fig. 1Recommendation pipeline. The recommendation engine accepts analytic tasks and data properties, and then automatically generates appropriate visualizations for users
Fig. 6VisRec results of the Happiness Ranking dataset with the tasks-coverage-based ranking scheme. The Display by task switch is closed
Fig. 2Example of Vega-Lite specification. The figure shows a bar chart with an ordinal variable at x axis, a quantitative variable at y axis, and a nominal variable at color channel. In addition, sort, sum aggregation, and zero stack transformation are applied
Fig. 3User interface. When uploading a dataset (A), the data field shows data columns with the corresponding data type (B). Users can customize system settings, including interested data columns (B), the max number of charts (C), recommendation mode (D), ranking scheme (E), and task list (F). Clicking the Recommendation button will generate visualizations on the right. Selected tasks are tabbed, along with a Display by task switch (G). If the switch is open, the charts will be displayed by tasks. (H) is the default thumbnail view with a row of charts. Clicking the task tab will show all the recommendations (I). If the switch is closed, all the charts will be deduplicated first and displayed together like in Fig. 6
Fig. 7VisRec examples of the COVID-19 dataset in the spatial task with multiple columns. One chart is previewed by clicking on it
Fig. 9Combination Recommendation results of the Hollywood Stories dataset (without task selection)
Fig. 4Architecture of TaskVis. TaskVis consists of six modules and two bases (for tasks a and rules e respectively). b Input: accepts the user’s input. c Preprocess: extract data properties. d Visualization Generation: enumerate all qualified candidates. f Visualization Ranking: rank all visualizations according to selected scheme. g Output: present recommendation results to users. h Combination VisRec: make combination recommendations by iterating tasks
Task base. Mark column lists appropriate marks, where the rank has priority, (*) indicates the combination of marks, e.g. rect(text) means a text layer is superimposed on rect chart
| Task | Mark | Description | Reference |
|---|---|---|---|
| Change Over Time | line/area | Analyse how the data changes over time series | [ |
| Characterize Distribution | bar/point | Characterize the distribution of the data over the set | [ |
| Cluster | bar/point | Find clusters of similar attribute values | [ |
| Comparison | line/point/bar | Give emphasis to comparison on different entities | [ |
| Compute Derived Value | rect(text)/arc/bar | Compute aggregated or binned numeric derived value | [ |
| Correlate | bar/line | Determine useful relationships between the columns | [ |
| Determine Range | tick/boxplot | Find the span of values within the set | [ |
| Deviation | bar(rule)/point(rule) | Compare data with certain value like zero or mean | [ |
| Error Range | errorband/errorbar | Summarizes an error range of quantitative values | [ |
| Filter | rect/bar/arc | Find data cases satisfying the given constrains | [ |
| Find Anomalies | bar/point | Identify any anomalies within the dataset | [ |
| Find Extremum | bar/point | Find extreme values of data column | [ |
| Magnitude | arc/bar | Show relative or absolute size comparisons | [ |
| Part to Whole | arc | Show component elements of a single entity | [ |
| Retrieve Value | rect(text) | Find values of specific columns | [ |
| Sort | bar | Rank data according to some ordinal metric | [ |
| Spatial | geoshape/circle(text) | Show spatial data like latitude and longitude | [ |
| Trend | point(line) | Use regression or loess to show the variation trend | [ |
Fig. 5Example of cost model. cost score of the visualization in Fig. 2 can be obtained by summing up cost of all components
Fig. 8VisRec examples of COVID-19 dataset in the change over time task with the region, confirmed, and date columns
Fig. 10User ratings of whether the visualization can satisfy the given task with a 5-point Likert scale