| Literature DB >> 31909113 |
Patricia Acosta-Vargas1, Mario González1, Sergio Luján-Mora2.
Abstract
This article presents the process of building a dataset for evaluation of the accessibility of 368 web pages, beginning with Webometrics rankings, the WAVE tool was used in the evaluation of the web pages. The dataset documents data on repeated errors with higher frequency, in such a way that they alert the web developers, supporting them in creating more inclusive and accessible websites for all types of people, including users with disabilities. The data show that university websites have frequent problems related to the lack of alternative text linked to images. Some of the university websites included in this dataset were found to violate web accessibility requirements based on the Web Content Accessibility Guidelines 2.0 and 2.1. Therefore, this data has been shared to allow replication of the experiment, and serve as an input to future studies related to web accessibility. The dataset is hosted, with public access, in the Mendeley Dataset Repository.Entities:
Keywords: Accessibility; Assess; Dataset; Evaluation; Higher education; Web content accessibility guidelines (WCAG) 2.1; Website
Year: 2019 PMID: 31909113 PMCID: PMC6938809 DOI: 10.1016/j.dib.2019.105013
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Description of dataset variables.
| Name | Description | Type |
|---|---|---|
| University | It is the name of the University taken in the case study. | Text |
| URL | It is the website address of the university. | Text |
| Acronym | It is the short name defined for the university. | Text |
| Country | The variable indicates the country name of the educational institution. | Text |
| Latin America Ranking | It is the numeric value assigned by the webometrics institution according to the location in the ranking of higher education institutions for Latin America. | Numeric |
| World Ranking | It is the numerical value assigned by the webometrics institution according to the location in the ranking of higher education institutions for the whole world. | Numeric |
| Presence | This variable is the number of web pages of the main web domain of the institution. It includes all subdomains and all file types, including pdf documents. | Numeric |
| Impact | This value represents the external networks (subnets) that create backlinks to the institution's web pages. After normalization, the average value between the two sources is selected. This variable is related to the visibility of the website. | Numeric |
| Opening | This variable is related to the number of citations of the principal authors, according to the Google Scholar citations source. | Numeric |
| Excellence | This variable relates to the number of academic articles published in high-impact international journals in the top 10% of their respective scientific disciplines. The data provider is the SCimago Group. | Numeric |
| Errors | A variable defined by WAVE indicates that it detected an error. The absence of errors does not mean that a page is accessible. Red icons indicate accessibility errors that need to be corrected. | Numeric |
| Alerts | Indicates the elements that evaluators observe that represent a problem for the end-user. | Numeric |
| Features | Indicate accessibility features, things that are likely to improve accessibility, but that need to be verified. | Numeric |
| Structural Elements | They represent the alerts that the evaluators must review in the structure of the web page. | Numeric |
| HTML5 and ARIA | This variable is defined by WAVE and represents the web accessibility errors that the evaluator must correct on how to add accessibility information to HTML elements using the Accessible Rich Internet Applications specification. | Numeric |
| Contrast Errors | Represents the alerts that evaluators should review in the Errors of Contrast section. | Numeric |
Fig. 1Data columns sizes and types.
Fig. 2Correlation for numeric variables.
Fig. 3Right: The top 50 universities in the dataset ranked. Left: Number of universities in the dataset by country.
The dataset information can help the research community for various applications, such as to predict whether websites are accessible or to determine possible failures in building inclusive website prototypes. It can also be used for clustering analysis or multivariate queries, testing, comparison with similar datasets, and categorization of accessible websites. These data are useful for knowing the accessibility status of educational websites in Latin America. Some, despite a high ranking, according to Webometrics [ On the other hand, these data allow identification of errors repeated with high frequency in the main pages of the 368 websites [ This type of reference data can directly benefit website developers, during design with agile and adaptive methodologies, such that all users, including people with disabilities, can navigate and interact easily on the web. These data can be compared with outcomes of future evaluations in order to know whether educational institutions have improved their web accessibility, advanced universal access, and raised their visibility in search engines. |
Specifications Table
| Subject | Computer Science and Education |
| Specific subject area | Analysis, Classification Analysis, Web Accessibility |
| Type of data | Table in.xlsx format |
| How data were acquired | Web scrapping from Webometrics, automatic evaluation with WAVE (software |
| Data format | Raw, analyzed. The dataset is public and is available in the Mendeley Dataset Repository [ |
| Parameters for data collection | The authors performed a web scraping from the Webometrics site. Using an Excel macro, we obtained the URL of each site to evaluate. The URL of each home page was loaded into the Google Chrome browser, and the WAVE plug-in was executed. The resulting data was manually recorded in a spreadsheet that is now stored in the Mendeley Dataset Repository. |
| Description of data collection | For the evaluation of the main pages of each website, the data was collected as follows. The first phase involved a web scraping of the Webometrics site, in the section of Latin American universities. In the second phase, 368 web pages were randomly selected for evaluation. In phase three, an Excel macro was used to extract each URL and place it in the Google Chrome browser. The WAVE plug-in, version 1.0.9, updated November 17, 2017. WAVE produces a report containing the data and variables involved. Finally, the report data from each web page was manually copied and organized in the spreadsheet. |
| Data source location | Higher Education Institutions in 26 countries: Antigua Barbuda, Argentina, Aruba, Bolivia, Brazil, Chile, Colombia, Costa Rica, Cuba, Dominica, Ecuador, El Salvador, Guatemala, Haiti, Honduras, Jamaica, Mexico, Nicaragua, Panama, Paraguay, Peru, Puerto Rico, Dominican Republic, Trinidad and Tobago, Uruguay, and Venezuela. |
| Data accessibility | Mendeley Dataset Repository on |
| Related research article | Acosta-Vargas, P., Acosta, T., & Luján-Mora, S. “Challenges to Assess Accessibility in Higher Education Websites: A Comparative Study of Latin America Universities.” |