Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels
Many accessibility features available on mobile platforms require applications (apps) to provide complete and accurate metadata describing user interface (UI) components. Unfortunately, many apps do not provide sufficient metadata for accessibility features to work as expected. In this paper, we exp...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Many accessibility features available on mobile platforms require
applications (apps) to provide complete and accurate metadata describing user
interface (UI) components. Unfortunately, many apps do not provide sufficient
metadata for accessibility features to work as expected. In this paper, we
explore inferring accessibility metadata for mobile apps from their pixels, as
the visual interfaces often best reflect an app's full functionality. We
trained a robust, fast, memory-efficient, on-device model to detect UI elements
using a dataset of 77,637 screens (from 4,068 iPhone apps) that we collected
and annotated. To further improve UI detections and add semantic information,
we introduced heuristics (e.g., UI grouping and ordering) and additional models
(e.g., recognize UI content, state, interactivity). We built Screen Recognition
to generate accessibility metadata to augment iOS VoiceOver. In a study with 9
screen reader users, we validated that our approach improves the accessibility
of existing mobile apps, enabling even previously inaccessible apps to be used. |
---|---|
DOI: | 10.48550/arxiv.2101.04893 |