3 December 2019 Widget Detection Network: widget detection in mobile screenshot with region-based attention networks
Lin Qi, Tiezhu Wang
Author Affiliations +
Funded by: Key Technology R&D program of Hebei
Abstract

We propose an architecture to automatically detect widgets in mobile screenshots, considering only visual cues. Even though traditional object detection methods perform well on common objects in natural scene images, they are unable to deal with the screenshot images with complex widget layout. Therefore, we propose region-based Widget Detection Network (WDN), which introduces regularities in the screenshot images as the regularizations. First, we design a scale-aware attention structure to make the backbone network sensitive to widget scales so that the salient features of the interest regions could be captured. Second, a strategy of horizontal region generation is proposed to fully utilize the aligned property of widget arrangement, which generates all the region candidates in a horizontal line at once. Finally, a variant of online hard example mining is employed to alleviate the problem of imbalance samples, which explicitly restricts the ratio of foreground and background to achieve better balance. We conduct experiments on a proposed benchmark dataset. The quantitative results and qualitative analysis on the benchmark dataset show that WDN achieves impressive performance, which outperforms the common object detection methods in the widget detection task.

© 2019 SPIE and IS&T 1017-9909/2019/$28.00 © 2019 SPIE and IS&T
Lin Qi and Tiezhu Wang "Widget Detection Network: widget detection in mobile screenshot with region-based attention networks," Journal of Electronic Imaging 28(6), 063006 (3 December 2019). https://doi.org/10.1117/1.JEI.28.6.063006
Received: 17 July 2019; Accepted: 29 October 2019; Published: 3 December 2019
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Network architectures

Human-machine interfaces

Sensors

Visualization

Mining

Feature extraction

Back to Top