Presentation + Paper
4 October 2023 Fine-tuned deep convolutional neural network for hand segmentation in egocentric videos
Author Affiliations +
Abstract
Semantic segmentation is a high-level task in computer vision that associates each pixel of an image with a semantic(class) label. Fine-semantic segmentation is a pixel-level task that provides detailed information necessary to easily identify the region of the object of interest. Hands are one of the main channels for communication, enhancing human-object and human-environment interaction, and in egocentric videos, they appear to be ubiquitous and at the center of vision and activities, hence our interest in hand segmentation. Fine-semantic segmentation of hands locates, identifies, and groups together pixels associated with the hands, with a hand semantic label. We performed fine semantic segmentation of hands, by improving the architecture of the state-of-the-art deep convolutional neural network (RefineNet). We achieve a finer and more accurate result by amending the process of obtaining and combining high and low-level features, and the pixel grouping for pixel-level classification. We performed this task on a public egocentric video dataset (EgoHands). We evaluate our model (RefineNet-Pix) performance by adopting the existing pixel-level metric, mean precision (mPrecision). Comparing our result with the baseline reported in Urooj’s work, we obtain accuracy higher than 87.9% of the benchmark. Our finer and more accurate semantic segmentation result guarantees good performance under various lighting conditions and complex backgrounds, making it suitable for use in both indoor and outdoor environments. Fine-hand semantic segmentation can be applied in image analysis, medical systems (with a focus on understanding hand motion for prediction, diagnosis, and monitoring), hand gesture recognition (human-computer interaction and understanding action), and robotics(grasp and manipulation of objects).
Conference Presentation
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Eyitomilayo Yemisi-Babatope, Miguel Ángel Camargo-Rojas, Mireya Saraí García-Vázquez, and Alejandro Álvaro Ramírez-Acosta "Fine-tuned deep convolutional neural network for hand segmentation in egocentric videos", Proc. SPIE 12673, Optics and Photonics for Information Processing XVII, 1267303 (4 October 2023); https://doi.org/10.1117/12.2677242
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Semantics

Education and training

Video

Performance modeling

Convolution

Data modeling

Back to Top