Environmental sound classification using vision transformer

Haoran Hong; Junfeng Li; Xingxing Li

doi:10.1117/12.3014876

4 March 2024 Environmental sound classification using vision transformer

Haoran Hong, Junfeng Li, Xingxing Li

Proceedings Volume 12981, Ninth International Symposium on Sensors, Mechatronics, and Automation System (ISSMAS 2023); 129811Z (2024) https://doi.org/10.1117/12.3014876
Event: 9th International Symposium on Sensors, Mechatronics, and Automation (ISSMAS 2023), 2023, Nanjing, China

Abstract

Classification of environmental sounds plays a key role in surveillance systems, crime detection etc. Since the study of the sounds in a real environment can get significant information. Deep learning models, such as convolutional neural networks, have been shown very useful for environmental sound classification (ESC). Recent work has shown that Vision Transformer (ViT) models can achieve comparable or even superior performance on image classification tasks. In the paper, an environmental sound classification method based on Vision Transformer is proposed. We represent sound files with their image representations, namely Log Mel Spectrogram Images and train a Vision Transformer model on these image representations. Specifically, the method obtains an average classification accuracy of 94.6633%. The classification result reveals that the proposed approach is with a good performance on the ESC accuracy.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Haoran Hong, Junfeng Li, and Xingxing Li "Environmental sound classification using vision transformer", Proc. SPIE 12981, Ninth International Symposium on Sensors, Mechatronics, and Automation System (ISSMAS 2023), 129811Z (4 March 2024); https://doi.org/10.1117/12.3014876

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available