Paper
24 June 2005 Singing voice detection for karaoke application
Arun Shenoy, Yuansheng Wu, Ye Wang
Author Affiliations +
Proceedings Volume 5960, Visual Communications and Image Processing 2005; 596028 (2005) https://doi.org/10.1117/12.631645
Event: Visual Communications and Image Processing 2005, 2005, Beijing, China
Abstract
We present a framework to detect the regions of singing voice in musical audio signals. This work is oriented towards the development of a robust transcriber of lyrics for karaoke applications. The technique leverages on a combination of low-level audio features and higher level musical knowledge of rhythm and tonality. Musical knowledge of the key is used to create a song-specific filterbank to attenuate the presence of the pitched musical instruments. This is followed by subband processing of the audio to detect the musical octaves in which the vocals are present. Text processing is employed to approximate the duration of the sung passages using freely available lyrics. This is used to obtain a dynamic threshold for vocal/ non-vocal segmentation. This pairing of audio and text processing helps create a more accurate system. Experimental evaluation on a small database of popular songs shows the validity of the proposed approach. Holistic and per-component evaluation of the system is conducted and various improvements are discussed.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Arun Shenoy, Yuansheng Wu, and Ye Wang "Singing voice detection for karaoke application", Proc. SPIE 5960, Visual Communications and Image Processing 2005, 596028 (24 June 2005); https://doi.org/10.1117/12.631645
Lens.org Logo
CITATIONS
Cited by 12 scholarly publications and 3 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Linear filtering

Electronic filtering

Signal attenuation

Databases

Feature extraction

Acoustics

Error analysis

Back to Top