Paper
6 March 2013 Density-induced oversampling for highly imbalanced datasets
Daniel Fecker, Volker Märgner, Tim Fingscheidt
Author Affiliations +
Proceedings Volume 8661, Image Processing: Machine Vision Applications VI; 86610P (2013) https://doi.org/10.1117/12.2003973
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States
Abstract
The problem of highly imbalanced datasets with only sparse data of the minority class in the context of two class classification is investigated. A novel synthetic data oversampling technique is proposed which utilizes estimations of the probability density distribution in the feature space. First, a Gaussian mixture model (GMM) from the data of the well-sampled majority class is generated and with its help a new GMM is approximated by Bayesian adaptation using the sparse minority class data. Random synthetic data is generated from the adapted GMM and an additional assignment rule assigns this data to either the minority class or else discards it. The obtained synthetic data is employed in combination with the available original data to train a support vector machine classifier. The examined application in this paper is optical on-line process monitoring of laser brazing with only rare sporadic occurring defects. Experiments with different amounts of minority class data samples and comparisons to other methods show that this approach performs very well for highly imbalanced datasets.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Daniel Fecker, Volker Märgner, and Tim Fingscheidt "Density-induced oversampling for highly imbalanced datasets", Proc. SPIE 8661, Image Processing: Machine Vision Applications VI, 86610P (6 March 2013); https://doi.org/10.1117/12.2003973
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Statistical analysis

Image processing

Beam controllers

Cameras

Expectation maximization algorithms

Feature extraction

Back to Top