This paper considers a method for detection of road surface markings using a camera mounted on top of a vehicle. The detection is done with an orientation-aware detector based on a convolutional neural network. To successfully detect the orientation and position of road surface markings, the input frontal image is converted to a bird’s-eye view image using inverse perspective matching. Synthetic image dataset is constructed with aid of MSER (maximally stable extremal regions) algorithm to solve data imbalance problem. The detector is trained to estimate orientations of the detected objects in addition to the class labels and positions. Pretrained DenseNet based YOLOv2 model is modified to detect rotated rectangles with an additional cost function and new efficient IOU (intersection of union) measure. Instead of directly estimating the orientation angle of the road surface markings, probabilistic estimation is done with quantized angular bins. Benchmark dataset is formulated for evaluation and the experimental results showed that the considered algorithm provides promising result while running in a real-time.
In this paper, we propose an accurate lane-level map building method using low-cost sensors such as cameras, GPS and in-vehicle sensors. First, we estimate the ego-motion from the stereo camera and the in-vehicle sensors, and globally optimize the accurate vehicle positions by fusion with the GPS data. Next, we perform lane detection on every image frame in the camera. Lastly, we repeatedly accumulate and cluster the detected lanes based on the accurate vehicle positions, and perform polyline fitting algorithm. The polyline fitting algorithm follows a variant of the Random Sample Consensus (RANSAC) algorithm, which particularly solves the multi-line fitting problem. This algorithm can expand the lane area and improve the accuracy at the same time by repeatedly driving the same road. We evaluated the lane-level map building on two types of roads: a proving ground and a real driving environment. The lane map accuracy verified at the proving ground was 9.9982cm of the CEP error.
We present a vehicle re-identification method for a parking lot toll system. Given a probe image captured from one camera installed in the entrance of a parking lot, re-identification is the method of identifying a matching image from a gallery set constructed from different cameras in the exit region. This method is especially useful when the license plate recognition fails. Our method is based on a convolutional neural network (CNN) which is a variant of multilayer perceptron (MLP). An input image of the CNN model is cropped by the license plate detection (LPD) algorithm to eliminate the background of an original image. To train a vehicle re-identification model, we adopt the pre-trained models which showed the outstanding results in the ImageNet [1] challenge from 2014 to 2015. Then, we fine-tune one of the models (GoogLeNet [2]) for a car’s model recognition task using a large-scale car dataset [3]. This fine-tuned model is utilized as a feature extractor. Cosine function is used to measure the similarity between a probe and a gallery. To evaluate the performance of our method, we create two datasets: ETRI-VEHICLE-2016-1 and ETRI-VEHICLE2016-2. The experimental result reveals that the proposed technique can achieve promising results.
In a camera-based engagement level recognition, a face is an important factor because cues mainly come from a face, which is affected from a distance between a camera and a user. In this paper, we present an automatic engagement level recognition method showing stable performance regardless of a distance between a camera and a user. We show a detailed process about getting a distance-invariant cue and compare its performance with and without the process. We also adopt a temporal pyramid structure to extract temporal statistical feature and present a voting method for an engagement level estimation. We show the results and the analysis using the database acquired in the real environment.
In this paper, we present an affect recognition system for measuring the engagement level of children using the Kinect while performing a multiple intelligence test on a computer. First of all, we recorded 12 children while solving the test and manually created a ground truth data for the engagement levels of each child. For a feature extraction, Kinect for Windows SDK provides support for a user segmentation and skeleton tracking so that we can get 3D joint positions of an upper-body skeleton of a child. After analyzing movement of children, the engagement level of children’s responses is classified into two classes: High or Low. We present the classification results using the proposed features and identify the significant features in measuring the engagement.
Recently, many studies show that an indoor horse riding exercise has a positive effect on promoting health and diet. However, if a rider has an incorrect posture, it will be the cause of back pain. In spite of this problem, there is only few research on analyzing rider’s posture. Therefore, the purpose of this study is to estimate a rider pose from a depth image using the Asus’s Xtion sensor in real time. In the experiments, we show the performance of our pose estimation algorithm in order to comparing the results between our joint estimation algorithm and ground truth data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.