As global populations soar and the climate warms, food supply management is an increasingly critical problem. Precision agriculture, driven by on-site data collected from various sensors, plays a pivotal role in optimizing irrigation, fertilization, and enhancing plant health and crop yield. However, the manual process of in-filed chlorophyll measurement, which is a key metric for guiding agricultural decisions, is very cumbersome and poses significant challenges. This paper explores the transformative potential of multispectral imaging data to automate plant measuring and monitoring tasks, thereby reducing labor and time costs while improving the quality of data available for making informed agricultural decisions. We present a deep-learning model for instance segmentation of plants, trained on the Growliflower dataset of RGB and multispectral image cubes of cauliflower plants. The proposed algorithm uses a Convolutional Neural Network (CNN) to leverage both the spectral information and and its spatial context to locate individual plants. We introduce a novel band-selection algorithm for determining the most significant multispectral features for use in the convolutional network: this reduces model complexity while ensuring accurate results. Our model’s ability to generalize across varying growth stages, soil conditions, and varieties of crops in the training dataset demonstrates its suitability for real-world agricultural applications. This fusion of cutting-edge sensing technology for robotic systems and state-of-the-art deep learning models holds significant promise for advancements in crop yield, resource efficiency, and sustainability practices.
Eye-tracking holds numerous promises for improving the mixed reality experience. While eye-tracking devices are capable of accurate gaze mapping on 2D surfaces, depth estimation of gaze points remains a challenging problem. Most gaze-based interaction applications are supported by estimation techniques that find a mapping between gaze data and corresponding targets on a 2D surface. This approach inevitably leads to a biased outcome, as the nearest objects in the line of sight will tend to be the target of interest more often. One viable solution would be to estimate gaze as a 3D coordinate (x, y, z) rather than the traditional 2D coordinate (x, y). This article first introduces a new comprehensive 3D gaze dataset collected in a realistic scene setting. Data was collected using a head-mounted eye-tracker and a depth estimation camera. Next, we present a novel depth estimation model, trained on the new gaze dataset to accurately predict gaze depth based on calibrated gaze vectors. This method could help develop a mapping between gaze and 3D objects on a 3D plane. The presented model improves the reliability of depth measurement of visual attention in real scenes as well as the accuracy of depth-based scenes in virtual reality environments. Improving situational awareness using 3D gaze data will benefit several domains, particularly human-vehicle interaction, autonomous driving, and augmented reality.
Single-image super-resolution (SISR), which maps a low-resolution observation to a high-resolution image, has been extensively utilized in various computer vision applications. With the advent of convolutional neural networks (CNNs), numerous algorithms have emerged that achieve state-of-the-art results. However, the main drawback of CNN is the negligence in the interrelationship between the RGB color channel. This negligence further reduces crucial structural information of color and provides a non-optimal representation of color images. Furthermore, most of these CNN-based methods contain millions of parameters and layers, limiting the practical applications. To overcome these drawbacks, an endto- end trainable single image super-resolution method – Quaternion-based Image Super-Resolution network (QSRNet) that takes advantage of the quaternion theory is proposed in this paper. QSRNet aims at maintaining the local and global interrelationship between the channels and produces high-resolution images with approximately 4x fewer parameters when compared to standard CNNs. Extensive computer experimentations were conducted on publicly available benchmarking thermal datasets, including DIV2K, Flickr2K, Set5, Set14, BSD100, Urban100, and UEC100, to demonstrate the effectiveness of the proposed QSRNet compared to traditional CNNs.
Neural networks have emerged to be the most appropriate method for tackling the classification problem for hyperspectral images (HIS). Convolutional neural networks (CNNs), being the current state-of-art for various classification tasks, have some limitations in the context of HSI. These CNN models are very susceptible to overfitting because of 1) lack of availability of training samples, 2) large number of parameters to fine-tune. Furthermore, the learning rates used by CNN must be small to avoid vanishing gradients, and thus the gradient descent takes small steps to converge and slows down the model runtime. To overcome these drawbacks, a novel quaternion based hyperspectral image classification network (QHIC Net) is proposed in this paper. The QHIC Net can model both the local dependencies between the spectral channels of a single-pixel and the global structural relationship describing the edges or shapes formed by a group of pixels, making it suitable for HSI datasets that are small and diverse. Experimental results on three HSI datasets demonstrate that the Q-HIC Net performs on par with the traditional CNN based methods for HSI Classification with a far fewer number of parameters.
Translating environmental knowledge from bird’s eye view perspective, such as a map, to first person egocentric perspective is notoriously challenging, but critical for effective navigation and environment learning. Pointing error, or the angular difference between the perceived location and the actual location, is an important measure for estimating how well the environment is learned. Traditionally, errors in pointing estimates were computed by manually noting the angular difference. With the advent of commercial low-cost mobile eye trackers, it becomes possible to couple the advantages of automated image processing based techniques with these spatial learning studies. This paper presents a vision based analytic approach for calculating pointing error measures in real-world navigation studies relying only on data from mobile eye tracking devices. The proposed method involves three steps: panorama generation, probe image localization using feature matching, and navigation pointing error estimation. This first-of-its-kind application has game changing potential in the field of cognitive research using eye-tracking technology for understanding human navigation and environment learning and has been successfully adopted by cognitive psychologists.
Face recognition technologies have been in high demand in the past few decades due to the increase in human-computer interactions. It is also one of the essential components in interpreting human emotions, intentions, facial expressions for smart environments. This non-intrusive biometric authentication system relies on identifying unique facial features and pairing alike structures for identification and recognition. Application areas of facial recognition systems include homeland and border security, identification for law enforcement, access control to secure networks, authentication for online banking and video surveillance. While it is easy for humans to recognize faces under varying illumination conditions, it is still a challenging task in computer vision. Non-uniform illumination and uncontrolled operating environments can impair the performance of visual-spectrum based recognition systems. To address these difficulties, a novel Anisotropic Gradient Facial Recognition (AGFR) system that is capable of autonomous thermal infrared to visible face recognition is proposed. The main contribution of this paper includes a framework for thermal/fused-thermal-visible to visible face recognition system and a novel human-visual-system inspired thermal-visible image fusion technique. Extensive computer simulations using CARL, IRIS, AT and T, Yale and Yale-B databases demonstrate the efficiency, accuracy, and robustness of the AGFR system.
Although not as popular as fingerprint biometrics, palm prints have garnered interest in scientific community for the rich amount of distinctive information available on the palm. In this paper, a novel method for touchless palm print stitching to increase the effective area is presented. The method is not only rotation invariant but also able to robustly handle many distortions of touchless systems like illumination variations, pose variations etc. The proposed method also can handle partial palmprints, which have a high chance of occurrence in a scene of crime, by stitching them together to produce a much larger-to-full size palmprint for authentication purpose. Experiment results are shown for IIT-D palmprint database, from which pseudo partial palmprints were generated by cropping and randomly rotating them. Furthermore, the quality of stitching algorithm is determined by extensive computer simulations and visual analysis of the stitched image. Experimental results also show that the stitching significantly increases the area of palm image for feature point detection and hence provides a way to increase the accuracy and reliability of detection.
Publisher’s Note: This paper, originally published on 25 May 2016, was replaced with a revised version on 16 June 2016. If you downloaded the original PDF, but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance.
One of the most important areas in biometrics is matching partial fingerprints in fingerprint databases. Recently, significant progress has been made in designing fingerprint identification systems for missing fingerprint information. However, a dependable reconstruction of fingerprint images still remains challenging due to the complexity and the ill-posed nature of the problem. In this article, both binary and gray-level images are reconstructed. This paper also presents a new similarity score to evaluate the performance of the reconstructed binary image. The offered fingerprint image identification system can be automated and extended to numerous other security applications such as postmortem fingerprints, forensic science, investigations, artificial intelligence, robotics, all-access control, and financial security, as well as for the verification of firearm purchasers, driver license applicants, etc.
In this paper, a novel technique to mosaic multiview contactless finger images is presented. This technique makes use of different correlation methods, such as, the Alpha-trimmed correlation, Pearson’s correlation [1], Kendall’s correlation [2], and Spearman’s correlation [2], to combine multiple views of the finger. The key contributions of the algorithm are: 1) stitches images more accurately, 2) provides better image fusion effects, 3) has better visual effect on the overall image, and 4) is more reliable. The extensive computer simulations show that the proposed method produces better or comparable stitched images than several state-of-the-art methods, such as those presented by Feng Liu [3], K Choi [4], H Choi [5], and G Parziale [6]. In addition, we also compare various correlation techniques with the correlation method mentioned in [3] and analyze the output. In the future, this method can be extended to obtain a 3D model of the finger using multiple views of the finger, and help in generating scenic panoramic images and underwater 360-degree panoramas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.