Tympanic membrane (TM) diseases are among the most frequent pathologies, affecting the majority of the pediatric population. Video otoscopy is an effective tool for diagnosing TM diseases. However, access to Ear, Nose, and Throat (ENT) physicians is limited in many sparsely-populated regions worldwide. Moreover, high inter- and intra-reader variability impair accurate diagnosis. This study proposes a digital otoscopy video summarization and automated diagnostic label assignment model that benefits from the synergy of deep learning and natural language processing (NLP). Our main motivation is to obtain the key visual features of TM diseases from their short descriptive reports. Our video database consisted of 173 otoscopy records from three different TM diseases. To generate composite images, we utilized our previously developed semantic segmentation-based stitching framework, SelectStitch. An ENT expert reviewed these composite images and wrote short reports describing the TM's visual landmarks and the disease for each ear. Based on NLP and a bag-of-words (BoW) model, we determined the five most frequent words characterizing each TM diagnostic category. A neighborhood components analysis was used to predict the diagnostic label of the test instance. The proposed model provided an overall F1-score of 90.2%. This is the first study to utilize textual information in computerized ear diagnostics to the best of our knowledge. Our model has the potential to become a telemedicine application that can automatically make a diagnosis of the TM by analyzing its visual descriptions provided by a healthcare provider from a mobile device.
Ear diseases are frequently occurring conditions affecting the majority of the pediatric population, potentially resulting in hearing loss and communication disabilities. The current standard of care in diagnosing ear diseases includes a visual examination of the tympanic membrane (TM) by a medical expert with a range of available otoscopes. However, visual examination is subjective and depends on various factors, including the experience of the expert. This work proposes a decision fusion mechanism to combine predictions obtained from digital otoscopy images and biophysical measurements (obtained through tympanometry) for the detection of eardrum abnormalities. Our database consisted of 73 tympanometry records along with digital otoscopy videos. For the tympanometry aspect, we trained a random forest classifier (RF) using raw tympanometry attributes. Additionally, we mimicked a clinician’s decision on tympanometry findings using the normal range of the tympanogram values provided by a clinical guide. Moreover, we re-trained Inception-ResNet-v2 to classify TM images selected from each otoscopic video. After obtaining predictions from each of three different sources, we performed a majority voting-based decision fusion technique to reach the final decision. Experimental results show that the proposed decision fusion method improved the classification accuracy, positive predictive value, and negative predictive value in comparison to the single classifiers. The results revealed that the accuracies are 64.4% for the clinical evaluations of tympanometry, 76.7% for the computerized analysis of tympanometry data, and 74.0% for the TM image analysis while our decision fusion methodology increases the classification accuracy to 84.9%. To the best of our knowledge, this is the first study to fuse the data from digital otoscopy and tympanometry. Preliminary results suggest that fusing information from different sources of sensors may provide complementary information for accurate and computerized diagnosis of TM-related abnormalities.
Rosacea is a common cutaneous disorder characterized by facial redness, swelling, and flushing, and it is usually diagnosed by a dermatologist after a visual examination. Qualitative human assessment often results in relatively high intra- and interobserver variability, which can negatively affect patient outcomes. Computer-assisted image analysis may improve visual assessment by human observers because it enables quantitative, consistent, and accurate analysis. Here, we combine classical multidimensional scaling (MDS) with deep convolutional neural networks (CNNs) to create an efficient framework to identify rosacea lesions. MDS is utilized to determine an appropriate amount of training data, which are used to train Inception-ResNet-v2 to classify facial images into rosacea and non-rosacea regions. Using a leave-one-patient-out cross-validation scheme with 128 × 128 non-overlapping image patches, the method resulted in a class weighted average Dice coefficient (DC) of 82.1% ± 2.4% and accuracy of 85.0% ± 0.6%. While this average performance is almost identical to our previous results (81.7% ± 2.7% and 84.9% ± 0.6% for DC and accuracy, respectively), with the new scheme, we use approximately 90% less data to train the system. We also report the results of quantitative experiments with overlapping patches with a stride of 50 pixels. With the same experimental setup, speedups of 25.6 times (128 × 128), 23.4 times (192 × 192), and 23.2 times (256 × 256) have been observed when the network is trained with the entire training data as the baseline. The class weighted average DC for this experiment with the proposed method is 83.9% ± 2.1% as in the case of 192 × 192 pixels overlapping patches, while it is 84.4% ± 2.2% when the entire set is trained at each fold. We conclude that the proposed method can be an efficient way to train deep neural networks using only a small subset of the training data.
Target detection is one of the most important topics for military or civilian applications. In order to address such detection tasks, hyperspectral imaging sensors provide useful images data containing both spatial and spectral information. Target detection has various challenging scenarios for hyperspectral images. To overcome these challenges, covariance descriptor presents many advantages. Detection capability of the conventional covariance descriptor technique can be improved by fusion methods. In this paper, hyperspectral bands are clustered according to inter-bands correlation. Target detection is then realized by fusion of covariance descriptor results based on the band clusters. The proposed combination technique is denoted Covariance Descriptor Fusion (CDF). The efficiency of the CDF is evaluated by applying to hyperspectral imagery to detect man-made objects. The obtained results show that the CDF presents better performance than the conventional covariance descriptor.
Nowadays food inspection and evaluation is becoming significant public issue, therefore robust, fast, and environmentally safe methods are studied instead of human visual assessment. Optical sensing is one of the potential methods with the properties of being non-destructive and accurate. As a remote sensing technology, hyperspectral imaging (HSI) is being successfully applied by researchers because of having both spatial and detailed spectral information about studied material. HSI can be used to inspect food quality and safety estimation such as meat quality assessment, quality evaluation of fish, detection of skin tumors on chicken carcasses, and classification of wheat kernels in the food industry. In this paper, we have implied an experiment to detect fat ratio in ground meat via Support Vector Data Description which is an efficient and robust one-class classifier for HSI. The experiments have been implemented on two different ground meat HSI data sets with different fat percentage. Addition to these implementations, we have also applied bagging technique which is mostly used as an ensemble method to improve the prediction ratio. The results show that the proposed methods produce high detection performance for fat ratio in ground meat.
A novel hyperspectral target detection technique based on Fukunaga-Koontz transform (FKT) is presented. FKT offers significant properties for feature selection and ordering. However, it can only be used to solve multi-pattern classification problems. Target detection may be considered as a two-class classification problem, i.e., target versus background clutter. Nevertheless, background clutter typically contains different types of materials. That’s why; target detection techniques are different than classification methods by way of modeling clutter. To avoid the modeling of the background clutter, we have improved one-class FKT (OC-FKT) for target detection. The statistical properties of target training samples are used to define tunnel-like boundary of the target class. Non-target samples are then created synthetically as to be outside of the boundary. Thus, only limited target samples become adequate for training of FKT. The hyperspectral image experiments confirm that the proposed OC-FKT technique provides an effective means for target detection.
The performance of the kernel based techniques depends on the selection of kernel parameters. That’s why; suitable parameter selection is an important problem for many kernel based techniques. This article presents a novel technique to learn the kernel parameters in kernel Fukunaga-Koontz Transform based (KFKT) classifier. The proposed approach determines the appropriate values of kernel parameters through optimizing an objective function constructed based on discrimination ability of KFKT. For this purpose we have utilized differential evolution algorithm (DEA). The new technique overcomes some disadvantages such as high time consumption existing in the traditional cross-validation method, and it can be utilized in any type of data. The experiments for target detection applications on the hyperspectral images verify the effectiveness of the proposed method.
Hyperspectral imagery (HSI) is a special imaging form that is characterized by high spectral resolution with up to hundreds of very narrow and contiguous bands which is ranging from the visible to the infrared region. Since HSI contains more distinctive features than conventional images, its computation cost of processing is very high. That’s why; dimensionality reduction is become significant for classification performance. In this study, dimension reduction has been achieved via VNS based band selection method on hyperspectral images. This method is based on systematic change of neighborhood used in the search space. In order to improve the band selection performance, we have offered clustering technique based on mutual information (MI) before applying VNS. The offered combination technique is called MI-VNS. Support Vector Machine (SVM) has been used as a classifier to evaluate the performance of the proposed band selection technique. The experimental results show that MI-VNS approach has increased the classification performance and decrease the computational time compare to without band selection and conventional VNS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.