Within the last decades, a large number of techniques for contrast enhancement has been proposed. There are some comparisons of such algorithms for few images and figures of merit. However, many of these figures of merit cannot assess usability of altered image content for specific tasks, such as object recognition. In this work, the effect of contrast enhancement algorithms is evaluated by means of the triangle orientation discrimination (TOD), which is a current method for imager performance assessment. The conventional TOD approach requires observers to recognize equilateral triangles pointing in four different directions, whereas here convolutional neural network models are used for the classification task. These models are trained by artificial images with single triangles. Many methods for contrast enhancement highly depend on the content of the entire image. Therefore, the images are superimposed over natural backgrounds with varying standard deviations to provide different signal-to-background ratios. Then, these images are degraded by Gaussian blur and noise representing degradational camera effects and sensor noise. Different algorithms, such as the contrast-limited adaptive histogram equalization or local range modification, are applied. Then accuracies of the trained models on these images are compared for different contrast enhancement algorithms. Accuracy gains for low signal-to-background ratios and sufficiently large triangles are found, whereas impairments are found for high signal-to-background ratios and small triangles. A high generalization ability of our TOD model is found from the similar accuracies for several image databases used for backgrounds. Finally, implications of replacing triangles with real target signatures when using such advanced digital signal processing algorithms are discussed. The results are a step toward the assessment of those algorithms for generic target recognition. |
1.IntroductionFor remote sensing applications and reconnaissance, acquisition and operation of cameras in different spectral bands are required, and each has its own pros and cons. The best possible choice among devices for procurement is therefore dependent on the imager performance for the desired task, e.g., the detection, recognition, or identification (DRI) of distant targets with a background composed of vegetation, urban structures, and sky. Camera data can be acquired in field trials for characterization of single devices. However, these measurements are time consuming and expensive. Furthermore, the possession of the device is required. Therefore, modeling and image-based simulation of imagers are useful and important for the assessment of imagers. Such tools become even more important when scene-dependent advanced digital signal processing (ADSP) techniques are used in the device, for their impact on performance is difficult to predict. In this paper, the effect of contrast enhancement (CE) algorithms, which have so far mainly evaluated in terms of esthetic perception, is considered. Triangle orientation discrimination (TOD)1 is a well-established image-based approach for the characterization of electro-optical system performance, especially for range performance assessment in remote sensing applications.2 It models the DRI tasks for real targets by a simplified recognition task. The original idea was that an observer has to determine the orientation of an equilateral triangle directing in four directions (up, down, left, and right) shown on a display, which is fed by an imaging system. The ability to clearly discriminate the orientation is reduced depending on different types of degradation, e.g., optical diffraction blur and sensor noise. However, due to the resurgence of machine learning, automatic target recognition3 is becoming increasingly important. The importance of machine vision applications also led to the emergence of scalable compression frameworks4,5 aimed at high lossy compression while simultaneously preserving image quality for machine and human vision. In contrast to human observers, the performance of these methods does not depend on properties of the display but merely on the digital output of the imaging system. A prominent and frequently used approach for machine vision is convolutional neural networks (CNN). Such CNN models also have been trained on artificial images of triangles to perform the TOD discrimination and validated on acquired camera data.6 Therefore, these models can be used for automated camera tests in the lab by means of scene projectors.7 In this paper, CNN models for TOD discrimination are trained and validated on degraded artificial images of single triangles superimposed over natural backgrounds from Open Images V6.8 In addition, the training data are processed by CE algorithms from Table 1 with equal probabilities. Then, a trained model is validated on images with varying error levels of background and Gaussian noise . Parts of this work have already been published elsewhere.32 Table 125 methods for CE.
In Sec. 2, the considered CE algorithms, the model setup, and the training procedure are described. Section 3 shows the accuracies on validation images with varying background and noise levels. Accuracy differences between individual CE algorithms and identity are shown for varying values of the signal-to-background ratio , signal-to-noise ratio , and triangle circumradius . Finally, results are discussed in Sec. 4, which concludes the paper. 2.Methods2.1.Considered Contrast Enhancement AlgorithmsSeveral methods for CE have been proposed within the past decades. Various modifications to conventional global histogram equalization have been proposed to counteract mean brightness shifts11,12 leading to annoying artifacts and allowing smooth transitions to identity.25,26 Also learning-based methods33–37 as well as methods based on image decomposition38,39 have been proposed. The decomposition often relies on color information, making the methods inapplicable to single channel image data. However, the scope of this work is limited to easily implemented algorithms, given in Table 1, that operate on single channel image data. 2.2.Data GenerationFor the training of models for TOD,1 images of triangles are generated with varying contrasts, sizes, and four orientations, i.e., up, down, left, and right. Misalignment angles uniformly distributed in are added to the orientation angles of the triangles to make the models more robust to misalignment. Exceeding the maximum rotation angle 15 deg would lead to incorrect labeling because rotations of an equally sided triangle by 30 deg result in other labeled orientations due to the 120 deg rotational symmetry. This rotation is crucial when applying models on real camera data because some misalignment between the field of view of a camera and a target is unavoidable. 2.3.Background OverlayBackground images are extracted from OpenImages V68 as the random square crops. RGB images are converted to floating-point grayscale images. Mean and standard error are calculated within these crops. Then, gray levels of an image with single triangle and background overlay are calculated as is a constant gray level over the entire image. is an offset value only added for pixels related to the triangle. In Eq. (1), the pixel values of the image crop are normalized by subtracting the mean and dividing by the corrected error, and with . Without this correction, Eq. (1) is not well-defined for uniform image sections because in this case the denominator would be . Then, the normalized gray levels related to the cropped natural background are scaled to have a specific standard deviation . Therefore, the signal-to-background ratio is expressed as2.4.DegradationsSeveral image degradations representing typical camera effects are applied. Temporal noise is applied as uncorrelated additive Gaussian noise. Fixed pattern noise of a sensor is modeled as line- and column-based additive Gaussian noise. Linear motion blur on the triangle is applied to represent moving targets. Stabilization errors due to camera vibration are applied as linear motion blur and Gaussian blur on the triangle with a background overlay. Blur due to optical diffraction by circular apertures is applied by filters with . These filters represent Airy disks40 as the diffraction patterns of a circular aperture. is the Bessel function of the first kind and first order. To provide optical diffraction blur for varying values of aperture diameter , detector pixel pitch , wavelength , and focal length , a dimensionless scaling factor is introduced asA variety of physical parameters (, , , and ) can be realized by random sampling of from a uniform distribution in . is chosen to limit the proportion of severely degraded images, which aggravate the model training due to little possible accuracy gains compared with statistical fluctuations. Optical diffraction blur is applied by spatial filtering with random 2D kernels of width and height . The kernel size is limited due to lack of information beyond the borders of images of finite size. The scaling factor is therefore also limited because higher values lead to radial kernel profiles biased by clipping effects due to the limited kernel size. To reduce aliasing due to the oscillatory form of the Airy disk [Eq. (4)], larger kernels of pixels are generated with an oversampling factor . These extended kernels are downsampled by average pooling with this oversampling factor to give kernels . Normalized filter kernels are then formed as Several aliasing effects may occur due to small detector fill factors (the ratio between the detector dimension and pitch) or different shapes of detector footprints, e.g., rhombic or circular. These effects can be realized by masking extended kernels of Airy disks with the detector profile before average pooling. However, this option is not used in this work due to the rare use of nonsquare detectors, low signal-to-noise ratio for low fill factors,41 and faster image generation. 2.5.Contrast EnhancementThe algorithms in Table 1 are applied on 50% of the degraded image with equal probabilities. Due to the complex and divergent control flow of some algorithms, these methods are implemented and calculated by a separate application on the CPU. In Fig. 1, pristine training examples, as well as degraded and ADSP processed ones, are shown. These training examples are generated online during training to have a practically infinite amount of training data immune to overfitting. However, the source of background images and the number of possible crops is large but finite. 2.6.Model SetupA conventional CNN architecture shown in Fig. 2 is used for TOD classification on images of dimensions . To facilitate the model training, the input image is normalized by linear shifting and scaling of pixel values to have a mean of 0 and a standard deviation of 1 over spatial dimensions. Uniform input images with a standard deviation of 0 are not scaled. Then, the normalized image is downsampled by a chain of building blocks until the spatial dimensions are reduced to 2. Each building block consists of two 2D convolutional layers with kernels and rectified linear unit activations (ReLUs) and a subsequent pooling layer. Hence, downsampling by a factor 2 is applied per block. The spatial dimensions are reduced, and the number of feature maps, given as increases for each block . is the floor function yielding the largest integer that is smaller than the argument. Two subsequent dense layers and a final softmax layer provide the probabilities for the four orientations. A default configuration with the initial number of filters , growth factor , and first dense layer size is arbitrarily chosen.2.7.Model TrainingThe models are trained and evaluated by Python 3.9/Tensorflow 2.8. For optimization, ADAM42 with a learning rate is used. Weights are initialized by He normal initialization.43 Models are trained for iterations. Despite slower training, techniques for acceleration of training, such as batch normalization,44 weight normalization,45 or adaptive gradient clipping,46 were deliberately omitted to achieve smaller models with faster inference, which are compatible for running on edge TPUs.47 The loss function is cross entropy. In each iteration, new sample images of triangles with background overlays are generated on the fly during the training for data augmentation. Triangles have a random size, position, and arbitrary orientation angle in [0 deg, 360 deg]. In addition, these images are impaired by the prescribed degradations. 50% of the training images were enhanced with one of the 25 methods for CE with equal probabilities. Corresponding disjoint sets of background images are randomly chosen from the respective partition of the OpenImages V6 database. For evaluation of model performance over the degradation parameters, i.e., , , and triangle size, images are generated in the same way as background images chosen from the test subset of the database. The question arises how large the percentage of degraded and ADSP processed images in the training data should be to obtain acceptable accuracies on validation sets of pristine and degraded images. models with different percentages of degraded and processed images in the training data were trained and evaluated on different kinds of validation data. The respective accuracies on the validation data are shown in Fig. 3. It can be observed that models trained only with pristine images perform very bad on degraded and ADSP processed images with natural backgrounds. A slight increase of the percentage significantly raises the validation accuracies on degraded imagery. On the other hand, models trained with a high percentage of degraded images still perform very well on pristine images. Therefore, to make the best use of the model capacity for the image degradations and ADSP methods, all models mentioned below are trained with 100% degraded images, whereas ADSP is applied with 50% probability. 3.Results3.1.Dependency of Accuracy on Background VarianceA trained model is validated on images of a fixed target, a centered triangle with a circumradius of . The triangle circumradius is converted to the often-used square root area 1 using the Pythagorean theorem: 1000 random crops of different background images from OpenImages V68 were used. The background variance of gray levels was varied to have different . White Gaussian noise was added to have , where and is about one-sixth of the dynamic range. In Fig. 4, accuracies over for different and corresponding example images for the lowest are shown.Obviously, accuracies are about 100% for and . The accuracies drop monotonically with decreasing . A similar behavior can be observed for varying triangle sizes, as shown in Fig. 5. As expected, the accuracies also drop for decreasing triangle sizes. Further variation of the relative triangle position in the subpixel range shows high fluctuations of accuracies for low triangle circumradius . This finding is consistent with the problem of recognition near the resolution limit mentioned before.1 3.2.Upscaling of Receptive FieldModels according to the CNN architecture shown in Fig. 2 were trained for various resolutions of receptive fields, i.e., , , , and . For and , a reduction of learning rate was required to achieve significant model improvements compared with the initial model states. Otherwise, model performances stagnated on average at 25% guessing rates. In Fig. 6, validation accuracies over are shown for different sizes of receptive fields. There is a general trend of decreasing accuracies for larger receptive fields. This may be due to the fact that larger receptive fields can contain more objects with high similarity to triangles. Furthermore, the growth factor for the number of feature maps per block may be insufficiently small to provide enough model capacity for an increasing range of triangle sizes to handle. A surprising fact is the lower accuracies for higher compared with and receptive fields of and larger. This might indicate beneficial properties of Gaussian noise by suppressing structures in the background resembling the triangle target. 3.3.Comparison of Methods for Contrast EnhancementA trained model was validated by images processed with the 25 ADSP algorithms shown in Table 1. Many algorithms only operate on integer pixel values based on gray level distributions, which are widespread for natural images. Hence, to prevent saturation due to clipping, the pixel values are shifted to have a mean of half of the dynamic range, upscaled by a factor and converted to 8 bit. After ADSP processing, this shifting and upscaling is reverted, and pixel values are converted to floating point numbers. In Fig. 7, differences of accuracies between single CE algorithms and the identity are shown for varying and . For convenience, the algorithms were ranked with respect to the maximum value, and only the top 10 algorithms are shown. It can be observed that there are accuracy gains for low , and the accuracy differences are quite similar for the top 10 algorithms. Because the accuracy is about 100% for and without ADSP processing according to Fig. 4, no significant improvement by CE can be expected for these cases. By contrast, a severe degradation of model performance occurs for some CE algorithms and high . To validate CE algorithms on a variety of degradations, different triangle parameters and degradation parameters were varied and uniformly distributed in the ranges given in Table 2. Table 2Boundaries for uniformly distributed triangle parameters and degradation parameters with the image dimensions I=64 and the dynamic range DR=255.
random samples of triangles and degradation parameters were combined with natural backgrounds as random crops from Open Images V6.8 This procedure was repeated times with varying random seeds, resulting in different triangles and background images. Model accuracies were calculated on images with . The same procedure was repeated, with each of the 25 CE algorithms from Table 1 being applied respectively as the final step. Compared with grid variation of individual triangle and degradation parameters, random sampling of many of these parameters allows for investigation of individual parameters by arbitrary parameter cuts, whereas other parameters are widely distributed. This gives better insights on possible fluctuations on model performances when those parameters are unknown. As shown in Fig. 8, accuracy differences between each of the 25 CE algorithms and identity were calculated for parameter cuts of the triangle circumradius and the signal-to-background ratio . Only accuracies on images with values in pixel (left), (right), (left), and (right) were selected. The interquartile ranges (IRQ), shown as orange boxes, contain values between the 25%-percentile and the 75%-percentile. The IRQs are extended by whiskers by 1.5IRQ at both sides at maximum, but they are limited by the respective minimal and maximal values in the data. Outliers are shown as circle markers. Obviously, a high and a low triangle circumradius lead to significant impairment of model accuracies by most of the 25 CE algorithms. Accuracy differences at high triangle circumradius and high (right bottom) show small IRQs, as the accuracy is often saturated at 100% for high . Hence, accuracies for low are rendered as outliers. For high and low (left bottom) only, some CE algorithms show accuracy gains. Also, monotonic transitions of accuracy differences for varying and were observed. The reason for the significant impairment at high could be due to the fact that a narrow gray level distribution of background values leads to excessive enhancement of the background by most CE algorithms, resulting in textures with a low dynamic range, steep edges, and a high similarity to the triangle to be discriminated. On the other hand, a large triangle reduces the number of background pixels and hence their contribution to the gray level distribution of the entire image. Most of the investigated CE algorithms depend on the image gray level distribution. 3.4.Generalization on Background Images of Different Image DatabasesTo investigate the ability of the TOD model to generalize to a larger variety of background images, the trained model was validated on artificial images with a single centered triangle with a fixed circumradius of superposed with background images resulting from random crops of images from different image databases. Examples of such image crops with are shown in Fig. 9 for different image databases: Pascal VOC,48 ILSVRC2012,49 FLIR ADAS,50 OpenImages V6,8 Stanford dogs,51 Oxford flowers 102,52 Caltech 101,53 and Gaussian noise. In Fig. 10, the model accuracies over images and different are shown, with triangle orientations and different backgrounds from several image databases. In addition, the generated artificial images are impaired by Gaussian noise with a high noise level (left) and a low noise level . It can be observed that model accuracies are very similar for most of the image databases. In contrast, background images of Gaussian noise yield significantly better accuracies than those of the image databases. Images of FLIR ADAS50 show accuracies between those of Gaussian noise and the image databases, which may be due to the relatively high noise content in the FLIR ADAS images. This fact indicates that structured backgrounds from the image databases have a higher degradational effect on the triangle recognition than Gaussian noise for equal standard deviations of pixel value fluctuations. The qualitative behavior of the model accuracy is similar for different databases when applying the methods for CE. The same is true for the ranges of values, for which CE yields an improvement in accuracy. For convenience, only an example for applying CLAHE is shown in Fig. 11. 3.5.Variation of Model SizeThe used architectures so far were an arbitrary initial choice. One might ask if similar/better accuracies could have been achieved by smaller/larger models. To answer this question, further models were trained based on the default configuration (Sec. 2.6) with modifications of single parameters, i.e., the initial number of filters , the growth factor , the dense layer size , and the number of dense layers. In Fig. 12, validation accuracies for varying initial number of filters , the growth factor , the dense layer size , and the number of extra dense layers in addition to the final dense layer with four units are shown. Models are validated on images with orientations, , and samples, resulting in images. Accuracies denoted with “all degradations” contain images enhanced by one of the CE algorithms (Table 1) with equal probabilities. Furthermore, validation accuracies are compared for different activations, i.e., leaky ReLU,54 exponential linear unit (ELU),55 Gaussian error linear unit (GELU),56 scaled exponential linear unit (SELU),57 APLU,58 tanh, sigmoid,59 softplus,59 softsign,59 and swish.60 It can be observed that reductions of the dense layer size as low as , the growth factor , and the number of filters result in comparable accuracies compared with the default configuration. Even for a varying number of extra dense layers with units in addition to the final dense layer with four units, there are only slight variations in accuracies. From the 11 investigated activations, the ReLU59 from the default configuration (Sec. 2.6) performs very well compared with most other activations. adaptive piecewise linear unit (APLU) and sigmoid did not converge at all above the guessing rate of 25%. 3.6.Model ComplexityWe did benchmarks of our trained TOD model for 64x64 pixels on our machine (a Ryzen 9 3900X processor with an NVIDIA GeForce RTX 2080Ti graphics card and 64GB RAM). Table 3 gives the average running time on an NVIDIA GeForce RTX2080Ti, as well as the number of parameters and floating point operations (FLOPs), which equals twice the number of multiply-accumulate computations (MACs). Table 3Model properties and benchmark results on an NVIDIA GeForce RTX 2080Ti.
The trained TOD models are smaller and faster compared with current machine vision backbones, which are also shown in Table 3. Faster model inference allows for a stronger focus on several image degradations. Compared with the classification of RGB images in the visible spectrum, the TOD models are applicable on single-channel data, and the four triangle classes are symmetric and balanced. Furthermore, the triangle shape and texture are independent of any spectral band, in contrast to many image databases in the visible band. This is crucial, e.g., for range performance assessment of imagers in several infrared spectral bands [long-wavelength infrared (LWIR), mid-wavelength infrared (MWIR), and short-wavelength infrared (SWIR)]. 3.7.Comparison with Other Image Quality MetricsDifferent image quality metrics were proposed for assessment of methods for CE, such as absolute mean brightness error (AMBE),20 discrete entropy,12 measure of enhancement (EME) and EME based on entropy (EMEE),65 QRCM,66 UIQ,67 EBCM,68 and CII.29 A more detailed overview of further image quality metrics and methods for CE can be found in another work.69 However, most of these metrics were validated by subjective image quality assessments and may not correlate well with accuracies of models for TOD recognition or other machine vision tasks. To investigate some current nonreference metrics on images used in the evaluation of TOD models, these metrics were calculated for images with a centered triangle superposed by backgrounds taken from OpenImages V6 scaled to different and impaired by Gaussian noise with different . In addition, these images were enhanced by three CE methods, CLAHE,13 EHS,17 and SUACE,28 which were among the top 10 algorithms in Fig. 7. In Fig. 13, values for nonreference image quality metrics EBCM,68 EME, EMEE,65 and entropy12 over are shown. It can be observed that metric values are high for low and low , representing high variances of background and Gaussian noise, respectively. The metric values decrease monotonically with increasing and . The only exception is EBCM for , which we assume is due to the regularization of denominators in the algorithm. Therefore, low metric values represent conditions under which TOD accuracies are high. However, very similar metric values could also be observed when the triangle was omitted. This indicates that these metrics are mainly determined by background for triangles with . Thus, the metrics are weakly or not at all interrelated with the TOD task performance if the triangle is very small. CEs by the three CE methods generally lead to positive shifts of the metric values or in other words , which can be interpreted as predominant enhancement of the background and noise, which aggravates TOD recognition. Similar results were found for the evaluation of the full-reference metrics AMBE,20 CII,29 QRCM,66 and UIQ.67 In summary, the metrics can be meaningful for the assessment of CE, whereas they cannot provide insights if the CE is beneficial for TOD recognition and possibly other classification tasks with small targets. 4.ConclusionAccuracies of a sequential CNN model performing TOD discrimination were compared with respect to 25 different methods for CE. The background overlay was crucial because the accuracy is significantly impaired for high background variance and the CE algorithms strongly depend on it. Accuracy gains for low signal-to-background ratios and a sufficiently large triangle were shown. Model accuracies on images with randomly sampled triangle and degradation parameters revealed significant impairment by the investigated CE algorithms for a high and low triangle circumradius . The strong fluctuations of accuracy differences highlight the difficulty in showing clear superiority of individual algorithms. Models with increased resolution of the receptive field have shown decreasing accuracies, which may indicate that the growth of the number of model parameters was insufficient to represent the increasing range of triangle sizes. Another reason may be a higher number of background artifacts mimicking triangles. Larger images have more pixels. Therefore, their gray level distributions are statistically more stable. Hence, CE algorithms based on these gray level distributions should provide lower variations in the processed images and the associated accuracies. To prove this hypothesis, further investigations on larger receptive fields are required. Variations of model size parameters, i.e., the number of filters , the growth factor , the number of dense layers, and the activation function, have shown that the used default configuration is close to optimal based on the used model architecture and maximal values of degradation parameters used for the generation of training/validation data. Stronger degradations may require larger models for optimal accuracies. The presented method can be used in an analogous way to assess the impact of other scene-based ADSP on military tasks. Moreover, the trained models can be used together with a test bed with an infrared scene projector for hardware in the loop testing of images including embedded ADSP. Finally, the methodology may be easily extended to more sophisticated classification tasks with real target signatures. In contrast to the triangle, real target signatures also have textures with spatial variations. Therefore, the gray level distribution and the CE based on it depend more strongly on the variations within the target, especially if the target covers lots of image pixels. Features related to these textures may require larger models compared with those investigated in this work. ReferencesP. Bijl and J. M. Valeton,
“Triangle orientation discrimination: the alternative to minimum resolvable temperature difference and minimum resolvable contrast,”
Opt. Eng., 37
(7), 1976
–1983 https://doi.org/10.1117/1.601904
(1998).
Google Scholar
S. Keßler et al.,
“The European computer model for optronic system performance prediction (ECOMOS),”
Proc. SPIE, 10433 282
–294 https://doi.org/10.1117/12.2262590 PSISDG 0277-786X
(2017).
Google Scholar
A. d’Acremont et al.,
“CNN-based target recognition and identification for infrared imaging in defense systems,”
Sensors, 19
(9), 2040 https://doi.org/10.3390/s19092040 SNSRES 0746-9462
(2019).
Google Scholar
Y. Shi et al.,
“Scalable compression for machine and human vision tasks via multi-branch shared module,”
J. Electron. Imaging, 31 023014 https://doi.org/10.1117/1.JEI.31.2.023014 JEIME5 1017-9909
(2022).
Google Scholar
Q. Wang, L. Shen and Y. Shi,
“Recognition-driven compressed image generation using semantic-prior information,”
IEEE Signal Process. Lett., 27 1150
–1154 https://doi.org/10.1109/LSP.2020.3004967 IESPEJ 1070-9908
(2020).
Google Scholar
D. Wegner and E. Repasi,
“Imager assessment by classification of geometric primitives,”
Proc. SPIE, 11406 17
–25 https://doi.org/10.1117/12.2558572 PSISDG 0277-786X
(2020).
Google Scholar
D. Wegner and E. Repasi,
“Image based performance analysis of thermal imagers,”
Proc. SPIE, 9820 982016 https://doi.org/10.1117/12.2223629 PSISDG 0277-786X
(2016).
Google Scholar
A. Kuznetsova et al.,
“The Open Images Dataset V4: unified image classification, object detection, and visual relationship detection at scale,”
Int. J. Comput. Vis., 128 1956
–1981
(2020).
Google Scholar
S.-C. Huang, F.-C. Cheng and Y.-S. Chiu,
“Efficient contrast enhancement using adaptive gamma correction with weighting distribution,”
IEEE Trans. Image Process., 22 1032
–1041 https://doi.org/10.1109/TIP.2012.2226047 IIPRE4 1057-7149
(2013).
Google Scholar
Y.-T. Kim,
“Contrast enhancement using brightness preserving bi-histogram equalization,”
IEEE Trans. Consum. Electron., 43 1
–8 https://doi.org/10.1109/30.580378 ITCEDA 0098-3063
(1997).
Google Scholar
H. Ibrahim and N. Kong,
“Brightness preserving dynamic histogram equalization for image contrast enhancement,”
IEEE Trans. Consum. Electron., 53 1752
–1758 https://doi.org/10.1109/TCE.2007.4429280 ITCEDA 0098-3063
(2007).
Google Scholar
C. Wang and Z. Ye,
“Brightness preserving histogram equalization with maximum entropy: a variational perspective,”
IEEE Trans. Consum. Electron., 51 1326
–1334 https://doi.org/10.1109/TCE.2005.1561863 ITCEDA 0098-3063
(2005).
Google Scholar
K. Zuiderveld,
“Graphics Gems IV,”
Contrast Limited Adaptive Histogram Equalization, 474
–485 Academic Press Professional, Inc., San Diego, California
(1994). Google Scholar
Z.-G. Wang, Z.-H. Liang and C.-L. Liu,
“A real-time image processor with combining dynamic contrast ratio enhancement and inverse gamma correction for PDP,”
Displays, 30
(3), 133
–139 https://doi.org/10.1016/j.displa.2009.03.006 DISPDP 0141-9382
(2009).
Google Scholar
J. Tang, E. Peli and S. Acton,
“Image enhancement using a contrast measure in the compressed domain,”
IEEE Signal Process. Lett., 10 289
–292 https://doi.org/10.1109/LSP.2003.817178 IESPEJ 1070-9908
(2003).
Google Scholar
Y. Wang, Q. Chen and B. Zhang,
“Image enhancement based on equal area dualistic sub-image histogram equalization method,”
IEEE Trans. Consum. Electron., 45 68
–75 https://doi.org/10.1109/30.754419 ITCEDA 0098-3063
(1999).
Google Scholar
D. Coltuc, P. Bolon and J.-M. Chassery,
“Exact histogram specification,”
IEEE Trans. Image Process., 15 1143
–1152 https://doi.org/10.1109/TIP.2005.864170 IIPRE4 1057-7149
(2006).
Google Scholar
S. F. Tan and N. A. M. Isa,
“Exposure based multi-histogram equalization contrast enhancement for non-uniform illumination images,”
IEEE Access, 7 70842
–70861 https://doi.org/10.1109/ACCESS.2019.2918557
(2019).
Google Scholar
C. Wang, J. Peng and Z. Ye,
“Flattest histogram specification with accurate brightness preservation,”
IET Image Process., 2 249
–262 https://doi.org/10.1049/iet-ipr:20070198
(2008).
Google Scholar
S.-D. Chen and A. Ramli,
“Minimum mean brightness error bi-histogram equalization in contrast enhancement,”
IEEE Trans. Consum. Electron., 49 1310
–1319 https://doi.org/10.1109/TCE.2003.1261234 ITCEDA 0098-3063
(2003).
Google Scholar
K. Singh and R. Kapoor,
“Image enhancement via median-mean based sub-image-clipped histogram equalization,”
Optik, 125
(17), 4646
–4651 https://doi.org/10.1016/j.ijleo.2014.04.093 OTIKAJ 0030-4026
(2014).
Google Scholar
K. Wongsritong et al.,
“Contrast enhancement using multipeak histogram equalization with brightness preserving,”
in IEEE Asia-Pacific Conf. Circuits and Syst., 1998. IEEE APCCAS 1998,
455
–458
(1998). https://doi.org/10.1109/APCCAS.1998.743808 Google Scholar
S. Poddar et al.,
“Non-parametric modified histogram equalisation for contrast enhancement,”
IET Image Process., 7 641
–652 https://doi.org/10.1049/iet-ipr.2012.0507
(2013).
Google Scholar
C. H. Ooi and N. A. M. Isa,
“Adaptive contrast enhancement methods with brightness preserving,”
IEEE Trans. Consum. Electron., 56 2543
–2551 https://doi.org/10.1109/TCE.2010.5681139 ITCEDA 0098-3063
(2010).
Google Scholar
S.-D. Chen and A. Ramli,
“Contrast enhancement using recursive mean-separate histogram equalization for scalable brightness preservation,”
IEEE Trans. Consum. Electron., 49 1301
–1309 https://doi.org/10.1109/TCE.2003.1261233 ITCEDA 0098-3063
(2003).
Google Scholar
K. Sim, C. Tso and Y. Tan,
“Recursive sub-image histogram equalization applied to gray scale images,”
Pattern Recognit. Lett., 28
(10), 1209
–1221 https://doi.org/10.1016/j.patrec.2007.02.003 PRLEDG 0167-8655
(2007).
Google Scholar
M. Kim and M. Chung,
“Recursively separated and weighted histogram equalization for brightness preservation and contrast enhancement,”
IEEE Trans. Consum. Electron., 54 1389
–1397 https://doi.org/10.1109/TCE.2008.4637632 ITCEDA 0098-3063
(2008).
Google Scholar
A. M. R. R. Bandara, K. A. S. H. Kulathilake and P. W. G. R. M. P. B. Giragama,
“Super-efficient spatially adaptive contrast enhancement algorithm for superficial vein imaging,”
in IEEE Int. Conf. Ind. and Inf. Syst. (ICIIS),
1
–6
(2017). https://doi.org/10.1109/ICIINFS.2017.8300427 Google Scholar
F. Bulut,
“Low dynamic range histogram equalization (LDR-HE) via quantized Haar wavelet transform,”
Visual Comput., 38
(6), 2239
–2255 https://doi.org/10.1007/s00371-021-02281-5 VICOE5 0178-2789
(2022).
Google Scholar
J. D. Fahnestock and R. A. Schowengerdt,
“Spatially variant contrast enhancement using local range modification,”
Opt. Eng., 22
(3), 223378 https://doi.org/10.1117/12.7973124
(1983).
Google Scholar
P.-C. Wu, F.-C. Cheng and Y.-K. Chen,
“A weighting mean-separated sub-histogram equalization for contrast enhancement,”
in Int. Conf. Biomed. Eng. and Comput. Sci. (ICBECS),
1
–4
(2010). https://doi.org/10.1109/ICBECS.2010.5462511 Google Scholar
D. Wegner and S. Keßler,
“Comparison of algorithms for contrast enhancement based on TOD assessments by convolutional neural networks,”
Proc. SPIE, 12271 122710H https://doi.org/10.1117/12.2638539 PSISDG 0277-786X
(2022).
Google Scholar
J. Park et al.,
“Distort-and-recover: color enhancement using deep reinforcement learning,”
in Proc. IEEE Conf. Comput. Vis. and Pattern Recognit. (CVPR),
(2018). https://doi.org/10.1109/CVPR.2018.00621 Google Scholar
B. Xiao et al.,
“Histogram learning in image contrast enhancement,”
in IEEE/CVF Conf. Comput. Vis. and Pattern Recognit. Workshops (CVPRW),
1880
–1889
(2019). https://doi.org/10.1109/CVPRW.2019.00239 Google Scholar
G. F. C. Campos et al.,
“Machine learning hyperparameter selection for contrast limited adaptive histogram equalization,”
EURASIP J. Image Video Process., 2019
(1), 59 https://doi.org/10.1186/s13640-019-0445-4
(2019).
Google Scholar
Y.-G. Shin et al.,
“Unsupervised deep contrast enhancement with power constraint for OLED displays,”
IEEE Trans. Image Process., 29 2834
–2844 https://doi.org/10.1109/TIP.2019.2953352 IIPRE4 1057-7149
(2020).
Google Scholar
V. Bychkovsky et al.,
“Learning photographic global tonal adjustment with a database of input/output image pairs,”
in CVPR,
(2022). https://doi.org/10.1109/CVPR.2011.5995332 Google Scholar
S. Lombardi and K. Nishino,
“Reflectance and illumination recovery in the wild,”
IEEE Trans. Pattern Anal. Mach. Intell., 38
(1), 129
–141 https://doi.org/10.1109/TPAMI.2015.2430318 ITPIDJ 0162-8828
(2016).
Google Scholar
H. Yue et al.,
“Contrast enhancement based on intrinsic image decomposition,”
IEEE Trans. Image Process., 26
(8), 3981
–3994 https://doi.org/10.1109/TIP.2017.2703078 IIPRE4 1057-7149
(2017).
Google Scholar
G. B. Airy,
“On the diffraction of an object-glass with circular aperture,”
Trans. Cambridge Philos. Soc., 5 283 TCPSAE 0371-5779
(1835).
Google Scholar
G. C. Holst, Electro-Optical Imaging System Performance, 6th ed.JCD Publishing(
(2017). Google Scholar
D. P. Kingma and J. Ba,
“Adam: a method for stochastic optimization,”
(2014). Google Scholar
K. He et al.,
“Delving deep into rectifiers: surpassing human-level performance on ImageNet classification,”
(2015). Google Scholar
S. Ioffe, C. Szegedy,
“Batch normalization: accelerating deep network training by reducing internal covariate shift,”
in Proc. 32nd Int. Conf. Mach. Learn.,
448
–456
(2015). Google Scholar
T. Salimans and D. P. Kingma,
“Weight normalization: a simple reparameterization to accelerate training of deep neural networks,”
(2016). Google Scholar
A. Brock et al.,
“High-performance large-scale image recognition without normalization,”
(2021). Google Scholar
A. Yazdanbakhsh et al.,
“An evaluation of edge TPU accelerators for convolutional neural networks,”
(2021). Google Scholar
M. Everingham et al.,
“The pascal visual object classes (VOC) challenge,”
Int. J. Comput. Vis., 88
(2), 303
–338 https://doi.org/10.1007/s11263-009-0275-4 IJCVEQ 0920-5691
(2010).
Google Scholar
O. Russakovsky et al.,
“ImageNet large scale visual recognition challenge,”
Int. J. Comput. Vis., 115
(3), 211
–252 https://doi.org/10.1007/s11263-015-0816-y IJCVEQ 0920-5691
(2015).
Google Scholar
A. Khosla et al.,
“Novel dataset for fine-grained image categorization,”
in First Workshop on Fine-Grained Visual Categorization, IEEE Conf. Comput. Vis. and Pattern Recognit.,
(2011). Google Scholar
M.-E. Nilsback and A. Zisserman,
“Automated flower classification over a large number of classes,”
in Indian Conf. Comput. Vis., Graph. and Image Process.,
(2008). https://doi.org/10.1109/ICVGIP.2008.47 Google Scholar
F.-F. Li et al.,
“Caltech 101,”
(2022). Google Scholar
A. Maas, A. Hannun and A. Ng,
“Rectifer nonlinearities improve neural network acoustic models,”
(2013). Google Scholar
D. Clevert, T. Unterthiner and S. Hochreiter,
“Fast and accurate deep network learning by exponential linear units (ELUs),”
(2015). Google Scholar
D. Hendrycks and K. Gimpel,
“Bridging nonlinearities and stochastic regularizers with Gaussian error linear units,”
(2016). Google Scholar
G. Klambauer et al.,
“Self-normalizing neural networks,”
(2017). Google Scholar
F. Agostinelli et al.,
“Learning activation functions to improve deep neural networks,”
(2014). Google Scholar
, “Tensorflow API,”
https://www.tensorflow.org/api_docs/python/tf/keras/activations
(2022).
Google Scholar
P. Ramachandran, B. Zoph and Q. V. Le,
“Searching for activation functions,”
(2017). Google Scholar
K. Simonyan and A. Zisserman,
“Very deep convolutional networks for large-scale image recognition,”
(2014). Google Scholar
K. He et al.,
“Identity mappings in deep residual networks,”
(2016). Google Scholar
C. Szegedy et al.,
“Rethinking the inception architecture for computer vision,”
(2015). Google Scholar
F. Chollet,
“Xception: deep learning with depthwise separable convolutions,”
(2016). Google Scholar
S. Agaian, B. Silver and K. Panetta,
“Transform coefficient histogram-based image enhancement algorithms using contrast entropy,”
IEEE Trans. Image Process., 16 741
–758 https://doi.org/10.1109/TIP.2006.888338 IIPRE4 1057-7149
(2007).
Google Scholar
T. Celik,
“Spatial mutual information and pagerank-based contrast enhancement and quality-aware relative contrast measure,”
IEEE Trans. Image Process., 25
(10), 4719
–4728 https://doi.org/10.1109/TIP.2016.2599103 IIPRE4 1057-7149
(2016).
Google Scholar
Z. Wang and A. Bovik,
“A universal image quality index,”
IEEE Signal Process. Lett., 9 81
–84 https://doi.org/10.1109/97.995823 IESPEJ 1070-9908
(2002).
Google Scholar
B. N. Anoop, P. E. Ameenudeen and J. Joseph,
“A meta-analysis of contrast measures used for the performance evaluation of histogram equalization based image enhancement techniques,”
in 9th Int. Conf. Comput., Commun. and Netw. Technol. (ICCCNT),
1
–6
(2018). https://doi.org/10.1109/ICCCNT.2018.8494069 Google Scholar
S. V. Renuka, D. R. Edla and J. Joseph,
“An objective measure for assessing the quality of contrast enhancement on magnetic resonance images,”
J. King Saud Univ. – Comput. Inf. Sci., 34
(10, Part B), 9732
–9744 https://doi.org/10.1016/j.jksuci.2021.12.005
(2021).
Google Scholar
BiographyDaniel Wegner is a research assistant at the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation (IOSB) in Ettlingen, Germany. He received his diploma degree in physics (equivalent to MS degree) from the Karlsruhe Institute of Technology (KIT), Germany, in 2013. Then, he received his PhD in physics from KIT in 2022. His research interests include image quality metrics, methods for contrast enhancement, image-based simulation of atmospheric turbulence as well as approaches for modeling, image-based simulation, and machine learning for imager performance assessment. Stefan Keßler is the head of the Sensor Simulation Group at the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation in Ettlingen, Germany. He received his diploma degree in physics from the University of Heidelberg in 2008 and his PhD in physics from the University of Erlangen-Nürnberg in 2014. His research activities comprise sensor modeling, image simulation of infrared and electro-optical imagers, and imager performance assessment. |