Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

Morteza Shahriari Nia; Daisy Zhe Wang; Stephanie Ann Bohlman; Paul D. Gader; Sarah J. Graves; Milenko Petrovic

doi:10.1117/1.JRS.9.095990

5 November 2015 Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

Morteza Shahriari Nia, Daisy Zhe Wang, Stephanie Ann Bohlman, Paul D. Gader, Sarah J. Graves, Milenko Petrovic

Author Affiliations +

Journal of Applied Remote Sensing, Vol. 9, Issue 1, 095990 (November 2015). https://doi.org/10.1117/1.JRS.9.095990

Abstract

Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

1. Introduction

Mapping tree species by remote sensing techniques have been useful in understanding the role of plant species at the landscape scale. Landscape-scale species distributions can be used to help determine classification of land use/land cover, plant response to climate change, detection of invasive species, patterns of plant competition, and spatial distributions of fire fuel loads among other applications.¹ Mapping individual species has been enabled by the advancements in remote sensing technologies, such as hyperspectral imagery or light detection and ranging (LiDAR).

Along with the development of remote sensing devices with the spatial and spectral resolution to map species, there has been a wave of development in statistical and algorithmic approaches to mapping species. We provide a brief review here and highlight that the impact of atmospheric correction has been relegated to a mere preprocessing step although it may have direct impacts on classification accuracy. Colgan et al.² used a two-stage support vector machine (SVM) at both the pixel level and crown level for tree species classification, in which LiDAR measurements were used for crown segmentation. Féret and Asner³ studied the accuracy of various parametric/nonparametric supervised classification techniques and observed that there is a clear advantage in using regularized discriminant analysis, linear discriminant analysis, and SVM. There have been other tree species classification efforts, such as that of Dalponte et al.,⁴ Féret and Asner,⁵ Ghosh et al.,⁶ Immitzer et al.,⁷ Naidoo et al.,⁸ and Ustin et al.,⁹ that share the same approach with minor variations.

Féret and Asner³ compared the classification performance of different hyper- and multispectral sensors, particularly Carnegie Airborne Observatory’s (CAO) hyperspectral alpha system,¹⁰ WorldView-2, and QuickBird. By convolving 72 hyperspectral bands to eight and four multispectral channels available in the WorldView-2 and QuickBird satellite sensors, respectively, they observed that WorldView-2 produced more accurate classification results than either QuickBird or CAO. Clark et al.¹¹ compared leaf, pixel, and crown level measurements to identify important wavelength regions for species discrimination. Although optimal regions of the spectrum for species discrimination varied with scale, near-infrared (NIR) bands were consistently important regions across all scales. Bands in the visible region and shortwave infrared were more important than other bands at pixel and crown scales. Clark and Roberts¹² performed their analysis on higher level data products, such as vegetation indexes, signal derivatives, and signal intensities among others for classification.

Baldeck and Asner¹³ tried to measure beta diversity (turnover of species assemblages between sampling units¹⁴) of different regions using distance measures, such as Euclidean distance and K-means clustering in unsupervised models. Use of these clustering techniques provides a quick assessment of beta diversity, thereby avoiding costly and time-consuming field data collection. However, about 50% of pixels could not be identified to species and were classified as “other,” signaling the need for improvements in techniques, similar argument holds in Baldeck et al.’s later work in Ref. 15. This suggests that there is a certain level in species classification accuracy beyond which state-of-the-art classification techniques cannot discriminate different species, as such are species that are either too rare or too spectrally similar to other species that they cannot be individually distinguished with spectral data alone.

For all of these studies, preprocessing of hyperspectral imagery was required, which can potentially impact the results. The impact of the atmosphere is variable in space and time and usually requires correction for quantitative remote sensing applications. Several researchers have investigated the effects of different atmospheric correction methods for Landsat, QuickBird, and imaging spectrometers on generic land cover detection applications as we elaborate below, but there is little work to address the impact of atmospheric correction for plant species classification. Some approaches to atmospheric correction include scene-derived adjustments, in which in-scene statistics are used, such as the darkest pixel method,¹⁶ or purely empirical methods, where ground-recorded spectral data are required, e.g., the empirical line method.¹⁷ Some involve radiative transfer models, such as the 6S code¹⁸ and moderate spectral resolution atmospheric transmittance (MODTRAN),¹⁹ while others, such as atmospheric correction (ATCOR)²⁰ and fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH),²¹^,²² add in situ spectral data to the model.²³ If ground data are not available, radiative transfer models provide a cost- and time-effective solution in atmospheric correction; however, with the availability of field data, ATCOR and FLAASH provide the added value in monitoring the performance and fine tuning the training of the models using field data, hence the focus on ATCOR and FLAASH. Manakos et al.²³ compared the effect of ATCOR and FLAASH atmospheric correction on land cover types in the island of Crete using Worldview-2 satellite data, where FLAASH outperformed ATCOR correction. In particular, ATCOR produced consistently low corrected reflectance values for all targets and all bands.²³ Here, we compare ATCOR versus FLAASH but in the context of tree species classification. The choice of atmospheric correction may be important in species classification because if atmosphere effects that obscure the land surface signal are not properly removed, then separability of different plant species may be diminished. This is particularly important if there is spatial variation in atmospheric effects across the scene to which the species classification will be applied. Also, as sunlight penetration in plants (and their stacks of leaves) is much different to that of nonliving materials (asphalt, gravel, etc), a wider range of signal spectrum is vulnerable to that of atmosphere, which has not been the focus of previous papers.

The data for this study were provided by the National Ecological Observatory Network (NEON) for Ordway-Swisher Biological Station (OSBS) in north-central Florida, United States. NEON includes $\sim 60$ local sites in different ecological domains across the United States.²⁴ Starting in 2017, the NEON remote-sensing airborne observation platform (AOP), carrying a $meter / sub meter$ resolution for hyperspectral and LiDAR instruments, will collect images of each site annually for 30 years, thus generating a large amount of data that can be used to map plant species and track their distributions through time. This paper uses prototype image data from a pilot study conducted at Ordway Swisher in advance of the full implementation of NEON in 2017.

The contributions of this paper are as follows. We performed tree species classification using SVM and studied the impact of different atmospheric correction techniques for classification accuracy. We focused on the two commonly used atmospheric correction techniques (ATCOR and FLAASH) that NEON applied to the hyperspectral data collected in 2010. We also explore the use of Gaussian filters for denoising reflectance values. Finally, we examine the use of filters to remove pixels with low vegetation [normalized difference vegetation index (NDVI)] or high shade (NIR), which has been used in a number of species classification studies.²^,⁸ With this work, we hope to provide guidance for processing the large set of NEON images that will become available starting in 2017 that can be used for species classification.

2. Data Collection

OSBS covers $37 {km}^{2}$ in Putnam County in north-central Florida and is managed jointly by the University of Florida and The Nature Conservancy (Fig. 1). OSBS features diverse natural forests and small pine plantations and has a 75-year history of low human impact. The major plant communities in OSBS, as defined by the Florida Natural Areas Inventory, are²⁶ sandhill, xeric hammock, upland mixed forest, baygalls, basin swamp, basin marsh, marsh lake, clastic upland lake, and sandhill upland lakes. The sandhill community is managed using prescribed burning on a scheduled 3-year rotation. The ground sampling for this research focused on a sandhill ecosystem dominated by longleaf pine (Pinus palustris) and turkey oak (Quercus laevis).²⁶^,²⁷

Fig. 1

Location of Ordway-Swisher biological station (OSBS).²⁵

The instrumentation slated for deployment on the NEON AOP remote-sensing payloads in 2017 was not yet available. AOP airborne spectroscopic and LiDAR measurements were performed using existing systems. It is important to note that the actual AOP sensors for NEON starting in 2017 will have better performance, improved conformance of hyperspectral/LiDAR integrations, and better spatial resolution than the sensors used in this study.

Airborne visible/infrared imaging spectrometer (AVIRIS) operated by the Jet Propulsion Laboratory (JPL) onboard on a Twin Otter DeHavilland DHC-6-300 aircraft was used to collect data. Images were collected on two separate days over OSBS: the morning of September 4, 2010 and midday of September 10, 2010 (Fig. 2). Both of the flights were conducted at an approximate altitude of 4000 m AGL at $\sim 90 knots$ with zenith angle of 180.0 and azimuth angle of 0.0 [speed over ground (SOG) $\sim 65$ to 91 knots; NASA JPL AVIRIS flight details on both days²⁸^,²⁹]. The atmospheric conditions were mostly clear with some haze on September 4 and some puffy clouds on September 10. Depending on flight line, pixel sizes ranged from 3.3 to 3.6 m. Hyperspectral data were atmospherically corrected using FLAASH and ATCOR algorithms. Altogether, 8 flight lines and 224 bands were recorded with wavelengths from 365.93 to 2496.24 nm.

Fig. 2

JPL AVIRIS flights over OSBS:²⁷ (a) Flights ground tracks, (b) hyperspectral true-color mosaic, morning September 4, 2010, and (c) hyperspectral true-color mosaic, midday September 10, 2010.

Atmospheric characterization relied on measurements of a CIMEL sun photometer in coordination with the NASA AErosol RObotic NETwork (AERONET).³⁰ Measurements were collected on September 4, 2010 and the derived atmospheric information was used to improve the atmospheric correction of the AVIRIS spectrometer data. Detailed measurements, such as aerosol optical thickness, water vapor, etc., are available online.³¹ NEON personnel performed the orthorectification and atmospheric correction of the reflectance/radiance values, and after extensive studies and ground measurements opted for ATCOR and FLAASH (hence, we focus on ATCOR and FLAASH).

2.1.

Field Data

Field identification and mapping of tree species was performed on February 28, 2014. A laptop preloaded with the georeferenced images was used in conjunction with a professional grade GPS. Twenty-nine tree crown polygons covering 1269 pixels were mapped and identified to species or genus in the field. Later, we extracted the pixels of each polygon from the AVIRIS image strip from the morning of September 4, 2010. This flight path was used because it contained the fewest clouds over the trees we identified and mapped in the field. Some species had many pixels (e.g., 334 for turkey oak) where others had fewer (e.g., 81 for laurel oak). This bias in population size affects classification accuracy.³² The unidentified oak or pine species categories could be classified to a genera (Quercus or Pinus), but not confidently identified to species, although these crowns are likely one or several of the pine or oak species listed in Table 1.

Table 1

Field data specifications.

Common name	Scientific name	Number of polygons	Number of pixels
Laurel oak	Quercus hemisphaerica	5	81
Longleaf pine	Pinus palustris	13	307
Oak (unknown)	Quercus	5	121
Pine (unknown)	Pinus	10	275
Sand live oak	Quercus geminata	5	151
Turkey oak	Quercus laevis	14	334

3. Species Classification

For many pixels in the image, we cannot expect a pure signal of a single species due to the large size of each pixel (3+ by 3+ square meters). Instead, many pixels were linear/nonlinear mixtures of endmembers in each pixel (e.g., multiple canopy crowns, soil, understory vegetation, shadow, etc.). The main species in this study, turkey oak and longleaf pine, have sparse canopies, which increase the amount of soil and branch material visible in the pixels.

3.1.

Atmospheric Correction

Ground spectral measurements for use in the atmospheric correction were collected concurrently with flight operations. Unlike many other atmospheric correction algorithms that interpolate radiation transfer properties from a precalculated database of modeling results (such as GENLN2³³ and DART radiative transfer model³⁴), FLAASH incorporates the MODTRAN radiation transfer code³⁵ alongside atmosphere and aerosol types specified, e.g., HITRAN-96’s water line parameters,³⁶ extinction coefficients for continuous and quasicontinuous molecular absorptions, such as the $H_{2} O$ ³⁷ and $N_{2}$ continua, $CFC$ and ${HNO}_{3}$ vibrational bands, and electronic transitions of $O_{2}$ and $O_{3}$ .³⁸ A unique MODTRAN solution is computed for each image. FLAASH (implementation by Exelis VIS Inc. Boulder, Colorado, United States) also includes the following features: correction for pixel mixing due to scattering of surface-reflected radiance, computation of a scene-average visibility (aerosol/haze amount), handling stressing atmospheric conditions (e.g., clouds), cirrus and opaque cloud mapping, and spectral polishing for artifact suppression. FLAASH starts from a standard equation for spectral radiance at a sensor pixel, $L$ , that applies to the solar wavelength range (thermal emission is neglected) and flat, Lambertian materials or their equivalents, as follows:²¹

Eq. (1)

L_{e} \approx [\frac{(A + B) ρ_{e}}{1 - ρ_{e} S} + L_{a}],

where

ρ_{e}

is an average surface reflectance for the pixel and a surrounding region,

S

is the spherical albedo of the atmosphere,

L_{a}

is the radiance back scattered by the atmosphere, and

A

and

B

are coefficients that depend on atmospheric and geometric conditions but not on the surface. Solving for surface reflectance

ρ_{e}

, we have

Eq. (2)

ρ_{e} \approx \frac{L_{e} - L_{a}}{A + B + S (L_{e} - L_{a})} .

ATCOR (ATCOR 4, implementation by ReSe Applications Schläpfer, University of Zürich, Switzerland) has the following features: capability of combination with geometric information on terrain, a lookup table of a wide range of precalculated radiative transfer runs for different weather conditions and sun angles employing MODTRAN, incorporation of spatially varying aerosol conditions, and statistical haze removal that masks haze and cloud regions and removes haze of land areas. It also accounts for deshadowing of cloud/building cast shadow areas, cirrus cloud removal, BRDF correction of irradiance effects, evaluation of atmospheric parameters (aerosol type, visibility, water vapor) by comparing retrieved reflectance with library spectra, and finally inclusion of a solar reference spectrum. ATCOR performs atmospheric correction for surface reflectance, $ρ$ , disregarding the adjacency component, as follows:³⁹

Eq. (3)

ρ = \frac{π {d^{2} (c_{0} + c_{1} DN) - L_{path}}}{τ E_{g}},

where

τ

is the atmospheric (direct or beam) transmittance for a vertical path through the atmosphere,

d

is the Earth–Sun distance in astronomical units,

c_{0}

,

c_{1}

, and DN are the radiometric calibration offset, gain, and digital number, respectively.

ρ

is the surface reflectance and

E_{g}

is the global flux on the ground.

3.1.1.

Ground validation capabilities

To ensure cross-site calibration, NEON has funded the development of the NEON Imaging Spectrometer Design Verification Unit at NASA JPL.⁴⁰ Onboard calibrator, in conjunction with laboratory calibration and ground-based characterization methods provided means for verification and comparison of multiple calibration techniques. Ground-truth measurements consisting of surface reflectance and atmospheric properties have been used to predict the at-sensor radiance, which provided an operational calibration of the imaging spectrometer.⁴¹ Molecular and aerosol components of the atmosphere attenuate and scatter light with strong spectral dependencies, therefore, multispectral solar radiometer, as part of the AERONET,⁴² has been used to take solar irradiance and sky radiance measurements before, during, and after sensor acquisition of the test site to derive spectral aerosol optical depth, Ångström parameter, column water vapor, and many other atmospheric properties. Columnar ozone amount has been determined using ozone monitoring instrument.

Geometries of the sensor and sun at the time the sensor measured the test site have also been included in the input. The midlatitude summer atmospheric model in MODTRAN has been used, which defines atmospheric profiles for $H_{2} O$ , $O_{3}$ , $N_{2} O$ , ${CO}_{2}$ , and ${CH}_{4}$ that are proper for the altitude, pressure, and column ozone provided to the model. The ${CO}_{2}$ mixing ratio was set to 365 ppm, aerosol optical depth at 550 nm and Ångström parameter is provided to define aerosol spectral extinction. The model has been set to assume a lambertian surface with spectral reflectance of the test site, which has been measured close in time to sensor acquisition. To account for boundary areas, an additional reflectance spectrum representing the area surrounding the test site has been defined as an additional radiative transfer constraint. Reflectance and atmospheric properties were also characterized for one Landsat 5 TM and four AVIRIS overpasses. Brief information on some of the measurements are provided in Table 2, and more information on NEON’s 2010 OSBS campaign are available in Refs. 40 and 43.

Table 2

Summary of sensor overpass parameters used as input to radiative transfer code for at-sensor radiance prediction.40 All measurements occurred in 2010, and time is in UTC.

Sensor overpass	Landsat 5 TM	AVIRIS # 4	AVIRIS # 5	AVIRIS # 5	AVIRIS # 9
Test site	Asphalt	Vegetated	Vegetated	48% tarpaulin	Asphalt
Site center	29.695, $- 82.261$	29.689, $- 81.994$	29.689, $- 81.994$	29.689, $- 81.994$	29.695, $- 82.261$
Overpass time	September 2, 15:51	September 4, 14:01	September 4, 14:11	September 4, 14:11	September 4, 14:46
Ground ref time	September 2, 15:42 to 15:57	September 4, 13:59 to 14:14	September 4, 14:19 to 14:32	September 3, 20:39 to 20:43	September 3, 20:23 to 20:36
Sensor azimuth	103.00	287.01	101.13	160.13	48.31
Sensor zenith	2.37	15.69	8.73	4.18	3.46
Solar azimuth	128.93	104.19	105.98	105.96	112.28
Solar zenith	31.62	53.28	51.04	51.07	44.22
Altitude (m)	space	3994.3	4039.3	4045.2	4107.4
Ångström	1.599	1.868	1.848	1.848	1.885
Aerosol optical depth (550 nm)	0.1193	0.2262	0.2407	0.2407	0.2527
Water vapor (cm)	3.03	4.00	4.04	4.04	4.04
Ozone (DU)	291	285	285	285	276

3.2.

Signal Preprocessing

The hyperspectral images were loaded in MATLAB using an in-house upgraded version of enviread, initially developed by Dr. I. Howat at Ohio State University.⁴⁴ A check for the consistency of calibration and uniformity of pixel sizes indicated a range of 3.3 to 3.6 M pixel sizes due to various flight and measurement conditions. As different flights have different altitudes and hence pixel resolutions, this is an essential step to take into account. Here, we define a hyperspectral image $I$ with dimensionality $(x, y, w, z)$ , where $x \in X = [167000, 833000]$ represents the range of UTM easting values, $y \in Y = [0, 9400000]$ represents UTM northing values, $w \in W = {1, \dots, 224}$ is the index of the reflectance wavelengths, and $z \in Z = {1, \dots, 60}$ is the UTM zone of the image. Based on our observations, we take constant $ξ = 10000$ as a cutoff point to avoid erroneous sensor readings. There are various types of noise in JPL AVIRIS measurements such as negative reflectance values; the range of hyperspectral reflectance values (range $[- 32762, 32724]$ ): one should note that reflectance is the proportion of sun radiance signals, which should be a positive value, but in normalized form, reflectance is between zero and one. The normalization process to remove sensor noisy data is as follows:

Eq. (4)

I_{x y w z} = {\begin{cases} 0 & for I_{x y w z} < 0 \\ 1 & for I_{x y w z} < ξ \\ \sqrt{\frac{I_{x y w z}}{ξ}} & otherwise \end{cases} .

For normalization, the negative reflectance values were set to zero and values greater than 10,000 were set to 10,000. Comparing the generated RGB wavelengths to that of the RGB image taken at the same time of the flight, we noticed that the image appears much darker. To adjust the image intensity to a “true color” setting, the square-root of signal returns was used. We excluded wavelengths corresponding to strong water vapor absorption bands in the atmosphere: 1333.2 to 1482.7 nm, 1791.6 to 1967.6 nm, and 2406.9 to 2496.2 nm.

3.2.1.

Impact of low-vegetated/shaded pixels

We tested two filters to obtain pixels that contain greater signal of green vegetation for the canopies. A filter of NIR excludes heavily shaded pixels,² for this data, we set the threshold to 0.33. To obtain pixels with a high signal of green vegetation, a filter of NDVI can be used.⁴⁵ Here, we set the threshold to 0.4 and band 665.6 nm was used for red and 734.1 nm for NIR. However, in our images, we observed that preserving low NDVI/NIR pixels from ground data increased prediction accuracy. Removing low NDVI pixels from tree canopies degraded classification performance by about 10%. This is contrary to the general belief in the literature that removing pixels with low green vegetation contribution (low NDVI) and high shading (low NIR) improves classification performance. Threshold values for NIR and NDVI are highly data dependent (based on sensor type, atmospheric correction, and signal calibration) and were chosen empirically.

3.2.2.

Gaussian filter

The sensor readings were very noisy (Fig. 3). To reduce noise, we take advantage of an abundance of bands (224 AVIRIS bands) and exploit their local-aggregate information by applying a Gaussian filter to reduce cross-band noise. Applying the filter reduced the amount of cross-band noise. Signal transitions are smoother while useful features of the spectrum are preserved. The impact of this preservation in species classification accuracy is demonstrated in our results section. Taking a Gaussian window $w$ of size $N > 0$ , the coefficients of the Gaussian window are computed as follows:

Eq. (5)

w (n) = e^{- \frac{1}{2} {(α \frac{n}{N / 2})}^{2}},

where

- (N - 1) / 2 \leq n \leq (N - 1) / 2

and

α

is inversely proportional to the standard deviation (

σ

) of a Gaussian random variable (

σ = N / 2 α

). After Gaussian parameters are specified, convolution is performed to apply the smoothing factor. Convolving vectors

u \in R^{m}

and

v \in R^{n}

gives vector

w \in R^{m + n - 1}

, such that

Eq. (6)

w (k) = \sum_{j} u (j) v (k - j + 1) .

Fig. 3

Use of Gaussian filter to reduce cross-band sensor noise: water absorption bands are preserved here for displaying purposes.

3.3.

Support Vector Machines

SVM often outperforms other algorithms on species classification.²^,¹⁵^,⁴⁶ We parameterize SVM with $k$ -fold cross validation where $k = 5$ . Classifier nonlinearity comes from taking the following nonlinear functions as kernel for SVM:

• Polynomial function kernel
• Radial basis function (RBF) kernel.

Regarding multiclass classification, $(\begin{matrix} c \\ 2 \end{matrix})$ different classifiers are trained, where $c$ is the number of classes (6 in this case). Hence, we train 15 disjoint binary classifiers at each iteration of $k$ -fold. All the classifiers are trained once. Majority voting among classifiers decides the class assignment. Classifications were done for individual pixels not for crowns.

We empirically evaluated the impact of optimizing classifier parameters with regard to FLAASH and ATCOR atmospheric corrections. We then investigated the impact of data preprocessing filters on the performance of the species classification. A one-versus-one combinatorial multiclass $k$ -fold ( $k = 5$ ) cross-validation setup of SVM using nonlinear kernels (polynomial and radial basis) is used as the model (15 different classifiers were trained), where majority voting determines the species of a pixel. Pixels of a single canopy were used for either training or test sets, and not both, as pixels of one canopy may contain similar information.⁴⁷ A cost of $C = \infty$ is set for misclassification, meaning there is little to zero tolerance for improperly classified samples.

Finally, the effect of atmospheric correction on prediction accuracy versus classification model parameters is considered. $C$ , $σ$ , $P$ , $m$ and optimization method are the knobs of SVM that are tuned in a mixture of grid and heuristic search. Here, $C$ stands for the cost or penalty of misclassification against simplicity of the decision surface; $σ$ defines how far the influence of a single training example reaches, with low values meaning “far” and high values meaning “close” in RBF function; $P$ is the polynomial degree for polynomial function as kernel; $m$ is the maximum number of iterations the optimization function should iterate; and optimization method defines the selected optimization method. In this work, we set $C = + \infty$ , $m = 10,000$ and use quadratic programming as optimization method. In the following paragraphs, we evaluate the impact of $P$ and $σ$ , respectively.

4. Results and Discussion

Through extensive analysis, we evaluate the impact of atmospheric correction (FLAASH and ATCOR) on species classification accuracy. Accuracy of species classification was different depending on the atmospheric correction algorithm used and FLAASH atmospheric correction outperforms ATCOR by a margin of about 2% to 4%. First, we evaluate the impact of Gaussian filter; applying a Gaussian filter reduces signal noise with or without presence of water absorption bands (Fig. 4). Our point is not overiterating that removing water absorption bands is a useful step as this has been a de facto approach in the literature; rather, this proves that one can still get improvements on prediction accuracy using proper Gaussian settings even in the presence of highly unusable water absorption bands.

Fig. 4

Impact of Gaussian window on prediction accuracy: (a) before removing water absorption bands and (b) after removing water absorption bands.

Removing low NDVI-NIR pixels from the data set caused a general degradation of performance compared to using all pixels (Fig. 5). In the FLAASH data set, the accuracy is about 70%, and ATCOR yields 66% accuracy. The species in this area have crowns with low leaf density and complex surfaces, such that individual pixels within a crown may have low “greenness” due to wood or soil reflectance through the canopy or canopy shadows from a complex surface. The filters may be removing excessive number of pixels in this instance. The degradation using the NDVI-NIR filter can also be due to the fact that pixel size is large (3 m) and canopies are large (each canopy is from 6 to 107 pixels). This generous marking of canopies includes areas with little greenness (shadows, branches, gravel, etc.; all with low NDVI and low NIR values). Due to the mixing nature of reflectance values, even low NDVI/NIR pixels of a continuous canopy still contain signals from the underlying species, which may not be as green. Figure 5 shows that removing low NDVI/NIR pixels of a continuous canopy actually degrades the performance of the classification model with an impact of about 4%. So, in similar scenarios, where field data consist of large continuous land patches, this observation advises on preserving low NDVI/NIR pixels. It is important to realize that this is only relevant in the context of field data, as the species are already known, and is only useful in training the classifier. In global application of NDVI/NIR filters to entire flight lines, we suggest using the method in the literature; i.e., deletion of low values to remove roads, grass, etc. Here, the benefit of FLAASH over ATCOR can again be observed on average by 2% to 3%.

Fig. 5

Classification results with the removal of low normalized difference vegetation index and near-infrared pixels.

Changing the polynomial degree $P$ impacted prediction accuracy using a polynomial kernel in SVM [Fig. 6(a)]. FLAASH atmospherically corrected data yields 73.5% accuracy, whereas ATCOR results in 69.8%. The simpler the polynomial, the better the performance; a more complicated polynomial leads to a high bias classification model, which performs poorly when evaluated on test data. For FLAASH atmospheric correction ( $P \geq 3$ ), the optimization function does not converge. On the other hand, ATCOR data performs as predicted. Accuracy drops as low as 10.8% and 11.3% with polynomial degrees of 7 and 8 (due to extreme overfitting).

Fig. 6

Parameter tuning for classification algorithms: (a) tuning polynomial order for SVM with polynomial kernel function and (b) tuning $σ$ in SVM with radial basis function kernel function.

The radial basis kernel has better performance than a polynomial kernel. As shown in Fig. 6(b), the best results are achieved using FLAASH data, with a peak at 75.3%, while ATCOR comes close at 74.9%. RBF does not show good performance at $σ$ values, which are either excessively low or high, where $σ$ is the inverse of the width of the RBF kernel (roughly defining the area of influence of a support vector); in other terms, it defines the degree of influence of a single training example. The larger $σ$ is, the closer other examples must be to be affected. Because RBF takes data to a higher dimensionality, a small $σ$ gives a pointed bump in the higher dimensions, and a large $σ$ gives a softer, broader bump. Thus, neither extreme shows a good fit nor a middle point of $σ = 10,000$ provides the best results. On the negative side, FLAASH begins with an accuracy of 0% and ATCOR at 19.9%, but they quickly attain a stable region close to each other, while FLAASH demonstrates superior performance in most of the cases. There is a good range of $σ$ values ([10, 10000]), which give a plateau in prediction accuracy, implying the RBF kernel has a more robust performance than the polynomial kernel.

Table 3 shows the confusion matrix of the best performing classification model (75.2% accuracy), using the FLAASH correction module and SVM with RBF kernel and Gaussian window length of four. The majority of misclassifications are between pine (other) class and longleaf pine. A similar misclassification can be observed between oak (other) and other types of oak. This was expected because the “other” class contains a mixture of different species of the genera (oak or pine). On the other hand, there is minimal misclassification between the oak versus pine categories. The oak category is rarely misclassified as pine, but pines are misclassified as turkey oak and laurel oak at a low level. There is no misclassification among different oak species. Laurel oak, turkey oak, and live oak are well separable from each other, suggesting that broad-leaf species like oaks can, in general, be well separated.

Table 3

Confusion matrix using radial basis function kernel function with optimal accuracy of 75.2% near peak using the FLAASH correction module.

Known class	Predicted class
Known class	Pine (other)	Longleaf Pine	Turkey Oak	Live Oak	Oak (other)	Laurel oak	∑
Pine (other)	10	10	3	0	0	0	23
Longleaf pine	20	36	0	0	0	4	60
Turkey oak	1	0	62	0	0	0	63
Live oak	0	0	0	44	4	0	48
Oak (other)	0	0	0	0	1	9	10
Laurel oak	1	0	0	0	0	5	6
$\sum$	32	46	65	44	5	18	210

In our experiments, we observed small but consistently better performance of FLAASH atmospherically corrected data versus ATCOR for tree species classification. This is in conformance with recent observations of Manakos et al.,²³ who found that FLAASH atmospheric correction outperformed ATCOR in endmember classification of land cover types in Crete.

5. Conclusions

Identification of species using remote sensing technologies, such as hyperspectral and LiDAR sensors, has critical utility in studying impacts of global warming, biomass estimation, and invasive species identification among other issues. In this paper, we report species classification using SVM applied to AVIRIS hyperspectral data available for OSBS in north-central Florida. Performance of the classifier was improved by using a Gaussian filter for denoising reflectance values. We also discuss how incorporating even low NDVI and low NIR pixels can be helpful in improving classification accuracy in some landscapes ground data. Due to the mixing nature of remote sensing hyperspectral data, using such pixels can reduce the bias in maximum margin support vectors in SVM. The images atmospherically corrected using the FLAASH algorithm outperformed the ATCOR algorithm by margins of about 2% to 4%. Our classification model is robust for species classification among different oak species, however, we show minor misclassification between pine and oak species.

Acknowledgments

Authors are thankful to NEON Inc. for providing hyperspectral data. Also, they are thankful to Ms. Leila Kalantari for assistance in collecting field data. The National Ecological Observatory Network is a project sponsored by the National Science Foundation and managed under cooperative agreement by NEON, Inc. The NEON 2010 Pathfinder data set is based on work supported by the National Science Foundation under Grant No. DBI-0752017.

References

1.

R. Scholes and S. Archer, “Tree-grass interactions in savannas,” Annu. Rev. Ecol. Syst., 28 (1), 517 –544 (1997). http://dx.doi.org/10.1146/annurev.ecolsys.28.1.517 ARECBC 0066-4162 Google Scholar

2.

M. S. Colgan et al., “Mapping savanna tree species at ecosystem scales using support vector machine classification and BRDF correction on airborne hyperspectral and LIDAR data,” Remote Sens., 4 (11), 3462 –3480 (2012). http://dx.doi.org/10.3390/rs4113462 RSEND3 Google Scholar

3.

J. Féret and G. P. Asner, “Tree species discrimination in tropical forests using airborne imaging spectroscopy,” IEEE Trans. Geosci. Remote Sens., 51 (1), 73 –84 (2013). http://dx.doi.org/10.1109/TGRS.2012.2199323 Google Scholar

4.

M. Dalponte et al., “Tree crown delineation and tree species classification in boreal forests using hyperspectral and als data,” Remote Sens. Environ., 140 306 –317 (2014). http://dx.doi.org/10.1016/j.rse.2013.09.006 RSEEA7 0034-4257 Google Scholar

5.

J. B. Féret and G. P. Asner, “Semi-supervised methods to identify individual crowns of lowland tropical canopy species using imaging spectroscopy and LIDAR,” Remote Sens., 4 (8), 2457 –2476 (2012). http://dx.doi.org/10.3390/rs4082457 RSEND3 Google Scholar

6.

A. Ghosh et al., “A framework for mapping tree species combining hyperspectral and LIDAR data: role of selected classifiers and sensor across three spatial scales,” Int. J. Appl. Earth Obs. Geoinf., 26 49 –63 (2014). http://dx.doi.org/10.1016/j.jag.2013.05.017 Google Scholar

7.

M. Immitzer, C. Atzberger and T. Koukal, “Tree species classification with random forest using very high spatial resolution 8-band WorldView-2 satellite data,” Remote Sens., 4 (9), 2661 –2693 (2012). http://dx.doi.org/10.3390/rs4092661 RSEND3 Google Scholar

8.

L. Naidoo et al., “Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LIDAR data in a Random Forest data mining environment,” ISPRS J. Photogramm. Remote Sens., 69 167 –179 (2012). http://dx.doi.org/10.1016/j.isprsjprs.2012.03.005 IRSEE9 0924-2716 Google Scholar

9.

S. L. Ustin et al., “Retrieval of foliar information about plant pigment systems from high resolution spectroscopy,” Remote Sens. Environ., 113 S67 –S77 (2009). http://dx.doi.org/10.1016/j.rse.2008.10.019 RSEEA7 0034-4257 Google Scholar

10.

G. P. Asner et al., “Carnegie airborne observatory: in-flight fusion of hyperspectral imaging and waveform light detection and ranging for three-dimensional studies of ecosystems,” J. Appl. Remote Sens., 1 (1), 013536 (2007). http://dx.doi.org/10.1117/1.2794018 Google Scholar

11.

M. L. Clark, D. A. Roberts and D. B. Clark, “Hyperspectral discrimination of tropical rain forest tree species at leaf to crown scales,” Remote Sens. Environ., 96 (3), 375 –398 (2005). http://dx.doi.org/10.1016/j.rse.2005.03.009 RSEEA7 0034-4257 Google Scholar

12.

M. L. Clark and D. A. Roberts, “Species-level differences in hyperspectral metrics among tropical rainforest trees as determined by a tree-based classifier,” Remote Sens., 4 (6), 1820 –1855 (2012). http://dx.doi.org/10.3390/rs4061820 RSEND3 Google Scholar

13.

C. A. Baldeck and G. P. Asner, “Estimating vegetation beta diversity from airborne imaging spectroscopy and unsupervised clustering,” Remote Sens., 5 (5), 2057 –2071 (2013). http://dx.doi.org/10.3390/rs5052057 RSEND3 Google Scholar

14.

M. Kessler et al., “Alpha and beta diversity of plants and animals along a tropical land-use gradient,” Ecol. Appl., 19 (8), 2142 –2156 (2009). http://dx.doi.org/10.1890/08-1074.1 ECAPE7 1051-0761 Google Scholar

15.

C. Baldeck et al., “Landscape-scale variation in plant community composition of an African savanna from airborne species mapping,” Ecol. Appl., 24 (1), 84 –93 (2014). http://dx.doi.org/10.1890/13-0307.1 ECAPE7 1051-0761 Google Scholar

16.

D. G. Hadjimitsis, C. R. Clayton and A. Retalis, “On the darkest pixel atmospheric correction algorithm: a revised procedure applied over satellite remotely sensed images intended for environmental applications,” Proc. SPIE, 5239 464 –471 (2004). http://dx.doi.org/10.1117/12.511520 PSISDG 0277-786X Google Scholar

17.

E. Karpouzli and T. Malthus, “The empirical line method for the atmospheric correction of IKONOS imagery,” Int. J. Remote Sens., 24 (5), 1143 –1150 (2003). http://dx.doi.org/10.1080/0143116021000026779 IJSEDK 0143-1161 Google Scholar

18.

E. F. Vermote et al., “Second simulation of the satellite signal in the solar spectrum, 6S: an overview,” IEEE Trans. Geosci. Remote Sens., 35 (3), 675 –686 (1997). http://dx.doi.org/10.1109/36.581987 Google Scholar

19.

A. Berk, L. S. Bernstein and D. C. Robertson, “MODTRAN: a moderate resolution model for LOWTRAN,” (1987). Google Scholar

20.

R. Richter and D. Schläpfer, “Atmospheric/topographic correction for satellite imagery,” (2005). Google Scholar

21.

S. M. Adler-Golden et al., “Atmospheric correction for shortwave spectral imagery based on MODTRAN4,” Proc. SPIE, 3753 61 –69 (1999). http://dx.doi.org/10.1117/12.366315 PSISDG 0277-786X Google Scholar

22.

S. Adler-Golden et al., “FLAASH, a MODTRAN4 atmospheric correction package for hyperspectral data retrievals and simulations,” in Proc. 7th Annual JPL Airborne Earth Science Workshop, 97 –21 (1998). Google Scholar

23.

I. Manakos et al., “Comparison between atmospheric correction modules on the basis of worldview-2 imagery and in situ spectroradiometric measurements,” in 7th EARSeL SIG Imaging Spectroscopy Workshop, 11 –13 (2011). Google Scholar

24.

M. Keller et al., “A continental strategy for the National Ecological Observatory Network,” Front. Ecol. Environ., 6 (5), 282 –284 (2008). http://dx.doi.org/10.1890/1540-9295(2008)6[282:ACSFTN]2.0.CO;2 Google Scholar

25.

University of Florida, “Ordway-swisher biological station,” http://ordway-swisher.ufl.edu Google Scholar

26.

T. Kampea et al., “The neon 2010 airborne pathfinder campaign in florida,” (2010). Google Scholar

27.

K. Krause and M. Kuester, “Airborne observation platform (AOP) pathfinder 2010 data release,” (2014) http://neoninc.org/pds/files/NEON.AOP.015068.pdf ( May ). 2014). Google Scholar

28.

NASA JPL AVIRIS, “AVIRIS flight: f100904t01,” http://aviris.jpl.nasa.gov/cgi/flights_10.cgi?step=view_flightlog&flight_id=f100904t01 Google Scholar

29.

NASA JPL AVIRIS, “AVIRIS flight: 100910t01,” http://aviris.jpl.nasa.gov/cgi/flights_10.cgi?step=view_flightlog&flight_id=f100910t01 Google Scholar

30.

NASA Aerosol Robotic Network, “Aerosol robotic network (AERONET),” http://aeronet.gsfc.nasa.gov/ Google Scholar

31.

D. Giles, “AERONET data synergy tool,” http://aeronet.gsfc.nasa.gov/cgi-bin/bamgomas_interactive Google Scholar

32.

N. Japkowicz and S. Stephen, “The class imbalance problem: a systematic study,” Intell. Data Anal., 6 (5), 429 –449 (2002). Google Scholar

33.

D. P. Edwards, “GENLN2: a general line-by-line atmospheric transmittance and radiance model. Version 3.0: description and users guide,” 147 Colorado (1992). Google Scholar

34.

J. P. Gastellu-Etchegorry et al., “Modeling radiative transfer in heterogeneous 3-D vegetation canopies,” Remote Sens. Environ., 58 (2), 131 –156 (1996). http://dx.doi.org/10.1016/0034-4257(95)00253-7 RSEEA7 0034-4257 Google Scholar

35.

A. Berk et al., “MODTRAN cloud and multiple scattering upgrades with application to AVIRIS,” Remote Sens. Environ., 65 (3), 367 –375 (1998). http://dx.doi.org/10.1016/S0034-4257(98)00045-5 RSEEA7 0034-4257 Google Scholar

36.

M. W. Matthew et al., “Status of atmospheric correction using a MODTRAN4-based algorithm,” Proc. SPIE, 4049 199 –207 (2000). http://dx.doi.org/10.1117/12.410341 PSISDG 0277-786X Google Scholar

37.

S. Clough, F. Kneizys and R. Davies, “Line shape and the water vapor continuum,” Atmos. Res., 23 (3), 229 –241 (1989). http://dx.doi.org/10.1016/0169-8095(89)90020-3 ATREEW 0169-8095 Google Scholar

38.

G. P. Anderson et al., “UV spectral simulations using LOWTRAN 7,” AGARD, Atmospheric Propagation in the UV, Visible, IR, and MM-Wave Region and Related Systems Aspects, 1990). Google Scholar

39.

J. Richter and D. Schläpfer, “Atmospheric / topographic correction for satellite imagery,” ATCOR-2/3 User Guide, Version 7.0., DLR German Aerospace Centre, Wessling, Germany (2014). Google Scholar

40.

J. McCorkel et al., “NEON ground validation capabilities for airborne and space-based imagers,” Proc. SPIE, 8153 81530Z (2011). http://dx.doi.org/10.1117/12.894370 PSISDG 0277-786X Google Scholar

41.

K. Thome, “Absolute radiometric calibration of Landsat 7 ETM+ using the reflectance-based method,” Remote Sens. Environ., 78 (1), 27 –38 (2001). http://dx.doi.org/10.1016/S0034-4257(01)00247-4 RSEEA7 0034-4257 Google Scholar

42.

B. Holben et al., “Aeroneta federated instrument network and data archive for aerosol characterization,” Remote Sens. Environ., 66 (1), 1 –16 (1998). http://dx.doi.org/10.1016/S0034-4257(98)00031-5 RSEEA7 0034-4257 Google Scholar

43.

K. S. Krause et al., “Early algorithm development efforts for the National Ecological Observatory Network Airborne Observation Platform imaging spectrometer and waveform LIDAR instruments,” Proc. SPIE, 8158 81580D (2011). http://dx.doi.org/10.1117/12.894178 PSISDG 0277-786X Google Scholar

44.

I. Howat, “ENVI file reader,” (2010) http://www.mathworks.com/matlabcentral/fileexchange/15629-envi-file-reader-updated-2-9-2010 Google Scholar

45.

Jr. J. Rouse et al., “Monitoring vegetation systems in the great plains with erts,” NASA Special Publication, 351 309 (1974). NSSPAW 0565-7075 Google Scholar

46.

M. A. Cho et al., “Mapping tree species composition in South African savannas using an integrated airborne spectral and LIDAR system,” Remote Sens. Environ., 125 214 –226 (2012). http://dx.doi.org/10.1016/j.rse.2012.07.010 RSEEA7 0034-4257 Google Scholar

47.

C. A. Baldeck and G. P. Asner, “Improving remote species identification through efficient training data collection,” Remote Sens., 6 (4), 2682 –2698 (2014). http://dx.doi.org/10.3390/rs6042682 RSEND3 Google Scholar

Biography

Morteza Shahriari Nia is a PhD candidate at University of Florida. He received his BSc from Trabiat Moallem University and his MSc from Tarbiat Modares University, Tehran, Iran. He has also been a research assistant at University of Texas, San Antonio. His research interests include probabilistic knowledge bases over web-scale text corpora and applied deep learning.

Daisy Zhe Wang is an assistant professor in the CISE department at the University of Florida. She is the director of the UF Data Science Research Lab. She obtained her PhD from the EECS at UC Berkeley. She currently pursues research topics such as knowledge bases, knowledge-based inference, and crowd-assisted machine learning. She received the Google Faculty Award in 2014. Her research is also sponsored by NSF and DARPA.

Stephanie Ann Bohlman is an assistant professor in the School of Forest Resources and Conservation at the University of Florida. She graduated from New College in Sarasota, Florida and received a MS and PhD degree from the University of Washington. Her research interests include remote sensing of landscape-level forest composition, structure, and function.

Paul Gader received his PhD from the University of Florida in 1986. He was a researcher in industry, and a faculty member at the Universities of Wisconsin, Oshkosh, Missouri, and Florida, where he was chair of CISE from 2012 to 2015. His first image processing research was in 1984. He has researched parallel signal processing, fuzzy sets, Bayesian methods, handwriting recognition, landmine detection, and hyperspectral analysis. He is an IEEE fellow and UF Research Foundation professor.

Sarah J. Graves is a PhD student in Forest Resources and Conservation at the University of Florida, where she is using airborne hyperspectral data to study tree growth and canopy traits at sites in the tropics and throughout eastern United States. She received her BS degree in Environmental Science from the University of Minnesota, a professional degree in GIS from the University of Wisconsin-Madison, and an MS degree from the University of Florida in 2014.

Milenko Petrovic is a research scientist at the Institute for Human and Machine Cognition in Ocala, Florida. He received his PhD degree in computer engineering from University of Toronto in 2007. He is interested in the design and development of big data machine learning systems and interactive data analytics systems, which have been applied to a variety of problems including ecology, transportation network optimization, sentiment analysis, and information extraction.

Citation Download Citation

Morteza Shahriari Nia, Daisy Zhe Wang, Stephanie Ann Bohlman, Paul D. Gader, Sarah J. Graves, and Milenko Petrovic "Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine," Journal of Applied Remote Sensing 9(1), 095990 (5 November 2015). https://doi.org/10.1117/1.JRS.9.095990

Published: 5 November 2015

Access the abstract

JOURNAL ARTICLE
14 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 7 scholarly publications.

Explore citations on Lens.org

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Atmospheric corrections

Image classification

Hyperspectral imaging

Reflectivity

Image filtering

Sensors

Atmospheric modeling

Show All Keywords

CHORUS Article. This article was made freely available starting 04 November 2016

1.

Introduction

2.

Data Collection

Fig. 1

Fig. 2

2.1.

Field Data

Table 1

3.

Species Classification

3.1.

Atmospheric Correction

Eq. (1)

Eq. (2)

Eq. (3)

3.1.1.

Ground validation capabilities

Table 2

3.2.

Signal Preprocessing

Eq. (4)

3.2.1.

Impact of low-vegetated/shaded pixels

3.2.2.

Gaussian filter

Eq. (5)

Eq. (6)

Fig. 3

3.3.

Support Vector Machines

4.

Results and Discussion

Fig. 4

Fig. 5

Fig. 6

Table 3

5.

Conclusions

Acknowledgments

References

Biography

Show All Keywords

Keywords/Phrases

Search In:

Publication Years