Unmanned aerial vehicles (UAVs) can overcome several limitations of satellite and aerial platforms using their multiple visit ability. However, UAVs usually collect images of small and simple regions from a large image scene and obtain high-resolution images from various viewing angles and altitudes. Multiple datasets created in various regions and conditions can be helpful considering data expansion to improve the usability of the UAV datasets with deep learning. The combined segmentation network (CSN), which can train two datasets simultaneously by sharing encoding blocks, was used to segment heterogeneous UAV datasets, such as UAVid and semantic drone dataset. CSN shared encoding blocks to learn general features from two datasets and decoding blocks trained separately on each dataset. For the preprocessing step, classes of each dataset were adjusted to minimize the difference between the two datasets. Experiment results show that CSN can segment more accurately for specific classes, such as background and vegetation, which have low ratios in the single dataset. This study presented the potential application of integrated heterogeneous UAV imagery datasets by learning shared layers. Thus, surface inspection would be effectively conducted using UAV datasets.
Semantic segmentation of urban areas can provide useful information for analyzing and detecting changes in urban development. Recently, numerous remote sensing image datasets from various platforms have been acquired, and various semantic segmentation studies using them have been conducted. However, they do not contain many images because of their large data capacity and difficulty in constructing label data. Furthermore, it is difficult to use them simultaneously because each dataset has a different spatial resolution, shooting angle, and meaningful objects. In this study, two different UAV image datasets, such as UAVid semantic segmentation and semantic drone datasets, were used to train a combined U-net model to use heterogeneous remote sensing datasets for semantic segmentation tasks simultaneously. The UAVid dataset has a flight height of 50 m and 300 images with eight classes. However, the semantic drone dataset was acquired at an altitude of 5–30 m above the ground and contains 598 images with 20 classes. The combined U-net model is based on the U-net architecture, but it receives input from two different data sources. The experimental results showed that learning two datasets with a combined U-net improved semantic segmentation accuracy more than learning each data with a U-net. This study confirms the ability to train two different datasets acquired from different places and platforms simultaneously; thus, evaluating the applicability of semantic segmentation studies using heterogeneous remote sensing datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.