Paper
16 August 2024 Building extraction method in complex scenes of remote sensing based on transformer
Yuqi Zeng, Wenzao Shi, Yuchen Zheng, Jiewei Wu
Author Affiliations +
Proceedings Volume 13230, Third International Conference on Machine Vision, Automatic Identification, and Detection (MVAID 2024); 132300P (2024) https://doi.org/10.1117/12.3036564
Event: Third International Conference on Machine Vision, Automatic Identification and Detection, 2024, Kunming, China
Abstract
In the rapid development of technology, remote sensing image acquisition technology has been widely applied in many fields. However, accurately extracting building information from massive remote sensing data in complex scenes remains a huge challenge. This article proposes a remote sensing building extraction method for complex scenes based Transformer. Using transfer learning, the Google pre trained ViT models weight model is used as the pre training model for the automatic building extraction algorithm model in this article. Based on this pre training model, a remote sensing dataset is trained. With the help of a model combining Transformer and U-Net tasks, the prepared dataset and network structure are adjusted and improved to adapt to the characteristics of remote sensing image samples. The experimental results show that TransUNet can better segment and preserve detailed shape information, enjoying both the benefits of high-dimensional global contextual information and the benefits of low dimensional details. This paper uses TransUNet to achieve the first application of Transformer in building extraction from remote sensing images. It not only encodes the global and contextual information of the image as a sequence, but also effectively utilizes the low dimensional features of CNNs through a U-shaped hybrid structure design.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yuqi Zeng, Wenzao Shi, Yuchen Zheng, and Jiewei Wu "Building extraction method in complex scenes of remote sensing based on transformer", Proc. SPIE 13230, Third International Conference on Machine Vision, Automatic Identification, and Detection (MVAID 2024), 132300P (16 August 2024); https://doi.org/10.1117/12.3036564
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Remote sensing

Image processing

Convolutional neural networks

Deep learning

Back to Top