Change detection (CD) is the operation of quantitatively analyzing the surface changes of a phenomenon or objects over two different times. Lately, CD based on deep learning has developed to become more and more powerful, and convolutional neural networks (CNNs) have dominated the field of remote sensing (RS) CD. In particular, in many fields of computer vision, neural networks based on U-Net network and skip connections have been generally used. However, despite the excellent performance achieved by CNN, it does not learn global and long-range semantic information interaction well due to the locality of convolutional operations. The recently proposed Swin-UNet in the field of medical image segmentation achieved excellent results, which is a U-Net-like pure transformer. In the face of the challenge of segmentation accuracy, the Swin transformer has demonstrated strong capabilities. The Swin transformer block (STB) consists of residual connected STBs used in SwinIR to enhanced training stability. We began to try to incorporate them into our network for RS CD. Finally, we propose a transformer-based multi-scale feature fusion model (TMFF), including decoder, encoder, and skip connection structure, for RS image CD. We modify the original U-Net architecture so that it can better aggregate semantic features at all levels. Our proposed TMFF achieves impressive results through experiments on three datasets; |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
![Lens.org Logo](/images/Lens.org/lens-logo.png)
CITATIONS
Cited by 1 scholarly publication.
Transformers
Remote sensing
Feature fusion
Image fusion
Feature extraction
Computer programming
Image processing