In practical applications, to improve the real-time performance of end-to-end stereo matching networks, the existing methods build cost volume at low resolution. However, with detailed information missing in low-resolution features, it is difficult to get accurate disparity estimation results in weak texture regions. Besides, smooth L1 loss supervision also results in a loss of accuracy in disparity discontinuity areas. To solve these problems, we propose an efficient stereo-matching network based on multiple attention mechanisms and edge optimization, which can achieve high accuracy in a short time. The multi-scale attention module is applied to enhance the feature expression in detail regions. For weak texture areas, we construct a concatenation cost volume and a multi-level patch matching volume, which can be combined to improve the network’s attention to weak texture regions. In terms of edge optimization, we perform bimodal Laplace modeling of the sampled edge points’ disparity distribution and optimize the edge region of the initial disparity map using likelihood loss to obtain sharp edges. The experimental results show that, on the SceneFlow and KITTI datasets, the proposed network improves by 32% and 27% in accuracy compared with BGNet+. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Education and training
Feature extraction
Convolution
Point clouds
Discontinuities
Visualization
Semantics