Traditional stereo dense image matching (DIM) methods normally predefine a fixed window to compute matching cost, while their performances are limited by the matching window sizes. A large matching window usually achieves robust matching results in weak-textured regions, while it may cause over-smoothness problems in disparity jumps and fine structures. A small window can recover sharp boundaries and fine structures, while it contains high matching uncertainties in weak-textured regions. To address the issue above, we respectively compute matching results with different matching window sizes and then proposes an adaptive fusion method of these matching results so that a better matching result can be generated. The core algorithm designs a Convolutional Neural Network (CNN) to predict the probabilities of large and small windows for each pixel and then refines these probabilities by imposing a global energy function. A compromised solution of the global energy function is utilized by breaking the optimization into sub-optimizations of each pixel in one-dimensional (1D) paths. Finally, the matching results of large and small windows are fused by taking the refined probabilities as weights for more accurate matching. We test our method on aerial image datasets, satellite image datasets, and Middlebury benchmark with different matching cost metrics. Experiments show that our proposed adaptive fusion of multiple-window matching results method has a good transferability across different datasets and outperforms the small windows, the median windows, the large windows, and some state-of-the-art matching window selection methods.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited