Spectral Imagery Tensor Decomposition for Semantic Segmentation of Remote Sensing Data through Fully Convolutional Networks
Center for Research and Advanced Studies of the National Polytechnic Institute, Telecommunications Group, Av del Bosque 1145, Zapopan 45017, Mexico
University of Guadalajara, Center of Exact Sciences and Engineering, Blvd. Gral. Marcelino García Barragán 1421, Guadalajara 44430, Mexico
University of Natural Resources and Life Science, Institute of Geomatics, Peter Jordan 82, Vienna 1180, Austria
Author to whom correspondence should be addressed.
Remote Sens. 2020, 12(3), 517; https://doi.org/10.3390/rs12030517
Received: 22 November 2019 / Revised: 8 January 2020 / Accepted: 11 January 2020 / Published: 5 February 2020
(This article belongs to the Special Issue Remote Sensing Data Compression)
This work aims at addressing two issues simultaneously: data compression at input space and semantic segmentation. Semantic segmentation of remotely sensed multi- or hyperspectral images through deep learning (DL) artificial neural networks (ANN) delivers as output the corresponding matrix of pixels classified elementwise, achieving competitive performance metrics. With technological progress, current remote sensing (RS) sensors have more spectral bands and higher spatial resolution than before, which means a greater number of pixels in the same area. Nevertheless, the more spectral bands and the greater number of pixels, the higher the computational complexity and the longer the processing times. Therefore, without dimensionality reduction, the classification task is challenging, particularly if large areas have to be processed. To solve this problem, our approach maps an RS-image or third-order tensor into a core tensor, representative of our input image, with the same spatial domain but with a lower number of new tensor bands using a Tucker decomposition (TKD). Then, a new input space with reduced dimensionality is built. To find the core tensor, the higher-order orthogonal iteration (HOOI) algorithm is used. A fully convolutional network (FCN) is employed afterwards to classify at the pixel domain, each core tensor. The whole framework, called here HOOI-FCN, achieves high performance metrics competitive with some RS-multispectral images (MSI) semantic segmentation state-of-the-art methods, while significantly reducing computational complexity, and thereby, processing time. We used a Sentinel-2 image data set from Central Europe as a case study, for which our framework outperformed other methods (included the FCN itself) with average pixel accuracy (PA) of 90% (computational time ∼90s) and nine spectral bands, achieving a higher average PA of 91.97% (computational time ∼36.5s), and average PA of 91.56% (computational time ∼9.5s) for seven and five new tensor bands, respectively.