1. Introduction
A consumer RGB camera captures light signals with three types of color sensors. Yet the light is physical radiance—a continuous spectral function of wavelength—which, intuitively, can hardly be described by a 3-dimensional color representation. Indeed, many researchers have found that real-world spectra should be at least 5- to 8-dimensional [
1,
2,
3,
4,
5]. Consequently, with RGB imaging we can only acquire limited information encoded in the light spectrum.
Using a hyperspectral camera [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19], we can record scene radiance at high spectral and spatial resolution. This technique has been widely used in machine vision applications such as remote sensing [
20,
21,
22,
23,
24,
25,
26,
27], medical imaging [
28,
29,
30,
31], food processing [
32,
33,
34,
35,
36,
37], and anomaly detection [
38,
39,
40,
41,
42,
43,
44], as well as in the spectral characterization domain, including the calibration of color devices (e.g., cameras [
45] and scanners [
46]), scene relighting [
47,
48], and art conservation and archiving [
49,
50,
51]. While useful, hyperspectral cameras are usually much more expensive than the RGB cameras. Moreover, the extra spectral information is often captured with reduced spatial and/or temporal resolution, which limits their usefulness.
Spectral reconstruction (SR) seeks to recover spectral information from the RGB data of a single camera [
52,
53,
54,
55,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66]. Assuming the recovery error of SR is low enough (regarding the results in the literature and in this paper), we can essentially measure spectra using an RGB camera.
Historically, SR is most efficiently solved by Linear Regression (LR) [
52], where the map from RGBs to spectra is described by a simple linear transformation. Considering a nonlinear map, the Polynomial Regression (PR) [
53] and Root-Polynomial Regression (RPR) [
56] methods expand the RGBs into a set of polynomial/root-polynomial terms—which are then mapped to spectra via a linear transform. More recent regression models, including the Radial Basis Function Network (RBFN) [
54] and the A+ sparse coding algorithm [
55], use clustering techniques to define the local neighborhood (in the color or spectral space) in which each RGB is regressed.
On the other hand, we see most of the recent SR algorithms are not based on simple regressions but based on Deep Neural Networks (DNN) [
60,
61,
62,
63,
64,
65,
66] which embrace the idea of regressing RGB image patches as a whole (as oppose to regressing each pixel independently in regressions). This approach hypothesizes that object-level descriptions of the RGBs can—though requiring much more computational resources—aid the recovery of spectra. However, DNN-based models do not always perform better than simple regressions [
55] and often suffer from instability issue when recovering spectra at different brightness scales [
56,
57,
58]. Furthermore, spectra recovered by DNNs are shown to be less accurate in color [
57,
59] and do not necessarily provide more accurate cross-illumination or cross-device color reproductions [
57].
In this paper, we aim to further optimize the existing regression-based SR methods. First, we noticed that in recent works, DNN-based models are evaluated and ranked by Mean Relative Absolute Error (MRAE) [
63,
64]. Most top DNN models are also designed to minimize MRAE directly [
60,
61,
62,
63,
64]. However, all regressions used in SR are optimized by convention for Root Mean Square Error (RMSE) while evaluated using MRAE (or other similar relative errors) when compared with the DNNs [
55,
56,
57,
60,
61]. In other words, these regressions are optimized for one metric, and then how well they work is evaluated with another.
In
Figure 1, we illustrate the standard experimental framework of SR. In training, the parameters of the SR model are tuned such that the “losses”—the differences between the ground-truths and estimations measured by a given loss metric—are statistically minimized. After the SR models are trained, we evaluate them based on a desired evaluation metric. Ideally, the loss and evaluation metrics should match (i.e., the same or similar in nature). Indeed, a model that is optimized for one metric but evaluated by another will surely lead to sub-optimal results.
Based on this insight, we propose two new minimization approaches for simple regressions—the Relative Error Least Squares (RELS) and Relative Error Least Absolute Deviation (RELAD). While the former minimizes an error similar to MRAE and is solved in closed form, the latter explicitly minimizes the MRAE metric but has the disadvantage of requiring an iterative minimization.
The second contribution of this paper is to propose a new way of regularizing the regression-based spectral reconstruction. Most regressions are necessarily trained using a regularization constraint [
67], both to prevent overfitting and to make the system equations more “stable” [
68] (a system of equations is stable if small perturbations appear in the training data results in a small perturbation in the solved-for model parameters). However, we observe that hitherto in regression-based spectral reconstruction all spectral channels are regularized altogether—e.g., in [
52,
53,
54,
55,
56]. That is the regularization constraint is effectively applied at the spectrum level. Yet, fundamentally, error metrics such as MRAE measure the errors at all spectral channels independently and then average them to give a spectral error measure. Thus, we propose a “per-channel” regularization methodology which ensures optimized regularization for each spectral channel independently.
Combined, we find that training the simple regressions to minimize the same error as used in testing and adopting a per-channel regularization approach lead to a significant uptick in performance. Moreover, as shown in the example hyperspectral image reconstruction results in
Figure 2, our new regression methods deliver performance that is similar to one of the best (but much more complex) DNN approaches.
The rest of the paper is organized as follows. In
Section 2, we will introduce the prerequisites of regression-based spectral reconstruction. Our new methods are introduced in
Section 3.
Section 4 details the experimental procedures, and
Section 5 discusses the experimental results. This paper concludes in
Section 6.
3. Rethinking the Optimization of Regression for Spectral Reconstruction
The first part of our contributions concerns the regularization method used in the standard case (Equation (
9)), where the regularization takes place at the spectrum level with all spectral channels being regularized together. Here, we will argue—and develop the requisite mathematics—that the regularization should be done per spectral channel.
In our second contribution, we alter the form of the regression. Following that recently MRAE is used to evaluate and rank new methods, and that this metric is optimized directly in modern DNN solutions but not in regression-based methods, we reformulate the regressions to optimize for fitting errors that are “relative” in nature (spectral differences relative to the ground-truth values). We develop solutions based on both and relative error minimizations.
3.1. Per-Channel Regularization
Returning to the conventional regression formulation in Equation (
5), let us split the regression matrix
and the ground-truth spectral data
by columns:
Here, denotes the column of . Remember that each row of is a single radiance spectrum, thus the numbers in are the -channel spectral intensities of all the spectra in the database.
Now, we observe that the original multivariate regression is in fact a collection of 31 independent regressions:
Notice that Equation (
13) is equivalent to the original formulation in Equation (
5), only that it explicitly shows there is no “inter-channel” dependence. In other words, for all regression-based SR in the literature, it has been always the case that each column of
is only used for recovering the corresponding column of
while irrelevant to the recoveries for other spectral channels.
Curiously, as we solve for
using the standard LS minimization in Equation (
9), the strength of the penalty term,
, is controlled by a single parameter
, meaning that all columns of
, that is, the 31
’s, are regularized using the same
parameter despite the fact that each of them works independently of others. Essentially, by regularizing
as a whole, we are “asserting” such an interdependence among channels.
From a mathematical viewpoint (regarding how the regression is formulated), it makes more sense to regularize each per-channel regression independently. Following Equation (
13), we rewrite Equation (
9) in a per-channel fashion:
Here, the per-channel regularization parameter
(again to be optimized by a cross-validation procedure, see
Section 2.2.1) is used specifically for regularizing the regression of the
spectral channel. That is, we select different regularization parameters for different spectral channels.
We would like to remark that, although our per-channel approach matches the assumption made by the regression’s formulation (that there is no inter-spectral-channel dependence), we shall admit the possibility that there might be better ways to formulate the regression which factors in “reasonable interdependence” between channels. For example, we may consider to impose a “smoothness” constraint used in the literature (see, e.g., in [
76]) on the recovered spectra, though we note that this assumption would be more important for the recovery of “reflectance” spectra (which are intrinsically smooth) instead of the “radiance” spectra we are recovering (because the illumination spectrum is part of the radiance, which can occasionally make the radiance far from smooth, especially for indoor illuminations).
The solution of Equation (
14) can be written in closed form [
67]:
Here, the
matrix is the identity matrix—if Linear Regression (Equation (
5)) is used, then
is
; otherwise, if a nonlinear transform is incorporated (Equation (
7)),
is
where
s is the dimension of the nonlinear feature vectors.
3.2. Relative-Error Least Squares Minimization
Now, let us minimize an error that is more similar to MRAE. First, we consider to formulate an -minimization problem so as to ensure a closed-form solution.
From Equation (
5), we can remodel the approximation as the following minimization:
where all the divisions are component-wise. Here, the square of an
relative error (referred to as the “relative-RMSE” in some works [
55,
70]) is minimized. We call this new minimization approach the Relative Error Least Squares (RELS).
Equation (
16) can be rewritten in a per-channel fashion:
(these two minimizations are equivalent because the divisions are component-wise). Or equivalently, we write
where
is an
N-component vector of ones.
We can further define a matrix of RGBs weighted by the reciprocals of
:
where
. Using this nomenclature, let us again rewrite Equation (
18) into
Clearly, Equation (
20) shows that RELS is in effect another least-squares problem which regresses
to fit the vector
.
Of course, we need to regularize this minimization by solving the following equation instead:
whose closed-form solution is written as [
67,
77]
3.3. Relative-Error Least Absolute Deviation Minimization
Finally, let us consider to minimize MRAE directly. Analogous to Equation (
16), we are now going to solve the following minimization:
Following the same derivation as Equations (
16)–(
21), we reach
In the literature, regressions solved via an
minimization is called the Least Absolute Deviation (LAD) [
78,
79]. As here the MRAE we are minimizing is a relative error, we call this new approach the Relative Error Least Absolute Deviation (RELAD).
Notice that for the regularization penalty term in Equation (
24) we also adopted an
norm, which refers to the LASSO regularization [
74] (see
Section 2.2.1).
Unlike RELS, RELAD does not have a closed-form solution. For a small amount of data, a globally optimal solution can be found using a linear-programming solution [
78,
80]. However, in our application the amount of data is large—where the Iterative Reweighted Least Squares (IRLS) [
79] algorithm is more appropriate and is thus used here.
The IRLS process approaches RELAD minimization by repeatedly solving Weighted Least Squares (WLS) [
81] while updating the weights on every iteration depending on the losses and mapping functions obtained in the previous iteration, until the solution converges.
The detailed algorithm is given in Algorithm 1. The iteration number is indicated by the superscript
. All min, division, and absolute operations shown in Algorithm 1 are component-wise to the vectors, while the median function in Step 8 and the mean functions in Step 11 calculate the median and mean of components of
(the resulting scalar
in Step 8 is a preliminary estimate of the standard deviation of absolute losses commonly used in the literature, see, e.g., in [
79,
82]). Furthermore, the min functions used in Step 9 and 10 are to clip the reciprocal values at
so as to prevent overly large numbers. Finally, in Step 11, we set the tolerance
and the stopping iteration
.
Algorithm 1: Solving RELAD regression (Equation (24)) by IRLS algorithm. |
- 1:
▷ Initialization of weights; is the identity matrix - 2:
▷ Absolute losses of all data are initialized to infinity - 3:
- 4:
repeat - 5:
- 6:
▷ Closed-form WLS solution - 7:
▷ Absolute losses - 8:
▷ Preliminary estimate of the standard deviation of - 9:
▷ The × operator is the scalar product - 10:
- 11:
untilor - 12:
▷ Return the converged
|
6. Conclusions
Spectral reconstruction (SR) recovers high-resolution radiance spectra from RGB images. Many methods are regression-based, with simple formulations and usually closed-form solutions, while the current state-of-the-art spectral recovery is delivered by the much more sophisticated Deep Neural Network (DNN) solutions.
Recently, the top DNN models are trained and evaluated based on Mean Relative Absolute Error (MRAE)—a relative error which measures the spectral difference as a percentage of the ground-truth spectral value. Comparatively, all regressions are still trained based on the Least Squares (LS) minimization, which does not suggest a minimized MRAE result. This problem is further compounded by the sub-optimal regularization setting used in conventional regressions where all spectral channels are jointly regularized by a single penalty parameter.
In this paper, we developed new regression approaches that minimize relative errors and are regularized per spectral channel, including the closed-form Relative-Error Least Squares (RELS) and the Relative-Error Least Absolute Deviation (RELAD) approach (which directly minimizes MRAE and was solved by an iterative method).
Our results showed that the new minimization approaches significantly improve the conventional regressions especially in the darker regions of the images. Consequently, our best improved regression model narrows the performance gap with the leading DNNs to only 8% under the MRAE evaluation.