1. Introduction
A consumer RGB camera captures light signals with three types of color sensors. Yet the light is physical radiance—a continuous spectral function of wavelength—which, intuitively, can hardly be described by a 3dimensional color representation. Indeed, many researchers have found that realworld spectra should be at least 5 to 8dimensional [
1,
2,
3,
4,
5]. Consequently, with RGB imaging we can only acquire limited information encoded in the light spectrum.
Using a hyperspectral camera [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19], we can record scene radiance at high spectral and spatial resolution. This technique has been widely used in machine vision applications such as remote sensing [
20,
21,
22,
23,
24,
25,
26,
27], medical imaging [
28,
29,
30,
31], food processing [
32,
33,
34,
35,
36,
37], and anomaly detection [
38,
39,
40,
41,
42,
43,
44], as well as in the spectral characterization domain, including the calibration of color devices (e.g., cameras [
45] and scanners [
46]), scene relighting [
47,
48], and art conservation and archiving [
49,
50,
51]. While useful, hyperspectral cameras are usually much more expensive than the RGB cameras. Moreover, the extra spectral information is often captured with reduced spatial and/or temporal resolution, which limits their usefulness.
Spectral reconstruction (SR) seeks to recover spectral information from the RGB data of a single camera [
52,
53,
54,
55,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66]. Assuming the recovery error of SR is low enough (regarding the results in the literature and in this paper), we can essentially measure spectra using an RGB camera.
Historically, SR is most efficiently solved by Linear Regression (LR) [
52], where the map from RGBs to spectra is described by a simple linear transformation. Considering a nonlinear map, the Polynomial Regression (PR) [
53] and RootPolynomial Regression (RPR) [
56] methods expand the RGBs into a set of polynomial/rootpolynomial terms—which are then mapped to spectra via a linear transform. More recent regression models, including the Radial Basis Function Network (RBFN) [
54] and the A+ sparse coding algorithm [
55], use clustering techniques to define the local neighborhood (in the color or spectral space) in which each RGB is regressed.
On the other hand, we see most of the recent SR algorithms are not based on simple regressions but based on Deep Neural Networks (DNN) [
60,
61,
62,
63,
64,
65,
66] which embrace the idea of regressing RGB image patches as a whole (as oppose to regressing each pixel independently in regressions). This approach hypothesizes that objectlevel descriptions of the RGBs can—though requiring much more computational resources—aid the recovery of spectra. However, DNNbased models do not always perform better than simple regressions [
55] and often suffer from instability issue when recovering spectra at different brightness scales [
56,
57,
58]. Furthermore, spectra recovered by DNNs are shown to be less accurate in color [
57,
59] and do not necessarily provide more accurate crossillumination or crossdevice color reproductions [
57].
In this paper, we aim to further optimize the existing regressionbased SR methods. First, we noticed that in recent works, DNNbased models are evaluated and ranked by Mean Relative Absolute Error (MRAE) [
63,
64]. Most top DNN models are also designed to minimize MRAE directly [
60,
61,
62,
63,
64]. However, all regressions used in SR are optimized by convention for Root Mean Square Error (RMSE) while evaluated using MRAE (or other similar relative errors) when compared with the DNNs [
55,
56,
57,
60,
61]. In other words, these regressions are optimized for one metric, and then how well they work is evaluated with another.
In
Figure 1, we illustrate the standard experimental framework of SR. In training, the parameters of the SR model are tuned such that the “losses”—the differences between the groundtruths and estimations measured by a given loss metric—are statistically minimized. After the SR models are trained, we evaluate them based on a desired evaluation metric. Ideally, the loss and evaluation metrics should match (i.e., the same or similar in nature). Indeed, a model that is optimized for one metric but evaluated by another will surely lead to suboptimal results.
Based on this insight, we propose two new minimization approaches for simple regressions—the Relative Error Least Squares (RELS) and Relative Error Least Absolute Deviation (RELAD). While the former minimizes an error similar to MRAE and is solved in closed form, the latter explicitly minimizes the MRAE metric but has the disadvantage of requiring an iterative minimization.
The second contribution of this paper is to propose a new way of regularizing the regressionbased spectral reconstruction. Most regressions are necessarily trained using a regularization constraint [
67], both to prevent overfitting and to make the system equations more “stable” [
68] (a system of equations is stable if small perturbations appear in the training data results in a small perturbation in the solvedfor model parameters). However, we observe that hitherto in regressionbased spectral reconstruction all spectral channels are regularized altogether—e.g., in [
52,
53,
54,
55,
56]. That is the regularization constraint is effectively applied at the spectrum level. Yet, fundamentally, error metrics such as MRAE measure the errors at all spectral channels independently and then average them to give a spectral error measure. Thus, we propose a “perchannel” regularization methodology which ensures optimized regularization for each spectral channel independently.
Combined, we find that training the simple regressions to minimize the same error as used in testing and adopting a perchannel regularization approach lead to a significant uptick in performance. Moreover, as shown in the example hyperspectral image reconstruction results in
Figure 2, our new regression methods deliver performance that is similar to one of the best (but much more complex) DNN approaches.
The rest of the paper is organized as follows. In
Section 2, we will introduce the prerequisites of regressionbased spectral reconstruction. Our new methods are introduced in
Section 3.
Section 4 details the experimental procedures, and
Section 5 discusses the experimental results. This paper concludes in
Section 6.
3. Rethinking the Optimization of Regression for Spectral Reconstruction
The first part of our contributions concerns the regularization method used in the standard case (Equation (
9)), where the regularization takes place at the spectrum level with all spectral channels being regularized together. Here, we will argue—and develop the requisite mathematics—that the regularization should be done per spectral channel.
In our second contribution, we alter the form of the regression. Following that recently MRAE is used to evaluate and rank new methods, and that this metric is optimized directly in modern DNN solutions but not in regressionbased methods, we reformulate the regressions to optimize for fitting errors that are “relative” in nature (spectral differences relative to the groundtruth values). We develop solutions based on both ${\ell}_{2}$ and ${\ell}_{1}$ relative error minimizations.
3.1. PerChannel Regularization
Returning to the conventional regression formulation in Equation (
5), let us split the regression matrix
$\mathbf{M}$ and the groundtruth spectral data
$\mathbf{R}$ by columns:
Here, ${\underline{\mathsf{\rho}}}_{i}$ denotes the $i\mathrm{th}$ column of $\mathbf{R}$. Remember that each row of $\mathbf{R}$ is a single radiance spectrum, thus the numbers in ${\underline{\mathsf{\rho}}}_{i}$ are the $i\mathrm{th}$channel spectral intensities of all the spectra in the database.
Now, we observe that the original multivariate regression is in fact a collection of 31 independent regressions:
Notice that Equation (
13) is equivalent to the original formulation in Equation (
5), only that it explicitly shows there is no “interchannel” dependence. In other words, for all regressionbased SR in the literature, it has been always the case that each column of
$\mathbf{M}$ is only used for recovering the corresponding column of
$\mathbf{R}$ while irrelevant to the recoveries for other spectral channels.
Curiously, as we solve for
$\mathbf{M}$ using the standard LS minimization in Equation (
9), the strength of the penalty term,
${\gamma \left\right\mathbf{M}\left\right}_{2}^{2}$, is controlled by a single parameter
$\gamma $, meaning that all columns of
$\mathbf{M}$, that is, the 31
${\underline{\mathit{m}}}_{i}$’s, are regularized using the same
$\gamma $ parameter despite the fact that each of them works independently of others. Essentially, by regularizing
$\mathbf{M}$ as a whole, we are “asserting” such an interdependence among channels.
From a mathematical viewpoint (regarding how the regression is formulated), it makes more sense to regularize each perchannel regression independently. Following Equation (
13), we rewrite Equation (
9) in a perchannel fashion:
Here, the perchannel regularization parameter
${\gamma}_{i}$ (again to be optimized by a crossvalidation procedure, see
Section 2.2.1) is used specifically for regularizing the regression of the
$i\mathrm{th}$ spectral channel. That is, we select different regularization parameters for different spectral channels.
We would like to remark that, although our perchannel approach matches the assumption made by the regression’s formulation (that there is no interspectralchannel dependence), we shall admit the possibility that there might be better ways to formulate the regression which factors in “reasonable interdependence” between channels. For example, we may consider to impose a “smoothness” constraint used in the literature (see, e.g., in [
76]) on the recovered spectra, though we note that this assumption would be more important for the recovery of “reflectance” spectra (which are intrinsically smooth) instead of the “radiance” spectra we are recovering (because the illumination spectrum is part of the radiance, which can occasionally make the radiance far from smooth, especially for indoor illuminations).
The solution of Equation (
14) can be written in closed form [
67]:
Here, the
$\mathbf{I}$ matrix is the identity matrix—if Linear Regression (Equation (
5)) is used, then
$\mathbf{I}$ is
$3\times 3$; otherwise, if a nonlinear transform is incorporated (Equation (
7)),
$\mathbf{I}$ is
$s\times s$ where
s is the dimension of the nonlinear feature vectors.
3.2. RelativeError Least Squares Minimization
Now, let us minimize an error that is more similar to MRAE. First, we consider to formulate an ${\ell}_{2}$minimization problem so as to ensure a closedform solution.
From Equation (
5), we can remodel the approximation as the following minimization:
where all the divisions are componentwise. Here, the square of an
${\ell}_{2}$ relative error (referred to as the “relativeRMSE” in some works [
55,
70]) is minimized. We call this new minimization approach the Relative Error Least Squares (RELS).
Equation (
16) can be rewritten in a perchannel fashion:
(these two minimizations are equivalent because the divisions are componentwise). Or equivalently, we write
where
$\underline{\mathbf{1}}$ is an
Ncomponent vector of ones.
We can further define a matrix of RGBs weighted by the reciprocals of
${\underline{\mathsf{\rho}}}_{i}$:
where
${\underline{\mathsf{\rho}}}_{i}={[{\rho}_{i,1},{\rho}_{i,2},\cdots ,{\rho}_{i,N}]}^{\mathsf{T}}$. Using this nomenclature, let us again rewrite Equation (
18) into
Clearly, Equation (
20) shows that RELS is in effect another leastsquares problem which regresses
${\mathbf{H}}_{i}$ to fit the vector
$\underline{\mathbf{1}}$.
Of course, we need to regularize this minimization by solving the following equation instead:
whose closedform solution is written as [
67,
77]
3.3. RelativeError Least Absolute Deviation Minimization
Finally, let us consider to minimize MRAE directly. Analogous to Equation (
16), we are now going to solve the following minimization:
Following the same derivation as Equations (
16)–(
21), we reach
In the literature, regressions solved via an
${\ell}_{1}$ minimization is called the Least Absolute Deviation (LAD) [
78,
79]. As here the MRAE we are minimizing is a relative error, we call this new approach the Relative Error Least Absolute Deviation (RELAD).
Notice that for the regularization penalty term in Equation (
24) we also adopted an
${\ell}_{1}$ norm, which refers to the LASSO regularization [
74] (see
Section 2.2.1).
Unlike RELS, RELAD does not have a closedform solution. For a small amount of data, a globally optimal solution can be found using a linearprogramming solution [
78,
80]. However, in our application the amount of data is large—where the Iterative Reweighted Least Squares (IRLS) [
79] algorithm is more appropriate and is thus used here.
The IRLS process approaches RELAD minimization by repeatedly solving Weighted Least Squares (WLS) [
81] while updating the weights on every iteration depending on the losses and mapping functions obtained in the previous iteration, until the solution converges.
The detailed algorithm is given in Algorithm 1. The iteration number is indicated by the superscript
${}^{\left(t\right)}$. All min, division, and absolute operations shown in Algorithm 1 are componentwise to the vectors, while the median function in Step 8 and the mean functions in Step 11 calculate the median and mean of components of
${\underline{\delta}}^{\left(t\right)}$ (the resulting scalar
$\widehat{\sigma}$ in Step 8 is a preliminary estimate of the standard deviation of absolute losses commonly used in the literature, see, e.g., in [
79,
82]). Furthermore, the min functions used in Step 9 and 10 are to clip the reciprocal values at
${10}^{6}$ so as to prevent overly large numbers. Finally, in Step 11, we set the tolerance
$\u03f5=0.00005$ and the stopping iteration
$T=20$.
Algorithm 1: Solving RELAD regression (Equation (24)) by IRLS algorithm. 
 1:
${\mathbf{W}}^{\left(0\right)}={\tilde{\mathbf{W}}}^{\left(0\right)}=\mathbf{I}$ ▷ Initialization of weights; $\mathbf{I}$ is the $N\times N$ identity matrix  2:
${\underline{\delta}}^{\left(0\right)}=inf$ ▷ Absolute losses of all data are initialized to infinity  3:
$t=0$  4:
repeat  5:
$t=t+1$  6:
${\underline{\mathit{m}}}_{i}^{\left(t\right)}={\left[{\mathbf{H}}_{i}^{\mathsf{T}}{\mathbf{W}}^{(t1)}{\mathbf{H}}_{i}+{\gamma}_{i}{\tilde{\mathbf{W}}}^{(t1)}\right]}^{1}{\mathbf{H}}_{i}^{\mathsf{T}}{\mathbf{W}}^{(t1)}\underline{\mathbf{1}}$ ▷ Closedform WLS solution  7:
${\underline{\delta}}^{\left(t\right)}={\mathbf{H}}_{i}{\underline{\mathit{m}}}_{i}^{\left(t\right)}\underline{\mathbf{1}}$ ▷ Absolute losses  8:
$\widehat{\sigma}=\frac{\mathrm{median}\left({\underline{\delta}}^{\left(t\right)}\right)}{0.6745}$ ▷ Preliminary estimate of the standard deviation of ${\underline{\delta}}^{\left(t\right)}$  9:
${\mathbf{W}}^{\left(t\right)}=\widehat{\sigma}\times diag\left(\mathrm{min}\left(\frac{1}{{\underline{\delta}}^{\left(t\right)}},\phantom{\rule{4pt}{0ex}}{10}^{6}\right)\right)$ ▷ The × operator is the scalar product  10:
${\tilde{\mathbf{W}}}^{\left(t\right)}=diag\left(\mathrm{min}\left(\frac{1}{\left{\underline{\mathit{m}}}_{i}^{\left(t\right)}\right},\phantom{\rule{4pt}{0ex}}{10}^{6}\right)\right)$  11:
until$\mathrm{mean}\left({\underline{\delta}}^{\left(t\right)}\right)\mathrm{mean}\left({\underline{\delta}}^{(t1)}\right)<\u03f5$or$t\ge T$  12:
${\underline{\mathit{m}}}_{i}={\underline{\mathit{m}}}_{i}^{\left(t\right)}$ ▷ Return the converged ${\underline{\mathit{m}}}_{i}$

6. Conclusions
Spectral reconstruction (SR) recovers highresolution radiance spectra from RGB images. Many methods are regressionbased, with simple formulations and usually closedform solutions, while the current stateoftheart spectral recovery is delivered by the much more sophisticated Deep Neural Network (DNN) solutions.
Recently, the top DNN models are trained and evaluated based on Mean Relative Absolute Error (MRAE)—a relative error which measures the spectral difference as a percentage of the groundtruth spectral value. Comparatively, all regressions are still trained based on the Least Squares (LS) minimization, which does not suggest a minimized MRAE result. This problem is further compounded by the suboptimal regularization setting used in conventional regressions where all spectral channels are jointly regularized by a single penalty parameter.
In this paper, we developed new regression approaches that minimize relative errors and are regularized per spectral channel, including the closedform RelativeError Least Squares (RELS) and the RelativeError Least Absolute Deviation (RELAD) approach (which directly minimizes MRAE and was solved by an iterative method).
Our results showed that the new minimization approaches significantly improve the conventional regressions especially in the darker regions of the images. Consequently, our best improved regression model narrows the performance gap with the leading DNNs to only 8% under the MRAE evaluation.