Towards ‘Fourth Paradigm’ Spectral Sensing

Reconstruction algorithms are at the forefront of accessible and compact data collection. In this paper, we present a novel reconstruction algorithm, SpecRA, that adapts based on the relative rarity of a signal compared to previous observations. We leverage a data-driven approach to learn optimal encoder-array sensitivities for a novel filter-array spectrometer. By taking advantage of the regularities mined from diverse online repositories, we are able to exploit low-dimensional patterns for improved spectral reconstruction from as few as p=2 channels. Furthermore, the performance of SpecRA is largely independent of signal complexity. Our results illustrate the superiority of our method over conventional approaches and provide a framework towards “fourth paradigm” spectral sensing. We hope that this work can help reduce the size, weight and cost constraints of future spectrometers for specific spectral monitoring tasks in applied contexts such as in remote sensing, healthcare, and quality control.


Introduction
Natural signals are compressible functions that represent changes in the spectrotemporal dynamics of physical phenomena [1]. Common examples of natural signals include light and sound. The information contained within a signal is encoded when it is received by an observer. Observers can be biological, such as the human eye, or mechanical (e.g., a digital camera). The most useful "observers" encode signal information into a format that can be read, copied, and shared with others via a process called quantization. Contemporary scientific discovery is increasingly dependent on encoding hardware as numerous autonomous processes require vast amounts of data. In what many are now calling the "fourth paradigm in scientific discovery" [2], efficient data transcoding is paramount. Herein, we approach the metrological process from the perspective of information science starting with quantization and concluding with reconstruction. We hypothesize that by exploiting regularities in existing datasets, we can develop optimized non-uniform protocols for spectral sensor placement and use adaptive methods to maximize reconstruction efficiency. While we focus on spectral sensing for visible light, the methods discussed are applicable to any set of signals characterized by locality and compositionality.

What Is "Spectral" Sensing?
Spectral sensing is used in a number of applied contexts including healthcare, remote sensing, and quality control [3][4][5]. To meet the requirements of this highly diverse 1.
Spectral super-resolution; While each application of spectral sensing has different requirements for what is considered "spectral" resolution, we can broadly identify three main classes: (1) tristimulus; (2) multispectral; and (3) hyperspectral imaging. Pixel-based, red, green, blue (RGB) is perhaps the most ubiquitous "spectral" signal and is used for color rendering and other imaging purposes. While RGB has historically been excluded from spectral sensing technologies, interest in recovering broadband spectral signals from RGB has increased over the years [6][7][8][9][10][11]. So-called "RGB-to-spectrum" approaches are, however, limited in their applicability to broadband (low-complexity) signals and generally fail for signals with low autocorrelation. Multispectral imaging is an exciting emerging field and typically refers to systems capturing between 3 and 10 channels with bandwidths greater than 20 nm [12]. Common applications are in color evaluation for quality control and remote sensing [13]. Advances in optical filters and semiconductor technology have also improved the size, weight, power, and cost constraints, making hyperspectral sensing a more attractive middle ground [14]. Hyperspectral imaging is perhaps the most diverse class reserved for devices measuring more than 10 channels [15]. For imaging applications, hyperspectral cameras can have between 512 and 2048 channels over the visible domain making them extraordinarily information-rich. While devices in each of these classes all collect "spectral" data, the range in resolution covers three orders of magnitude.

Existing Methods
Many compact spectral sensing methods exist based on the aforementioned technologies. Recent interest in multi-spectral filter-array technology has, in particular, attracted a lot of interest. Methods of multispectral demosaicking have improved low-cost "single shot" imaging [16] as well as compressive methods [17]. Alternative approaches look for statistical regularities to exploit for task-specific applications. A plethora of work in natural image statistics has motivated approaches based on scene optimization [18] and RGB images [8,19,20]. Increasingly, however, focus has shifted towards data-driven computational methods to recover as much information as possible from low-fidelity measurements. This includes improving the optimization frameworks for diffractive achromats (DAs) [21] and constructive improvements for the alternating direction method of multipliers (ADMM) optimization in solving ill-posed reconstruction problems [22].
Contemporary approaches to compressive spectral sensing rely on a diversity of sensor technologies from liquid crystal phase retarders [23][24][25] to stacked array spectrometers with broadband filters [26]. Emergent technologies such as quantum dot and nanowire spectrometers [27,28] also show great potential to disrupt the spectral sensing space. At the same time, the manufacturing cost of "single shot" compact filter-array spectrometers with 3-20 channels and bandwidths in the 20 nm range have dramatically decreased to a fraction of the cost of conventional scanning spectrometers [29,30]. Regardless of the underlying technology, each measurement device maps data from the "natural" dimension to a hardcoded "measurement dimension" defined by the resolution of the device. Towards this end, we are interested in applying a data-driven approach to investigate the theoretical boundaries constraining reconstruction performance in the future generation of compact sensing hardware.

Beating Nyquist
In 2005, our understanding of the theoretical limits of sampling rate for bandlimited signals were such that if a signal is sampled at a frequency f , perfect reconstruction is only guaranteed if the bandwidth b < f /2 [32]. This observation is a result of the Shannon-Nyquist sampling theorem and while this theorem still holds today, two seismic advances in applied math have forced engineers to contextualize it in a new light. First, in 2006, it was demonstrated that sub-Nyquist sampling was possible without violating the theorem by requiring the signals to be sparse (i.e., compressible) in a generic basis [31,33]. This approach is called compressive sensing and has since revolutionized data collection by initializing compression at the point of quantization. Second, the increase in data availability has made it possible to feature mine prior observations for statistical regularities. These regularities can then be used to exploit symmetries and other structural properties in order to make inferences that further maximize reconstruction performance. In this way, the "datafication" of our world has imparted huge implications for metrology in general and spectral sensing in particular. In this paper, we will specifically show how advanced domain knowledge can be used to optimize the measurement process.

What Is Reconstruction?
Simply defined, reconstruction refers to a process of recovering a signal from a set of limited measurements [1]. In practice, this can be done several ways depending on the application and in context, all of which result in solving the following problem: Here, y = g φ (s) is a measurement of a high-dimensional signal s = E(λ) by an encoder g φ reconstructed by a function f θ (c.f., Figure 1). The function K being minimized can be any metric or norm capturing the dissimilarity between the reconstruction and the ground truth. If the signal is sampled at or above the Nyquist rate, reconstruction can be as simple as constructing a linear fit through the sub-sampled points (we call this the "interpolation regime"). Alternatively, if the signal is undersampled but known to exhibit unique statistical regularities, "reconstruction" can also be a process of finding a match among known prior observations. In both examples, we say the reconstruction is "naïve" in the sense that interpolation and pattern matching are inherently trivial tasks.
If the reconstruction is lossy, performance metrics are accompanied with a compression power score to contextualize the trade-off between complexity and descriptivity. The compression power is defined as the ratio between the uncompressed and compressed file size (e.g., defined by the vector length). What can be misleading about the compression power is that in the limit that the compressed file size obtains information saturation (i.e., is sampled at a lossless rate via Nyquist or CS) the compression ratio becomes meaningless when the uncompressed signal dimension is increased. This is because the features used for lossless reconstruction are fully formed at a given resolution, and artificially increasing the resolution of the uncompressed file will inflate the compression power. Towards this end, we propose a slightly amended performance metric that encapsulates the about of "work" performed by the algorithm f θ : R p → R n in order to reach a target error depending on the available information: where ∆ is the target performance threshold, γ is the percent of signals matching signals in the Kanji, and µ is the percentage of signals recovered at or above the Nyquist rate.
We can think of W as the penalized compression power that is set equal to 0 when the reconstruction algorithm fails to outperform either the matching problem or interpolation. Given that we reconstruct a set of observations encoded by g φ , from a Kanji K ∈ R m , we can define the proportion of adequately reconstructed signals as the ratio of the cardinalities of S and S where S = { s | s ∈ S ∧ K(s,ŝ) < ∀ŝ = f θ (y) } is the subset of reconstructed observations adhering to a dissimilarity score less than defined by a function K appropriate for the underlying datatype. For spectral sensing applications, this is typically the spectral angle (SA) or another derivative index corresponding to the spectral information divergence [34,35].

Encoding light
to improve how spectral data are measured, compressed, and stored, but also in how they are classified. Naturally the question becomes: what is the intrinsic dimension of visible light?
In 2005, our understanding of the theoretical limits of sampling rate for bandlimited signals were such that if a signal is sampled at a frequency f , perfect reconstruction is only guaranteed if the bandwidth b < f /2 (Shannon, 1949). This observation is a result of the Shannon-Nyquist sampling theorem and while this theorem still holds up today, two seismic advances in applied math have forced engineers to contextualize it in a new light. First, in 2006 it was demonstrated that sub-Nyquist sampling was possible without violating the theorem by requiring the signals to be sparse (i.e., compressible) in a generic basis (Candès et al., 2006a,b). This approach is called compressive sensing and has since revolutionized data collection by initializing compression at the point of quantization. Second, the increase in data availability has made it possible to feature mine prior observations for statistical regularities. These regularities can then be used to exploit symmetries and other structural properties in order to make inferences that further maximize reconstruction performance. In this way, the "datafication" of our world has imparted huge implications for metrology in general and spectral sensing in particular. In this chapter, we will show specifically how advanced domain knowledge can be used to optimize the measurement process.
Encoder Decoder p = Q(s) X s s' = X(p) Q r 1 r 2 r 3 Figure 5.1: Spectrometry is the process of encoding spectral data to a measured dimension p from the infinite natural signal dimension. This process is visualized in the graphic above and written mathematically as g ¡ : R 1 ! R p . The measured data can then be reconstructed to a target high resolution n > p such that f µ : R p ! R n . Here the encoder g ¡ is a filter-array spectrometer with some transmission functions r i for i = (1, ..., p) and f µ represents a reconstruction algorithm (c.f., Figure 3.2). This structure is analogous to that of an autoencoder where the learned encoder weights would replace the response functions of the physical spectrometer and p is the dimension of the hidden layer.
Contemporary approaches to sub-Nyquist spectral sensing rely on a diversity of sensor technologies from liquid crystal phase retarders (Oiknine et al., 2018;August and Stern, 2013;Oiknine et al., 2019) to stacked array spectrometers with broadband filters . Emergent technologies such as quantum dot and nanowire spectrometers (Bao and Bawendi, 2015;Yang et al., 2019) also show great potential to disrupt the spectral sensing space. At the same time, the manufacturing cost of single shot compact filter-array spectrometers with 3 to 20 channels and bandwidths in the 20 nm range have decreased dramatically to a fraction of the cost of conventional scanning spectrometers (Chang et al., 2011;Choi et al., 2016;59 regularities. These regularities can then be used to exploit sym properties in order to make inferences that further maximize r In this way, the "datafication" of our world has imparted huge in general and spectral sensing in particular. In this chapter, w advanced domain knowledge can be used to optimize the meas Encoder Decoder p = Q(s) X s Q r 1 r 2 r 3 Figure 5.1: Spectrometry is the process of encoding spectral data to the infinite natural signal dimension. This process is visualized in mathematically as g ¡ : R 1 ! R p . The measured data can then be resolution n > p such that f µ : R p ! R n . Here the encoder g ¡ is a filter transmission functions r i for i = (1, ..., p) and f µ represents a reconstru This structure is analogous to that of an autoencoder where the learned the response functions of the physical spectrometer and p is the dime Contemporary approaches to sub-Nyquist spectral sensing rely nologies from liquid crystal phase retarders (Oiknine et al., 20 Oiknine et al., 2019) to stacked array spectrometers with broadba Emergent technologies such as quantum dot and nanowire spec 2015; Yang et al., 2019) also show great potential to disrupt the s same time, the manufacturing cost of single shot compact filte to 20 channels and bandwidths in the 20 nm range have decrea of the cost of conventional scanning spectrometers (Chang e In this way, the "datafication" of our world has imp in general and spectral sensing in particular. In this advanced domain knowledge can be used to optimiz Encoder p = Q(s) s Q r 1 r 2 r 3 Figure 5.1: Spectrometry is the process of encoding spec the infinite natural signal dimension. This process is vi mathematically as g ¡ : R 1 ! R p . The measured data c resolution n > p such that f µ : R p ! R n . Here the encoder transmission functions r i for i = (1, ..., p) and f µ represents This structure is analogous to that of an autoencoder where the response functions of the physical spectrometer and p Contemporary approaches to sub-Nyquist spectral s nologies from liquid crystal phase retarders (Oiknin Oiknine et al., 2019) to stacked array spectrometers w Emergent technologies such as quantum dot and nan 2015; Yang et al., 2019) also show great potential to d same time, the manufacturing cost of single shot co to 20 channels and bandwidths in the 20 nm range h of the cost of conventional scanning spectrometer data from the "natural" dimension to a hardcoded "measurement dimension" defined b resolution of the device (see Figure 5.1). Towards this end, we are interested in apply data-driven approach to investigate the theoretical boundaries constraining reconstru performance in the future generation of compact sensing hardware.

Problem statement
Efficient sensing is a two-step process requiring the informed design of an optimized enco and an adaptive reconstruction algorithm. Thus the contributions herein are twofold. Fi determine the optimal sensor locations for the reconstruction of broadband (low-comp and narrowband (high-complexity) spectra. Second, to develop an a data-driven recon tion algorithm (SpecRA) that balances simplicity and reconstruction fidelity. Together we that a combined workflow can account for real-world engineering and fabrication const on spectral sensors to determine what are the maximally informative dimensions of v spectra in theory and practice.

What is reconstruction?
Simply defined, reconstruction refers to a process of recovering a signal from a set of li measurements (Priemer, 1991). In practice this can be done several ways depending o application and context, all of which result in solving the following problem: Here y = g ¡ (s) is a measurement of a high-dimensional signal s = E (∏) by an encod reconstructed by a function f µ (c.f., Figure 5.1). The function K being minimized can b metric or norm capturing the dissimilarity between the reconstruction and the ground If the signal is sampled at or above the Nyquist rate, reconstruction can be as simple a structing a linear fit through the sub-sampled points (we call this the "interpolation reg Alternatively, if the signal is undersampled but known to exhibit unique statistical regula "reconstruction" can also be a process of finding a match among known prior observa In both examples, we say the reconstruction is "naïve" in the sense that interpolatio pattern matching are inherently trivial tasks. If the reconstruction is lossy, performance metrics are accompanied with a compression p score to contextualize the trade-off between complexity and descriptivity. The compre power is defined as the ratio between the uncompressed and compressed file size (e.g., de by the vector length). What can be misleading about the compression power is that in the that the compressed file size obtains information saturation (i.e., is sampled at a lossles via Nyquist or CS) the compression ratio becomes meaningless when the uncompressed 60 data-driven approach to investigate the theoretical boundarie performance in the future generation of compact sensing hard

Problem statement
Efficient sensing is a two-step process requiring the informed de and an adaptive reconstruction algorithm. Thus the contributio determine the optimal sensor locations for the reconstruction o and narrowband (high-complexity) spectra. Second, to develo tion algorithm (SpecRA) that balances simplicity and reconstruc that a combined workflow can account for real-world engineeri on spectral sensors to determine what are the maximally info spectra in theory and practice.

What is reconstruction?
Simply defined, reconstruction refers to a process of recoverin measurements (Priemer, 1991). In practice this can be done s application and context, all of which result in solving the follow min K (ŝ, s) subject toŝ = f µ ( Here y = g ¡ (s) is a measurement of a high-dimensional sign reconstructed by a function f µ (c.f., Figure 5.1). The function K metric or norm capturing the dissimilarity between the recons If the signal is sampled at or above the Nyquist rate, reconstru structing a linear fit through the sub-sampled points (we call th Alternatively, if the signal is undersampled but known to exhibi "reconstruction" can also be a process of finding a match amo In both examples, we say the reconstruction is "naïve" in the pattern matching are inherently trivial tasks. If the reconstruction is lossy, performance metrics are accompan score to contextualize the trade-off between complexity and d power is defined as the ratio between the uncompressed and com by the vector length). What can be misleading about the compre that the compressed file size obtains information saturation (i. via Nyquist or CS) the compression ratio becomes meaningless w 60 tigate the theoretical boundaries constraining reconstruction eration of compact sensing hardware.
rocess requiring the informed design of an optimized encoder P algorithm. Thus the contributions herein are twofold. First, to locations for the reconstruction of broadband (low-complexity) exity) spectra. Second, to develop an a data-driven reconstrucalances simplicity and reconstruction fidelity. Together we show account for real-world engineering and fabrication constraints ine what are the maximally informative dimensions of visible ction?
refers to a process of recovering a signal from a set of limited ). In practice this can be done several ways depending on the which result in solving the following problem: ent of a high-dimensional signal s = E (∏) by an encoder g ¡ (c.f., Figure 5.1). The function K being minimized can be any dissimilarity between the reconstruction and the ground truth. bove the Nyquist rate, reconstruction can be as simple as cone sub-sampled points (we call this the "interpolation regime"). dersampled but known to exhibit unique statistical regularities, process of finding a match among known prior observations. reconstruction is "naïve" in the sense that interpolation and ly trivial tasks.
rformance metrics are accompanied with a compression power e-off between complexity and descriptivity. The compression tween the uncompressed and compressed file size (e.g., defined be misleading about the compression power is that in the limit btains information saturation (i.e., is sampled at a lossless rate sion ratio becomes meaningless when the uncompressed signal Chapter 5. Fourth paradigm spectral sensing (Crocombe, 2019). The generic structure for a sensor with approximate Gaussian responsibility, over a wavelength range ∏ 2 §, is defined as: where §(P i j ) is the peak wavelength corresponding to the non-zero element of the p-th column of P (for a total of p channels), and ae is the full-width at half-maximum (FWHM). As it is infeasible to manufacture photodiode sensors with single-wavelength sensitivity channels (the diodes themselves are made from semiconductors with limited physical properties), available filters have FWHM values such that ae 2 [18,25] nm (Crocombe, 2019). An encoder, g ¡ , with p channels defines a measurement process.
As with all "natural" non-bandlimited signals, E (∏) 2 R 1 requires that we make an assumption that E (∏) 2 R n where n is finite and p ø n. When p < 2K log(n/K ) + (7/5)K , where K is the sparsity of the coefficient vector (Candès et al., 2006b), then g ¡ is a lossy compressor as the rate is below the minimums required for lossless reconstruction via Shannon-Nyquist and compressed sensing. We can further minimize information loss by designing g ¡ in a way that leverages domain knowledge of the underlying datatype g ¡ will likely encode.
To do this, we need to construct a simple mathematical model for g ¡ that simulates the measurement process of a real-world physical sensor. This is done by modeling the output current as proportional to the sum of the response function multiplied by the unknown Figure 1. Spectrometry is the process of encoding spectral data to a measured dimension p from the infinite natural signal dimension. This process is visualized in the above graphic and mathematically written as g φ : R ∞ → R p . The measured data can then be reconstructed to a target high resolution n > p such that f θ : R p → R n . Here, the encoder g φ is a filter-array spectrometer with some transmission functions R(λ, i) for i = (1,. . .,p) and f θ represents a reconstruction algorithm. This structure is analogous to that of an autoencoder where the learned encoder weights would replace the response functions of the physical spectrometer and p is the dimension of the hidden layer.
When we say that the algorithm has to "work", what we mean is that unlike in pattern matching and interpolation tasks where there is no optimization procedure taking place, reconstruction has to "add" information that is not trivially available. When signals have no corresponding match and are sampled in the sub-Nyquist regime, the reconstruction problem formulated in Equation (1) becomes: where x is a coefficient vector and k i ∈ K is a prior observation contained in our "Kanji" (a Pareto-optimal library distinct from a learned dictionary). A fundamental property of signal reconstruction is that the more complex a signal is, the more basis modes are needed to accurately approximate that signal. One of the consequences of deriving K from a big dataset L is that the likelihood that a "new" observation has already been measured and exists in an accessible dataset is high. Consequently, reconstruction in the era of big data is less about adhering to a specific methodology and instead about finding the most efficient process required to return the true signal from an encoded measurement. Towards this end, ensuring that K comprises real spectral observations ensures that the "missing" information in an undersampled measurement can be "filled in". That said, when we take a measurement y, the only information we have is that of the encoded signal. This means that we have to "trick" the algorithm by solving: with the assumption that x ≈ x. Given that K is not a generic basis, the guarantees of compressed sensing do not hold in this case. Instead, the efficacy of this assumption is constrained by the ability of signals in K to preserve a unique structure in the measured dimension.

Towards Data-Driven Bases for Reconstruction
Variation in spectral distributions is created from interactions between light and matter. In order to derive a basis, we first need to compile some observable data. Here, we compile a library of illuminant and reflectance spectra from available open source datasets [40][41][42][43]. A high-level description of our library L is summarized in Figure 3. Since each spectrum was sampled at different frequencies, we normalized all spectra within the visible range from 380 to 780 nm and re-sampled them using standard interpolation methods. Within L, we have 401 spectra which were used as a representative set for color rendition studies in addition to 99 color evaluation samples (CESs) uniformly distributed within the natural color system (NCS) gamut [44,45].

Towards Data-Driven Bases for Reconstruction
Variation in spectral distributions is created from interactions between light and matter. In order to derive a basis, we first need to compile some observable data. Here, we compile a library of illuminant and reflectance spectra from available open source datasets [40][41][42][43]. A high-level description of our library L is summarized in Figure 3. Since each spectrum was sampled at different frequencies, we normalized all spectra within the visible range from 380 to 780 nm and re-sampled them using standard interpolation methods. Within L, we have 401 spectra which were used as a representative set for color rendition studies in addition to 99 color evaluation samples (CESs) uniformly distributed within the natural color system (NCS) gamut [44,45]. n bases simplest solution would be 87 sphere [11,12].  [17][18][19][20]. Towards this end, interest in dimension reducing algorithms 43 like PCA, and non-negative matrix factorization (NMF) have become widespread in the literature as 44 viable methods to describe the variance in large spectral datasets [21][22][23][24][25]. While these approaches 45 have been generally successful, there has not been an attempt to apply these methods to describe, 46 organize, and label all available spectra in the visible domain. Such an approach would help to create an 47 organizational structure which results in a set of empirically-derived endmember spectra or 'spectral 48 species'. The identification of which can be used to exploit regularities in spectral feature space for 49 spectral monitoring and compression applications. 50 2.1. Constructing a spectral reference library 51 The first step towards creating a classification structure is to compile a library of spectra which are 52 representative of the visible domain L vis . Towards this end, we created our own library of illuminant 53 and reflectance spectra from available open-source datasets [26][27][28][29]. A high-level description of the 54 library is summarized in Table 1 Table 1. A high-level description of the contents of our spectral libraries for the illuminants (left) and material reflectances (right). Of the 746 illuminant spectra 146 are theoretical and 600 are real. NB: CES refers to 'color evaluation sample'. Reflectance spectra in the CES set are uniformly distributed across the NCS color gamut. The complete library with verbose descriptions of the spectra can be found in the supplemental material.
The contents of Table 1 are random in frequency since they reflect the abundance in available spectral 59 data in open source datasets. The illuminant spectra contain 401 spectra which were used as a 60 representative set for color rendition studies as where the 99 CES reflectances which are uniformly 61 distributed within the NCS color gamut [30,31]. Additions to the illuminant class include mobile and 62 computer screen as well as high-intensity discharge (HID) sources used in street lighting, and daylight 63 spectra for various sun angles above the horizon in urban and rural settings [32]. We acknowledge that 64 her limited when one considers the fact that many possible sion) are not available and many of the available sources mon [11]. In the same way that biological individuals may how a theoretically infinite set of spectra may be reduced sed on features in the spectral power distribution can be cation systems for spectral distributions were first proposed arly 20th century [12][13][14][15][16]. Since then, spectral classification onitoring applications most notably in medical research, wards this end, interest in dimension reducing algorithms zation (NMF) have become widespread in the literature as n large spectral datasets [21][22][23][24][25]. While these approaches not been an attempt to apply these methods to describe, e visible domain. Such an approach would help to create an set of empirically-derived endmember spectra or 'spectral e used to exploit regularities in spectral feature space for lications.
tion structure is to compile a library of spectra which are Towards this end, we created our own library of illuminant en-source datasets [26][27][28][29]. A high-level description of the 7 spectra collected, 746 are illuminant spectra (of which 600 erials illuminated under an equal energy illuminant. Each to 780 nanometers and re-sampled using cubic interpolation ontents of our spectral libraries for the illuminants (left) and lluminant spectra 146 are theoretical and 600 are real. NB: ctance spectra in the CES set are uniformly distributed across the ith verbose descriptions of the spectra can be found in the uency since they reflect the abundance in available spectral inant spectra contain 401 spectra which were used as a ies as where the 99 CES reflectances which are uniformly 0,31]. Additions to the illuminant class include mobile and ischarge (HID) sources used in street lighting, and daylight rizon in urban and rural settings [32]. We acknowledge that e considers the fact that many possible able and many of the available sources me way that biological individuals may infinite set of spectra may be reduced the spectral power distribution can be pectral distributions were first proposed [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16]. Since then, spectral classification ions most notably in medical research, erest in dimension reducing algorithms become widespread in the literature as asets [21][22][23][24][25]. While these approaches pt to apply these methods to describe, ch an approach would help to create an erived endmember spectra or 'spectral egularities in spectral feature space for compile a library of spectra which are e created our own library of illuminant [26][27][28][29]. A high-level description of the 746 are illuminant spectra (of which 600 nder an equal energy illuminant. Each nd re-sampled using cubic interpolation l libraries for the illuminants (left) and are theoretical and 600 are real. NB: S set are uniformly distributed across the ons of the spectra can be found in the ect the abundance in available spectral in 401 spectra which were used as a CES reflectances which are uniformly the illuminant class include mobile and ces used in street lighting, and daylight ural settings [32]. We acknowledge that further limited when one considers the fact that many possible ar fusion) are not available and many of the available sources t common [11]. In the same way that biological individuals may n see how a theoretically infinite set of spectra may be reduced es based on features in the spectral power distribution can be assification systems for spectral distributions were first proposed the early 20th century [12][13][14][15][16]. Since then, spectral classification tral monitoring applications most notably in medical research, 0]. Towards this end, interest in dimension reducing algorithms ctorization (NMF) have become widespread in the literature as nce in large spectral datasets [21][22][23][24][25]. While these approaches e has not been an attempt to apply these methods to describe, in the visible domain. Such an approach would help to create an s in a set of empirically-derived endmember spectra or 'spectral can be used to exploit regularities in spectral feature space for applications.
rary sification structure is to compile a library of spectra which are vis . Towards this end, we created our own library of illuminant le open-source datasets [26][27][28][29]. A high-level description of the e 1767 spectra collected, 746 are illuminant spectra (of which 600 materials illuminated under an equal energy illuminant. Each 380 to 780 nanometers and re-sampled using cubic interpolation the contents of our spectral libraries for the illuminants (left) and 746 illuminant spectra 146 are theoretical and 600 are real. NB: . Reflectance spectra in the CES set are uniformly distributed across the ary with verbose descriptions of the spectra can be found in the frequency since they reflect the abundance in available spectral illuminant spectra contain 401 spectra which were used as a studies as where the 99 CES reflectances which are uniformly ut [30,31]. Additions to the illuminant class include mobile and sity discharge (HID) sources used in street lighting, and daylight he horizon in urban and rural settings [32]. We acknowledge that rther limited when one considers the fact that many possible fusion) are not available and many of the available sources ommon [11]. In the same way that biological individuals may ee how a theoretically infinite set of spectra may be reduced based on features in the spectral power distribution can be sification systems for spectral distributions were first proposed e early 20th century [12][13][14][15][16]. Since then, spectral classification l monitoring applications most notably in medical research, Towards this end, interest in dimension reducing algorithms orization (NMF) have become widespread in the literature as e in large spectral datasets [21][22][23][24][25]. While these approaches as not been an attempt to apply these methods to describe, the visible domain. Such an approach would help to create an a set of empirically-derived endmember spectra or 'spectral n be used to exploit regularities in spectral feature space for plications. y cation structure is to compile a library of spectra which are s . Towards this end, we created our own library of illuminant pen-source datasets [26][27][28][29]. A high-level description of the 1767 spectra collected, 746 are illuminant spectra (of which 600 aterials illuminated under an equal energy illuminant. Each 80 to 780 nanometers and re-sampled using cubic interpolation

MATERIALS
e contents of our spectral libraries for the illuminants (left) and 6 illuminant spectra 146 are theoretical and 600 are real. NB: eflectance spectra in the CES set are uniformly distributed across the with verbose descriptions of the spectra can be found in the equency since they reflect the abundance in available spectral minant spectra contain 401 spectra which were used as a udies as where the 99 CES reflectances which are uniformly [30,31]. Additions to the illuminant class include mobile and y discharge (HID) sources used in street lighting, and daylight horizon in urban and rural settings [32]. We acknowledge that are random in frequency since they reflect the abundance in available spectral tasets. The illuminant spectra contain 401 spectra which were used as a lor rendition studies as where the 99 CES reflectances which are uniformly CS color gamut [30,31]. Additions to the illuminant class include mobile and as high-intensity discharge (HID) sources used in street lighting, and daylight ngles above the horizon in urban and rural settings [32]. We acknowledge that ese methods to describe, h would help to create an mber spectra or 'spectral pectral feature space for ary of spectra which are wn library of illuminant -level description of the ant spectra (of which 600 energy illuminant. Each using cubic interpolation illuminants (left) and and 600 are real. NB: ly distributed across the ra can be found in the ance in available spectral a which were used as a ces which are uniformly class include mobile and eet lighting, and daylight 2]. We acknowledge that methods to describe, ould help to create an er spectra or 'spectral ctral feature space for of spectra which are n library of illuminant vel description of the t spectra (of which 600 ergy illuminant. Each ing cubic interpolation distributed across the an be found in the ce in available spectral hich were used as a which are uniformly ss include mobile and lighting, and daylight . We acknowledge that ly these methods to describe, roach would help to create an dmember spectra or 'spectral s in spectral feature space for a library of spectra which are our own library of illuminant high-level description of the uminant spectra (of which 600 qual energy illuminant. Each pled using cubic interpolation or the illuminants (left) and etical and 600 are real. NB: iformly distributed across the spectra can be found in the undance in available spectral ectra which were used as a ectances which are uniformly nant class include mobile and in street lighting, and daylight ngs [32]. We acknowledge that these methods to describe, ach would help to create an ember spectra or 'spectral n spectral feature space for ibrary of spectra which are r own library of illuminant igh-level description of the inant spectra (of which 600 al energy illuminant. Each ed using cubic interpolation the illuminants (left) and cal and 600 are real. NB: rmly distributed across the ctra can be found in the dance in available spectral tra which were used as a ances which are uniformly nt class include mobile and street lighting, and daylight s [32]. We acknowledge that Normalized irradiance [-] Dictionary Illuminants Materials Figure 3. A high-level description of the contents of our spectral libraries for the illuminants and material reflectances. Of the 746 illuminant spectra, 146 are theoretical and 600 are real. The 1021 reflectances correspond to material samples under equal energy illumination. We can see that despite the variety of illuminants and materials, there are clear regularities in signal space (e.g., very few spectra have high relative power between 380 and 430 nm).
Additions to the illuminant class include mobile and computer screen as well as high-intensity discharge (HID) sources used in street lighting, and daylight spectra for various sun angles above the horizon in urban and rural settings [46]. We acknowledge that this set is not complete; however, we believe that from a mechanistic perspective, the space of available spectral illuminants is sufficiently sampled and is likely overcomplete for some sub-classes [47]. Like pixel space, the vastness of signal space means that natural signals are inherently rare with the vast majority of signals containing no information [48]. Even if future spectral measurements are naturally sparse in L, there is a lot of redundancy making L computationally heavy. This is where our learned Kanji can be useful. If K comprises the same features as L, then we can use K to derive a low-rank basis (with the default being uniform placement). Towards this end, we investigate the following sparse coding methods:
Deep autoencoders (DAE). Figure 3. A high-level description of the contents of our spectral libraries for the illuminants and material reflectances. Of the 746 illuminant spectra, 146 are theoretical and 600 are real. The 1021 reflectances correspond to material samples under equal energy illumination. We can see that despite the variety of illuminants and materials, there are clear regularities in signal space (e.g., very few spectra have high relative power between 380 and 430 nm).

Details
Additions to the illuminant class include mobile and computer screen as well as high-intensity discharge (HID) sources used in street lighting, and daylight spectra for various sun angles above the horizon in urban and rural settings [46]. We acknowledge that this set is not complete; however, we believe that from a mechanistic perspective, the space of available spectral illuminants is sufficiently sampled and is likely overcomplete for some sub-classes [47]. Like pixel space, the vastness of signal space means that natural signals are inherently rare with the vast majority of signals containing no information [48]. Even if future spectral measurements are naturally sparse in L, there is a lot of redundancy making L computationally heavy. This is where our learned Kanji can be useful. If K comprises the same features as L, then we can use K to derive a low-rank basis (with the default being uniform placement). Towards this end, we investigate the following sparse coding methods:

Singular Value Decomposition
The simplest data-driven basis can be derived by computing the first r column vectors of the unitary matrix of the singular value decomposition [39]. Given some representative data K, a basis Ψ r can be found by solving: The main advantage of SVD is that it can be quickly and efficiently performed in most computational software packages. The resulting basis (columns of Ψ r ) are ordered with respect to the strength of their contribution in representing variance in the original data matrix. When plotted, it is evident that any similarity to real-world spectral power distributions is lost. Instead, we can think of the basis vectors as defining an abstract "feature space". While the advantage of an SVD-derived basis is simplicity, the drawback is that the bases can only be used to find a linear map (i.e., a relatively simple relationship given the abilities of modern deep autoencoders). In fact, we can think of SVD as a special case of a DAE wherein the encoder weights describe a linear relationship.

Symmetric Non-Negative Matrix Factorization
While the SVD basis works well capturing the features of K, it notoriously lacks interpretability when applied to physical systems where negative values may be meaningless. Within the context of designing a spectral imaging sensor, the response sensitivities of each channel must be positive because there is no physical way to interpret negative sensitivity. Towards this end, we implement symmetric non-negative matrix factorization (SymNMF) [49,50]. SymNMF overcomes the pitfalls of other algorithms insofar as it is capable of capturing nonlinear cluster structures (unlike standard "brother" non-negative matrix factorization). Even more interestingly, SymNMF optimization is independent of the eigenspace of the affinity matrix (unlike spectral clustering). Furthermore, the affinity matrix A can be defined with respect to any appropriate distance metric given a priori knowledge of the datatype. The minimization problem for SymNMF is defined as where r is the rank of Ψ. If the number of spectra in L is n, then A is a square n × n matrix where each element in A corresponds to a measure of distance between observations. Formally, we define A elementwise as where K(k i , k j ) is a similarity measure (e.g., Euclidean distance) between k i , k j ∈ K. Here, Ψ r has r columns corresponding to the learned basis vectors. One of the core benefits of SymNMF is that d can be selected via knowledge of the underlying datatype. In the most abstract applications, d may best be represented by information theoretic measures such as the normalized information and compression distances [51,52].

Sparse Dictionary Learning
Sparse dictionary learning (SDL) is a sub-domain of sparse representation that spans a number of algorithms, most notably: the method of optimal direction (MOD) [53,54]; k-singular value decomposition (K-SVD) [55]; and online dictionary learning (ODL) [56], which is commonly implemented for its competitive speed [57]. While it may appear that learned dictionaries are naturally superior, it is important to understand their benefits and shortcomings. First, SDL may be unnecessary if the data are naturally sparse in signal space [39]. Second, dictionary learning algorithms tend to be computationally expensive because they require multiple iterations to converge on the optimal solution. Regardless, their applicability in sparse approximation should not be ignored and while the atoms do not necessarily retain their similarity to real-world spectra, they do share the most similarities than any of the other methods and do not produce negative values if the input data are nonnegative. We seek to find a dictionary Ψ r by minimizing the following loss function reported in [58]: Here, β is a sparsity-promoting coefficient. This loss function is commonly referred to as the sparse-coding or LASSO regression [59]. LASSO balances sparsity with model complexity in order to promote a dictionary with low cross-validated error and is one proposed approach to solving the l 1 -minimization problem framed in Equation (3).

Deep Autoencoders
Autoencoders refer to a specific class of artificial neural networks whose aim is to learn the most efficient encoding of some data in a target low-dimensional representation (latent space) [60]. The architecture of an autoencoder is roughly represented by the sketch in Figure 1 where the encoding weights Ψ are learned via the loss function: where Ω w acts as an l 2 penalty on the encoder weights and Ω s enforces sparsity via the Kullback-Leibler (KL) divergence [61]. While the loss function for the autoencoder requires more unpacking than others, the key take-away is that the mean squared error (MSE) is minimized between the learned representationŷ = f θ (y) and y for some latent basis Ψ given some regularization constraints on the weights associated to each basis vector, and a sparsity constraint on the reconstruction of the output. In essence, the goals and ambitions are well aligned with the other methods but with an added degree of flexibility. Equation (9) is closely related to the sparse relaxed regularized regression (SR3) method [62] aimed at finding a less restrictive loss function.

Implementation via QU Factorization
To summarize out steps up to this point, we amassed a library L ∈ R n of available online datasets without processing it in any way. We then used a subset K ∈ R m as an input to a number of well-known sparse-coding methods to arrive at four candidate lowrank bases Ψ i ∈ R r for n m > r. We now want to use these bases to design different non-uniform encoders by assigning response functions to the pivot points derived via QU factorization [63]: Here, a pivot matrix P is derived for each of the four "data-driven" bases (n.b., the uniform basis is not subscripted). Equation (10) can be solved using preset commands in most computational suites, taking the basis Ψ i as the only input and outputting the pivot points. The resulting pivots (non-zero entries in P) correspond to the peak wavelengths used to construct our encoder. From these points, we can define a response function by fitting a Gaussian distribution consistent with most available filters used in arraytype spectrometers [14]. The generic structure for a sensor with approximate Gaussian responsibility over a wavelength range λ ∈ Λ, which is defined as where Λ(P ij ) is the peak wavelength corresponding to the non-zero element of the p-th column of P (for a total of p channels), and σ is the full-width at half-maximum (FWHM).
As it is infeasible to manufacture photodiode sensors with single-wavelength sensitivity  [14]. In Figure 4, we show example response functions constructed via Equation (11).   (11). Here, the peak responsivity corresponds to the pivot points derived via QU factorization.

Towards data-driven bases for reconstruction
An encoder, g φ , with p channels, defines a measurement process: As with all "natural" non-bandlimited signals, E(λ) ∈ R ∞ requires that we make an assumption that E(λ) ∈ R n where n is finite and p n. When p < 2Klog(n/K) + (7/5)K, where K is the sparsity of the coefficient vector [33], then g φ is a lossy compressor as the rate is below the minimums required for lossless reconstruction via Shannon-Nyquist and compressed sensing. We can further minimize information loss by designing g φ in a way that leverages the domain knowledge that the underlying datatype g φ will likely encode. To do this, we need to construct a simple mathematical model for g φ that simulates the measurement process of a real-world physical sensor. This is performed by modeling the output current as proportional to the sum of the response function multiplied by the unknown spectral distribution: where λ = max(Λ) and (λ, p) is the per-wavelength measurement error associated with the channels. If we assume the response functions are roughly Gaussian, the relative differences between measured points are only preserved if the response functions are the same width (which is not always the case in real-world sensors).

Specra: An Adaptive Reconstruction Framework
Reconstruction is broadly characterized by three regimes depending on the accessibility of information: matching, reconstruction and interpolation. When signals are easily describable in K, there is little risk of overfitting, but when signals exhibit unseen features, adding a greater number of elements to K does not always imply greater reconstruction performance. What we seek to accomplish with SpecRA, is to develop an adaptive framework that uses the knowledge of K, together with the measured signal y, to triage observations as they arrive at the sensor in order to apply the reconstructive method with the highest probability of success. The core decision making comes down to the value of the following measure, which we refer to as the relative "rarity" of the measurement y relative to the encoded elements in the reference set k i ∈ K: The challenge with this approach is that measured signals lose much of their unique features during the encoding process. In the previous section we outlined a workflow to improve the preservation of feature structure by deriving data-driven non-uniform sampling alternatives. Here, we aim to benefit from this groundwork in order to demonstrate the superiority of the proposed, integrated, method outlined in Figure 5. Abstract: Information is the resolution of uncertainty that manifests itself as patterns. Although 1 complex, most observable phenomena are not random and instead are associated to highly complex, 2 yet deterministic, chaotic systems. The underlying patterns and symmetries expressed from these 3 phenomena determine how much they can be compressed without losing information. While some 4 patterns like the existence of Fourier modes are easy to extract, advances in machine learning have 5 enabled more comprehensive methods in feature extraction most notably in their ability to find non- 6 linear relationships. Herein we review some of these methods and their application to the discovery 7 of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world.

13
Keywords: measurement; big data; matrix decomposition; sparse representation; tailored sensing 14 1. Introduction 15 y = PE(l) 16 Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. Abstract: Information is the resolution of uncertainty that manifests itself as patterns. Although 1 complex, most observable phenomena are not random and instead are associated to highly complex, 2 yet deterministic, chaotic systems. The underlying patterns and symmetries expressed from these 3 phenomena determine how much they can be compressed without losing information. While some 4 patterns like the existence of Fourier modes are easy to extract, advances in machine learning have 5 enabled more comprehensive methods in feature extraction most notably in their ability to find non- 6 linear relationships. Herein we review some of these methods and their application to the discovery 7 of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world.

13
Keywords: measurement; big data; matrix decomposition; sparse representation; tailored sensing 14 19 Non-random, chaotic, signals arise from natural and engineered processes [1]. As 20 such, information transmitted from physical systems is routinely captured and stored in 21 increasingly large datasets. As the number of recorded observations tends towards the 22 number of all possible observations (which is not necessarily infinite depending on the 23 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 24 decomposition) will span the feature space of all observations. Broadly speaking, mea-25 surements are observational snapshots encoded by devices calibrated for specific uses 26 [2] and are typically taken without knowledge of the underlying system. For example, 27 amplification of recorded electrical signals is unbiased as to their source. Greater data 28 availability means that most contemporary measurements are not likely to retain especially 29 surprising information, reflected in the truism, the more we know about our world, the less 30 there is to discover. In fields where the structures and properties of physical phenomena 31 are well-documented, there are more assumptions we can be made about the design of mea-32 surement hardware, as well as the classification structure of a domain-specific taxonomy. 33 For example, while there are numerous species of insects that have yet to be discovered, 34 enough have been identified to reasonably conclude that any unknown species that remain 35

Introduction
Non-random, chaotic, signals arise from natural and engineered processes [1]. A such, information obtained from physical systems is routinely captured and stored in increasingly large datasets. As the number of recorded observations tends towards th number of all possible observations, then any basis derived from this library of prior (e.g., via proper orthogonal decomposition) will span the feature space of all observation Broadly speaking, measurements are observational snapshots encoded by devices calibrated for specific uses [2] and are typically taken without knowledge of the underlying system For example, amplification of recorded electrical signals is unbiased as to their sourc Greater data availability means that most contemporary measurements are not likely t retain especially surprising information, reflected in the truism, the more we know abou our world, the less there is to discover. In fields where the structures and properties o physical phenomena are well-documented, there are more assumptions we can be mad about the design of measurement hardware, as well as the classification structure of domain-specific taxonomy. For example, while there are numerous species of insects tha have yet to be discovered, enough have been identified to reasonably conclude that an unknown species that remain will most likely exhibit similar traits (i.e., comprising chitinous exoskeleton and similar physical morphology). Therefore, when entomologist go into the field to search for novel species, they bring equipment to measure, documen and collect their observations that is optimized for classifying prior observations. Th axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting epistemological question that is, by extension, extremel relevant in metrology.     relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. 36 Version February 16, 2022 submitted to Information https://www.mdpi.com/journal/information Transcoding spectral data with prior knowledge Forrest Simon Webler March 15, 2022 The goal of this document is to outline the systems and methods for the optimal design of a transcoder specifically for spectral data in the visible domain, but with potential to be generalized to other datatypes.
A transcoder can be a physical device or computer program which takes data (in this case spectral light data) from one dimension, encodes it to a lower dimension and decodes it back to a higher resolution which may or may not be the same as the original input dimension. For simplicity, we will call the encoder Q and the decoder Q 1 such that Q : R d ! R p where p < d and Q 1 is approximated by a reconstruction algorithm ⇡ Q 1 . As ! Q 1 the reconstruction performance increases although perfect (i.e. lossless) reconstruction may not be guaranteed. Towards this end, designing a good transcoder means designing both an e cient encoder Q and a corresponding reconstruction algorithm .

Designing an optimal encoder
Since the data we are encoding is spectral data (i.e. ambient light) the 'encoder' will be a form of spectrometer specifically a filter array spectrometer where the spectrum is measured by m number of channels placed at di↵erent wavelengths. Each channel is sensitive to a particular wavelength and current state-of-the-art filters are roughly Gaussian with full width at half maximum (FWHM) around 20nm [1,2]. Of course the narrower and more plentiful the filters are, the better the spectrum is measured (for reference, conventional spectrometers have 2048 channels and sample every 0.3nm). Achieving such results with low-cost and compact spectrometers is currently impossible. For this reason optimization must be within the constraints we are given: FHWM ⇡ 20nm and if we can reconstruct the same data with fewer channels without significant loss of information that is an advantage (i.e. data collection would be more e cient). Towards this end, we can only change two things: where the channels are placed on the wavelength axis, and the number of channels (between 3-20 channels for compact filter-based spectrometers).
where §(P i j ) is the peak wavelength corresponding to the non-zero element of the p-th column of P (for a total of p channels), and ae is the full-width at half-maximum (FWHM). As it is infeasible to manufacture photodiode sensors with single-wavelength sensitivity channels (the diodes themselves are made from semiconductors with limited physical properties), available filters have FWHM values such that ae 2 [18,25] nm (Crocombe, 2019). An encoder, g ¡ , with p channels defines a measurement process.
As with all "natural" non-bandlimited signals, E (∏) 2 R 1 requires that we make an assumption that E (∏) 2 R n where n is finite and p ø n. When p < 2K log(n/K ) + (7/5)K , where K is the sparsity of the coefficient vector (Candès et al., 2006b), then g ¡ is a lossy compressor as the rate is below the minimums required for lossless reconstruction via Shannon-Nyquist and compressed sensing. We can further minimize information loss by designing g ¡ in a way that leverages domain knowledge of the underlying datatype g ¡ will likely encode.
To do this, we need to construct a simple mathematical model for g ¡ that simulates the measurement process of a real-world physical sensor. This is done by modeling the output current as proportional to the sum of the response function multiplied by the unknown spectral distribution.
where ∏ > = max( §) and ≤(∏, p) is the per-wavelength measurement error associated to the channels. If we assume the response functions are roughly Gaussian, the relative differences between measured points are only preserved if the response functions are the same width (which is not always the case in real-world sensors).

SpecRA: an adaptive reconstruction framework
Reconstruction is broadly characterized by three regimes depending on the accessibility of information: matching, reconstruction, and interpolation. When signals are easily describable in K, there is little risk of overfitting, but when signals exhibit unseen features, adding a greater number of elements to K does not always imply greater reconstruction performance. What we seek to accomplish with SpecRA, is to develop an adaptive framework that uses the knowledge of K, together with the measured signal y, to triage observations as they arrive at the sensor in order to apply the reconstructive method with the highest probability of success. The core decision making comes down to the value of the following measure we refer to as the relative "rarity" of the measurement y, relative to the encoded elements in the reference set k i 2 K. 14) The challenge with this approach is that measured signals lose much of their unique features during the encoding process. In the previous section we outlined a workflow to improve the preservation of feature structure by deriving data-driven non-uniform sampling alternatives.
Here, we aim to benefit from this groundwork in order to demonstrate the superiority of the proposed method.
Measurement in the age of information. of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world.

Introduction
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 20 such, information transmitted from physical systems is routinely captured and stored in 21 increasingly large datasets. As the number of recorded observations tends towards the 22 number of all possible observations (which is not necessarily infinite depending on the 23 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 24 decomposition) will span the feature space of all observations. Broadly speaking, mea-25 surements are observational snapshots encoded by devices calibrated for specific uses 26 [2] and are typically taken without knowledge of the underlying system. For example, 27 amplification of recorded electrical signals is unbiased as to their source. Greater data 28 availability means that most contemporary measurements are not likely to retain especially 29 surprising information, reflected in the truism, the more we know about our world, the less 30 there is to discover. In fields where the structures and properties of physical phenomena 31 are well-documented, there are more assumptions we can be made about the design of mea-32 surement hardware, as well as the classification structure of a domain-specific taxonomy. 33 For example, while there are numerous species of insects that have yet to be discovered, 34 enough have been identified to reasonably conclude that any unknown species that remain 35 relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

14
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. linear relationships. Herein we review some of these methods and their application to the discovery 7 of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 17 such, information transmitted from physical systems is routinely captured and stored in 18 increasingly large datasets. As the number of recorded observations tends towards the 19 number of all possible observations (which is not necessarily infinite depending on the 20 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 21 decomposition) will span the feature space of all observations. Broadly speaking, mea-22 surements are observational snapshots encoded by devices calibrated for specific uses 23 [2] and are typically taken without knowledge of the underlying system. For example, 24 amplification of recorded electrical signals is unbiased as to their source. Greater data 25 availability means that most contemporary measurements are not likely to retain especially 26 surprising information, reflected in the truism, the more we know about our world, the less 27 there is to discover. In fields where the structures and properties of physical phenomena 28 are well-documented, there are more assumptions we can be made about the design of mea-29 surement hardware, as well as the classification structure of a domain-specific taxonomy. 30 For example, while there are numerous species of insects that have yet to be discovered, 31 enough have been identified to reasonably conclude that any unknown species that remain 32 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 33 physical morphology). Therefore, when entomologists go into the field to search for novel 34 species, they bring equipment to measure, document, and collect their observations that is 35 optimized for classifying prior observations. The axiom that many new observations are 36 likely to be unsurprising (i.e., derivative as opposed to truly novel) raises an interesting 37 epistemological question that is, by extension, extremely relevant in metrology. relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world. Non-random, chaotic, signals arise from natural and engineered processes [1]. As 15 such, information obtained from physical systems is routinely captured and stored in 16 increasingly large datasets. As the number of recorded observations tends towards the 17 number of all possible observations, then any basis derived from this library of priors 18 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 19 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 20 for specific uses [2] and are typically taken without knowledge of the underlying system. 21 For example, amplification of recorded electrical signals is unbiased as to their source. 22 Greater data availability means that most contemporary measurements are not likely to 23 retain especially surprising information, reflected in the truism, the more we know about 24 our world, the less there is to discover. In fields where the structures and properties of 25 physical phenomena are well-documented, there are more assumptions we can be made 26 about the design of measurement hardware, as well as the classification structure of a 27 domain-specific taxonomy. For example, while there are numerous species of insects that 28 have yet to be discovered, enough have been identified to reasonably conclude that any 29 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 30 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 31 go into the field to search for novel species, they bring equipment to measure, document, 32 and collect their observations that is optimized for classifying prior observations. The 33 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 34 to truly novel) raises an interesting epistemological question that is, by extension, extremely 35 relevant in metrology. linear relationships. Herein we review some of these methods and their application to the discovery 7 of new transform bases. We highlight the efficacy of these bases over generic ones (e.g., Fourier) 8 in encoding information and contextualize these developments in the evolution of metrology. In 9 short, the intrinsic dimension of a signal can be redefined in the dimension of a data-driven low-rank 10 feature space. Towards this end, the metrological standards and norms that determine the resolution 11 of devices used to encode information may need to be redefined within the context of an data-rich 12 world.

Introduction
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 18 such, information transmitted from physical systems is routinely captured and stored in 19 increasingly large datasets. As the number of recorded observations tends towards the 20 number of all possible observations (which is not necessarily infinite depending on the 21 domain), then any basis derived from this library of priors (e.g., via proper orthogonal 22 decomposition) will span the feature space of all observations. Broadly speaking, mea-23 surements are observational snapshots encoded by devices calibrated for specific uses 24 [2] and are typically taken without knowledge of the underlying system. For example, 25 amplification of recorded electrical signals is unbiased as to their source. Greater data 26 availability means that most contemporary measurements are not likely to retain especially 27 surprising information, reflected in the truism, the more we know about our world, the less 28 there is to discover. In fields where the structures and properties of physical phenomena 29 are well-documented, there are more assumptions we can be made about the design of mea-30 surement hardware, as well as the classification structure of a domain-specific taxonomy. 31 For example, while there are numerous species of insects that have yet to be discovered, 32 enough have been identified to reasonably conclude that any unknown species that remain 33 will most likely exhibit similar traits (i.e., comprising a chitinous exoskeleton and similar 34 physical morphology). Therefore, when entomologists go into the field to search for novel 35 species, they bring equipment to measure, document, and collect their observations that is 36 optimized for classifying prior observations. The axiom that many new observations are 37 Version February 6, 2022 submitted to Information https://www.mdpi.com/journal/information Citation: Webler, F.S. and Andersen, M.
Measurement in the age of information.  relationships. Herein we review some of these methods and how they might inform the discovery 7 of useful transform bases. Additionally, illustrate the efficacy of data-driven bases over generic 8 ones in encoding information whilst contextualizing these developments in the evolution of "fourth 9 paradigm" metrology. Towards this end, we propose that existing metrological standards and norms 10 may need to be redefined within the context of a data-rich world.

15
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 16 such, information obtained from physical systems is routinely captured and stored in 17 increasingly large datasets. As the number of recorded observations tends towards the 18 number of all possible observations, then any basis derived from this library of priors 19 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 20 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 21 for specific uses [2] and are typically taken without knowledge of the underlying system. 22 For example, amplification of recorded electrical signals is unbiased as to their source. 23 Greater data availability means that most contemporary measurements are not likely to 24 retain especially surprising information, reflected in the truism, the more we know about 25 our world, the less there is to discover. In fields where the structures and properties of 26 physical phenomena are well-documented, there are more assumptions we can be made 27 about the design of measurement hardware, as well as the classification structure of a 28 domain-specific taxonomy. For example, while there are numerous species of insects that 29 have yet to be discovered, enough have been identified to reasonably conclude that any 30 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 31 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 32 go into the field to search for novel species, they bring equipment to measure, document, 33 and collect their observations that is optimized for classifying prior observations. The 34 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 35 to truly novel) raises an interesting epistemological question that is, by extension, extremely 36 relevant in metrology.

15
Non-random, chaotic, signals arise from natural and engineered processes [1]. As 16 such, information obtained from physical systems is routinely captured and stored in 17 increasingly large datasets. As the number of recorded observations tends towards the 18 number of all possible observations, then any basis derived from this library of priors 19 (e.g., via proper orthogonal decomposition) will span the feature space of all observations. 20 Broadly speaking, measurements are observational snapshots encoded by devices calibrated 21 for specific uses [2] and are typically taken without knowledge of the underlying system. 22 For example, amplification of recorded electrical signals is unbiased as to their source. 23 Greater data availability means that most contemporary measurements are not likely to 24 retain especially surprising information, reflected in the truism, the more we know about 25 our world, the less there is to discover. In fields where the structures and properties of 26 physical phenomena are well-documented, there are more assumptions we can be made 27 about the design of measurement hardware, as well as the classification structure of a 28 domain-specific taxonomy. For example, while there are numerous species of insects that 29 have yet to be discovered, enough have been identified to reasonably conclude that any 30 unknown species that remain will most likely exhibit similar traits (i.e., comprising a 31 chitinous exoskeleton and similar physical morphology). Therefore, when entomologists 32 go into the field to search for novel species, they bring equipment to measure, document, 33 and collect their observations that is optimized for classifying prior observations. The 34 axiom that many new observations are likely to be unsurprising (i.e., derivative as opposed 35 to truly novel) raises an interesting epistemological question that is, by extension, extremely 36 relevant in metrology.

37
Version February 16, 2022 submitted to Information https://www.mdpi.com/journal/information Reconstructed measurement Returned match Figure 5.5: The SpecRA algorithm sorts raw measurements based on their rarity R relative to a Kanji K and applies an appropriate minimizing optimization. This process helps to avoid both underfitting and overfitting making for an adaptive and more generalizable framework. The reconstruction processes further to the right in the diagram, the greater algorithmic work is required (c.f., Equation 5.2).
To determine the triage constant Ø we incrementally increased the value while applying the algorithm repeatedly on random testing and validation partitions. We also realized that Ø depends on the resolution of the input data. This is because lower-dimensional data will more effectively "mask" the rarity of the signal. For this reason we repeated this analysis for p = (1, ..., 16) channels (after which we saw a plateauing effect). While we experimented with 70 Figure 5. The SpecRA algorithm sorts raw measurements based on their rarity R relative to a Kanji K and applies an appropriate minimizing optimization (i.e., M1, M2, or M3). This process helps prevent both underfitting and overfitting, making for an adaptive and more generalizable framework. In general, the greater the R, the greater algorithmic "work" is required.
The three minimizing optimization processes are defined as follows:

M1
Return the nearest match k i such that R = min(R); M2 Solve min x + x − w 1 for w j=i = 1; w j =i = 0, subject to y = xg φ (K); M3 Solve min x subject to y = xg φ (K).
To determine the triage constants α and β, we incrementally increased each value while repeatedly applying the algorithm on randomized testing and validation partitions. We also realized that β should depend on the number of channels p of the encoder. This is because lower-dimensional data will more effectively "mask" the rarity of the signal (i.e., leading to greater metamerism). For this reason, we repeated this analysis for p = (1,. . ., 25) channels. While we experimented with many different relationships, the ansatz that β scales with the inverse square-root of p was the most successful. Consequently, we were able to determine the following relationship for the available spectral data: where L is the target loss (of the reconstruction) determined by the spectral angle mapper [64]. In practice, L β = 0.05. For α, we found that this relationship also holds-albeit for a smaller target loss such that L α = 0.01.

Non-Uniform Performance Dynamics
In this section, we present results from simulated data for which we compare the viability of the learned non-uniform sampling protocols against the uniform reference. We compute the mean errors and plot their distribution as a function of method and rate. Additionally, we test our hypotheses regarding the correlation between loss, signal complexity, and signal rarity.

Comparison to Existing Approaches
To obtain an idea of the differences between SpecRA and other competing approaches, we present results from the undersampled regime (p = 3). Here, Figure 6, we simulated the response that a tristimulus sensor would have with uniform and non-uniform responsivity. As expected, the information being fit is too coarse for the Fourier modes to find a fit in the measured dimension. LASSO fails to find a parsimonious fit and succumbs to overfitting. Because of the adaptability of SpecRA, the fit is more balanced and the ansatz made in the low-rank space is not far from the ground truth. Furthermore, we can see that the performance increases when structure is preserved by finding a more optimal low-rank encoding.   Figure 6. Comparing the performance of the algorithm in the "extreme" undersampled case of p = 3, we see how SpecRA outperforms the other methods, mostly by avoiding overfitting. Furthermore, the performance is improved when the signal is encoded with a learned non-uniform protocol (in this case, derived from the weights learned by training a deep autoencoder network).
While reconstruction with Fourier modes may be more appropriate in higher dimensional reconstruction (i.e., p > 10), the advantages are only seen when the measured signal is typologically distant from the reference set of prior observations. SpecRA takes a simple yet effective approach: maximizing the available information and not overfitting.

Loss as a Function of Method and Rate
As we are working with spectral data, we reported the reconstruction error (loss) terms of the spectral angle mapper (SAM) defined as In order to compare results, we first split our library L into five random training, T, and validation V sets. The training sets comprised n T = 300 spectra while the validation sets comprised n V = 1417 (i.e., the remaining signals in L after removing the 146 theoretical sources). We then constructed our encoder using the training set to first derive K, then Ψ, and finally P used to construct the response functions R for g θ . Then, we simulated the measurement process of the spectra in the validation set by Equation (13) with (λ, p) = 0 (i.e., for comparative analysis and applications where simulated data are used, e.g., rendering) for p = (1,. . ., 25). We then reconstructed the measured spectra via the SpecRA algorithm, and computed the reconstruction loss defined above.
What we see in Figure 7 is the type of Pareto distribution we expect to see in such experiments. As the number of channels increases, eventually a plateauing effect is ob-served wherein adding more channels to the encoder does not result in greater returns in minimizing loss. What is interesting is that the uniform approach is very clearly the exception resulting in a loss 150% that of the nearest non-uniform approach for an encoder with two channels. Aside from the mean, we can also plot the individual loss per reconstructed spectrum in the validation set. In Figure 8, we can see how plotting the uniform losses against the non-uniform losses directly evidences the efficacy of the underlying method. While it is clear from Figure 7, that non-uniform methods are competitive for p < 4, we can see how this competitive edge is maintained for larger p. Even after the plateauing of the mean, we can see that the distribution of losses exhibits a bias for non-uniform methods (c.f., SVD, SNM, and SDL, for p = 12). Interestingly, this effect is not observed for responses derived via the autoencoder weights which are presumed to be a generalization of the SVD modes demonstrating competitive results. Mean loss by method and rate Mean loss by method and rate Standard deviation by method and rate Figure 7. Here, we show the mean loss and standard deviation for each method as a function of the sampling rate. The methods are abbreviated by three-letter codes for visual clarity. The mean SAM loss, L, is displayed as a heat-map and as a Pareto plot. The difference between methods is greatest in for p < 5 (boxed region). Here, we plot the uniform loss against the non-uniform loss for the four derived sampling protocols at three different encoder rates for all the spectra in our validation set (n V = 1417). When the majority of the points fall below the y = x line (shown in red), this means that the non-uniform approach outperforms the uniform method (e.g., SVD, p = 12).

Correlation between Loss, Signal Complexity, and Rarity
In addition to comparing methods, we had two core hypotheses: loss will be correlated with signal complexity and rarity. To test these hypotheses we computed the signal complexity by taking the standard deviation of the signal (as a proxy) and the signal rarity by computing the mean of the 10 most similar (according to Equation (16)) signals in the training set. Since SpecRA enforces sparsity, the number of spectra used in the reconstruction generally does not exceed 10 (or else it would suffer from overfitting). Towards this end, we compute these metrics and present the results in Figure 9. Interestingly, there is little to no correlation between the complexity of the signal and reconstruction via SpecRA while, we see a notable trend in the loss when plotted against the rarity of the signal relative to the training set (especially for lower rates). The lack of correlation with signal complexity could be a result of how the SpecRA switches between reconstruction methodologies and also explain why signal rarity does not exhibit a stronger correlation. If the spectra in the training library or predefined set, K, are similar to the measured spectra, SpecRA will be effective at finding either a direct match or a sparse approximation, even if the rate is small. At the same time, we see detrending as the rate increases, indicating that rarity is less important when more information is available. The results of our preliminary analysis are promising and generally consistent with our hypotheses. One interesting question that remains unanswered here is the degree to which a signal can differ from the reference set, K, and still be effectively reconstructed from a sparse combination of prior observations.

Discussion and Future Integration
Formally, reconstruction, in the context of signal processing, is any process that recovers a signal from the set of points. Whether this is performed via a codebook or regression, the end goal is the same: recover information that is not readily available (i.e., via interpolation). Data-driven reconstruction methods apply some inductive bias and in the case of SpecRA, we exploit Lex Parsimoniae by toggling between pattern matching, l 1 -minimization via a primal-dual algorithm, and linear interpolation. As more information becomes available via online repositories, tailored bases will likely outperform generic ones such as that of Fourier. As is the case with all data-driven methods, the success is still highly linked to the availability and tractability of the underlying data (which can be a challenge for hyperspectral imaging).
In this paper, we simulated response functions for uniform and non-uniform sampling methods based on exploiting regularities in the frequency domain. Applying real-world constraints on idealized mathematical models is a challenge in any discipline. In the case of visible spectral data, there is a clear effect of information saturation implying that finer resolution is perhaps not needed to capture sufficiently unique information about the spectrum. On the other hand, the relative rarity of a signal does play a role in constraining the possible loss. Towards this end, applying online learning algorithms to sift through repositories to construct interpretable low-rank reference libraries, K, is paramount.
Furthermore, systematically selecting filter locations via a combined process of datadriven analysis and matrix factorization can dramatically improve results, especially for low-rate encoding (i.e., p < 4). A clear benefit of our method is its independence from fitted optimization parameters for a set of priors, and prior information in the form of a measured signal does improve results. Finally, our approach remains untested on realworld sensing hardware. It remains unclear what effect increasing channel errors will have on the reconstruction process. This will be investigated in future work. In applying this work to other domains, it is important to note that while many of these methods are generalizable, there will always be specific design constraints that need to be considered for any new class of sensors. In conclusion, spectral sensing spans many disciplines and its relevance in autonomous systems is becoming even more present. The systems and methods outlined in this paper provide a working template for research into the design and implementation of compact spectrometers and rendering software for a diversity of applications. While conventional spectrometry is seen as costly and data intensive, taking advantage of domain knowledge and sparse optimization can offer a valuable alternative to existing methods. We hope that this work provides a foundation for both theoretical and practical future developments.