Custom Scanning Hyperspectral Imaging System for Biomedical Applications: Modeling, Benchmarking, and Specifications

Prototyping hyperspectral imaging devices in current biomedical optics research requires taking into consideration various issues regarding optics, imaging, and instrumentation. In summary, an ideal imaging system should only be limited by exposure time, but there will be technological limitations (e.g., actuator delay and backlash, network delays, or embedded CPU speed) that should be considered, modeled, and optimized. This can be achieved by constructing a multiparametric model for the imaging system in question. The article describes a rotating-mirror scanning hyperspectral imaging device, its multiparametric model, as well as design and calibration protocols used to achieve its optimal performance. The main objective of the manuscript is to describe the device and review this imaging modality, while showcasing technical caveats, models and benchmarks, in an attempt to simplify and standardize specifications, as well as to incentivize prototyping similar future designs.


Introduction
In the past decade, the advent of embedded computing has produced a new array of imaging systems that, while based upon mature technologies, have and will increase in complexity. Multi-and hyperspectral imaging (MSI/HSI) is an imaging modality capable of obtaining spatially resolved spectral information of a subject of interest, and one of many areas of research that can also exploit these new developments. This technique has been thoroughly employed in remote sensing [1], crop analysis and agricultural and soil science [2][3][4], as well as food quality control [5][6][7]. In the field of Biomedical Optics, MSI/HSI systems are used in various diffuse optical imaging technologies, such as Spatial Frequency Domain Imaging (SFDI) [8], Single Shot Optical Properties (SSOP) [9], and corrected fluorescence imaging (qF-SSOP) [10], among others [11]. While handheld devices are becoming readily available on the market [12], for more specific applications such as SFDI, SSOP or qF-SSOP, having control over the optics and mechatronics of the system (e.g., automatic gain control, exposure control, communications with additional instrumentation, non-proprietary API control, etc.) is an unavoidable requirement. In these fields of research, then, it is common practice to design highly controllable, fully customizable, open-architecture systems with high spectral and spatial resolution. Table 1. Frequent far-field reflectance hyperspectral imaging systems for biomedical applications, with its properties and applications (summary from [13,23]). Most devices trade off spatial, spectral resolution, and speed, depending on the configuration.

Imaging Modality Spatial Channels/Resolution Spectral Channels/Resolution
Snapshot imaging Given by sensor resolution, ∼100 × 100 px/spatial resolution dependent on optics Given by optics, 50-100 ch./∼10 nm The main objective of this article, therefore, is to propose a set of metrics that represent scanning devices such that the key variables that define image quality become adequately established, since overall system performance is not only dependent on the properties of its constituent parts, but also on how efficiently these elements are combined together. A second objective is to describe a custom-built HSI prototype, which presents a structure that may constitute a novel approach, and to use these defined benchmarks to optimize its properties and evaluate where it could be improved. The proposed work pursues the optimization of these efficiency metrics, which could also be extrapolated to other HSI imaging modalities, comparing each other in terms of systems efficiency, evaluating imaging artifacts and acquisition delays, or using them with already existing custom multi-and hyperspectral scientific prototypes for debugging and design optimization.
The document is structured as follows. Section 2 describes a model proposition for a pushbroom/scanning system, as well as a general overview of a custom-built prototype and, finally, the proposed set of performance benchmarks for scanning devices. In Section 3, these benchmarks are tested on the prototype with a series of simple protocols. Finally, some optimization considerations and the complete characterization of the pushbroom scanning system are discussed in the last pages of the article. Figure 1 shows a brief summary of the components that comprise a far-field scanning imaging device. Consider the following list as a generic description of the main components of any scanning system. Their names and properties will be the starting point for the evaluation of the custom device:

Fundamental Parts of a Scanning Imaging System
1.
Optical subsystem. An optical component will take a line in the field of view (FOV) and break it up into spatial and spectral information, which will be measured by a camera. There will likely be focusing optics, a spectrographic element and their corresponding couplings and interfaces. From a systems perspective, optics will constrain our wavelength range of operation, and the spectrograph and lens will limit the overall spectral resolution, i.e., the minimum spectral distance between two monochromatic signals that can be spatially separated.

2.
Imaging device. Light spatially separated by wavelength will reach a camera, which will capture incoming photons across its active pixel sensor. Different wavelengths will, then, be differentiated by their arrival onto different pixels. Consequently, several imaging parameters will depend on sensor characteristics, namely quantum efficiency (and wavelength range of operation), exposure time, camera gain, pixel depth and spectral resolution. While the first parameters will depend on the specific characteristics of the camera, the latter will be also related to the spectrograph's ability to separate wavelengths. In the case of line-scanning hyperspectral devices, the camera will also specify the spatial resolution for one of the two spatial axes. 3.
Actuator subsystem. The actuator subsystem will usually constitute the sole moving part of the imaging device. In pushbroom systems, for example, either the imaging device or the platform where the sample is placed is moved by a large actuator. In the case of rotating mirror scanners, an actuator/motor will drive a mirror fixed to an otherwise free-rotating axle. Depending on the driving mechanism, (e.g., belts and/or gear mechanisms) there will be mechanical imperfections-such as backlash-as well as delays due to serial communications between the main computer and the actuator controller/driver. Consequently, the main variables at play will be the repeatability of the drive system (i.e., its ability to obtain the same image pixel-wise when repeating the same motion) and time delays due to serial communications. If using a stepper motor, its properties (e.g., steps per revolution, stepper resolution, and degree of microstepping) will define the actuator system and its ability to produce high-resolution images.

4.
Communications and storage. Image acquisition and actuation control must be governed by a main computer, which must communicate with both camera and actuator subsystems, as well as manage data storage and image calibration. Storage must also be sufficiently efficient so that storing acquired spectra does not impose a significant hindrance on the overall system. We will assume that the imaging system must be able to communicate with other instrumentation and computers within a laboratory network. Camera-computer communications will impose a restriction on the total time per measurement, and therefore communication delays as well as driver and application program interface (API) delays will play a significant role in our efficiency models.  Figure 1. A generalized description of an external rotating mirror scanning imaging system. (A) A single focused line within the field of view can be captured at a time; (B) each line will consist of a two-dimensional sensor measurement, where the first axis will represent spatial dimension x, and the second one specifies wavelength, λ. (C) A complete hyperspectral image is obtained by moving the rotating mirror and storing more images for each of the mirror's positions (y).

Developed Rotating Mirror Scanning HSI Imaging System
The developed device is a custom-built rotating-mirror scanning HSI device with an embedded computer for storage and control, summarized in Figure 2 and Tables 2-5. As a rotating mirror system, it has the same properties of a pushbroom system, while at the same time minimizing the number of moving parts, and thus actuator delays. It consists of an ImSpector V10E spectrograph (Specim, Spectral Imaging Ltd., Finland) paired with a CS-mount, 10× 5 mm-50 mm f/1.3 varifocal objective lens with manual iris and zoom, originallly prepared for 1/3" cameras, (T10Z0513CS-3, Computar Optics Group, Cary, NC, USA) and a Mako G-223B NIR CMOS camera (Allied Vision Technologies GmbH., Stadtroda, Germany). Coupling the CS-mount objective lens onto the ImSpector and a 2.2 megapixel (2048 × 1088) C-mount camera provides a shorter minimum working distance and angle of view, at the expense of losing a few spectral pixels and half of the spatial pixels of the camera, resulting in a 1200 × 1200 px FOV. This ensemble allows the reading of a single hyperspectral line, which can then be swept across the sample with a rotating metal mirror system powered with a NEMA 17 stepper motor and a GT2 pulley array, achieving 19,200 steps per revolution with minimal backlash and CNC-grade (Computer Numerical Control) repeatability.
Communications are handled by a generic Gigabit Ethernet router, while DHCP services and camera control are handled by an Odroid XU4 (Hardkernel Co., Ltd., GyeongGi, South Korea) running Ubuntu 16.04. Dedicated Python 3 software was prepared to handle pymba, a Python 3 wrapper for the Vimba 2.1.3 SDK. Camera control within the laboratory can be achieved through a simplified Python webserver running Flask. Gain control is in the 0-25 dB range, and camera exposure can be set anywhere in the 10 ms-2 s time range. Up to 1.4 MP (1200 × 1200 px) image resolution can be achieved if necessary, although smaller images can be stored via software. The number of sensor bins in the spectral axis is 1088 channels in total, but the total number of spectral channels will be given by the spectrograph. Camera aperture, focus and zoom can be controlled manually.

Efficiency Benchmarks and Calibration Procedures
We shall obtain a set of expressions that can serve as adequate overall system performance metrics, as well as calibration and correction protocols for spatial and spectral information. While the former will provide insight into this particular imaging modality, the latter are standard procedure in hyperspectral imaging.

Lines per Second (LPS)
Analogous to frames per second (FPS) in conventional systems, it is the inverse of the total time required to measure a single line, namely the exposure time plus additional time delays: In this equation, T line represents the total time needed for acquiring a single line, and it is constituted by a combination of various different time delays and intervals that take place during acquisition: • Exposure time, denoted by T exp , is the time that a sensor will spend capturing photons from a single line of the FOV (shown in Figure 1). Ideally, image acquisition should only consist of integer multiples of a preset exposure time, and image storage and communications should be immediate. It will be shown that this is not the case. • Transfer/receiving time T rx is the relationship between file size and network speed within the device. Transfer time over an Ethernet network will be practically constant and, thus, for a fixed image size, transfer time between the camera and the main computer will be fixed. As a general rule, the total transfer time will be given by where h, w are the native height and width (in pixels) of the camera, b is the bit depth (either 8 or 12 bits), R b is the bitrate (in bits/s), and Ω is the overhead of the protocol. In our practical case, where Vimba uses UDP (with an overhead of 1.9%, Ω = 1.019) on a 2048 × 1088 pixel camera with 8 (or 12) bits per pixel connected to a Gigabit Ethernet router, we obtain 18.16 ms and 27.24 ms of transmission time for 8-and 12-bit images, respectively. This will introduce a limitation of about 55.06 and 27.04 frames per second, respectively, since the camera will be the only device sending packets to the main computer, and switching time will be considered negligible.

•
Camera Application Program Interface (API) delays, T api . Any imaging sensor will be controlled by the computer via an API or driver library. Driver and API delays should be constant delays that an imaging device presents that can be shown to be due to runtime execution of precompiled API dynamic libraries.

•
Memory buffer copying and transfer times T mcpy . They represent buffer copying delays due to internal bus limitations within the main computer, i.e., between its network interface and memory.

•
Actuator/mirror movement delays T act . These comprise delays in serial communications between the main computer and the actuator subsystem, and due to stepper motor movement, which will be negligible for high-resolution images, but noticeable in low-resolution acquisition. • There will be additional, random delays, mostly due to Operating System (OS) interruptions. We will denote them as T rand . Unless we work with a real-time operating system, there shall be random interruptions during acquisition, due to the OS scheduler setting the measurement API to background due to priority issues. This phenomenon has a stochastic nature and cannot be controlled unless a real-time operating system is used.
For our model, T extra = T rx + T api + T mcpy + T act represents all the additional delays in the system that are not random. This value will be estimated as a constant during nonlinear least squares fitting in the following sections.

System Efficiency
The efficiency of a scanning imaging device can be described as the ratio between the specified exposure time per sensor capture and the actual time per line: and this measurement will always be ≤ 1, since

Object Plane Curvature
Using a rotating mirror for scaning imposes a curvature distortion on the FOV of the camera, such that the region of space that remains focused is cilindrical, not planar. This nonlinear transformation can be inferred theoretically by considering Figure 3. Angle θ is the orientation angle of the mirror with respect to the normal of the object plane, and γ represents the angle between the object plane normal and the incoming light ray. By the definition of the tangent function, we can infer that the line position along the object plane, y d , is given by: and the difference between the normal distance to the object plane and the curved object plane, ∆r = r − d, with r = AP and d = AP , is equal to Additionally, the width of objects in a line (x, θ) will be linearly distorted by a scaling factor, which is also a nonlinear function of mirror angle θ: Therefore, every pixel in the image (x, y) will correspond to a spatial position (x d , y d ), where where ∆x is the spatial resolution of the measured line in millimeters (which will be generally a function of lens optical angle of acceptance and r), y π/4 corresponds to the pixel index where the ray path is fully normal to the object plane (also, where θ = π/4), and the angular resolution of the stepper motor ∆θ, in our case, will be the total number of steps per revolution times a stepping factor s: A representation of the complete nonlinear transformation (without defocusing) is illustrated in Figure 4. As we diverge from the linear range of operation, the captured image becomes deformed and downscaled, since we are receiving rays from points that are further away from our object plane normal.

Rotating mirror
Inverse ray path Object plane Linear range Figure 3. Rotating mirror geometry and linear range of operation. An image is acquired by varying rotating mirror angle θ. The optical path is increased by ∆r(θ), which produces distortion and defocusing. These artifacts can be considered negligible for an angle interval Θ (linear range).
For this approach, the tangent function can be approximated for small values of θ as a linear transformation between mirror angle θ and position y d in the field of view within focus. In particular, for values γ ∈ Γ = [−π/8, +π/8] (θ ∈ Θ = [π/4 − π/16, π/4 + π/16]), the approximation can hold for ∆r/r ≤ 8% (smaller errors can be obtained by reducing this range further). If lens aperture is kept within a range such that optical depth of field remains over ∆r, then a clear image can be captured. Any system should be verified in a real context (at least for a particular value of r), so that its linear range can be specified.  Figure 4. Image distortion in a scanning imaging system with a rotating mirror for a system height of r = 225 mm. Left plot: object plane, normal to the imaging system (axes in mm). Blue dots represent the positions of a subset of captured pixels that will form a uniform pixel grid in the measured image (right plot). Aspect ratio is not preserved so that distortions can be better highlighted.

Microstepping Accuracy
Once the main distortions due to using a rotating mirror have been considered, there will be additional accuracy noise caused by the remaining mechanical factors, namely backlash and microstepping. We will assume that any remaining errors will be coming from random sources and are Gaussian-distributed, i.e., n ∼ N (0, σ 2 ), hence simplifying the remaining uncertainties of the model. The standard deviation of these random events will be estimated by histogram approximation of a Gaussian probability density function (PDF), and is expected to be negligible for high-resolution images and quite relevant in low-resolution acquisition, since motor inertia will correct current variations in the windings of the stepper [24].

Spatial Resolution
Spatial resolution is typically estimated using a resolution test chart. The one we choose to use is a 1951 USAF Resolution Test Chart (MIL-STD-150), printed with reference lines with known lengths. Such lines provide a relationship between pixel positions and actual distances in the image plane.

Spectral Calibration
Wavelength separation through a spectrograph is known to be non-linear in nature. Generally, since a given spectrograph has a fixed response over time, spectral characterization only needs to be performed once. There are two steps required in order to fully calibrate a hyperspectral system. First, the polynomic response of the spectrograph as a function of pixel position must be found. The characteristic polynomial of a spectrograph, λ = P(p), is a least-squares fit that attempts to relate a spectral wavelength λ to its pixel position p within the sensor: Having a material with well-known reflectance peaks, a polynomial of degree d can be obtained through least-squares regression with d or more reference peaks, λ 0 , . . . , λ d , which can be then identified at specific pixel locations, p 0 , . . . , p d [25].
After being able to relate pixels with wavelengths, we can then establish the spectral resolution of the device. This was achieved via a CAL-2000 Mercury-Argon light source (Ocean Optics Mikropack, Ostfildern, Germany) oriented towards the rotating mirror, which shall be kept at a 45 • angle. After background substraction, the spectral emission lines of both gasses should be visible. Since their spectral emission lines are well-known, finding the smallest discernible lines will provide a lower bound for spectral resolution.

Dark Image Substraction
The term hot pixels refers to the appearance of bright pixels at long exposure times due to sensor lattice damage; its correction can be achieved via dark image substraction, quite common procedure in astronomical imaging calibration [26]. Hot pixels in the sensor must be corrected by characterizing their behavior as a function of exposure time when they are not exposed to any light. This can be achieved by closing the aperture completely and obtaining several measurements. By obtaining the baseline pixel values for several exposure times, hot pixel background signals can be modeled in each pixel by where indices i, j indicate the pixel's position in the sensor, a ij and b ij are the coefficients that model the linear behavior of the pixel as a function of exposure time T exp , and h bg,ij will be the estimated hot pixel value that should be subtracted to each pixel for every new measurement. This relationship should be occasionally verified, in order to keep track of the number of damaged pixels in the sensor [27].

Light Source Stability and SNR
Without properly characterizing the main source of optical power, we may incur in inaccuracies in power output and, therefore, in reflectance measurements. Ideally, for biomedical applications it is preferable to use stable and power-controlled devices, such as monochromatic lasers and/or supercontinuum light sources. As a preliminary approach, it is sufficient to use a tungsten halogen lamp in the Vis-NIR range, due mainly to its thermal inertia, which provides light source stability in the millisecond-to-second scale. We shall procure two protocols for analyzing how the signal-to-noise ratio (SNR) of the camera is a function of exposure time and light source optical power.
We will use either one or two 1-kW tungsten halogen bulbs for all measurements, depending on the absorption of the sample. For the first experiment, we turn on the light source and take a low resolution snapshot of an illuminated white Spectralon calibration reference. This is repeated several times (20 times in our case), and each snapshot is timestamped and stored. The average Spectralon reflectance value for each timestamp will be calculated, and then plotted as a function of time. If thermal inertia works well under standard temperature conditions, the average value of this Spectralon reference should become constant over time.
The second experiment assumes that the received sensor counts are a function of backscattered optical power. Therefore, the electrical current transformed into a byte in each pixel will be consequence of the sum of two powers: where P rx is the received optical power on that pixel, P s is the optical power coming from the sample, and P n is Additive White Gaussian Noise (AWGN). Under this assumption, the average SNR as a function of wavelength and exposure time can be calculated via the following expected value ratio: SNR(T exp ; λ) = 10 log 10 P s P n = 10 log 10 E[P rx (T exp ; λ)] which for the given assumption is equivalent to calculatinĝ SNR(T exp ; λ) = 10 log 10 μ where here x 1 , . . . , x N (T exp ; λ) are several pixels of a reference material measured at a given exposure time and wavelength,μ is a maximum likelihood (ML) estimate of the average reflectance of such material, andσ 2 the ML estimate of the variance of the light source. These calculations should be performed on reference-calibrated reflectance data, so two Spectralon captures are required for each evaluated exposure time, and square-law losses due to light source directionality can be corrected.

Results and Discussion
Once all constraints and properties of our model are adequately defined, we must characterize the imaging system and verify that, indeed, it behaves as described. These results have been obtained from the device explained in Section 2.2, and attempt to serve as an example of how to measure whether or not each component is being exploited to its maximum potential. Here, we shall focus on the elements of the reviewed model that have not been studied by previous work: (1) timing issues, (2) spatial distortions, (3) microstepping noise, and (4) overall efficiency.

Timing and Delays
Measuring time spent in each of the modeled operations is an adequate first step in characterizing system efficiency. In Figure 5, the total time per line is partitioned between its essential tasks. For this experiment, each measured line is separated by a single motor step (i.e., the device is working at full resolution). This was tested at 8-bit and 12-bit depth, since bit depth will influence the total time spent transmitting data. As expected, with no other processes running in the system, transfer and buffer copying only take a small fraction of the total time, while a fixed time is spent communicating with the camera and the rest of the time is spent acquiring light.  By obtaining the inverse of this time per line, we can represent the rate of acquired lines per second of the scanning device. This calculation is shown in Figure 6, where the dots represent the inverse of the bars in Figure 5, and the continuous line is a nonlinear least squares fit to 1/(T exp + T extra ), being T extra the constant time parameter that we wish to estimate. Additionally, the ideal model, i.e., 1/T exp , is shown as a dotted black line. The different graphs represent the average lines per second when parts of the system are disabled: the red plot represents normal acquisition, while in the case of the blue line the rotating mirror remains static (i.e., no serial commands are sent to the rotating mirror actuator) and, for the green line, only acquisition commands are being executed, and acquired images are discarded (not stored into memory). Driver and API delays are, in theory, constant values that can be shown to be due to runtime execution of precompiled dynamic libraries. Nevertheless, random variations in timing are clearly visible after adjusting for our model, and are related to OS schduler interruptions and multithreading management, since acquisition, actuator control, storage and RAM management are run by different processes. In order to obtain accurate, stable LPS measurements, we must use application-specific circuits, and/or real-time operating systems.  Figure 6. Acquisition speed (in lines per second) as a function of exposure time, disabling different parts of the imaging equipment. Measurements correspond to the colored scatter plots, while a nonlinear least-squares fit is superimposed as a continuous line plot for each scenario. As expected, disabling the scanning system and ignoring buffer copying allows us to reach the theoretical limit of the camera (40-50 FPS).
Additionally, it must be noted how, regardless of exposure time, all measurements are limited by the maximum FPS rate that the camera and/or network protocol can manage (49.5 FPS/LPS for 8 bits is limited by camera performance, 27 FPS/LPS for 12 bits is limited by network speed). All measurements are bounded from above from the value given by Equation (2) due to network bandwidth, overheads in transmission, as well as transmission delays, router switching delays, and DLL runtime delays (which cannot be optimized since the Vimba API is already compiled and no source code is available). These issues produce an additional limit, as well as random acquisition speed variability. Some clinical approaches, like fluorescence quantification, will require 12-bit images, while other methods could benefit of much faster acquisition times rather than a higher pixel resolution; these issues should be considered in advance depending on the application.
A similar approach can be done for the other modalities shown in Table 1, considering their ideal efficiency or throughput (Table 6). For the spatial and spectral resolution considered in Table 1, a rotating mirror scanning system would score favorably with respect to other modalities, considering that rotating a small mirror will take a shorter amount of time and less power consumption than displacing either the sample or the imaging system (i.e., as indicated in the table, T act T move ). Table 6. Ideal throughput/efficiency summary with the proposed device (other entries extracted from [23]). We have shown that ideal efficiencies diverge from real ones by various timing issues that should be overcome. These systems, then, can benefit from the evaluation approaches in this article.

Imaging Modality Ideal Acquisition Time
Snapshot imaging T exp Staring systems

Image Distortions and Noise
Since focusing optics are independent from the actuator subsystem in many cases, the object plane will inevitably suffer nonlinearities. A rotating mirror with a stepper motor will incur three types of measurement errors. First, the object plane will be curved, as the focusing distance will be fixed for any mirror angle; this curvature effect must be limited to a linear range, and its effects must be quantified. Second, there will be backlash in the transmission system that connects the motor to the mirror axle; such backlash can damage repeatability, and must be minimized. Third, and finally, there will be microstepping noise due to spurious currents in both driver and stepper rotor; those must be minimized by using an adequate stepper driver and microstepping factor. Although we can model theoretical distortion due to mirror rotation and avoid exceeding its linear range of operation, overall distortion and noise can (and should) be characterized by imaging an Amsler grid with various lengths and forcing the rotating mirror to capture a half/quarter rotation at maximum resolution. Then, the first finite difference along the y axis can be obtained, which allows for edge detection in the grid. The variation in grid cell width can be approximated by the nonlinear scaling factor of Section 2.3.3, and therefore a nonlinear least-squares fit of Equation (5) can be obtained. This deterministic distortion allows for obtaining r, the total distance to our object, as well as isolating any remaining errors coming from other sources. The result of this calculation is left in Figure 7. To obtain a successful alignment, an Amsler grid with 6 × 6 mm squares was printed at maximum resolution, screwed to an optical table, where a SMS20 Heavy Duty Boom stand/crane (Diagnostic Instruments, Inc., Michigan, US) was situated holding the instrument above the grid. Then, horizontal (x-wise) alignment between grid and camera was achieved by constantly previewing the image and overlaying a series of horizontal lines in a custom-designed graphical user interface (GUI), moving and fixing the stand until centering and alignment of the center dot of the grid was achieved at θ = π/4 (or, alternatively, γ = 0). The relationship between position y and angle θ is linear within the range ±π/8 ≈ ±0.3926 and distortions in the x axis can be corrected by an inverse affine transformation.
The residual in the right subplot shows a cosine-like pattern that is likely due to misalignment errors that cannot be compensated by the GUI and manual adjustment. While this non-random variation cannot be corrected any further without more precise equipment, random variations in y d can still be accounted for by FFT clipping this signal down to its average noise spectral power density. The results of this experiment are shown in Figure 8, where the left plot shows the variations in grid size beyond misalignment, and the right subplot presents a histogram and a Gaussian fit for an estimated mean µ and variance σ 2 . This variability is therefore in the order of σ ≈ 0.2 mm at a distance of 348.73 mm and a FOV size of 120 × 120 mm, which can be considered negligible for far-field measurement.

Spectral Calibration
As specified in the Materials and Methods section, a Wavelength Calibration Standard plate with tabulated absorption peaks was imaged with a tungsten halogen lamp . The result of finding at least 8 tabulated peaks is displayed in Figure 9, where absorption peak distribution before and after calibration can be seen. Spectral resolution was empirically obtained using a calibration light source with known proximal emission peaks within the wavelength range and peak distances within the range tolerated by the spectrograph. The top subfigure in Figure 10 shows such measurement, with superimposed well-known argon and mercury peak wavelengths. Lines closer than 3 nm to each other were not discernible (e.g., 576.96 and 579.07 nm), which means that spectral resolution shall be bounded by 3 nm. This provides an upper bound on the number of spectral channels, namely 218 (i.e., 5 pixels or 3 nm per channel). Taking resolution into consideration, the spectrograph response can be calculated, resulting in λ(p) = 370.79 + 2.68p − 0.0001p 2 , with p being one of the 218 channels. This fairly linear response is expected.

Spatial Resolution
A spatially calibrated image of the aforementioned resolution test chart is displayed in Figure 11. Closeups for 2, 1, and 0.5 millimeter-wide bands are provided on the right side of the figure. Vertical resolution (in red) degrades much faster than horizontal resolution (in green), with horizontal lines (vertical resolution, in red) blurring out at about 0.5 mm, and vertical lines (discerned by horizontal resolution, in green) can be distinguished at up to 0.2-0.3 mm. If spatial resolution is defined as the maximum between horizontal and vertical resolution, then the spatial resolution of the imaging device is 0.5 mm with the camera placed at a 35 cm distance from the test chart. Given the actuator noise shown in Figure 8, we can conclude that vertical resolution is currently limited by microstepping noise. Vert. resol. V2 Horiz. resol. H2 Figure 11. 1951 USAF resolution test chart, average reflectance (left) and cross-section intensities at different band widths (right). Total resolution is limited by vertical (actuator) resolution, which could be improved in future versions by improving mechanical resolution and actuator repeatability. Best viewed in color.

Light Source Calibration and SNR
The outcomes of the experiments described in Section 2.3.8 are shown in Figure 12. The left subplot provides empirical support to explain how the SNR of reflectance signals behaves as a combination of exposure time, reference reflectance, and camera quantum efficiency. The right subplot shows the average received optical power during the first 20 seconds of operation following the activation of the halogen light source. From these results, we can conclude that (1) the response of the camera is linear with respect to exposure time, (2) that the SNR of reflectance data is limited by the spectral response of the optical imaging equipment, and (3) that our light source is stable, on average, within the timescale of operation.

Benchmark Results and Characterized Equipment
Once our image is as clear as possible, spectral information has been calibrated, and timing issues are well characterized, it is time to benchmark overall system performance. While LPS values show the maximum acquisition speed of the system, efficiency values (as described in Section 2) will show how much the maximum LPS rates are throttled by time not spent in acquiring lines. For Figure 13, the same LPS measurements are repeated and efficiency is plotted (scatter plots) and nonlinear least-squares fit to Equation (3) (smooth line plots), for the three scenarios explained in Section 3.1 and for 8-and 12-bit images. The behavior of this graph is expected, as efficiency is a monotonically increasing function with respect to exposure time, and rises quickly as less tasks take control of the main computer and spend time performing non-essential tasks.  Figure 13. Measured system efficiency and nonlinear least squares fit, as defined by Equation (3) for 8and 12-bit depth. As exposure time increases, the fraction of time spent in camera communications becomes negligible and efficiency increases. This is also increased with bit depth, as the data must traverse a bandwidth-limited network. As can be seen, mirror rotation and memory buffering produces a noticeable decrease in efficiency of about 10% for typical exposure times (100-300 ms).
Although efficiency can be seen as an indirect measurement of LPS rates, it can be considered an explicit benchmark in and of itself, since it only indicates the divergence between an ideal, theoretical measuring machine, and the device under test. It completely ignores every other specification of the system, and focuses on how much CPU time is dedicated to measuring only. Better cameras can be substituted in the system and its efficiency can be reevaluated, which facilitates device adaptation and updating to different acquisition environments. For example, higher efficiencies could be more practical in constantly moving environments (such as real-time, in-vivo diagnostics during a surgical procedure), while lower efficiencies can be sufficient for slower scenarios (i.e., ex-vivo imaging). Also, they could be used in industrial manufacturing as a way to quantitatively compare camera efficiency the various tradeoffs between resolution, gain, sensor quantum efficiency, network interface, and spectral range of operation, when selecting a sensor in a commercial device or process.
Another interesting conclusion is related to scanning hyperspectral devices as an imaging modality. Artifacts due to subject motion and/or metabolic changes, such as sample oxidation, changes in oxygenation, degradation, or decay will be more severe as the total time spent capturing is increased. While under these conditions other imaging modalities can prove more useful, a fair tradeoff between speed and resolution can be achieved in these systems, since the total time spent in an image with N lines lines measured will be T img = N lines T line / avg (T exp ), where avg (T exp ) is the average measurement efficiency for a given exposure time. Consequently, the higher the efficiency, the smaller the number of uncorrectable artifacts that could be found in an image with an adequate choice of resolution and exposure time.
Finally, by assuming that the model takes into account all timing and nonlinearity issues, its final characteristics can be summarized, as in Table 7. Understandably, some of its specifications (e.g., gain and exposure time ranges) are extracted from their parts' corresponding tables, while the rest are a consequence of the benchmarks described throughout the article. Resolved distances, nonlinearity corrections, delays and efficiencies may be illuminating, for example, when comparing tradeoffs and similarities with known equipment, commercial or otherwise; both device manufacturers and applied optics scientists can compare, select, replicate and improve their systems, based on how well different parts perform together.
As for now, we can report on the protyping of a high-resolution, highly efficient hyperspectral imaging system, capable of 1.44 Mpx spatial resolution, 200 wavelength channels over the Vis-NIR range, fully modifiable optics, variable gain, exposure time and acquisition mode for HSI, SFDI, and SSOP measurements (as well as any additional experimental methods) which can be controlled via a RESTful Web Service from any computer within the laboratory network. An HSI image of a hand obtained with this device (r = 350 mm, T exp = 200 ms, full resolution), as well as some random spectra within the image, can be seen in Figure 14. Color reconstruction is achieved via CIE 1931 Color Matching Functions (CMFs) applied on the acquired spectra, providing XYZ/RGB images with identical resolution. A near-infrared channel (Figure 14b), namely 803 nm, reveals existing vascular structures in the hand. Some random spectra are also displayed at the bottom plot, with 1% transparency for each spectrum, to stand out repeated features (i.e., skin and background spectra) from unlabeled spectra.

Conclusions
Prototyping hyperspectral scanning equipment is a multiparameter, interdisciplinary task that requires harmonious and efficient behavior amongst components at play. This manuscript has described and summarized the main obstacles faced when constructing a scanning HSI device and has defined a few ideal objectives (and benchmarks to measure them) which system developers can consider when building devices as described in Section 2.
As reviewed throughout the article, an ideal system will always spend the vast majority of its time performing acquisition, while minimizing all preprocessing, communications, storage and API-based delays as much as possible. In real case scenarios, full control of its specifications (gain, exposure time, focusing distance, aperture, zoom) within a local network is also desireable, especially with multi-purpose camera systems that unlock numerous potential biomedical imaging applications.
In order to determine how well a specific custom device behaves (and how much it can be improved), methods like measuring timing constraints and metrics such as system efficiency can provide information on how close to ideal, asymptotic behavior the system under test is. Optics-related parameters, such as nonlinear distortions and microstepping noise, have been characterized as well (at least for the case of rotating mirror scanning systems), and in most cases system redesign has been sufficient to keep them as minimized as possible. Some interesting results include the fact that API communications can be the most significant cause of timing innefficiencies in a scanning system, that spatial distortions can be almost completely eliminated by keeping the FOV of the camera within a linear range, and that vertical resolution is limited by microstepping noise.
Interesting future lines of research include, among others: (1) applying flatter, more stable light sources, such as supercontinuum laser sources, in order to achieve a flatter SNR with respect to wavelength; (2) using curved mirrors for depth-of-field and focus plane correction; (3) approaching other wavelength ranges, such as NIR-SWIR; (4) using Application-Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or real-time operating systems for minimizing all the aforementioned delays; (5) using various spectrograph slit sizes and combinations with other focusing optics, such as macro and/or telecentric lenses; (6) evaluating sensor lattice damage with more complex hot-pixel models, since these exhibit slight nonlinear behavior; (7) studying the exact influence of object motion artifacts in this imaging modality; and (8) using profilometry for source intensity correction, especially when using far-field sources, such as structured light projectors. All these possible improvements stem from the evaluation of the described equipment via the proposed methodology.
A more general conclusion that can be reached from these experiments is that a full characterization is necessary when it comes to understanding how the different components of an imaging system relate to each other, so that the root causes of imperfections and/or slowdowns in image acquisition can be adequately detected, and design can be improved in further versions, especially when building and testing devices with clinical and surgical applications, where timeliness and precision are fundamental.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: