Time-Domain Near-Infrared Spectroscopy and Imaging: A Review

This article reviews the past and current statuses of time-domain near-infrared spectroscopy (TD-NIRS) and imaging. Although time-domain technology is not yet widely employed due to its drawbacks of being cumbersome, bulky, and very expensive compared to commercial continuous wave (CW) and frequency-domain (FD) fNIRS systems, TD-NIRS has great advantages over CW and FD systems because time-resolved data measured by TD systems contain the richest information about optical properties inside measured objects. This article focuses on reviewing the theoretical background, advanced theories and methods, instruments, and studies on clinical applications for TD-NIRS including some clinical studies which used TD-NIRS systems. Major events in the development of TD-NIRS and imaging are identified and summarized in chronological tables and figures. Finally, prospects for TD-NIRS in the near future are briefly described.


Introduction
The time-domain (TD) technique in near-infrared spectroscopy (NIRS) and imaging technology has the greatest potential among the continuous wave (CW), frequency-domain (FD), and time-domain modalities owing to having the richest information contained in the measured TD data. Although the advantages of the TD technique are widely recognized, only two TD-NIRS systems were commercially available for brain and/or muscle NIRS oximeters, while more than ten CW-NIRS systems were commercially available by 2011, according Ferrari and Quaresima [1]. These limited usages of the TD-NIRS technique are due to its drawbacks, to which some articles have referred, as follows.
In 2014, Torricelli et al. [2] described the situation of the TD-NIRS technique as follows (with modification), "TD-NIRS and TD techniques in general had the reputations of being cumbersome, bulky, and very expensive when compared with commercially available continuous wave (CW) functional-NIRS systems. These disadvantages cannot be ignored, and a gap between CW and TD-NIRS technology still exists." In 2018, Papadimitriou et al. [3] also described the drawbacks of TD-NIRS systems more in details (with modification), "TD-NIRS instruments used so far are bulky and expensive, and typically employ sensitive optoelectronics which are susceptible to vibrations. When solid state lasers are used, switching from one wavelength to another is slow about 10 s. When pulsed laser diodes are used, long warm-up time (about 60 min) is required to achieve a stability of pulse timing in the picosecond range. These factors limit the usages of TD-NIRS not only in hospitals but also in laboratories."

Theoretical Background of TD-NIRS
The principles, concepts, and theoretical background of TD-NIRS are described in this section by showing the fundamental equation of light propagation, the radiative transfer equation (RTE), followed by its approximate equations with analytical solutions, numerical solving methods, and quantities featuring TD-NIRS. With an understanding of the theoretical background and the featuring quantities, it will be easier to reasonably interpret the results of calculations of light propagation in in vitro and in vivo experiments and the clinical applications of TD-NIRS. The theoretical background, particularly for TD-DOT, is comprehensively reviewed by Arridge [4].

Radiative Transfer Equation (RTE)
We start from the radiance, or the specific intensity, I(r,ŝ, t), defined as the average radiant power flowing at position r and at time t through the unit area oriented in the direction of the unit vectorŝ and through the unit solid angle alongŝ in a medium as shown in Figure 1. The most fundamental equation describing light propagation in biological tissue, which is accepted in this field, is the radiative transfer equation in time-domain (TD-RTE) (or the Boltzmann transport equation) for radiance [5], ∂ c∂t I(r,ŝ, t) +ŝ·∇I(r,ŝ, t) + [µ a (r) + µ s (r)]I(r,ŝ, t) = µ s (r) 4π p(ŝ,ŝ )I(r,ŝ , t)dΩ + q(r,ŝ, t) where c is the velocity of light in the medium, ∇ is the spatial gradient operator, • is the scalar product operator, µ a (r) and µ s (r) are the absorption and scattering coefficients, respectively, p(ŝ,ŝ ) is the scattering phase (angular) function describing the probability of scattering from directionŝ into directionŝ, dΩ is the solid angle for integration, and q(r,ŝ, t) is the light source. Here we assume that the radiance is for a specific wavelength and that the velocity of light is constant throughout the medium. The RTE is an energy conservation equation, and each term has physical meanings; the total temporal change in the radiance, the energy inflow due to the gradient of the radiance (or the diffusion of the radiance), the energy gain by absorption and scattering, the energy inflow to directionŝ by scattering from directionŝ over the entire solid angle, and the energy gain by light sources. Here we note: if the radiance, I(r,ŝ, t), having a dimension of W/(m 2 sr) is divided by the velocity of light, c, the velocity of light, c, it has a dimension of J/(m 3 sr) and is often called as the photon energy density, ˆ( , , ) ( , , ) / u s t I s t c = r r . Equation (1) is sometimes expressed as ( , , ) u s t r instead of ( , , ) I s t r . The RTE is an integro-differential equation and it is not easy to calculate even with the use of modern computers due to the integral term on the right-hand side. Many studies have been carried out on how to solve the RTE by numerical and analytical methods, and by equivalently statistical methods. Analytical and statistical methods are briefly reviewed in the following, while numerical methods are briefly reviewed in a later section.

Expansion of the RTE by Spherical Harmonics and the PN Approximations
One way to simplify the RTE is to expand (2) where qlm(r, t) and pl are the expansion coefficients which are known. For a particular combination of (l, m) = (L, M), Equation (2) is transformed to an infinite number of coupled partial-differential equations for iLM(r, t) with L ranging from 0 to ∞ and M ranging from −L to +L [7] (pp. 282-288). By retaining the equations for L from 0 to N with M = −L, −L +1, …, L − 1, L, the number of the retained coupled partial-differential equations is [summation of (2L + 1) from L = 0 to N] = (N + 1) 2 , the same as the number of unknown functions, iLM(r, t). The system of these (N + 1) 2 equations for (N + 1) 2 unknowns is the PN approximation. For example, for the P1 approximation, there are four unknowns of iLM(r, t), and for the P3 approximation, there are 16 unknowns of iLM(r, t). Even-order PN approximations are not useful and only odd-order PN approximations are considered.

The P1 Approximation
In the P1 approximation, four unknowns, i00(r, t), i1−1(r, t), i10(r, t), and i11(r, t) are related to the fluence rate, ϕ(r, t), and to the flux vector of the fluence rate, F(r, t), expressed as, and the P1 approximation is expressed by two coupled equations for ϕ(r, t) and F(r, t), including the reduced scattering coefficient, μs′(r) = [1 − g]μs(r), and the anisotropy parameter, g(r), defined as the average cosine of the phase function as, The RTE is an integro-differential equation and it is not easy to calculate even with the use of modern computers due to the integral term on the right-hand side. Many studies have been carried out on how to solve the RTE by numerical and analytical methods, and by equivalently statistical methods. Analytical and statistical methods are briefly reviewed in the following, while numerical methods are briefly reviewed in a later section.

Expansion of the RTE by Spherical Harmonics and the P N Approximations
One way to simplify the RTE is to expand I(r,ŝ, t), q(r,ŝ, t), and p(ŝ,ŝ ) into a series of spherical harmonics, Y l m (ŝ) (l = 0, 1, 2, . . . , m = −l, −l + 1, . . . , l − 1, l), to separate the angular dependences of I(r,ŝ, t) and q(r,ŝ, t) onŝ from the dependences on r and t [5,6] [7] (pp. 282-288). After some mathematical manipulations, the RTE is rewritten in terms of a series of spherical harmonics for the expansion coefficients i lm (r, t) of I(r,ŝ, t), where q lm (r, t) and p l are the expansion coefficients which are known. For a particular combination of (l, m) = (L, M), Equation (2) is transformed to an infinite number of coupled partial-differential equations for i LM (r, t) with L ranging from 0 to ∞ and M ranging from −L to +L [7] (pp. 282-288). By retaining the equations for L from 0 to N with M = −L, −L + 1, . . . , L − 1, L, the number of the retained coupled partial-differential equations is [summation of (2L + 1) from L = 0 to N] = (N + 1) 2 , the same as the number of unknown functions, i LM (r, t). The system of these (N + 1) 2 equations for (N + 1) 2 unknowns is the P N approximation. For example, for the P 1 approximation, there are four unknowns of i LM (r, t), and for the P 3 approximation, there are 16 unknowns of i LM (r, t). Even-order P N approximations are not useful and only odd-order P N approximations are considered.

Diffusion Approximation and Diffusion Equation (DE)
The TE of the P 1 approximation is further simplified to the time-domain diffusion equation (TD-DE) by adding conditions of (i) strong scattering meaning µ s » µ a or Dµ a « 1, (ii) slow temporal changes in the fluence rate and the light source leading to, and (iii) the source emission is isotropic meaning q 1 (r, t) = 0. Then, Equation (5) reduces to the TD-DE, Equation (7) is a parabolic type of partial differential equation describing diffusion phenomena, and the net flux of the fluence rate is given by Fick's law for diffusion phenomena, F(r, t) = −D(r)∇φ(r, t).
The regime map of Figure 2 shows the valid region of the DE where the demarcating curve is drawn as t = 10/(µ s c) ≈ 10(3D/c) under the condition of µ s » µ a . Here, 3D/c is the characteristic time of interaction. For a case of µ s = 1.0 mm −1 typical for biological tissue, the DE fails for light propagation within times shorter than 0.05 ns (50 ps), and the RTE is required in this period of times.   (4) where θ is the angle between the directions ŝ and ŝ′, and the phase function ˆ( , ') p s s is assumed symmetric to the azimuthal angle, ϕ, and dependent on the polar angle, θ, only. Here, the phase function is normalized as (1/4π)∫4πp(ŝ, ŝ′)dΩ = 1.
Equation (5) is the equation for ϕ(r, t) in the P1 approximation having a form of the telegraph equation (TE) which is an elliptic type of partial differential equation including the second derivatives with respect to both time and space indicating a phenomenon of wave propagation in the medium.

Diffusion Approximation and Diffusion Equation (DE)
The TE of the P1 approximation is further simplified to the time-domain diffusion equation (TD-DE) by adding conditions of (i) strong scattering meaning μs′ » μa or Dμa « 1, (ii) slow temporal changes in the fluence rate and the light source leading to, and (iii) the source emission is isotropic meaning q1(r, t) = 0. Then, Equation (5) reduces to the TD-DE, Equation (7) is a parabolic type of partial differential equation describing diffusion phenomena, and the net flux of the fluence rate is given by Fick's law for diffusion phenomena, The regime map of Figure 2 shows the valid region of the DE where the demarcating curve is drawn as t = 10/(μs′c) ≈ 10(3D/c) under the condition of μs′ » μa. Here, 3D/c is the characteristic time of interaction. For a case of μs′ = 1.0 mm −1 typical for biological tissue, the DE fails for light propagation within times shorter than 0.05 ns (50 ps), and the RTE is required in this period of times.  The net fluxes of the fluence rate, F(r, t), can be measured by time-resolved (TR) detectors and the measured fluxes such as TR reflectance and TR transmittance are often expressed as the time-of-flight distribution (TOF-distribution) in this article.

Diffusion Coefficient Independent of the Absorption Coefficient in TD-DE
The diffusion coefficient is given as D(r) = 1/[3(µ s (r) + µ a (r))] in the process of deriving the DE in the framework of the P 1 approximation stated above. However, there was a long controversy about the expression of the diffusion coefficient and whether it depends on µ a or not. Furutsu and Yamada [8] first discussed that in time-domain, D(r) is independent of µ a (r), i.e., D(r) = 1/[3µ s (r)], while in the CW-domain D(r) may depend on µ a (r). For optically homogeneous media, the radiance in the RTE for an impulse source can be written as, where I 0 (r,ŝ, t) is the radiance for non-absorbing medium without absorption, Equation (9), This is easily understood by substituting Equation (8) into Equation (1) neglecting the impulse source term, and Equation (8) is sometimes referred to as expressing the microscopic Beer-Lambert law [9]. The process of the P 1 approximation described above can be applied to Equation (9), then the diffusion coefficient is clearly given as D = 1/(3µ s ) independent of µ a . For inhomogeneous media, the derivation of D(r) = 1/[3µ s (r)] is not straightforward like for homogenous media, but Furutsu and Yamada [8] proved it mathematically.
Then, the controversy about the diffusion coefficient started regarding whether it should be included µ a or not [10][11][12]. But experimental and numerical studies supported D(r) = 1/[3µ s (r)] [13][14][15], and mathematical studies using processes different from the P 1 approximation derived D(r) independent of µ a for the TD-DE and D(r) dependent on µ a for the CW-DE [16,17]. Finally, the following expressions are likely to be accepted in this field, The absorption coefficient, µ a , of biological tissue is much smaller than µ s in the NIR range (typically µ a~0 .01 mm −1 and µ s ~1.0 mm −1 ), and the change in the magnitude of D upon including µ a is very small. Therefore, D(r) = 1/[3µ s (r)] is a good choice for all cases.

Boundary Condition for DE
The DE requires initial and boundary conditions to be solved. A boundary condition at the interface between two different media with mismatched refractive indexes is expressed by the following, where r b is a position on the interface, n indicates the direction outward normal to the interface, and A is a reflection parameter given by A = (1 + R in )/(1 − R in ), with R in denoting the internal diffusive reflectivity estimated by the Fresnel reflection coefficient or other empirical models. In the process of obtaining analytical solutions of the DE, extrapolated boundary conditions are often employed with the mirror image method to satisfy Equation (12). For simplicity, the zero-boundary condition where the fluence rate at the interface is given as zero is sometimes used, and this is the case for A = 0 in Equation (12). Analytical solutions for TD-DE for a point and impulse source are very useful because they  are Green's functions, and they have been obtained for simple geometries such as infinite media, semi-infinite media, slabs, cylinders, and spheres with homogeneous optical properties. The most fundamental analytical solution for TD-DE is for infinite homogeneous media given by Equation (13),

Analytical Solutions for TD-DE
where r is the distance from a point impulse source located at the origin. Patterson et al. [18] first derived analytical solutions for TD-DE for semi-infinite and slab media by use of the mirror image source method under a zero-boundary condition. In the extrapolated boundary condition, the fluence rate is zero at an extrapolated surface with a distance of 2AD from the physical boundary, as shown in Figure 3, and the fluence rate for semi-infinite homogeneous media is given by Equation (14), where r 1 = [(z − z 0 ) 2 + ρ 2 ] 1/2 and r 2 = [(z + z 0 + 2z b ) 2 + ρ 2 ] 1/2 are the distances from the position of interest, P(ρ, z), to the positive and negative point impulse sources, respectively, The reflectance at the surface, P(ρ, 0) with the source-detector (SD) distance of ρ, is given by Fick's law as Equation (15), where r 10 = (z 0 2 + ρ 2 ) 1/2 and r 20 = [(z 0 2 + 2z b ) 2 + ρ 2 ] 1/2 . Equation (15) is the Green's function for semi-infinite homogeneous media and is often used in the derivation of analytical solutions using the perturbation theory.

Analytical Solutions for TD-DE
Analytical solutions for TD-DE for a point and impulse source are very useful because they are Green's functions, and they have been obtained for simple geometries such as infinite media, semiinfinite media, slabs, cylinders, and spheres with homogeneous optical properties. The most fundamental analytical solution for TD-DE is for infinite homogeneous media given by Equation (13), where r is the distance from a point impulse source located at the origin. Patterson et al. [18] first derived analytical solutions for TD-DE for semi-infinite and slab media by use of the mirror image source method under a zero-boundary condition. In the extrapolated boundary condition, the fluence rate is zero at an extrapolated surface with a distance of 2AD from the physical boundary, as shown in Figure 3, and the fluence rate for semi-infinite homogeneous media is given by Equation (14), where r1 = [(z −z0) 2 + ρ 2 ] 1/2 and r2 = [(z + z0 + 2zb) 2 + ρ 2 ] 1/2 are the distances from the position of interest, P(ρ, z), to the positive and negative point impulse sources, respectively, The reflectance at the surface, P(ρ, 0) with the source-detector (SD) distance of ρ, is given by Fick's law as Equation (15), where r10 = (z0 2 + ρ 2 ) 1/2 and r20 = [(z0 2 + 2zb) 2 + ρ 2 ] 1/2 . Equation (15) is the Green's function for semi-infinite homogeneous media and is often used in the derivation of analytical solutions using the perturbation theory. Analytical solutions, or the Green's functions, for TD-DE for other simple geometries are comprehensively summarized in Reference [19].
When the medium scatters light non-isotropically but anisotropically, the diffusion coefficient is dependent on the direction and is expressed by a diffusion tensor. The analytical solution for TD-DE for the anisotropic diffusion in a slab medium is given by Kienle et al. [20] using a diffusion tensor for the diffusion coefficient.
Semi-infinite medium r1 r2 Figure 3. Concept of the extrapolated boundary and mirror image source for a semi-infinite medium.
Analytical solutions, or the Green's functions, for TD-DE for other simple geometries are comprehensively summarized in Reference [19].
When the medium scatters light non-isotropically but anisotropically, the diffusion coefficient is dependent on the direction and is expressed by a diffusion tensor. The analytical solution for TD-DE for the anisotropic diffusion in a slab medium is given by Kienle et al. [20] using a diffusion tensor for the diffusion coefficient.

Monte Carlo Simulations
Instead of directly solving the RTE or the DE with analytical and numerical methods, stochastic methods have been used to obtain statistically equivalent solutions of the RTE or the DE. One of them is random walk theory, which describes light propagation as a sequence of steps on a regular cubic lattice [21]. But this method oversimplifies the phenomenon, and the most common stochastic methods are Monte Carlo (MC) simulations. Wilson and Adam [22] first introduced an MC simulation into this field, and Wang et al. [23] provided free software for an MC simulation to calculate light propagation in multi-layered tissue with their published paper explaining the details. Now free software codes for MC simulations are available from the website of the Oregon Laser Center in Portland [24], and these codes are used by many researchers in this field.
The concept of MC simulations is to track the trajectories of energy packets, which are often called simply photons for brevity, while the trajectories of the photons are determined to statistically satisfy the optical properties of the medium, as shown in the following equations for Figure 4. The injected photon with an energy of E i is scattered with a scattering path length of L to the direction of the polar and azimuthal angles of θ and ϕ, respectively, and the energy is attenuated to E i+1 = E i [µ s /(µ s + µ a )]. L, θ, and ϕ are determined using uniformly distributed random numbers in the range of (0, 1), R 1 , R 2 , and R 3 by the following equations, where the phase function p(θ) is assumed to be independent of ϕ, f 2 (θ) is the accumulated phase function of p(θ), f 2 −1 means the inverse of f 2 (θ). When p(θ) is constant, meaning that g = 0 and the diffusion approximation is applicable, f 2 (θ) becomes as simple as f 2 (θ) = (1 − cosθ)/2, and resultantly θ is simply determined by θ = cos −1 (1 − R 2 ).

Monte Carlo Simulations
Instead of directly solving the RTE or the DE with analytical and numerical methods, stochastic methods have been used to obtain statistically equivalent solutions of the RTE or the DE. One of them is random walk theory, which describes light propagation as a sequence of steps on a regular cubic lattice [21]. But this method oversimplifies the phenomenon, and the most common stochastic methods are Monte Carlo (MC) simulations. Wilson and Adam [22] first introduced an MC simulation into this field, and Wang et al. [23] provided free software for an MC simulation to calculate light propagation in multi-layered tissue with their published paper explaining the details. Now free software codes for MC simulations are available from the website of the Oregon Laser Center in Portland [24], and these codes are used by many researchers in this field.
The concept of MC simulations is to track the trajectories of energy packets, which are often called simply photons for brevity, while the trajectories of the photons are determined to statistically satisfy the optical properties of the medium, as shown in the following equations for Figure 4. The injected photon with an energy of Ei is scattered with a scattering path length of L to the direction of the polar and azimuthal angles of θ and φ, respectively, and the energy is attenuated to L, θ, and φ are determined using uniformly distributed random numbers in the range of (0, 1), R1, R2, and R3 by the following equations, where the phase function p(θ) is assumed to be independent of φ, f2(θ) is the accumulated phase function of p(θ), f2 −1 means the inverse of f2(θ). When p(θ) is constant, meaning that g = 0 and the diffusion approximation is applicable, f2(θ) becomes as simple as f2(θ) = (1 − cosθ)/2, and resultantly θ is simply determined by θ = cos −1 (1 − R2). Monte Carlo simulations can equivalently provide solutions to the RTE even under the conditions for which the DE breaks down, and due to its flexibility to geometry, Boas et al. [25] succeeded in developing an MC code for predicting 3D and TD light propagation in the adult human head, which is a complex heterogeneous medium including a non-scattering and non-absorbing cerebro-spinal fluid (CSF) layer. However, for obtaining accurate solutions, long computation times are needed because the statistical noises are inversely proportional to the square root of the number of injected photons. Some techniques have been studied to overcome this drawback of MC simulations and will be reviewed in a later section. Monte Carlo simulations can equivalently provide solutions to the RTE even under the conditions for which the DE breaks down, and due to its flexibility to geometry, Boas et al. [25] succeeded in developing an MC code for predicting 3D and TD light propagation in the adult human head, which is a complex heterogeneous medium including a non-scattering and non-absorbing cerebro-spinal fluid (CSF) layer. However, for obtaining accurate solutions, long computation times are needed because the statistical noises are inversely proportional to the square root of the number of injected photons. Some Appl. Sci. 2019, 9, 1127 8 of 54 techniques have been studied to overcome this drawback of MC simulations and will be reviewed in a later section.

Time-Domain Sensitivity Functions of Optical Signals
Assume that the TR fluence rate, φ 0 (r, t), is a solution of the TD-DE in a homogeneous medium with the absorption and diffusion coefficients, µ a0 and D 0 , for the impulse point source at the origin and time zero. If µ a0 and D 0 in a small volume of the medium, V p , are perturbed to µ a = µ a0 + δµ a and D = D 0 + δD, the perturbations in the fluence rate due to the perturbations of δµ a and δD, δφ a (r, t), and δφ D (r, t), respectively, are derived by the first order perturbation theory (explained in detail in a later section), and the TD sensitivities (weights or Jacobians) of the fluence rate to the changes in µ a and D, defined as the limits of δµ a → 0 and δD → 0 of the absolute values of δφ a (r, t)/δµ a and δφ D (r, t)/δD, respectively, are given as, where G(r, r , t − t ) is the Green's function which is the solution of φ(r, t) in the TD-DE for the impulse point source at position r and time t . The right-hand sides of Equations (17) and (18) are calculated analytically using the analytical solutions of TD-DE for simple geometries or numerically for complex geometries. Equations (17) and (18) are the TD sensitivities of the fluence rate to the changes in µ a and D, respectively, but the TD sensitivities of any optical signals such as the reflectance and transmittance can be defined. For example, when the reflectance, R(r d , r s , t) (r d and r s are the detector and source positions, respectively) is considered, the TD sensitivity of the reflectance to the change in µ a is expressed approximately as, where G (R) (r d , r , t) is the Green's function of the reflectance measured at r d for the impulse point source at position r and time t , and G(r , r s , t − t ) is the Green's function of the fluence rate measured at position r and time t − t for the impulse point source at position r s and time t . Now the physical meaning of Equation (19) should be considered. By integrating the right-hand side of Equation (19) for infinitesimal volume of V p = δ(r − r) around r, and after some mathematical manipulations, a three-point function of r d , r s , and r can be defined as Equation (20), where α is a proportionality constant and the second equality is derived by the reciprocity between the position of the detector and the position of interest, or the adjoint formulation. Here, G(r, r s , t ) means the fluence rate at position r and at time t generated by an impulse point source at position r s and time 0, and G(r d , r, t − t ) in the first equality means the fluence rate at position r d at time t − t generated by an impulse point source at position r and time t as indicated in Figure 5a. The function H(r d , r s , r, t) indicates the probability of photons that exist at position r after being injected from the source at position r s at time 0 and finally being detected by the detector at position r d at time t, and when illustrated in 2D or 3D images at varying times t, they show the temporal evolution of probabilistic light paths in the medium between the source and detector. In case of reflectance from semi-infinite media, the probabilistic light paths exhibit so-called banana shapes, and the banana shapes grow as time evolves. In case of transmittance through slabs, similar formulations are derived as in the case of reflectance, and the probabilistic light paths exhibit so-called spindle-like shapes.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 9 of 53 The function H(rd, rs, r, t) is called various terminology such as the sensitivity to absorption, photon probability distribution [26], photon hitting density [27], photon measurement density function [28], relative photon visit probability [29], photon visiting probability [30], photon sampling function, photon weight function, Jacobian with respect to absorption, etc., with variation in the proportionality constant, α. The sensitivity to scattering changes can also be derived from Equation (18), but its physical meaning is not as clear as H(rd, rs, r, t). The sensitivities play important roles not only in understanding the propagating path of scattered light but also in the inversion process of reconstructing absorption and scattering images in DOT as explained later.
Sawosz et al. [30] experimentally estimated H(rd, rs, r, t) by time-gated measurements of the fluence rates through a homogeneous semi-infinite medium. The second equality in Equation (20) means that H(rd, rs, r, t) is given by the convolution between the Green's function for the input position at the source, rs, and that for the input position at the detector, rd, as shown in Figure 5b. Sawosz et al. [30] used this characteristic of H(rd, rs, r, t) to acquire the images in Figure 6 showing the banana shapes growing bigger as time evolved. Continuous wave sensitivity is obtained by extending the upper time limit in Equation (20) to infinity.

Time-Resolved (TR) Mean Depth of Light Propagation
From sensitivity to absorption, H(rd, rs, r, t), it is possible to calculate the TR mean depth of light propagation, <z>(rd, rs, t), in a semi-infinite homogeneous medium by, , ' where the integral with respect to position is over the volume, V, with finite values of H(rd, rs, r, t).
(rs, 0) (rd, t) The function H(r d , r s , r, t) is called various terminology such as the sensitivity to absorption, photon probability distribution [26], photon hitting density [27], photon measurement density function [28], relative photon visit probability [29], photon visiting probability [30], photon sampling function, photon weight function, Jacobian with respect to absorption, etc., with variation in the proportionality constant, α. The sensitivity to scattering changes can also be derived from Equation (18), but its physical meaning is not as clear as H(r d , r s , r, t). The sensitivities play important roles not only in understanding the propagating path of scattered light but also in the inversion process of reconstructing absorption and scattering images in DOT as explained later.
Sawosz et al. [30] experimentally estimated H(r d , r s , r, t) by time-gated measurements of the fluence rates through a homogeneous semi-infinite medium. The second equality in Equation (20) means that H(r d , r s , r, t) is given by the convolution between the Green's function for the input position at the source, r s , and that for the input position at the detector, r d , as shown in Figure 5b. Sawosz et al. [30] used this characteristic of H(r d , r s , r, t) to acquire the images in Figure 6 showing the banana shapes growing bigger as time evolved. Continuous wave sensitivity is obtained by extending the upper time limit in Equation (20) to infinity. The function H(rd, rs, r, t) is called various terminology such as the sensitivity to absorption, photon probability distribution [26], photon hitting density [27], photon measurement density function [28], relative photon visit probability [29], photon visiting probability [30], photon sampling function, photon weight function, Jacobian with respect to absorption, etc., with variation in the proportionality constant, α. The sensitivity to scattering changes can also be derived from Equation (18), but its physical meaning is not as clear as H(rd, rs, r, t). The sensitivities play important roles not only in understanding the propagating path of scattered light but also in the inversion process of reconstructing absorption and scattering images in DOT as explained later.
Sawosz et al. [30] experimentally estimated H(rd, rs, r, t) by time-gated measurements of the fluence rates through a homogeneous semi-infinite medium. The second equality in Equation (20) means that H(rd, rs, r, t) is given by the convolution between the Green's function for the input position at the source, rs, and that for the input position at the detector, rd, as shown in Figure 5b. Sawosz et al. [30] used this characteristic of H(rd, rs, r, t) to acquire the images in Figure 6 showing the banana shapes growing bigger as time evolved. Continuous wave sensitivity is obtained by extending the upper time limit in Equation (20) to infinity.

Time-Resolved (TR) Mean Depth of Light Propagation
From sensitivity to absorption, H(rd, rs, r, t), it is possible to calculate the TR mean depth of light propagation, <z>(rd, rs, t), in a semi-infinite homogeneous medium by, , ' where the integral with respect to position is over the volume, V, with finite values of H(rd, rs, r, t).

Time-Resolved (TR) Mean Depth of Light Propagation
From sensitivity to absorption, H(r d , r s , r, t), it is possible to calculate the TR mean depth of light propagation, <z>(r d , r s , t), in a semi-infinite homogeneous medium by, < z > (r d , r s , t) = V zH(r d , r s , r, t)dr V H(r d , r s , r, t)dr where the integral with respect to position is over the volume, V, with finite values of H(r d , r s , r, t).

Time-Resolved (TR) Pathlength
Time-Resolved pathlength, expressed as l(t, ρ), is defined as the pathlength of the trajectory that light takes from the impulse source position at time zero to the position where light exists at time t on the way to a detector with the SD distance of ρ, and is equal to the TR mean TOF, <t>(t, ρ), multiplied by the speed of light, c. The TR mean TOF, <t>(t, ρ), is calculated for the measured TOF distribution, expressed by F(t, ρ), as the following, If the medium is multi-layered, the TR partial pathlength of the i-th layer is defined as, where t i is time when the light is propagating inside the i-th layer. For CW light, the (partial) pathlengths are obtained by evolving the upper limit of time t to infinity, ∞.
In the i-th layer of a multi-layered medium or in a homogeneous medium, the optical properties are assumed to be constant. Then F(t, ρ) can be generally expressed by f (t i , ρ)exp(−µ ai ct) as seen from the TR reflectance of Equation (15), and Equation (23) is modified as follows, This equation holds not only for the DE but also for the RTE [32] (p.37), and is very useful for estimating total and partial pathlengths in various media.

Physiological Information and Optical Properties
Once the optical properties of biological tissues are estimated or determined by TD-NIRS physiological information, particularly, the hemodynamic information can be estimated.
The absorption coefficient of tissue is the sum of the absorption coefficients of the chromophores such as hemoglobin, myoglobin, melanin, water, lipid, cytochrome, etc., and expressed as, where ε i is the extinction coefficient or the molar absorption coefficient of the component i, C i is the molar concentration or the molarity of the component i, and N C is the number of components. Note that ε i is often based on common logarithm while µ a is based on natural logarithm, necessitating correction between the two logarithms. If ε i (λ) are known for all the components, measurements of µ a (λ) at N C (or more than N C ) wavelengths will determine C i of all the components. Oxygenated hemoglobin (oxy-Hb) and deoxygenated hemoglobin (deoxy-Hb) are two dominant chromophores in NIRS with known spectra of ε oxy-Hb (λ) and ε deoxy-Hb (λ). Then C oxy-Hb and C deoxy-Hb are determined and total hemoglobin (total-Hb) concentration, C total-Hb = C oxy-Hb + C deoxy-Hb , and oxygen saturation, S O2 = C oxy-Hb /C total-Hb , can be calculated to show hemodynamic statuses in brain, muscle, breast, skin, etc. In contrast to µ a , the scattering properties, µ s and µ s , indicate the statuses of microstructures and sub-cellular components, because scattering is caused by gradients and discontinuities of refractive indexes in tissue.

Overview of TD-NIRS Instruments
In the mid 1980s, NIRS succeeded in in vivo and non-invasive monitoring of the changes in cerebral hemoglobin concentrations in human heads [33]. However, the monitoring method based on the modified Beer-Lambert law is subject to the problem of quantitativeness due to the unknown optical pathlength which is necessary for applying the modified Beer-Lambert law.
To cope with this problem in the late 1980s, studies were conducted to estimate the optical pathlengths in biological tissues by measuring TOF distributions of reemitted light after being multiply scattered in tissues using picosecond-pulsed lasers and detected by detectors having picosecond temporal resolutions such as streak cameras. Delpy et al. and van der Zee et al. used an ultrafast dye laser excited by a krypton-laser for a light source (a pulse width less than 6 ps and repetition rate of 76 MHz at wavelengths of 761 nm and 783 nm) and a streak camera for a detector to estimate the mean optical pathlengths from the measured TOF distributions [34,35], and found that the absorbance changes of the measured transmittance were related to the absorption changes in the media through the estimated mean optical pathlengths. Chance et al. reported that the logarithmic slopes of TOF distributions in the decaying period were proportional to hemoglobin concentrations in biological tissues by TD experiments using two dye lasers excited by the second harmonics of an Nd:YAG laser for pulsed light sources and time-correlated single-photon counting (TCSPC) technique for a detector which was widely employed for measurement of fluorescence life time [36,37]. Furthermore, Nomura et al. measured TOF distributions through rat heads using a Ti-Sapphire laser for a pulsed light source and a streak camera for a detector and concluded that the Beer-Lambert law held every time in the TOF distributions [38].
These pioneering studies promoted wide use of time-domain near-infrared spectroscopy (TD-NIRS) systems for understanding light propagation and quantitative measurements of hemoglobin concentration in biological tissue. However, the TD-NIRS systems used in these pioneering studies had the disadvantages of being very expensive, large, non-portable, and highly sophisticated, and resultantly, their systems were unacceptable for clinical applications.
To mitigate the disadvantages stated above, Cubeddu et al. developed a compact dual-wavelength multichannel tissue oximeter using two pulsed-laser diodes (pulse width less than 100 ps, repetition rate of 80 MHz, wavelengths of 672 nm and 818 nm) for light sources and multi-anode photomultiplier tubes (PMTs) for detectors [39]. Oda et al. developed a compact three-wavelength one-channel tissue oxygenation monitor using three pulsed-laser diodes (pulse width less than 100 ps, repetition rate of 5 MHz, wavelengths of 759 nm, 797 nm and 833 nm) for light sources and PMTs for detectors based on the TCSPC technique [40]. This tissue oxygenation monitor was marketed but only in Japan. Grosenick et al. also developed a single-wavelength single-channel TD optical mammograph using a pulsed laser diode (pulse width of about 400 ps, repetition rate of 80 MHz, wavelength of 785 nm) for light source and a PMT-TCSPC for detector, and measured TR-transmittance through healthy breasts as well as breasts carrying tumor [41].
Currently, TD-NIRS systems with a few channels employing the TCSPC technique are actively developed for the purpose of applications in clinical usage, and also multi-channel TD-NIRS systems have been developed for TD-DOT.

Principle, Components, Characteristics, and Operation of the TCSPC Technique
The TCSPC technique is a time-resolved spectroscopic measurement method based on a single photon counting, which is a way of light detection by counting photons one by one, combined with a pulsed light source with a narrow pulse width in the order of picoseconds and a stabilized repetition of pulse generation in the order of a mega-hertz, and has the capability of measuring temporal changes in very weak light intensity with a high-temporal resolution in the order of picoseconds. At time, t, after the emission of a single ultra-short light pulse, photons reaching a detector with a probability of less than one photon per one light pulse are detected. This detection of single photon at time t is repeated many times in the order of a mega-hertz, and a histogram of the numbers of detected single photons as a function of time, t, is produced, i.e., a TOF distribution is obtained [37].
As shown in Figure 7, a standard TCSPC instrument consists of a pulsed light source, a single-photon counting detector, such as a PMT, a time pick-off circuit, such as a constant fraction discriminator (CFD), a time-to-amplitude converter (TAC), an A/D converter, a histogram memory, etc. A pulsed light source must emit ultrashort light pulses with a pulse width much less than the characteristic response time of the optical phenomenon in a sample and with a stabilized and fast repetitive pulse generation. The TAC, one of the key components, receives output pulses (start signals) from the detector and reference pulses (stop signals) synchronized to the ultrashort light pulses from the light source, and outputs pulses with the intensities proportional to the time differences between the start and stop signals. The output pulses from the TAC are processed by the A/D converter and the histogram memory.

Principle, Components, Characteristics, and Operation of the TCSPC Technique
The TCSPC technique is a time-resolved spectroscopic measurement method based on a single photon counting, which is a way of light detection by counting photons one by one, combined with a pulsed light source with a narrow pulse width in the order of picoseconds and a stabilized repetition of pulse generation in the order of a mega-hertz, and has the capability of measuring temporal changes in very weak light intensity with a high-temporal resolution in the order of picoseconds. At time, t, after the emission of a single ultra-short light pulse, photons reaching a detector with a probability of less than one photon per one light pulse are detected. This detection of single photon at time t is repeated many times in the order of a mega-hertz, and a histogram of the numbers of detected single photons as a function of time, t, is produced, i.e., a TOF distribution is obtained [37].
As shown in Figure 7, a standard TCSPC instrument consists of a pulsed light source, a singlephoton counting detector, such as a PMT, a time pick-off circuit, such as a constant fraction discriminator (CFD), a time-to-amplitude converter (TAC), an A/D converter, a histogram memory, etc. A pulsed light source must emit ultrashort light pulses with a pulse width much less than the characteristic response time of the optical phenomenon in a sample and with a stabilized and fast repetitive pulse generation. The TAC, one of the key components, receives output pulses (start signals) from the detector and reference pulses (stop signals) synchronized to the ultrashort light pulses from the light source, and outputs pulses with the intensities proportional to the time differences between the start and stop signals. The output pulses from the TAC are processed by the A/D converter and the histogram memory. The important factor in the measurement of the TCSPC technique is that the light intensity detected by the detector is adjusted so that the detection probability in a time-gate for one input pulse is constant at any time-gate. If more than one photon is detected in the time-gate for one input pulse, the TAC starts working at the arrival of the start signal generated by the first detected photon and neglects the start signals generated by the second, third, … detected photons until the next stop signal arrives. Then, measurement pile-up takes place causing a distortion in the TOF distribution. Resultantly, earlier times are enhanced because the detection probability in the time-gate is no longer constant, and a correct TOF distribution cannot be obtained due to a serious distortion. To solve this problem of pile-up distortion, the detection rate of the detector must be sufficiently smaller than the repetition rate of the input light pulses.
When m photons are detected in an average of one input pulse, the probability of detecting n photons for one input pulse is given by a Poisson distribution, The important factor in the measurement of the TCSPC technique is that the light intensity detected by the detector is adjusted so that the detection probability in a time-gate for one input pulse is constant at any time-gate. If more than one photon is detected in the time-gate for one input pulse, the TAC starts working at the arrival of the start signal generated by the first detected photon and neglects the start signals generated by the second, third, . . . detected photons until the next stop signal arrives. Then, measurement pile-up takes place causing a distortion in the TOF distribution. Resultantly, earlier times are enhanced because the detection probability in the time-gate is no longer constant, and a correct TOF distribution cannot be obtained due to a serious distortion. To solve this problem of pile-up distortion, the detection rate of the detector must be sufficiently smaller than the repetition rate of the input light pulses.
When m photons are detected in an average of one input pulse, the probability of detecting n photons for one input pulse is given by a Poisson distribution, As an example, when only one photon for 100 input pulses is detected in an average (m = 0.01), the probability that more than one photon for one input pulse are detected becomes very small as, It is obvious that the appropriate value of m depends on the acceptable distortion of the measured TOF distributions, and it is a common understanding that m less than 5% (0.05) provides negligible distortions of the measured TOF distributions. Actually, m is often kept at small values of about 1% (0.01) or less than 2% (0.02). In principle, the errors will decrease with the decrease in m, but at the same time the measuring time will be prolonged to assure good S/N ratios. Therefore, it is important to choose a light source capable of emitting light pulses stably with a narrow pulse width and a high repetition rate. To shorten measuring times, Ida and Iwata [42] devised a method correcting the pile-up distortion mathematically for the measured data with m larger than 0.1 (10%).
In the TCSPC technique, there also exists a dead time when the detector system is unable to count photons due to processing signals after receiving photons. If the dead time is longer than the time period between two successive input pulses, counting losses take place. In standard TCSPC systems, the dead times are longer than the time period between two successive input pulses. Then the CW intensity, I, given by integration of a TOF distribution with respect to time must be corrected for the counting losses by I true = I meas /(1 − αI meas ) where I true and I meas [s −1 ] are the corrected and integrated intensities, respectively, and α is the dead time [s] when the measuring systems are non-paralyzed models.

Single-and Dual-Channel TD-NIRS Systems Based on the TCSPC Technique
In 1999, Oda et al. developed a single-channel TD-NIRS system based on the TCSPC technique (TRS-10, Hamamatsu Photonics K.K., Japan), and in 2009 it was extended to a dual-channel system (TRS-20, Hamamatsu Photonics K.K., Japan) [43], both for research use. Figure 8 shows a photograph and block diagram of the TRS-20 consisting of three pulsed laser diodes with wavelengths of 760 nm, 800 nm, and 830 nm, a TCSPC unit with a CFD/TAC, A/D converter, histogram memory, etc.
As an example, when only one photon for 100 input pulses is detected in an average (m = 0.01), the probability that more than one photon for one input pulse are detected becomes very small as, It is obvious that the appropriate value of m depends on the acceptable distortion of the measured TOF distributions, and it is a common understanding that m less than 5% (0.05) provides negligible distortions of the measured TOF distributions. Actually, m is often kept at small values of about 1% (0.01) or less than 2% (0.02). In principle, the errors will decrease with the decrease in m, but at the same time the measuring time will be prolonged to assure good S/N ratios. Therefore, it is important to choose a light source capable of emitting light pulses stably with a narrow pulse width and a high repetition rate. To shorten measuring times, Ida and Iwata [42] devised a method correcting the pile-up distortion mathematically for the measured data with m larger than 0.1 (10%).
In the TCSPC technique, there also exists a dead time when the detector system is unable to count photons due to processing signals after receiving photons. If the dead time is longer than the time period between two successive input pulses, counting losses take place. In standard TCSPC systems, the dead times are longer than the time period between two successive input pulses. Then the CW intensity, I, given by integration of a TOF distribution with respect to time must be corrected for the counting losses by Itrue = Imeas/(1 − αImeas) where Itrue and Imeas [s −1 ] are the corrected and integrated intensities, respectively, and α is the dead time [s] when the measuring systems are non-paralyzed models.

Single-and Dual-Channel TD-NIRS Systems Based on the TCSPC Technique
In 1999, Oda et al. developed a single-channel TD-NIRS system based on the TCSPC technique (TRS-10, Hamamatsu Photonics K.K., Japan), and in 2009 it was extended to a dual-channel system (TRS-20, Hamamatsu Photonics K.K., Japan) [43], both for research use. Figure 8 shows a photograph and block diagram of the TRS-20 consisting of three pulsed laser diodes with wavelengths of 760 nm, 800 nm, and 830 nm, a TCSPC unit with a CFD/TAC, A/D converter, histogram memory, etc. From the TR reflectances measured by the TRS-10 and TRS-20, μa and μs′ of the media are estimated under the assumption that the media are homogeneous and semi-infinite. The TR reflectance from a homogeneous semi-infinite medium is given by the analytical solution of the TD-DE as Equation (15). For the zero-boundary condition, Equation (15) is slightly simplified as the following, From the TR reflectances measured by the TRS-10 and TRS-20, µ a and µ s of the media are estimated under the assumption that the media are homogeneous and semi-infinite. The TR reflectance from a homogeneous semi-infinite medium is given by the analytical solution of the TD-DE as Equation (15). For the zero-boundary condition, Equation (15) is slightly simplified as the following, here, the wavelength dependences are not explicitly written. Equation (28), convoluted by the instrumental response function (IRF), is fitted to the measured TD reflectance to estimate µ a and µ s of the medium using a non-linear least-squared technique based on the Levenberg-Marquardt method. Applying Equation (25) to the case of the two components of oxy-Hb and deoxy-Hb with background absorption gives the following equation with the wavelength dependence written explicitly, where µ a,BG is the absorption coefficient of the background medium. Solving the simultaneous equations of Equation (29) for the three wavelengths used in the instrument provides C oxy-Hb and C deoxy-Hb , and C total-Hb , and S O2 = C oxy-Hb /C total-Hb is calculated. For verification of the quantities obtained by the TRS-10, experiments were performed using a blood phantom consisting of an Intralipid suspension and blood with the controlled S O2 from 0% to 100% [44]. The values of C oxy-Hb , C deoxy-Hb , and C total-Hb measured by the TRS-10 agreed well with the values calculated from the measurements of pH, P O2 , and P CO2 in the phantom by conventional methods.

Multi-Channel TD-NIRS Systems Based on TCSPC for DOT
Multi-channel TD-NIRS systems have been developed for the purpose of TD-DOT. In 1999, a Japanese group developed a 64-channel TD-NIRS system and studied the image reconstruction algorithm as well as the performance of the system [45]. Figure 9 shows the photograph and block diagram of the system which had 64 independent detector units, while light sources of three pulsed laser diodes with three wavelengths operated in turn using an optical switch for selecting a wavelength and a mechanical switch for selecting a source position. Results using this system are reviewed in a later section [46][47][48].
Appl. Sci. 2019, 9, x FOR PEER REVIEW 14 of 53 here, the wavelength dependences are not explicitly written. Equation (28), convoluted by the instrumental response function (IRF), is fitted to the measured TD reflectance to estimate μa and μs′ of the medium using a non-linear least-squared technique based on the Levenberg-Marquardt method. Applying Equation (25) to the case of the two components of oxy-Hb and deoxy-Hb with background absorption gives the following equation with the wavelength dependence written explicitly, where μa,BG is the absorption coefficient of the background medium. Solving the simultaneous equations of Equation (29) for the three wavelengths used in the instrument provides Coxy-Hb and Cdeoxy-Hb, and Ctotal-Hb, and SO2 = Coxy-Hb/Ctotal-Hb is calculated. For verification of the quantities obtained by the TRS-10, experiments were performed using a blood phantom consisting of an Intralipid suspension and blood with the controlled SO2 from 0% to 100% [44]. The values of Coxy-Hb, Cdeoxy-Hb, and Ctotal-Hb measured by the TRS-10 agreed well with the values calculated from the measurements of pH, PO2, and PCO2 in the phantom by conventional methods.

Multi-Channel TD-NIRS Systems Based on TCSPC for DOT
Multi-channel TD-NIRS systems have been developed for the purpose of TD-DOT. In 1999, a Japanese group developed a 64-channel TD-NIRS system and studied the image reconstruction algorithm as well as the performance of the system [45]. Figure 9 shows the photograph and block diagram of the system which had 64 independent detector units, while light sources of three pulsed laser diodes with three wavelengths operated in turn using an optical switch for selecting a wavelength and a mechanical switch for selecting a source position. Results using this system are reviewed in a later section [46][47][48]. A 32-channel TD-NIRS system named MONSTIR (Figure 10) was developed by a group from the University of London for reconstructing TD-DOT images of brain functions in premature infants at bedside [49]. The DOT images obtained using these systems are reviewed later in Section 5.3.4. A 32-channel TD-NIRS system named MONSTIR (Figure 10) was developed by a group from the University of London for reconstructing TD-DOT images of brain functions in premature infants at bedside [49]. The DOT images obtained using these systems are reviewed later in Section 5.3.4.
Ueda et al. [50] developed a 16-channel TD-NIRS system based on the TCSPC technique to reconstruct TD-DOT images from measured TR reflectances at an adult human forehead. Results of TD-DOT using this system are reviewed later in Section 5.3.4.

Other New TD-NIRS Systems
Various research institutes have updated and improved TD-NIRS systems by incorporating recently developed and useful apparatuses for the components of TD-NIRS systems. In the following, newly developed TD-NIRS systems and new measurement methods for TD-NIRS systems including ones which do not use the TCSPC technique are briefly reviewed.

MONSTIR II
In 2014, a group from the University of London developed an updated version of MONSTIR named MONSTIR II employing a supercontinuum (SC) laser for the light source with an acousticoptic tunable filter (AOTF), which enables to select four wavelengths arbitrarily in a wavelength range from 650 nm to 950 nm [51].

TD-DOT and TD-NIRS Systems for Optical Mammography
In 2011, Hamamatsu Photonics developed a 48-channel three-wavelength optical mammography as shown in Figure 11a [52]. Breasts were immersed in a hemispherical gantry filled with a matching fluid, and DOT images were reconstructed by an algorithm based on the microscopic Beer-Lambert law [9,50]. Furthermore, a 12-channel TD-NIRS system for optical mammography with a hand-held probe shown in Figure 11b was developed to be used at a bed side [53]. Another TD-NIRS system for optical mammography, as shown in Figure 11c, was developed by a group from Italy to obtain TR transmittance images of compressed breasts [54]. The results of these systems are briefly described later in Sections 5.2.1 and 5.3.5. Figure 11. Photos of (a) the 48-channel TD-DOT for optical mammography [52]; (b) the 12-channel TD-NIRS system for optical mammography with a hand-held probe [53], and (c) the transmittance Ueda et al. [50] developed a 16-channel TD-NIRS system based on the TCSPC technique to reconstruct TD-DOT images from measured TR reflectances at an adult human forehead. Results of TD-DOT using this system are reviewed later in Section 5.3.4.

Other New TD-NIRS Systems
Various research institutes have updated and improved TD-NIRS systems by incorporating recently developed and useful apparatuses for the components of TD-NIRS systems. In the following, newly developed TD-NIRS systems and new measurement methods for TD-NIRS systems including ones which do not use the TCSPC technique are briefly reviewed.

MONSTIR II
In 2014, a group from the University of London developed an updated version of MONSTIR named MONSTIR II employing a supercontinuum (SC) laser for the light source with an acoustic-optic tunable filter (AOTF), which enables to select four wavelengths arbitrarily in a wavelength range from 650 nm to 950 nm [51].

TD-DOT and TD-NIRS Systems for Optical Mammography
In 2011, Hamamatsu Photonics developed a 48-channel three-wavelength optical mammography as shown in Figure 11a [52]. Breasts were immersed in a hemispherical gantry filled with a matching fluid, and DOT images were reconstructed by an algorithm based on the microscopic Beer-Lambert law [9,50]. Furthermore, a 12-channel TD-NIRS system for optical mammography with a hand-held probe shown in Figure 11b was developed to be used at a bed side [53]. Another TD-NIRS system for optical mammography, as shown in Figure 11c, was developed by a group from Italy to obtain TR transmittance images of compressed breasts [54]. The results of these systems are briefly described later in Sections 5.2.1 and 5.3.5.
Ueda et al. [50] developed a 16-channel TD-NIRS system based on the TCSPC technique to reconstruct TD-DOT images from measured TR reflectances at an adult human forehead. Results of TD-DOT using this system are reviewed later in Section 5.3.4.

Other New TD-NIRS Systems
Various research institutes have updated and improved TD-NIRS systems by incorporating recently developed and useful apparatuses for the components of TD-NIRS systems. In the following, newly developed TD-NIRS systems and new measurement methods for TD-NIRS systems including ones which do not use the TCSPC technique are briefly reviewed.

MONSTIR II
In 2014, a group from the University of London developed an updated version of MONSTIR named MONSTIR II employing a supercontinuum (SC) laser for the light source with an acousticoptic tunable filter (AOTF), which enables to select four wavelengths arbitrarily in a wavelength range from 650 nm to 950 nm [51].

TD-DOT and TD-NIRS Systems for Optical Mammography
In 2011, Hamamatsu Photonics developed a 48-channel three-wavelength optical mammography as shown in Figure 11a [52]. Breasts were immersed in a hemispherical gantry filled with a matching fluid, and DOT images were reconstructed by an algorithm based on the microscopic Beer-Lambert law [9,50]. Furthermore, a 12-channel TD-NIRS system for optical mammography with a hand-held probe shown in Figure 11b was developed to be used at a bed side [53]. Another TD-NIRS system for optical mammography, as shown in Figure 11c, was developed by a group from Italy to obtain TR transmittance images of compressed breasts [54]. The results of these systems are briefly described later in Sections 5.2.1 and 5.3.5. Figure 11. Photos of (a) the 48-channel TD-DOT for optical mammography [52]; (b) the 12-channel TD-NIRS system for optical mammography with a hand-held probe [53], and (c) the transmittance  Figure 11. Photos of (a) the 48-channel TD-DOT for optical mammography [52]; (b) the 12-channel TD-NIRS system for optical mammography with a hand-held probe [53], and (c) the transmittance TD-NIRS imaging system for optical mammography with breast compression [54]. (b) Reproduced from Reference [53]. (c) Reproduced from Reference [54].

Compact TD-NIRS Systems Using MPPCs (or SiPMs)
A multi-pixel photon counter (MPPC), or a silicon photomultiplier (SiPM) for another name, is a new type of photon counting semiconductor device consisting of a high-density matrix of avalanche photodiodes, enabling single-photon detection and offering high-photon detection efficiency, excellent timing resolution, low bias voltage operation, ruggedness, resistance to excess light, and immunity to magnetic fields. Using MPPCs (or SiPMs), compact TD-NIRS systems have been developed and commercialized recently.
A dual-channel TD-NIRS system for non-invasive monitoring of oxygenation in the brain (tNIRS-1, Hamamatsu Photonics K.K., Japan) was manufactured and marketed as a medical instrument in 2014 [55]. Figure 12 shows its photograph. It employs semi-conductor devices of cooled multi-pixel photon counters (MPPCs) for light detection and time-to-digital converters (TDCs) for TR measurement with a design concept of compactness and usability, resulting in a size of 292 mm (width) × 291 mm (height) × 207 mm (depth) and a weight of about 7.5 kg.

Compact TD-NIRS Systems Using MPPCs (or SiPMs)
A multi-pixel photon counter (MPPC), or a silicon photomultiplier (SiPM) for another name, is a new type of photon counting semiconductor device consisting of a high-density matrix of avalanche photodiodes, enabling single-photon detection and offering high-photon detection efficiency, excellent timing resolution, low bias voltage operation, ruggedness, resistance to excess light, and immunity to magnetic fields. Using MPPCs (or SiPMs), compact TD-NIRS systems have been developed and commercialized recently.
A dual-channel TD-NIRS system for non-invasive monitoring of oxygenation in the brain (tNIRS-1, Hamamatsu Photonics K.K., Japan) was manufactured and marketed as a medical instrument in 2014 [55]. Figure 12 shows its photograph. It employs semi-conductor devices of cooled multi-pixel photon counters (MPPCs) for light detection and time-to-digital converters (TDCs) for TR measurement with a design concept of compactness and usability, resulting in a size of 292 mm (width) × 291 mm (height) × 207 mm (depth) and a weight of about 7.5 kg. In 2016, the group from Politecnico di Milano, Italy, developed a compact (size of 160 mm (width) × 50 mm (height) × 200 mm (depth)), low power consumption, dual-wavelength TD-NIRS system as shown in Figure 13 [56]. It employs two pulsed laser diodes with wavelengths of 670 nm and 830 nm for the light sources, a silicon photomultiplier (SiPM) with an active area of 1 mm 2 for the detector, and a home-made TDC with a temporal resolution of 10 ps for TR measurements. For TD-NIRS, the same group also developed a compact detector probe integrating a fast SiPM and its electronics, which can be directly put in contact with the skin [57]. Many compact detector probes can be installed into a head-cap without the need for optical fibers for collecting light.   In 2016, the group from Politecnico di Milano, Italy, developed a compact (size of 160 mm (width) × 50 mm (height) × 200 mm (depth)), low power consumption, dual-wavelength TD-NIRS system as shown in Figure 13 [56]. It employs two pulsed laser diodes with wavelengths of 670 nm and 830 nm for the light sources, a silicon photomultiplier (SiPM) with an active area of 1 mm 2 for the detector, and a home-made TDC with a temporal resolution of 10 ps for TR measurements. For TD-NIRS, the same group also developed a compact detector probe integrating a fast SiPM and its electronics, which can be directly put in contact with the skin [57]. Many compact detector probes can be installed into a head-cap without the need for optical fibers for collecting light.

Compact TD-NIRS Systems Using MPPCs (or SiPMs)
A multi-pixel photon counter (MPPC), or a silicon photomultiplier (SiPM) for another name, is a new type of photon counting semiconductor device consisting of a high-density matrix of avalanche photodiodes, enabling single-photon detection and offering high-photon detection efficiency, excellent timing resolution, low bias voltage operation, ruggedness, resistance to excess light, and immunity to magnetic fields. Using MPPCs (or SiPMs), compact TD-NIRS systems have been developed and commercialized recently.
A dual-channel TD-NIRS system for non-invasive monitoring of oxygenation in the brain (tNIRS-1, Hamamatsu Photonics K.K., Japan) was manufactured and marketed as a medical instrument in 2014 [55]. Figure 12 shows its photograph. It employs semi-conductor devices of cooled multi-pixel photon counters (MPPCs) for light detection and time-to-digital converters (TDCs) for TR measurement with a design concept of compactness and usability, resulting in a size of 292 mm (width) × 291 mm (height) × 207 mm (depth) and a weight of about 7.5 kg. In 2016, the group from Politecnico di Milano, Italy, developed a compact (size of 160 mm (width) × 50 mm (height) × 200 mm (depth)), low power consumption, dual-wavelength TD-NIRS system as shown in Figure 13 [56]. It employs two pulsed laser diodes with wavelengths of 670 nm and 830 nm for the light sources, a silicon photomultiplier (SiPM) with an active area of 1 mm 2 for the detector, and a home-made TDC with a temporal resolution of 10 ps for TR measurements. For TD-NIRS, the same group also developed a compact detector probe integrating a fast SiPM and its electronics, which can be directly put in contact with the skin [57]. Many compact detector probes can be installed into a head-cap without the need for optical fibers for collecting light.

TD-NIRS Systems Using SPADs
Puszka et al. [58] proposed and developed a new TD-NIRS system incorporating single-photon avalanche diodes (SPADs) operating in a fast-gating mode into a TCSPC unit for use in TD-DOT. Through phantom experiments, they compared the results with and without time-gating and concluded that the proposed method not only extended the dynamic range of the detector but also improved the depth sensitivity due to the higher sensitivity to late photons. Dalla Mora et al. [59] also developed a system using fast-pulsed vertical cavity surface emitting lasers and fast-gated SPADs, both embedded into probes with SD distances of 5 mm and 30 mm, indicating the feasibility of application to wearable imaging instruments which should be compact, inexpensive, and quantitative. Furthermore, Di Sieno et al. [60] modified the above system using an SC laser with an AOTF for the light source and fast-gated SPADs for the detector, aiming the application at non-contact measurements. They performed phantom experiments to evaluate the performances of the system and discussed the possibility of TD-DOT imaging with non-contact measurements.
Sinha et al. [61] proposed an early photon approach to improve the spatial resolution of TD-DOT images using a developed TD-NIRS system based on the TCSPC technique incorporating SPADs. To enhance the detection efficiency of early photons with less scattering events, measurements were made with a detection rate much higher than that leading to counting losses due to the dead time.
Kalyanov et al. [62] reported a TD-DOT system using a state-of-the-art SPAD camera chip for detection, which is an array of 32 × 32 SPADs including a time-to-digital converter, and a super-continuum laser. The source light can be guided to 24 channels at most, and a total of 1024 TOF distributions at the maximum can be obtained without bulky instruments using conventional TD-NIRS systems.

TD-NIRS Systems Using Pseudo-Random Bit Sequences
Chen and Zhu [63] developed a new TD-DOT system based on the spread spectrum approach using a laser diode modulated with pseudo-random bit sequences, which replaced picosecond or femtosecond ultrashort pulsed lasers usually employed for light sources in the TCSPC technique. Phantom experiments verified the improvements of the spatial resolution and S/N ratio of the 2D scanning images through phantoms with a thickness of 5.5 cm. Mo and Chen [64] developed a fast TD-DOT system using the pseudo-random bit sequences. They reported that the IRF of the system was about 800 ps and that 2D maps of the optical properties could be obtained within a few seconds. Lange et al. [67] developed a broadband multichannel TD-NIRS system using a supercontinuum laser for a light source and an ICCD camera coupled with an imaging spectrometer for a detector. The spectral range was from 600 nm to 900 nm and the IRF was about 660 ps. The performances of the system were demonstrated by in vivo experiments to monitor the hemodynamic responses in adult arms during a cuff occlusion and in adult brains during a cognitive task.

Compact TD-NIRS System Incorporating Devices Used in Telecommunications
Very recently, Konstantinos et al. [3] developed a single-channel TD-NIRS system using a spread spectrum technique for TOF measurements. As a light source, the system incorporates a low-cost commercially available optical transceiver module, widely used in telecommunications applications. The system can generate an IRF under 600 ps, exhibits good accuracy and low-noise properties, requires very short warm-up times in contrast to conventional pulsed laser diodes, and is very compact compared to traditional TD-NIRS systems.

TD-NIRS Systems for Measurement of Water, Lipid, and Collagen Contents
Most TD-NIRS systems employ light sources with a wavelength range from about 700 nm to about 900 nm by focusing attention on hemodynamics. By extending the wavelength range up to 1100 nm, it becomes possible to measure the contents of other tissue components. Taroni et al. [54] developed a TR reflectance imaging system for optical mammography, as shown in Figure 11c, to obtain images of the contents of water, lipid, and collagen in addition to oxy-and deoxy-hemoglobin using seven wavelengths from 635 nm to 1060 nm by employing seven pulsed laser diodes. Ohmae et al. [68] developed a compact TD-NIRS system using six wavelengths in the wavelength range from 760 nm to 980 nm for measuring water and lipid content in addition to hemoglobin oxygenation. The results from using these instruments are briefly reviewed later in Section 5.2.1.

Future Trend of TD-NIRS Instruments
As stated above in Section 3.3, many new techniques have been developed to improve the performances of TD-NIRS instruments toward faster data acquisition within shorter measurement times, with contactless measurements, more compact devices, lower costs, faster image reconstruction algorithms with higher image quality, and more physiological information by extending the wavelength range, etc. These improvements will expand clinical applications of TD-NIRS systems and will lead to the development of 3D imaging with TD-DOT.

Advanced Theories and Methods for TD-NIRS
In this section, advanced theories and methods based on Section 2 are briefly described.

Solving the TD-RTE and TD-DE Numerically
The TD-DE is a parabolic type of partial differential equation which can be easily solved by commercially available software codes even for geometrically complicated media such as the human head and small animal models using the finite element method (FEM). Schweiger et al. [69] generally described the numerical method using FEM to solve the TD-DE as a forward calculation, as well as an inverse calculation for time-domain diffuse optical tomography (TD-DOT).
Solving the TD-RTE is not an easy task even by numerical computation due to the term of the scattering integral in Equation (1). The discrete ordinates method (DOM) is a standard technique to solve the RTE and some good monographs have been published for the method [70]. A review of solving the TD-RTE numerically is beyond the scope of this article, and only a few articles, which were arbitrarily chosen, are reviewed below.
Das et al. [71,72] solved the TD-RTE using the DOM and the calculated TOF distributions were compared with measured TOF distributions by TD experiments using tissue phantoms, rat tissue samples in vitro, and anesthetized rats in vivo.
Fujii et al. [73] solved the TD-RTE using the DOM to understand light propagation in the human neck including the thyroid and trachea for diagnosis of thyroid cancer by DOT. Light propagation inside the trachea cannot be expressed by the TD-DE because the trachea is a perfect void region of air, and reflection and refraction at the tissue-trachea interface should be considered. The results showed an interesting pattern of light propagation inside the trachea.
Some methods have been proposed to calculate the TD-RTE with smaller computation loads. Only two studies are reviewed briefly for your information. One is to employ a cell-vertex finite volume method for discretization of the space for faster computing of the 2D TD-RTE [74]. The other is to employ a new renormalization approach to calculate the scattering integral for accurate and efficient computation of the 3D TD-RTE [75].

Analytical Solutions for the TD-RTE
Paasschens [76] derived an almost exact analytical solution for the TD-RTE for the impulse point source in an infinite medium using the Fourier and its inverse transforms. He succeeded in correctly expressing the ballistic component which is never expressed by solutions for the TD-DE.
Martelli et al. developed an approximate Green's function for the TD-RTE for TD reflectance from a semi-infinite medium by heuristically employing the mirror image method to the analytical solution of the TD-RTE by Paasschens [76,77]. The closed-form Green's function is useful for measurement of optical properties or DOT based on the TD-RTE using inversion analyses.
Liemert and Kienle derived a Green's function of the TD-RTE for the radiance and fluence rate in an infinite medium by expanding the radiance into Legendre polynomials similarly to the P N approximation and by expressing the expansion coefficients analytically [78]. They also derived a Green's function of the TD-RTE for the radiance in a 3D anisotropically-scattering medium with an impulse point unidirectional source [79]. The solution involves a spherical Hankel transform necessitating numerical calculation, but the dependences on all the variables were found analytically.
Simon et al. extended the analytical solution derived by Liemert and Kienle to a semi-infinite medium by employing the mirror image method as Martelli et al. did [77,78,80]. The mirror image method correctly describes the boundary condition for the DE, but only approximately for the RTE. Nevertheless, the solutions agree well with the results of MC simulations and those of Martelli et al. and are applicable for anisotropically-scattering media.

The TD-RTE with Spatially Varying Refractive Index
The TD-RTE of Equation (1) assumes a constant refractive index throughout the medium, but the refractive index may spatially vary inside the medium. Ferwerda formulated the RTE for media with a spatially varying refractive index [81]. Khan and Jiang then derived the DE with a spatially varying refractive index from the TD-RTE formulated by Ferwerda [81,82].
Tualle and Tinet derived the TD-RTE with varying refractive indexes, which satisfy energy conservation while the RTE by Ferwerda does not satisfy energy conservation [81,83]. They also derived the TD-DE with varying refractive index and verified the equation by comparison with the results of MC simulations.

Solutions of the Telegraph Equation (TE)
Kumar et al. [84] numerically solved a 1D case of the TE, Equation (5), using the method of characteristics and finite difference. The solutions of the TE were compared with those of the TD-DE, and the ballistic components were clearly observed in the solutions of the TE while the solutions of the TD-DE violated the causality in the early times.
Analytical solutions of the TE were obtained by Durian and Rudnick, although their TE was a little bit different from that derived by the P 1 approximation, Equation (5) [85]. They obtained analytical solutions for infinite slabs and semi-infinite media as the forms of Laplace transform, and the solutions obtained by inverse Laplace transform cannot be expressed explicitly. Their analytical solutions agreed well with the results of MC simulations, although small differences were observed in very early times.

Perturbation Theory
Perturbation approaches have widely been employed for imaging inclusions with the optical properties different from those of the background homogeneous medium. The perturbation approach is a general concept applicable to any equations describing physical phenomena, and it is natural to apply the perturbation approach to the phenomenon of light propagation, or, photon migration, described by the TD-RTE, the TE, and the TD-DE. Actually, Polonsky and Box [86] reported general formulations of the perturbation approach applied to the RTE. However, the perturbation approach using the TD-RTE has difficulties in obtaining analytical solutions such as the Green's functions, and most studies of the perturbation approach are based on the TD-DE for which the Green's functions are available for simple geometries such as the infinite, semi-infinite, infinite slab, cylindrical, and spherical media [19].

Formulation of the TD-Perturbation Based on the TD-DE
Formulation of the TD-perturbation starts from the TD-DE (Equation (7)) for the unperturbed fluence rate, φ 0 (r, t), under the unperturbed µ a0 (r) and D 0 (r) = 1/3µ s 0 (r), and the TD-DE for the perturbed fluence rate, φ p (r, t) = φ 0 (r, t) + δφ(r, t), under the perturbed coefficients, µ ap (r) = µ a0 (r) + δµ a (r) and D p (r) = D 0 (r) + δD(r). By subtracting the unperturbed equation from the perturbed equation, the TD-DE for δφ(r, t) is obtained as, If the right-hand side of Equation (30) is considered as a virtual source, Q v (r,t) = {−δµ a (r) + ∇•[δD(r)∇]}φ p (r,t), and Equation (30) has the same form as the unperturbed equation. Then, if the Green's function of the unperturbed equation is known as G(r, r , t − t ), the solution of Equation (33) for the virtual source of Q v (r, t) is given by the Green's function method as, where V p indicates the volume occupied by the perturbations. Substitution of Q v (r,t) into Equation (31) and some mathematical operations leads to the following equation, where Now, the key subject is how to calculate the integrals of Equations (33) and (34) including the unknown φ p (r, t) (note: the integral equation composed of Equations (32), (33), and (34) is called the Fredholm equation of the second kind), and various assumptions and approximations have been employed. Usually, the perturbations are assumed constant inside the inclusions, i.e., δµ a (r) = δµ a = const and δD(r) = δD = const, and the simplest approximation is to substitute φ 0 (r, t) for φ p (r, t) by assuming φ p (r, t) ≈ φ 0 (r, t) resulting in the followings, These solutions are called the first order perturbation or the Born approximation, and many TD studies have reported analytical formulations in the first order perturbation using the TD-Green's functions for homogeneous infinite media, semi-infinite media, infinite slabs, multi-layered media [32] (pp.   [28,[87][88][89][90][91]. Solutions of the second order perturbation are derived by utilizing the first order perturbation in the manner of φ p (r, t) ≈ φ 0 (r, t) + δφ a (1) + δφ D (1) resulting in the followings,

δφ
(2) D (r, t) = δφ The third, fourth, and higher order perturbation can be formulated in a similar manner, and the cross talks between δµ a and δD begin to play a role.
The first order perturbation, i.e., the Born approximation, is believed to be valid for inclusions with very small sizes and very small changes in µ a and D from those of the background. For large inclusions with large changes in µ a and D such as in the case of breast cancers, the second or third order perturbation may be necessary. For this purpose, many studies of higher order perturbation theory have been reported [92][93][94][95].

First Order TD-Perturbation Using the TD-DE
Arridge [28] reported the analytical formulations of the first order TD perturbation approach based on the TD-DE for homogeneous infinite media, semi-infinite media and infinite slabs. The paper describes the analytical formula of the kernels of the integrals in Equation (33) and (34), so called the absorption and diffusion kernels, respectively. The kernels in the inversion problems are essentially the Jacobians which have a physical meaning of probability density function, or sensitivity of a measurement by the perturbation, and the images of the TD Jacobians for various geometries are shown. Hebden and Arridge reported experimental results using a breast-like slab phantom to show reconstructed DOT images of inclusions in the phantom by use of the perturbation method in the preceding paper [28,87]. The spatial resolution of scattering inclusions was found to be better than that of absorbing inclusions.
Morin et al. experimentally demonstrated the performance of a TD perturbation approach for detection of cylindrical inclusions embedded in a 20-mm thick scattering slab simulating compressed breasts measuring the TR transmittance in a collinear geometry of the source, inclusion, and detector [88]. They reported that the changes in the transmittance by absorbing perturbations were two-orders of magnitudes stronger than those by scattering perturbations.
Carraresi et al. [89] studied the accuracy of a TD perturbation model to predict the effects of absorbing and scattering inclusions on light propagation. They compared the results of the perturbation model with those of Monte Carlo simulations and phantom experiments for the TR-transmittance through a 40-mm thick scattering slab simulating breasts and verified that the changes in the TR transmittance caused by absorbing perturbations were independent of those by scattering perturbations.
Spinelli et al. [90] employed Padé approximants to extend the application of the first order TD perturbation to large-volume inclusions using a non-linear contrast of the measured TR transmittance as a function of absorbing and scattering perturbations. Their perturbation model and experimental setup were similar to those of Carraresi [89], and the employment of Padé approximants improved the accuracy of the reconstructed images for the large-volume inclusions.
Martelli et al. [91] developed formulations of the TD perturbation model for two-and three-layered media with absorbing perturbations supposing human heads. The eigen-function solutions of the TD-DE for two-and three-layered media were used [96,97], and results of their model agreed well with those of MC simulations, and TD-sensitivity maps were also shown.

Higher Order TD Perturbation Using the TD-DE
Sassaroli et al. proposed a higher order TD perturbation model for absorbing perturbation by introducing a probability distribution function and defining TD mixed-pathlength moments (mixed-moments) [94,95]. The fundamental equation is valid for both the TD-RTE and TD-DE, but analytical solutions for the TD-RTE are not available while those of the TD-DE are available. Also, calculations of the mixed-moments are not easy, but because the contributions of the mixed-moments are negligibly small compared with those of the self-moments, calculations using the self-moments only is found to provide good results. The fourth order perturbation using the Green's functions for the TD-DE and the Padé approximants have shown excellent results for large-volume inclusions with large absorption contrasts to the background.
Wassermann [92] derived general formulations of a higher order TD perturbation model for absorbing perturbations, and the author discussed that higher order perturbation is effective when the characteristic parameter, Λ, is smaller than 3, i.e., Λ = |δµ a |R inc 2 /D 0 < 3 where R inc is the radius of a spherical inclusion. The paper provided calculation results of the second and third perturbations of the TR transmittance from a 60-mm thick slab having an absorbing inclusion with a diameter of 10 mm and a contrast of δµ a /µ a0 about unity, and it summarized that the second perturbation is sufficient for accurate predictions. Grosenick et al. reported explicit formula of higher order perturbations similar to those derived by Wassermann [92,93], not only for absorbing perturbations but also for scattering perturbations up to the third order corrections although calculation results for scattering perturbations were not shown. The authors also refer to the characteristic parameter, Λ, as Wassermann for the criterion for convergence of the Born series for absorbing perturbations [92].
Most of the above reports for higher order perturbations are applicable to absorbing perturbations only, and the development of higher order perturbation models for scattering perturbations is desirable.

TD-Perturbation Using the TD-RTE
For more general treatment of the perturbation theory, Polonsky and Box started from the TD-RTE for the Stokes vector of the radiance, I = {I, Q, U, V}, where I, Q, U, and V are the Stokes parameters describing the polarization state [86]. For simplicity of discussions, they first defined the operator of the RTE, L, for I(t, r, n) at point r in the direction n at time t, and its adjoint operator, L * , for the Stokes vector of the adjoint radiance, I * (t, r, n), and then expressed an optical measurement of E (such as reflectance or transmittance) as the scalar product of a response function (equivalent to the Green's function) of R and the radiance of I, E = <R + , I> (superscript + denotes the transpose of the vector). Finally, they derived two basic equations for the perturbation approach as, where subscript b denotes the background medium and p is a parameter such as µ a and µ s . It is very difficult to explicitly give analytical solutions of the TD-RTE for the response function of R, because Equation (39) is expressed implicitly, but the perturbations of Equation (39) are given by the direct and adjoint solutions of the RTE, I b and I * b , and for the background medium.

Multi-Layered Media
Most biological organs can be categorized in multi-layered media from an optical point of view. Two typical examples are the human head and the tissues above the skeletal muscle, the former consisting of five layers of scalp (skin), skull, cerebrospinal fluid (CSF), gray matter, and white matter layers, the latter consisting of three layers of skin, fat, and muscle layers. The analytical solutions of the equations describing light propagation in multi-layered media will be very useful to recover the optical properties of each layer from measurements of TR reflectance. No analytical solution for the TD-RTE for multi-layered media has been reported so far, but the analytical solutions for the TD-DE have been obtained as follows.
Kienle et al. [98] reported the analytical solution of TR reflectance for two-layered semi-infinite media based on the TD-DE to recover µ a and µ s of the layers. But the analytical solutions derived using the Fourier-transform approach were not explicitly given in TD. Tualle et al. [99] reported another analytical solution of TR reflectance for two-layered semi-infinite media based on the TD-DE by applying the mirror image method for semi-infinite homogeneous media to two-layered media using distributed sources. Further, Tualle et al. extended Kienle et al. study to multi-layered media and obtained explicitly the asymptotic solution of TR reflectance in the very late time when the asymptotic solution is dependent on µ a and µ s of the deepest layer [98,100]. Liemert and Kienle extended the method for two-layered media to general N-layered media although explicit analytical solutions were not given [98,101]. The TD solutions were obtained using a fast Fourier transform from the FD solutions for N-layered media at many (512 for example) frequencies.
The analytical solutions of TR reflectance using the eigenfunction method were reported in series by Martelli et al. [96,97,102,103]. The geometries of the objects were a two-layered semi-infinite medium [102], two-layered parallelepiped [96], two-layered finite cylinder [97], and three-layered finite cylinder [103]. Using the separation of variables for space (r) and time (t) based on the microscopic Beer-Lambert law, the analytical solution of the TD-DE for a three-layered finite cylinder shown in Figure 14 are obtained as [103], where i = 1, 2, 3 denote the first, second, and third layer, respectively, J 0 is the 0th order Bessel function, K l are the roots of J 0 (K l L) = 0, and for K lni , γ lni , and N ln refer to the reference [103]. The TR reflectance, R(ρ, t), is obtained from Equation (40) by use of Fick's law at z = 0, and the TR partial mean pathlength inside each layer, <l i (ρ, t)>, is given analytically by the following equation, Appl. Sci. 2019, 9, x FOR PEER REVIEW 23 of 53 finite cylinder [103]. Using the separation of variables for space (r) and time (t) based on the microscopic Beer-Lambert law, the analytical solution of the TD-DE for a three-layered finite cylinder shown in Figure 14 are obtained as [103], where i = 1, 2, 3 denote the first, second, and third layer, respectively, J0 is the 0th order Bessel function, Kl are the roots of J0(KlL) = 0, and for Klni, γlni, and Nln refer to the reference [103]. The TR reflectance, R(ρ, t), is obtained from Equation (40) by use of Fick's law at z = 0, and the TR partial mean pathlength inside each layer, <li(ρ, t)>, is given analytically by the following equation, Figure 14. Geometry of a three-layered finite cylinder simulating the human head. Modified from Reference [103].
Another analytical solution of the TD-DE for finite cylindrical stratified media was reported by Barnett based on the separation of variables [104]. This method was verified by comparing the results for two-layered media with those of Tualle et al. [99], and reportedly the computation was fast enough for application to DOT.

Advanced Monte Carlo Simulations
Kienle and Patterson proposed a method to produce results of MC simulations for many combinations of μa and μs by correcting the result of one MC simulation for the reference combination of μa = 0 and μs = μs0 [105]. When the TR reflectance from a semi-infinite homogeneous medium with μa = 0 and μs = μs0 is R0(ρ, t), the TR reflectance, R(ρ, t), for a medium with μa and μs is given by, Many results of R(ρ, t) produced from a single MC simulation were used for estimating μa and μs′ from the measured TR reflectance. A similar MC method was employed by Alterstam et al. [106] where they called the method white Monte Carlo. The results are shown for the human head model having three layers consisting of the combined scalp and skull layer, the CSF layer, and the brain. The mean partial pathlength, given by temporal integration of Equation (41), of the third (brain) layer is also shown to vary from 0.3 mm to about 50 mm with ρ varying from 10 mm to 50 mm, respectively. Software for the two-layered finite cylinders is given in the Reference [32] (pp. [109][110][111][112][113][114][115][116][117][118][119][120][121][122][123][124][125].
Another analytical solution of the TD-DE for finite cylindrical stratified media was reported by Barnett based on the separation of variables [104]. This method was verified by comparing the results for two-layered media with those of Tualle et al. [99], and reportedly the computation was fast enough for application to DOT.

Advanced Monte Carlo Simulations
Kienle and Patterson proposed a method to produce results of MC simulations for many combinations of µ a and µ s by correcting the result of one MC simulation for the reference combination of µ a = 0 and µ s = µ s0 [105]. When the TR reflectance from a semi-infinite homogeneous medium with µ a = 0 and µ s = µ s0 is R 0 (ρ, t), the TR reflectance, R(ρ, t), for a medium with µ a and µ s is given by, Many results of R(ρ, t) produced from a single MC simulation were used for estimating µ a and µ s from the measured TR reflectance. A similar MC method was employed by Alterstam et al. [106] where they called the method white Monte Carlo. Sassaroli et al. [107] proposed another correction method to produce many results from a single Monte Carlo simulation, called perturbation Monte Carlo (pMC) method. At the first step, a standard MC simulation is performed for homogeneous and non-absorbing medium with µ a = 0 and µ s = µ s0 , and all the coordinates of scattering positions only for detected photons are stored. The weights of the detected photons are unity at this step. At the second step, another medium with an inhomogeneity is considered with µ ai and µ si for the inhomogeneity and with µ a0 and µ s0 for the surroundings. Then the corrections of the weights of the detected photons are given by the following two scaling relationships, w a for absorption and w s for scattering, where l i and l 0 are the pathlength of the photons inside and outside of the inhomogeneity, respectively, and K i is the number of scattering events inside the inhomogeneity. The two scaling relationships were combined into one equation by Hayakawa et al. [108]. By this method the computation time is significantly reduced. Later, Sassaroli proposed a faster perturbation MC method by storing the seed values of random number series only for the detected photons [109]. This can speed up computation by 1000 times under some conditions. Standard MC methods are inappropriate for inverse processes due to long computation times, but the pMC method can be used in retrieving the optical properties from measured TR reflectance [110], in functional imaging in small animals using time-gated technique [111], and in computing the Jacobians for DOT [112]. However, still faster MC methods are desirable for TD-DOT, and an MC method using a graphics processing unit (GPU) and two parallel random number generators was developed to report 300 times faster calculations than using conventional CPUs [113]. GPU-based MC methods are developed to simulate light propagation in teeth where the optical properties vary in a wide range due to different tissues such as enamel and tubules in dentin [114].
The equivalence between MC simulations and solving the RTE is further studied by Voit et al. [115] to compare the results of MC simulations with those of the Maxwell equation in light propagation with polarization.

Hybrid RTE and DE Models
Hybrid RTE and DE models have been developed to overcome the limitation of the DE which is not applicable to early time regimes and regions close to the boundary, source, and with low-scattering, and to save the computation loads of calculating the RTE for the whole space and time. In the fields of DOT and fluorescence diffuse optical tomography (FT) for small animals and small organs like human fingers and teeth, the early-time regime plays an important role, and simple application of the DE may be inappropriate. Several hybrid models have been proposed for CW [116], FD [117,118], and TD [119][120][121]. In the FD version of the coupled RTE-DE model [117], the whole computation domain is divided into two subdomains of near-field regions using the RTE and far-field regions using the DE, and the continuity of the intensity and its derivatives at the interface between the two subdomains are given for connecting the solutions of the RTE and DE. In the FD version of the overlapped or buffered RTE-DE model [118], the subdomains overlap in an artificially defined buffer zone where smooth transition from the RTE to the DE is achieved. One example of the overlapped RTE-DE model for TD was proposed by Fujii et al. [118]. In their model, the RTE and DE regions are separated spatially and temporally by using a crossover length for spatial separation and a crossover time for temporal separation, succeeding in describing light propagation accurately and reducing computational load by a quarter when compared with full computation of the RTE. Further development of hybrid RTE-DE models for TD is expected in the future.

Anisotropic Light Propagation in TD
Anisotropic light propagation in biological tissues was experimentally observed such as in human skin by Nickell et al. [122]. Heino et al. [123] investigated anisotropic light propagation in the TD-RTE with the scattering phase function dependent on both the polar and azimuthal angles and derived the anisotropic TD-DE with the diffusion tensor instead of the diffusion coefficient. Kienle et al. [20] also studied the anisotropic light propagation caused by structural anisotropic media using MC simulations and the TD-DE with the diffusion tensor, and the analytical expressions of reflectance and transmission from a slab were provided.

Studies toward Clinical Applications of TD-NIRS
In this section, fundamental studies toward clinical applications of TD-NIRS including features of TD-light propagation, measurements of the optical properties, TD-DOT, and fluorescence-DOT are reviewed.

Features of TD-Light Propagation Including Penetration Depth, Optical Pathlength, etc.
Features of TD-light propagation in tissue-like media have extensively been studied, and some of the studies are reviewed below.

Light Propagation Based on the Microscopic Beer-Lambert Law
In 1991, Hasegawa et al. [124] numerically studied the characteristics of TR transmission through homogeneous slabs using MC simulations under the microscopic Beer-Lambert law including the time ranges where the diffusion approximation does not hold. Nomura et al. [125] reported that the microscopic Beer-Lambert law was valid not only for tissue-like phantoms but also for living tissues such as rat brain and thigh muscle using TD measurements. Based on the microscopic Beer-Lambert law, Tsuchiya [9] derived simple equations describing light propagation by use of the photon path distribution which expresses the distribution of the pathlengths inside all voxels in the whole volume.

Mean-TOF, Partial Pathlength, and Sensitivity for the Head Model
Okada et al. [126] investigated light propagation in models of the adult head theoretically using MC simulations as well as solving the DE and experimentally using measurements of TR reflectance, R(t), from phantoms. Four different models of adult brain were constructed to see the effects of the SD distance on the mean-TOF, <t>, and partial pathlength of the i-th layer, <l i >, and the sensitivity distribution for CW on the SD distance, and to see the effect of the CSF layer on <t> and <l i >. More realistic head models were used to extend the study [127].
Firbank et al. [128] theoretically studied light propagation in a realistic adult head model constructed from MRI data using a hybrid DE/radiosity technique and a MC simulation to obtain R(t), <t>, their TR-sensitivities, H R (t) and H <t> (t), and <l i (t)>. They found that H <t> (t) has larger values than H R (t) in deeper regions.
Steinbrink et al. [129] studied R(t) from a multi-layered model to estimate the changes in R(t) due to absorption changes in the layers from < l i (t) > calculated by MC simulations. Their results were verified by experiments using two-and three-layered phantoms and by in vivo experiments.

Use of Null SD Distance
Del Bianco et al. [130] studied <l i (t)> and penetration depth, <z(ρ, t)>, for semi-infinite two-layered media using the analytical solution of the TD-DE. Based on this study, Torricelli et al. [131] proposed the usage of null SD distance for improving contrast and resolution of diffuse optical imaging from TR reflectance measurements. They concluded that the null SD distance for TR reflectance measurement provides four advantages over the conventional SD distances of a few centimeters, i.e., (i) stronger optical signals, (ii) deeper penetration depths, (iii) larger contrast for absorbing inclusions, and (iv) higher localization of the inclusions, but also discussed a problem of too strong signals of early photons. They argued several ways for solving the problem: (i) gating the detector to measure photons arriving at 200 ps and later, (ii) using SPADs which are not damaged by the burst of initial photons, and (iii) the use of non-null but small SD distance, while keeping the advantages of a null SD distance. The method of the null SD distance for TR reflectance measurements was further studied for two-layered media by Spinnelli et al. [132] using MC simulations and the TD-DE. Later, proof-of principle tests of the null SD distance technique were reported using fast-gated SPAD as well as non-contact probes showing good agreements of the depth sensitivity and spatial resolution between the phantom experiments and MC simulations [133]. These results indicated the feasibility and potentiality of the null SD distance technique for applications to non-contact and high-density diffuse TR reflectance measurements.

Measurement of Mean Pathlength
The mean (CW) pathlengths, l, or the DPFs, can be experimentally obtained in vivo by use of TR reflectance measurements. Zhao et al. reported measured DPF maps of the forehead, somatosensory motor, and occipital regions of 11 adults using a TR system with an SD distance of 30 mm at three wavelengths around 800 nm [134]. The measured DPFs varied from about 6 to 9 depending on the regions and wavelength. These DPF data will be useful for quantitative monitoring of the hemodynamic changes occurring in adult heads. Bonnéry et al. studied the changes in the DPF of the foreheads of adult heads with aging and found that the upper layer including the scalp, skull, and CSF was thicker for older subjects than younger ones [135].

Measuring Optical Properties
The optical properties of tissue can be determined by various methods in CW, FD and TD, but TD data provide the richest information as a matter of course. Many papers have reported the optical properties of not only homogeneous but also layered tissues, determined by use of TD data. Once the optical properties of multi-layered tissue such as human heads are determined, it becomes possible to estimate the pathlengths and sensitivities inside the heads which are very important parameters for investigating brain activities using NIRS. In the following, TD determination of the optical properties of homogeneous semi-infinite media is described and reviewed first, then those of multi-layered media are reviewed.

Homogenous Semi-Infinite Media or Infinite Slab
Determination of µ a and µ s in homogeneous semi-infinite media by TD techniques is essentially based on fitting analytical or numerical solutions of the TD-DE to measured TR reflectances. The TR-reflectance, R(ρ, t), is expressed by Equation (15) for the TD-DE under the extrapolated boundary condition, and for simplicity the zero-boundary condition is sometimes employed as Equation (28). For the zero-boundary condition, the following two equations are derived [18], where t max is the time of maximum R(ρ, t). From these equations, it is possible to determine µ a and µ s of semi-infinite homogeneous media from the measured R(ρ, t). Especially, it should be noted that µ a is determined from the slope of lnR(ρ, t) at very late time. This feature of R(ρ, t) is also stated by Jacques [136]. But R(ρ, t) measured at very late times are usually associated with noises, and it is difficult to determine t max due to the IRF. Therefore, using Equation (43) is not appropriate for accurate determination of µ a and µ s , and fittings of the IRF convoluted analytical solutions of Equation (15) or Equation (28) to measured R(ρ, t) are often used to recover µ a and µ s . Note that deconvolution of measured R(ρ, t) by the IRF enhances measurement noises and instead the analytical solutions are convoluted by the IRF.
Determination of µ a and µ s in homogeneous infinite slabs by TR transmittance measurements is conducted using the analytical equation for TR transmittance, which is derived similarly to that for TR reflectance using the mirror image source method as Equation (15) [18,19].
Sassaroli et al. [137] made in vivo measurements of µ a and µ s of a piglet brain using a TD system at three wavelengths (759, 794, 824 nm) by the method stated above. To separate the contributions of different head layers, the measured R(ρ, t) were acquired at the surfaces of skin, skull, dura mater and brain, step by step. The values of ρ were chosen to assure the mean penetration depth within each layer. Measured µ a and µ s were compared with the other in vivo results reported in literatures, and the differences were discussed.
Cornelli et al. [138] estimated µ a and µ s of human foreheads in the wavelength range from 700 nm to 1000 nm. They found that the estimated µ a and µ s were close to those of the superficial (scalp and skull) layers by additional MC simulations for four-layered media simulating the structure of the human head.
Using the optical mammography system shown in Figure 11c, which employed seven wavelengths of 635 nm, 685 nm, 785 nm, 905 nm, 930 nm, 975 nm, and 1060 nm, Taroni et al. conducted a pilot study for optical estimates of tissue components to differentiate malignant from benign breasts from the data of patients with 45 malignant and 39 benign lesions [54]. They measured TR reflectances in vivo through compressed breasts and obtained images of the differences in µ a from the background at the seven wavelengths, ∆µ a (λ), using a time-gated perturbation analysis based on the microscopic Beer-Lambert law under the assumption that the compressed breasts were homogeneous slabs [54,139]. The ∆µ a (λ) images were converted to the changes in the contents of oxy-Hb, deoxy-Hb, water, lipid, and collagen using Equation (25), and from statistical analyses they concluded that the collagen content was the most important parameter for discriminating malignant and benign lesions. Ohmae et al. evaluated the performance of the six-wavelength TD-NIRS system using phantoms varying the contents of water, lipid, and an absorber [68]. The performance was confirmed with the measurements using a magnetic resonance imaging system.
Guggenheim et al. [140] recovered the spectra of µ a (λ) and µ s (λ) of irregularly-shaped homogeneous media from TD-NIRS measurements. The analytical solutions of the TD-DE are usually not applicable to irregularly shaped media, and the finite element method (FEM) was employed to solve the TD-DE. The FEM solutions for non-absorbing medium were multiplied by exp(-µ a ct) to incorporate the attenuation by absorption.

Multi-Layered Media
Kienle and Glanzman [141] used TR reflectance measurements to determine the optical properties of human forearms being assumed as a two-layered media consisting of skin-fat (top: subscript 1) and muscle (bottom: subscript 2) layers. The analytical solutions of the TD-DE for two-layered semi-infinite media were fitted to R(ρ, t) measured in vivo at two values of ρ with the wavelength of 830 nm to determine µ a1 , µ a2 , µ s1 , and µ s2 with known thicknesses of the top layer (denoted by s 1 ) [98]. They concluded that µ a2 can be determined accurately even if s 1 is known only approximately. Gagnon et al. used the same analytical solutions for two-layered media as Kienle and Glanzman to estimate the intra-and extra-cerebral hemoglobin concentrations [141,142]. The upper layer included the scalp, skull, and CSF, and the lower layer was the brain. They observed noticeable inter-subject variations in the hemoglobin concentrations and constant oxygen saturation of the cerebral hemoglobin.
Martelli et al. used analytical solutions for the TD-DE using the eigenfunction method for estimating the optical properties of two-layered media [102,143,144]. The analytical solutions were fitted to the TR reflectances provided by a MC simulation or phantom and in vivo experiments to estimate µ a1 , µ a2 , µ s1 , µ s2 , and s 1 accurately [143,144]. The errors of the estimated µ a1 , µ a2 , and µ s1 were small while those of µ s2 were large.
Sato et al. [145] proposed a method of determining µ a2 in two-layered semi-infinite media based on the microscopic Beer-Lambert law. Extending this method, a simple algorithm was developed to recover µ a1 and µ a2 in a two-layered medium from the TR reflectances measured at two values of ρ and was verified by numerical simulations and in vivo experiments using human foreheads [146,147].
Hoshi et al. [148] reported a method to determine the optical properties of a four-layered human head model using TR reflectance measurements and MC simulations. They used a TRS-10 (Hamamatsu Photonics) to measure the TR reflectances from various head portions of eleven healthy humans with a fixed ρ of 30 mm. For MC simulations, head models based on MRI scans were constructed as four-layered slabs consisting of the scalp, skull, cerebrospinal fluid (CSF), and brain. By a trial-and-error method, they changed the values of µ a and µ s of the four layers to fit the simulated TR reflectances to the measured ones and estimated the partial pathlengths and sensitivities of the four layers. As of the results, they found the followings: (i) the total pathlength ranging from 100 mm to 250 mm was closely related to the thickness of the scalp ranging from 2 mm to 10 mm; (ii) the partial pathlength of the brain ranged from 3% to 13% of the total pathlength; (iii) the brain tissue deeper than 25 mm from the head surface were hardly detected by the near-infrared light; (iv) the scalp tissue at the depth of 4 mm had the highest sensitivity inside the heads; and (v) most of the signals attributed to the brain came from the superficial layer with the thickness of 1 to 2 mm from the brain surface. These findings are important for functional NIRS and could not have been obtained without TD-measurements.
Jäger and Kienle [149] employed a neural network (NN) to estimate µ a of the brain (µ a,brain ) from TR reflectances in the case of five-layered media. Training data of the TR reflectances with a single value of ρ were generated by the analytical solutions of the TD-DE for five-layered media as well as MC simulations which can model the CSF layer more accurately than the TD-DE [101]. One-hundred to 500 training data with different noises were input into the NN, and the NN estimated µ a,brain under the condition that the optical properties of the five layers except µ a,brain and the thicknesses of the upper four layers are known with some uncertainties. Resultantly, µ a,brain was estimated with errors less than 5% even if there are uncertainties of 20% in the other optical properties and the thicknesses.
Using the eigen-function analytical solutions of the TD-DE [97], Martelli et al. employed optimal estimation method of a Bayesian approach to incorporate prior information in the process of recovering the optical properties of the two-layered media [150,151]. The method was verified by recovering of µ a1 , µ a2 , µ s1 , and µ s2 from the TR reflectances provided by MC simulations and phantom experiments with better performances than the standard non-linear least-squares methods. The thickness of the top layer, s 1 , and the time origin of the TR reflectance, t 0 , were also included in the unknowns. The values of µ a2 were very accurately estimated because the TR partial pathlength of the bottom layer, l 2 (t), becomes much longer than that of the top layer, l 1 (t), after 2 ns in the decaying period. Even if the thickness of the top layer, s 1 , is unknown the values of µ a2 are obtained with errors less than 10%, and this method will be well suited for applications to human heads where the effect of the CSF is mainly to decrease µ s2 .
Zucchelli et al. and Re et al. utilized the analytical formulations of the partial pathlength of Equation (24) with the analytical solutions of the TD-DE for two-layered media to estimate perturbations of µ a2 [32] (pp. [109][110][111][112][113][114][115][116][117][118][119][120][121][122][123][124][125] [152,153]. When the spectroscopic TR reflectance, R(t, λ), is divided into many time gates, the ratio of the perturbed and unperturbed R(t, λ) averaged over the g-th time-gate is derived from Equation (24) as, where R 0g (λ) and R g (λ) are the unperturbed and perturbed R(t, λ) averaged over the g-th time-gate, respectively, ∆µ a,i (λ) is the perturbation of µ a,i in the i-th layer, N is the number of layers, and l g,i (λ) is the partial pathlength of the i-th layer averaged over the g-th time-gate, which is given by the analytical solution of the TD-DE in case of two-layered media (N = 2). Then a system of linear equations of Equation (44) with the number of the time-gates is constructed and ∆µ a,i (λ) is obtained by inverting the system of the linear equations. The optical properties of the human heads were estimated by in vivo measurements using multi-wavelength TD spectroscopy system at a group of laboratories [154]. The analytical solution of the TD-DE for two-layered media was fitted to the measured R(t, λ) [103], and five parameters, µ a1 , µ a2 , µ s1 , µ s2 , and s 1 were estimated. Although the inter-laboratory differences are rather large, a definite trend was found after comparisons with the results using R(t, λ) generated by MC simulations.
Liebert et al. estimated the changes in µ a of the brain and superficial layers after the intravenous administration of indocyanine-green from multi-distance TR reflectance measurements and MC simulations for a two-layered model [155]. In the estimation process, the information of the partial pathlengths of the two layers was important for calculating the sensitivities to absorption changes in the layers.

Time-Domain Diffuse Optical Tomography (TD-DOT)
Diffuse optical tomography (DOT) has been investigated since the early 1990s in CW, FD, and TD. Many investigations of DOT have been conducted in CW and FD, but the advantages of TD over CW and FD attracted researchers. First TD-DOT was reported by Benaron and Stevenson [156] in 1993. They demonstrated the advantage of TD-DOT over CW-DOT by showing a TD-DOT image of a mouse. The TD-DOT image using the early photons revealed the internal organs of a mouse while the CW-DOT image was unable to distinguish the internal organs at all. In the late 1990s, several groups developed more sophisticated multi-channel TD-DOT systems for studies of brain, breast, and muscle imaging [45,49,157].
In the following, various methods for TD-DOT are reviewed first, and the applications of TD-DOT for in vivo studies of brain, breast, and muscle imaging are reviewed.

General Concept of TD-DOT
Some studies employed the reconstruction algorithm for X-ray CT to reconstruct the absorption images using the temporally extrapolated absorbance method [158,159] or using the time-gated measurements at early times [160][161][162][163]. These methods aimed to detect the ballistic-like components of transmitted ultra-short pulse light, but their applications were limited to small-sized objects because transmitted light was unmeasurable with high signal-to-noise ratios for objects thicker than a few centimeters.
Standard algorithms of TD-DOT for thick or large objects are categorized as a model-based iterative image reconstruction (MOBIIR) which employs a forward model of light propagation and a non-linear iterative inversion process. Arridge et al. proposed an algorithm of MOBIIR based on the DE [164]. They derived the sensitivities (Jacobians) for the mean time-of-flight, <t>, as the data type, from the perturbation theory, and formulated the inverse process to reconstruct µ a and µ s images. The software developed in this framework was named as "TOAST" and its revised version "TOAST++" was presented as an open-source software [165]. A general algorithm of MOBIIR is summarized below. Figure 15 shows a standard process of MOBIIR mainly consisting of measurements and an inverse process. The measured TD data (TOF distributions), Γ(t), can be used fully or converted to featured TD data. Several types of featured TD data characterizing the TOF distributions are often used as the following [4], Mellin-Laplace transform : Amplitude and phase of frequency-domain data: The time-gated intensity with T a = 0 and T b = ∞ corresponds to the CW intensity, and multiple time-gated intensities with a small time-step of T b − T a = ∆T dividing the whole TOF distributions are equivalent to the use of full TOF distributions. The n-th temporal moment with n = 0 is also identical to the CW intensity, and those with n = 1, 2, 3 are the mean, M 1 = <t>, variance, M 2 , and skew, M 3 , of the TOF distribution, respectively. The Mellin-Laplace transform with n = 0 or p = 0 corresponds to the simple Laplace or Mellin transform, respectively. Equation (48) describes the transformation of the TD data to the FD data at a frequency of ω. Several updating methods are available, and Newton-type non-linear reconstruction is a standard method. The objective function, Ψ, defined below is minimized, where Γcal and Γmes are the vectors of the measured and calculated (featured) TD data with the components of Γcal(rsm, rdm:μa(rn), D(rn)), and Γmes(rsm, rdm), respectively, the index m (=1, …, M) specifies the TD data number with M being the number of the TD data, the index n (=1, …, N) specifies the voxel number with N being the number of voxels in the medium, respectively, x = [μa(rn), D(rn)] T is a vector consisting of the optical properties at the n-th voxel, and λ is the regularization parameter.
Here, the measurement errors are neglected for simplicity. The Levenberg-Marquardt algorithm, which is a standard method in the Newton-type minimization process, leads to the following inversion matrix equation for the updates at the k-th iteration, δx (k) [69,167], where W is the sensitivity (weight or Jacobian) matrix of the featured TD data to the changes in the optical properties, δx. If the time-gated intensity is chosen for the featured TD data, W will be given by equations similar to Equations (17) and (18). Then the optical properties are updated by use of the Tikhonv regularization as follows, Some other various schemes have been employed for image reconstruction, and please refer to References [4,168,169]  Gao et al. proved that using full TOF distributions provides better images over the featured TD data, although the computation cost was as high as two orders of magnitude of that using featured TD data such as E (CW), <t>, <t> + c2, and <t> + c2 + c3 even for 2D reconstruction [171]. Figure 16 illustrates that the μa image reconstructed using the full TOF distributions shows qualities better than Several updating methods are available, and Newton-type non-linear reconstruction is a standard method. The objective function, Ψ, defined below is minimized, where Γ cal and Γ mes are the vectors of the measured and calculated (featured) TD data with the components of Γ cal (r sm , r dm :µ a (r n ), D(r n )), and Γ mes (r sm , r dm ), respectively, the index m (=1, . . . , M) specifies the TD data number with M being the number of the TD data, the index n (=1, . . . , N) specifies the voxel number with N being the number of voxels in the medium, respectively, x = [µ a (r n ), D(r n )] T is a vector consisting of the optical properties at the n-th voxel, and λ is the regularization parameter. Here, the measurement errors are neglected for simplicity. The Levenberg-Marquardt algorithm, which is a standard method in the Newton-type minimization process, leads to the following inversion matrix equation for the updates at the k-th iteration, δx (k) [69,166], where W is the sensitivity (weight or Jacobian) matrix of the featured TD data to the changes in the optical properties, δx. If the time-gated intensity is chosen for the featured TD data, W will be given by equations similar to Equations (17) and (18). Then the optical properties are updated by use of the Tikhonv regularization as follows, Some other various schemes have been employed for image reconstruction, and please refer to References [4,167,168] [169]. Here, c n = E −1 ∞ 0 (t− < t >) n Γ(t)dt. Among these six featured TD data, the combination of c 3 + M ML (0,0.001)/E provided the best reconstruction results.
Gao et al. proved that using full TOF distributions provides better images over the featured TD data, although the computation cost was as high as two orders of magnitude of that using featured TD data such as E (CW), <t>, <t> + c 2 , and <t> + c 2 + c 3 even for 2D reconstruction [170]. Figure 16 illustrates that the µ a image reconstructed using the full TOF distributions shows qualities better than those using the featured TD data of E and <t>.

Modified Generalized Pulse Spectrum Technique for TD-DOT
Gao et al. [172] proposed a modified generalized pulse spectrum technique (GPST) for image reconstruction of TD-DOT. The modified GPST is based on the Laplace transformed RTE and employs a ratio of the Laplace transformed signals at two Laplace parameters (denoted p1 and p2), R = Γ(p2)/Γ(p1), as the featured TD data. Then, with some assumptions, the sensitivities of R to the changes both in μa and in μs′ are expressed using the product of the Green's functions of the Laplace transformed fluence rate and flux as follows, Note that the sensitivity to change in μs′, Ws (R) (rd, rs, t), is easily calculated while the sensitivities of other featured TD data to the change in μs′ are usually expressed by the spatial gradients of both the Green's functions of the fluence rate and flux as Equation (18), which makes computation ineffective and leads to crosstalk between μa and μs′ images. This difference in calculating the sensitivity to the change in μs′ is the biggest advantage of the modified GPST over the other featured TD data approaches. In addition, taking positive and negative Laplace parameters for p1 and p2 is interpreted as weighting the early and late times in the TOF distributions, thus covering the key features of the TOF distributions. With these advantages, the modified GPST makes the simultaneous reconstruction of μa and μs′ images possible with less crosstalk between them. μs′ images provide anatomical information while μa images provide physiological information. The modified GPST was successfully extended from a 2D case to a semi-3D case with numerical simulations and phantom experiments [172][173][174].
A numerical simulation and phantom experimental studies using the modified GPST were performed for imaging brain activation from TR reflectance measurements in a two-layered model [175]. Comparisons of the reconstructed images among various featured data types such as E, <t>, <t> + c2, E + <t>, modified GPST and full TOF distributions were made, and it was found that the full TOF distributions provided the best images at the cost of long computation times [176].

Other Techniques for TD-DOT
Lyubimov et al. [177] applied the concept of the photon average trajectory to image

Modified Generalized Pulse Spectrum Technique for TD-DOT
Gao et al. [171] proposed a modified generalized pulse spectrum technique (GPST) for image reconstruction of TD-DOT. The modified GPST is based on the Laplace transformed RTE and employs a ratio of the Laplace transformed signals at two Laplace parameters (denoted p 1 and p 2 ), R = Γ(p 2 )/Γ(p 1 ), as the featured TD data. Then, with some assumptions, the sensitivities of R to the changes both in µ a and in µ s are expressed using the product of the Green's functions of the Laplace transformed fluence rate and flux as follows, Note that the sensitivity to change in µ s , W s (R) (r d , r s , t), is easily calculated while the sensitivities of other featured TD data to the change in µ s are usually expressed by the spatial gradients of both the Green's functions of the fluence rate and flux as Equation (18), which makes computation ineffective and leads to crosstalk between µ a and µ s images. This difference in calculating the sensitivity to the change in µ s is the biggest advantage of the modified GPST over the other featured TD data approaches. In addition, taking positive and negative Laplace parameters for p 1 and p 2 is interpreted as weighting the early and late times in the TOF distributions, thus covering the key features of the TOF distributions. With these advantages, the modified GPST makes the simultaneous reconstruction of µ a and µ s images possible with less crosstalk between them. µ s images provide anatomical information while µ a images provide physiological information. The modified GPST was successfully extended from a 2D case to a semi-3D case with numerical simulations and phantom experiments [171][172][173].
A numerical simulation and phantom experimental studies using the modified GPST were performed for imaging brain activation from TR reflectance measurements in a two-layered model [174]. Comparisons of the reconstructed images among various featured data types such as E, <t>, <t> + c2, E + <t>, modified GPST and full TOF distributions were made, and it was found that the full TOF distributions provided the best images at the cost of long computation times [175].

Other Techniques for TD-DOT
Lyubimov et al. [176] applied the concept of the photon average trajectory to image reconstruction algorithm of TD-DOT to speed up computation.
Hervé et al., Puszka et al., and Puszka et al. studied the performances of TD-DOT using the Mellin-Laplace transform with parameters of n and p in Equation (47) by numerical simulations and phantom experiments [177][178][179]. TD-DOT images reconstructed from the featured TD data using the Mellin-Laplace transform were found to be more robust to measurement noises than those using <t>, and the larger the value of n, the deeper the sensitivities extended. Also, the feasibility of short SD distances of 10 mm and 15 mm was studied for application of the null SD distance technique using an experimental setup employing a fast-gated SPAD [133,179].
For saving computation time and memory, Naser and Deen [180] developed a recursive equation to calculate the sensitivity (Jacobian) at a specific time step from the calculated fluence rates at all the previous time steps.

Brain Imaging
A TD-DOT system developed by Benaron's group was applied to neonatal heads to reconstruct 2D-DOT images of hemoglobin saturation to reveal the existence of hemorrhages in neonatal brains [181]. This system was also applied to adult heads to reveal the focal areas of neuronal activity in the brains [182]. The DOT images were reconstructed using a unique and heuristic curvilinear back-projection algorithm which was very fast in computation at the cost of a low accuracy. Their TD-DOT system, built in the early 1990s, required about 6 h to acquire one dataset for one image.
Hebden et al. reported 3D-DOT images of hemodynamics in premature infant brains with hemorrhage using MONSTIR and custom-made helmets with optodes [49,183]. They employed a non-linear inversion algorithm for image reconstruction (TOAST) [164]. The reconstructed images of C total-HB and S O2 indicated the existence of hemorrhage with physiologically reasonable values. Hebden et al. also obtained DOT images of ventilated infant brains to show the hemodynamic responses to changes in the oxygen and carbon dioxide partial pressures in the ventilating air [184]. The changes in C total-HB and S O2 revealed by the DOT images were in qualitative agreement with physiologically expected changes.
Ueda et al. used a 16-channel multi-wavelength reflectance TD-DOT system to obtain DOT images of human brain activity using a reconstruction algorithm based on the microscopic Beer-Lambert law [9,50]. The algorithm requires calculation of the photon path distribution only for a non-absorbing medium, and the changes in µ a are easily reconstructed. Two sets of optodes were attached onto the right and left of the forehead of a subject performing a video game task. The DOT images of ∆C oxy-HB , ∆C deoxy-HB , and ∆C total-HB overlaid on the MR images of his brain clearly showed the localized activity of the prefrontal cortex as shown in Figure 17, although the hemodynamic changes appeared in the dura mater due to localization errors.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 32 of 53 and the larger the value of n, the deeper the sensitivities extended. Also, the feasibility of short SD distances of 10 mm and 15 mm was studied for application of the null SD distance technique using an experimental setup employing a fast-gated SPAD [134,180]. For saving computation time and memory, Naser and Deen [181] developed a recursive equation to calculate the sensitivity (Jacobian) at a specific time step from the calculated fluence rates at all the previous time steps.

Brain Imaging
A TD-DOT system developed by Benaron's group was applied to neonatal heads to reconstruct 2D-DOT images of hemoglobin saturation to reveal the existence of hemorrhages in neonatal brains [182]. This system was also applied to adult heads to reveal the focal areas of neuronal activity in the brains [183]. The DOT images were reconstructed using a unique and heuristic curvilinear backprojection algorithm which was very fast in computation at the cost of a low accuracy. Their TD-DOT system, built in the early 1990s, required about 6 h to acquire one dataset for one image.
Hebden et al. reported 3D-DOT images of hemodynamics in premature infant brains with hemorrhage using MONSTIR and custom-made helmets with optodes [49,184]. They employed a non-linear inversion algorithm for image reconstruction (TOAST) [165]. The reconstructed images of Ctotal-HB and SO2 indicated the existence of hemorrhage with physiologically reasonable values. Hebden et al. also obtained DOT images of ventilated infant brains to show the hemodynamic responses to changes in the oxygen and carbon dioxide partial pressures in the ventilating air [185]. The changes in Ctotal-HB and SO2 revealed by the DOT images were in qualitative agreement with physiologically expected changes.
Ueda et al. used a 16-channel multi-wavelength reflectance TD-DOT system to obtain DOT images of human brain activity using a reconstruction algorithm based on the microscopic Beer-Lambert law [9,50]. The algorithm requires calculation of the photon path distribution only for a nonabsorbing medium, and the changes in μa are easily reconstructed. Two sets of optodes were attached onto the right and left of the forehead of a subject performing a video game task. The DOT images of ΔCoxy-HB, ΔCdeoxy-HB, and ΔCtotal-HB overlaid on the MR images of his brain clearly showed the localized activity of the prefrontal cortex as shown in Figure 17, although the hemodynamic changes appeared in the dura mater due to localization errors. Using MONSTIR [49], Gibson et al. obtained 3D-DOT images showing the response of a neonate motor cortex to the task of arm movement using a linear algorithm for image reconstruction [185]. It took about 2 h to acquire one dataset for one image, which is too long to obtain images of fast neuronal responses.
Ueno et al. obtained DOT images of ventilated premature infant brains with the help of reducing the artifacts induced by the relative movements of the optodes against the scalp surface [46,48]. Fukuzawa et al. improved the image quality by introducing the coupling coefficients of the optodes as the unknowns in the reconstruction algorithm [46,47,186]. Although the quality of these DOT images for infant brains still needs to be improved for clinical use, these results demonstrate the feasibility of the whole-head imaging of neonates by taking advantage of their small head size and skull thickness.
Imaging of neuronal activities by NIRS technology have recently been greatly advanced using the high-density reflectance CW-domain DOT having a data acquisition speed of 12 frames/s which is fast enough to visualize the hemodynamic response of neuronal activities [187]. Current TD-DOT systems are too slow for that purpose, and advancements in hardware for TD-DOT are highly desired for further applications in neuronal imaging.

Breast Imaging
Breasts are recognized to have smaller µ a and µ s than other tissues so that transmittance measurements are available. Ntziachristos et al. [188] developed a transmittance TD-DOT system to image the optical properties of breast phantoms and reported the potentiality of TD-DOT for breast tumor diagnosis. The same group used the TD system with an exogenous agent, indocyanine green (ICG), for image enhancement in in vivo measurements on patients [189]. They obtained better localization of target tumors, observed the differences in the ICG distributions among malignant tumors, benign tumors, and healthy tissues, and compared the DOT images with the concurrently acquired gadolinium-enhanced MRI images. However, the use of exogenous agents for frequent measurements may cause complications for patients and may thus be avoided.
Intes et al. [190] reported a four-wavelength (760, 780, 830, and 850 nm) TD-DOT system for breast tumor measurement using a commercially available instrument. Tomographic images reconstructed by a non-linear inversion algorithm showed the distributions of C oxy-HB , C deoxy-HB , water content, and lipid content in compressed breasts from 49 patients, and differences in the blood content and water fraction between malignant and benign tumors were found.
For the purposes of screening of breast cancers and of evaluating the effectiveness of chemotherapy for breast cancers, Ueda et al. [52] used a 48-channel TD-DOT system. In the DOT images of the breasts, cancer tissues were imaged with higher absorption than healthy surrounding tissues. In the case of breast cancers which had positively responded to chemotherapy, µ a of the tumor after chemotherapy drastically decreased by 34% from 0.0076 mm −1 before chemotherapy, indicating the effectiveness of quantitative evaluation of chemotherapy. Yoshimoto et al. [53] used a 12-channel TD-NIRS system with a hand-held probe for use at bed side and indicated the usefulness of the TD-DOT system for evaluation of the effectiveness of chemotherapy to breast cancers.
Enfield et al. [191] investigated the response to hormone therapy of breast cancers using a 3D TD-DOT system. They found that the images of ∆C total-HB strongly correlated with the clinical assessment of response to hormone treatment of breast cancers.
Zhang et al. [192] demonstrated the feasibility of improving the breast tumor diagnosis using combined TD-DOT and TD-FT. They used two wavelengths of 780 nm and 830 nm for the source and measured the TOF-distributions at 32 sites around cylindrical phantoms. The modified GPST was employed to reconstruct the images of (µ a , µ s ) and (y f , τ f ) where y f and τ f are the fluorescence yield and lifetime, respectively. The images of (µ a , µ s ) and (y f , τ f ) were reconstructed independently with a reasonable quantitativeness, but when the images of (y f , τ f ) were used to provide the locations of targets as prior information for reconstructing (µ a , µ s ) images (fluorescence guided DOT), the quantitativeness of the (µ a , µ s ) images was highly improved; for example, the ratios of the reconstructed to correct values of (µ a , µ s ) were (23.5%, 31.0%) for separate reconstruction and (76.6%, 86.2%) for fluorescence guided reconstruction.

Muscle Imaging
For muscle imaging, Hillman et al. reported DOT images of µ a and µ s inside a human adult forearm during hand grip exercises with transmittance measurements using MONSTIR [49,193].
The DOT images at wavelengths of 790 and 820 nm showed the responses of µ a and µ s to the exercises.
Using another multi-channel TD-DOT system [45], Zhao et al. reconstructed DOT images of µ a and µ s in human lower legs and forearms [47,194]. For the case of the forearms, DOT images of µ a , µ s , ∆C oxy-HB , ∆C deoxy-HB , and ∆C total-HB in human forearms during excises were obtained as shown in Figure 18. μs′, ΔCoxy-HB, ΔCdeoxy-HB, and ΔCtotal-HB in human forearms during excises were obtained as shown in Figure 18.

Fundamental Equations for TD-FS and TD-FT
Fluorescence diffuse optical spectroscopy (FS) and tomography (FT) have been investigated as one of the modalities for molecular imaging [196,197], and TD-FT is an extension of TD-DOT to the coupled fluorescence excitation and emission phenomena of light propagation. Fundamental equations of light propagation for TD-FS and TD-FT are usually described by the coupled TD-DEs for excitation and emission light, r r r r , where subscripts x and m refer to excitation and emission light, respectively, N(r), γ(r), and τ(r) are the spatial distributions of the fluorescence properties, i.e., concentration, quantum efficiency, and lifetime of the fluorophore, respectively, and ε is the extinction coefficient of the fluorophore at the excitation wavelength. The goal of FT is to reconstruct the distributions of fluorescence properties, and N(r) and τ(r) are the main targets for molecular imaging. In general, the image reconstruction algorithm for TD-DOT is extended for TD-FT, which uses the emitted fluorescence light intensities measured at the object surface as the input data in addition to the measured reemitted excitation light intensities. The optical and fluorescence properties are assumed first, and Equations (55) and (56) are solved as the forward problem to obtain the calculated excitation and fluorescence light intensities, which are compared with the measured ones. If they do not agree, the optical and fluorescence properties are upgraded by an optimization procedure introduced in Section 5.3.1, and Equations (55) and (56)

Fundamental Equations for TD-FS and TD-FT
Fluorescence diffuse optical spectroscopy (FS) and tomography (FT) have been investigated as one of the modalities for molecular imaging [195,196], and TD-FT is an extension of TD-DOT to the coupled fluorescence excitation and emission phenomena of light propagation. Fundamental equations of light propagation for TD-FS and TD-FT are usually described by the coupled TD-DEs for excitation and emission light, where subscripts x and m refer to excitation and emission light, respectively, N(r), γ(r), and τ(r) are the spatial distributions of the fluorescence properties, i.e., concentration, quantum efficiency, and lifetime of the fluorophore, respectively, and ε is the extinction coefficient of the fluorophore at the excitation wavelength. The goal of FT is to reconstruct the distributions of fluorescence properties, and N(r) and τ(r) are the main targets for molecular imaging. In general, the image reconstruction algorithm for TD-DOT is extended for TD-FT, which uses the emitted fluorescence light intensities measured at the object surface as the input data in addition to the measured reemitted excitation light intensities. The optical and fluorescence properties are assumed first, and Equations (55) and (56) are solved as the forward problem to obtain the calculated excitation and fluorescence light intensities, which are compared with the measured ones. If they do not agree, the optical and fluorescence properties are upgraded by an optimization procedure introduced in Section 5.3.1, and Equations (55) and (56) are solved again. This process is repeated until convergence is reached. Many studies have been conducted on TD-FS and TD-FT, and some studies are reviewed in the following.

Analytical Solutions of the Equations for TD-FS
Patterson and Pogue derived analytical formulations of fluorescence emission light intensities measured at surfaces of homogeneously fluorescing semi-infinite media for TD-FS and FD-FS [197]. The emission light intensities are analytically given using the product of the fluence rate in the medium and the escape function, which is equivalent to the sensitivity of reflectance to absorption changes expressed by Equation (17). Sadoqi et al. provided analytical solutions of Equations (55) and (56) for cuboidal, spherical, and cylindrical media using eigen functions, and TOF distributions of the fluorescence emission were compared with those obtained by phantom experiments [198].

Clinical Applications of TD-FS
Many studies have been done toward clinical applications of TD-FS, but only a few studies are reviewed in this article. For evaluating excised tumor specimens, Butte et al. induced tissue autofluorescence with a pulsed (1.2 ns) nitrogen laser at 337 nm and measured its decay profiles in a wavelength range from 370 nm to 500 nm using a fast oscilloscope and a photomultiplier [199]. Obtaining data at two times in the decay profiles (times at the intensity of 1/e and 10% of the maximum intensity) at six wavelengths, a linear discrimination analysis enabled discrimination of meningioma tissue from normal tissue with a sensitivity larger than 89% and specificity of 100%. The feasibility of using TD-FS for evaluating tumor tissue was discussed.
Milej et al. tried a method to measure the inflow and washout of an optical contrast agent (ICG) for the purpose of providing information on the blood supply to the brain by TR measurements of fluorescence at the head surface [200]. In order to provide information for interpretation of the measured fluorescence signals, they performed MC simulations and in vivo TR measurements to find the effects of the SD distances, positions of the SDs, and doses of injected ICG. Furthermore, Milej et al. made MC simulations of light propagation of the excitation and emission light in a two-layered medium mimicking intra-and extracerebral tissue compartments after injecting ICG [201]. They found that the knowledge of the absorption properties of the medium is essential for interpretation of the TR fluorescence signals measured at the head surface.

TD-FT Using Full TOF-Distributions and Effects of Featured Data Types
For TD-FT, Gao et al. reported FT images of the fluorescence yield (γεN) and lifetime (τ) using full TOF distributions in numerical simulations [202]. Reconstructed FT images using featured TD data can be evaluated by comparing with those using the full TOF distributions. To improve the quality of the images reconstructed from full TOF distributions, Li et al. proposed an overlap time-gating approach which simultaneously achieves high time-resolution and high signal-to-noise ratio for input data extracted from the measured TOF distributions [203]. Riley et al. discussed the effect of featured data types on reconstructions of the depth and lifetime of the fluorophore, and concluded that local data types such as the peak values, peak times, FWHMs (full widths at half maximum), and slopes of decaying period of measured TOF distributions have advantages over global data types such as the CW intensities, mean TOFs, variances, and skews [204].

TD-FT Using Early Arriving Photons
Many studies of TD-FT have been done using early arriving photons. Some of them to name a few are reviewed below.
Wu et al. used a streak camera for TR measurements of early photons, TOF distributions of which were Laplace transformed with an optimum parameter, and the fluorophore targets were accurately localized under the presence of the noise of background fluorescence [205].
Niedre et al. reconstructed TD-FT images of lung carcinoma in mice in vivo from early photons detected by a high-speed gated ICCD with a gate time of 100 ps to obtain 360 • projections [206]. Light propagation was described by a cumulant solution of the TD-RTE to generate three-point Jacobians, and a standard inversion process was employed to reconstruct 3D images of the fluorescence targets.
Patwardhan and Culver developed a TD-DOT system to quantitatively reconstruct µ a and µ s images inside small animals such as mice for the purpose of improvement of TD-FT images [207]. They used high-speed time-gated ICCD with a time gate less than 300 ps to obtain TR transmission data which were converted to FD data for image reconstruction of µ a and µ s using a non-linear iterative inversion algorithm.
Leblond et al. studied early-photon TD-FT to explain improved spatial resolutions of the images reconstructed from time-gated measurements with 200~400 ps time-gates by mathematical formulations and in silico phantoms [208]. They used the TD-DE for the forward problem and singular-value analysis for the inverse problem. Zhu et al. reported that using not only early photons but also late photons improve both the spatial resolution and image contrast in 3D TD-FT on a heterogeneous mouse model [209]. They employed the normalized Born-ratios as the featured data type, derived the Jacobians of the featured data to the changes in the absorption coefficient of fluorophores, and solved the non-linear iterative inversion problem to reconstruct the TD-FT images.
Cheng et al. developed a reconstruction algorithm of TD-FT for small animals using the first order time-derivative (or slope) of the rising portion of the measured TOF distributions in the early time regime [210]. Using the first derivative provided robustness to measurement noises and highspatial resolutions in the reconstructed TD-FT images. As the forward model, they employed the telegraph equation (TE) of Equation (5) having the term, ∂ 2 φ(r, t)/∂t 2 , because the TD-DE was derived by assuming ∂ 2 φ(r, t)/∂t 2 << ∂φ(r, t)/∂t which may breakdown for early photons.

TD-FT Using the GPST Algorithm
Gao et al. reconstructed TD-FT images using the GPST algorithm which was developed in TD-DOT [171,211]. From phantom experiments using a four-channel TCSPC system and GPST algorithm, combined DOT and FT provided images of µ a , µ s , N, and τ, and hemoglobin images were obtained from µ a images at two wavelengths of 780 nm and 830 nm [192].

Total Light Approach in TD-FS and TD-FT
Marjono et al. proposed a concept of the total light to calculate light propagation of fluorescence emission [212]. The total light is defined as φ t = φ x + φ m * /γ, where φ m * is the fluence rate of the emission light when the fluorescence life time is zero, i.e., φ m (r, t) = φ m * (r, t)⊗(1/τ)exp(−t/τ) with ⊗ denoting the convolution operator. With additional assumption of µ ax = µ am = µ a and D x = D m = D, the DE for φ t is given as, which is almost the same form as Equation (55) except the absorption by the fluorophore. So, computation is simplified. The total light approach was applied to TD-FT for reconstructing µ a and εN(r) in a 2D circular medium with a single and double fluorophore targets using <t> as a featured TD-data [213]. Nishimura et al. applied the total light approach to estimate the life-time function, (1/τ)exp(−t/τ), in heterogeneous scattering media with a more general scheme using the TD-RTE [214].

Transformation of TD-FT to FD-FT
Transformation of TD-FT to FD-FT have often been employed because integration of the right-hand side of Equation (56) can be avoided in FD [215][216][217]. For cases of a single fluorophore target, fluorescence signals in FD can be simply written as the product of the fluence rate, the Green's function of the excitation light and life-time function.

Application of MC Method for TD-FT
Monte Carlo simulations were also employed for forward calculations of TD-FT [218,219]. The perturbation MC (pMC) and adjoint MC were found to be appropriate for generating time-gated TR-Jacobians, and the positions and lifetimes of fluorophore targets were reconstructed using data of multiple time gates.

Clinical Applications of Commercially Available TD-NIRS Systems by Japanese Researchers
In this section, results of clinical applications of commercially available TD-NIRS systems by some Japanese clinician groups are briefly reviewed. All of them used TRS-10 or TRS-20 provided by Hamamatsu Photonics as described in Section 3.2.2 under the assumption of homogeneous semi-infinite medium. The applications of these instruments to multi-layered tissues may lead to some unreasonable results with overestimation or underestimation of physiological changes but may provide new findings and insights in the fields of hemodynamics, physiology, and neuroscience. These studies are still at the initial stage of clinical applications for TD-NIRS and may lead to a large leap in the near future.

Group from Kagawa Medical University
After some preliminary studies by Kusaka et al. and Ijichi et al. using hypoxia models of piglets, they measured µ s , DPF, cerebral blood volume (CBV), etc., in 22 neonate heads, and found a significant correlation between the postconceptional age, µ s , and CBV [220][221][222]. Ogawa et al. measured the changes in C oxy-Hb , C deoxy-Hb , C total-Hb , and So 2 of breast tissue during breastfeeding and found that all four parameters decreased during breastfeeding [223]. Koyano et al. studied the effect of blood transfusion on cerebral hemodynamics in preterm infants and found that cerebral So 2 decreased after transfusion while CBV increased [224]. Nakamura et al. tried to use the TRS-10 to evaluate the seriousness of asphyxiated neonates and found CBV and cerebral So 2 were significantly higher for neonates with adverse outcomes of hypothermic therapy [225]. Kusaka et al. reviewed the usefulness of TD-NIRS for cerebral hemodynamic treatments in neonates [226].

Group from Kagoshima University Hospital
A group from the Kagoshima University Hospital collaborating with Hamamatsu Photonics used TD-NIRS (TRS-10) to monitor cerebral hemodynamics (C oxy-Hb , C deoxy-Hb , C total-Hb and S O2 ) of patients during cardiopulmonary bypass surgery [227]. They found that C total-Hb correlated well with hematocrit measured by a blood gas analyzer. Then, Kakihana et al. evaluated the occurrence of postoperative cognitive dysfunction (POCD) after cardiopulmonary bypass surgery with hypothermic treatment by S O2 measured by TRS-10 and internal jugular vein oxygen saturation (S jvO2 ) measured by a conventional method [228]. In the patients with POCD, S jvO2 were found to be significantly larger than S O2 . The same group also tried to monitor hepatic oxygenation by TRS-10 and found that abdominal C total-Hb were significantly higher in the liver area than other area in healthy people and that hepatic oxygenation measured by TRS-10 may work for early detection of intestinal ischemia [229]. Recently, monitoring of post-resuscitation encephalopathy by TRS-10 was experimentally tried for pigs with cardiac arrest induced by electrical stimuli for preliminary study toward clinical applications [230].

Group from Nihon University School of Medicine
Sakatani et al. collaborated with Hamamatsu Photonics in using the 16-channel TCSPC system (exceptionally in this section) to measure the changes in the mean optical pathlengths and cerebral blood oxygenation at the adult prefrontal cortex during verbal fluency and driving simulation tasks [50,231]. During the verbal fluency task, C oxy-Hb and C total-Hb increased while C deoxy-Hb decreased. On the other hand, C oxy-Hb and C total-Hb decreased while C deoxy-Hb increased during the driving simulation task. The mean optical pathlengths did not change by the tasks. Yokose et al used the TRS-10 to detect vasospasm of the patients with subarachnoid hemorrhage by measuring the cortical oxygen saturation [232]. Tanida et al. used the TRS-10 to conclude that the change in C oxy-Hb at the lateral prefrontal cortex during a working memory task correlated with working memory performance [233]. Sakatani et al. examined the effect of the extract of Ginko biloba leaves on the performance of working memory with TRS-10 probes attached on the prefrontal cortex and found that the right laterality score of C oxy-Hb increased by administration of Ginko biloba leaves [234]. Machida et al. showed that cosmetic therapy to elderly women with mild cognitive impairment was effective to improve the activities of the prefrontal cortex with increased C oxy-Hb and C total-Hb measured by the TRS-20 [235]. Tanida et al. also used TRS-20 to evaluate the difference in the women's emotion between pleasure and displeasure induced by the difference in lipsticks and found that the difference in the lipsticks resulted in different activities in the left and right prefrontal cortex [236]. Murayama et al. studied the relationship between cognitive function and cerebral blood oxygenation measured with the TRS-20 in the prefrontal cortexes of 113 elderly people and found strong correlations between working memory function and C oxy-Hb , C total-Hb , and S O2 [237].

Group of Professor Hamaoka (Tokyo Medical University)
In 2000, Hamaoka et al. used TD-NIRS to study the changes in C oxy-Hb , C deoxy-Hb , C total-Hb , and S O2 in radial digitorum extensor muscles with arterial occlusion and found reasonable agreements of S O2 measured by TD-NIRS and a blood gas analyzer [238]. Later, the group used TD-NIRS to examine the activity of brown adipose tissue (BAT), which can be a counter-measure of obesity and obesity-induced metabolic disorders. Brown adipose tissue has more capillary and mitochondria than other tissues, and the densities of capillaries and mitochondria were found to be strongly correlated with µ a and µ s , respectively, which are measurable with TD-NIRS [239]. Using TD-NIRS, Nirengi et al. showed that daily ingestion of capsinoids (thermogenic capsisin analog) for eight weeks increased the BAT density by 46%, resulting in the same results with the conventional method using 18 F-fluoro-deoy-glucose positron emission tomography (PET) combined with X-ray CT [240]. Fuse et al. performed a study with 423 Japanese subjects to confirm the relationship between the BAT density and C total-Hb measured with TD-NIRS [241].

Other Groups in Japan
In 2006, Ohmae et al. made measurements for six healthy human subjects with pharmacologically perturbed cerebral hemodynamics using the TRS-10 and PET simultaneously, and reported a correlation between the hemodynamic changes obtained by the TRS-10 and the PET-parameter changes in gray matter regions [242]. Sato et al. used the TRS-10 with eight adult male heads during carotid endarterectomy and reported that TD-NIRS measurements were useful for monitoring the cerebral blood circulation and oxygenation during neurosurgical operations [243]. Yamazaki et al. used the TRS-20 on the foreheads of pregnant women during caesarean section [244]. In the case of massive bleeding due to placenta previa, S O2 measured by the TRS-20 showed clear decreases from 67% to 54%, while Sp O2 measured by pulse oximetry did not change. The TD-NIRS systems were found useful for monitoring bleeding during caesarean section. Ueda et al. evaluated the usefulness of TD-DOS imaging of primary breast cancers using the TRS-20 [245]. Lesions with C total-Hb 20% higher than normal tissue exhibited more advanced cancer stage, higher mitotic counts, and higher 18 Fluoro-deoxy-glucose uptake in PET, and TD-DOS imaging of C total-Hb was found useful for prediction of patient prognosis and potential response to treatment.

Summary
We reviewed the development of TD-NIRS from the views of its theoretical background, instruments, advanced theories and methods, the literature on future clinical applications, and current clinical applications of commercially available TD-NIRS systems. At the beginning of the summary section, we identified the major/key developments in the instruments as well as the theories, methods, and applications of TD-NIRS and imaging from the previous sections, and they are listed in Tables 1 and 2 in a chronological style. In addition, the major developments in Tables 1 and 2 are schematically and chronologically shown in Figures 19 and 20, where the relationship between the major developments are indicated by vertical arrows to help us view the whole picture of TD-NIRS and imaging technology. From these tables and figures, it can be said that the theories, methods, and applications of TD-NIRS developed steadily in the 1990s and 2000s, while the development of TD-NIRS instruments was rather slow during 1990s and 2000s. However, the development of such instruments began to accelerate during the 2010s, probably owing to the advent of the SiPM (MPPC), as seen from Figure 19. Table 1. Chronology of major events in instruments of TD-NIRS (Abbreviations: "Ch" for channel and "pLD" for pulsed laser diode).

Year Event
Ref.

Year Event
Ref.
In this situation, slow but steady progresses in TD-NIRS systems over the conventional TD-NIRS techniques have recently shown the possibility and feasibility of solving the drawbacks. As described in Section 3.3, many new systems without using the PMTs have been developed such as the compact, low-power consuming and dual-channel or dual-wavelength TD-NIRS systems using the SiPMs (MPPCs) [55,56], the systems incorporating fast-gated SPADs [58,60,61], the systems based on a spread spectrum approach using laser diodes with pseudo-random-bit sequences [63,64], the systems using time-gated ICCD cameras [65,66], the system incorporating a low-cost commercially available optical transceiver module widely used in telecommunications [3], the system using a supercontinuum laser [67], and the system using state-of-the-art SPAD camera chips [62]. In particular, the SiPM (MPPC) that is a new photon counting semiconductor device may replace the conventional PMTs. Resultantly, it is highly expected that TD-NIRS systems may be greatly improved to meet the requirements in clinical applications.
In concert with the recent developments of the instruments, the theories, methods, and applications in TD-NIRS will be further developed in the future. In particular, the calculation method of the TD-RTE and the hybrid TD-RTE and TD-DE techniques will be developed so that the developments in the TD-NIRS instruments may fully exhibit their performances in the picosecond and nanosecond time-resolved measurements.
If the advanced theoretical methods and algorithms are installed even in the existing commercially available TD-NIRS systems, clinicians will be able to obtain more useful information for diagnostics. For example, if the methods to determine the optical properties of two-layered media instead of assuming the homogeneous media are installed into the commercially available TD-NIRS systems (such as TRS-1 and tNIRS-1 sold by Hamamatsu), the provided information will be greatly extended to become more useful for clinicians. When many TD-NIRS or TD-DOT results are accumulated, clinical applications for new and unforeseen diagnostic methods will be discovered.
Combining the improvements of the hardware and software as stated above is also expected to make it possible to transfer the sophisticated technologies of TD-DOT and TD-FT from current bench-top to bed-side instruments toward wide clinical applications in the near future.