Some Notes on the Use of the Windowed Fourier Transform for Spectral Analysis of Discretely Sampled Data

The properties of the Gabor and Morlet transforms are examined with respect to the Fourier analysis of discretely sampled data. Forward and inverse transform pairs based on a fixed window with uniform sampling of the frequency axis can satisfy numerically the energy and reconstruction theorems; however, transform pairs based on a variable window or nonuniform frequency sampling in general do not. Instead of selecting the shape of the window as some function of the central frequency, we propose constructing a single window with unit energy from an arbitrary set of windows which is applied over the entire frequency axis. By virtue of using a fixed window with uniform frequency sampling, such a transform satisfies the energy and reconstruction theorems. The shape of the window can be tailored to meet the requirements of the investigator in terms of time/frequency resolution. The algorithm extends naturally to the case of nonuniform signal sampling without modification beyond identification of the Nyquist interval.


Introduction
The primary criticism leveled at the use of the continuous wavelet transform for the spectral analysis of discretely sampled data is that it fails to give quantitatively meaningful results. The power spectral density produced from the convolution of a wavelet basis and a discrete signal gives a qualitative picture of the temporal variation of its frequency content; however, when one attempts the reconstruction of the signal, the residual is not on the order of the precision of one's computational device. Likewise, the integrated margins of the power spectral density do not precisely equal the energy of the original signal. In other words, the discrete implementation of the continuous wavelet transform and its inverse do not in seconds u t = s so that u E = V 2 s, then accounting for the load impedance in ohms u Z = Ω gives physical units of V 2 s/Ω = (ml 2 /t 2 q) 2 t/(ml 2 /tq 2 ) in terms of mass, length, time, and charge, which is equivalent to joules. The units of frequency are reciprocal to those of time u f = u −1 t , and the term "power" is understood to mean "energy density" and must be qualified by the domain for its distribution. The squared modulus |y(t)| 2 ≡ P (t) can thus be identified as the temporal power of the signal.
Under suitable conditions not elaborated here, the Fourier integral and its inverse, define an integral transform pair satisfying the Plancherel energy theorem ∞ −∞ |y(t)| 2 dt = ∞ −∞ |ŷ(f )| 2 df and the Fourier inversion theoremŷ(t) = y(t). The representation over the frequency axisŷ(f ) carries units of uŷ = u y u t so that the squared modulus |ŷ(f )| 2 ≡ P (f ) gives the spectral power of the signal, and the spectral energy is Eŷ ≡ ŷ |ŷ f = ∞ −∞ŷ * (f )ŷ(f )df . When the signal is of finite duration t ∈ [0, T ], the Fourier transform's response to a component sinusoid y(t) = exp(i2πf t) iŝ expressed in terms of the frequency offset f ∆ ≡ f − f , thus the spectral power of the uniform window function Φ T (t) = T −1/2 normalized to unit energy describes the leakage in the frequency spectrum induced by the finite duration of the signal, with domain f ∆ ∈ [−∞, ∞]. The continuum leakage function is not periodic in f ∆ , as its magnitude decays according to f −2 ∆ . With regard to practical data analysis, let us suppose now that the signal is given in terms of discrete samples in time with uniform duration and spacing. (Those requirements will be relaxed in Section 5.) The time axis can then be described in terms of integers t ∈ [1, T ] with unit u t ≡ ∆ t given by the sample rate, allowing the signal to be written as a vector y ≡ y t = y(t). The signal energy can be expressed in terms of matrix multiplication as E y = y † D t y, where the temporal metric D t = I T ∆ t is proportional to the identity matrix of order T . The effect of frequency aliasing induced by the uniform sampling is often misunderstood. If we evaluate the Fourier transform of the discrete window with unit energy, we find that the leakage function becomes periodic in f ∆ , with period 1 in units of u f = ∆ −1 t . The principle branch is usually chosen as f ∈ [−1/2, 1/2]∆ −1 t , but one should always respect the periodic nature of the spectrum. The normalization of the spectral power likewise is defined over one period of the frequency axis, T t=1 |y(t)| 2 ∆ t =  we compare the continuum and discrete leakage functions for a window with duration T = 5, where we see the expected oscillatory lobes as well as the periodicity induced by the discrete sampling in time. Now let us consider the discretization of the frequency axis into uniform bins over its principle branch f ∈ [−f c , f c ], where f c ≡ (2∆ t ) −1 is the Nyquist critical frequency. To fully specify the frequency metric D f , one must state its order N , equal to the number of positive frequencies ≤ f c , as well as its parity P ∈ {0, 1}, indicating whether an even or odd number of bins N = 2N + P is used to span the domain. When P = 1, the edgemost bins corresponding to frequencies ±f c have a bin width equal to one half that of the others, so that the limits of integration are respected; because of aliasing, the integrand will have the same value at those locations, so that only one full bin's contribution is counted. This convention differs from that usually given for the discrete Fourier transform [20], where the factor of 1/2 is absorbed by the coefficients rather than the metric. The order N specifies the spacing of the frequencies as ∆ f = (2N ∆ t ) −1 , and the metric can be written as D f /∆ f = I N −(| 1 1 |+| N N |)P/2 so that Tr D f = ∆ −1 t . The central frequencies of the bins can be expressed as f n = [n − (N + 1)/2]∆ f for integers n ∈ [1, N ]. Figure 2 compares the even and odd discretizations of order N = 4. While the odd parity discretization is perhaps more familiar, for many of our purposes the even discretization will prove more convenient.
We are finally ready to define the two-sided discrete Fourier transform (of order N and parity P ) of a discretely sampled signal (of duration T ). If we collect our Fourier basis functions into the form of a matrix, Θ ≡ Θ(t, n) = exp(i2πf n t), then we can easily write the discrete transform pair in terms of matrix multiplication,ŷ ≡ Θ † D t y andŷ ≡ ΘD fŷ , as well as the spectral energy Eŷ =ŷ † D fŷ . In component notation, one haŝ where ∆ f is understood to account for the edge bins if P = 1. If the signal y(t) is real, so that y(−f ) =ŷ(f ) * , then one may define the one-sided transform, retaining only the nonnegative portion of the frequency axis with N → N + P bins and f n → [n − (P + 1)/2]∆ f , by renormalizing the basis functions to twice the energy via Θ → √ 2 Θ and lettingŷ(t) → Reŷ(t). Hereafter, we will assume y(t) is real so that we can focus our attention on the one-sided spectrum. In the fully discrete setting, the satisfaction of the energy and reconstruction theorems is dependent upon there being a sufficient number of degrees of freedom (dof) in the basis functions to fully represent the information content of the signal. For real y(t), the number of dof is equal to the signal duration T . For the one-sided transform, whether of even or odd parity, the number of dof is equal to 2N , as each frequency's coefficient has both an amplitude and a phase,ŷ n = A n exp(iω n ), except for the cases f n = 0 or f c which have only an amplitude. The critical frequency f c is identified as the lowest positive frequency whose basis function is entirely real over the discrete time axis, an observation which will prove useful when we consider the case of irregularly sampled data. The minimal order beyond which the energy and reconstruction theorems are satisfied can be evaluated as N min = T /2 for a real signal (and N min = T for a complex one). That condition is realized (for even T ) when Θ † Θ∆ t = I N T , indicating that the basis functions Θ form an orthogonal set for N = T /2. However, orthogonality over the discrete time axis is not a requirement, as y † D t y =ŷ † D fŷ and y =ŷ exactly (meaning to the precision of one's computational device) for any N ≥ N min and either parity P .
Lastly, let us look at what happens when one tries to express the fully discrete, one-sided Fourier transform over an axis of period (or scale) s rather than frequency f . In the continuum, the relation between the axes f = s −1 yields the Jacobian |df /ds| = s −2 , thus the spectral power over scale P (s) ≡ |ŷ(s)| 2 is equal to the spectral power over frequency P (f ) multiplied by the Jacobian, P (s) = f 2 P (f ), otherwise expressed asŷ(s) = fŷ(f ). In the discrete setting, however, one must be explicit with the mapping of the bin boundaries so that n |ŷ(f n )| 2 ∆ f = n |ŷ(s n )| 2 ∆ s . Let ∆ f = b − a be the width of one frequency bin centered on f n = (a + b)/2, which gets mapped to the scale bin ∆ s = a −1 − b −1 such that ∆ f /∆ s = ab and P (s n ) = abP (f n ). One may very well ask, in what sense is s n = f −1 n the center of the scale bin? The value f n is both the mean and the median of the frequency bin with uniform measure p f = ∆ −1 f , but the mean period over that bin is f −1 f = log(b/a)∆ −1 f . The value s n is recognized as the median of the scale bin with measure p s = s −2 ∆ −1 f , such that 2/(a+b) 1/a p s ds = 1/2. The edges for odd parity are handled similarly with respect to the half-width bins. For either parity, the bin whose lower boundary is a = 0 is the one that causes trouble, as ∆ s = ∞ in that case. One can contrive to neglect that bin for odd parity by subtracting the mean of the signal y(t) → y(t) − y(t) t , but that procedure does not repair the problem for even parity.  Table 1. Numerical results from the evaluation of Figure 3. To illustrate the difficulty, in Figure 3 we display the mapping of the spectral power for a signal with duration T = 20, a sum of two sinusoids at frequencies 1/40 and 1/4 normalized to unit energy E y = 1. In panels (a) and (b) we show the mapping for the one-sided transform of order N = 20 and odd parity over domain f ∈ [0, f c ]. While the higher frequency's contribution remains apparent, that of the lower frequency has been washed away by the measure factor. The dof carried by the lowest frequency bin are unrecoverable from the mapping over scale on account of the infinite bin width. In panels (c) and (d) we repeat the procedure but for domain f ∈ [f c , 2f c ]. This time, all the dof are properly mapped, thus the energy and reconstruction theorems are satisfied. In Table 1 we give some numerical results from the evaluation shown in the figure. In short, there is nothing to be gained by working on the scale axis rather than frequency other than a headache in dealing with the effect of aliasing, as one requires the same set of basis functions Θ and minimal order N min to satisfy the fundamental theorems of spectral analysis in either case.

Gabor and Morlet Transforms
Let us turn now to the consideration of how to transform the signal y ≡ y(t) into a time-frequency representationŶ ≡Ŷ (f, t) by modulating the exponential oscillations of the Fourier basis with a discrete window function, Θ(f, t) → Ψ(f, t) ≡ Φ(t)Θ(f, t). The Gabor transform is defined by the use of a Gaussian window with decay parameter σ, which we will notate as Φ(t) ∝ exp −π/2 (t 2 /σ 2 ) ≡ e −πt 2 /2σ 2 . Its normalization to unit energy Φ → Φ/ Φ | Φ 1/2 t depends upon the duration of the window, parametrized by its half-width τ . In the continuum, one can write ∞ −∞ Φ 2 (t)dt ∝ σ and τ −τ Φ 2 (t)dt ∝ σ erf(π 1/2 τ /σ); however, in the discrete setting with index t ∈ [−τ, τ ], the window energy t Φ 2 (t)∆ t is not expressible in closed form. For the one-sided spectrum, Φ → √ 2 Φ so that its energy equals 2. To fully specify which Gabor transform is being used, one must state the values of N , P , σ, and τ . We will return to the question of finding the minimal order of the Gabor transform for a signal of duration T in Section 5; for now, we will assume that N is greater than N min of the previous section so that the fundamental theorems are satisfied.
There are two alternatives for the definition of the phase convention in the transform, which differ in whether the phase is expressed relative to the origin of the signal's time axis or that of the window: where the time axis for the transform coefficients carries an indext ∈ [1 − τ, T + τ ], or elsê The latter is more familiar, but either is equally valid in terms of satisfying the energy and reconstruction theorems. The sign of the argument to the window function is chosen deliberately so that these expressions remain valid for the case of a window which is not symmetric in t . The signal is zero padded for values oft + t outside the original domain of [1, T ], as no other value could be assigned without changing the energy hence information content of the signal. For either phase convention, the temporal spectral power is defined as P (n,t) ≡ |Ŷ (n,t)| t .) For comparison to the signal energy E y , let us evaluate the spectrum's energy as EŶ ≡ n t P (n,t)∆ t ∆ f and the reconstruction's energy as Eŷ ≡ t |ŷ(t)| 2 ∆ t . For this section and the next, let us consider a particular real signal y(t) comprised of 4 sinusoids with unit amplitude and non-stationary frequency of duration T = 200. The variation in the instantaneous frequencies is assigned an amplitude of 5% and a period of T . Phases for the oscillations and variations are selected randomly. In Figure 4 we show the results of the one-sided Gabor transform with N = 200 and P = 0 using window parameters of σ = 2 √ π and τ = 12. Panel (a) displays the contours of the energy density P (n,t) over axes of time and frequency. Also indicated are the instantaneous frequencies . One-sided Gabor transform of a real signal as described in the text. The power spectrum is shown in (a) with the signal's instantaneous frequencies indicated by the dashed lines. The marginal spectral power is shown in (b) as ×, as is the marginal temporal power in (c), and each is compared to the convolution of the window and signal energies indicated by +. The absolute value of the reconstruction residual is displayed in (d).
used to generate the data. Panels (b) and (c) display the marginal energy densities over axes of frequency and time, respectively. The marginal energy densities are evaluated by taking each sum over an axis independently, i.e. P (n) ≡ t P (n,t)∆ t and P (t) ≡ n P (n,t)∆ f . Panel (d) gives the absolute value of the residual r(t) ≡ŷ(t) − y(t), which is on the order of the machine precision ∼ 10 −16 . The ratios EŶ /E y and Eŷ/EŶ are equal to unity to the same precision, thus the fundamental theorems are satisfied. At these parameter values, the spectral resolution is poor while the temporal resolution is sharp.
The marginal densities of the Gabor power spectrum can be compared to the convolution of the window's energy density with that of the signal in either the temporal or spectral representations. Looking first at the temporal representation, one finds the relation where Φ in this case is normalized to unit energy. In the spectral representation, the convolution is taken in the modular (periodic) sense, most easily performed using the two-sided transform. Letŷ(f n ) be the two-sided Fourier transform of the signal with order N and parity P , and letΦ(f m ) be the transform of the window function, again normalized to unit energy, with the same order and odd parity such that M = 2N + 1. Then P (n) in the two-sided sense n ∈ [1, 2N + P ] can be written where mod(a, b) ≡ a mod b and with respect to the half-width of the edge bins, from which the values corresponding to the one-sided transform can be extracted and doubled. The relation above is written as an approximation because there is some subtlety to the evaluation of the RHS. So far, we have imposed no condition on the window duration T = 2τ + 1 other than being odd for integer τ , and indeed, there is none. In the lower limit τ = 0, the basis functions are constant, Ψ(0, n) = √ 2 (or 1 for the two-sided transform), and all frequency resolution is lost, yet the fundamental theorems are still satisfied for any order N ≥ 1. In the opposite extreme, for example τ = T such that T T with σ = ∞, the basis functions become essentially those of the Fourier transform with minimal order N min . Simply put, the duration of the window in the short-time Fourier transform need not be short. When the temporal bandwidth of the window approaches or exceeds that of the signal, the evaluation of the RHS in Equation (12) becomes suspect if the order N is not sufficient to resolve both the signal and the window; the practical solution is to increase N . Furthermore, one can verify that the Gabor transform remains well behaved (if not particularly useful) for the case σ → iσ so that the window grows exponentially rather than decaying. The only requirement is that the window be real valued and normalized explicitly, i.e. discretely, to unit energy, times 2 for the one-sided transform.
We can now introduce the Morlet transform by promoting the window parameters from constants to functions of the central frequency, Φ(t) → Φ n (t). Let σ n ≡ σ/f n , and with respect to the discrete sampling in time, let τ n ≡ τ /f n for constants σ and τ . The parameter τ itself need not be an integer, as τ n is what defines the window duration for bin n. A difficulty with τ n arises immediately for odd parity when f n = 0; one can ameliorate the situation by using τ n from the lowest positive bin in that case. Again, either the phase convention of Equations (7)(8) or (9)(10) can be used. In Figure 5 we show the results of the one-sided Morlet transform of order 200 and even parity with parameters σ = √ π and τ = 6 chosen so that the window of the previous Gabor transform corresponds to the Morlet window at the critical frequency f c . The power spectrum in panel (a) has the marginal densities shown in panels (b) and (c); however, this time the ratios EŶ /E y and Eŷ/EŶ differ from unity on the order of 1%, interestingly by nearly the same value. Likewise, the absolute value of the residual displayed in panel (d) is on the order of 1%, comparable to the Farge method [2] as reported by Torrence and Compo [1]. While the lowest frequency component has been well resolved, the spectral resolution of the upper frequencies remains poor. Furthermore, without a single window in operation, one cannot perform the comparison of the marginal densities with the convolution of the window and signal energy densities corresponding to Equations (11) and (12).
Let us insert here some comments on what is called the admissibility condition. The statement often is made by authors that the wavelet must have a mean of zero so that its Fourier coefficient at f = 0 vanishes. While that remark might apply in the continuum, it is not appropriate for discretely sampled wavelets of finite duration. If one examines in detail the proof of the admissibility condition [21], one finds that it relies crucially on the property of continuity. In full, the admissibility condition states that if the wavelet is continuous in time with continuous Fourier transform, then its mean must be zero. In the context considered here, neither of those conditions is met: in the temporal representation the wavelet is a piece-wise constant function with jump discontinuities at the edges of the temporal bins, and likewise for the frequency representation which, while of arbitrary resolution beyond the minimal order, must nonetheless be evaluated discretely for any practical analysis of data. If one computes the leakage functions for the discrete Morlet basis [22], one finds an artificial discontinuity is induced by the imposition of the zero mean condition. As a final argument against subtracting the mean of the discrete basis functions, note that the Gabor transform works perfectly well for any values of the window parameters without including such procedure.
There are many suggestions found in the literature for improving the performance of the discretely implemented continuous wavelet transform. Let us examine a few of them here, with their absolute residuals displayed in Figure 6 and a numerical summary in Table 2. For panel (a) the mean of the signal is subtracted before entering the spectral analysis. As one can see, not much changes, but for consistency of comparison the remaining panels will also use the mean subtracted signal. Noticing that the ratios EŶ /E y and Eŷ/EŶ are very nearly equal, one can renormalize the spectrum and reconstruction a posteriori by their geometric mean C = (Eŷ/E y ) 1/2 , yielding the residual shown in panel (b). This renormalization prescription works best when there is not much energy in the upper portion of the frequency axis [22]. Perhaps more familiar is the prescription to shift the central frequency 2π → ω 1 . At these window parameter values, a shift of ω 1 /2π = 1.0127 is found to work well [23], giving the residual shown in panel (c). If one then applies the renormalization prescription, one gets the residual displayed in panel (d). While there has been a noticeable improvement of three orders of magnitude, the root-mean-square residual remains far above the machine precision. The primary feature of the Morlet transform is that it uses a different window function Φ n for each bin along the frequency axis. That is also its main difficulty, as each bin corresponds to a particular instance of the Gabor transform, any of which would work fine independently, but when the separate bins are isolated and combined, there is no reason to suppose that their marginal sums will equal the signal energy. The correspondence between the Morlet and Gabor transforms becomes more apparent if one redefines the window decay σ n relative to the window width τ n rather than the central frequency f n , in which case regions along the frequency axis of the Morlet spectrum are drawn from a particular Gabor spectrum. Simply pasting together pieces from different Gabor transforms from the outset appears doomed to failure, and it is surprising that the Morlet transform in the discrete setting works as well as it does. While we have not given up entirely on the Morlet transform, if one wants to make quantitative use of the results of a spectral analysis, the energy and reconstruction theorems must be satisfied precisely.

Layered Window Transform
How, then, are we to devise a multiresolution analysis that satisfies the fundamental theorems of spectral analysis? The answer lies in focusing our attention on the distribution of energy in the window function. Suppose we have one unit of energy, an "atom" so to speak, distributed with temporal power Φ 2 A (t) in the continuum, which we wish to use as the basis for a multiresolution analysis. Let the window at scale τ be defined by Φ τ (t) ∝ Φ A (t/τ ) for positive integer τ , which scales the relative unit of time by integral amounts. One can supplement this definition for τ = 0 by letting Φ 0 (0) = 1 and 0 otherwise. For an exponential decay with scale invariant parameter σ, one can write Φ τ (t) ∝ exp −π/2 (t 2 /τ 2 σ 2 ), where the normalization to unit energy is done explicitly over the integers t ∈ [−τ, τ ]. (Alternately, one could hold the window width fixed for all τ .) The layered window Φ L (t) is then defined to be the sum in terms of energy, not amplitude, of the windows Φ τ (t) over some set of scales τ ∈ [τ 1 , τ L ] with L members, normalized to unit energy, Φ 2 . For a one-sided transform, Φ L → √ 2 Φ L of course. This single window, with duration T = 2τ L + 1, is to be used for all the frequency bins in the windowed Fourier transform.
Since the atomic window Φ A is arbitrary, the distinction between summing the window energies Φ 2 τ versus summing the window amplitudes Φ τ is really just a matter of interpretation. However one defines the sum over scales, the layered window eventually is normalized explicitly to one unit of energy, or two units for a one-sided transform. Using the construction above in terms of energy, the amplitude of the layered window is the root-mean-square of the amplitudes of the scaled windows, while the construction summing the window amplitudes is not so simply normalized. With respect to its physical motivation, energy is the quantity that ultimately gets quantized, not amplitude, which is a property reflected in our favored definition of the layered window. The selection by the investigator of the set of scales {τ } to use in the construction specifies the domain of energy distributions to which the transform will be sensitive. Let us return now to consideration of our four frequency test signal, with its mean restored. For comparison with the previous Gabor and Morlet transforms, let the decay parameter here be σ = √ π/6, and let τ ∈ [12,96]. The results of such analysis are displayed in Figure 7. Compared to the Morlet transform, the spectral resolution is much improved in the upper portion of the frequency axis, as seen in panel (a). Since only one window is in operation, the marginal densities of panels (b) and (c) can be compared to the convolution of the window and signal energy densities in either the temporal or spectral representations. The absolute residual shown in panel (d) is on the order of the machine precision, with maybe a slight accumulation of truncation errors. The ratios EŶ /E y and Eŷ/EŶ equal unity to machine precision. The layered window transform, by virtue of using a single window for all frequencies, satisfies the energy and reconstruction theorems to the accuracy of one's computational device.
As a final comparison, let us put the Fourier power spectrum of the signal alongside the marginal spectral energy densities of the Gabor, Morlet, and layered window transforms, displayed in Figure 8. The Fourier spectrum in panel (a) is the typical mess one gets when analyzing non-stationary signals, but its net energy does equal the signal energy. The Gabor spectrum in panel (b) has lost its frequency resolution on account of the tight temporal bandwidth of the selected window parameters yet retains the correct normalization. The Morlet spectrum in panel (c) resolves the low frequency portion of the spectrum but not the high frequency region and does not have the correct normalization. The layered window spectrum in panel (d) is normalized to the signal energy and resolves both the low Figure 7. One-sided layered window transform of a real signal as described in the text. The power spectrum is shown in (a) with the signal's instantaneous frequencies indicated by the dashed lines. The marginal spectral power is shown in (b) as ×, as is the marginal temporal power in (c), and each is compared to the convolution of the window and signal energies indicated by +. The absolute value of the reconstruction residual is displayed in (d).
and high frequency parts of the spectrum while maintaining sufficient temporal resolution to be useful in identifying non-stationary features in the signal.

Irregular Sampling and Minimal Order
Let us now consider the case of an irregularly sampled signal y(t), with regard to the analyses by Lomb [24] and Scargle [25]. Irregular sampling has long vexed practitioners of wavelet analysis, leading to a variety of suggestions for how to overcome its difficulties. Because of the nature of stellar observations, the astrophysical community often presents data which contains gaps or is otherwise on an irregular time axis. The method by Foster [26] focuses on the normalization of the gapped wavelet basis functions, while the method by Frick et al. [27] focuses on the admissibility condition; taken together, those ideas lead to the edge adapted algorithm proposed by Johnson [28]. More familiar perhaps is the lifting scheme by Sweldens [29], which operates in the context of the dyadic wavelet transform implemented in terms of finite impulse response digital filter banks.
For simplicity, we will assume that the samples each have a uniform duration ∆ t which is not greater than any of the intermeasurement periods, so that the time metric D t remains proportional to the identity matrix; otherwise, a suitably generalized D t must be used. Let the observation times be given by the vector t ≡ t d indexed by d ∈ [1, D], where D is the total number of samples, in some unit u t not necessarily equal to ∆ t such that the values t d are integers; u t is the resolution of the time measuring apparatus, thus ∆ t must also be an integer. Similarly, the measurements can be written as the vector y ≡ y d so that the signal energy remains E y = y † D t y in units of u 2 y u t . Missing values, i.e. measurements at integer times not in t, are effectively treated as zero. Let us look first at how the discretized Fourier transform is modified in this case.
To identify the Nyquist critical frequency, one must find the lowest positive Fourier basis function which is entirely real (technically up to a constant phase) over the given set of observation times [30]. That procedure is easily accomplished by shifting and scaling the time axis t → t such that f c ≡ (2u t ) −1 in the new time units. If one defines t g to be the greatest common divisor over the set of intermeasurement periods, the new time axis can be written t d ≡ (t d − t 1 )/t g + 1 in units of u t ≡ t g u t such that t d ∈ [1, T ] remains integer valued. At the critical frequency, the Fourier basis function is exp iπ (2f c t d ) = ±1, and the Fourier spectrum has period 1 in units u f = u −1 t . Having found the units in which f c = 1/2, let us drop the prime distinguishing the scaled time axis in the following, and for convenience let us suppose ∆ t = 1 in the scaled units; otherwise one must be extra careful with the scaling of the energy units.
The next task is to determine the minimal order for satisfaction of the energy and reconstruction theorems. Let N D ≡ D/2 , and let N T ≡ T /2 . For N < N D there are an insufficient number of dof in the discrete Fourier transform to fully represent the information content of the (real) signal, thus the fundamental theorems are not satisfied in general. Likewise, for N ≥ N T there are more than enough dof to represent the signal; N T gives the minimal order of the analogous regularly sampled signal where the missing entries are assigned the value zero. For N D ≤ N < N T , the Fourier transform might work, depending upon the particulars of the irregular sampling. To satisfy the reconstruction theoremŷ = y Table 3. Comparison of the norm of the residual of the Fourier transform to the distance from the quality matrix Q to the identity I D at various orders N and even parity for an irregularly sampled signal with D = 10 and T = 28 such that N D = 5 and N T = 14. forŷ ≡ Qy in the one-sided case, one must have Q ≡ Re ΘD f Θ † D t = I D to the order of machine precision; the energy theorem also is satisfied when that condition is met. (For a complex signal, the entire matrix, not just its real part, must be utilized, and of course D f must cover one full period of the frequency axis.) The basis functions are evaluated only at the observation times Θ(d, n) = exp i2π (f n t d ), because those are the only locations at which the reconstruction theorem must be satisfied. In Table 3 we compare the norm of the reconstruction residual r ≡ŷ − y for some signal with D = 10 and T = 28 to the distance between the quality matrix Q and the identity I D according to the Frobenius metric at various orders N with P = 0. To produce a smooth picture of the spectral content, one should take N N T . Turning now to consideration of the Gabor transform (and by extension all windowed Fourier transforms with a fixed Φ), the construction of the quality matrix is a bit more complicated, owing to the convolutions along the time axis. It is commonly remarked that the number of dof along the frequency axis is multiplied by the number of times the window duration T = 2τ + 1 fits within the signal duration T , but the situation is more subtle than that, as one must also account for the bandwidth of the window given by the decay σ. Beginning with the case of regular sampling D = T , using the phase convention of Equations (9-10), one can write the composition of the inverse and forward transforms aŝ which yields one row in the quality matrix Q indexed by t ∈ [−2τ, 2τ ] indicating the diagonal where q(t ) appears such that Q has the form of a Toeplitz matrix of rank T . For a symmetric window Φ(−t ) = Φ(t ) written as a diagonal matrix Φ such that Ψ ≡ ΦΘ, the values q(t ) can be extracted from Ψ * D f Ψ † .
For an irregularly sampled signal T > D, one retains only those rows and columns of Q corresponding to actual data entries, Q → Q(t, t). In Table 4 we compare the residual norm of the Gabor transform with τ = 12 and two values of σ to the distance from Q to I D at various N with P = 0 for an irregularly sampled signal of duration T = 100 with D = 30 entries. As N approaches N D from below, the quality of reconstruction improves until the residual is on the order of machine precision. Table 4. Comparison of the norm of the residual of the Gabor transform with width τ = 12 and decay σ indicated to the distance from the quality matrix Q to the identity I D at various orders N and even parity for an irregularly sampled signal with D = 30 and T = 100 such that N D = 15 and N T = 50. 1.0987e-03 7.8620e-05 1.0067e-14 1.2076e-14 1.2964e-14 Q − I D F 6.6716e-04 6.9627e-05 6.0970e-16 7.7092e-16 1.0855e-15 Table 5. Comparison of the norm of the residual of the Morlet transform with width τ = 6 and decay σ = √ π to the distance from the quality matrix Q to the identity I D at various orders N and even parity for a regularly sampled signal with D = T = 30 such that N D = N T = 15. N 6 7 8 9 10 r F 3.3471e+00 2.9397e+00 2.7428e+00 2.2582e+00 1.9475e+00 Q − I D F 2.9527e+00 2.4686e+00 2.0993e+00 1.8244e+00 1.5788e+00 N 11 12 13 14 15 r F 1.7911e+00 1.5917e+00 1.3310e+00 9.3137e-01 5.3677e-01 Q − I D F 1.3451e+00 1.1139e+00 8.7648e-01 6.3175e-01 4.0199e-01 For the Morlet basis functions, the construction of Ψ is more involved, as Ψ(t , n) = Φ(t , n)Θ(t , n). Nonetheless, one can evaluate Q from Ψ * D f Ψ † for any rank T . As a function of order N , one finds that the deviation Q − I T F decreases until N > T /2, after which it bottoms out on the order of a few percent times T . In Table 5 we compare the norm of the residual for the Morlet basis using σ = √ π and τ = 6 for some regularly sampled signal with D = T = 30 to the deviation of Q from I D at various N with P = 0. If one wishes to investigate the feasibility of devising a Morlet basis with perfect reconstruction, the evaluation of Q is where to start.
Let us close this section by looking at the analysis of the signal from the previous section but with the number of measurements reduced by a factor of a third (N D = 134) for the same duration T = 200. The evaluation of the layered window Fourier transform proceeds as before, with the understanding that values of y(t + t ) at times not in t are treated as zero; the time axis for the spectrumŶ (n,t) includes every integert ∈ [1 − τ L , T + τ L ]. Using the same parameters as before, σ = √ π/6 and τ ∈ [12, 96] Figure 9. One-sided layered window transform of an irregularly sampled signal as described in the text. The power spectrum is shown in (a) with the signal's instantaneous frequencies indicated by the dashed lines. The marginal spectral power is shown in (b) as ×, as is the marginal temporal power in (c), and each is compared to the convolution of the window and signal energies indicated by +. The absolute value of the reconstruction residual is displayed in (d).
with N = 200 and P = 0, in Figure 9 we display the power spectrum, its marginal densities, and the reconstruction residual for our irregularly sampled signal. While the loss of information has affected the appearance of the spectrum relative to the regularly sampled case, it nonetheless remains a faithful representation of the information which is available. By virtue of using a single fixed window Φ L , the layered window Fourier transform satisfies the fundamental theorems of spectral analysis even when there are gaps in the observation record.

Window Comparison
Let us conclude the analysis with a comparison of the temporal and spectral bandwidths for several types of window function, as well as their estimates of the power spectral density carried by a test signal with a little more complexity than we had before. The test signal is regularly sampled with T = 200 and four component frequencies, but now only two components are present in the first half of the duration, while the other two are in the second half. An abrupt transition occurs at the midpoint of the duration. The amplitude of the frequency variation is now 10%, with a period of T /2, so that each component covers a broad range of instantaneous frequency. The order for all the transforms in this section will be the minimal Fourier order of the signal N = 100 with even parity P = 0. Figure 10. Comparison of the temporal and spectral energy densities of the various windows arranged according to Table 6. Table 6. Comparison of the temporal and spectral bandwidths for various windows, as well as their rms reconstruction residual for the test signal as described in the text. The parameters for the four types of window considered are displayed in Table 6, comprising of two Gabor windows at either end of the range in scale for the layered window also considered, in addition to a random window with an arbitrary duration. The first moments of time and frequency are indicated by t 1 and f 1 respectively, and the bandwidths (square root of the second moment about the first moment) by t 2 and f 2 . The two Gabor windows are observed to minimize the Fourier uncertainty relation t 2 f 2 ≥ 1/4π, Figure 11. Comparison of the power spectral density of the test signal described in the text using the various windows arranged according to Table 6. The instantaneous frequencies used to generate the signal are indicated by the dashed lines. thus in that sense are optimal, while the layered window has a slightly larger bandwidth product. The random window has a bandwidth product which is considerably larger, yet all these windows produce a valid discrete transform pair that satisfies the fundamental theorems of spectral analysis as indicated by the residual of reconstruction r F . The energy densities in the temporal and spectral representation used to evaluate the bandwidth products are shown in Figure 10, where one can see the trade-off in time/frequency resolution in action.
The power spectral density estimates evaluated from the test signal of this section using the various windows are displayed in Figure 11, arranged from (a) to (d) according to Table 6. The power spectrum given by the layered window in (c) does indeed incorporate features of either Gabor transform shown in (a) and (b); in a sense, it has combined the Gabor transforms over its range of scale in a way that preserves the energy and reconstruction theorems. With some algebraic manipulation of the expressions defining the layered window Fourier transform one might be able to show that explicitly, but for now that statement is intuitive speculation. Interestingly, the spectrum given by the random window is not much different from the others, driving home the point that the shape of the window truly is arbitrary as long as it is normalized appropriately.

Discussion
The primary result of this investigation is that the windowed Fourier transform has far more flexibility and utility than it is usually credited with. To achieve satisfaction of the energy and reconstruction theorems in the discrete setting, the only requirements on the window are that it be real, nonnegative, and normalized explicitly to unit energy, or two units for a one-sided transform. While we have looked only at symmetric windows here, one can easily verify that the expressions for the transform pair, either Equations (7)(8) or (9)(10), remain valid for a window that is not symmetric in t ; with attention to the details of relocating the temporal bins, the coefficients can be assigned to the time corresponding to the peak of the window rather than its midpoint. Similarly, a window with an even length duration T could be used if one is careful with the definition of the phase and the location of the bins. In fact, any nonnegative function can be used for Φ, or even a list of arbitrary numbers, as long as it is suitably normalized.
The flexibility in Φ allows one to define a multiresolution spectral analysis in terms of an atomic unit of energy Φ A (t/τ ) evaluated over a range of scales τ . These multiple scalings of Φ A are combined into a single layered window Φ L which is applied to the entire frequency axis; the phase component of the basis is independent of the window function. Because only a single window is used, the fundamental theorems of spectral analysis are satisfied. With a Gaussian Φ A , the form of Φ L very closely resembles a Lorentzian function, which expresses uniformity over scale when expressed as an angle. The shape of Φ L can be tailored to the needs of the investigation through inspection of its leakage function. Similarly to the wavelet transform, the trade-off between resolution in time or frequency is under the control of the investigator. Any structure seen in the spectral coefficients of the signal is understood to be conditioned on the selection of the window function; the answer one gets depends upon how the question is asked.
The difficulties faced by the continuous wavelet transform exemplified by the Morlet basis, with respect to the fundamental theorems of spectral analysis, can be summarized in the evaluation of its quality matrix Q. Perfect reconstruction in the discrete setting requires Q to equal the identity matrix to the precision of one's computational device. The various adjustments looked at above which improve the reconstruction residual must be bringing Q closer to that form; however, none of them achieve the required precision. For an integral transform pair to have quantitative significance, one must demonstrate that the distance from its Q to I is numerically zero. This investigation has proposed an alternate approach to multiresolution analysis which does satisfy the energy and reconstruction theorems while improving upon the resolution properties of the Gabor transform.
The effects of aliasing and of irregular sampling are easily understood with regard to the periodicity induced in the Fourier spectrum. The Nyquist interval is given simply by the span between frequencies which are indistinguishable over the stated measurement times; assigning a value of unity to that range such that f c = 1/2 is just a matter of choosing the appropriate units for time. The implementation of the windowed Fourier transform is unaffected beyond zero padding the signal at locations of missing measurements, whose justification is that no other procedure leaves the energy, hence information content, unchanged. One should observe that the ends of a regularly sampled signal are treated the same way. On that note, we can remark that there is no cone of influence for the windowed Fourier transform, as every spectral coefficient is important to the satisfaction of the fundamental theorems, even those outside the temporal domain of the data.
The minimal order N min required for a faithful representation of the signal depends upon the temporal bandwidth of the window and the duration of the data. Pinning that number down for the windowed Fourier transform is a bit tricky, and if needed is best evaluated explicitly through inspection of Q as a function of order N . What one can say in general is that 1 ≤ N min ≤ N T , where the bounds are Figure 12. Comparison of the lowest Ψ(t , 1) and highest Ψ(t , N ) frequency basis functions at order N = 50 with P = 0 using the parameters found in the text, with the real part indicated by × and the imaginary part by +. The Fourier basis is in (a) and (b), the Gabor basis is in (c) and (d), the Morlet basis is in (e) and (f), and the layered window basis is in (g) and (h).
given by the minimal orders for the fully temporal and fully spectral representations of the regularly sampled signal, respectively. Irregular sampling complicates matters by introducing an order N D < N T such that N D ≤ N min ≤ N T for the full-length Fourier transform and N min ≤ N D for the windowed Fourier transform. For most practical cases of data analysis, one is interested in producing a smooth plot of the spectral content, thus one would use an order N N min . To verify the marginal density of the windowed power spectrum in comparison to the convolution of the window and signal energy densities in the spectral representation, one requires an order N sufficient to resolve both the signal and the window temporal durations.
Thus far we have heard nary a peep from the actual basis functions used in this analysis, so let us close the discussion by looking at some of the stars of the show. In Figure 12 we display the real and imaginary parts of the basis functions, normalized to unit energy, for each of the transforms considered above at order N = 50 with even parity P = 0 at the lowest and highest entries on the frequency axis, i.e. f 1 = 1/200 and f N = 99/200. The Fourier transform has no intrinsic window, so it is shown for a duration T = 101 which spans T < T . The window parameters chosen are those from the previous sections: σ = 2 √ π and τ = 12 for the Gabor transform, σ = √ π and τ = 6 for the Morlet transform, and σ = √ π/6 and τ ∈ [12,96] for the layered window transform. One can observe that the transforms whose Q equals the identity share the property that these basis functions are phase analogues over the discrete sample times, whereas the Morlet basis functions are scale analogues. By phase analogue we mean if Ψ(t , 1) = a 1 + ib 1 , then Ψ(t , N ) = a 1 (−1) t + ib 1 (−1) t +1 , a feature shared by all positive frequency pairs whose midpoint is f = 1/4; the Morlet basis differs from the others in this regard. Whether that property is required for satisfaction of the fundamental theorems is unclear, but the results of this analysis suggest that it is.

Conclusions
In this article we have compared the Fourier, Gabor, and Morlet transforms in the fully discrete setting applicable to the analysis of sampled data. While the Fourier and Gabor transforms satisfy the fundamental theorems of energy conservation and perfect reconstruction, the Morlet transform does not. The magnitude of the residual can be related to the distance from the quality matrix to the identity in all cases, such that the minimal order for reconstruction can be determined for the Fourier and Gabor transforms. Various methods of improving the response of the Morlet transform are considered; however, none of them achieve the desired precision on par with the truncation error of one's computational device.
An alternate approach to multiresolution analysis is proposed which constructs a single layered window from multiple scalings of some atomic unit of energy. This layered window transform satisfies the fundamental theorems of spectral analysis while providing sufficient temporal resolution to identify non-stationary features in the signal. The trade-off between time and frequency resolution is under the control of the investigator through the selection of the atomic window and the scales over which it is evaluated. The power spectrum of the layered window transform is similar to that of the Morlet transform but provides much better frequency resolution.
The premise behind the wavelet transform is that the low frequency elements of a signal should have a much longer duration than the high frequency elements. There are, however, many examples of real world signals for which the converse is true. Consider, for example, the digital recording of a kick drum and cymbal rhythm such that the low frequency bursts have a relatively short duration compared to the ringing at high frequency. For that type of signal, the wavelet transform is going to provide a poor choice of basis even if it satisfied the fundamental theorems. The flexibility in the choice of window function used in the windowed Fourier transform allows one to tailor its response to the needs of the analysis far better. The result one gets depends upon how one defines the resolution of the window, any of which provide a valid spectral representation of the signal.
To make quantitative use of the spectral analysis of some discretely sampled signal, one must demonstrate that the fundamental theorems are satisfied. The windowed Fourier transform can be shown to satisfy those requirements for any real valued window function that is suitably normalized. The potential for gaps in the data, or some other form of irregular sampling, is found to pose no problem once the correct Nyquist interval is identified. Consequently, the windowed Fourier transform can be applied to data from a wide variety of sources, such as astronomical observations, which are limited physically to an irregular set of observation times, as well as the common case of regular sampling. The implementation of a multiresolution spectral analysis in the discrete setting appears to be possible only when the multiple scalings of the energy distribution are applied evenly across the frequency axis by combining them into a single layered window.