# Quantized Constant-Q Gabor Atoms for Sparse Binary Representations of Cyber-Physical Signatures

^{1}

^{2}

*Keywords:*Gabor atoms; wavelet entropy; binary metrics; acoustics; quantum wavelet

Next Article in Journal

Previous Article in Journal

Infrasound Laboratory, University of Hawaii, Manoa, HI 96740, USA

RedVox, Inc., Kailua-Kona, HI 96740, USA

Received: 9 July 2020
/
Revised: 13 August 2020
/
Accepted: 21 August 2020
/
Published: 26 August 2020

Increased data acquisition by uncalibrated, heterogeneous digital sensor systems such as smartphones present new challenges. Binary metrics are proposed for the quantification of cyber-physical signal characteristics and features, and a standardized constant-Q variation of the Gabor atom is developed for use with wavelet transforms. Two different continuous wavelet transform (CWT) reconstruction formulas are presented and tested under different signal to noise ratio (SNR) conditions. A sparse superposition of Nth order Gabor atoms worked well against a synthetic blast transient using the wavelet entropy and an entropy-like parametrization of the SNR as the CWT coefficient-weighting functions. The proposed methods should be well suited for sparse feature extraction and dictionary-based machine learning across multiple sensor modalities.

This paper applies the constant-Q standardized Infrasonic Energy, Nth Octave (Inferno) framework [1] to the Gabor wavelet [2] and proposes binary metrics for signature characterization. One of the primary motivations of this work is to facilitate the fusion of multi-modal data streams in sensor systems that collect information at different temporal and spatial granularities. Consider a cyber-physical sensor system that converts observables into digital time series data consisting of signals and noise. Signals of interest can be hypothetically described by sparse representations that define their signature. If the signature characteristics are sufficiently unique and recognizable from those of ambient coherent and incoherent noise, they can be used to identify and classify an object or process.

The transformation of diverse digital measurements into robust, scalable, and transportable representations is a prerequisite for signal detection, source localization, and machine learning applications for signature classification. The challenge at hand is to construct sparse signal representations that contain sufficient information for classification. Unambiguous classification can be elusive; measurement artifacts, unexpected signal variability, and non-stationary noise often conspire to add uncertainty to our classifiers. As will be discussed in this paper, information and uncertainty quantification can be substantially simplified when using standardized wavelets and binary metrics.

Oscillatory processes often exhibit spatial and temporal scalability and self-similarity. Although some physical processes scale linearly, many exhibit recurrent patterns that scale logarithmically and are well represented by power laws. Both linear and logarithmic scales can coexist. For example, overtones in harmonic acoustic systems are often linearly spaced in frequency, yet our sense of tone similarity is close to base 2 logarithmic (binary) octave scales. The term octave comes from the eight major notes in 12-tone musical notation, where every note frequency closely repeats with factors of two. This paper uses the term octave and binary interchangeably to denote the base 2 geometric scaling of frequency and time. The mapping between frequency (or pitch) and time (period) is direct for continuous tones, such as musical notes, or statistically stationary oscillations like the orbits of planets. Discrete Fourier transform methods are exceptionally well suited for the interpretation of steady tonal signals with linearly spaced harmonics. The Fourier transform deconstructs oscillations with distinct recurrent time periods into a spectral representation consisting of a set of discrete frequencies. The spectral transformation can be sparse because it removes time as a variable, facilitating the reconstruction of stable oscillations from a subset of coefficients in the Fourier spectrum.

Stable oscillators can be even more succinctly represented by a fundamental frequency or period (exclusive or, as they are not independent). For many physical systems, a map can be constructed between the fundamental frequency and its harmonics. Signals where the fundamental and its harmonics (when they exist) are statistically stationary and easily discernible above noise can be referred to as the easy continuous wave (CW) problem, or the zeroth (trivial) class of CW problems. The trivial CW problem is well understood and should routinely be used as a speed and performance benchmark for detection and classification algorithms.

The plot thickens when temporal variability is introduced in the signal or the noise. In the first class of CW problems, temporal variability is due to non-stationary broadband or band-limited noise. This is a chronic condition in infrasonic signal processing, where ambient noise can be coherent or incoherent across a dense sensor network [3] or an array aperture [4]. The first class of CW problems is also well understood when noise is predictable (e.g., normally distributed) over a time duration that is much longer or much shorter than the signal period in the detection band. However, this class of problems is not as well characterized when noise is not evenly distributed across the signal detection bandpass and can be particularly inconvenient when noise overwhelms the fundamental frequency band.

In the second class of CW problems, temporal variability is introduced by a change in the temporal, spectral, and/or statistical properties of the signal. These changes can be due to aging, failure, motion, communication, or any other change in state. In a simple two-state problem, one may quantify the properties of the first state, the transition period between states, and the properties on the final state. In a multiple-state problem, such as with communication systems, speech, or music, the Short-Time Fourier Transform (STFT) is often used to characterize spectral variability.

If the transition period between states is faster that the characteristic time scale of the initial state, the STFT does not always provide an accurate representation of this transient. For some signals, the details of the transient are not relevant and only the steady states are important. But a new class of signals emerges when the detection of transient anomalies is prioritized.

The zeroth class of transient problems consist of delta functions with their integrals and derivatives. Such instantaneous spikes do not exist in the natural world but can be readily constructed digitally to evaluate the impulse response of a system or represent a neuromorphic network [5,6]. The first class of transient problems would consist of realistic variants of the delta function that may be observed in the wild when a rapid change of state becomes the signal of interest. Just like a single-tone sinusoid may be regarded as the prototype end member for the trivial CW problem, an explosive detonation could be considered as a prototype transient signal source [7]. A time series corresponding to a blast would vary from ambient noise to a brief blast transient that fades back to a possibly perturbed background noise state. The transition from noise to signal can be devastatingly fast. In general, poorly-conditioned STFTs provides inadequate representations of brief, rapidly changing signals because the signatures no longer resemble a CW and are not optimally represented by sinusoids. However, since a STFT is a windowed sinusoid, a well-conditioned STFT window at the peak frequency of a signal turns the waveform in the STFT window into a wavelet that is well-tuned for the main signal bandpass.

The concept of a windowed sinusoid to represent a transient signal was introduced by Gabor [2] in 1946, and later mathematically formalized by others as wavelets. Variants of the Gabor wavelet are presented in the main text and the Appendix A, Appendix B, Appendix C, Appendix D, Appendix E, and Appendix F.

The second class of transient problems overlaps with the second class of CW problems. It corresponds to transients of significant durations which could be addressed with STFTs, wavelets, or their combination. Very often a transient is imbedded in a noise field with band-limited harmonic structure. Or the transient itself is a sweep, characterized by a substantial frequency change in the fundamental frequency and its harmonic structure.

The primary differences between STFTs and wavelet transform approaches are that the STFT uses a linear period mapping and a constant time window duration, while wavelets uses geometric pseudo-period mapping and time window durations that scales with the pseudo-period. Whereas in the Fourier framework there is a one-to-one mapping between time and frequency, the wavelet mapping between time scale and frequency can be less evident and depends on the selected wavelet.

This paper concentrates on developing highly standardized Gabor atoms [2] for the design and evaluation of transportable, sensor-agnostic transient signal detection, sparse feature extraction, and classification algorithms.

A Cyber-Physical System (CPS) is an algorithm-controlled computer system with physical inputs and outputs. A typical example of a mobile CPS is a smartphone with a microphone input (sound activation) that outputs a response (speech, music, or signal recognition) to a screen. Cyber-physical Measurement and Signature Intelligence (MASINT) is an emerging discipline that concentrates on phenomena transmitted through cyber-physical devices and their interconnected data networks. For smartphones and other multi-sensor mobile platforms connected to wireless networks, this includes digital noise, bit errors, and latencies internal to the device and its communication channels [8,9,10].

Data processed by the cyber part of CPSs are digital and represented as binary digits (bits). Although the precision of the data would be initially defined by its their allocated integer word size (16, 24 bit, etc.), the original data may be converted into floating point equivalents when an algorithms acts on them. For example, consider sound recorded by a smartphone at the standard rate of 48,000 samples per second. A typical sound record may have 16-bit resolution, so that its dynamic range in bits is 2^{−15} to 2^{15} – 1. However, one may only be interested in the lower frequency components of the raw data, so one would implement a lowpass anti-aliasing filter before decimation. Such filters often require floating point arithmetic in double precision (52 bit mantissa re IEEE 754 at the time of this writing) to reduce instability. Therefore, the precision of the resulting lowpass filtered data would exceed the specification of the original 16-bit integral input. However, the theoretical dynamic range of the system would not exceed the specification of the integer 16 physical input. Furthermore, data compression can be more efficient on floats than integers, which leads us to the topic of fractional bits as a measure of CPS amplitude, power, and information.

Many of the metrics we used in traditional physical and geophysical systems are inherited from the analog era. The base 10 decibel scale is a measure of power relative to a reference level, and is used extensively in telecommunications, acoustics, and electrical engineering. Let us estimate the hypothetical dynamic range of a 16-bit microphone record of a sinusoid at full scale. The peak rms amplitude would be

$${p}_{rms\text{}signal}=\frac{{2}^{16}}{2\sqrt{2}}\text{}.$$

All systems have quantization and system noise, and the noise can have a positive or negative bias. This is not a noise paper; for the sake of illustration, I model the system noise as oscillating around a mean of zero and alternating between −1 and 1,

$${p}_{rms\text{}noise}=\frac{{2}^{1}}{2\sqrt{2}}\text{}.$$

The theoretical dynamic range of the system in dB for a sinusoid recorded with a 16-bit microphone and sound card combination with a one-bit noise floor could be characterized by the ratio of the power
where a digital response is converted to the legacy base 10 logarithmic system. One advantage of the decibel approach is that it can be compared to the response of the human ear and other analog systems. However, analog comparisons are not necessary for many cyber physical applications. A more natural unit for CPS is the binary logarithm
where the unit fbits corresponds to floating point representation of bits. For example, in 24-bit systems, present-day quantization error is ~3 bits, leading to an effective dynamic range of ~21 fbits. Likewise, a 24-bit integer cast into a 32-bit symbol can have 8 + 3 bits of noise, and may be converted to a float that still has ~21 fbits of dynamic range.

$$10\ast lo{g}_{10}{\left[\frac{{p}_{rms\text{}signal}}{{p}_{rms\text{}noise}}\right]}^{2}=20\ast lo{g}_{10}\left[{2}^{15}\right]\approx 90\text{}\mathrm{dB}$$

$$lo{g}_{2}\left[\frac{{p}_{rms\text{}signal}}{{p}_{rms\text{}noise}}\right]=lo{g}_{2}\left[{2}^{15}\right]\approx 15.0\text{}\mathrm{fbits}$$

Another unit that is often specified is the ½ power point of the frequency response of a filter, which defines the quality factor of that filter. This is often referred to as the −3 dB point, since $10\ast lo{g}_{10}\left(2\right)~3\text{}\mathrm{dB}$. However, accurate filter bank reproductions require a clear specification of the ½ power point, and conversion from base 10 to base 2 specification can lead to computational errors. Plotting filter responses in floating point bits can be informative as it reveals the precision of the computation. Because it is awkward and there is already a precedent in information theory for using bits outside of their original definition as a binary digit, from here onwards in this paper the word bits will be used to represent either the floating point equivalent of bits or as a metric for information.

Consider the communication channel capacity introduced by Shannon [11], which in its simplest form can be expressed as
where $Ch$ is a measure of the differential entropy of a signal in the presence of noise, W is a measure of the bandwidth, $Sg$ is representative of the power of a signal, and Ns is representative of the noise power. The units of the channel capacity are in shannons per second, or bits per second, and represent the theoretical upper bound of the rate of information transfer in a communication channel. Since it is often impossible to separate noise embedded in a signal but it is often possible to construct a noise model, we can think the ratio (Sg + Ns)/Ns as a practical measure of the signal to noise ratio (SNR) of an observed signal that has been carried through a cyber-physical system or a medium.

$$Ch=Wlo{g}_{2}\left(\frac{Sg+Ns}{Ns}\right)$$

The effective SNR and therefore the detectability of a compressed pulse (such as a wavelet) is the product of the bandwidth, the signal to noise ratio, and the time duration of a signal [12]. When using constant-Q Gabor wavelet with fractional octave (binary) bands n of order N and center frequency ${f}_{n}$ to process a signal in the presence of noise, the next section shows that for
the signal detectability per band can be represented by
and the upper limit on rate of information in bits per second for a band-limited pulse with center frequency ${f}_{n}$ can be estimated from

$$SN{R}_{n}=\frac{N{s}_{n}+S{g}_{n}}{N{s}_{n}}=1+\frac{S{g}_{n}}{N{s}_{n}}$$

$$bSN{R}_{n}=\frac{1}{2}\text{}lo{g}_{2}\left(SN{R}_{n}\right)$$

$$C{h}_{n}=\text{}\frac{{f}_{n}}{N}\text{}bSN{R}_{n}\text{}.$$

Energy and Shannon entropies using the binary log are constructed for both the wavelet coefficients and SNR in Section 2.5.

This is an algorithmic paper providing foundational methods to construct standardized Gabor wavelets within a binary framework. No materials are included or required; all the algorithms required to reproduce the results are presented, with recommendations for specific existing functions in open-source software frameworks.

Although the methods are intended to be sensor-agnostic and transportable across diverse domains, the selection of the Gabor mother wavelet does define the optimal applicability of the algorithms: the methods in this paper will work best with a transient, or a portion of a transient, that can be well represented by a superposition of Gabor wavelets. Fortunately, this covers a fairly wide range of transient signature types. The fundamental principles in this work are expandable to other wavelets as well as to four-dimensional spatiotemporal representations.

A digital time series is constructed by collecting digital measurements at discrete times separated by a nominal sample interval $\Delta {\tau}_{s}$. One may estimate a standard deviation from nominal ${\sigma}_{{\tau}_{s}}$ associated with the sample interval; when this error is a very small percent of the sample interval (e.g., parts per million) it is generally treated as a constant. Some variability in the sample rate should be expected in cyber-physical sensing systems under different conditions (temperature, battery level, power load, data throughput, etc.) even when the systems have the same hardware configurations. This can have an impact when attempting high-accuracy time synchronization. If adequate performance metrics are collected, the sample rate error can be quantified and potentially compensated by an additional time-varying correction to the clock drift.

In many scientific domains, such as astronomy and climatology, the sample interval may be greater than one second. Domains where the phenomena of interest change more rapidly use the equivalent metric of samples per second, referred to as the sample rate and often expressed in units of Hertz. The relationship between the sample interval $\Delta {\tau}_{s}$ and its standard deviation ${\sigma}_{{\tau}_{s}}$ and the sample rate ${f}_{s}$ and its associated error can be expressed as

$$\frac{1}{\Delta {\tau}_{s}+{\sigma}_{{\tau}_{s}}}=\frac{1}{\Delta {\tau}_{s}}{\left(1+\frac{{\sigma}_{{\tau}_{s}}}{\Delta {\tau}_{s}}\right)}^{-1}\approx \text{}{f}_{s}\left(1-\frac{{\sigma}_{{\tau}_{s}}}{\Delta {\tau}_{s}}\right)\text{}\mathrm{if}\text{}\frac{{\sigma}_{{\tau}_{s}}}{\Delta {\tau}_{s}}\ll 1\text{}.$$

Although time is the primary discrete sampling parameter, system requirements are often provided as frequency specifications within the context of Fourier transforms. The nominal sample rate sets the maximum upper edge of the bandpass of the system; there should be negligible energy at the Nyquist frequency, which is half of the sample rate. The actual bandpass of a system is set by the low- and high- frequency cutoffs of a cyber-physical system, which may include the sensor response, hardware specifications, firmware and software modifications (such as anti-aliasing filtering), and data compression.

The mapping between frequency and period is simple for a continuous wave tone; the tone period is the inverse of the tone frequency. It is not so clear for transients. Following [7], a transient with a single spectral peak at a center frequency ${f}_{n}$ may be associated with a pseudo-period ${\tau}_{n}=1/{f}_{n}$. This mapping is important as the scale of wavelet representations is linearly proportional to the pseudo-period, which is also referred to as the scale period. A high-level overview of the Appendix A, Appendix B, Appendix C, Appendix D, Appendix E, and Appendix F is provided in this section for ease of reference.

Constant quality factor ($Q$) bands with constant proportional bandwidth are traditionally defined as [1]
where $\Delta f$ is the bandwidth centered on ${f}_{n}$. The $Q$ is a measure of the number of cycles needed to reach the ½ power point at the bandwidth edges. Appendix A shows that the bandwidth edges are well defined in fractional octave band representations of order $N$ so that the quality factor can be evaluated precisely as,

$$\frac{\Delta f}{{f}_{n}}=\frac{1}{Q}$$

$${Q}_{N}={\left[\text{}{2}^{\frac{1}{2N}}\text{}-\text{}{2}^{-\frac{1}{2N}}\right]}^{-1}.$$

From [1], and as shown in Appendix B and Appendix C, the characteristic time duration of the Gabor atom can be represented as
where ${M}_{N}$ is a measure of the number of oscillations in the characteristic time duration of a wavelet. For efficient computation all physical times are nondimensionalized and converted to equivalent sample points by multiplying by the sample rate. If $t$ is the time in seconds, the nondimensionalized time $m$ is
The approach is wavelet-agnostic up to this stage. Direct application of the ½ power points of the spectrum of Gabor-Morlet wavelet (Appendix B) at the band edges (Appendix C) yields
where ${M}_{N}$ controls the duration of the wavelet to match the order’s Q. This last step can be tailored to other wavelet types to produce constant-Q variants. Adherence to the specifications in Equations (10)–(14) yield standardized and well-constrained quantized Gabor atoms.

$${T}_{n}={M}_{N}\text{}{\tau}_{n}$$

$$m={f}_{s\text{}}t\text{}.$$

$${M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}\approx 2\sqrt{2ln2}\text{}N$$

Gabor [2] extended the Heisenberg principle to define the time-frequency uncertainty principle, and further proposed deconstructing signals into elementary waveforms referred to as time-frequency atoms [2,13]. These atoms provide the optimum compromise between time and frequency resolution and thus maximize information density. The Morlet wavelet [14,15], functional kin to the Gabor atom, was developed for seismic applications and is much beloved by mathematicians. Much has been said and written over the last 75 years about the merits, and limitations, e.g., [16], of the Gabor atom in diverse fields of applied science ranging including quantum mechanics, e.g., [17], neurophysiology, e.g., [18] and radar target recognition, e.g., [19].

Consider the translation and dilation of the familiar Gabor-Morlet mother wavelet
with dictionary [13]
which can be fully expressed as
where the mapping between the nondimensional scale ${\U0001d4c8}_{n}$ and the band period is

$${\mathsf{\Psi}}_{N}\left(m\right)=\frac{1}{{\pi}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left(-\frac{{m}^{2}}{2}\right)\text{}\mathrm{exp}\left(i{M}_{N}m\right)$$

$${\mathsf{\Psi}}_{n}\left[m-{m}^{\prime}\right]=\frac{1}{\sqrt{{\U0001d4c8}_{n}}}{\mathsf{\Psi}}_{N}\left(\frac{m-{m}^{\prime}}{{\U0001d4c8}_{n}}\right)$$

$${\mathsf{\Psi}}_{n}\left(m-{m}^{\prime}\right)=\frac{1}{{\pi}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}\frac{1}{\sqrt{{\U0001d4c8}_{n}}}exp\left\{-\frac{1}{2}{\left[\frac{m-{m}^{\prime}}{{\U0001d4c8}_{n}}\right]}^{2}\right\}exp\left\{i{M}_{N}\left[\frac{m-{m}^{\prime}}{{\U0001d4c8}_{n}}\right]\right\}$$

$${\U0001d4c8}_{n}=\text{}\frac{{M}_{N}}{2\pi}\text{}{f}_{s}{\tau}_{n}\text{}.$$

The constant-Q Gabor atoms are constrained to the discrete set of values
with quality factor
defined by the ½ power points of the Fourier spectrum, quantized order $N$. For this functional form, the wavelet admissibility condition can be represented as

$${\U0001d4c8}_{n}=\text{}{\U0001d4c8}_{0}{2}^{\frac{n}{N}}=\frac{{M}_{N}}{2\pi}\text{}{f}_{s}{\tau}_{0}{2}^{\frac{n}{N}},{M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}$$

$${Q}_{N}={\left[\text{}{2}^{\frac{1}{2N}}\text{}-\text{}{2}^{-\frac{1}{2N}}\right]}^{-1}\approx \sqrt{2}N$$

$${M}_{N}^{2}\text{}\gg 1\text{}.$$

By quantizing constant-Q bands and the resulting wavelet scales it is possible to also discretize the uncertainty in time and frequency of the resulting analyses. Gaussian pulses in general [12] and Gabor atoms in particular are well-known to have the lowest time-frequency uncertainty [2,13], making them natural building blocks for uncertainty quantification. The Gabor atom has the minimal value of the Heisenberg-Gabor uncertainty (Appendix D), where the nondimensionalized temporal standard deviation ${\sigma}_{t}$ and angular frequency standard deviation ${\sigma}_{\omega}$ over all time and frequency satisfy
which quantify time and frequency uncertainty discretely, minimally, and unambiguously.

$${\sigma}_{{f}_{s}t}\text{}=\text{}\frac{1}{\sqrt{2}}{\U0001d4c8}_{n}\text{}\Rightarrow {\sigma}_{{t}_{n}}\text{}=\text{}\frac{1}{\sqrt{2}}\frac{{M}_{N}}{2\pi}{\tau}_{n}$$

$${\sigma}_{\omega /{f}_{s}}=\text{}\frac{1}{\sqrt{2}}{\U0001d4c8}_{n}^{-1}$$

$${\sigma}_{t}{\sigma}_{\omega}=\text{}\frac{1}{2}$$

Converting to physical time with $m={f}_{s\text{}}t$ yields a more familiar Morlet representation
where the scale ${\U0001d4c8}_{n}$ may be readily recognized as the standard deviation of a Gaussian envelope with integration variable $m=\text{}{f}_{s}t$. This is very similar to the original form proposed by Gabor [2], and makes intuitive sense as the oscillatory term is clearly exposed. However, the additional factor of ${f}_{s}$ required to nondimensionalize the numerator of the Gaussian envelope for numerical computation has indubitably been an initial source of confusion amongst some physical scientists, author included.

$${\mathsf{\Psi}}_{n}\left(t-{t}^{\prime}\right)=\frac{1}{{\left(\pi {\U0001d4c8}_{n}^{2}\right)}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left\{-\frac{1}{2}{\left[\frac{\text{}{f}_{s}\left(t-{t}^{\prime}\right)}{{\U0001d4c8}_{n}}\right]}^{2}\right\}exp\left\{i\frac{2\pi {f}_{n}}{\text{}{f}_{s}}\left[\text{}{f}_{s}\left(t-{t}^{\prime}\right)\right]\right\}$$

The recommended quanta for the Gabor atoms are positive integer band numbers $n$ and the preferred orders $N$ as in [1]
though the special orders N = 0.75 and 1.5 are considered. The mother wavelet is uniquely defined (and can be quantized) by the order N, although it is often specified by the more accessible variable ${M}_{N}$. The mother wavelet is scale invariant. Each discrete atom in its dictionary is defined by its order N, its band number n, and a refence scale at n = 0. If the Gabor atoms remain within their quanta, there is only one degree of freedom: the reference scale. The reference scale can be set by the data acquisition system (e.g., the Nyquist frequency) or a standard reference frequency. The scale schema can also be set by a signal tuning frequency; the theoretical peak acoustic frequency for the detonation of one metric ton of TNT is used in Section 3. When integrating multi-sensor time series with different evenly and unevenly sampled data, it would be preferrable to either use a standard reference frequency or time scale (e.g., 1 kHz for audio, 1 Hz for infrasound [1]) or a shared target frequency. The resulting frequency bands will be evenly spaced logarithmically to standardize and facilitate multi-sensor cross-correlations and data fusion. It is important to reinforce that the mapping from physical time scale to nondimensional scale depends on the sample rate. Specifying a nominal sample rate $\text{}{f}_{s}$ or sample interval $\Delta {\tau}_{s}=1/{f}_{s}$ as in Equation (9) permits conversion to physical time $t$ and scale pseudo-period ${\tau}_{n}$ from the wavelet parameters,
and map to the physical scale center frequencies

$$n=0,\text{}1,\text{}2\dots ,\text{}N=1,\text{}3,\text{}6,\text{}12,\text{}24\dots $$

$$t=\frac{m}{\text{}{f}_{s}},{\tau}_{n}=\frac{2\pi}{{M}_{N}}\frac{{\U0001d4c8}_{n}}{\text{}{f}_{s}},$$

$${f}_{n}=\frac{1}{{\tau}_{n}},\text{}{\omega}_{n}=2\pi {f}_{n}\text{}.$$

It may be useful to think of the binary (base 2) order N as the quantized time and bandwidth stretch factor of the Gabor atom; as the order increases, the wavelet stretches in time and narrows in bandwidth, with each frequency band occupying a constant proportional frequency bandwidth that produces ${Q}_{N}$ oscillations at the band frequency in the time domain. Although in theory it is possible to use any integer band indexes n, for computational implementation it is practical to use only nonnegative integers to represent temporal scales [Equation (26)], with ${\tau}_{0}$ corresponding to the smallest scale and ${f}_{0}$ to the highest center frequency below the Nyquist frequency.

This paper recommends atom quantization using the well-established fixed order $N$ and quality factor ${Q}_{N}$ values of standard geometric binary intervals referred to as fractional octave bands in acoustic and infrasound applications (Table 1).

Appendix A develops a useful approximation for the quality factor ${Q}_{N}$ of order N,
with exact equivalence for octave bands at N = 1 (Table 2).

$${Q}_{N}\approx \sqrt{2}N\approx 1.414\text{}N,{M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}\approx 2\sqrt{2ln2}\text{}N\approx \text{}2.355\text{}N$$

These relations are seldom made explicit for constant-Q wavelet representations, which often leads to inadvertently creative interpretations and implementations. In traditional fractional octave bands, $N$ is an integer with preferred numbers 1, 3, 6, 12, 24 and its half-power (−3 dB) band edges and center frequencies are well established so their Q can be readily computed (Table 1 and Table 2). The band spectrum will overlap at the half-power point band edges to reduce (or at least regulate) spectral leakage and improve energy estimation. Dyadic wavelets use order N = 1 and are weakly admissible ($\text{}{M}_{N}^{2}~5.54$); carefully handled they do lead to very sparse and fast computational implementations (e.g., [13]).

The estimate for ${Q}_{N}$ in terms of the order $N$ is useful for practical application where we wish to specify the number of oscillations ${Q}_{N}$ in a window. If one abandons the bounds of the preferred bands, one can estimate the order for a wavelet that has any number of oscillations in its support window. Once N is estimated, exact values for the center frequencies and band edges can be computed from the expressions in Appendix A. These bespoke constant-Q bands will not meet binary (factor of two) center frequency recursions with ½ power band edge overlap, but may be useful for highly customized tuning. Examples are provided in Table 3.

Consider the curious case of a single oscillation in the window, where
and Q is evaluated more precisely from the order N. Although intuitive and compact, the resulting wavelets are marginally admissible ($\text{}{M}_{0.75}^{2}~3\text{}$) and produce oddly spaced, but legitimate, constant-Q frequency bands that grow rapidly and hit only every fourth standard octave every three bands. The window duration will be only 1.74 periods long and the spectral resolution of the Fourier transform will be exceedingly sparse. Adding another oscillation per window (increasing the quality factor to approximately two), would correspond to
The resulting wavelets that are more admissible ($\text{}{M}_{1.5}^{2}~13\text{}$) but also produce oddly spaced constant-Q frequency bands that land on every second standard octave every three bands. Third order bands hit exact powers of two every third band and have around four oscillations per window (Appendix C). Although it is possible to force center frequency scales, if best practices for band overlap are ignored one will have a set of wavelet filter banks with substantial spectral leakage or gaps between adjacent bands, and the possibility for excessively overdetermined or underdetermined results. This is what usually happens with default parameters on most continuous or discrete wavelet transform algorithms. This paper standardizes and regulates band spacing by asserting the relationship between order, bandwidth, and duration. Since it is both silly and mathematically inadvisable (even inadmissible) to construct a wavelet with less than one oscillation in its window, it is recommended that $Q\ge 1$. This suggests a minimum order number (quantum) of N = 3/4 for stable Gabor atoms, with N = 1 yielding value exact power of two (binary) bands.

$$N=\frac{3}{4}=0.75,\text{}{Q}_{0.75}=1.04,\text{}{M}_{0.75}=2\sqrt{ln2}\text{}{Q}_{0.75}\approx 1.73$$

$$N=\frac{3}{2}=1.5,\text{}{Q}_{1.5}=2.14,\text{}{M}_{1.5}=2\sqrt{ln2}\text{}{Q}_{1.5}\approx 3.56.$$

It is possible to estimate the smallest possible universal binary scale from the Planck time, the smallest measurable time scale

$$\Delta {\tau}_{Planck}={10}^{-43}\mathrm{s}~{2}^{-142}s\text{}.$$

Since the Planck time would be the smallest possible sample interval, the smallest oscillation that could be observed would be at the universal Nyquist period

$${\tau}_{min}=2\Delta {\tau}_{Planck}~{2}^{-141}\mathrm{s}\text{}.$$

At the other end of the timeline, the age of the universe is estimated to be 13.8 billion years, or
so that all time scales in the known universe can be encompassed within ~200 temporal octave bands. Computationally speaking, this is a small range of octaves that can be spanned by 200 Gabor atoms. Earth is estimated to be ~4.6 billion years old, covering around about 57 of those temporal binary bands. The oldest bones associated with Homo Sapiens-Sapiens are ~200,000 years old and within the last 42 temporal sub-bands since Earth’s inception. The human voice for average individuals ranges between one and two octaves, and five octaves species-wide. The nondimensionalized scale ${\U0001d4c8}_{Nyquist}$ of the binary (N = 1) Gabor atom at the Nyquist frequency is always the same whether one uses the Plank scale or a sample rate of 48 kHz
However, it is inadvisable make observations at the Nyquist limit, and it would be preferable to consider the starting center scale at one quarter of the sample rate, or four times the sample period. It would be possible to construct universal time scales with ${\tau}_{0}=\text{}{2}^{-140}\mathrm{s}$, whereas all timescales would occupy temporal sub-bands. The corresponding sensor-agnostic nondimensionalized scale would be $2\text{}{\U0001d4c8}_{Nyquist}$.

$${\tau}_{max}~{2}^{58}\text{}s,$$

$$\text{}{Q}_{1}=\sqrt{2},\text{}{M}_{1}=2\sqrt{2ln2}$$

$${\U0001d4c8}_{Nyquist}=\frac{{M}_{1}}{2\pi}\frac{{f}_{s}}{{f}_{Nyquist}}=\frac{{M}_{1}}{2\pi}\frac{{\tau}_{min}}{\Delta {\tau}_{Planck}}=\text{}\frac{2\sqrt{2ln2}}{\pi}\text{}\approx 0.75.$$

A third order representation (N = 3) of all the times scales in the universe can be represented by 600 temporal Gabor atoms. The beauty of the third order representation is that it is very close to the decimal representation, with every ten 1/3 octaves producing a decade $\left({2}^{10/3}~10\right)$, and thus provide a geometrically elegant compromise between ten-digit humans and binary digit machines. In addition to better meeting the admissibility condition, third order bands will contain over 99% of the information within their octave (Appendix D), making them compact temporal carriers. If the third order representation is used as the base order (N = 3), the preferred numbers are binary multiples (N = 3, 6, 12, 24 in Table 1), with a proportional elongation in the wavelet support and increase in spectral resolution.

Many software packages readily produce a Gabor-Morlet wavelet with default parameters (Appendix E). One of the most common values is ${M}_{N}=5$, which is close to order $N=2$ (Table 4). Other common values of the wavelet support correspond to ${M}_{N}=4,\text{}N=1.7$ and the more reasonable ${M}_{N}=8$ which is close to preferred order $N=3$.

Because none of these specifications correspond to standard orders, the resulting wavelets will tend to either overestimate (due to spectral leakage) or underestimate (due to spectral gaps between bands) the energy within adjacent constant-Q bands if binary center frequencies are forced, or will produce non-standard center frequencies.

Although it is possible to quantize the constant-Q Gabor atoms using the order N, the quality factor Q, or the multiplier ${M}_{N}$, the order is the most logical way to define the quanta of the wavelet. Describing the proposed wavelet dictionaries of preferred orders as the quantized constant-Q Gabor atoms with binary bases and overlapping ½ power points is rather awkward, and this paper proposes referring to these constructs as quantized wavelets, quantum wavelets of order N, or Nth order Gabor atoms. Although N = 1 provides a sparse clean binary (with power of two steps in frequency) representation with the tightest windows, the admissibility condition coupled with the better reconstruction capability presented in the next section suggests that using N = 3 as the base order is preferable, with the added advantage that all subsequent preferred orders in Table 1 are binary factors of base order 3.

The continuous wavelet transform (CWT) of a function $g\left(x\right)$ is represented in [13] (Equation (1.13)) as
where the asterisk (*) represents the complex conjugate. The equivalent CWT for a discrete sequence of observations (or a synthetic time series) $g\left(m\right)$ is the convolution of $g$ with a scaled and translated version of $\mathsf{\Psi}\left(m\right)$. Consider the nondimensional Quantum mother wavelet of order N,
$${\mathsf{\Psi}}_{N}\left(m\right)=\frac{1}{{\pi}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left(-\frac{{m}^{2}}{2}\right)\text{}\mathrm{exp}\left(i{M}_{N}m\right)$$
The discrete CWT can be expressed as
where the symbol $\u229b$ denotes a convolution [13], often computed using the discrete Fourier transform. This is comparable to the expression in [20], although their convolution has no amplitude scaling as it is corrected afterwards. The CWT coefficients ${\mathcal{W}}_{m,n}$ provide a measure of the degree of similarity between the time series and the wavelet of scale index n while translating along the time index m. While exact waveform reconstruction from the CWT is challenging (e.g., [21,22]), reference [20] provides an approximate expression for the wavelet-filtered time series $g\left({m}^{\prime}\right)$. The reconstruction filter from the Nth order Gabor atoms becomes,
where Re{ } denotes the real part of the coefficients and the reconstruction factor ${C}_{\delta}$ is scale independent and constant for wavelet function with fixed ${M}_{N}$. The reconstruction factor can be estimated by comparing against known test functions. Reference [20] empirically computed a reconstruction coefficient of ${C}_{\delta}=0.776$ with ${M}_{N}=6$, and [23] provides other estimates. Numerical evaluation shows the product $N{C}_{\delta}~2$, and the reconstruction approximation for the analytic (Appendix F) quantum wavelet of arbitrary order is
It is important to note how substantially different this expression is to the inverse discrete Fourier transform, where
and ${\widehat{g}}_{DFT}\left[n\right]$ are the Fourier coefficients. Unlike the discrete Fourier transform, the standard wavelet reconstruction does not require multiplication by the mother wavelet. For the special case where the atoms are well matched to the signal of interest, consider the sparse set of coefficients corresponding the complex time indexes ${m}_{n\text{}\u2102\text{}max}$ of the maximum energy, entropy, or SNR at each scale
where the maximum coefficient indexes can be computed separately for real and imaginary components. This has the form of a sum over the dominant Gabor atoms for each scale. Since one is only considering the maxima in a given record window, this is a very sparse representation consisting of the coefficient and the time offset corresponding to the peak energy or entropy estimate. Numerical evaluation shows that this last expression can be used to estimate the full analytic function representation as long as reconstruction uses the complex coefficients but only the real atom function since the time shifts in the Hilbert transform already include the $\pi /2$ time shift.

$$\mathcal{W}\left(g,\text{}u,\U0001d4c8\text{}\right)=g,{\mathsf{\Psi}}_{u,n}={{\displaystyle \int}}_{-\infty}^{\infty}g\left(x\right)\frac{1}{\sqrt{\U0001d4c8}}{\mathsf{\Psi}}^{*}\left(\frac{x-u}{\U0001d4c8}\right)dx$$

$${\mathsf{\Psi}}_{n}\left[m\right]=\frac{1}{\sqrt{{\U0001d4c8}_{n}}}{\mathsf{\Psi}}_{N}\left(\frac{m}{{\U0001d4c8}_{n}}\right)\text{}.$$

$${\mathcal{W}}_{n}\left[m\right]={\displaystyle \sum}_{{m}^{\prime}=0}^{Mp-1}g\left({m}^{\prime}\right){\mathsf{\Psi}}_{n}^{*}\left({m}^{\prime}-m\right)=g\u229b{\mathsf{\Psi}}_{n}^{*}\left[m\right]$$

$$g\left[m\right]\approx \frac{{\pi}^{\frac{1}{4}}}{N}\frac{1}{{C}_{\delta}}{\displaystyle \sum}_{n=0}^{Np-1}\frac{Re\left\{{\mathcal{W}}_{n}\left[m\right]\right\}}{\sqrt{{\U0001d4c8}_{n}}}$$

$${g}_{\u2102}\left[m\right]\approx \frac{{\pi}^{\frac{1}{4}}}{2}\text{}{{\displaystyle \sum}}_{n=0}^{Np-1}\frac{{\mathcal{W}}_{n}\left[m\right]}{\sqrt{{\U0001d4c8}_{n}}}\text{}.$$

$${g}_{DFT}\left[m\right]=\frac{1}{\sqrt{Np}}\text{}{\displaystyle \sum}_{n=0}^{Np-1}{\widehat{g}}_{DFT}\left[n\right]exp\left(j2\pi mn/Np\right)$$

$${g}_{\u2102\text{}}\left[m\right]\approx \frac{{\pi}^{\frac{1}{4}}}{2}\text{}{\displaystyle \sum}_{n=0}^{Np-1}\frac{{\mathcal{W}}_{n}\left[{m}_{n\text{}\u2102\text{}max}\right]}{\sqrt{{\U0001d4c8}_{n}}}Re\left\{{\mathsf{\Psi}}_{n}\left[m-{m}_{n\text{}\u2102\text{}max}\right]\right\}$$

One advantage of the constant Q wavelet representation is that it is possible to estimate the information content and detectability of a signal in a band by applying the same set of wavelet transforms to the signal and comparing them to the transform of a noise segment or model. Consider the definition for Shannon’s channel capacity [11], with
where Sg is the wavelet-transformed signal power and Ns is the wavelet-transformed noise power in a band. Consider two possible estimates for the bandwidth W (Shannon [11] left some room for interpretation). The first estimate approximates W by the ½ power point bandwidth
The second estimates W using the Gabor box standard deviation for the angular frequency
so that
Taking the average of $\Delta {f}_{n}$ and ${\sigma}_{f}$ provides a compromise between the two possible estimates, and a returns a tidy factor of ~0.5

$$SN{R}_{n}=\frac{N{s}_{n}+S{g}_{n}}{N{s}_{n}}=1+\frac{S{g}_{n}}{N{s}_{n}}$$

$$C{h}_{n}=\text{}Wlo{g}_{2}\left(SN{R}_{n}\right)$$

$$\Delta {f}_{n}=\frac{{f}_{n}}{{Q}_{N}}\approx \frac{1}{\sqrt{2}}\frac{{f}_{n}}{N}\approx 0.7071\text{}\frac{{f}_{n}}{N}\text{}.$$

$${\sigma}_{\omega}=\frac{1}{\sqrt{2}}\frac{{\omega}_{n}}{{M}_{N}}\approx \frac{1}{4\sqrt{ln2}\text{}}\frac{{\omega}_{n}}{N}\approx \text{}\frac{\pi}{2\sqrt{ln2}\text{}}\frac{{f}_{n}}{N}\approx 1.8867\text{}\frac{{f}_{n}}{N}$$

$${\sigma}_{f}=\frac{{\sigma}_{\omega}}{2\pi}\text{}=\frac{1}{4\sqrt{ln2}\text{}}\frac{{f}_{n}}{N}\approx 0.3003\text{}\frac{{f}_{n}}{N}\text{}.$$

$$C{h}_{n}\approx \text{}\frac{1}{2}\frac{{f}_{n}}{N}lo{g}_{2}\left(SN{R}_{n}\right)\text{}.$$

The effective $SN{R}_{G}$ and therefore the “detectability” of a bandwidth-limited compressed pulse [12] can be represented by the product of the Gabor time-bandwidth product (Appendix C) and the signal to noise ratio
Since the time-bandwidth product for the Gaussian wavelet is constant
and the uncertainty of its Gabor box is at the minimum, the likelihood of the detection of a signal of interest in a given band $n$ is only proportional to its SNR.

$$\text{}SN{R}_{G}={\sigma}_{t}\text{}{\sigma}_{\omega}\times SN{R}_{n}\text{}.$$

$${\sigma}_{t}{\sigma}_{w}=\frac{1}{2}$$

Shannon’s definition of the channel capacity was intended to represent the highest theoretical transfer rate of information through an analog line. Since SNR is given in power, which is typically the square of the signal amplitude, an unscaled binary log is off by a factor of two from the original data in bits. To reconcile this definition with the original collection of a time series signal in floating point bits (fbits), I define the binary SNR to match the signal rms amplitude as well as Shannon’s units for the information rate per band $C{h}_{N,n}$ of the quantum compressed pulse as
The increase in higher information delivery rate with increasing frequency is intuitive as more cycles are transferred per second. As the order number increases, the bandwidth narrows and so the potential information rate decreases. Less obvious is the decrease in high-frequency information with increasing distance in a lossy transmission channel. Assuming the noise power remains unchanged, the decrease in SNR with increasing scaled distance $r$ from the source origin on a lossy acoustic channel can be represented as
where ${n}_{g}=2$ for spherical geometric spreading in free space and ${n}_{g}=1$ for cylindrical spreading in a waveguide. The binary SNR can be represented as
The term in parenthesis shows the expected reduction of one bit per doubling of distance for spherical spreading $({n}_{g}=2)$. The last term suggests the frequency dependence of the channel capacity in a lossy acoustic medium may have the general form
so that with increasing range the optimal information transmission frequency shifts to lower frequencies. One may readily extend the binary SNR definition to the measure of relative power
and the −3dB half-power point becomes the −1/2 bit power point.

$$bSN{R}_{n}=\frac{1}{2}lo{g}_{2}\left(SN{R}_{n}\right)=lo{g}_{2}\left(\sqrt{SN{R}_{n}}\right),\text{}fbits$$

$$C{h}_{N,n}=\frac{{f}_{n}}{N}\times bSN{R}_{n},\text{}shannons/s=fbits/s.$$

$$SNR=SN{R}_{o}\text{}\frac{exp\left(-\gamma {f}^{2}r\right)}{{r}^{{n}_{g}}}.$$

$$SNR=\left[bSN{R}_{0}-\frac{{n}_{g}}{2}lo{g}_{2}r\text{}\right]\text{}-\text{}{f}^{2}r\text{}\left(\gamma \text{}lo{g}_{2}e\right).\text{}$$

$$C{h}_{n}~\alpha \left(lo{g}_{2}r\right)\text{}f-\beta \left(r\right)\text{}{f}^{3}$$

$$bR=lo{g}_{2}\left(\sqrt{\frac{S}{{S}_{max}}}\right)=\frac{1}{2}lo{g}_{2}\left(\frac{S}{{S}_{max}}\right),fbits$$

The entropy of a signal of interest can be estimated by the wavelet coefficients. A practical approach is described in [24]. The information content of each scale n at the time step m can be estimated from the wavelet energy. First estimate the complex wavelet coefficient energy from

$${E}_{m,\text{}n}={\left|Re\left\{\text{}{\mathcal{W}}_{m,n}\right\}\right|}^{2}+j\text{}{\left|Im\left\{\text{}{\mathcal{W}}_{m,n}\right\}\right|}^{2}\text{}.$$

The total energy in a given record can be estimated from

$$E=\text{}{\displaystyle \sum}_{m}{\displaystyle \sum}_{n}\sqrt{{E}_{m,\text{}n}{E}_{m,n}^{*}}\text{}.$$

The complex probability of ${\mathcal{W}}_{m,n}$ in the record is
where

$${p}_{m,n}=\frac{{E}_{m,\text{}n}}{E}$$

$$\sum}_{m}{\displaystyle \sum}_{n}{p}_{m,n}{p}_{m,n}^{*}=1$$

The log energy entropy (lee) per coefficient can be defined by the binary logarithm
where it should be noted that the factor of two scaling coefficient does not alter the relative weight of each coefficient. The Shannon entropy (se) per CWT coefficient is defined as
with corresponding complex versions that separate the real and imaginary components. These entropies can be readily evaluated to construct noise models from the lowest entropy components. If a stable noise model can be constructed from the record or from prior knowledge of the environment and transmission channel, SNR estimates can be computed and the process repeated to evaluate the dimensionless binary log of the SNR
and the product of the ratio and the binary ratio (RbR), an entropy-like nondimensional metric of the SNR that can be readily evaluated to identify and extract the wavelet coefficients would be most representative of a signal of interest,

$${e}_{lee}=lo{g}_{2}\left({p}_{m,n}^{2}\right)=2lo{g}_{2}\left({p}_{m,n}\right)$$

$${e}_{se}=-{p}_{m,n}lo{g}_{2}\left({p}_{m,n}\right)$$

$$bSN{R}_{m,n}=\frac{1}{2}lo{g}_{2}\left(SN{R}_{m,n}\right)$$

$$Rb{R}_{m,n}=\text{}SN{R}_{m,n}\times bSN{R}_{m,n}\text{}.$$

The methods presented in this paper are foundational: the intention is to use the Gabor atoms as fundamental building blocks with minimal time-frequency uncertainty and high information density. These methods are illustrated and discussed in the context of a blast pressure pulse. Consider a normalized transient wave function characteristic of an explosion. Suppose one wanted to construct a sparse wavelet representation of a blast pulse with peak energy at 6.3 Hz, corresponding to the detonation of one metric ton of TNT observed at 1 km. It is known [7] that at some distance from the source this center frequency may drop by an octave (factor of two in frequency) or more, as well as become stretched out (dispersed) in time due to propagation effects. A theoretical source pressure function for the detonation of high explosives was developed in some detail in [7] with one kiloton as the case study, and is used here to construct a representative synthetic waveform for a one (metric) ton detonation. Define ${\tau}_{c}$
as the pseudo-period of a blast pulse corresponding to the peak spectral energy at the frequency ${f}_{c}$ and angular frequency ${\omega}_{c}$, where ${\tau}_{p}$ is the time duration of the initial positive phase traditionally used in blast physics. The nondimensionalized time scale is

$${\tau}_{c}=4{\tau}_{p},{f}_{c}=\frac{1}{{\tau}_{c}},\text{}{\omega}_{c}=2\pi {f}_{c}$$

$$\widehat{\tau}=\frac{t}{{\tau}_{p}}=4\frac{t}{{\tau}_{c}}\text{}.$$

The form of the amplitude-normalized source pressure function for an explosive blast [7] can be represented as

$$g\left(\widehat{\tau}\right)=\left(1-\widehat{\tau}\right),\text{}0\le \widehat{\tau}\le 1$$

$$g\left(\widehat{\tau}\right)=\frac{1}{6}\left(1-\widehat{\tau}\right){\left(1+\sqrt{6}-\widehat{\tau}\right)}^{2},\text{}1\widehat{\tau}\le 1+\sqrt{6}\text{}.\text{}$$

This pulse has an associated analytic function ${g}_{\u2102}\left(\widehat{\tau}\right)$ discussed in Appendix F. Since the theoretical Hilbert transform has some unresolved issues, the numerical Hilbert transform [25] is used for comparison.

Note that the amplitude is not used in this exercise because in some cyber-physical systems, such as smartphones, the amplitude response of on-board sensors may not be known. However, sensor dynamic range is usually specified and available (e.g., int16, float32) and can be used for signal scaling relative to the full range or the noise.

The normalized pulse has zero mean (conservation of momentum) and its theoretical variance is

$${\sigma}_{p}^{2}={{\displaystyle \int}}_{-\infty}^{\infty}{g}^{2}\left(\widehat{\tau}\right)dt=0.95\frac{{\tau}_{c}}{8}$$

The complex Fourier transform $\widehat{g}\left(j\widehat{\omega}\right)$ of this pulse is
where $\widehat{\omega}=\frac{\pi}{2}\frac{\omega}{{\omega}_{c}}=\frac{{\tau}_{c}}{4}\omega =$ ${\tau}_{p}\omega $ and the peak in the spectrum is at $\omega ={\omega}_{c}$. Note there are at least two pseudoperiods of importance evident in the main blast pulse: the main spectral pseudoperiod ${\tau}_{c}$ and the positive phase pseudoperiod of $2{\tau}_{p}$. Near the source the positive phase pseudoperiod will dominate as it has the highest energy and bandwidth. With increasing distance and high-frequency attenuation the main pseudoperiod becomes more prominent and may also be downshifted in frequency [7]. However, additional scales can be introduced by reflection and refraction in the transmission channel that can induce phase shifts often modeled with Hilbert transforms (Appendix F).

$$\widehat{g}\left(j\widehat{\omega}\right)=\frac{\mathsf{\pi}}{2{\omega}_{n}\text{}}\left[\frac{1-j\widehat{\omega}-{e}^{-j\widehat{\omega}}}{{\widehat{\omega}}^{2}}+\text{}\frac{{e}^{-j\widehat{\omega}\left(1+\sqrt{6}\right)}}{3{\widehat{\omega}}^{4}}\left\{j\widehat{\omega}\sqrt{6}+3+\text{}{e}^{j\widehat{\omega}\sqrt{6}}\left[3{\widehat{\omega}}^{2}+j\widehat{\omega}2\sqrt{6}-3\right]\right\}\right]$$

The power spectra of real digital signals are usually expressed using only the positive frequencies up to the Nyquist frequency, where the unilateral spectral density $Pg\left(\widehat{\omega}\right)$ is defined as

$$Pg\left(\widehat{\omega}\right)=2{\left|\widehat{g}\left(j\widehat{\omega}\right)\right|}^{2}=2\text{}\widehat{g}\left(j\widehat{\omega}\right){\widehat{g}}^{*}\left(j\widehat{\omega}\right).\text{}$$

Since the target signature corresponds to a one tonne (1000 kg) detonation, the analysis concentrates on a target frequency of 6.3 Hz [7]. The general procedure for constructing target-tuned fractional binary bands of order N is to define a set of base 2 scales around the center or reference frequency
The upper limit is set by the Nyquist frequency, which means that the center frequency and its band edges should be below the Nyquist and the ½ power point of the anti-aliasing filter. A conservative estimate is
The lower limit is set by the largest data window duration $T$
so that the center frequencies are defined by
which will be sufficient information to compute the Morlet scale ${\U0001d4c8}_{n}$. If one must convert to a sorted, monotonically increasing pseudoperiod, let
and restart the counter for the period

$${f}_{c}=6.3\text{}\mathrm{Hz},\text{}{f}_{j}={f}_{c}{2}^{\frac{j}{N}}\text{}.$$

$${f}_{j\text{}max}={f}_{tg}{2}^{\frac{j\text{}max}{N}}\frac{{f}_{s}}{2}\Rightarrow j\text{}maxfloor\left(Nlo{g}_{2}\left[\frac{{f}_{s}}{2{f}_{tg}}\right]\right)\text{}.$$

$${f}_{j\text{}min}={f}_{tg}{2}^{\frac{j\text{}min}{N}}\frac{2}{T}\Rightarrow j\text{}minceil\left(Nlo{g}_{2}\left[\frac{2}{T{f}_{tg}}\right]\right)$$

$${f}_{j}={f}_{c}{2}^{\frac{j}{N}},\text{}j\in \left[j\text{}min,\text{}j\text{}max\right]$$

$${\tau}_{j}=\frac{1}{{f}_{j}},\text{}{\tau}_{0}=min\left({\tau}_{j}\right)$$

$${\tau}_{n}={\tau}_{0}{2}^{\frac{n}{N}},\text{}n\in \left[0,\text{}j\text{}max-j\mathrm{min}=length\text{}({f}_{j}\right)]\text{}.$$

This re-indexing is much easier to do numerically than to describe algorithmically. For the purposes of illustration and demonstration, let us choose a signal frequency that exactly matches the target frequency; if this example fails there is no purpose in continuing. A sample rate of 200 Hz will be more than sufficient for this example. Gaussian noise with a standard deviation that is one bit below the signal standard deviation (factor of 1/2) is superposed, and then anti-alias filtered for all frequencies below Nyquist. The analytic function is computed numerically from the real pulse for later comparisons with the wavelet-reconstructed signal.

The CWT scalogram is computed using the complex nondimensional mother quantum wavelet of order N. The complex Gabor-Morlet wavelet in SciPy [25] is represented by the function scipy.signal.morlet2, and has the desired canonical form,

$${\mathsf{\Psi}}_{H}\left(m\right)=\frac{1}{{\pi}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left(-\frac{{m}^{2}}{2}\right)\text{}\mathrm{exp}\left(i{\mathrm{M}}_{N}m\right)$$

$${\mathsf{\Psi}}_{u,n}\left(m\right)=\frac{1}{\sqrt{{\U0001d4c8}_{n}}}{\mathsf{\Psi}}_{H}\left(\frac{m-u}{{\U0001d4c8}_{n}}\right)$$

$${\U0001d4c8}_{n}={\U0001d4c8}_{0}\text{}{2}^{\frac{n}{N}}=\left[\frac{{M}_{N}}{2\pi}{f}_{s}{\tau}_{0}\right]\text{}{2}^{\frac{n}{N}}=\frac{{M}_{N}}{2\pi}\frac{{f}_{s}}{{f}_{n}}$$

$${T}_{n}=\left[{M}_{N}{\tau}_{0}\text{}\right]{2}^{\frac{n}{N}}=\frac{{M}_{N}}{{f}_{n}}$$

$${M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}$$

$${Q}_{N}={\left[\text{}{2}^{\frac{1}{2N}}\text{}-\text{}{2}^{-\frac{1}{2N}}\right]}^{-1}\text{}.$$

The only free variables are the order N, the smallest time scale ${\tau}_{0}$, and the sample rate ${f}_{s}$. Although the nondimensionalized scale will change with the sample rate, the final results can always be returned to the physical domain frequencies ${f}_{n}$. The nominal number of points per window can be estimated from ${f}_{s}{T}_{n}$. The complex wavelet coefficients can be readily computed from the real part of the discrete version of the blast source-time function $p\left(m\right)$

$${\mathcal{W}}_{n}\left[m\right]={\displaystyle \sum}_{{m}^{\prime}=0}^{Mp-1}p\left({m}^{\prime}\right){\mathsf{\Psi}}_{n}^{*}\left({m}^{\prime}-m\right)=p\u229b{\mathsf{\Psi}}_{n}^{*}\left[m\right]$$

After minor conditioning, the SciPy CWT function [25] promptly invokes the convolution function. This is computationally expensive: we have turned a time series with Mp points into a complex 2[Mp x Nbands] array of band-passed waveforms. The terms wavelets and wavelet filter banks are often used interchangeably in the context of the CWT.

The wavelet-filtered reconstructed complex analytical signal can be approximated from
where the $i,\text{}j$ indexes indicate that one may choose selected scales for the reconstruction over selected time indexes ${m}_{k}:{m}_{l}$ corresponding to the wavelet coefficients that best represent a signal of interest during the time interval of relevance. The wavelet CWT coefficients for the binary band decomposition are shown in Figure 1; the CWT coefficients are scaled by the reconstruction coefficients. A comparison of the input synthetic analytic record and the analytic signal reconstruction (summed over all scales) for the octave band representation is shown in Figure 2.

$${g}_{\u2102\text{}ij}\left[{m}_{k}:{m}_{l}\right]\approx \frac{{\pi}^{\frac{1}{4}}}{2}\text{}{\displaystyle \sum}_{n=i}^{j}\frac{{\mathcal{W}}_{n}\left[{m}_{k}:{m}_{l}\right]}{\sqrt{{\U0001d4c8}_{n}}}$$

The reconstruction process recovers the original dimensionality of the time series but returns its Hilbert transform, so the total dimensionality may be doubled (2Mp sample points). If only the original real signal is desired, then the dimensionality is unchanged.

The next steps estimate entropy and SNR, and consider sparse signal representation. Although binary bands are adequate for characterizing this signal, and are routinely used in the discrete wavelet transform, I take advantage of the flexibility offered by the CWT and use third order bands (N = 3) for the examples that follow. One of the benefits of third order bands is that the admissibility condition is better met and scales are recursive in powers of 2 and 10 ([1]). As presented in Appendix D, third order bands will contain over 99% of the Gabor box variance within an octave and within 80% of the full window ${T}_{n}$, reducing spectral leakage. If, in addition, one wants a factor of two accuracy in explosive yield estimates, 1/3 octave resolution is a minimum requirement. A third order band wavelet reconstruction is shown in Figure 3 and corresponds to the CWT decomposition presented in Figure 4. The wire mesh representation is the equivalent of the scalograms usually represented as color mesh plots, and illustrates the simplicity of the CWT decomposition. The primary difference between Figure 4 and Figure 5 is that the first scales the raw CWT coefficients by the reconstruction scaling, whereas Figure 5 shows the raw coefficients.

The energy probability distribution is constructed from the wavelet coefficients to estimate entropy, as discussed in the previous section. The log energy entropy looks like any other scalogram and does not add much value, but the Shannon entropy plot is interesting and well scaled (Figure 6). The peak entropy is at the scaled blast center frequency of unity, as expected.

Next a noise model is constructed to build the SNR and to establish criteria for standardized and reproducible sparse signal representation. Many are the ways to characterize noise, and few of them accurately characterize non-stationary noise over brief observation windows. An incorrect noise model can penalize the signal passband and degrade the signal SNR. For the white noise model with variance that is one bit below the signal variance, the CWT of the noise (Figure 7) shows how the high-frequency oscillations are adequately sampled whereas the low-frequency oscillations are undersampled. This leads to instability if the noise is only estimated over a brief observation record. In principle, one may build a noise model over a substantial time period to improve statistical significance under the assumption that the noise is statistically stationary. This can be a tenuous assumption in some circumstances. Noise studies are beyond the scope of this paper; the noise spectrum is flattened by using the mean of the noise coefficients to estimate the band-averaged noise level.

As anticipated, the binary SNR appears much like the log energy entropy since they are both scaled by a constant value, with the former over the band-averaged noise and the latter over the total energy. The SNR RbR, as described in the previous section, should also look very much like the entropy, except it would be zero for SNR of unity and positive for SNR > 1. The SNR RbR is shown in Figure 8 and indeed matches the Shannon entropy plot. This is encouraging; the entropy plot requires constructing an energy distribution that scales with the record, whereas the SNR requires constructing a noise model that is mostly independent of the record and should have more stability—as long as the ambient noise is approximately stationary or can at least be adequately modeled. If one is curating data for machine learning training, the entropy would be a good metric for picking and annotating possible signals as well as for refining noise models. If one is trying to trigger or detect signals operationally, the SNR may be a preferable metric since it makes no assumptions about the total energy in a record and only scales relative to a (preferably) stable noise representation.

One may use the CWT coefficient energy, the Shannon entropy, or the SNR RbR to test the feasibility of the sparse Gabor atom superposition. Suppose we use any of these Np scales x Mpoint time matrices to identify the peak contributions over the record, and define the complex time indexes as ${m}_{\u2102\text{}max}$. The quantum wavelet superposition would be expressed as
where the dimensionality of the representation is reduced to the complex coefficients and time indexes. Since the wavelet function can be reproduced for any time index, the time array need not be stored. In other words, if there are 20 scales, there will be 20 real coefficients and time offsets and 20 imaginary coefficients and time offsets, with total dimensionality of 4 × 20 = 80 parameters. If there is sufficient SNR and the signal is band limited it is possible to further reduce dimensionality by removing any coefficients below a specified threshold that may be fitting to noise (e.g., overfitting). Figure 9 shows the result of reconstruction from the superposition of all the top atoms of the 20 scales, and Figure 10 shows reconstruction from a sparser set of 12 scales with the highest SNR RbR. Similar results were obtained using the Shannon entropy. The Gaussian noise standard deviation for these two runs was one bit below the signal standard deviation.

$${g}_{\u2102\text{}ij}\left[{m}_{k}:{m}_{l}\right]\approx \frac{{\pi}^{\frac{1}{4}}}{2}\text{}{\displaystyle \sum}_{n=i}^{j}\frac{{\mathcal{W}}_{n}\left[{m}_{n\text{}\u2102\text{}max}\right]}{\sqrt{{\U0001d4c8}_{n}}}Re\left\{{\mathsf{\Psi}}_{n}\left[{m}_{k}:{m}_{l}-{m}_{n\text{}\u2102\text{}max}\right]\right\}$$

Increasing the noise standard deviation by a factor of two (one bit) still permits reconstruction from superposition (Figure 11), and increasing by another bit also allowed atomic reconstruction (Figure 12).

There is no end to the number of sensitivity studies that can be performed; in addition to other SNR tests, shifting the peak blast frequency away from the nominal target frequency still returned a stable reconstruction. Increasing the order past N > 6 only worsened the fit to the target waveform, increasing dimensionality and computational cost while decreasing reconstruction fidelity. This is expected from using a wavelet that does not match the target signature.

This paper proposes a transition to binary metrics for digital data and introduces a standardized, quantized variation of the Gabor atoms with binary bases, optimal time-frequency resolution, and clear spectral energy containment. A binary entropy-like metric for the SNR is proposed and used to extract the peak coefficients to evaluate the performance of the superposition of Gabor atoms against the more traditional CWT reconstruction. Although the immediate application is the analysis of time series data collected with cyber-physical systems such as smartphones, the methods presented in this paper should be transportable to other types of digital records and can be extended to other wavelet families.

I used a synthetic pressure pulse corresponding to the detonation of one metric ton if TNT in Gaussian noise as an example, and did not include the blast amplitude as a key parameter in order to concentrate on the entropy and SNR, which are both dimensionless scaled quantities. Observations collected close to an explosion should have brief durations and a high SNR; for short pulses it is advisable to use Gabor atoms of small order (N = 1 − 6). Due to cube root yield scaling, the third order bands will provide a yield resolution—and uncertainty—of a factor of two, and one-sixth order bands will return square root of two yield resolution. In other words, the uncertainty of yield estimates obtained with the quantum wavelet would be inversely proportional to the cube root of the band order. Acceptable signal reconstructions were obtained from the CWT coefficients as well as the superposition of the peak third order Gabor atoms for the blast signature. At increasing distance from the source, the peak frequency is expected to drop [7] and the pulse disperses over time. This opens up the possibility for stable 6 and 12 order analyses with a corresponding improvement in yield resolution. Future work will concentrate on such dispersed signatures as well as consider other types of CW signatures that would be well matched to higher-order Gabor atoms.

The methods developed have the goal of providing a tunable, standardized framework for signature feature extraction that can be used for signal classification and which should be well suited for dictionary learning [13].

This paper summarizes over five years of applied research and is based upon work supported in part by the Department of Energy National Nuclear Security Administration under Award Numbers DE-NA 0002534 (CVT), DE-AC07-05ID14517 (MINOS), DE-NA0003920 (MTV), DE-NA0003921 (ETI), and the Air Force Research Laboratory under Agreement FA8750-18-2-0113.

The author is grateful for two anonymous reviewers who helped improve the manuscript, and extends warm aloha to A. Christe, B. Williams, S. Takazawa, and J. Tobin for their patience and perseverance in reviewing various drafts of the document. Many thanks for S. Pozzi, D. Chichester, A. Ericson, E. Lam, S. Leung, J. Carlo, A. Smith, A. Rangarajan, J. Zeineddine for providing invaluable context.

The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

This report was prepared as an account of work sponsored by agencies of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. The United States Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.

This work builds on the Infrasonic Energy, Nth Octave (Inferno) framework [1], which has been implemented in infrasound array processing algorithms for nuclear monitoring applications [3,4] Logarithmic constant-bandwidth, also referred to as proportional frequency or constant quality factor (Q) bands, are traditionally defined by their scaled bandwidth
where ${f}_{n}$ is the center frequency of band number n and ${f}_{H}$ and ${f}_{L}$ are referred to as the upper and lower band edge frequency, respectively. Defining the center, upper, and lower band edge periods ${\tau}_{n},\text{}{\tau}_{H},\text{}and\text{}{\tau}_{L}\text{}$ as
the

$$\frac{\Delta f}{{f}_{n}}=\frac{{f}_{H}-{f}_{L}}{{f}_{n}}=\text{}\frac{1}{Q}$$

$${\tau}_{n}=\frac{1}{{f}_{n}},\text{}{\tau}_{H}=\frac{1}{{f}_{L}},\text{}{\tau}_{L}=\frac{1}{{f}_{H}},$$

$$\frac{\Delta \tau}{{\tau}_{n}}=\frac{{\tau}_{H}-{\tau}_{L}}{{\tau}_{n}}=\frac{\Delta f}{{f}_{n}}=\frac{\Delta \omega}{{\omega}_{n}}=\text{}\frac{1}{Q}\text{}.\text{}$$

In this section we generalize the constant-Q framework to the logarithmic discretization of evaluation intervals relative to a given reference scale and base. For a given reference scale ${\tau}_{0}$, which could be time, frequency, spatial length, wavenumber, or any other useful metric, we define a logarithmic scale base G > 1 and center scale ${\tau}_{n}$ as
where n is the band number and N is the band order, subject to the constraints $n\in \mathbb{Z},\text{}N\ge 1$.

$$\frac{{\tau}_{n}}{{\tau}_{0}}={G}^{\frac{n}{N}}$$

The natural base for both contemporary and quantum computers is base 2, and analysis windows with powers of two are recommended for complex computations at large scales. Many efficient algorithms are based on binary (base two) filter banks. Selecting G = 2 yields

$$\frac{{\tau}_{n}}{{\tau}_{0}}={2}^{\frac{n}{N}},\frac{{\tau}_{H}}{{\tau}_{n}}={2}^{\frac{1}{2N}},\frac{{\tau}_{L}}{{\tau}_{n}}={2}^{-\frac{1}{2N}},\frac{{\tau}_{H}{\tau}_{L}}{{\tau}_{n}^{2}}=1$$

$${Q}_{N}={\left[\text{}{2}^{\frac{1}{2N}}\text{}-\text{}{2}^{-\frac{1}{2N}}\right]}^{-1}.$$

Note that center and band edge scales attached to a given band n change with the order N, reference scale ${\tau}_{0}$, and the reference base G. If the reference scale and base are standardized, all bands are invariant. For example, the concert A pitch standard is fixed at 440 Hz and may be used to tune other instruments anywhere and at any time.

The next step substantially simplifies the estimation of constant-Q bands with a minimal introduction of a 2% computational error. To the author’s knowledge, this is the first time this expression is presented (and he would be most grateful to be informed otherwise). Numerical evaluation shows that

$$\underset{N\to \infty}{\mathrm{lim}}\frac{{Q}_{N}}{{Q}_{1}}=\underset{N\to \infty}{\mathrm{lim}}\left(\text{}{G}^{\frac{1}{2}}\text{}-\text{}{G}^{-\frac{1}{2}}\right){\left(\text{}{G}^{\frac{1}{2N}}\text{}-\text{}{G}^{-\frac{1}{2N}}\right)}^{-1}\approx N\frac{G-1}{\sqrt{G}ln\left(G\right)}\approx \left(1.02\text{}\right)N\approx N$$

$${Q}_{N}\approx N{Q}_{1}=N\left[\frac{\sqrt{G}}{G-1}\right]$$

The center frequencies and band edges, and thus the quality factor, of traditional fractional octave bands are well known and can be readily computed for all the standard bands. The primary value of the expression for ${Q}_{N}$ is that it provides a simple, explicit estimate of the relation between the quality factor and the band order, which in turn permits an estimate of the support window duration for a given wavelet in terms of the band order. Numerical inspection shows that for most practical applications and for $G=2\approx {10}^{\frac{3}{10}}$, even those when $N$ is non integer, we can use the expression
to estimate the relationship between the band order and the quality factor.

$${Q}_{N}=\frac{{f}_{n}}{\Delta {f}_{n}}\approx \sqrt{2}N$$

Although the center frequency is traditionally defined as the geometric mean of the band edges, the ½ power spectral points at the band edges are only symmetric around the arithmetic mean of the center frequency. The relation between the arithmetic mean ${f}_{na}=\left({f}_{L}+{f}_{H}\right)/2$ and the geometric mean ${f}_{ng\text{}}=\sqrt{{f}_{L}{f}_{H}}$ of the center frequency of fractional binary bands is
where the approximation uses the binomial expansion. The arithmetic and geometric center frequencies are close to each other, and for fractional octave bands (N>1) get ever tighter. However, the band edge power levels at the half band width $\Delta {f}_{n}/2$ should be considered to be relative to the arithmetic mean rather than the geometric mean. In general practice it is easier to use the arithmetic frequency as ${f}_{n}$, with the understanding that the fractional octave specifications are defined by geometric scaling.

$$\frac{{f}_{na}}{{f}_{ng\text{}}}\text{}\cong \sqrt{1+\text{}\frac{1}{8{N}^{2}}}\approx 1+\text{}\frac{1}{16{N}^{2}}$$

As an extension of the Inferno framework [1] the nominal duration of the Gabor atom window ${T}_{n}$ may be defined as a multiple ${M}_{N}$ of the scale as
where the scale multiplier ${M}_{N}$ is set by the half power points of the wavelet. Traditional constant-Q frameworks in acoustics and music applications match the 12-tone equal temperament system ($N$ = 12) for $G=2$ or $G={10}^{\frac{3}{10}}\approx 2$ and are consistent with the Renard series recommended in ISO3 for $N$=1, 3, 6, 12, 24.

$${T}_{n}\left(N,n\right)\stackrel{\scriptscriptstyle\mathrm{def}}{=}{M}_{N}{\tau}_{n}={M}_{N}{\tau}_{0}{G}^{\frac{n}{N}}$$

Different disciplines call the same things different names; many of the challenges in present-day data science are often due to divergent lexicon and the diversity of applications specific to each field. The idea of using a windowed sinusoid as a basis function for signal representation was developed in detail in Gabor’s [2] landmark paper, where he also introduced the time-frequency uncertainty principle. Gabor’s atoms were further developed by Grossman and Morlet [14] and P. Goupillaud et al. [15] (amongst others), who formalized and popularized what we now know as wavelet transforms. Mallat [13] presents a lucid overview of the complementary nature of Fourier and wavelet representations in his Wavelet Tour of Signal Processing; the serious student would be wise to consider it required reading.

The Gabor wavelet is a special case of a wavelet-modulated window ([13] Equations (4.60)–(4.62)) and is representative of a bandwidth-limited compressed pulse [12]. For a physical scientist, its most intuitive form is
representing a sinusoid with scaled time $x=\text{}{f}_{s}t$ and scaled angular frequency ${\eta}_{n}\text{}$(or linear space and wavenumber) modulated by Gaussian window with standard deviation $\sigma $. Comparison with the canonical expression
shows that the scaled angular frequency and standard deviation are

$$\mathsf{\Psi}\left(x\right)=\frac{1}{{\left[\pi {\sigma}^{2}\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left(-\frac{{x}^{2}}{2{\sigma}^{2}}\right)\text{}\mathrm{exp}\left(i{\eta}_{n}\text{}x\right),$$

$${\mathsf{\Psi}}_{n}\left(x\right)=\frac{1}{{\left(\pi {\U0001d4c8}_{n}^{2}\right)}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}exp\left\{-\frac{1}{2}{\left[\frac{x}{{\U0001d4c8}_{n}}\right]}^{2}\right\}exp\left\{i\left[\frac{2\pi {f}_{n}}{\text{}{f}_{s}}\right]x\right\}$$

$${\eta}_{n}=\frac{2\pi {f}_{n}}{\text{}{f}_{s}},\sigma ={\U0001d4c8}_{n},\sigma \eta ={M}_{N}\frac{f}{{f}_{n}}$$

The Fourier transform of the mother wavelet is
has unit second moment
and it first moment vanishes in the limit
Another important representation of the Gabor wavelet [26,27] is
with the advantage that its Fourier transform
has a peak amplitude of unity and yields equal-amplitude filter banks.

$$\widehat{\mathsf{\Psi}}\left(\eta \right)={\left[4\pi {\sigma}^{2}\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}\text{}exp\left\{-\frac{1}{2}{\sigma}^{2}{\left[\eta -{\eta}_{n}\right]}^{2}\right\}={\left[4\pi {\U0001d4c8}_{n}^{2}\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}\text{}exp\left\{-\frac{1}{2}{M}_{N}^{2}{\left[\frac{f}{{f}_{n}}-1\right]}^{2}\right\},$$

$${{\displaystyle \int}}_{-\infty}^{\infty}\mathsf{\Psi}\left(x\right){\mathsf{\Psi}}^{*}\left(x\right)dx=1,$$

$${{\displaystyle \int}}_{-\infty}^{\infty}\mathsf{\Psi}\left(t\right)dt\text{}\to 0\text{}\mathrm{for}\text{}{\sigma}^{2}{\eta}_{c}{}^{2}\gg 1.$$

$$\psi ={\left(4\pi {\sigma}^{2}\right)}^{-\frac{1}{4}}\text{}\mathsf{\Psi}$$

$$\psi \left(x\right)=\frac{1}{{\left[2\pi {\sigma}^{2}\right]}^{\frac{1}{2}}}\text{}exp\left\{-\frac{{x}^{2}}{2{\sigma}^{2}}\right\}exp\left\{i{\eta}_{\mathrm{c}}x\right\}$$

$$\widehat{\psi}\left(\eta \right)=\text{}exp\left\{-\frac{1}{2}{\sigma}^{2}{\left[\eta -{\eta}_{\mathrm{c}}\right]}^{2}\right\}=\text{}exp\left\{-\frac{1}{2}{M}_{N}^{2}{\left[\frac{f}{{f}_{n}}-1\right]}^{2}\right\}.$$

The Inferno framework was developed with the introduction of multiresolution array processing in the field of infrasound. The time duration of an analysis window at a specific period is represented as

$${T}_{n}={M}_{N}\text{}{\tau}_{n}\text{}.$$

This time window generally sets the temporal resolution of the resulting data products. In the case of the STFT, the fixed-duration analysis window can be referred to as the window of integration. In other words, the integration window ${T}_{n}$ is defined as a multiple ${M}_{N}$ of the pseudo period. This window immediately constrains the lowest frequency ${f}_{min}$ that can be represented and the resolution of a spectral representation,

$${f}_{min}=\frac{1}{{T}_{n}}\text{}.$$

The upper bandwidth of the analysis window can be set by the Nyquist frequency, which is half of the sampling frequency of the digital time series. In practice the upper bandwidth is close to one quarter of Nyquist. Although this representation is simple and tidy, it is not particularly informative. A more useful representation of window duration is the number of wavelet oscillations in the window, which can be estimated from the quality factor ${Q}_{N}$ of the wave function. As presented in Appendix C, the relation between the scale multiplier ${M}_{N}$ and the quality factor can be estimated by the ½ power (−3 dB, or half bit) points on the power spectrum,

$${M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}\text{}.$$

The wavelet admissibility condition for the for this wavelet is equivalent to the zero mean, or
which is essentially met by the standard bands presented in Table 1. Although traditionally the Nth octave frequencies are represented by the geometric mean of the band edge frequencies (Appendix A), in the evaluation of spectral power losses it is important to use the arithmetic mean for ${f}_{n}$ which would be centered in the bandwidth $\Delta {f}_{n}$ in linear frequency space. Since the ratios of the arithmetic and geometric means are constant and set by the band order N, the geometric scaling is still preserved.

$${M}_{N}^{2}\text{}\gg 1$$

The canonical form for computational evaluation is:

$${\mathsf{\Psi}}_{n}\left(x-{x}^{\prime}\right)=\frac{1}{{\pi}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right.}}\frac{1}{\sqrt{{\U0001d4c8}_{n}}}exp\left\{-\frac{1}{2}{\left[\frac{x-{x}^{\prime}}{{\U0001d4c8}_{n}}\right]}^{2}\right\}exp\left\{i{M}_{N}\left[\frac{x-{x}^{\prime}}{{\U0001d4c8}_{n}}\right]\right\}\text{}.$$

The second b-type form has a different structure
applying
yields
which has the form
with
Note that since
the “bandwidth” $b$ is inversely proportional to the actual bandwidth of the highest frequency.

$${\psi}_{n}\left(x\right)\text{}={\text{}\mathsf{\Psi}}_{{x}^{\prime},n}\left(x\right){\left(4\pi \right)}^{-\frac{1}{4}}\text{}{\U0001d4c8}_{n}^{-\frac{1}{2}}$$

$${\psi}_{n}\left(x-{x}^{\prime}\right)={\left(2\pi \right)}^{-\frac{1}{2}}\text{}{\U0001d4c8}_{n}^{-1}\text{}exp\left\{-\frac{1}{2}{\left[\frac{x-{x}^{\prime}}{{\U0001d4c8}_{n}}\right]}^{2}\right\}exp\left\{i2\pi \frac{\text{}{f}_{n}}{\text{}{f}_{s}}\left(x-{x}^{\prime}\right)\right\}$$

$${\U0001d4c8}_{n}=\text{}{\U0001d4c8}_{0}\text{}{2}^{\frac{n}{N}},$$

$${\psi}_{n}\left(x-{x}^{\prime}\right)={\left(\mathsf{\pi}\text{}2{\U0001d4c8}_{o}^{2}\right)}^{-\frac{1}{2}}\text{}{\left[{s}_{n}\right]}^{-1}\text{}exp\left\{-\frac{1}{2{\U0001d4c8}_{o}^{2}}{\left[\frac{x-{x}^{\prime}}{{s}_{n}}\right]}^{2}\right\}exp\left\{i\frac{{M}_{N}}{{\U0001d4c8}_{0}}\left[\frac{x-{x}^{\prime}}{{s}_{n}}\right]\right\}$$

$${\psi}_{N}\left(\mu \right)=\frac{1}{\sqrt{\pi b}}\text{}exp\left\{-\frac{{\mu}^{2}}{b}\right\}exp\left\{i\frac{{M}_{N}}{{\U0001d4c8}_{0}}\text{}\mu \right\}$$

$${\psi}_{n}\left(\mu \right)=\frac{1}{{s}_{n}}{\psi}_{N}\left(\frac{\mu -{\mu}^{\prime}}{{s}_{n}}\right)$$

$${s}_{n}=\frac{{\U0001d4c8}_{n}}{{\U0001d4c8}_{0}}\text{}={2}^{\frac{n}{N}},n\ge 0,\text{}{\U0001d4c8}_{0}={M}_{N}\frac{\text{}{f}_{s}{\tau}_{0}}{2\pi}$$

$$b=2{\U0001d4c8}_{o}^{2}=2{\left[{M}_{N}\frac{\text{}{f}_{s}{\tau}_{0}}{2\pi}\right]}^{2}.$$

$$b=8ln2{\left(\frac{\text{}{f}_{s}}{\mathsf{\Delta}{\omega}_{0}}\right)}^{2}$$

The power spectral density of the Gabor wavelet is:
where Y is the fractional power loss. There exist various definitions of the quality factor of a system. This paper defines ${Q}_{N}$ by 1/2 of the spectral power relative to the peak spectral power, where $Y=2$. Therefore, for the Gabor wavelet,

$${\widehat{\mathsf{\Psi}}}^{2}{}_{n}\left(f\right)={\left[4\pi {\U0001d4c8}_{n}^{2}\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.}\text{}exp\left\{-{M}_{N}^{2}{\left[\frac{f-{f}_{n}}{{f}_{n}}\right]}^{2}\right\},$$

$$\frac{{\widehat{\mathsf{\Psi}}}^{2}{}_{u,n}\left({f}_{n}\text{}\pm \text{}\Delta {f}_{n}/2\right)}{{\widehat{\mathsf{\Psi}}}^{2}{}_{u,n}\left({f}_{n}\right)}=\text{}exp\left\{-{M}_{N}^{2}{\left[\frac{\Delta {f}_{n}}{2{f}_{n}}\right]}^{2}\right\}=exp\left\{-{\left[\frac{{M}_{N}}{2{Q}_{N}}\right]}^{2}\right\}=\frac{1}{Y}$$

$${M}_{N}=2\sqrt{ln2}\text{}{Q}_{N}\text{}.$$

Consider the decay of the spectrum relative with distance $\delta $ from the peak frequency

$$\frac{{\widehat{\mathsf{\Psi}}}^{2}{}_{u,n}\left({f}_{n}+\text{}\delta \Delta {f}_{n}/2\right)}{{\widehat{\mathsf{\Psi}}}^{2}{}_{u,n}\left({f}_{n}\right)}\text{}=exp\left\{-{\left[\frac{\delta {M}_{N}}{2{Q}_{N}}\right]}^{2}\right\}=exp\left\{-{\left[\delta \sqrt{ln2}\right]}^{2}\right\}={2}^{-{\delta}^{2}}.$$

The loss in dBs and binary bits can be expressed as

$$dB=\text{}10\ast lo{g}_{10}({2}^{-{\delta}^{2}})=\text{}-{\delta}^{2}10\ast {\mathrm{log}}_{10}\left(2\right)\approx -3{\delta}^{2}$$

$$bR=\text{}\frac{1}{2}lo{g}_{2}({2}^{-{\delta}^{2}})=\text{}\frac{-{\delta}^{2}}{2}\text{}.$$

There is a loss of 3 dB, 12 dB, 27 dB, and 48 dB, and a binary power loss of ½, 2, 4.5, and 8 fbits, for integer multiples of the bandedge $\delta =1,\text{}2,\text{}3,\text{}4$, respectively.

It is worth considering an alternate definition for the quality factor of an oscillator. Consider the time required for the amplitude to drop to 1/e of its peak value. In the case of the Quantum wavelet this is set by the Gaussian envelope, and this particular definition is best suited for the real part of the wavelet which is symmetric about the origin. By applying this definition,

$$exp\left\{-\frac{1}{2}{\left[\frac{x}{{\U0001d4c8}_{n}}\right]}^{2}\right\}=exp\left\{-\frac{1}{2}{\left[\frac{{f}_{s}{\tau}_{e}}{{\U0001d4c8}_{n}}\right]}^{2}\right\}=exp\left\{-\frac{1}{2}{\left[\frac{{\omega}_{n}{\tau}_{e}}{{M}_{N}}\right]}^{2}\right\}=exp\left\{-1\right\}$$

$${\tau}_{e}=\frac{\sqrt{2}}{\pi}\frac{{T}_{n}}{2}\approx 0.45\frac{{T}_{n}}{2}\text{}.$$

Since the wavelet is symmetric, this states that the portion of the wavelet contained within 2${\tau}_{e}$ of the window has an amplitude above 1/e of the peak. The quality factor associated with this type of oscillator is
and comparison with the half power point quality factor shows
and they are sufficiently close to each other to be equivalent for descriptive purposes. The time duration of the quantum wavelet is defined by
where $\text{}{Q}_{N}\approx \text{}{Q}_{e}$ can be interpreted as the number of oscillations in a little less than half of the total window ${T}_{n}$ with amplitude above 1/e of the maximum amplitude. The remaining half of the window is useful to allow the wavelet to settle down and meet the desirable condition of a vanishing first moment.

$$\text{}{Q}_{e}=\frac{{\omega}_{n}{\tau}_{e}}{2}=\frac{{M}_{N}}{\sqrt{2}}$$

$${Q}_{e}=\sqrt{2ln2}\text{}{Q}_{N}\approx 1.1774\text{}{Q}_{N}$$

$${T}_{n}={M}_{N}\text{}{\tau}_{n}=2\sqrt{ln2}{Q}_{N}{\tau}_{n}$$

Practical implementations of Gabor wavelets and their variants often have to make some compromises in the application of the wavelet duration ${T}_{n}$, especially if the window is required to be a power of two. Direct integration of the wavelet power over the window ${T}_{n}$ shows that it contains 99.999% of all the power. Integration over $2{\tau}_{e}$ will be insufficient (Appendix D). However, there exists a third quality factor defined by
where

$$exp\left\{-\frac{1}{2}{\left[\frac{{\omega}_{n}{\tau}_{\pi}}{{M}_{N}}\right]}^{2}\right\}=exp\left\{-\pi \right\}$$

$$\text{}{Q}_{\pi}=\frac{{\omega}_{n}{\tau}_{\pi}}{2}$$

$$\text{}{Q}_{\pi}=\sqrt{\pi}{Q}_{e}\approx 1.7724\text{}{Q}_{e}$$

$$\text{}{\tau}_{\pi}=\sqrt{\pi}{\tau}_{e}=\sqrt{\frac{2}{\pi}}\frac{{T}_{n}}{2}\approx 0.7978\frac{{T}_{n}}{2}\text{}.$$

In other words, $2{\tau}_{\pi}$ encompasses ~80% of the window, and integration of the wavelet power over $2{\tau}_{\pi}$ returns 99.96% of the total power. Therefore $2{\tau}_{\pi}=0.8{T}_{n}$ may be a reasonable lower bound for the wavelet duration. This is further considered in Appendix D.

Gabor introduced the time-frequency uncertainty principle in his landmark paper [2]. It is not possible to observe for all time and reach zero frequency. It is also impossible to sample infinitely fast and reach infinite frequency. All observations require a restriction in the observation time and the observation rate, and this places hard limits on the observable bandwidth of a process. The fundamental discretization interval scale invokes the Gabor uncertainty principle, which states the time and period of a signal cannot be known exactly but can be contained inside the box defined by the temporal and frequency variance of the probability distribution of the wave function.

This section follows the generalized mathematical formalism of ([13], Section 2.3.2, Uncertainty Principle). As in [13] and [7] the Fourier Transform pair used in this work is
where $\widehat{f}\left(\omega \right)$ and $f\left(t\right)$ may be complex, and $t$ and $\omega $ are nondimensionalized time Equation (13). The Parseval-Plancherel identity asserts that
Where
and the asterisk denotes complex conjugation. A related identity is routinely used in Fourier and Wavelet analyses and the application of filter banks

$$\widehat{f}\left(\omega \right)={{\displaystyle \int}}_{-\infty}^{\infty}f\left(t\right){e}^{-j\omega t}dt$$

$$f\left(t\right)=\frac{1}{2\pi}{{\displaystyle \int}}_{-\infty}^{\infty}\widehat{f}\left(\omega \right){e}^{j\omega t}d\omega ,$$

$$\Vert f\Vert {}^{2}={{\displaystyle \int}}_{-\infty}^{\infty}{\left|f\left(t\right)\right|}^{2}dt=\frac{1}{2\pi}{{\displaystyle \int}}_{-\infty}^{\infty}{\left|\widehat{f}\left(\omega \right)\right|}^{2}d\omega =\text{}\frac{1}{2\pi}\Vert \widehat{f}\Vert {}^{2}$$

$${\left|f\right|}^{2}=f\xb7{f}^{*}$$

$${{\displaystyle \int}}_{-\infty}^{\infty}f\left(t\right){g}^{*}\left(t\right)dt=\frac{1}{2\pi}{{\displaystyle \int}}_{-\infty}^{\infty}\widehat{f}\left(\omega \right){\widehat{g}}^{*}\left(\omega \right)d\omega \text{}.$$

The Gabor uncertainty principle constrains uncertainty to Gabor box defined by the variance in time and frequency. It is equivalent to the Heisenberg uncertainty principle for position and momentum extended to time and frequency, or space and wavenumber. Let a one-dimensional signal of interest be represented by a wave function $f\left(t\right)$. The probability density that a signal can be localized in time at a given time $t$ is
and the probability density that its angular frequency is $\omega $ is

$$\frac{{\left|f\left(t\right)\right|}^{2}}{\Vert f\Vert {}^{2}}=\frac{2\pi {\left|f\left(t\right)\right|}^{2}}{\Vert \widehat{f}\Vert {}^{2}},$$

$$\frac{{\left|\widehat{f}\left(\omega \right)\right|}^{2}}{\Vert \widehat{f}\Vert {}^{2}}=\frac{{\left|\widehat{f}\left(\omega \right)\right|}^{2}}{2\pi \Vert f\Vert {}^{2}}.$$

The variance in the time localization of the signal as
and the variance in the frequency localization of the signal as

$${\sigma}_{t}^{2}=\frac{1}{\Vert f\Vert {}^{2}}{{\displaystyle \int}}_{-\infty}^{\infty}{\left(t-u\right)}^{2}\text{}{\left|f\left(t\right)\right|}^{2}dt.$$

$${\sigma}_{\omega}^{2}=\frac{1}{\Vert \widehat{f}\Vert {}^{2}}{{\displaystyle \int}}_{-\infty}^{\infty}{\left(\omega -\xi \right)}^{2}\text{}{\left|\widehat{f}\left(\omega \right)\right|}^{2}d\omega .$$

Reference [13] uses these expressions to rederive the Heisenberg-Gabor uncertainty principle, which states that the temporal and angular frequency variance satisfy:

$${\sigma}_{t}^{2}{\sigma}_{\omega}^{2}\ge \frac{1}{4}\text{}.$$

In the special case of the Gabor-Morlet wavelet and its Quantum spawn, where the wave function is symmetric and centered around the time-shift $u$ and the spectrum is symmetric relative to the peak frequency ${\omega}_{n},$ the variance for the time and frequency distribution of the signal wave function can be readily evaluated.
and the Gabor box defined by the variance is minimal,
which is another reason for this wavelet’s popularity. Equations (A60) and (A61) are converted to physical time in Equation (22) of the main text.

$${\sigma}_{t}^{2}=\frac{1}{\Vert {\psi}_{H}\Vert {}^{2}}{{\displaystyle \int}}_{-\infty}^{\infty}{\left(t-u\right)}^{2}\text{}{\left|{\psi}_{H}\left(t-u\right)\right|}^{2}dt=\text{}\frac{1}{2}{\U0001d4c8}_{n}^{2}$$

$${\sigma}_{{\omega}_{n}}^{2}=\frac{1}{\Vert {\widehat{\psi}}_{H}\Vert {}^{2}}{{\displaystyle \int}}_{-\infty}^{\infty}{\left(\omega -{\omega}_{n}\right)}^{2}\text{}{\left|{\widehat{\psi}}_{H}\left(\omega -{\omega}_{n}\right)\right|}^{2}d\omega =\text{}\frac{1}{2}{\U0001d4c8}_{n}^{-2}$$

$${\sigma}_{t}^{2}{\sigma}_{{\omega}_{n}}^{2}=\frac{1}{4},$$

Consider the standard deviation for time integrated over the scaled window $\u03f5{T}_{n}$

$${\sigma}_{t}^{2}\left(\u03f5\right)=\frac{1}{\Vert {\psi}_{H}\Vert {}^{2}}{{\displaystyle \int}}_{\mathrm{u}-\frac{\u03f5{T}_{n}}{2}}^{u+\frac{\u03f5{T}_{n}}{2}}{\left(t-u\right)}^{2}\text{}{\left|{\psi}_{H}\left(t-u\right)\right|}^{2}dt=\text{}\frac{1}{\sqrt{\pi}}{\left[\frac{{M}_{N}}{{\omega}_{n}}\right]}^{2}{{\displaystyle \int}}_{-\u03f5\mathsf{\pi}}^{\u03f5\mathsf{\pi}}{x}^{2}\text{}{e}^{-{x}^{2}}\text{}dx$$

$${{\displaystyle \int}}_{-\mathrm{a}}^{\mathrm{a}}{x}^{2}\text{}{e}^{-{x}^{2}}\text{}dx=\frac{\sqrt{\pi}}{2}\left[erf\left(\mathrm{a}\right)-\frac{2}{\sqrt{\pi}}a\text{}{e}^{-{a}^{2}}\right]\text{}.$$

For $\u03f5\ge \frac{3}{2\mathsf{\pi}}$

$${\sigma}_{t}^{2}\left(\u03f5\right)\cong \text{}\frac{1}{2}{\U0001d4c8}_{n}^{2}\text{}erf\left(\u03f5\mathsf{\pi}\right)\text{}.$$

For $\u03f5=\left[1.0,\text{}0.8,\text{}0.45\right]\text{}$
where $\u03f5$ corresponds to integration over ${T}_{n},\text{}2{\tau}_{\pi}\approx 0.8{T}_{n},\text{}and\text{}2{\tau}_{e}\approx 0.45{T}_{n}$, corresponding to the full window, the decay time associated with ${Q}_{\pi}$, and the e-folding time associated with ${Q}_{e}$, respectively (Appendix C).

$${\sigma}_{t}^{2}\left(\u03f5\right)\cong \text{}\frac{1}{2}{\U0001d4c8}_{n}^{2}\left[0.9999,\text{}0.9996,\text{}0.9544\text{}\right]\text{},$$

Next, consider the standard deviation for time integrated over the scaled window $\u03f5{T}_{n}$

$${\sigma}_{{\omega}_{n}}^{2}\left(\mathsf{\delta}\right)=\frac{1}{\Vert {\widehat{\psi}}_{H}\Vert {}^{2}}{{\displaystyle \int}}_{{\omega}_{n}-\frac{\delta \Delta {\omega}_{n}}{2}}^{{\omega}_{n}+\frac{\delta \Delta {\omega}_{n}}{2}}{\left(\omega -{\omega}_{n}\right)}^{2}\text{}{\left|{\widehat{\psi}}_{H}\left(\omega -{\omega}_{n}\right)\right|}^{2}d\omega =\text{}=\text{}\frac{1}{\sqrt{\pi}}{\left[\frac{{M}_{N}}{{\omega}_{n}}\right]}^{-2}{{\displaystyle \int}}_{-\mathsf{\delta}\sqrt{ln2}}^{\mathsf{\delta}\sqrt{ln2}}{x}^{2}\text{}{e}^{-{x}^{2}}\text{}dx$$

$${\sigma}_{{\omega}_{n}}^{2}\left(\mathsf{\delta}\right)\cong \text{}\frac{1}{2}{\U0001d4c8}_{n}^{-2}\text{}\left[erf\left(\mathsf{\delta}\sqrt{ln2}\right)-\frac{2}{\sqrt{\pi}}\left(\mathsf{\delta}\sqrt{ln2}\right)\text{}{2}^{-{\delta}^{2}}\right]\text{}.$$

For $\mathsf{\delta}=\left[1,2,3,4\right]\text{}$
where $\delta $ corresponds to integration over $\Delta {\omega}_{n},\text{}2\Delta {\omega}_{n},\text{}3\Delta {\omega}_{n},\text{}and\text{}4\Delta {\omega}_{n}$, respectively. These results show that the Gabor box can be well approximated (>99% of the variance) by a window of duration $2{\tau}_{\pi}=0.8{T}_{n}$ and a bandwidth of $3\Delta {\omega}_{n},$ and over 99.99% of the variance is contained by a Gabor box of dimensions ${T}_{n},\text{}4\Delta {\omega}_{n}$. In other words, third octave bands will contain over 99% of the variance within its octave and within 80% of the full window ${T}_{n}$.

$${\sigma}_{{\omega}_{n}}^{2}\left(\mathsf{\delta}\right)\cong \text{}\frac{1}{2}{\U0001d4c8}_{n}^{-2}\left[0.2912,\text{}0.8640,\text{}0.9941,\text{}0.9999\right]$$

A few variations of the Gabor-Morlet wavelet are available in present-day computing environments. One of the more familiar forms of the mother wavelet used in modern computations [26,27] is

$$\psi \left(\mu \right)=\frac{1}{\sqrt{\pi b}}\text{}exp\left\{-\frac{{\mu}^{2}}{b}\right\}exp\left\{i2\mathsf{\pi}{\overline{f}}_{b}\text{}\mu \right\}$$

$${\mathsf{\psi}}_{{\mu}^{\prime},n}\left(t\right)=\frac{1}{{s}_{n}}\mathsf{\psi}\left(\frac{\mu -{\mu}^{\prime}}{{s}_{n}}\right)\text{}.$$

This form is found in the Matlab “cmor” function as well as the Python Pywavelets [28] “cmorB-C” function with $C={\overline{f}}_{b}$. The term b is referred to as the “bandwidth parameter” of the wavelet. The Quantum wavelet has the equivalence
where ${f}_{0}$, the highest center frequency, is used as the starting point. The scaled wavelet duration is ${M}_{N}\frac{{f}_{s}}{{f}_{n\text{}}}$ and can be rounded to approximate the number of points for each scale.

$${s}_{n}=\text{}{2}^{\frac{n}{N}},n\ge 0,$$

$${\tau}_{\mathrm{n}}={\tau}_{0}{s}_{n}=\frac{1}{{f}_{0}}{s}_{n}$$

$$b=2{\left[{M}_{N}\frac{\text{}{f}_{s}{\tau}_{0}}{2\pi}\right]}^{2}$$

$$C={\overline{f}}_{b}=\frac{\text{}{f}_{0}}{\text{}{f}_{s}}=\frac{1}{\text{}{f}_{s}{\tau}_{0}}$$

Foster [29] expresses the abbreviated Morlet wavelet as
so that $z={\omega}_{n}t$ and now $c=\frac{1}{2{M}_{N}^{2}}$ is inversely proportional to the Q of the wave function. The beauty of Foster’s approach is that it can be used for unevenly sampled data. A modernization of this algorithms can be found at [30].

$$F\left(z\right)={e}^{iz-c{z}^{2}}=exp\left\{i{\omega}_{n}t-\frac{1}{2{M}_{N}^{2}}{\omega}_{n}^{2}{t}^{2}\right\}$$

The reconstruction coefficients of the complex Morlet CWT return the imaginary part of the analytic signal. The complex analytic signal corresponding to the real signal $g\left(\widehat{\tau}\right)$ is
where $\mathscr{H}$ denotes the Hilbert transform, a recurrent topic in wave propagation as reflection introduces phase shifts that are often modeled as Hilbert transforms of the original signal [31]. For example, some of the U-shaped infrasound waveforms associated with thermospheric returns resemble the Hilbert transform of an explosion pulse [3]. The Hilbert transform is also useful for estimating instantaneous frequency and in the computation of the Hilbert-Huang transform [32].

$${g}_{\u2102}\left(\widehat{\tau}\right)=g\left(\widehat{\tau}\right)+j\text{}\mathscr{H}\left[g\left(\widehat{\tau}\right)\right]$$

Let $g\left(\widehat{\tau}\right)$ represent the Granström Triangular (GT) pulse [7],

$$g\left(\widehat{\tau}\right)=\left(1-\widehat{\tau}\right),\hspace{1em}0\le \widehat{\tau}\le 1$$

$$g\left(\widehat{\tau}\right)=\frac{1}{6}\left(1-\widehat{\tau}\right){\left(1+\sqrt{6}-\widehat{\tau}\right)}^{2},\hspace{1em}1<\widehat{\tau}\le 1+\sqrt{6}\text{}.$$

The Hilbert transform of the canonical GT blast pulse is rather unwieldy, but can be evaluated from
where the $\mathcal{P}$ in front of the integral denotes the Chaucy principal value. Multiple integration by parts over the interval of the GT pulse yields
Since
the solutions are well behaved near the zero crossings. However, there are some issues in this solution. First, there are the two troublesome implicitly complex terms. The first is
where $ln\left(\widehat{\tau}\right)$ tends to negative infinity at $\widehat{\tau}=0$. The second tricky term is

$${g}_{\mathscr{H}}\left(\widehat{\tau}\right)=\mathscr{H}\left[g\left(\widehat{\tau}\right)\right]=\mathcal{P}\frac{1}{\pi}{{\displaystyle \int}}_{-\infty}^{\infty}\frac{g\left(x\right)}{t-x}dx$$

$${g}_{\mathscr{H}}\left(\widehat{\tau}\right)=\frac{1}{\pi}\left[1+\left(1-\widehat{\tau}\right)ln\left(-\widehat{\tau}\right)-\left(1-\widehat{\tau}\right)ln\left(1-\widehat{\tau}\right)\right],\text{}0\le \tau \le 1\phantom{\rule{0ex}{0ex}}{g}_{\mathscr{H}}\left(\widehat{\tau}\right)=\frac{1}{6\pi}\frac{\left(a-1\right)}{6}\left[a\left(2a+5\right)-1+6{\widehat{\tau}}^{2}-3\widehat{\tau}\left(1+3a\right)\right]\text{}$$

$$+\text{}\frac{1}{6\pi}\left[\left(\widehat{\tau}-1\right){\left(a-\widehat{\tau}\right)}^{2}\right]\left[ln\left(a-\widehat{\tau}\right)-ln\left(1-\widehat{\tau}\right)\right],1\tau \le a=1+\sqrt{6}\text{}.$$

$$\underset{x\to 0}{\mathrm{lim}}\text{}x\text{}ln\left(x\right)=0,\text{}\underset{x\to 0}{\mathrm{lim}}\text{}{x}^{2}\text{}ln\left(x\right)=0$$

$$ln\left(-\widehat{\tau}\right)=ln\left(\widehat{\tau}\right)+j\pi ,\text{}0\le \tau \le 1$$

$$ln\left(1-\widehat{\tau}\right)=ln\left(\widehat{\tau}-1\right)+j\pi ,\text{}1\tau \le 1+\sqrt{6}\text{}.$$

The complex terms are awkward; fortunately, multiplication and division by zero can be readily avoided numerically by adding the smallest floating point value (float epsilon) to arguments in logarithmic computations so it is possible to evaluate the real part of the solution. Another inconvenience is the discontinuity in ${g}_{\mathscr{H}}$ and its slope as $\widehat{\tau}\to 1$. Rewriting the first term as

$${g}_{\mathscr{H}}{\left(\widehat{\tau}\right)}_{\widehat{\tau}<1}=\frac{1}{\pi}\left[1+\left(1-\widehat{\tau}\right)ln\left(\widehat{\tau}\right)-\left(1-\widehat{\tau}\right)ln\left(1-\widehat{\tau}\right)\right]+j\left(1-\widehat{\tau}\right),\phantom{\rule{0ex}{0ex}}\widehat{\tau}\to 1\text{}from\text{}below$$

$${g}_{\mathscr{H}}\left(\widehat{\tau}\to 1\right)=\frac{1}{\pi}\text{}.$$

Evaluating the second term yields

$${g}_{\mathscr{H}}\left(\widehat{\tau}=1\right)=\frac{1}{\pi}\frac{\sqrt{2}}{\sqrt{3}}=\frac{1}{\pi}\left[1-\frac{\sqrt{3}-\sqrt{2}}{\sqrt{3}}\right],\text{}\widehat{\tau}\to 1\text{}from\text{}above.$$

These deficiencies are suboptimal, and not altogether surprising given that the waveform did not design integrability into the GT pulse [7]. Fortunately, these inadequacies are deemed computationally irrelevant by using the numerical convolution provided by the SciPy [25] signal.hilbert, which returns the analytic function for a real input waveform. The comparison between the synthetic theoretical analytic signals, the CWT reconstruction, and the numerical Hilbert transform are presented in the figures in the main text.

- Garcés, M.A. On Infrasound Standards, Part 1. Time, frequency, and Energy scaling. Inframatics
**2013**, 2, 13–35. [Google Scholar] [CrossRef] - Gabor, D. Theory of Communication, Part 3. Electr. Eng.
**1946**, 93, 445–457. [Google Scholar] - Vergoz, J.; LePichon, A.; Millet, C. The Antares explosion observed by the USArray: An unprecedented collection of infrasound phases recorded from the same event. In Infrasound Monitoring for Atmospheric Studies, 2nd ed.; Le Pichon, A., Blanc, E., Hauchecorne, A., Eds.; Springer Nature: Cham, Switzerland, 2019; pp. 349–386. [Google Scholar] [CrossRef]
- Mialle, P.; Brown, D.; Arora, N. Advances in operational processing at the international data center. In Infrasound Monitoring for Atmospheric Studies, 2nd ed.; Le Pichon, A., Blanc, E., Hauchecorne, A., Eds.; Springer Nature: Cham, Switzerland, 2019; pp. 209–248. [Google Scholar] [CrossRef]
- Yu, Q.; Yao, Y.; Wang, L.; Tang, H.; Dang, J.; Tan, K.C. Robust Environmental Sound Recognition with Sparse Key-point Encoding and Efficient Multi-spike Learning. arXiv
**2019**, arXiv:1902.01094. [Google Scholar] - Severa, W.; Vineyard, C.M.; Rellana, R.; Verzl, S.; Almone, J.B. Training Deep Neural Networks for Binary Communication with the Whetstone Networks. Nat. Mach. Intell.
**2019**, 1, 86–94. [Google Scholar] [CrossRef] - Garcés, M.A. Explosion Source Models. In Infrasound Monitoring for Atmospheric Studies, 2nd ed.; Le Pichon, A., Blanc, E., Hauchecorne, A., Eds.; Springer Nature: Cham, Switzerland, 2019; pp. 273–345. [Google Scholar] [CrossRef]
- Pacenak, Z.; Kleissl, J.; Lam, E. Detection of a Surface Detonated Nuclear Weapon using a Photovoltaic Rich Microgrid. In Proceedings of the IEEE Power and Energy Society General Meeting, San Diego, CA, USA, 1 August 2018. [Google Scholar]
- Cai, Y.; Lam, E.; Howlett, T.; Cai, A. Spatiotemporal Analysis of ‘Jello Effect’ in Drone Videos. In Proceedings of the 10th International Conference on Applied Human Factors and Ergonomics, Washington, DC, USA, 26 July 2019. [Google Scholar]
- Elliot, D.; Martino, E.; Otero, C.E.; Smith, A.; Peter, A.; Luchterhand, B.; Lam, E.; Leung, S. Cyber-Physical Analytics: Environmental Sound Classification at the Edge. In Proceedings of the IEEE World Forum on the Internet of Things 2020, Edge and Fog Computing Paper 1205, New Orleans, LA, USA, 5–9 April 2020. [Google Scholar]
- Shannon, C.E. The Mathematical Theory of Communication; University of Illinois Press: Urbana, IL, USA, 1998; [1949 first ed]. [Google Scholar]
- Berg, N.J.; Pellegrino, J.M. Acousto-optic Signal Processing: Theory and Implementation; Marcel Dekker: New York, NY, USA, 1996; ISBN 0-8247-8925-3. [Google Scholar]
- Mallat, S. A Wavelet Tour of Signal Processing: The Sparse Way, 3rd ed.; Academic Press: Cambridge, MA, USA, 2009; [1998 first ed]. [Google Scholar]
- Grossmann, A.; Morlet, J. Decomposition of Hardy Functions into Square Integrable Wavelets of Constant Shape. Soc. Int. Am. Math. SIAM J. Math. Analys.
**1984**, 15, 723–736. [Google Scholar] [CrossRef] - Goupillaud, P.; Grossmann, A.; Morlet, J. Cycle-octave and Related Transforms in Seismic Signal Analysis. Geoexploration
**1984**, 23, 85–102. [Google Scholar] [CrossRef] - Sheu, Y.-L.; Hsu, L.-Y.; Wu, H.-T.; Li, P.-C.; Chu, S.-I. A New Time-Frequency Method to Reveal Quantum Dynamics of Atomic Hydrogen in Intense Laser Pulses: Synchrosqueezing Transform. AIP Adv.
**2014**, 4, 117138. [Google Scholar] [CrossRef] - Ashmead, J. Morlet Wavelets in Quantum Mechanics. Quanta
**2012**, 1, 58–70. [Google Scholar] [CrossRef] - Canolty, R.T.; Womelsdorf, T. Multiscale Adaptive Gabor Expansion (MAGE): Improved Detection of Transient Oscillatory Burst Amplitude and Phase. bioRxiv
**2019**, 369116. [Google Scholar] [CrossRef] - Shi, Y.; Zhang, X.D. A Gabor Atom Network for Signal Classification with Application in Radar Target Recognition. IEEE Trans. Signal Process.
**2001**, 49, 2994–3004. [Google Scholar] [CrossRef] - Torrence, C.; Compo, G.P. A Practical Guide to Wavelet Analysis. Bull. Am. Meteorol. Soc.
**1998**, 79, 61–78. [Google Scholar] [CrossRef] - Farge, M. Wavelet Transforms and their Applications to Turbulence. Annu. Rev. Fluid Mech.
**1992**, 24, 395–458. [Google Scholar] [CrossRef] - Lebedeva, E.A.; Postnikov, E.B. On Alternative Wavelet Reconstruction Formula: A Case Study of Approximate Wavelets. R. Soc. Open Sci.
**2014**, 1, 140124. [Google Scholar] [CrossRef] [PubMed] - Bishop, M. Continuous Wavelet Transform Reconstruction Factors for Selected Wavelets. Available online: http://mark-bishop.net/signals/CWTReconstructionFactors.pdf (accessed on 29 July 2018).
- Li, T.; Zhou, M. ECG Classification Using Wavelet Packet Entropy and Random Forests. Entropy
**2016**, 18, 285. [Google Scholar] [CrossRef] - Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods
**2020**, 17, 261–272. [Google Scholar] [CrossRef] [PubMed] - Mallat, S. Group Invariant Scattering. Commun. Pure Appl. Math.
**2012**, 65, 1331–1398. [Google Scholar] [CrossRef] - Mallat, S.; Zhang, S.; Rochette, G. Phase Harmonic Correlations and Convolutional Neural Networks. Inf. Inference J. IMA
**2019**, 2, 1–26. [Google Scholar] [CrossRef] - Lee, G.R.; Gommers, R.; Wasilewski, F.; Wohlfahrt, K.; O’Leary, A. PyWavelets: A Python Package for Wavelet Analysis. J. Open Source Softw.
**2019**, 4, 1237. [Google Scholar] [CrossRef] - Foster, G. Wavelets for Period Analysis of Unevenly Sampled Time Series. Astron. J.
**1996**, 112, 1709–1729. [Google Scholar] [CrossRef] - Takazawa, S. Weighted Wavelet Z-Transform Repository. v1.0.0. 2020. Available online: https://bitbucket.org/redvoxhi/libwwz/src/master/ (accessed on 24 August 2020).
- Brekhovskikh, L.M.; Godin, O.A. Acoustics of Layered Media I, 2nd ed.; Sections 5.2 and 5.3; Springer: New York, NY, USA, 1998. [Google Scholar]
- Huang, N.E.; Wu, Z. A Review on Hilbert-Huang Transform: Method and its Applications to Geophysical Studies. Rev. Geophys.
**2008**, 46, RG2006. [Google Scholar] [CrossRef]

N | ${\mathit{Q}}_{\mathit{N}}$ | ${\mathit{M}}_{\mathit{N}}$ |
---|---|---|

1 | 1.4142 | 2.3548 |

3 | 4.3185 | 7.1907 |

6 | 8.6514 | 14.4055 |

12 | 17.3099 | 28.8229 |

24 | 34.6235 | 57.6519 |

48 | 69.2488 | 115.3067 |

96 | 138.4984 | 230.6150 |

N | ${\mathit{Q}}_{\mathit{N}}$ | ${\mathit{Q}}_{\mathit{N}}\text{}\approx \text{}\sqrt{2}\mathit{N}$ |
---|---|---|

1 | 1.4142 | 1.4142 |

3 | 4.3185 | 4.2426 |

6 | 8.6514 | 8.4853 |

12 | 17.3099 | 16.9706 |

24 | 34.6235 | 33.9411 |

48 | 69.2488 | 67.8823 |

96 | 138.4984 | 135.7645 |

${\mathit{Q}}_{\mathit{N}}$ | $\mathit{N}\text{}\approx \text{}{\mathit{Q}}_{\mathit{N}}/\sqrt{2}$ | ${\mathit{M}}_{\mathit{N}}$ |
---|---|---|

1 | 0.7071 | 1.6651 |

2 | 1.4142 | 3.3302 |

4 | 2.8284 | 6.6604 |

8 | 5.6569 | 13.3209 |

16 | 11.3137 | 26.6417 |

32 | 22.6274 | 53.2835 |

64 | 45.2548 | 106.5670 |

128 | 90.5097 | 213.1340 |

${\mathit{M}}_{\mathit{N}}$ | $~{\mathit{Q}}_{\mathit{N}}$ | $\mathit{N}$ |
---|---|---|

1 | 0.600561204 | 0.4246609 |

2 | 1.201122409 | 0.8493218 |

4 | 2.402244818 | 1.698643601 |

5 | 3.002806022 | 2.123304501 |

6 | 3.603367226 | 2.547965401 |

8 | 4.804489635 | 3.397287201 |

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).