Antiderivative Antialiasing for Stateful Systems

Nonlinear systems, such as guitar distortion effects, play an important role in musical signal processing. One major problem encountered in digital nonlinear systems is aliasing distortion. Consequently, various aliasing reduction methods have been proposed in the literature. One of these is based on using the antiderivative of the nonlinearity and has proven effective, but is limited to memoryless systems. In this work, it is extended to a class of stateful systems which includes but is not limited to systems with a single one-port nonlinearity. Two examples from the realm of virtual analog modeling show its applicability to and effectiveness for commonly encountered guitar distortion effect circuits.

between consecutive samples, the nonlinearity is applied, and the result is lowpass-filtered by integrating over one sampling interval before sampling again to obtain the digital output signal. Thus, the nonlinear system y(n) = f u(n) , (1) where f (u) denotes the nonlinear function, mapping input sample u(n) to output sample y(n), is transformed into Only rarely, the integral will have a closed solution and numerical integration at run time is unattractive considering the computational load. However, by switching the integration variable, we obtain where we may apply the fundamental theorem of calculus and by further special-casing u(n) ≈ u(n − 1) to avoid numerical issues, we finally arrive at y(n) =f u(n), u(n − 1) =    F(u(n))−F(u(n−1)) u(n)−u(n−1) if u(n) ≈ u(n − 1) where F(u) = f (u)du is the antiderivative of f (u). Again, the antiderivative may not have a closed solution, but being a function in one variable, it may be pre-computed and stored in a lookup table.
In addition to reducing aliasing artifacts, the approach introduces a half-sample delay and attenuates signal components not only above the Nyquist limit, but also at high frequencies below it, i.e., it acts as a low-pass filter in the audio band. This can be readily seen when using the identity function f (u) = u instead of a true nonlinearity. In that case, straightforward calculation yields y(n) = 1 2 u(n) + 1 2 u(n − 1), which is a first-order finite impulse response low-pass filter with a group delay of half a sample. The low-pass effect can be countered by a modest amount of oversampling (e.g., by a factor of two) and the delay usually is of no concern.

The Higher-Dimensional Case
Before turning to stateful systems, we shall briefly discuss how to lift the restriction to systems with one scalar input u(n) and one scalar output y(n) quietly assumed so far. To keep the notation concise, we collect the inputs u 1 (n), u 2 (n), . . . , u K u (n) of a system with K u scalar inputs into an input vector u(n) = u 1 (n) u 2 (n) · · · u K u (n) (using bold letters to indicate vectors here and in the following). Likewise, multiple outputs are collected into an output vector y(n) = y 1 (n) y 2 (n) · · · y K y (n) . We may first notice that the extension to vector-valued output is straightforward: the above derivation does not depend at all on the fact that the non-linear function (or its antiderivative) is scalar-valued.
For vector-valued input, on the other hand, things do not work out so nicely anymore. In particular, we can no longer rewrite (2) as (3). So unless the integral in (2) happens to have a closed solution, to avoid numerical solving at run time, we are almost stuck with precomputing and tabulating the solution for all combinations of u(n) and u(n − 1) at a certain density. Thus, if the input is K u -dimensional, having to use two subsequent input vectors stacked on top of each other to form the lookup (index) vector, we would require a 2K u -dimensional lookup table. A small improvement is possible by realizing that we always integrate along lines, and a line can be parameterized by 2K u − 2 parameters in a K u -dimensional space. Along a line, we can precompute the antiderivative with respect to the scalar coordinate along the line. This antiderivative with a one-dimensional argument and 2K u − 2 parameters could be stored in a 2K u − 1-dimensional lookup table. While this may be reasonable for K u = 2, it quickly becomes infeasible for larger K u due to prohibitive memory requirements.

Extension to Stateful Systems
The half-sample delay introduced by the method of [10] becomes problematic if the nonlinearity is embedded in the feedback loop of a stateful system. As noted in [10], for the particular case of an integrator following the nonlinearity and using trapezoidal rule for time-discretization, one can simply replace the numerator of the discretized integrator's transfer function with the filter introduced by antialiasing. This fusing of antialiased nonlinearity and integrator then has no additional delay compared to the system without antialiasing, hence can be used inside a feedback system without problems.
Here, we consider systems that do not necessarily have an integrator following the nonlinearity. In particular, we shall consider the general discrete nonlinear state-space system x(n) = f x,0 (p x,0 (n)) + · · · + f x,L (p x,L (n)) (6) y(n) = f y,0 (p y,0 (n)) + · · · + f y,M (p y,M (n)), with D p y,m ∈ R K p y,m ×K u depend on the system. Some remarks are in order: • We not only allow multiple states, collected in the vector x(n), but similarly multiple inputs in u(n) and multiple outputs in y(n). In many applications, input and output will however be scalar. • While having only one nonlinear function on the right-hand side, i.e., L = M = 0, is perfectly valid, we allow the decomposition into a sum of multiple functions. We do this because, as discussed above, nonlinear functions with scalar or very low-dimensional argument are much better suited to the proposed method than those with higher-dimensional argument. Thus, while we theoretically allow vector-valued p x,l (n) and p y,m (n) of arbitrary dimension, from a practical viewpoint, their dimensionalities K p x,l and K p y,m should be low. Therefore, if the system's nonlinearity can be decomposed into a sum of nonlinear functions with low-dimensional arguments, it should be. It has to be noted, though, that unfortunately not all systems allow their nonlinearity to be decomposed in such a way. • It may often be desirable to differentiate between linear and nonlinear parts of this system. By letting f x,0 (p x,0 (n)) = p x,0 (n) and f y,0 (p y,0 (n)) = p y,0 (n) be the identity function and further we therefore rewrite our system as A visual representation is given as a block diagram in Figure 1. • If the system is obtained in the context of virtual analog modeling, usually the nonlinear functions will only be given implicitly (as the solution of what is sometimes referred to as a delay-free loop), making solving a nonlinear equation necessary. However, they are typically based on a common set of functions f m , only applying different weighting to their outputs, i.e., with L = M and p m (n) = p x,m (n) = p y,m (n). While this redundancy should be kept in mind for optimizing an implementation, we will derive our method for the more general case of possibly independent nonlinear functions for state update and output.
In a first step, we may consider only applying the aliasing suppression to f y,m (p y,m ), as they are not part of any feedback loop. We have to be careful though, and may not just replace f y,m withf y,m in (11), as that would lead to a misalignment in time of the linear and antialiased nonlinear terms. Instead, we have to apply the antialiasing also to the linear part f y,0 , even though it obviously does not introduce aliasing distortion. Fortunately, the integral has a closed solution no matter the dimensionality, namelỹ resulting in y,m (p y,m (n), p y,m (n − 1)). (15) However, any aliasing distortion introduced into x(n) by (10) will not undergo any mitigation (except for the lowpass filtering). Now, if we naively rewrite (10) as we did with (11), we modify our system in an unwanted way as we introduce additional delay in the feedback. But we do that in a very controlled way: the unit delay in the feedback is replaced by a delay of 1.5 samples. As a delay of 1.5 samples corresponds to a delay of one sample at 2 3 of the sampling rate, we can compensate for the extra half-sample delay by designing our system for this reduced sampling rate, but then operating it at the original sampling rate. We thus arrive at with where all coefficients are calculated for the reduced sampling ratef s = 2 3 f s . We can only do this because we do not have a delay-free loop. Or rather, the delay-free loop is hidden inside the nonlinear functions: Instead of worrying about a nonlinearity within a delay-free loop, we treat the solution of the delay-free loop as the nonlinearity to apply aliasing reduction to. Note that the behavior for frequencies above 1 2f s = 1 3 f s is ill-defined, but with the mild oversampling suggested by [10] anyway, we do not have to worry about this.
The increased delay is not the only effect of the modification. There is also the low-pass filtering. To study this in more detail, assume all f x,l (p x,l ) and f y,m (p y,m ) to be linear so that we have a linear system, and let H(z) denote the transfer function obtained from (8)- (11). If we instead use (16)-(19) without adjusting the coefficients, it is straightforward to verify that the resulting transfer function fulfillsH We may observe two effects: the well-known filtering with a factor on the outside and the substitution z ← 1 2 (z −1 + z −2 ) −1 in the argument of H. Evaluating the latter for z = e jω , we note  While in the original system H(z) is evaluated on the unit circle e jω (shown dotted) to obtain the frequency response, for the modified system, the original H(z) is evaluated on the trajectory of (21). We notice that, in addition to the frequency scaling by 3 2 , there is an additional scaling away from the unit circle, increasing with frequency. Importantly, as we only evaluate H(z) for z on or outside the unit circle, we preserve stability, i.e., if H(z) is stable, so isH(z). Nevertheless, especially for higher frequencies, this may cause a significant distortion of the frequency response.
An extreme example would be an all-pass filter with high Q-factor, where the transformation might result in the zero moving onto the frequency axis, turning a flat frequency response into one with a deep notch.
As the examples will demonstrate, many typical systems are rather well-behaved under the transformation, but one has to be aware of this pitfall.

Computational Cost
In addition to potentially altering the system behavior for high frequencies, the price to pay for applying the proposed antialiasing approach is computational cost. Obviously, the exact total number of operations needed depends on the actual system the method will be applied to, but we may observe some trends. First, by the necessary oversampling by a factor of two, the computational cost doubles even for the operations we do not otherwise alter. This is, of course, still better than just using a higher oversampling factor and no other antialiasing at all. If the input and output sampling rates are fixed, the required resampling necessitates interpolation and decimation filters, which incur additional cost, but for the proposed approach and naive oversampling alike.
To judge the computational cost of the modifications made by the proposed method, let us assume that both the nonlinear functions and their antiderivatives are stored in lookup tables with the same resolution and lookup/interpolation method. Then replacing the nonlinear functions by (4) introduces a division and slightly increases the number of required table lookups. Note that the number of table lookups does not double, as F(p(n)) may be stored to avoid the lookup of F(p(n − 1)) in the next time step. Only when switching between the two cases in (4), an additional lookup is needed. However, the computational cost of the interpolation during table lookup grows exponentially with its dimensionality.
So if all nonlinear functions have only scalar arguments, the extra effort per time step is a number of additions and multiplications by 1 2 and one division per nonlinear function, and the occasional extra table lookup. This will almost certainly be more efficient than increasing the sampling rate even further. However, when the nonlinear functions have arguments of higher dimensionality, the (almost) doubled dimensionality for the antiderivatives leads to a quickly increasing computational cost for the table lookups, making the method unattractive in that case, in addition to the exploding memory requirements discussed above.

Examples
We shall exemplify the method with two circuits well-suited for it as they result in nonlinear functions with scalar input. The Julia source code implementing the derived models can be found at https://github.com/martinholters/ADAA_Examples.jl.

Diode Clipper
As the first example, we consider the diode clipper of Figure 3. The circuit is simple enough that we briefly repeat the modeling process here. From Kirchhoff's and Ohm's laws and i C = Cẏ, we immediately obtain Summing over two subsequent sampling instances, we get We now use the trapezoidal rule to substitutė y(n) +ẏ(n − 1) = 2 f s · (y(n) − y(n − 1)) (24) and obtain Collecting all quantities from time step n − 1 into canonical states allows simplification to Using Shockley's equation for the diodes, we get where saturation current and temperature voltage have been chosen as I S = 1 fA and v T = 25 mV respectively. Inserting (28) into (27) and introducing then leads to the implicit equation for y(n). Note that we do not treat this as a delay-free loop and apply the antialiasing to the sinh-function. Instead, we let f y (p y (n)) = y(n) denote the solution of the implicit equation. The resulting function is depicted in Figure 4 (obtained using an iterative solver).  To obtain the state update equation, we rearrange (27) to and note by comparing with (26) that the left-hand side equals x(n). Thus introducing with p x (n) = p y (n), we arrive at of the desired form.
Applying the aliasing mitigation only to the output equation is particularly simple in this case, giving y(n) =f y (p y (n), p y (n − 1)) (35) withf y defined according to (4). The required antiderivative F y (p y ) of f y (p y ), depicted in Figure 5, has to be determined numerically. For the results below, it has been precomputed at 1024 uniformly distributed points in the relevant range and stored in a table, using cubic interpolation during lookup. To also apply aliasing mitigation to the state update equation, we have to change it to and substitutef s = 2 3 f s for f s in (30) and (32). Note that this immediately leads tõ To study the effectiveness of the method, we consider Figure 6, where the output spectra for a sinusoidal excitation is depicted for various model configurations. Figure 6a,b give the baseline, the system without any aliasing mitigation at sampling rates f s = 44.1 kHz and f s = 88.2 kHz, respectively. Only applying aliasing mitigation to the output equation according to (35) is of little benefit, as seen when considering Figure 6c,d in comparison. We do note, however, the low-pass effect in Figure 6c, where higher harmonics exhibit an attenuation of up to 10 dB. When also applying the aliasing mitigation to the state update equation according to (36), we observe a significant aliasing reduction in Figure 6e,f. As explained, the aliasing mitigation should be combined with (modest) oversampling. In this particular case, as verified in Figure 6e, the model is still a relatively good fit even without oversampling, which however must be mainly attributed to lucky coincidence. More relevant is the case of a sampling rate of f s = 88.2 kHz, shown in Figure 6f. Comparing to oversampling to f s = 220.5 kHz without additional aliasing mitigation measures, as shown in Figure 6g, we see that the aliased components at low frequencies, where they are most easily perceived, are at a comparable level.

Tube Screamer-Like Distortion Circuit
As a second example, we consider the distortion circuit of Figure 7, inspired by the Ibanez Tube Screamer TS-808. We shall not go into details of the modeling procedure, for which we have used ACME.jl (https: //github.com/HSU-ANT/ACME.jl), but remark that if one allows the three diodes to be different, one can no longer derive a closed-form expression for their combined behavior. Instead, the nonlinear behavior is determined by a system of three equations. Nevertheless, using the dimensionality reduction approach of [13], the input p x (n) = p y (n) to the nonlinearity can be reduced to a scalar value, formed by a linear combination of the input and the capacitor states. Hence, the proposed method is applicable. Figure 8 again shows the output spectra for a sinusoidal excitation. As can be seen in Figure 8a, with plain oversampling to f s = 88.2 kHz, the signal contains strong aliasing components. Applying aliasing mitigation only to the output equation reduces the aliasing distortion to a limited extent, as shown in Figure 8b. In contrast, when also applying aliasing mitigation to the state update equation, the aliasing is significantly reduced, as seen in Figure 8c. Again, the aliasing mitigation is most effective at low frequencies, where it is also perceptually most relevant. As in the diode clipper example, for low frequencies the system with aliasing mitigation at f s = 88.2 kHz performs at least as good as an unmodified system at f s = 220.5 kHz, see Figure 8d.

Conclusions and Outlook
The presented approach for aliasing reduction generalizes the approach of [10] to all nonlinear systems that can be cast in a way that the nonlinearity takes scalar or very-low-dimensional input. This includes, but is not limited to, all models of circuits with a single one-port nonlinear element. If the system contains a delay-free loop, it has to be re-cast such that the nonlinearity is defined as the solution of the delay-free loop. Then, the delay introduced by applying the method of [10] to the nonlinearity can be compensated by adjusting the system's coefficients, even if the nonlinearity is part of a feedback loop.
As is to be expected, the achieved aliasing reduction is comparable to that of [10], allowing to significantly reduce the required oversampling especially for systems which introduce strong distortion, while the additional computational load is modest. Assuming lookup tables are used for f (u) (in general being implicitly defined) and its antiderivative F(u), the main price to pay is in terms of memory used.
It should be noted that the extensions to higher-order antiderivatives as proposed in [14] or [15] should be straightforward, following the same principle. A more interesting future direction would be to lift the restriction on the nonlinear function to have only scalar or very-low-dimensional input. If the method of [10] (or even the higher-order extensions of [14] or [15]) could be generalized to nonlinear functions with multiple inputs without requiring excessively large lookup tables, the method proposed in the present paper would immediately generalize to all stateful nonlinear systems.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.