Fast Simulation of Laser Heating Processes on Thin Metal Plates with FFT Using CPU/GPU Hardware

: In ﬂexible manufacturing systems, fast feedback from simulation solutions is required for effective tool path planning and parameter optimization. In the particular sub-domain of laser heating/cutting of thin rectangular plates, current state-of-the-art methods include frequency-domain (spectral) analytic solutions that greatly reduce the required computational time in comparison to industry standard ﬁnite element based approaches. However, these spectral solutions have not been presented previously in terms of Fourier methods and Fast Fourier Transform (FFT) implementations. This manuscript presents four different schemes that translate the problem of laser heating of rectangular plates into equivalent FFT problems. The presented schemes make use of the FFT algorithm to reduce the computational time complexity of the problem from O ( M 2 N 2 ) to O ( MN log ( MN )) (with M × N being the discretization size of the plate). The test results show that the implemented schemes outperform previous non-FFT approaches both in CPU and GPU hardware, resulting in 100 × faster runs. Future work addresses thermal/stress analysis, non-rectangular geometries and non-linear interactions (such as material melting/ablation, convection and radiation heat transfer).


Introduction
Based on virtual modelling and simulation of physical phenomena, Industry 4.0 solutions aim to integrate interactive virtual worlds with their equivalent physical part (e.g., using digital twins). These solutions enable the development of decision making tools that can be of great use in the optimization of manufacturing processes.
In this context, engineering solutions use extensively Finite Element Analysis (FEA) for simulation of such physical phenomena (e.g., acoustics, heat transfer, structural analysis, fluid flow, etc.). However, FEA approaches require a great amount of computation resources. In contrast, spectral analysis and spectral methods are competitive alternatives to numerical simulations. These methods provide frequency-domain solutions (infinite sum of trigonometric functions) to the Partial Difference Equations (PDEs) that model such physical phenomena.
In the particular sub-domain of laser heating/cutting simulation, frequency-based algorithms have been developed for heat transfer analysis on rectangular plates. These algorithms are faster than traditional

Laser Heating/Cutting Simulation
Finite Element Analysis (FEA) is one of the most used methods for thermodynamic simulation of laser heating/cutting of metal plates. Using non-linear FEA, Yilbas et al. [4] simulate triangular cuts for residual stress analysis. Similarly, Akthar et al. [5,6] perform the same non-linear FEA analysis for rectangular cuts, while in Reference [7] circular cuts are studied using the same approach. In order to account for laser ablation (material melting and evaporation), different methods such as the enthalpy method [4][5][6][7], element birth and death [8], volume fractions [9], and temperature thresholds [10][11][12] have been presented.
Analytic methods provide significantly faster computations at the cost of some model simplifications. Zimmer [20] presents a uni-dimensional analytic model for laser drilling processes when the laser beam is static. Modest and Abakians [21] present a solution for a moving laser on an infinite 2D plate. Jiang and Dai [22] present a frequency-based solution for rectangular plates when the moving laser follows a straight path. Similarly, Mejia et al. [23,24] present a frequency-based solution for arbitrary laser trajectories. Finally, an extension of the previous frequency-based solutions applied to multiple laser beams simultaneously heating the plate surface is presented in Reference [25].

FFT-Based Laser Heating Simulation
FFT-based methods are relevant in the solution of physical problems by solving the inherent PDE in the frequency domain. As a consequence, these methods have been successfully implemented in the simulation of different physics phenomena. For example, in the context of heat transfer analysis, Ju and Farris [26] present an FFT-based method for the solution of the thermoelastic equation on infinite domains, while Dillenseger and Esneault [27] apply the FFT to the solution of a heat transfer problem that arises in treatments of tissue with cancer. In structural analysis, FFT-based methods have been developed for the solution of different elasticity and plasticity problems [28][29][30][31], and fluid mechanics [32]. Other applications of the FFT include electromagnetism [33], 1D signal processing [34], and 2D image processing [35].
As discussed previously, many authors [22][23][24][25] solve the problem of laser heating simulation in the frequency domain. However, none of these authors cast the problem into the FFT domain.

Conclusions of the Literature Review
Current analytic methods for simulation of the laser heating/cutting problem already provide fast solutions to the problem in the frequency domain. However, such methods perform brute-force evaluation of the Fourier transforms, whose computation complexity for a 2D plate is O(M 2 N 2 ). As a consequence, these applications quickly become computationally expensive as more resolution of the plate is required.
To overcome this problem, this manuscript presents four different schemes that cast the existing brute-force solutions into equivalent DST and DFT problems. Mathematical proof for the validity of each scheme is presented and algorithms that make use of FFT libraries are introduced, reducing the computational complexity of the problem from O(M 2 N 2 ) (squared) to O(MN log(MN)) (logarithmic). These algorithms are implemented both in CPU and GPU architectures. Numerical validation against the brute-force approach results in a measured absolute error that is below 10 −10 K along the 2D plate. The results show significant computation time improvements to such brute-force simulations (i.e., References [22][23][24][25]), reducing the measured computation times from 1 s to 0.01 s (100× faster) for a 1024 × 1024 rectangular plate, and enabling simulations for larger plate discretization sizes (up to 4096 × 4096).
This manuscript extends the work presented in Reference [3]. In this previous work, two of the four presented schemes are briefly introduced. The research presented in this paper presents two additional FFT schemes, and provides further details of the four schemes (with added illustrations), to make easier the understanding of the algorithms. Furthermore, new simulations have been executed and an application case of the algorithms being implemented into an interactive simulator is presented.

Heat Transfer Equation for Laser Heating on Thin Plates
The temperature u(x, y, t) on a 2D rectangular plate for a continuous laser beam source satisfies the following partial differential equation with initial and boundary conditions: where a × b × ∆z are the plate dimensions, ρ is the plate density, c p is the specific heat and κ is the thermal conductivity. q = q(x, y, t) is the heat loss due to convection at the plate surface, h is the convection coefficient and u ∞ is the ambient temperature. Finally, the heat source f = f (x, y, t) is defined as a square-shape moving laser beam: where R is the plate reflectivity, P is the laser power, r is the laser radius and x 0 (t) = [x 0 (t), y 0 (t)] is the location of the laser spot at time t. x 0 is the parametric curve that defines the laser trajectory, discretized as a sequence of piecewise linear trajectories as described in References [23,24]. The function f describes the laser power on the plate according to the distance (infinity norm) Figure 1 presents an scheme of the laser heating problem on thin metal plates.

Discrete Fourier Transform (DFT) and Fast Fourier Transform (FFT)
The Discrete Fourier Transform (DFT) allows to write any sequence of M real numbers as a finite sum of sine and cosine functions, that is, a Fourier series. The (1D) DFT of the sequence of real values G = {g 0 , g 1 , . . . , g M−1 } ⊂ R is defined as: where φ m ∈ C is the m th Fourier coefficient and i = √ −1 is the imaginary unit. The computational complexity for direct evaluation of Equation (7) is O(M 2 ), in which each g k requires M evaluations (one for each Fourier term φ m ).
The Fast Fourier Transform (FFT) [2,34] is an algorithm that performs a factorization of the DFT, reordering the Fourier terms and grouping them (into pairs) in order to avoid redundant computations between different g k terms. Such a grouping is possible due to symmetries of the sine and cosine functions, and the resulting evaluation is performed in recursive form [2,34]. As a consequence, the FFT algorithm reduces the computational complexity of the problem to O(M log M) [2,34].
The above DFT and FFT complexity orders are true for 1D arrays. Therefore, for a 2D discrete plate of size M × N, the computational complexities become O(M 2 N 2 ) and O(MN log(MN)) for the DFT and the FFT, respectively.
The remainder of this section describes how to cast Equation (3) as a DFT problem and therefore, solve it using any FFT algorithm. Such casting effectively improves the computational complexity of the problem with respect to the current state of the art [22][23][24][25].

Scheme 1-Discrete Sine Transform (DST)
The Discrete Sine Transform (DST) [36] is a particular case of the DFT transform in which only the sine terms of the Fourier series are considered. The (1D) DFT of the sequence G = {g 0 , g 1 , . . . , g M−1 } ⊂ R is defined as: Intuitively, this is the easiest of the schemes for casting the problem as Equation (3) only considers the sine terms of a Fourier series. The algorithm of such casting is discussed below. The reader may refer to Appendix A for the mathematical proof of the scheme.
Algorithm 1 presents the method used to retrieve the temperature at any given time t with the DST method (see Equation (A1)). Line 2 applies the fast 2D DST of any FFT library, which presents a computational complexity equivalent to a FFT (i.e., O(N log N) [37]). Line 3 applies the initial and boundary conditions presented in Equation (1) to the computed solution. The complexity of the presented algorithm is O(MN log(MN)).

Scheme 2-FFT Padded with Zeros
In this scheme, the original list of Fourier coefficients is duplicated in size in each direction (2M × 2N). The added coefficients are set to zero and the FFT algorithm is applied in each direction. The final temperature result is obtained from the imaginary (sine) component of the FFT result. The mathematical proof of the scheme is presented in Appendix B.
Algorithm 2 presents the method used to retrieve the temperature at any given time t using the zero padding method (see Equation (A3)). Line 1 initializes the extended matrix of Fourier coefficients with M, N trailing zeros (as per Figure 3). Lines 4 and 8 compute the 1D FFT of the padded arrays for the y and x dimensions, respectively. Lines 5 and 9 extract the complex (imaginary component) of the results. Finally, Line 12 removes the trailing zeros from the solution while Line 13 applies initial and boundary conditions. The complexity of the presented algorithm is O(MN log(MN)).

Scheme 3-Odd-Symmetry 1D FFT
In this scheme, the original list of Fourier coefficients is also duplicated in size in each direction (2M × 2N). The idea is to take advantage from the odd symmetry of the sine function at kπ (with k ∈ N, see Figure 4). Therefore, the added coefficients are set by mirroring the original M and N coefficients (multiplied by −1) in each direction (rows and columns). The final temperature is obtained from the imaginary (sine) component of the 1D FFT result in each direction. The mathematical proof of this scheme is presented in Appendix C. Algorithm 3 presents the method used to retrieve the temperature of Equation (A9) at any given time t using two nested 1D FFTs. Line 1 initializes the extended matrix of Fourier coefficients with M, N trailing zeros. Lines 3-5 and Lines 6-8 add the reversed sequences of Fourier coefficients (with negative sign) in each dimension, respectively (see Figure 5). Lines 10 and 14 compute the 1D FFT of the padded arrays for the y and x dimensions, respectively. Lines 11 and 15 extract the complex (imaginary component) of the result. Finally, Line 12 removes the mirrored part from solution while Line 13 applies initial and boundary conditions. The complexity of the presented algorithm is O(MN log(MN)).

Scheme 4-Odd-Symmetry 2D FFT
Finally, in this scheme the original list of coefficients is duplicated and mirrored in each direction exactly as in Section 3.6. However, this scheme also takes advantage of the even symmetry of the cosine function at 2kπ (k ∈ N, see Figure 6). Similar to the 1D odd-symmetry approach, the duplicated coefficients are mirrored in each direction (rows and columns), and multiplied by −1. The final temperature is retrieved from the the real component of the 2D FFT, which considers the sine components and the cosine components (that become 0 due to the cosine symmetry). The mathematical proof of the scheme is presented in Appendix D. Algorithm 4 presents the method used to retrieve the temperature of Equation (A16) at any given time t using a 2D FFT. Line 1 initializes the extended matrix of Fourier coefficients with M, N trailing zeros. Similar to the 1D odd symmetry method, Lines 3-8 add the reversed sequences of Fourier coefficients (with negative sign) in each direction (see Figure 5). Line 9 computes the 2D FFT of the extended Fourier matrix. Line 11 extracts the real part of the FFT solution and removes the mirrored part. Finally, Line 12 applies the initial and boundary conditions. The complexity of the presented algorithm is O (MN log(MN)).

Algorithm 4
Retrieve temperature using 2D FFTs by applying odd sine symmetry and even cosine symmetry to the original coefficients

Complexity Analysis
This section presents a complexity analysis of the presented algorithms. Table 1, presents the computational complexity of Algorithm 2, line by line. The number of operations in a 1D FFT is of N + N log N [2,34]. As M and N grow large, the dominant term for the total number of computer operations is 2MN(log(2M) + log(2N)). As a consequence, the resulting complexity is O(MN(log M + log N)) or equivalently, O(MN log(MN)).

Line Description Number of Operations Dominant Term Complexity Order
Algorithms 1, 3 and 4, present the same structure as Algorithm 2 does. Therefore, similar complexity analysis apply for Algorithms 1, 2, 3 and 4, resulting in the same complexity order O (MN log(MN)).

Results
This section presents the simulation and performance results of the implemented DST and FFT schemes using different state-of-the-art FFT libraries, for the solution of the laser heating problem on thin metal plates. All the simulations are executed with the parameters presented in Table 2 and the laser trajectory presented in Figure 2a. Section 4.1 presents the numerical validation of the presented schemes with respect to the brute-force algorithms [22][23][24][25]. Finally, Section 4.2 discusses the computational performance of the implemented schemes using available FFT libraries. Section 3 validates the mathematical correctness of the presented schemes. However, a numerical validation is presented in this section with numerical and graphical results for a 0.01 × 0.01 × 0.001 rectangular plate. Laser and material parameters are presented in Table 2 while the laser trajectory used for the tests is the same presented in Figure 2a. As a groundtruth, we choose the method presented in References [23][24][25]. This method already solves the problem presented in Equation (1) using a brute-force approach, which requires O(M 2 N 2 ) operations (as already discussed in Section 3.3. Figure  7 plots the temperature distribution results obtained with this brute-force method.  Figure 2a obtained by the brute-force method [24]. No FFT or Discrete Sine Transform (DST) is used. Figure 8a plots the temperature distribution at the end of the laser trajectory, computed with the DST algorithm for a 1024 × 1024 plate discretization. Figure 8b plots the same result computed with the zero padding FFT algorithm. The absolute error for the DST and the zero paddding FFT result (w.r.t. the brute-force approach) is presented in Figures 8c,d, respectively. The measured absolute error is below 10 −10 (K) in both cases. It is worth pointing out that this error is evenly distributed through the 2D plate, which means that such error is not sensitive to the laser path or any other geometric features (such as the domain boundaries).
Similarly, Figure 9a,b plot the temperature distributions at the end of the laser trajectory for the 1D symmetric FFT and the 2D symmetric FFT algorithms, respectively. Figure 9c,d plot the absolute error for the 1D symmetric and 2D symmetric FFTs, respectively. Again, the error is below 10 −10 , evenly distributed through the 2D plate.

Computational Performance
This section evaluates the performance of the proposed methods under CPU and GPU hardware architectures by making use of highly optimized FFT libraries. The Python programming language includes in its scientific package ecosystem high level wrappers to C/C++ libraries. For this reason, Python has been selected for the rapid prototyping of the proposed schemes in this work.
The FFT algorithm is used in a wide range of performance demanding applications. Therefore, the optimization degree of its implementation is highly relevant. On the one hand, to target the CPU, the FTTPACK, MKL and FFTW libraries have been selected. On the other hand, to target the GPU, the cuFFT library from the NVIDIA CUDA Toolkit has been used. All these libraries make use of multi-core parallelization, vectorization instructions, efficient memory usage, and apply specific FFT algorithms to exploit the underlying hardware to the highest degree. It is worth noticing that the FFTPACK library is the only one (between the aforementioned ones) that provides an implementation of the DST. Table 3 summarizes the selected libraries along the Python wrapper packages and the targeted hardware device during the performance tests. Two test platforms have been used for the performance measurements: i) a desktop PC using Windows 10 with an Intel Core i5-6500 (CPU), 16 GB RAM and NVIDIA GeForce GTX 960 (GPU) and ii) a desktop PC using Manjaro (GNU/Linux) with an Intel Core i7-4700K (CPU), 16GB RAM and NVIDIA GeForce RTX 2060 (GPU). To measure the execution times of each proposed method, each test has been computed 5 times and the minimum time has been registered.
This section is divided into four subsections. Section 4.2.1 presents the computation times using the CPU, while Section 4.2.2 presents the computational times using GPU hardware. Then, Section 4.2.3 compares the performance difference between both devices. Finally, Section 4.2.4 presents the achieved speed-up against the state of the art brute-force solution [22][23][24][25]. Figure 10 shows the computation time of the proposed schemes using the FFTPACK, MKL and FFTW libraries, respectively. These are all implemented to be executed in general CPU hardware.   Figure 10a,b show all the proposed schemes implemented with the FFTPACK library. The FFTPACK is the only library (between the used ones in this manuscripts) that has an implementation of the DST algorithm. This DST implementation is efficient for plate discretization sizes of 512 × 512 and 1024 × 1024. However, its performance is not as consistent as the FFT based methods. Overall, the performance of the FFT-based methods with different input size are more stable, being the 1D odd symmetric FFT scheme the best approach using the FFTPACK library. Figure 10c,d show the execution times of the temperature evaluation making use of the MKL library. In this case, from the FFT-based methods, both the 1D and 2D odd symmetric schemes are the most efficient. Figure 10e,f show the computation times using the FFTW library. Although, quite close to the results obtained with the MKL library, the FFTW results are the best when using the CPU device. In this case, also both the 1D and 2D odd symmetric schemes are the most efficient.

CPU Performance Measurements
The optimization degree achieved for the FFT algorithms with the MKL and FFTW libraries is higher. These libraries make better use of the underlying hardware, obtaining faster results than the FFTPACK library for the FFT-based methods. Results obtained with the FFTW library are slightly better (faster) than the MKL ones. However, this can be due to the usage of wrappers, as the pyfftw (FFTW) wrapper offers more control over the implementation. Nonetheless, the obtained results greatly surpass the state of art, both FFTW and MKL have shown execution times under 1s for plate sizes up to 4096 × 4096. While the zero padding and the 1D odd symmetric implementations produce similar results (in terms of computation time), the 2D odd symmetric scheme is by far the most performant. As the Fourier coefficients can be computed in the GPU before performing the temperature computation, the input for the FFT is already in GPU memory. It is worth to point out that the transfer of these coefficients from host memory (CPU) to device memory (GPU) is not measured. Figure 12a shows an overview of the computation times for the proposed DST (FFTPACK only) and FFT (FFTPACK, MKL, FFTW and cuFFT) methods. The FFTPACK (red) is the slowest and cuFFT (yellow) is the fastest. Execution times for both the MKL (blue) and FFTW (green) libraries are similar, obtaining slightly faster results with FFTW. Overall, the GPU hardware acceleration (with cuFFT) provides a considerable speed-up, making it a good alternative to consider for simulations on plates with large discretization sizes. Figure 12b compares the execution times of the two test platforms considering both CPU and GPU devices for the most performant FFT method: the 2D odd symmetric algorithm. This comparison shows that the GPU hardware effectively accelerates the computation time, between the fastest CPU (i7-4700K) and the slowest GPU (GTX 960), obtaining up to a 2× speed-up for plate sizes larger than 1024 × 1024. The performance difference increases as the input plate increases in size. Using more recent GPU hardware (RTX 2060) results show a bigger difference in the achievable compute time speed-up.  Figure 13 compares the proposed FFT method with the state of the art (SoA) GPU brute-force solution [25]. The presented FFT method is much faster for plate sizes larger than 128 × 128, showing a big difference in computing times with a plate size of 1024 × 1024, where the FFT approach obtains a 124× speed-up (2.255 s against 0.018 s). Figure 13 demonstrates the potential of the presented FFT method to perform the temperature evaluation for high resolution plate sizes (1024 × 1024 and beyond). Furthermore, the current brute-force solution [25] has a limit size of 1024 × 1024 due to GPU shared memory usage, while the proposed FFT approach can compute the temperature for plates of sizes up to 4096 × 4096 under the same GPU hardware, without resorting to out-of-core GPU memory management. For small plate sizes (smaller than 128 × 128), the brute-force approach is faster due to the FFT method requiring extra processing of input coefficients and dispatching of kernels (scheduling time), adding a small computation overhead.

Method
Analytic SoA FFT_2D_oddsym Figure 13. Appraisal of the computation times using an NVIDIA GeForce GTX960 (GPU) for the presented 2D odd symmetric FFT method vs the brute-force method presented in Reference [25]

Interactive Simulator Prototype
This section presents the integration of the presented FFT-based schemes into a 3D interactive simulator for CNC (Computer Numeric Control) laser machining. The prototype integrates a physical module for the temperature computation and a geometry module that computes the plate cutting through time [23]. The physical module implements the GPU-based FFT algorithms presented in this manuscript for the temperature computation at interactive rates while the geometry module performs boolean operations as discussed in References [38,39].
The current prototype provides interactive simulation of the laser heating/cutting process, visualized as a continuous animation. Interactively, the user can inspect the plate and its temperature at any specific timestep. Furthermore, the fast computation speed enables the possibility to run different simulations with different parameters in an interactive manner. Figure 14 shows the virtual simulator for the test case discussed in this manuscript. The development of interactive virtual worlds connected with physical objects (e.g., digital twins), has become a key technology for fast assessment of manufacturing processes [1]. In this context, an interactive CNC machine (as the integrated prototype) provides several tools to the engineer for the design of efficient CNC programs (plate, laser parameters and trajectory), reducing the requirement of real-world tests and consequently, reducing costs in terms of energy consumption, material waste, machining times, and so forth.
The performance of the simulation is very important in the decision making process and particularly, in the optimization of a given CNC program. Given a fixed time to design the CNC program, it is important to test and tune the different program parameters (such as laser parameters and trajectory). The current approach shows a significant decrease in the simulation computing time. This saving allows user-assisted and/or automated optimization programs to evaluate the different scenarios at a reasonable computational cost. The impact of such optimization on the quality of the workpiece will depend on the actual decisions and strategies applied by the engineers, supported by the simulation results.
Although not being part of the present investigation, it is worth noticing that the temperature maps on the plate help to predict and control deformations caused by thermal residual stresses.

Conclusions and Future Work
This manuscript presents four different schemes for the solution of the laser heating problem on thin metal plates using the DST and the FFT. The presented methods reduce the computational complexity of the problem from O(M 2 N 2 ) to O(MN log(MN)) (with M × N being the discretization size of the metal plate). It is worth noting that forced convection and radiative heat transfer are not present in the solution considered in this manuscript. The inclusion of those effects in the simulation implies a significant change in the structure of the mathematical model, which is out of the scope of this research. Overcoming of such a limitation is left as future work.
These schemes are implemented in both CPU and GPU architectures using available optimized FFT libraries. Mathematical and numerical proofs of the correctness of the schemes are presented and the numerical error is measured below 10 −10 K (and independent of the laser trajectory).
The performance evaluation shows that the minimum achievable computation time varies in function of the used library, specially for big input sizes. Furthermore, the obtained results improve the state of the art [25] in both CPU and GPU platforms for all the proposed schemes. Specifically, using GPU hardware, the computation times for the temperature evaluation are reduced from 1 s to 0.01 s (100× faster), measured in an NVIDIA GeForce GTX 960 (GPU).
With more modern GPU hardware (GeForce RTX2060) even faster results can be acquired which shows the potential of the proposed algorithms towards real-time laser heating/cutting simulations and flexible manufacturing scenarios that require fast tool-planning capability and laser parameter optimization to easily adapt to customer order changes.
Future work includes (1) the inclusion of thermal/stress models for structural analysis of the plate after the generated high temperature gradients, (2) analysis of non-rectangular plate geometries, and (3) consideration of non-linear interactions such as temperature-dependent thermal properties, forced convection, radiation heat transfer and phase changes.

Abbreviations
The following abbreviations are used in this manuscript: Total simulation time (s).
Laser spot location at a given time x 0 (t) = (x 0 (t), y 0 (t)). with u kl (t) = u(x k , y l , t) the temperature at the discrete points of the plate and γ m = mπ/M, δ n = nπ/N the discrete versions of α m and β n , respectively. This equation is equivalent to a 2D DST of the temperatures on the discrete plate (as per Equation (8)).

Appendix B. Scheme 2-FFT Padded with Zeros
Consider {x 0 , x 1 , . . . , x M } be a uniform discretization of the interval [0, a]. For M Fourier coefficients the following equation holds: where I[·] corresponds to the complex component of the series and γ m = mπ/M. This corresponds to a 1D DFT ( Equation (7)) with M trailing zeros.
After applying the same procedure to the sequence {y 0 , y 1 , . . . , y N }, Equation (3) becomes: The previous equation is equivalent to 2 nested 1D DFTs ( Equation (7)) after adding 1 zero at the beginning of the Fourier sequence and M, N zeros (x and y components, respectively) at the end of the sequence.
Putting together Equations (A7) and (A8), Equation (3)  The previous equation is equivalent to 2 nested 1D DFTs ( Equation (7)) after padding the M − 2, N − 2 coefficients in reverse order (and negative) at the end of the original Fourier coefficients in each direction (x and y), respectively. The final result is retrieved by taking the complex part (i.e., the sine component) of each 1D DFT.

Appendix D. Scheme 4-Odd-Symmetry 2D FFT
In this scheme, consider the real part D kl (instead of the complex one) of Equation (A9) as follows: in which the first and second terms cancel out, as well as terms three and four, respectively. Therefore: Finally, we add D kl to Equation (A9): u kl (t) = u kl (t) + D kl