Time-Domain Multidimensional Deconvolution: A Physically Reliable and Stable Preconditioned Implementation

: Multidimensional deconvolution constitutes an essential operation in a variety of geophysical scenarios at different scales ranging from reservoir to crustal, as it appears in applications such as surface multiple elimination, target-oriented redatuming, and interferometric body-wave retrieval just to name a few. Depending on the use case, active, microseismic, or teleseismic signals are used to reconstruct the broadband response that would have been recorded between two observation points as if one were a virtual source. Reconstructing such a response relies on the the solution of an ill-conditioned linear inverse problem sensitive to noise and artifacts due to incomplete acquisition, limited sources, and band-limited data. Typically, this inversion is performed in the Fourier domain where the inverse problem is solved per frequency via direct or iterative solvers. While this inversion is in theory meant to remove spurious events from cross-correlation gathers and to correct amplitudes, difﬁculties arise in the estimation of optimal regularization parameters, which are worsened by the fact they must be estimated at each frequency independently. Here we show the beneﬁts of formulating the problem in the time domain and introduce a number of physical constraints that naturally drive the inversion towards a reduced set of stable, meaningful solutions. By exploiting reciprocity, time causality, and frequency-wavenumber locality a set of preconditioners are included at minimal additional cost as a way to alleviate the dependency on an optimal damping parameter to stabilize the inversion. With an interferometric redatuming example, we demonstrate how our time domain implementation successfully reconstructs the overburden-free reﬂection response beneath a complex salt body from noise-contaminated up- and down-going transmission responses at the target level.


Introduction
Seismic recordings carry useful information describing the interaction of waves propagating in an otherwise unknown physical medium. The recorded wavefields are associated with a particular wave state that can be uniquely identified by defining wave nature, media parameters, source extension, and boundary conditions. In theory, depending on the application, there is a certain degree of freedom in setting specific conditions to reproduce the desired survey. In reality, multiple physical, economical, or environmental aspects constraint the circumstances in which seismic exploration is conducted. To overcome such limitations, based on survey measurements, data-driven techniques are typically used to recreate more ideal settings, also known as virtual surveys. Most of the efforts in many pre-processing workflows consist of identifying and separating certain family of events present in the recorded seismic gathers. For instance, one may be interested in removing the source wavelet imprint, ground-roll, or surface-related multiples. An effective strategy to connect the elements intervening in two seismic states consists of defining an interaction quantity and use reciprocity theorems to isolate the response of a sought state [1]. Thereby, it becomes possible to combine wavefield measurements at two independent receivers to reproduce the response that would have been observed at one location as if a source was injected at the other one. Ideally, Green's function should emerge once cross-correlation is performed at each receiver inside a closed energy-radiating boundary [2][3][4]. The set of methods describing such wavefield reconstruction are referred to as seismic interferometry and have been extensively used for sources of different natural events including localized explosions and earthquakes [5], ambient noise [6], and infrasonic transducers [7]. From this perspective, interferometric methods are also used to relocate sources from the acquisition surface to a preset virtual plane in the subsurface, therefore, the technique is often interpreted as a form of data-driven redatuming [8,9]. The need for a closed boundary in a lossless media hinders the accurate wavefield reconstruction by cross-correlation, thereby, only a blurred version of the required Green's function is retrieved. The extension of the method for dissipative media with open boundaries results in an implicit representation of Green's function, where deconvolution is now used to remove the spatio-temporal blurring effects of the correlation gathers. This method is known as seismic interferometry by multidimensional deconvolution (MDD- [4,10]) and has been extended to more general vector fields including elastic [11] and electromagnetic waves [12].
Though MDD plays a crucial role in seismic interferometry, its applicability extends well beyond interferometry, being of importance to many applications-from wavemode separation to imaging. Many applications of MDD take advantage of the fact that, in principle, it does not require knowledge of the structural parameters, hence, one can build connections with well-known surface-related-multiple preprocessing techniques. For instance, in marine seismic operations, image resolution is degraded by ghost reflections. When all parameters of the linear ghost model are known, deghosting is conducted as a deterministic deconvolution [13]. Similarly, a relatively close problem is source wavelet estimation and removal. In effect, source deconvolution is a critical preprocessing step in Marchenko redatuming [14,15] and adaptive surface-related multiple elimination [16,17]. On the other hand, in a simultaneous-source acquisition experiment [18][19][20], where responses to sources ignited at different time delays are recorded, deblending has also been interpreted as a form of seismic interferometry by multidimensional deconvolution [21]. The advantage of this approach is that most of the crosstalk artifacts mapping in the image domain are reduced. To a large extent, many of these operations can be carried out in a single step by considering up/down multidimensional inverse filtering. In offshore seismic surveys, it is well known that the water-air interface and the ocean bottom are strong reflectors. Early methods to remove water-layer reverberations consider a deconvolution filter [22]. Later, Noah's single-channel deconvolution was used to remove surface-related multiple effects [23]. More recently, the advent of multicomponent ocean-bottom seismic data (OBS), enabled data-driven wavefield separation and the method was extended to multidimensional fields [24].
Whilst redatuming by MDD represents a clear step in the direction of obtaining more accurate wavefields than its correlation counterpart, it comes with a set of additional numerical challenges due to the ill-posed nature of the associated inverse problem. This has generally prevented the wide adoption of MDD in the geophysical community; however, recent interest in the topic has led to the development of a variety of strategies to stabilize such an inversion. Historically, MDD is approached in the frequency domain where multiple inverse problems are solved independently for each frequency in the available seismic bandwidth range, and the resulting solutions are then recombined and transformed back into the time domain. Whilst direct inversion of the kernel matrix is generally adopated [4], alternative solutions have been proposed in the literature. For example, Minato et al. [25] apply singular-value decomposition (SVD) to compute the pseudoinverse matrix and truncate small eigenvalues as a way to naturally regularize the inverse process. More recently, Boiero and Bagaini [26] suggest to introduce a regularization term that enforces similarity of the solution between neighbour receivers in the solution of each inverse problem.
Alternatively, MDD can also be solved in the time domain as a single coupled inverse problem: whilst this approach is less known and not widely adopted yet, it has recently been shown to be more robust as solving for the entire frequency spectrum at once acts as a natural regularizer and does not require the identification of damping parameters for each frequency independently. In this direction, pioneering work has been carried out by van der Neut and Herrmann [27], van der Neut et al.
[28], Luiken and van Leeuwen [29], Ravasi and Vasconcelos [30], who also proposed the use of different regularization and preconditioning strategies to further alleviate the ill-posedness of the inverse problem. More specifically, van der Neut and Herrmann [27] investigate the use of sparsifying transforms (i.e., Curvelets) in combination with sparsity-promoting inversion to improve the robustness of the solution to noise in the data and mitigates artefacts arising from limited-aperture source arrays. Van der Neut et al. and Luiken and van Leeuwen [28,29] on the other hand, introduce physics-based preconditioners to guarantee the solution of the inverse problem to satisfy a priori information such as causality and reciprocity. Finally, in the context of acoustic laboratory experiments, time-domain MDD has also been recently shown great promise to remove domain boundary effects [7].
Nevertheless, a systematic comparison between the frequency and time domain implementations of MDD and an in-depth analysis of the effect of different preconditioning strategies is not present to date in the literature. In this paper, we aim to fill such a gap and provide a thorough evaluation of the impact of different preconditioning strategies in the solution of MDD problems. In our analysis, we first revisit the multidimensional deconvolution method and provide an overview of its practical implementation in the frequency domain. Similarly to previous studies, we also argue that difficulties mostly arise when selecting an optimal regularization parameter per frequency to stabilize the problem. We also reformulate the time-domain constrained least-squares inversion to include three types of physical preconditioners, namely time causality, reciprocity, and frequency-wavenumber locality. More importantly, by using full-waveform transmission wavefields from a subsalt synthetic dataset we consider two different scenarios, labeled here as easy and difficult. The first case is concerned with the situation where the input data and the kernel of the MDC modeling operator are noise-free-e.g., directly modeled or recorded data. The second refers to the case where these wavefields are obtained by means of another inverse process, for example Marchenko redatuming [14,31,32] or Full-Wavefield-Migration [33,34], which lead to input wavefields inevitably contaminated by both coherent and incoherent noise. Especially in the latter case, we show the importance of solving the MDD problem in the time domain alongside with the proposed hybrid physical preconditioners to produce superior wavefield prediction in terms of amplitude and phase fidelity, indicating the benefits of this approach in controlling the implicit noise amplification.

Seismic Wavefield Redatuming
Interferometric redatuming by deconvolution dates back to the 1D data-driven transform method proposed by Riley and Claerbout [23]. The underlying idea of this technique is to remove surface-related multiples (SRM) from seismic reflection data recorded at an open boundary on the surface of the Earth by means of single-channel deconvolution of the downgoing wavefield from the upcoming wavefield at each receiver location. More recently, this method has been extended to work directly with multi-dimensional seismic data and it is generally referred to as multi-dimensional deconvolution (MDD) in the context of surface multiple removal for ocean-bottom acquisition systems [24,35,36]. Using one-way reciprocity theorems, Wapenaar et al. [12] propose a generalization of the method for the case of 3D laterally variant media in the presence of dissipation effects. In this section, we briefly review the main theoretical aspects behind interferometric redatuming for acoustic fields, followed by a description of its practical implementation in the time and frequency domain. Even though convolutional models for more general vector fields exist, we restrict our notation to scalar fields only and recall that derivations for electromagnetic [12] and elastic [11,37] waves follow similar arguments along with the ones in this section. As for the actual implementation of the method, the current practice of estimating a virtual reflection response in the space-frequency domain considers the introduction of a suitable regularization parameter to stabilize the inversion [38][39][40][41]. Contrary to that approach, for solutions in the space-time domain, we propose a set of efficient preconditioners designed to conform to the implicit physics of wave phenomena, namely, reciprocity, causality, and locality/compactness in the frequency-wavenumber domain. Additional constraints are also imposed when considering the localized mapping of the cross-correlation function (CCF) in the frequency-wavenumber domain. This implementation substantially reduces the numerical artifacts present in the retrieved reflectivity and positively influences subsequent imaging and inversion studies due to accurate waveform kinematics and amplitude estimation.

Two-Way Multidimensional Deconvolution
Rayleigh's reciprocity theorems describe the interaction of seismic sources and wavefields in two distinct physical states [1,42]. They comprise the theoretical foundation of seismic interferometry and lead to Green's function representations describing wave propagation between two observation points in the media, in terms of data measured at such locations. The specific configuration of choice in the two states combined with a convenient interaction quantity determines the domain of application for which the representation is derived. In particular, consider a heterogeneous volume, D, representing a section of the medium characterized by wave propagation velocity c(x) and density ρ(x), enclosed by a surface ∂D. In the reference state (Figure 1a), the broadband pressure and particle velocity wavefields at receiver location x r due to an impulsive monopole source at location x r are given by Green's functionḠ p (x r , x r ; ω) andḠ v n (x r , x r ; ω), respectively. Note that the subscript n is used here to indicated that the particle velocity measurement must be oriented normal to the enclosing surface. Similarly, in the actual medium (Figure 1b), the band-limited pressure and particle velocity recordings at receiver x r from a source at x s are indicated by p(x r , x s ; ω) = G p (x r , x s ; ω)s(ω) and v n (x r , x s ; ω) = G v n (x r , x s ; ω)s(ω), respectively, where s(ω) represents the possibly unknown source signature. Note that the source at location x s can be of any type (i.e., monopole or dipole) as long as this is consistent for all the wavefields in this state; moreover, we consider a configuration where x s is located outside ∂D, whereas x r lies within the volume D. Starting from Rayleigh's reciprocity theorem of the convolution type, and considering the interaction quantity ∂ i {P A V i,B − V i,A P B } (subscripts A and B refer to the reference and actual media), the convolution representation theorem for Green's functionḠ in the space-frequency domain is derived [4].
where the frequency dependency has been omitted for brevity. Note that this representation is valid for dissipative and lossless media. Deriving Equation (1) involves the Sommerfeld radiation condition, which simplifies the problem by assuming the enclosing surface, ∂D = ∂D r 0 + ∂D r , with ∂D r a plane at depth level z r , and ∂D r 0 a semisphere segment ( Figure 1). In such configuration, ∂D r 0 does not contribute to the integral when its radius is arbitrarily large [4,24]. Additionally, the medium outside D can be different in both states, but it is required that inside the properties remain the same so that all contrast terms in Rayleigh's reciprocity theorems can be dropped. In particular, the effects of a homogeneous reference medium outside of D can be enforced when considering proper boundary conditions for the incoming waves. Finally, we note that the process of multiplying two wavefields and integrating over the receiver boundary ∂D r as performed twice in Equation (1) is generally referred to as multi-dimensional convolution (MDC).  (1), the Green's functionḠ(x r , x r ) in the reference configuration (a), can be obtained by deconvolving the wavefield observed at x r from that in x r (b). Seismic waves due to sources x s (grey stars) distributed along the surface ∂D s are measured by receivers (triangles) at ∂D r . After MDD, the receiver at location x r (blue triangle) is turned into a virtual source and its response is observed at the other receivers (red triangle). Note that G(x r , x s ) and G(x r , x s ) are full transmission responses carrying all orders of internal as well as surface related multiples. (c) Schematic representation of the reference responseḠ for a fixed time. Rows correspond to common shot gathers associated with an specific virtual source, whereas Columns are common receiver gathers; numbers indicate the rows and columns used to display the estimated reflection response in the subsequent figures.
At this point, the problem of findingḠ p (x r , x r ) andḠ v n (x r , x r ) under the integral can be resolved in terms of multidimensional deconvolution by assuming p(x r , x s ), p(x r , x s ), and v n (x r , x s ) to be known. Solving the integral Equation (1) requires a sufficiently large amount of sources x s radiating energy into the volume D such that the implicit ill-posedness of the problem can be reduced. Given that, in practice, only a limited set of sources and receivers are deployed at discrete locations, it is convenient to discretize the representation (1) and recast the integral in a compact matrix-vector form for each frequency [43], Here, matrix D consists of the stack of velocity and pressure recordings where each of them performs a multidimensional convolution operation in the frequency domain. On the other hand, the sought solution,ḡ, is the concatenation of pressure and particle velocity Green's functions in the reference state. Finally, whilst we have considered here the case of a single virtual source x r , Equation (2) can be easily modified to include multiple virtual sources as commonly found in the literature [4]; in such a case, both p andḡ become matrices whose columns are associated to different virtual sources.
Depending on the specific solution, one can constrain the inversion by introducing prior information in the form of regularization. Ravasi et al. [43] proposes a directionality operator based on the one-way wavefield separation filter [44][45][46]. Such a prior steers the inversion in favor of the desired solution.

One-Way Multidimensional Deconvolution
Assuming one is interested in reconstructing the overburden-free properties of a wavefield propagating in the reference configuration only (Figure 1a), two-way deconvolution may not be a suitable implementation in the most general case. Except for situations when directionality priors are considered, the retrieved wavefield includes any scattering effects introduced by the overburden above the datum level z r . Alternatively, by imposing that the reference medium is homogeneous above ∂D r , the reference Green's function is only composed of upcoming waves at the boundary ∂D r , i.e.,Ḡ =Ḡ − . Effectively, this argument implies thatḠ + is discarded once all wavefields in Equation (1) are expressed in terms of their pressure normalized wavefield separated components, i.e., G = G + + G − [4]. Therefore, Equation (1) yields, or alternatively These are two alternative versions of the one-way implicit representation for the broadband up-going responseḠ p at observation point x r due to a localized virtual source at x r radiating energy downwards into the medium D. Once again, MDD is required to estimate the unknown quantityḠ p (orḠ v n ). In essence, one-way MDD enables the reconstruction of a virtual survey beneath the overburden such that the reflection response is deprived from all interactions with the medium above ∂D r . Finally, by invoking a far field approximation [47], Equations (4) and (5) can be further simplified to consider only monopole sources and pressure receivers where the medium parameters scaling factor has been incorporated for simplicity intoP + . For historical reasons, one-way MDD is the most widely used interferometric redatuming technique in seismic exploration, whereas the two-way MDD represents as a more general description, for which in principle data separation is not required.

Frequency Domain MDD-Regularized Least Squares
Irrespective of the integral representation and the specificities of the deconvolution problem one attempts to solve, Equations (3)-(6) can be written for general fields in terms of a Fredholm integral of the first kind where, Q(x r , x s ) is the integral kernel operator, G(x r , x r ) the wavefield in the model space we wish to recover, and U(x r , x s ) is the wavefield in the data space. A summary of particular discrete representations for one and two-way wavefields in terms of the general notation is given in Table 1. Our goal is to solve the linear inverse problem (7) to find the unknown model parameters G that better explain the data U. In contrast to most inverse problems, where the kernel is well defined and encodes the underlying physics, in this case, both U and Q are measurements, in general polluted with noise. Examples of this situation are recorded wavefields as in the virtual source method [3], or wavefields derived from model-driven redatuming techniques such as Full-Wavefield Migration (FWM- [33]) or data-driven redatuming techniques such a Marchenko redatuming [14,31]. This peculiarity negatively impacts the stability properties of the MDD problem making it more challenging than other inverse problems. In spite of such consideration, we can still attempt to solve Equation (7) in terms of a regularized least-squares problem: where the l 2 -norm is used to evaluate the distance from the observed data to the numerical estimation, and the regularization parameter, λ, weights the information in the data against that in the prior. The minimum norm solution to Equation (8) is then written as in the normal equations, Table 1. Summary of acoustic MDD representations in compact matrix-vector notation for one-and two-way wavefields according to the general representation (7). The fifth and sixth column, C, Γ, comprise cross-correlation and point spread functions, respectively.
We note that the first term on the left-hand side is the well known cross-correlation function, C = Q H U, while the point spread function, Γ = Q H Q, appears as a blurring operator acting on the sought solution. Inversion of the normal equations, C = ΓG, in the frequency domain is commonly done for each frequency slice individually. Since the stabilized point spread function is inverted per frequency component, the required memory is such that, operator and data matrices of size (n r × n s ) can be explicitly defined, therefore, this strategy is computationally feasible and in many cases direct or block Krylov methods can be used [48,49]. A major drawback in solving (8) lies in the difficulty of selecting an optimal regularization parameter, in fact, many of the parameter selection methods rely on solving the system multiples times and select λ based on an additional minimization problem [50]. Moreover, no theory exists to define how to choose different λ for each frequency inversion that are consistent with each other. Equally important is the source to receiver ratio and the band-limited character of the sources [29]. Finally, the absence of a clear causality constraint in ω-domain lead to solutions containing acausal events.

Time Domain MDD-Constrained Least Squares
An alternative implementation of deconvolution is derived by transforming the Fredholm integral (7) from the frequency into the time domain. Once again, the data U and unknown response G are related through Multi-Dimensional convolution with a given kernel,Q. Contrary to the element wise multiplication in the frequency domain, here the kernel of the Fredholm integral constitutes a time domain convolution. In the time domain, the linear inverse problem reads, where the circumflex is used to stress the fact that the multi-dimensional convolution is here performed in the time domain and therefore it cannot be decoupled on a frequencyby-frequency basis. A fundamental distinction from the frequency domain version lies in the fact that the time-domain operator Q is too large to be explicitly formed and inverted, an aspect that calls for the use of matrix-free linear algebra and optimization techniques. In other words, the use of direct solvers is prohibited by the size of the matrix representation of the operator, and therefore the solution g is usually approximated using an iterative solver such as LSQR [51]. Such a solver only requires the implementation of the forward and adjoint operations of the modelling and preconditioning operators on a given data vector. From an implementation perspective, the multi-dimensional convolution is more efficiently solved in the Fourier domain, thus, an equivalent time domain representation requires the definition of an operator Q = F −1 QF acting on a given vector and performs a step of batch matrix multiplication as described by Ravasi and Vasconcelos [30]. Here F and F −1 , are forward and inverse space-time Fourier transform operators.
In general the problem (10) remains ill-posed provided that in typical surveys only a few sources are considered, nonetheless, the conditioning of the operator decreases as the sources/receivers density increases for large apertures. In an effort to alleviate these issues, further physical priors can be included in the solution of the inverse problem primarily to impose desired physical constraints on the solution while also stabilizing the inverse problem. The inverse problem is therefore posed as a constrained least-squares problem which enforces the reconstruction of g to belong to a restricted subset of solutions G [29], If we now define a projection operator P that can enforce the solution to be within the set of admissible solutions G, the constrained problem can be equivalently written as a preconditioned unconstrained least-squares problem: where g = Pz. Note that a linear operator is said to be a projection only when the following relation is satisfied, P = P 2 , or in other words when applying the operator twice or more times produces the same result as applying it once. Here, we seek to reconstruct z by considering the associated normal equations, Conceptually, this formulation is no different from the result in Equation (9), except that introducing the preconditioner P is a necessary condition to retrieve a physically reliable and stable solution. Moreover, in connection with the use of iterative methods for solving linear systems, in our experience, preconditioning also helps to improve the convergence by steering the solution towards the desired subset of solutions G. The implementation of our time-domain MDD is based on the PyLops framework for matrix-free linear algebra [52].

Physics-Inspired Preconditioners
We now introduce a set of efficient preconditioners which allow enforcing our prior belief on the solution in terms of physical constraints at each iteration as the optimization scheme progresses. Such constrains should, in principle, capture the physics of wave propagation and enable a successful reconstruction by driving the inversion towards a meaningful set of solutions. First, we seek causal solutions for which any event arriving before the direct wave is rejected, while ensuring that all internal multiples at t ≥ τ d (x r , x r ) are retained. The key elements of causality are introduced through a space-time window which is applied to the sequence of partial solutions of an iterative method [29]. Such preconditioner is typically built with the help of an eikonal solver using approximated velocities at the redatuming level z r , and can be formally defined as where τ d is the direct-wave's travel time. In the frequency domain the implicit Hadamard product Θ G turns into a convolution, limiting any practical implementation of causality in the Fourier domain. Note that since the Hadamard product is self-adjoint Θ = Θ H , this projection is said to be orthogonal. Second, we exploit the symmetry properties of the sought wavefield by introducing a reciprocity constraint that enforces the reflection response to remain the same when sources and receivers are interchanged. Mathematically speaking, source-receiver reciprocity is achieved by ensuring that the reconstructed wavefield is the same when the spatial coordinates are transposed. Formally, the prescribed prior can be written as follows where the operator Y reshapes the model vector z into a three-dimensional tensor with x r , x r , and t along the three directions, transposes the first and second dimensions, and averages the two tensors together before reshaping the resulting tensor back to a one dimensional vector. Enforcing this preconditioner projects the model vector onto the space of symmetric solutions, thereby, this physical prior makes sure that any solution for which reciprocity is not satisfied belongs to an undesired subspace and therefore is discarded. Once again, this projection operator is self-adjoint and therefore orthogonal. A third preconditioner is designed in the frequency-wavenumber and it is aimed at enforcing the spectrum of our solution to be placed within the expected signal cone [53]. The signal cone is here designed based on the cross-correlation function C. In essence, since the CCF is a blurred version of the impulse response G, it captures most of its features whilst being at the same time polluted with artifacts and wrong amplitudes. Despite the presence of unphysical events, we can still use its f -k spectrum to constrain the inversion by rejecting solutions outside a specific frequency range. In consequence, we introduce a 3D f -k filter adhering to the CCF frequency content as Here c min is a design parameter and corresponds to the minimum velocity to be retained. As iteration progress corrupted information leaks into the solution due to small singular values amplifying any noise present in the data, as a result the reconstruction is degraded and many coherent short-scale length features and fringes are observed. Likely that information corresponds to low velocity events mapping in the reject region of the frequency-wavenumber filter (16) and can be removed. Effectively, this form of implicit regularization eliminates short-wavelength that are not expected to be resolvable within the given frequency bandwidth. In this sense, this preconditioning enforces a smoothing constrain suppressing potential spurious model updates and enhance the convergence toward a well-behaved solution lying in the correct subspace.
In general, we can easily prove the benefit of repeatedly applying those preconditioners at each iteration of the inverse process instead of only once directly on the final solution. This comes from the realisation that, whilst our preconditioners acts locally on the solution either in time-space or frequency-wavenumber domain, the MDC operator acts globally on the model and data vectors within the entire time-space domain. In other words, taking the time causality preconditoner as an example, let us consider a single time-space location in the model vector and follow the computation of the gradient of our cost function at the i-th iteration, ∇J i = −Q H (u − Qg i ), which is used to update the current model estimate within the iterative solver. The application of the modelling operator Q involves a time convolution between each trace of the kernel of the operator and the current model vector estimate: as the kernel extends over the entire time axis the presence of signal at acausal times in the model vector not only introduces further noise at earlier times but also in the causal part of the predicted data as shown in Figure 2a. Similarly, the part of the residual originated from the acausal signal in the model vector introduces additional noise in the gradient and therefore in the subsequent model estimate at every time sample (Figure 2b). On the other hand, when a time window is applied during the computation of the gradient in the form of a preconditioner, acausal contributions arising during the application of the forward and adjoint modelling operator are immediately annihilated and prevented to leak into the gradient and subsequent model estimate. Finally, as we will investigate in more details in the numerical section, the described preconditioners can be chained together to better constrain the inversion. Whilst this can be done in principle for any combination and order of the preconditioners, mathematically speaking the product of multiple projections is in general not a projection. More specifically, when two projections commute, their product is always projection; the converse is however false, as the product of two non-commuting projections may still be a projection. The concatenation of operators that do not commute does not guarantees that the resulting vector lies in the intersection of the independent projections. As discussed in more details in [29], as the time causality and reciprocity projections commute, they can be applied together in any order. This interchangeability originates from the fact that they both act in time-space domain in a orthogonal fashion: time causality annihilates values in the acausal part making this part of the model already reciprocal, whilst it does not modify samples in the causal part where the reciprocity operator enforces symmetry. Conversely, f -k filtering applies by masking the model solution in the frequency-wavenumber domain. Whilst windowing in such domain is equivalent to two-dimensional filtering in the time-space domain, this projection has a more global impact on the model vector. For example, when applied after the time causality, it would bring back some of the signal to the acausal window. On the other hand, if time causality is applied after, the acausal part of the signal is ensured to be zero. Whilst this fact has theoretical implications in terms of the constrained and preconditioned problem not being equivalent, in the next section we will see how considering the combined effect of multiple priors via composition generally leads to improved model reconstruction. Thereby, we retrieve an impulse response that exhibits fewer erroneous events, while more accurate amplitudes are predicted. Table 2 outlines different options for chained preconditioners. Table 2. Different preconditioners options explored in this study. The attributes of the transformation rule, P, in the optimization problem (12) are given via composition of preconditioners (14), (15), and (16). Here, * indicates the physical constraints enforced by the chained operator.

Opt
Causality Reciprocity f -k Filtering Preconditioner

Numerical Examples
In order to evaluate the performance of multidimensional deconvolution and compare its frequency and time domain implementations, we design a numerical experiment for oneway source redatuming using decomposed wavefields at the receiver side. The input upand downgoing wavefields are obtained by solving the scattering Marchenko extrapolation equations [53] in a highly complex medium (Figure 3a). The Marchenko method provides access to noise-polluted up and down-going wavefields, P ± mko , emitted by sources at the surface and reconstructed at a set of virtual receivers. Typically, virtual receivers are located at the focusing level beneath the overburden. The medium extends 16.26 km in the horizontal direction and up to 8.0 km in depth, serving as support for a large overburden mimicking the effects of an inhomogeneous (dirty) salt body. Beneath the overburden, this model consists of sharp unconformities, laterally inhomogeneous sediments, and interbed discontinuities (Figure 4a). Interferometric redatuming in such environments is challenging due to the high impedance contrast, which introduces a strong internal-multiple regime that ultimately produces artifacts in the cross-correlation function. Those nonphysical events are attributed to the cross-talk between uncorrelated events in the data. Besides bandlimitation issues along with a virtual survey-limited aperture, source undersampling further contributes to the ill-posedness of the problem.  (Figure 3b,c). Similarly, the virtual survey we seek to reconstruct relative to the macro-model is shown in Figure 3 enclosed by the dashed target box beneath the salt body. Figure 4 shows the target model with corresponding numerically modeled local reflection response used as the benchmark for our reconstruction. Data modeling is carried out with a broadband impulse-source wavelet ranging from 1 to 50 Hz and absorbing boundaries on all sides of the target model. In Figure 4b-d, we select three sources gathered at 6.4, 7.5, and 8.6 km to reveal the structure of the true reflectivity.

Noise-Free Modelled Wavefields
A sensible aspect of MDD is the influence of erroneous or noise-degraded wavefields used in the inversion as data/operator. As a reference for the subsequent experiments, we first conduct a noise-free case study. To recreate such a scenario, we treat the down-going pressure field P + mko (Figure 3b) as an aerial source at ∂D r , and then convolved it with the broadband numerically modeled reflection response to build a noise-free up-going field P − mod (Figure 3d) as dictated by Equation (5). Therefore, any deviation from P − mod is interpreted as noise, just as it is inferred when comparing Figure 3c,d. Our first test use the minimum norm solution per frequency slice to reconstruct the field that would have been recorded at the datum level (magenta-solid line in Figure 3) by an actual seismic experiment as if sources and receivers were located just above it. We start by constructing the spatially band-limited point-spread function needed to correct source smearing and directionality in the correlation gathers. Figure 5a shows a source-gather slice of the time domain blurring operator in the middle of the virtual acquisition axis. Ideally, it should be a band-limited delta function around zero time and space offset. Nonetheless, strong coherent events are visible outside the desired range, whereas long unbalanced tails dress the expected delta pulse. Those events arise from the cross-talk of up and down-going waves, which ultimately leak into the correlation function. As observed in Figure 5b, the imprint of this filter populates the correlation gathers with artificial waves whose dips are rather high, apart from multiple uniform events mapping outside the space-time causal path. Despite the complexity of the up/down transmission wavefields, most physical events are well defined at near offsets and the general structure resembles the gather in Figure 4c. This observation justifies the use of the CCF as a surrogate of the virtual response in many applications. Still, in Figure 5c, trace evaluation uncovers a substantial phase misalignment along with uncompensated amplitudes. Figure 5d,e show the corresponding space-frequency (power spectra) domain counterparts. We note a limited temporal bandwidth content around zero offset with cross-talk noise energy spreading from the center towards larger offsets. Now that the blurring operator (PSF) and correlation function (CCF) are available, we solve the normal equations one frequency at a time. Filtering the correlation response with the regularized inverse of the PSF, [ Q H Q + λI] −1 , in theory, should remove the crosstalk artifacts while at the same time an amplitude corrected and sharper version of the CCF is retrieved. In the frequency domain, sharpening results in the widening of the temporal spectrum. Far from perfect, the choice of an optimal stabilization parameter strongly influences the regularized MDD solution at each frequency sample. This penalty term, to a large extent, controls the quality of the reconstruction, but often it is difficult to select the best possible option without solving the problem multiple times for a small subset of damping parameters. The first panel in Figure 6a (common source gather) and Figure 6b (common receiver gather) illustrate the deconvolved correlation function with a hand-tuned regularized PSF once transformed into the time domain. Opposite to the observations in the CCF, the estimated pressure response in this noise-free MDD reveals well-defined arrivals with higher spectral contributions and attenuated artifacts. In fact, many reflectors have recovered their continuity, and most of the overburden interfering events are suppressed. We also note seismic phases closer to the ground truth relative to those in the correlation gather and a partial virtual source smearing removal despite amplitudes not being accurately predicted (third panel in Figure 6a,b).
Alternatively, we resolve to set up the inversion on the time-space domain and opt for a gradient-based iteration solver (e.g., LSQR). While an explicit definition in computer memory of the discrete wavefield operator P + mko is prohibitive, such solvers only require the resulting forward and adjoint operations on the data vector; therefore, a Pylops linear operator is defined instead [30,52]. Even though no preconditioned is introduced, P = I, the synthesized virtual source-receiver response, second panel in Figure 6a,b, exhibits significantly fewer spurious events, a radiating pattern with improved causality properties, plus a much closer prediction of both phase and amplitudes. That being the case, a detailed trace inspection allows us to infer that a crude time-domain implementation takes care of crucial waveform information otherwise underestimated or neglected, opposite to its frequency domain counterpart (third panel in Figure 6a,b).

Noise-Contaminated Marchenko Wavefields
Continuing with our example, we now conduct the same experiment but using the noisy records P ± mko from scattering Marchenko redatuming. These transmission fields contain multiple spurious events not associated with reflectors in the target area besides high-frequency random noise. Figure 6c,d outline the inherent difficulties MDD faces when noise interferes with clean data. In contrast to the previous case, both time and frequency domain implementations quickly diverge from a satisfactory solution, verified by an overall gather degradation and strong short-wavelength events arriving at high dipping angles, which can not be explained when comparing with the benchmark. For instance, trace comparison indicates an overall amplitude overestimation regarding the noise-free solution for both CSG and CRG. Although we only display a few gathers retrieved by MDD, the behavior displayed in these panels is representative of the overall inversion results. Irrespective of the method used, data and operator fidelity are of utmost importance for the inversion along with effective priors to mitigate noise amplification that ultimately obscures the reconstructed local impulse response. In our next example, we explore the influence of implementing a causality preconditioner (Option 2 in Table 2). Restricting the domain of the sought solution is beneficial since it can effectively reject any acausal event which in turn prevents energy outside the time window Θ from leaking into the causal part of the solution. Clearly, in the gradient-based optimization, as iteration progress, any acausal information in the model vector is translated across the whole time-space domain as artifacts in the predicted data, a consequence of modeling with a noise-contaminates convolutional kernel that extends throughout the entire time axis. Figure 7 depicts a local response estimation that is purely causal (Figure 7c) aside from a large offset slice of the crude reconstruction for P = I (Figure 7b). In spite of the implicit numerical instabilities introduced by the MDD problem, a considerably more consistent reconstruction is achieved when relying on causality as constrain contrary to the crude unconstrained solution. Still, prominent events with quasilinear moveouts are embedded in the reconstructed field. In this case, while a general more accurate fit is predicted along the receiver axis, a significant misalignment is observed in the common receiver gathers. This observation is characteristic of a nonreciprocal response. As expected, the algorithm does not consider such physical priors unless introduced as constrain. Introducing the time window P = Θ as preconditioner derives in a solution depleted of acausal events but does not prevent noise from leaking into R. In (d,h) an overlay of traces indicates a fairly good amplitude prediction in the receiver side (red-dashed line), but a significant phase and amplitude mismatch is appreciated in the source side (blue-dashed line).
As a way to control high-frequency errors propagating across iterations, we now study the influence of f -k filtering to constrain the inversion within a prescribed signal cone. The argument being that physical events with hyperbolic moveout map at low wavenumbers in the f -k domain whereas short-wavelength spurious arrivals populate the spectrum at higher wavenumbers. Consider the CCF f -k spectrum depicted in Figure 8a. We note that the center of the f -k space contains low spatial frequency information associated with the actual local response, controlling the overall gather contrast, brightness, and general shape, whereas its periphery determines high spatial frequency information interpreted as sharp details, edges, and low-velocity dipping events. Hence, the red-dashed line in Figure 8 establishes a limit to discriminate between physical and artificial energy. Indeed, this spurious energy is amplified as iterations progress in an unconstrained inversion as we can see in Figure 8b, where an f -k slice of the inverted virtual response in Figure 7b discloses information with a rather large energy content outside the signal cone. Making sure that such unphysical energy is removed from the solution at each iteration proves to be of substantial benefit for wavefield reconstruction. For this to be achieved, we now run the inversion using P = F −1 WF as a preconditioner (Option 4 in Table 2). Contrary to our previous example, the pressure response from the target area in the f -k space (Figure 8c,f) is now deprived of low-velocity artificial events mapping at high wavenumbers but still retains crucial waveform information otherwise overlapping with linear moveout events and high-frequency uncorrelated noise in the time-space domain. Likewise, the inverted pressure wavefield in the time domain (Figure 9b,f) reveals waveform features representing a substantial improvement in comparison to that of Figure 7b,f and is closer to the benchmark in Figure 9a,e. While in the f -k domain preconditioned inversion many spurious events are controlled, still, the solution does not obey source-receiver reciprocity as one can infer when comparing CSG with CRG. In addition to enforcing a smoothing constrain on the model vector per iteration, one expects the wavefields to obey some form of reciprocity. In Figure 9c,g, we extract a source and receiver gather slice from the virtual response reconstructed with the help of the reciprocity preconditioner P = Υ (Option 3 in Table 2). While some events are underestimated in the unconstrained method, Figure 9d,h show a prominent waveform similarity against the benchmark for all causal events as appreciated in the trace comparisons. The gathers are not only kinematically closer to the ground truth but also all amplitudes appear better predicted. Depending on the wave physics, enforcing the spatial reciprocity preconditioner requires proper parameter scaling. This principle in its simple form holds for two-way wavefields; but it is not the case for one-way wavefields in general, where the separation strategy determines how reciprocity works [54]. When flux-normalization is considered, the standard form (15) holds, however, if the wavefields are pressure normalized, an additional balancing factor modifies the reciprocity condition. It should be noted that such a factor is given in terms of vertical derivatives of the structural parameters which in general are unknown at depth. Only when the medium is homogeneous, flux and field normalized fields are identical, in which case reciprocity, in the form (15), holds for both situations.  A final experiment considers the combined effect of multiple physical priors. In Figure 10 the reconstructed virtual responses using chained preconditioners P = ΘF −1 WF and P = ΘΥF −1 WF are displayed. The fact that acausal arrivals are controlled throughout iterations in the inversion implies that they are not present in the reconstructed gathers and therefore do not interfere with physical events arriving after the direct wave when the multidimensional matrix operator is applied to the model vector. On top of that, the smoothing preconditioner rejects all low-velocity artifacts and prevents uncorrelated high-frequency noise to leak into the solution. Similarly, when the reciprocity prior is enforced, the signals in the common shot and common receiver gathers are identical indicating that the solution is reciprocal. Here we note that using joint preconditioners redound in responses with noticeably broader temporal bandwidth with lower spurious events and traces exhibiting a much better fit than the unconstrained inversion, or more importantly the Tikhonov regularized inversion in the frequency domain.
Having access to multiple forms of preconditioning the multidimensional deconvolution problem calls for a more quantitative form to evaluate the performance of the inversion as a function of the pre-set prior. To compare the quality of distorted wavefield reconstruction relative to the numerically modeled local response, we use two distinct metrics. While the root means square error (RMSE) is a common measure to estimate the reconstruction quality in multiple inversion techniques, in image processing, the structural similarity index measure (SSIM) is often referred to as a more robust approach to evaluate image [55] and seismic gathers [56] attributes. Indeed, whereas the RMSE is a local measure of the square norm between two signals, the SSIM considers mean intensity, contrast, and structure. Figure 11a,b show the SSIM and RMSE for multiple preconditioning strategies as a function of iterations. It is important to note that in both cases one realizes the characteristic semi-converge behavior of different solutions, provided that initially, at low iterations, the SSIM is close to one (ideal reconstruction) but it decays as the number of iterations increases. Notably, depending on the preconditioner option a faster decay in the quality factor is observed for the situation where the inversion is unconstrained, or only disjoint causality and f -k filtering are used. On the other hand, when chained preconditioners are in place, the solution remains more stable as the iteration number increases. In Figure 11c the relative residual errors for several priors used in the previous examples is shown. Here, the error propagation a function of iterations for hybrid preconditioners shows dramatically fast convergence with stable residuals (continuous green and purple lines) around 8 × 10 −3 reached in approximately 10 iterations.

Discussion
Through our work, we show that MDD can be accomplished in a manner that is both reliable and stable-particularly in the presence of noise in both the data and forward operator-by means of physics-based constraints implemented as pre-conditioners in the time domain. Though our examples in this paper are 2D, our approach is general and immediately applicable to 3D seismic data. In a companion paper, Ravasi and Vasconcelos [30] show an HPC-ready setup of the very same PyLops implementation we use here and demonstrate it with 3D MDD in the image domain after redatuming. Moreover, the generality of our approach also extends directly to the cases of MDD in elastic [11,37,39] and attenuative media [12], with the only difference being that, of course, our proposed preconditioning operators must be appropriately re-parameterized for those cases (e.g., see discussion in the next paragraph).
When considering the use of spatial reciprocity operators for preconditioning, we point out that our approach in this work is to be considered both an example and a framework. At a more immediate level, it is in an example in the context of the acoustic wavefields employed in the examples herein. More generally, however, it is a framework given that, depending on the wave physics in consideration, reciprocity requires different parameterizations in terms of both scaling and data-component relations. For example, when dealing with one-way-decomposed wavefields, the appropriate form of spatial reciprocity is dependent upon the choice of wavefield normalization chosen for the decomposition operation [45,46,54]-which in some cases may require additional medium information, such as impedance local to source or receiver locations. Another important example is when multi-component fields are present-such as pressure and particle velocity in acoustics, or displacement and stress in elastic media-in which case spatial reciprocity must comply with the appropriate component combinations, e.g., a vertical-stress source into a horizontal-displacement receiver can only be replaced by a horizontal-displacement source into a vertical-stress receiver with appropriate sign corrections.
In the general context of interferometry and deconvolution, MDD is important, but of course 1D, single-channel deconvolution is just as important if not more prevalent in terms of use. This is so not only in interferometry [57][58][59], but also in key approaches such as estimation of receiver functions [60][61][62]. To this end, it is important to note that our conclusions hold equally in 1D deconvolution-because that is a reduced case of MDD-pre-conditioned, time-domain deconvolution will outperform the majority of frequency-domain, single-channel schemes, including the widely-used "water-level" method. In the case of single-channel deconvolution, pre-conditioners such as imposed causality or wavelet shapes/bandwidth are trivial simplifications of the multi-dimensional causality and f -k restriction operators we propose here. In terms of implementation, the same PyLops-operator setup can be used for efficient 1D deconvolution.
Finally, though herein we restrict ourselves to solving the time-domain deconvolution problem under a least-squares norm by means of the LSQR solver [51], our conclusions and framework could just as easily be extended to other metric and/or solver choices. For example, our pre-conditioned time-domain approach is applicable to one-norm, sparsitypromoting MDD (e.g., [27,63])-where the use of sparsity-promoting transforms such as radon-based [64], curvelets [65,66], shearlets [67], seislets [68], or wavelets [69,70] in 1D. Such metrics can make up of solvers such as FISTA [71] or SPGL1 [72]. These open up possibilities for pre-conditioners in the transformed domains, where one such example would be, for instance, to impose smoothness and/or restrictions on local slopes of events in the output inverted data. These and other more complex operations will be the subject of future research and thus are beyond the scope of this paper.

Conclusions
With this work, we show that the challenging task of performing reliable, stable multi-dimensional deconvolution (MDD) is both achievable and practical by treating the problem in the time-domain, combined with easy-to-implement, regularisation-free, physics-based pre-conditioners. Such an approach consistently outperforms frequencydomain approaches that, while being easier to implement computationally, are the source of the common misconception that deconvoltution can be unreliable due to its reliance on tuneable regularisation parameters. Our conclusions not only apply to MDD but are equally important for single-channel deconvolution practices, e.g., in interferometry or receiver function estimation, where pre-conditioned time-domain deconvolution will consistently outperform the widely-used frequency-domain spectral division.
Here, we present acoustic-wave multidimensional deconvolution in the context of target-oriented redatuming at a given datum in the subsurface. In general, MDD estimation of an overburden-free reflection response deprived of any overburden-related effects builds on the availability of a sufficiently large number of sources over a wide aperture and regularly sampled receivers. While the current practice of using frequency-domain wavefields to solve the normal equations relies on direct solvers, an alternative iterative-based time-domain implementation has proved promising for broadband response estimation in general settings. The first approach is feasible provided the point-spread function is small enough to be explicitly defined in memory, whereas in the former, there is no need for such memory burden since one leverages on iterative solvers that only require the forward and adjoint operations of the convolutional matrix operator on a given data vector to be readily available, therefore an explicitly defined version of the operator in matrix form can be avoided.
We present a reliable, yet stable time-domain implementation of MDD that includes physical priors as a mechanism to improve kernel conditioning by transforming the model vector into a domain where the ill-posed nature of the inverse problem is significantly mitigated. Moreover, selecting a proper hybrid preconditioner turns out to be an efficient way to accelerate convergence. In this respect, the iterative solver quickly favors a more stable solution relative to the simple unpreconditioned implementation with no explicit regularization parameter. This implementation emerges as a numerically attractive MDD option on large data sets and 3D case studies where direct inverse methods are currently prohibited or in situations for which the signal-to-noise ratio can potentially worsen the frequency domain reconstruction as a consequence of inverting the normal equations. Taking advantage of linear operators, typically built in the Python ecosystem for largescale problems, our numerical examples using multiple preconditioners demonstrate the effectiveness of the proposed method and its ability to handle noise-contaminated input wavefields in highly complex geological settings.