(Quasi-)Real-Time Inversion of Airborne Time-Domain Electromagnetic Data via Artificial Neural Network

The possibility to have results very quickly after, or even during, the collection of electromagnetic data would be important, not only for quality check purposes, but also for adjusting the location of the proposed flight lines during an airborne time-domain acquisition. This kind of readiness could have a large impact in terms of optimization of the Value of Information of the measurements to be acquired. In addition, the importance of having fast tools for retrieving resistivity models from airborne time-domain data is demonstrated by the fact that Conductivity-Depth Imaging methodologies are still the standard in mineral exploration. In fact, they are extremely computationally efficient, and, at the same time, they preserve a very high lateral resolution. For these reasons, they are often preferred to inversion strategies even if the latter approaches are generally more accurate in terms of proper reconstruction of the depth of the targets and of reliable retrieval of true resistivity values of the subsurface. In this research, we discuss a novel approach, based on neural network techniques, capable of retrieving resistivity models with a quality comparable with the inversion strategy, but in a fraction of the time. We demonstrate the advantages of the proposed novel approach on synthetic and field datasets.

Not only the equipment, but also the data processing and inversion strategies have gone through continuous advancements. In this respect, just to mention an example, stacking of the recorded transient curves via moving windows with widths that are time-gate dependent [25,26] can increase the lateral resolution at shallow, while enhancing the signal-to-noise ratio at depth (where, in any case, the physics

Methods
ATEM data are usually inverted by minimizing an objective functional that consists of the summation of a data misfit and a regularization term. Hence, the objective functional to be minimized is often formalized as follows: in which (i) d obs is the vector of the observations; (ii) m is the vector of the model parameters; (iii) F is the forward modelling operator mapping the model m into the data space; F takes into account the physics of the process and the characteristics of the acquisition system [46]; (iv) W d is the data weighting matrix taking into account the uncertainty in the measurements; (v) s(m) is the regularization term incorporating the prior knowledge about the resistivity model to be reconstructed; (vi) the multiplier λ controls the balance between the importance given to the data with respect to the prior information.
In the deterministic scheme we are investigating here, in order to have a term of comparison to effectively assess the performances of the alternative approach based on ANN, we consider a one-dimensional model parameterization. Hence, m (and consistently also F) is based on the assumption that (locally) the subsurface is not varying laterally. Therefore, even when data and model sections are discussed, each individual data sounding, and each associated model, is handled independently from the adjacent ones. More specifically, whereas the forward modelling F is always one dimensional, there is a connection between the neighboring models imposed through the regularization term. In this respect, concerning the stabilizer choice, we adopt the probably most common option of s(m) being equal to the minimum gradient stabilizer.
Hence, despite the conductivity distribution is considered locally 1D, the stabilizer acts both along the vertical (z) and the horizontal (x) direction, promoting solutions that are laterally coherent (without being truly 2D/3D). This is the essence of the so-called spatially constrained inversion [28,47].
Moreover, in the 1D deterministic inversion scheme we are using, the value of λ is calculated in order to guarantee a chi-squared value approximately equal 1 (with N d being the number of time gates) [48,49]. The ANN is built in order to perform a similar task with respect to the minimization of objective functional in Equation (1).
ANNs use continuous and differentiable activation functions at each unit of the network, which makes the network output (m) a continuous and differentiable function of the network input (d); this, in turn, leads to the possibility of defining a continuous and differentiable error function for the evaluation of the difference between the network output and the target output. Consequently, the error function can be minimized over a training set using a relatively simple gradient-based procedure. Hence, the problem of building an effective ANN to map the recorded measurements into resistivity vector is reduced to the minimization of an error functional: in which D and M consist of the elements of the (data, model) couples (d t , m t ) constituting the Training Dataset (TD) [50]. Of course, in this case, the minimization aims at finding the optimal weights w of the connections between the network units. Thus, the ANN K is found via the minimization with respect to w. Once K is built based on the TD, it can be applied to the elements d obs of the observed dataset to infer the corresponding conductivity models m. In this respect, it is worth noting that the retrieved K-and, therefore, the corresponding final resistivity distribution obtained via the application of K on the observed data-depends on the selection of the TD. ML approaches are based on the stationarity assumption: the couples in the TD and in the solution dataset need to be independent and identically distributed (i.i.d.) random variables. In this sense, TD formalizes the available prior information about the studied system. Consistently, the TD should be selected in order to be representative of the targets (therefore, coherent with our expectations about the geology to be reconstructed) [50,51]. Data stationarity and TD's representativeness are very well-known issues of ML [52]. In a further attempt to reconcile the ANN approach and the (regularized) deterministic inversion, we could think about the selection of the conductivity models for the development of the TD as some sort of regularization: the solution provided by the ANN cannot be too different from the models (and the associated data) used to train the ANN. Hence, for example, the TD should be based on the prior (geological) knowledge available about the investigated area. This might sound tautological, but it is actually the key point of regularization theory (and, clearly, also of ML approaches).
In the present paper, the ANN consists of a multilayer perceptron with (i) an input layer with 54 (i.e., the number of time gates) units; (ii) three hidden layers with, respectively, 100, 500, 200 units; and (iii) an output layer characterized by 30 (i.e., the number of conductivity model parameters) units.
As TD we took the d t data generated via the forward modelling F for each of the 1D resistivity models m t making up a realistic resistivity section ( Figure 1). It is important to stress that, despite the apparent lateral coherence of the 1D model, the elements of the TD are, indeed, handled as independent soundings and resistivity models. Plotting the TD data and the models as 2D sections made it easier to assess the representativeness of the (geologically informed) training dataset with respect to the actual measurements to be inverted. It is also important to highlight that the TD used in the present research is based on the technical specifications of the particular system used for the experimental data collection. Thus, the d t 's are calculated from the corresponding m t 's by using, for example, the waveform and time gates provided by the contractor together with the survey measurements.
In total, the utilized TD consisted of around 12,000 (d t , m t ) couples (a sample of which is plotted in Figure 1). In the training phase, a multi-start approach has been adopted to minimize the effect of local minima of the error functional. Additionally, following a standard procedure [53], the optimal number of epochs was selected by studying the error functional value when applied on validation subsets [54].
Differently from the 1D deterministic inversion case in Equation (1) (in which the stabilizing terms connects adjacent 1D models), in the inversion performed through K, no lateral information is included, and the individual soundings are inverted separately. The inclusion of this further piece of knowledge would be surely beneficial (if available) and should be included in future developments.

Synthetic Test
In order to assess the effectiveness of the ANN approach, we applied the neural network (based on the previously discussed TD) to a known verification dataset. Figure 2 shows the true conductivity sections whose 1D models were used to generate the noise-free synthetic data to be inverted. Therefore, in short, and by using a neural network lingo, Figure 2 (together with its associated data) is our verification dataset. Figure 3 consists of the conductivity sections reconstructed via the proposed ANN. In turn, the inferred conductivities ( Figure 3) have been used to calculate their associated electromagnetic response; the comparison between the original synthetic data and the calculated response is shown, model-by-model, with a red dot (red axis on the right in Figure 3). From this data misfit estimation, it is clear that the conductivity distribution recovered by the neural network is generally compatible with the inverted data within 4%.
Considering the retrieved conductivity distribution, the ANN reconstruction captures almost all the features present in the original model. In addition, Figure 3 demonstrates that the proposed approach is quite robust as it retrieves the lateral coherence of the conductivity sections despite the individual models are inverted separately. A quantitative assessment of the model agreement between the reconstructed and the original model can be done through the Figure 4 showing, in the log-scale, the ratio between the ANN reconstruction and the true model. In general, the values in Figure 4 are around one, demonstrating the overall accuracy of the ANN reconstruction. The areas in Figure 4 characterized by major discrepancies between the ANN solution and the true model are generally localized at depth (where, in any case, because of the physics of the method, the sensitivity of the data to the conductivity values is lower) and on the right side of the conductivity sections. This is not surprising if we look at the electromagnetic responses. Regarding this, Figure 5 shows the original data (blue lines) compared to the calculated measurements (red lines) for each of the sections in Figures 2 and 3; it is clear that many of the soundings on the right side of the sections are characterized by a smaller number of time gates (indeed, to simulate more realistic conditions, in several of the original soundings, the late time gates have been removed, mimicking what often happens with field noisy observations). Of course, with a reduced number of time gates, the depth at which the conductivity affects the data values is shallower. This is consistent with the larger model misfit on the right side of the panels in Figure 4.  Figure 1 to the data generated by the conductive models in Figure 2. The data misfit between the calculated and the original measurements is shown for each individual 1D model location as a red dot (the corresponding axis is on the right in red).  Figure 4. The ratio between the conductivity models in Figure 3 (the ANN result) and in Figure 2 (the true conductivity distribution).

Field Example
The Sakatti deposit is located 15 km north of Sodankylä, Finland; it is rich in Cu-Ni-PGE (Platinum Group Elements) minerals [55] and has been selected as one of the test sites of the EU-funded project INFACT aiming at the development of cutting-edge technologies for mineral exploration [56]. Within the framework of the INFACT project, time-domain electromagnetic data have been acquired by Geotech using a VTEM system [16,17].
In this section, we compare the results obtained with the ANN-already used for the previous synthetic test and trained on the TD in Figure 1-against a more traditional 1D deterministic inversion based on the forward modelling utilized for simulating, for example, the responses in Figure 5. Therefore, Figure 6 demonstrates that the inversion performed via the developed ANN can infer reasonable 1D models whose responses fit the observation within a 5% threshold (for each model, the data misfit value is represented as a red dot, and the associated red axis is on the right side of the panel).
When the ANN result is compared with the corresponding 1D deterministic inversion in Figure 7, it is possible to see that the "traditional" deterministic inversion with vertical and lateral smooth constrained is often superior in fitting the data (the data misfit is generally below 2%, as it is clearly visible from the red dots representing the data misfit). Figure 8 might be helpful in quantitatively evaluating the differences between the two results as it shows the ratio between the different solutions; it worth noting how the larger discrepancies between the two solutions occur where the data fitting of the deterministic inversion is larger (e.g., between 1850 and 2400 m) and/or in areas characterized by high resistivity values. Therefore, the areas in which also the deterministic inversion has difficulties in fitting the observations and that are characterized by relatively high resistivity values are those where the differences with the ANN solution are more pronounced. This is in agreement with the fact that, in general, ATEM methods have difficulties in accurately distinguish between different high resistivity values.    Figure 8. The ratio between the conductivity models in Figure 6 (the ANN inversion result) and in Figure 7 (the deterministic inversion result).

Discussion
Clearly, the indubitable advantages of the deterministic inversion come at a price: the ANN inversion takes approximately 24 s to invert the entire Sakatti dataset (consisting of 14,346 soundings with 54 time gates each) by using a standard laptop (equipped with an Intel Core i5-8250U processor), whereas hours (so an amount of time of the order of magnitude of 10 4 s) are necessary to perform the same task by using the 1D deterministic approach and a 64-CPU server.
To be fair, it is true that the training phase-crucial for the development of the ANN-requires several hours. Despite that, we believe that the proposed workflow has at least a few main pros:

1.
It can allow the optimization of the survey design while the acquisition of the ATEM is on-going. In fact, the development of an effective training dataset and the associate ANN can be performed before the survey-or it can be even based on the outcomes from the first flight(s) of the survey if the area is assumed to be relatively "stationary"-and, once the ANN is available, reliable results can be almost instantaneously obtained just after each flight. In turn, this can lead to real-time rearrangements of the original tentative survey plans in order to maximize the VoI (Value of Information) of the measurements to be further collected [57].

2.
The ANN speed can be extremely useful for effective Quality Check (QC) of the data during the survey. 3.
The availability of a good starting model (derived from the ANN inversion) can be used to speed-up the 1D deterministic inversion by reducing the number of iterations.
Of course, if, for producing the final results, post-processing analyses are necessary (e.g., in 3D environments), the same will be true also when adopting the proposed ANN approach: ANN based on a 1D forward modelling approach cannot guarantee better results compared with the corresponding deterministic inversion; it can only provide solutions of similar quality within a fraction of the time and by using cheaper computational tools.
Moreover, clearly, at least in principle, the presented ANN scheme can be extended in order to also include, for example, induce polarization (IP) effects: it is a matter of incorporating them in the forward modelling algorithm used in the development of the TD. However, as before, we cannot expect the ANN to solve all the issues connected IP in ATEM data. Reasonably, we can only expect to solve the same problem, but much faster.

Conclusions
We present a novel approach to the inversion of airborne time-domain electromagnetic data based on neural networks.
We demonstrate the effectiveness of the proposed inversion strategy by testing it on both synthetic and field data. Based on these outcomes, we conclude that the proposed neural network approach is capable of retrieving the conductivity distribution of the subsurface from the measurements collected by the airborne geophysical system with an accuracy that is largely comparable with the most commonly used (in the academia and in the industry) inversion strategies and that relies on 1D deterministic inversion approaches. These results are particularly noticeable as the neural network inversion takes only a fraction of the time required for the deterministic inversion (a few seconds versus hours).
The performances of the neural network discussed in the paper can be potentially enhanced in terms of data fitting via data augmentation techniques expanding the TD and building more accurate models, provided that effective ways to generate artificial data getting closer to the behavior of the test dataset can be found [58,59].
In addition, an aspect that has not been investigated here is the dependence on the training dataset; in future works, studying the robustness of the result as a function of the training dataset would be extremely relevant: after all, the definition of the proper training dataset is a way to include prior (geological) information into the inversion.
Clearly, the neural network strategy discussed in the paper deals with each sounding independently and does not make use of the possible available knowledge concerning the lateral coherence of the targets; it is a pity not to exploit these additional pieces of information in order to get even more effective results. In this perspective, pseudo-2D/3D approaches should be explored as well.
The dramatic speed up of the inversion by means of the application of the neural network (seconds vs. hours, on a standard laptop) potentially paves the road to on-the-fly inversion with possible applications on real-time survey design optimizations. On the other hand, in the most conservative scenario, the discussed neural network inversion can serve as a starting model for faster deterministic inversions and/or as a QC tool during the data collection phases.