In the present study, a multilayer perceptron (MLP) is selected as the predictive model. Owing to its universal approximation capability, the MLP is able to represent complex nonlinear relationships without requiring an explicit specification of the underlying functional form. This property makes it particularly suitable for the analysis of spatial patterns in geochemical data, especially in cases where prior knowledge of the relationships between variables is limited. At the same time, it is well recognized that the predictive performance of neural network models strongly depends on both the quality of the input data and the structure of the feature space.
Accordingly, the application of the MLP in this work is combined with a preliminary preprocessing of the input spatial coordinates aimed at increasing feature informativeness and partially compensating for the weak spatial correlations inherent in the original dataset. This preprocessing step is intended to improve the learning efficiency and predictive stability of the neural network model.
2.2. Feature Transformation Methodology Based on Anisotropy Modeling and Dual-MLP Training
Feature transformation methodology based on constructing an informational and mathematical anisotropy model for the prediction points, and preparation of the training dataset and prediction set for computation using a multilayer perceptron. As is known, anisotropy in geological information manifests itself as the dependence of medium properties on direction. In geology, anisotropy is understood as the dependence of the variability of geological indicators on spatial directions. For example, the variability anisotropy of the contents of different ore components within an ore body is often non-uniform, which is a consequence of the specific stages of its formation. Physically, anisotropy may be associated with fine layering, elongated shapes of soil particles, fracturing, ice content, metamorphism, and the stress–strain state—that is, with lithological, structural, textural, and tectonic properties of rocks, as well as other causes [
13,
14].
In the present study, the focus is placed on the anisotropy of mineral content, understood as the directional variability of mineral concentrations with respect to the prediction direction. The geological sampling data are summarized in
Table 1, while their spatial distribution is illustrated in
Figure 1, which shows the locations of the geochemical sampling points. The color intensity of the points reflects the corresponding concentration values, with darker shades indicating higher mineral content. Points displayed with uniform color denote locations for which geochemical assay results are unavailable.
The prediction task is formulated as follows: for the generated points located in the northwestern direction, the gold concentration (Au1) must be forecasted. Accordingly, the analysis is focused on the directional variability of mineral content specifically along the northwest direction.
Convex hulls are constructed for the set of geochemical sampling points and for the set of generated points [
15].
Figure 2 illustrates the convex hulls used to visually represent the spatial boundaries of the original data and the area within which the predicted values are generated.
We introduce the following notation: R is the set of points with geochemical sampling results (red), and G is the set of generated points (green), then: C is the union of these sets: C = R∪G, S—the intersection of these sets: S = R∩G, RG—the difference in the sets: RG = R\G, GR—the difference in the sets: GR = G\R. Then, based on the adopted notation: GR—the set of generated points located in the area covered by a thick layer of unconsolidated deposits; SR—the set of generated points belonging to the intersection of the sets S = R∩G; SG—the set of points with known geochemical analysis results belonging to the intersection of the sets S = R∩G.
Thus, the problem of forecasting gold content at the points of set G is divided into the following two tasks:
To forecast the gold content at the points of set SR—essentially an interpolation problem, i.e., estimating the unknown gold content values at the points of set SR, which are located between the points of set SG with known geochemical analysis results.
To forecast the gold content at the points of set GR—essentially an extrapolation problem, i.e., estimating the unknown gold content values at the points of set GR, which are located outside the set of points SG with known geochemical analysis results.
The methodology for solving the interpolation problem for the unknown gold content values at the points of set SR.
Given that in this case we are interested in the variation of gold content in the northwest direction, determine the equations of the straight lines along which the mineral content will be forecasted for the points of sets R and G. To do this, we place both sets on a single figure, determine the correlation lines for these sets, and display them in
Figure 3. As can be seen, the correlation lines of these sets are almost parallel.
We will consider the anisotropy of the mineral content along the correlation lines in the northwest direction, as shown in
Figure 3.
The concept of “anisotropy converging to a point” is introduced. For the geometric interpretation of this concept, we define the following objects and introduce several terms. We define the prediction line as the straight line perpendicular to the correlation line—shown in
Figure 3 as the green line perpendicular to the correlation line of set G. Essentially, the prediction line drawn through a prediction point divides the plane into two half-planes: the southeast half-plane (the right half-plane), which must always contain points with known geochemical sampling results, and the northwest half-plane (the left half-plane), which contains the prediction points, i.e., points without geochemical sampling results.
In particular, in
Figure 3, southeast of the prediction line (the green line perpendicular to the correlation line of the generated set) are the red points for which geochemical sampling results are known, while northwest of the prediction line are the green points for which the mineral content must be forecasted. Note that points with known geochemical sampling results may also lie in the left half-plane. Thus, for forecasting, it is necessary to select such a point from the prediction set R that, when the prediction line is drawn through it, there is a certain number of geochemical sampling points located to the right of this line.
The rules for selecting the point for which the anisotropy model must be constructed and/or the mineral content must be forecasted (hereinafter referred to as the prediction point) as follows: we choose such a prediction point p from the prediction set that, if the prediction line is drawn through it, there must be a subset of geochemical sampling points located to the right of this prediction line, and the points of this subset closest to the prediction line must lie no farther from it than the maximum distance between neighboring points of set G.
Let there be a prediction point
p, through which the prediction line AB passes (
Figure 3). We divide the right half-plane into
Si sectors (
i > 1), and each sector is further divided into subsectors
Pij, where a subsector is defined as the part of a sector bounded by two arcs of different radii
rj (
j > 1); moreover, the first sector is bounded on the angular side by the prediction point
p itself.
In
Figure 4,
p is the prediction point; AB is the prediction line; the right half-plane AB
Π is divided into three sectors: S
1 = ApD, S
2 = DpC, S
3 = CpB; each sector is in turn divided into three subsectors according to the radii r
1, r
2, r
3.The number of divisions is chosen for demonstration purposes—in principle, the division may be made into any reasonable number of sectors and subsectors.
The following conditions are imposed on the sector subdivision. Let N be the total number of points in the right half-plane for which geochemical sampling results are known, and let NS be the number of sectors into which the right half-plane ABΠ must be divided. Then the sectors should be constructed such that each sector contains N/NS points with known geochemical sampling results. For definiteness, the integer remainder of the division N/NS is added to the last sector. For example, if N = 100, NS = 3, the remainder is 1, and therefore the first sector should contain 33 points, the second sector 33 points, and the third sector 34 points.
Now consider the subdivision of sectors into subsectors using sector S1 as an example. In this sector, the first subsector P11 is bounded on one side by point p and on the other by the arc of radius r1; the second subsector P12 is the part of sector S1 bounded by the arcs of radii r1 and r2; the third subsector P13 is the part of sector S1 bounded by the arcs of radii r2 and r3. In principle, a fourth subsector may also be introduced, bounded on one side by radius r3 and practically unbounded within the sector on the other, but in this case, we restrict ourselves to only three subsectors.
In the presented example, the right half-plane is divided into three angular sectors, and each sector is further subdivided into three radial subsectors. As noted earlier, both the number of sectors in the half-plane and the number of subsector divisions can be selected arbitrarily. The choice of these parameters depends on the type of mineral under investigation, the spatial distribution and density of sampling points, and other geological and methodological considerations.
Figure 4 illustrates the geometric construction used to define the proposed information measure. In this example, the right half-plane is partitioned into three angular sectors, each of which is further subdivided into three radial subsectors, forming a structured directional representation of the spatial neighborhood. The number of sectors and subsectors can be selected depending on the mineral type, the spatial distribution of sampling points, and other relevant factors. After completing the required geometric constructions, we proceed to analyze the points with known geochemical assay results that fall within each subsector. Let us denote by
Mij the set of points located in subsector
Pij of sector
Si, where
i ∈ [1, …, NS], j ∈ [1, …, NP] and where NS is the number of sectors into which the right half-plane is divided, and NP is the number of subsectors within each sector.
We then compute the average concentration of the mineral
Cij for subsector
Pij using the following formula:
where
Cij is the average concentration of the mineral within subsector
Pij;
denotes the geochemical assay result (mineral content) at the
k-th point located in subsector
Pij; and |
Mij| is the cardinality of the set
Mij i.e., the number of points contained in that subsector.
We now compute the coordinates corresponding to the considered subsector using the following formulas:
where
—the weighted average geographic longitude of all points in the set Mij;
—the weighted average geographic latitude of all points in the set Mij;
—the geographic longitude of the k-th point in the set Mij;
—the geographic latitude of the kk-th point in the set Mij;
and |Mij|—the cardinality of the set Mij, i.e., the number of points in that subset.
We now construct an information model of anisotropy converging to the point p. Here, an information model is understood as a model of an object—in this case, a model of anisotropy converging to the point p—represented in the form of information that describes the parameters and variable quantities of the object which are essential for the study and are mutually related.
The essential constituent elements of the information model of anisotropy converging to the prediction point pi are:
The prediction point, which is described by
- −
its geographic coordinates: longitude loniloni and latitude latilati;
- −
the concentration ci of the mineral at the point pi∈R.
The average concentration of the mineral for subsector , where
l is the number of the sector in the information model of anisotropy converging to the point pi, and m is the number of the subsector of sector l corresponding to the prediction point pi ∈ R;
The weighted average geographic coordinates— and —are for the subsector i of the prediction point pi ∈ R.
As an example, we present in tabular form the information model of anisotropy converging to the point pp (
Figure 4) for the first sector:
An information model describing the parameters and variables involved in anisotropy analysis is presented in
Table 2.
As noted above, an information model describes the parameters and variable quantities of the object under study that are essential and mutually interrelated. Traditionally, anisotropy in geochemical data is characterized using assay values at sampling points through the identification of principal directions of spatial correlation or by means of spatial variogram models [
16,
17].
In the present study, however, anisotropy is represented using an alternative information model formulated as an ordered sequence of variations in the averaged mineral concentration values and the weighted average coordinates of subsectors within the defined sectors. These quantities are computed according to Formulas (3) and (4) as the subsectors converge toward the prediction point, thereby providing a directional and structured representation of spatial variability.
Based on this ordered sequence of averaged mineral concentration values in the subsectors as they converge toward the prediction point
p, it becomes possible to construct, for each
i-th sector, a functional relationship of the following general form:
Content—the mineral concentration in the considered sector as a function of the radius length r, with , is the number of subsectors in the considered sector;
, where is the number of sectors for the p-th prediction point;
–fi(r)—the function describing the variation in mineral concentration for the i-th sector, which must be constructed from the data in the following table containing the mineral concentration values and their corresponding radii:
The radius values
ri in
Table 3 are computed using the following formulas:
There exist numerous methods for approximating tabulated data by various types of functions [
18], which is used to construct a functional dependence of the mineral content on the radius length. Furthermore, all sectors of the prediction point
p must be approximated by functions of the same type, and, in addition, all functions describing the relationships in (6) must satisfy the following condition at
r = 0:
which means that, for any sector
, the value of function (5) at the point
r = 0 must be equal to the numerical value of the mineral content
c at point
p.
Thus, all prediction points of the considered sets must satisfy Equations (5) and (7).
Next, the number of functions
Nf of type (5) that must be approximated based on Tables of type 3 for the prediction set
P:
where
NS is the number of sectors used to construct the anisotropy model converging to the point;
|P| is the cardinality of the set P.
For example, for the set
P shown in
Figure 1, the cardinality is |
P| = 820.
If we construct NS = 3 sectors for each prediction point, then we must build 820 × 3 = 2,460,820 × 3 = 2460 functions approximating 2460 tables of type 3.
As noted earlier, such a task can most effectively be addressed today using multilayer perceptrons. Therefore, instead of constructing functions of type (5), we shall approximate the anisotropy of mineral concentration values converging to the point—represented in the form of a Table of type 3—using MLP.
The information model of anisotropy converging to a point is then transformed into a form suitable for input into a neural network. We represent the data from
Table 3 as a single row of a training set corresponding to one sector of the selected prediction point pp.
Such a row, when forming the training dataset, will have the structure shown in
Table 4 below:
Table 4 is constructed for the case in which each prediction point is represented by
NS = 3 sectors, and each sector is divided into
NP = 3 subsectors. Under these conditions, the first 11 parameters in
Table 4 serve as input data for the MLP, while the 12-th parameter represents the output training target of the MLP, which in this case corresponds to the mineral content at the prediction point. During MLP training, the output neuron value for a given set of input parameters must converge to the value specified by the 12-th parameter.
Each row must be stored in CSV format (Comma-Separated Values)—a standard text format for tabular data representation.
Given the specifics of training and solving prediction problems using MLP, the prediction points p fall into two categories:
- −
Prediction points used for training the neural network, i.e., points for which the mineral content is known;
- −
Actual prediction points, i.e., points at which the mineral content must be determined.
In the general case, for a prediction point pp, whose right half-plane is divided into NS sectors and each sector into NP subsectors, a single row of input parameters in the training dataset corresponding to one sector will contain (3 × NP + 2) input parameters and one output parameter—the mineral content at point pp.
Therefore, the total number of rows corresponding to prediction point pp will be equal to NS.
When forming the MLP training set for N prediction points, the total number of rows in the prediction dataset will be: N × NS. Moreover, for all NS rows corresponding to a particular prediction point pipi, the value of the output parameter will be identical.
An MLP trained on such data will produce, for each prediction point, NS predicted mineral content values corresponding to its sectors. To obtain the final predicted value of mineral concentration at the prediction point, we refer to condition (7). According to this condition, all values of function (5) at r = 0 must be equal to one another. Therefore, to ensure that the MLP output values for each sector satisfy condition (7), we apply a second multilayer perceptron as follows.
From the NS predicted mineral content values obtained for each sector of the prediction point, we construct a row of the training dataset whose format is presented in the following table:
For validation of the MLP model, the performance metrics defined in Equation (2) were used:
- −
The first objective function (MSE), defined as the mean squared error between the desired value of the target variable and the value computed by the network over the entire training dataset within a single training epoch;
- −
The second objective function (MAE), defined as the mean absolute error between the desired value of the target variable and the value computed by the network over the entire training dataset within a single training epoch.
The data presented in
Table 1 were used to form the training and testing datasets. The dataset was divided into two subsets in a 50%/50% proportion, where samples with even indices were assigned to the training set and samples with odd indices were assigned to the test set.
Table 5 presents the results of training the MLP model using the data from
Table 4.
After training, the MLP model was evaluated on the test dataset. The testing results demonstrated that the test error did not exceed an MSE value of 0.015.
In columns 1, 2, and 3, the values of mineral concentration computed by the trained multilayer perceptron, based on the data given in
Table 4 for a single prediction point, are presented, while the fourth column contains the true value.
To satisfy condition (7), we form for each prediction point one row in the format shown in
Table 6. These rows together constitute the training dataset for a second multilayer perceptron, which must have the following basic parameters:
- −
Number of input neurons: 3 (equal to the number of sectors NS);
- −
Number of output neurons: 1.
After training the multilayer perceptron on this dataset, we obtain, for each prediction point, the final neural-network-predicted value of mineral concentration at point p.
If we pass the prediction-set points, prepared using the information model of anisotropy converging to the point, through the trained neural network, the network will compute the predicted mineral concentration values.
Thus, to predict mineral concentration values based on the information model of anisotropy of mineral content converging to the point—described by Equations (5) and (7)—we employ a multilayer perceptron twice.
Table 6.
Format of a single training-set row used to obtain a single predicted concentration value at the prediction point.
Table 6.
Format of a single training-set row used to obtain a single predicted concentration value at the prediction point.
| C1 | C2 | C3 | c |
|---|
| 1 | 2 | 3 | 4 |
|
0.3251
|
0.2653
|
0.3201
|
0.2894
|
Algorithm for Solving the Prediction Problem Using Preliminary Processing of Geological Input Data and Dual Application of a Multilayer Perceptron
The methodology for solving the interpolation problem of unknown gold concentrations at the points of set SR can be presented in the form of the following algorithm:
1. Consider the sets:
SR—the set of generated points belonging to the intersection of the sets S = R∩G;
SG—the set of points with known geochemical assay results belonging to the intersection S = R∩G;
The information about the points of set SG forms the training sample.
2. Specify the values: NS, NP, r1, r2, …, rNP.
3. For every p∈SG, compute using Formulas (3) and (4) the average concentration and the weighted average coordinates for each subsector.
4. For every p∈SG, construct the information model of anisotropy converging to point pp.
The information for each sector is presented as a single row of the training set.
For each prediction point p, NS rows must be formed in CSV format.
5. Define the architecture of the multilayer perceptron (the first neural network):
- −
Specify the number of input neurons (3 × NP + 2);
- −
Specify the number of output neurons: 1;
- −
Specify the number of hidden layers;
- −
Specify the number of neurons in the hidden layer(s);
- −
Specify the normalization regime for the input data—each input variable is normalized individually.
6. Specify the number of training epochs.
7. Run the neural network for training.
8. Obtain and process the results.
9. From the outputs corresponding to all
NS rows produced by the first neural network, and in accordance with the structure shown in
Table 6, construct |S
G| rows of the training dataset for the second perceptron, where |S
G| is the cardinality of the set S
G.
10. Define the architecture of the second multilayer perceptron (the second neural network):
- −
Number of input neurons: 3;
- −
Number of output neurons: 1;
- −
Number of hidden layers;
- −
Number of neurons in the hidden layer(s);
- −
Specify the normalization regime for the input data—each input variable is normalized individually.
11. Specify the number of training epochs.
12. Run the neural network for training.
13. Obtain and process the results.
14. We now proceed to solving the interpolation problem, i.e., determining the gold concentration at the points of the set SR. In this case, the prediction set will be prepared based on the set R.
15. Specify the values: NS, NP, r1, r2, …, rNP.
16. For every p∈SR, compute, using Formulas (3) and (4), the average concentration and the weighted average coordinates for each subsector.
17. For every p∈SR construct the information model of anisotropy converging to point p. The information for each sector is represented as a single row of the training set. Thus, for each point pp, NS rows must be generated in CSV format.
18. Run the first neural network to perform prediction.
19. Obtain and process the results.
20. From the values of all
NS output neurons, and in accordance with the structure shown in
Table 6, form |S
R| rows of the training dataset for the second neural network.
21. Run the second neural network to perform prediction.
22. Obtain and process the results.