1. Introduction
Geometallurgy has emerged as one of the most demanding and integrative areas in modern mining practice. At its foundation, this field seeks to develop quantitative methods that allow reliable prediction and optimization across the mineral value chain, spanning from early exploration to processing and final recovery [
1]. Traditionally, mine design and planning at different stages have relied mainly on deposit modeling and on understanding the spatial variability of grades associated with mineralization. Geometallurgy extends this approach by linking geological and metallurgical information in a single framework. It brings together the geological setting, the distribution of mineralization and deleterious elements, and the processing response when evaluating design and production plans [
2,
3]. Through this integration, geometallurgy provides a coherent framework that supports more effective mine design, improves production scheduling, and reduces operational risk [
4,
5,
6].
From an operational perspective, achieving this integration requires spatial models that accurately represent ore grades while preserving their relationships with processing response. To achieve this, one must use methods that can describe the spatial distribution of these variables with sufficient accuracy while still accounting for the complex relationships among them. In this regard, ore control models play a central role. Within the scope of this study, production information is represented through an ore control model developed at the mining scale from blast hole sampling data. The model describes the spatial and temporal distribution of grades, mineralization zones and processing responses as mining progresses. At each stage of production, it provides the most reliable characterization of the material being extracted and is therefore used as the reference basis for the reconciliation and updating of stochastic realizations. It helps examine the interactions between variables and identify or correct possible errors [
7]. In practice, these models are typically built after block extraction, based on manual sampling of run-of-mine (ROM) material followed by chemical assays and geological inspection. In recent years, the use of sensor technologies has improved both the speed and reliability of ROM characterization, which in turn supports the generation of more consistent ore control models. For example, sensors installed along conveyor belts now provide continuous measurements of geological and grade properties, allowing the spatial location of production blocks to be tracked in real time. In closed-loop mining systems, production-scale measurements are continuously integrated with geological and resource/reserve models to reconcile predictions with observed mining outcomes. Benndorf [
8] formalized this concept through a feedback-driven framework in which online sensor data and production measurements are assimilated to update ore control and resource/reserve models during operation. This approach allows discrepancies between predicted and extracted material to be identified and progressively corrected, thereby improving short-term decision-making and reducing operational risk. Within this framework, real-time data acquisition, model updating, and reconciliation form an iterative cycle that links production monitoring with spatial modeling. Additional technical details on closed-loop mining systems are discussed in Benndorf [
8].
In this context, Benndorf [
9] introduced a closed-loop reconciliation framework that enables ore deposit models to be updated during production as new information becomes available. Building on this concept, the Ensemble Kalman Filter (EnKF) was subsequently adapted to the mining value chain as a practical tool for updating geostatistical realizations. The EnKF sequentially assimilates production and monitoring data into simulated deposit models, thereby yielding progressively refined representations of the deposit. Previous studies have shown that the EnKF provides an effective and flexible framework for updating mineral deposit models as additional data are acquired [
7,
9,
10,
11,
12,
13,
14].
A key limitation of the EnKF lies in its limited ability to preserve inter-variable relationships during the updating process [
15]. This limitation is particularly relevant for geochemical variables (elemental grades), which are compositional in nature, as neglecting their compositional structure may lead to biased or physically inconsistent results. Applying the EnKF directly in the compositional space violates the Gaussian assumptions underlying the method, since compositional variables are constrained and statistically dependent [
16]. As a result, the updating process may generate unrealistic outcomes, such as negative elemental grades. The use of ratio or log-ratio transformations is one avenue for transferring compositional data into an unconstrained space [
16,
17].
On the other hand, the equations of the EnKF are derived under the assumption that the variables follow a multivariate Gaussian distribution. The absence of this assumption in the problem formulation may lead to suboptimal or even biased results [
15]. The multivariate Gaussian assumption is rarely satisfied in geostatistical applications. To address this, several transformation methods have been developed to map variables into a Gaussian scale. The univariate Gaussian anamorphosis (UGA), minimum/maximum autocorrelation factors (MAFs), projection pursuit multivariate transform (PPMT), flow anamorphosis (FA), and rotation-based iterative Gaussianization (RBIG) are among the approaches developed for this purpose [
14,
18,
19,
20,
21,
22,
23,
24,
25,
26]. UGA, commonly used in geostatistics, transforms only the marginal distributions of the variables into Gaussian ones, treating each variable independently. Although it is sufficient to model marginal distributions, it does not guarantee a joint Gaussian representation when several correlated variables are treated together. In contrast, PPMT, FA and RBIG treat all the variables jointly and are capable of transforming them into multivariate Gaussian. In multivariate EnKF, model updates depend on the cross-covariance terms between variables. In this setting, departures from a joint Gaussian behavior can affect the consistency of the updating step. For this reason, the present study adopts a multivariate Gaussianization strategy that targets the joint distribution of the variables to be updated.
Despite these methodological advances, important challenges remain in the real-time updating of geometallurgical models. In particular, existing approaches often address geological uncertainty, compositional constraints, and multivariate dependencies in isolation, rather than within a unified updating framework. As a consequence, the joint updating of geochemical and processing-related variables remains limited, particularly when compositional data and non-Gaussian distributions are involved. This situation highlights the need for an integrated framework that simultaneously considers geological uncertainty, compositional structure, and multivariate relationships during model updating.
The objective of this study is to develop an updatable geometallurgical framework that integrates geological, geochemical, and processing information within an ore control context (
Figure 1). The proposed approach explicitly accounts for uncertainty in geological boundaries and supports the spatial modeling of geometallurgical variables, including geochemical components and processing responses. The geological modeling workflow adopted in this study follows previously published work [
13,
27], and the resulting stochastic geological models are used as inputs for the simulation of continuous geometallurgical variables using turning bands (TBs) simulation. These stochastic realizations are subsequently updated as new production data become available using the EnKF approach.
By integrating the isometric log-ratio transformation (ILR) with flow anamorphosis (FA), the proposed framework enables the updating of continuous geometallurgical variables within a multivariate Gaussian setting. The ILR transformation ensures that the compositional nature of geochemical variables is properly handled, thereby allowing the updating to be performed in an unconstrained space free from closure effects. The FA method is subsequently applied to map the ILR-transformed variables onto a multivariate Gaussian distribution while preserving their dependence structure, which is essential for multivariate spatial modeling [
1,
22]. Although alternative Gaussianization techniques, such as PPMT and RBIG, are computationally more efficient, their marginal, histogram-based transformations may lack robustness during back-transformation, particularly for skewed or heavy-tailed distributions [
19,
22]. Given the moderate dimensionality of the variables considered in this study, the ILR–FA combination provides a suitable balance between theoretical requirements related to closure handling, multivariate Gaussianization and robustness of the inverse transformation.
2. Case Study: Golgohar Iron Deposit
The Golgohar iron mine is located in Kerman Province, about 55 km southwest of Sirjan (at latitude 29° N and longitude 55°15′ E). Geophysical magnetic and gravity surveys led to the discovery of six anomalies within an area of approximately 40 square kilometers. This area is recognized as one of the major iron deposits in Iran, with a total mineral reserve of more than 1135 million tonnes and average grades of 57.2% iron, 0.16% phosphorus, and 1.8% sulfur [
28,
29]. Anomaly No. 1 of the Golgohar iron deposit in Sirjan, with an estimated reserve of about 250 million tonnes, will be examined in this study. This anomaly represents a lenticular-shaped deposit, with its eastern part extending downward into the ground [
29]. Previous studies have shown that during the first ten years of mining this deposit, the average sulfur grade remains below the permissible limit. However, from the eleventh year onward, this value increases progressively and becomes a major issue. For this reason, this area and its available data were selected as the case study for implementing the proposed geometallurgical program.
Based on field studies in open-pit mine No. 1, the host rocks of mineralization consist of metamorphic rocks such as quartzite, mica schist, talc schist, chlorite schist, and amphibolite. Quartzite and mica schists exhibit sedimentary structures and are mostly exposed in the upper parts of the pit, whereas amphibolite appears in various sections of the pit, particularly in its lower parts, indicating a certain degree of mineralization.
2.1. Zoning of Mineralization in Golgohar Pit No. 1
Based on the spatial variations in iron and sulfur grades, processing responses, and associated minerals, three mineralization zones can be defined in Golgohar Iron Ore Pit No. 1: the bottom magnetite, the oxide, and the top magnetite zones (
Figure 2). The bottom magnetite zone constitutes the main part of the deposit and is located in the lower portion of the orebody. Sulfide minerals associated with this zone mainly include pyrrhotite and chalcopyrite.
The oxide zone occurs as a layered unit overlying the bottom magnetite zone. It represents the oxidized portion of the deposit and is characterized by secondary iron minerals and reduced sulfur contents resulting from weathering processes [
30,
31,
32]. Typical minerals in this zone include goethite, martite, hematite, and maghemite.
In limited areas of Pit No. 1, a top magnetite zone is observed above the studied anomaly. Despite its shallow position, this zone exhibits a relatively low degree of oxidation, which results in a distinct magnetic response. This reduced oxidation is also reflected in lower average sulfur grades compared with the deeper mineralization zones [
30].
2.2. Structural Geology of the Study Area
The study area, in terms of structural–geological zoning, is located within the Sanandaj–Sirjan metamorphic belt. In general, it can be stated that the tectonic evolution of the Golgohar region has been strongly influenced by the existing strike-slip faults.
Figure 3 shows the faults located within the area of Golgohar open pit mine No. 1 in Sirjan. The main fault zones in the studied complex are of the reverse type. These reverse faults have thrust the upper ore units of the Golgohar complex over the Korsefid complex. These fault zones in the study area are mostly covered by alluvial deposits. Along the reverse fault paths, strike-slip faults with a NW trend can be observed in the Golgohar region. The Golgohar region has undergone several phases of deformation from the Paleozoic era to the present. The Quaternary faults in the area are all normal faults, with displacements ranging from a few centimeters to several meters clearly visible.
2.3. Exploration Drill Hole Dataset
Out of a total of 541 exploration drill holes present in the study area, 4305 samples with varying lengths and a total drilling length of 23,479 m were provided by the Golgohar complex. The samples were analyzed for five key variables, including the elements iron (Fe), sulfur (S), and phosphorus (P), as well as processing variables such as iron oxide content (FeO) and weight recovery amount (MWT). The assays originate from routine mine and laboratory databases that follow standard QA/QC protocols. Although a comprehensive re-evaluation of QA/QC procedures was beyond the scope of this work, available records and basic statistical checks did not indicate the presence of significant systematic biases. The data were therefore considered adequate for the intended geostatistical modeling purposes.
Data cleaning was restricted to basic preprocessing prior to modeling, comprising duplicate removal and cross-checking of exploration and production data. No gap-filling or imputation of missing values was performed. The drill hole samples were composited to a length of 6 m in accordance with the mining bench height to ensure consistent support for spatial modeling. This compositing length induces some smoothing, particularly for high-variance variables such as sulfur. However, exploration data analysis indicates that the main spatial trends and variance structure are preserved. Given the dense blast hole sampling in the study area, the effect of drill hole compositing on local variability is considered acceptable for the scope of this study.
According to
Figure 4, the density of exploration drill holes in the eastern part of the deposit is higher than in the western part. Moreover, sampling among the various ore units has only been conducted within the three defined mineralization zones, namely, the bottom magnetite, oxide, and top magnetite zones. The proportions of these zones among the available samples are as follows: bottom magnetite (code 101) accounts for 63%, oxide (code 201) for 25%, and top magnetite (code 301) for 12%.
Descriptive statistical analysis showed that the ranges of variation for Fe, FeO, P, S, and MWT are 58.57, 32.7, 11.639, 11.958, and 96 percent, respectively (
Table 1). Within the investigated region, Fe, FeO, and MWT display considerable variability, as indicated by their elevated variances. Iron, iron oxide, and weight recovery amount exhibit negatively skewed distributions, in contrast to phosphorus and sulfur, which are positively skewed.
2.4. Blast Hole Production Data and Development of the Ore Control Model
As mentioned earlier, one of the main objectives of this paper is to reconcile and update simulated realizations by incorporating production data collected from mining operations. After the deposit is modeled, the mining blocks are extracted at successive time steps according to the mine design and production schedule. Owing to the sequential nature of block extraction, the ore control model evolves accordingly to reflect the temporal and spatial variations in ore characteristics. Throughout this paper, the ore control model serves as the production reference for the assessment of simulated and updated realizations. As mining at the Sirjan iron ore deposit is performed by open-pit methods, the information obtained from blast hole drilling is used to construct and update the ore control model.
Based on the extracted blocks from the studied deposit, a dataset comprising 28,807 blast hole samples was provided by the mine, each associated with its corresponding extraction time step. The available information includes chemical analyses of iron (Fe), phosphorus (P), and sulfur (S) contents, together with metallurgical responses such as iron oxide (FeO) and weight recovery amount (MWT), assayed under standard QA/QC protocols. These data are complemented by mineralogical descriptions and the recorded sampling time.
Table 2 summarizes the statistical parameters of the variables analyzed in the blast hole samples collected from the studied deposit. According to this table, the ranges of variation for Fe, FeO, P, S, and MWT are 63.4%, 30.6%, 2.845%, 8.362%, and 96%, respectively. In addition, the proportions of the mineralization zones—bottom magnetite, oxide zone, top magnetite, and waste—are 0.343, 0.293, 0.079, and 0.286, respectively, as derived from the analytical results.
As a fundamental step in the reconciliation and updating procedure, it is first necessary to construct an ore control model based on the dataset derived from blast hole information. For this purpose, a block model was created considering the spatial configuration of the blast holes and the dimensions of the extracted blocks. This block model is required to remain consistent with the exploration block model. Accordingly, the dimensions of the constructed blocks were set equal to those used in the exploration block model, i.e., 15 × 10 × 10 m3. After this stage, it is necessary to model the spatial distribution of the studied variables (Fe, FeO, P, S, MWT, mineralization zones) within the defined domain using an appropriate interpolation method.
For this purpose and owing to the very dense blast hole sampling, the nearest neighbor approach was applied to both continuous and categorical variables. Interpolation methods such as kriging or inverse distance weighting were not considered in this context, insofar as they would introduce additional smoothing. The objective of this step was to assign production-scale measurements rather than to generate smoothed estimates. Also, the nearest-neighbor approach provides a consistent treatment of categorical variables, such as geological domains, for which other interpolation methods are not conceptually appropriate, as they result in class mixing. Therefore, the nearest-neighbor method was selected as the most transparent and robust option for the construction of the ore control model. Because the estimation of each block must rely solely on the data located within its boundaries, the search ellipsoid radii were constrained to the block dimensions. Accordingly, the search ranges in the major, minor, and vertical directions were set to 10, 10, and 15 m, respectively.
Figure 5 illustrates the ore control model of the elements and mineralization zones constructed from the blast hole data.
3. Stochastic Geometallurgical Modeling: Proposed Approach
In this section, the proposed algorithm is implemented using the available exploration and production datasets from the Gol Gohar Iron Ore Mine No. 1. It should be noted that the proposed methodology is generic and can be applied to other deposits that require real-time integration of geological, geochemical, and processing information.
As a first step, the study area was partitioned into homogeneous subdomains based on the geological model of the deposit. This partitioning was performed using the VPM-DS algorithm with an updating capability informed by the ore control model. The development and detailed implementation of this algorithm for the study area have been documented in previous studies [
13,
27]. Therefore, only a brief overview is provided here, with emphasis on its role within the present workflow. In these previous studies, the geological model was treated as stochastic rather than deterministic. Geological domains were described through multiple realizations generated with the VPM-DS algorithm. This approach captures uncertainty in domain boundaries and propagates it to downstream modeling steps.
As a result, grade and geometallurgical simulations are not conditioned on a single fixed interpretation. Instead, geological uncertainty is carried through the workflow and reflected in the simulated outputs.
3.1. Contact Analysis and Definition of Geological Domains
The partitioning of the study area relied on a contact analysis to distinguish ‘soft’ and ‘hard’ boundaries between mineralization zones. This analysis evaluated the continuity of the mean grades of geochemical elements across the boundaries, following established procedures. Specifically, the variations in the mean value of each grade variable near the boundaries separating the mineralization zones were assessed with cross-to-direct variogram ratios of grade–indicator pairs [
33], which were computed omnidirectionally and along the major, minor, and vertical directions. As shown in
Figure 6, these ratios at short lag distances are systematically below 0.3 in the bottom magnetite zone, whereas they lie between 0.7 and 0.85 in the oxide and top magnetite zones, for all three grade variables Fe, P, and S. The quantitative contact analysis therefore suggests continuous grade transitions between oxide and top magnetite zones, whereas abrupt changes in the mean grades (i.e., marked contrasts) are observed with respect to the bottom magnetite zone. Accordingly, the boundary between the oxide and the top magnetite zones is interpreted as soft, while the boundaries with the bottom magnetite zone are interpreted as hard. The same contact analysis was also applied to the other geometallurgical variables considered in this study, namely FeO and MWT, revealing a similar spatial behavior and supporting the boundary classification adopted for the geological and geometallurgical modeling. Based on this diagnosis, the top magnetite and oxide zones were grouped into a single geological domain, whereas the bottom magnetite zone was defined as a second domain. Accordingly, two homogeneous geological domains separated by a hard contact were established.
As a result, a cascaded approach was adopted for the simulation procedure. First, a geological model was constructed, and grade modeling within each geological domain was then performed separately.
3.2. Stochastic Geological Modeling
In the first stage, geological domain boundaries are updated using production-scale ore control information: newly acquired blast hole data are used to constrain the spatial location of uncertain domain boundaries. This procedure operates on categorical geological domains and reconciles stochastic geological realizations constructed by using the VPM-DS algorithm with observed mining outcomes, while maintaining domain proportions and spatial connectivity. This geological updating step is conceptually separate from the EnKF-based updating of continuous geochemical and geometallurgical variables, which is introduced later in the study.
Figure 7 illustrates the final version of the updated geological model to be used in the subsequent analyses. The updated stochastic geological model will serve as the structural framework for multivariate simulation and subsequent EnKF-based updating of geochemical and geometallurgical variables.
3.3. Subdomaining Accounting for Structural Information
In the second stage, after constructing the geological model, it is necessary to account for the role of faults associated with mineralization in the geometallurgical modeling process. Given the abundance of faults with various azimuths in the study area, the main orientations of these structures must first be examined. The rose diagram provides a convenient means for analyzing fault orientations. As illustrated in
Figure 8, the strike of the faults in the study area ranges between 105° and 150°, with most faults striking approximately 120°. Previous studies have demonstrated that the spatial distribution of geochemical variables, in particular sulfur (S), is strongly controlled by the faults present in the study area. To identify the main fault orientations controlling mineralization, the variogram map of sulfur was generated (
Figure 9). As illustrated in
Figure 9, the spatial distribution of sulfur reveals the greatest continuity along the 120° direction. The fault oriented in this direction can therefore be interpreted as being associated with the mineralization. As illustrated in
Figure 10, the primary block model was partitioned into four subdomains by considering both the geological domains and the major fault identified in the study area. The continuous variables were subsequently modeled for each subdomain through independent simulation runs.
3.4. ILR–FA Transformation of Geochemical Variables and Processing Responses
Figure 11 illustrates the histograms and joint distributions of the continuous variables in the study area. As a result, the dataset under study is classified into two groups. The first group comprises the geochemical components, namely Fe, P, and S, whereas the second group consists of the geometallurgical variables, including FeO and MWT. As illustrated in
Figure 10, the dataset from the study area was partitioned into four subsets corresponding to the defined homogeneous subdomains. The subsequent simulation procedures were then performed for each subdomain based on its respective dataset.
In this study, the ILR–FA approach was employed for the joint modeling of geochemical and geometallurgical data, which are inherently compositional. First, the data were transformed from the simplex space—where components are expressed as relative proportions constrained to a constant sum—into a real Euclidean space using isometric log-ratio (ILR) transformations to remove the closure constraint while preserving the correlation structure among the variables [
34].
Given the compositional nature of the continuous variables Fe, P, S, MWT, and FeO, the ILR transformation was implemented in three separate steps to account for their different measurement supports. In the first step, the geochemical variables Fe, P, and S were treated jointly, as they are reported in weight percent and form a closed compositional system. A filler component was introduced for each sample so that the sum of Fe, P, S, and the filler equaled 100%, after which the ILR transformation was applied to this four-part composition.
In the second step, the same procedure was applied to FeO. Although FeO is also expressed in weight percent, it does not belong to the Fe–P–S compositional system and was therefore treated separately. A filler component was defined so that the sum of FeO and its filler equaled 100% prior to applying the ILR transformation.
In the third step, the ILR transformation was applied to MWT. While MWT is a processing-related variable, it is expressed on a percentage scale. A filler component was therefore introduced to enforce closure before transformation. This stepwise implementation ensured that the compositional constraints of each variable group were preserved during the Gaussianization and subsequent updating procedures.
Table 3 provides an overview of the ILR implementation and the definition of the filler components.
Next, the Flow Anamorphosis (FA) procedure was applied in its multivariate form to the ILR-transformed dataset in order to achieve normalization and to minimize residual cross-correlations among variables. This procedure evaluates the joint probability structure of the data and performs a nonlinear mapping into the Gaussian scale, thereby ensuring that the transformed variables fulfill the assumption of multivariate normality [
22]. The transformation is defined as a series of monotonic flows. Within this framework, the flow gradually reshapes the empirical distribution of the data toward a Gaussian form. These flows are described by smooth differential equations that control how the target density evolves from the normal model toward the observed empirical one. This gradual deformation keeps the mapping invertible and preserves the geometry of the data space [
22,
35].
Considering the central role of the ILR and FA procedures, an integrated ILR-FA routine was written in the R programming environment. The code was applied separately to the datasets corresponding to each geological subdomain defined in the modeling stage.
Figure 12 shows the histograms and bivariate distributions of the normalized variables obtained after applying the ILR–FA approach in the different subdomains. The results confirm that the normalization step in this approach was successful, ensuring a consistent scale across all variables. This outcome is related to the partitioning of the dataset into homogeneous subdomains, which reduced spatial complexity and heterogeneity.
Following the ILR–FA transformation, the original continuous variables (Fe, P, S, FeO, and MWT) were mapped onto five uncorrelated Gaussian components, denoted V1 to V5. These components are linear combinations of the original variables in the transformed Gaussian scale and are used as state variables in the subsequent spatial simulation and EnKF updating.
3.5. Spatial Simulation of Geochemical Variables and Processing Responses
After Gaussianization using the ILR–FA transformation, variogram analysis was carried out separately for each uncorrelated Gaussian component within each geological subdomain. Experimental variograms were computed along the major, minor, and vertical directions. Geometric and zonal anisotropies were consistently identified across the variograms and were explicitly accounted for in the subsequent Turning Band (TB) simulation.
One hundred realizations of every variable were generated within each subdomain using TB simulation [
36,
37]. TB was selected for its computational efficiency and its ability to generate a large number of conditional multivariate realizations on dense production grids. In this application, the method offers a practical balance between conditioning accuracy and computational cost, given the repeated simulations required by the EnKF updating framework. Alternative approaches based on covariance matrix decomposition or on kriging-based conditioning are well established. At the spatial scale considered here, however, their computational cost limits their use for real-time or near–real-time updating. In contrast, TB enables fast and accurate reproduction of second-order statistics, making it well suited for large-scale simulations, uncertainty assessment, and flow property estimation. In its conditional form, the TB simulation integrates measured data through kriging-based correction, ensuring exact data reproduction while maintaining spatial variability [
38].
Figure 13 shows a three-dimensional view of one realization in the original scale. The figure displays the spatial distribution of Fe, P, and S together with the geometallurgical variables FeO and MWT. As shown in
Figure 13, the patterns of the geochemical and geometallurgical variables show a clear difference between the two geological domains. In the reproduced realization, sulfur is lower in the first domain than in the second domain. This pattern reflects a compositional transition across the boundary.
3.6. Performance Assessment
To highlight the role of a geometallurgical program, several approaches were used to simulate the grades of iron, sulfur, and phosphorus in the study area. In the first approach (A1), the simulation was based on the grade-related variables (Fe, P, S) without introducing any additional parameters. The data were first transformed from the simplex space to real space using the ILR transformation. The FA method was then applied to normalize the variables and remove cross-correlations. Finally, each variable was simulated one hundred times using the TB algorithm. In the second approach (A2), the geometallurgical parameters (FeO, MWT) were included in the simulation to extend the previous strategy. The third approach (A3) followed the same framework but also considered the geological boundaries. Based on the simulated geological domains, the input data—including the grade variables, geometallurgical parameters, and block model—were divided into two homogeneous domains, and the simulations were performed separately for each. In the final (proposed) approach (A4), the major fault identified in the study area was incorporated, using the same procedure as in the previous step. To evaluate the results and determine the most suitable approach, the reproduced realizations were compared with the conditioning data. The comparison was performed using univariate statistics, correlation analysis, and variogram modeling. As the first approach only includes three components, the results of these three components are presented for all approaches in the following sections.
Univariate statistical analysis involved examining the histograms of the realizations generated by the different approaches. In this step, the mean and variance of each realization were compared among the approaches. The conditioning data used in the simulations were taken as the reference for comparison.
Figure 14 and
Figure 15 show the comparison of univariate statistical parameters obtained from each approach with those derived from the original data. The mean values were well reproduced in all approaches. The difference between the reproduced means and those of the conditioning data shows a clear reduction from the first to the fourth approach. The variance of the reproduced realizations in the first two approaches is lower than that of the conditioning data, though the difference is smaller in the second approach. In the third approach, this difference decreases, and in the fourth, the agreement with the reference data improves further, reducing the remaining difference.
One of the objectives of this study is to ensure that the cross-correlations between the studied factors are properly reproduced in the realizations.
Figure 16 shows the correlation coefficients between the studied ILR-FA factors obtained from the realizations by the different approaches, compared with those derived from the conditioning data. As shown in
Figure 17, the difference between the reproduced and actual correlation coefficients is small in the first approach. However, all ILR-FA factors show some bias in their correlation values. This bias decreases in the second approach and almost disappears in the third. In the fourth (proposed) approach, the difference between the reproduced and actual correlation coefficients is even smaller, showing that this approach is the most effective.
The variogram was then used to describe the two-point spatial statistics of the data and was calculated in different directions. In each approach, the variogram models of the irl-FA factors were first fitted to the conditioning data. The experimental variograms were then computed from the realizations generated by the different approaches.
Figure 17 shows the comparison between the experimental variograms and the variogram models of the conditioning data. The fourth approach shows the best agreement with the reference models and reproduces the two-point statistics more accurately than the other approaches.
Accordingly, the evaluation and validation results show that the fourth (proposed) approach is the most effective in reproducing all statistical measures compared with the other approaches. The realizations generated by this approach are therefore used in the following steps for the reconciliation and updating processes.
4. Updating Stochastic Models for Continuous Variables
As discussed in the previous section, variables such as geological information, tectonic structures, and processing responses play a key role in modeling grades and processing responses within a geometallurgical framework. During mining and material extraction, however, certain discrepancies may appear in geostatistical models, especially in deposits with complex geological settings. Reducing these errors during operation remains one of the main challenges in mine planning and ore control. Grade analysis data and geological information from the extracted blocks of the Gol Gohar Iron Mine No. 1 are available for the period 2005–2019. These datasets provide a solid basis for refining the reproduced stochastic models and reducing potential errors. Building an ore control model using production data is therefore an essential step in this process.
To compare the reproduced realizations with the ore control model, the grade–tonnage curves for each were plotted (
Figure 18). For this comparison, the blocks matching those in the ore control model were first selected from the reproduced realizations based on their spatial locations. The corresponding plots were then analyzed.
Figure 18 shows the distributions of the studied elements in the realizations prior to updating. In the mean grade plots of sulfur and phosphorus, however, some bias with respect to the ore control model is apparent. This can be explained by the complex behavior of these variables, characterized by low background concentrations and localized enrichment, highlighting the limitations of static geostatistical modeling and motivating the application of the updating framework to adjust the realizations using production data, in order to correct the bias and reduce the associated uncertainty.
As part of the updating strategy, the EnKF method was applied to the realizations of each element. The EnKF is an advanced data assimilation technique that updates the system state using a set of probabilistic model realizations. Instead of propagating the full error covariance matrix through time, as required in the classical Kalman Filter, the EnKF estimates the covariance from the sample statistics of the ensemble. A fixed covariance inflation scheme was applied to the forecast covariance matrix at each update step to mitigate the tendency of ensemble-based filters to underestimate the forecast error covariance. This approach maintains ensemble spread over successive updates and prevents the filter from becoming overconfident as production data are assimilated. Spatial covariance localization was applied to limit spurious long-range correlations arising from the finite ensemble size. A compactly supported taper function based on the Gaspari–Cohn formulation was used to define a distance-based localization matrix, which was applied to the forecast covariance by element-wise (Schur) multiplication prior to the computation of the Kalman gain. The localization distance was selected according to the spatial continuity indicated by variogram ranges within each geological domain, ensuring consistency with the correlation scale of the Gaussian components.
During the forecast step, the multivariate relationships between geochemical and geometallurgical variables are preserved through the joint propagation of all variables within a common ensemble of realizations. These relationships are established in the multivariate Gaussian scale defined by the ILR–FA transformation and are carried forward through the ensemble statistics. In the analysis step, the assimilation of production data updates both the ensemble means and the cross-covariance terms between variables. This allows the relationships between grades and processing responses to be updated consistently, such that corrections applied to one variable are coherently transferred to related variables according to their estimated correlations. This mechanism enables information from updated variables to propagate across the multivariate system while preserving compositional constraints and avoiding artificial decoupling between grades and processing variables.
Measurement uncertainty was incorporated into the EnKF through the observation error covariance matrix. In this study, the measurement error was assumed to be constant over time, reflecting stable analytical precision and consistent sensor performance during production monitoring. Under this assumption, the updating procedure focuses on the effect of newly assimilated information rather than on temporal variations in measurement quality. This approach is efficient for complex and high-dimensional problems, making it suitable for geostatistical and geometallurgical modeling. In mineral extraction applications, the system state represents the vector of spatial variables
defined on the block model grid. The observations
correspond to production samples collected during operation. The goal is to update the model using the difference between the predicted and observed values [
39]. The main equation used in the analysis step can be written as follows [
8]:
In this equation, denotes the state matrix consisting of one hundred realizations generated by the proposed simulation approach at time step . is the observation matrix at time step . defines the observation operator linking the model space to the observed data, and represents the Kalman gain matrix that controls the weight applied during model correction.
To account for the spatial correlation among the studied variables, all elements were first transformed into uncorrelated Gaussian variables using the ILR–FA approach. In the next step, the updates were assimilated into the corresponding realizations according to the temporal sequence of the blocks in the ore control model. The updated realizations were then transformed back to the original scale.
Figure 19 shows the updating results for one of the reproduced realizations of the uncorrelated normalized factors at different time steps, in plan view.
The computational cost of the proposed framework is controlled by the dimensionality of the transformed variable space and the ensemble size used in the Ensemble Kalman Filter (EnKF). In this study, the combination of the ILR transformation and flow anamorphosis did not result in prohibitive computational demands, given the moderate number of variables and realizations considered. After defining the initial transformations and variogram models, the forecast and analysis steps of each EnKF update were completed within several seconds to a few minutes, consistent with operational decision-making requirements. This performance supports the applicability of the proposed approach in near real-time mining environments.
To demonstrate the impact of the updating procedure on model uncertainty, ensemble variance was evaluated before and after data assimilation. Both the static (prior) realizations and the EnKF-updated realizations were evaluated against the same dataset using identical ore control information, enabling a direct and consistent comparison of prediction errors and conditional uncertainty.
Figure 20 and
Figure 21 show the spatial distribution of mean ensemble variance, averaged over all reproduced realizations, for variables V1 to V5 in plan-view and vertical cross-section, respectively. The prior ensemble variance reflects the initial uncertainty related to geological complexity and limited information. After the final updating step, the posterior ensemble variance exhibits a clear and spatially coherent reduction in uncertainty, particularly within the ore control domain where production data were assimilated. The plan-view maps illustrate the lateral extent of uncertainty reduction, whereas the cross-section at elevation 3,218,826 shows how this reduction propagates vertically within the updated model. This spatial pattern is consistent with the sequential assimilation mechanism of the Ensemble Kalman Filter. In addition, an uncertainty reduction ratio (URR) was computed to quantify the overall decrease in variance after updating. The URR is defined as the percentage reduction in mean ensemble variance between the forecast and updated states. For the final updated model, the uncertainty reduction ratio (URR) indicates a consistent decrease in ensemble variance across the studied variables, ranging from approximately 31% to 52%. Higher reductions are observed for Fe and MWT, whereas lower values are obtained for P and S.
To assess the proposed approach quantitatively, the Root Mean Square Error (RMSE) was used to measure the temporal difference between the actual values (
) and the simulated blocks (
):
In this equation, N denotes the number of blocks used in the assimilation process, and
is the mean of the simulated values for each block. The changes in RMSE reflect whether the model performance improves (positive) or deteriorates (negative) relative to time zero:
As more production data are assimilated into the updating process, the RMSE values are expected to decrease.
Figure 22 shows the computed RMSE values for the different ILR-FA factors (V1 to V5). A close look at the ΔRMSE curves reveals a clear trend: as more data enter the system, the model steadily improves. The variables do not behave in the same way, yet all of them move upward without showing the erratic changes that would suggest numerical problems. Two of the variables, V3 and V1, respond more strongly to the added information, while V4 and V2 follow a slower, more gradual path. Such differences are not surprising because each variable interacts with the conditioning data according to its own spatial structure and the way it is represented in the prior covariance. Toward the later stages, the curves begin to flatten, which suggests that most of the available information has already been incorporated and that further updates make only modest changes. This type of behavior is often observed when an ensemble filter becomes increasingly informed. Overall, the progression of the ΔRMSE curves indicates that the updating scheme performs well, reduces the initial uncertainty by a substantial margin, and leads the system toward a stable and coherent set of predictions.
Based on the performed calculations, the RMSE at time step zero, before the updating stage, was lower than 0.11 for all normalized and uncorrelated components (V1 to V5). Given that the components are normalized to a unit standard deviation, an RMSE of 0.11 indicates that the modeling framework was able to reflect quite accurately the complex relationships among the continuous variables while incorporating the influence of geological domains and the structural effects of mineralization related to the presence of faults. These considerations resulted in realizations that were consistent with the behavior inferred from the monitored extraction blocks.
As shown in
Figure 22, once the updating process began, the RMSE steadily decreased and reached values below 0.055 in the final time steps. This suggests that the updating method was successful in reducing the overall error by more than 50% when incorporating information from the production (blast hole) data. It should be noted that the level of accuracy generally improves as additional time steps are introduced. Even so, reaching a full 100% reproduction of the production data is not feasible, since EnKF assumes a non-zero measurement error for the production data.
It is also clear that, if the realizations provided to the updating algorithm had been generated from approaches A1, A2 or A3 that produced weaker results, the RMSE at time step zero would have been noticeably higher. In such a situation, the discrepancy between predicted and mined blocks would have been larger, and the updating procedure would remove a larger proportion of the initial RMSE.
This observation underscores a fundamental distinction between the proposed updating framework and conventional spatial modeling approaches. Methods such as cokriging or static multivariate simulation rely on a fixed dataset and produce estimates or realizations that remain unchanged once generated. Consequently, discrepancies between predicted and mined blocks cannot be corrected and persist throughout the production stage, becoming embedded in the model as mining progresses. In contrast, the EnKF-based framework adopted in this study operates directly on the initial set of stochastic realizations and updates them progressively as new production information becomes available. The comparison between the static realizations generated by the selected simulation approach (AP4) and their EnKF-updated counterparts shows that the main added value of the EnKF lies in its capacity to dynamically correct prediction errors while simultaneously reducing the associated uncertainty during operation.
Figure 20 and
Figure 21 indicate that, although traditional multivariate simulation reproduces spatial variability and inter-variable relationships, it does not lead to a reduction in uncertainty in areas where new information becomes available. In contrast, the EnKF-based updating corrects the realizations locally and reduces ensemble variance within the ore control domain. This behavior reflects the ability of the EnKF to assimilate production data sequentially, a feature that is not addressed by cokriging or static multivariate simulation alone. Furthermore, the temporal evolution of the RMSE (
Figure 22) illustrates the progressive improvement of the model as additional production data are assimilated, a behavior that is not captured by classical spatial modeling techniques. These results indicate that the added value of the EnKF does not lie in replacing multivariate simulation, but rather in extending it into a dynamic, update-enabled framework that supports geometallurgical decision-making during operation.
From an operational perspective, the reduction in prediction errors and associated uncertainty achieved through the EnKF-based updating framework has direct implications for mine planning and processing control at the Golgohar mine. Improved spatial estimates of key geochemical and geometallurgical variables support more reliable material classification and stockpile assignment, thereby reducing misclassification errors commonly associated with static ore control models. The proposed approach progressively corrects the stochastic realizations as new production data become available, which contributes to stabilizing the quality of material delivered to the processing plant. Reduced uncertainty in variables such as Fe, P, S, and processing-related indices results in a more consistent feed composition. This consistency is essential for maintaining stable plant performance and reducing the risk of operational disruptions. These benefits are particularly relevant for deposits with complex geological settings, such as the Golgohar deposit. In such settings, rapid spatial variability in grade and mineralogical composition can lead to fluctuations in feed quality. The proposed framework enables the integration of real-time ore control information into geometallurgical decision-making. This supports more robust short-term planning and processing operations.
5. Conclusions
This study presents an updatable geometallurgical framework that integrates stochastic geological modeling, multivariate spatial simulation, and real-time data assimilation using the Ensemble Kalman Filter (EnKF). By combining the isometric log-ratio (ILR) transformation with flow anamorphosis (FA), the proposed approach allows compositional geochemical variables and processing responses to be treated consistently within a multivariate Gaussian setting while preserving their dependence structure.
The application of the framework to the Gol Gohar Iron Ore Deposit No. 1 demonstrates its ability to reproduce key spatial patterns and to reveal discrepancies between static model realizations and production data. The case study focuses on an iron ore operation. It should be noted that the proposed framework is not deposit-specific and can be transferred to other mining contexts with similar data availability. Variables such as sulfur and phosphorus, which are affected by compositional constraints and localized enrichment, show clear improvements after updating to production data, with a progressive reduction in modeling error as new data are assimilated.
From an operational perspective, the proposed framework provides a practical tool for integrating production information into geometallurgical models during mining. The updating process supports ore control and production planning by reducing uncertainty and correcting biases in grade and processing-response predictions within a time scale compatible with operational decision-making.
The proposed approach relies on assumptions related to measurement uncertainty, ensemble configuration, and data availability. From a quantitative point of view, the sequential assimilation produces a gradual and consistent rise in ΔRMSE, which means that roughly about 30 to more than 50% of the initial error is removed when assimilating the production data. Most of this improvement appears early in the procedure, and after this initial adjustment, the filter begins to settle, with the curves slowly taking a stable shape as more updates are added. While these assumptions were appropriate for the present case study, future work could focus on adaptive treatment of measurement error, alternative ensemble strategies, and broader application of the framework to other deposit types and operational settings.