Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets

Zhang, Yunfeng; Darilmaz, Ahmet

doi:10.3390/geotechnics6010020

Open AccessArticle

Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets

by

Yunfeng Zhang

^*

and

Ahmet Darilmaz

Department of Civil and Environmental Engineering, University of Maryland, College Park, MD 20742, USA

^*

Author to whom correspondence should be addressed.

Geotechnics 2026, 6(1), 20; https://doi.org/10.3390/geotechnics6010020

Submission received: 3 October 2025 / Revised: 30 January 2026 / Accepted: 5 February 2026 / Published: 10 February 2026

Download

Browse Figures

Versions Notes

Abstract

This study investigates how incorporating physical constraints can enhance the performance of machine learning models by ensuring that geotechnical drilling data predictions align with known physical conditions at the site. Machine learning-predicted soil property point cloud data has significant value for geotechnical project planning. The base model was trained on extensive borehole datasets of soil properties collected from an area of 32,133 square km covering five distinct physiographical regions. To incorporate physics-based constraints, a custom loss function was defined to penalize the model training loss whenever it violates known physical principles. Two distinct types of machine learning models—classification and regression models—are considered in this study for categorical and numerical geotechnical drilling datasets, respectively. Feature variables play a critical role in determining the accuracy of machine learning models and feature variables including location, geology, surface elevation, soil parent material, physiographical information (codes) and soil layer depth are adopted for training the machine learning models after parametric study of various feature variable combinations. Two case studies were conducted to demonstrate the effectiveness of incorporating physical constraints into machine learning models for categorical and regression datasets respectively. The study results demonstrate strong potential for applying physics-constrained machine learning models to generate reasonable estimated values across large regions, while also providing a better understanding of the historical data within the geotechnical drilling inventory.

Keywords:

machine learning; bedrock; soil type; geotechnical; borehole; artificial neural network; point clouds; subsurface exploration; physical constraint

1. Introduction

Geotechnical site characterization involves determining the profiles of subsurface soil properties and stratigraphic distributions, usually through site investigation tests. Geotechnical drilling and in situ testing are usually performed to obtain borehole logs and subsurface soil property profiles, but borehole logs and in situ testing data might not be available for a construction project site. Artificial Intelligence (AI)-based estimation holds significant potential to advance virtual geotechnical site characterization through machine learning model predictions, which is very valuable for project planning and strategic selection of limited boring locations at the project site. Every year millions of dollars are invested into field testing of geomaterials through geotechnical subsurface investigation. There is strong potential for labor reduction by utilizing historic field test data with machine learning models to generate predicted data as well as developing a better understanding of the relationships in the geotechnical drilling data inventory [1]. In the US, state and local government agencies have accumulated massive amounts of engineering datasets as well as other data such as roadway and construction data over a long time period, which creates an excellent opportunity for establishing deep learning models to enable reliable prediction of engineering characteristics and other desired features from the massive datasets [2]. Advanced machine learning models are becoming available to address the challenging needs identified by engineers in design and construction, field investigation data, asset management and operation optimization. Recently, machine learning has begun to be implemented in geotechnical engineering problems [2,3,4,5,6]. Moreover, data-centric geotechnics has been increasingly emphasized in recent research, underscoring the crucial relationship between model performance and data quality, feature selection, and domain-informed preprocessing [7,8]. The model developed considering these principles can be utilized as part of the soil boring request process to deliver preliminary subsurface data based on machine learning analysis with historical boring hole data.

Although machine learning models can be used by design engineers to provide historic data-based predictions, traditional machine learning models are plagued by occasional hallucinations, producing physically inconsistent predictions at certain locations. This can be explained by the fact that machine learning model training is primarily driven by statistical patterns in the training data and may not align with well-established physical conditions at the site. This limitation can lead to inaccurate prediction results if calibration and corresponding adjustment is not done properly for the site, especially in complex geotechnical environments. The growing integration of machine learning with the physical sciences has led to the development of Physics-Informed Neural Networks (PINNs). These hybrid models merge traditional data-driven approaches with physical principles, including differential equations, boundary constraints, and empirical rules. A defining feature of PINNs is their ability to incorporate physical constraints directly into the training process, ensuring that predictions remain consistent with established scientific knowledge [9,10]. While physically consistent models should enforce constraints to within machine precision, data-driven algorithms often fail to satisfy well-known constraints that are not explicitly enforced. In particular, neural networks, powerful regression tools for nonlinear systems, may severely violate constraints on individual samples while optimizing overall performance [11]. In addition to the black-box neural networks, recent advancements in white-box algorithms, such as random tree forests [12], expression trees (ETs) [13], Gene Expression Programming (GEP) algorithms [14], and K-dimensional Tree-Graph Convolutional Neural Process (KDTree-GCNP) [15], allow for the integration of explicit physical relationships and domain knowledge while improving transparency and confidence in subsurface predictions. However, for the statewide subsurface prediction tasks detailed in this study, the use of Artificial Neural Networks (ANNs) remains a more efficient option due to their excellent nonlinear mapping capabilities, which enable the model to better capture the underlying complex relationships in large tabular geospatial datasets. Furthermore, ANNs coupled with physical constraints mitigate the limitations of conventional black-box models, sustaining high predictive performance without compromising physical principles.

This study investigates how incorporating physical constraints into the training process of machine learning models can improve their performance and reliability for predicting geotechnical drilling data. By penalizing the model when it violates known physical conditions, its predictions are not only made physically plausible but also consistent with the real-world conditions at a project site. Although the studies that explore the use of physical constraints in geotechnical datasets have demonstrated their utility, further work on large-scale applications of these methods remains valuable.

This study focuses on developing physically constrained machine learning models for geotechnical datasets including but not limited to soil type and bedrock depth. Since soil type represents categorical data and bedrock depth represents continuous numerical data, different forms of physical constraint conditions and loss functions have to be applied accordingly. There are two ways to apply physical constraints in machine learning: soft constraints, which are enforced by adding extra penalties to the loss function; and hard constraints, which refer to conditions that must be satisfied when generating the model. While hard-constrained machine learning models have some advantages over soft-constrained ones, such as more robust and accurate predictions, the former are usually difficult to optimize due to their strict observation of the constraints [16]. This study features the simultaneous treatment of classification (soil type) and regression (bedrock depth) tasks within a unified physics-constrained machine learning framework, demonstrated by two case studies. The results of the machine learning prediction are generated in different contexts such as tables, raster images and multiple shapefiles containing the predicted geotechnical drilling data such as soil types and bedrock depths. Feature variable selection and parametric analysis are critical for achieving optimal model performance, particularly for datasets with complex geotechnical characteristics. While the current model adopted the optimal combination of feature variables for soil type and bedrock depth models, they were determined based on a thorough parametric study and engineering experience. Through regional raster map visualizations over an area exceeding 32,000 square kilometers, it is shown that the available datasets for soil type and bedrock depth are sufficient to produce large area predictions that appear reasonable overall and comply with specified physical constraints. In summary, this study makes the following key contributions: (1) machine learning model development using a large and heterogeneous dataset encompassing more than 11,000 boreholes and over 260,000 samples across multiple physiographic regions; (2) the generation of large-area, high-resolution subsurface prediction maps that are directly relevant to engineering practice and infrastructure planning; and (3) clear demonstrations that the incorporation of physical constraints materially reduces geologically implausible predictions when compared with unconstrained models. This work may inspire others who are interested in applying machine learning but remain hesitant due to uncertainty about the level of detail and accuracy achievable with their available data. The results presented here offer a valuable baseline that can serve as a starting point for developing and applying machine learning models to geotechnical datasets in other regions with similar feature variable data available, such as other U.S. states.

2. Soil Type and Bedrock Depth Data

The subsurface site investigation dataset used in this study comprises two distinct types of tabular data: soil type in categorical variable format and bedrock depth in numeric format. Tabular data, as used here, is stored in ASCII file format. Accordingly, two types of machine learning models were developed: a classification model for soil type and a regression model for bedrock depth. The foundation of this study is a large geotechnical drilling dataset spanning the state of Maryland transportation network. The data was compiled from numerous historical borehole records spanning five distinct physiographical regions. A preliminary feature engineering study was conducted to identify the most relevant influence feature variables for the machine learning models.

2.1. Soil Type Dataset

Data exploratory analysis was applied to the training data as the first step. The distribution of the soil type data used for machine learning model training in this study is presented as a histogram in Figure 1b, which shows that ML and CL are the two major soil categories, each with more than 48,000 data samples. OH has the fewest data samples in this dataset, with less than 30 records. The unbalanced data distribution in these categories might cause data imbalance problems in machine learning model training and requires special attention in training. The geospatial distribution of these borehole data in the state of Maryland is visualized in Figure 1a. Data are concentrated mostly at a relatively shallower depth in the west region of Maryland than those in the east region with deep bed rock values as shown in this figure.

This study utilized an extensive borehole dataset containing soil properties from five different physiographical regions. Machine learning models were trained with 260,751 data samples collected from a total of 11,448 boreholes in Maryland. To obtain optimal performance from supervised machine learning algorithms, candidate features must first be identified based on engineering judgment and refined through trial-and-error parametric studies or feature-sensitivity analyses, such as SHAP analysis. The machine learning model was trained using both geospatial and physical feature variables to predict the primary constituent of soil type at any given drilling depth. We adopted a range of feature variables identified through a parametric study to be critical for model accuracy, including: Geospatial Data, latitude and longitude coordinates to capture spatial dependencies; Geology, local geological formations and rock type; Surface Elevation, a digital elevation model (DEM) dataset was used to provide ground surface elevation at each borehole location; Physiographical Region, a categorical code PHYSIO_COD representing the broader physiographical region (this code was parsed into three sub-codes, SECTION, REGION, DISTRICT; this decoupling separates the distinct information types embedded in the original code, enabling the machine-learning model to interpret each component more precisely); Soil Layer Depth, the vertical depth of a specific soil layer where the sample is collected; and Soil Parent Material, the substance in which a soil develops, was utilized as a categorical feature variable for the machine learning model. Soil Parent Material information on the geological origin of the soil can be found from the United States Department of Agriculture (USDA) Soil Survey Geographic Database (SSURGO) [17]. Parent material can be unconsolidated, chemically weathered mineral or organic matter which will significantly influence the resulting soil profile and properties and works as a useful indication of underlying soil type [18]. To extract the parent material value for each data sample, a spatial join tool was employed in ArcGIS Pro (Version 3.4.3) [19] for the SSURGO shapefile at the boring hole location. The original dataset included the SSURGO Map Unit Key (MUKEY), a numeric identifier that links soil map unit polygons to comprehensive soil attributes in SSURGO, including texture and component composition, as a feature variable. This was significant because it directly linked soil properties across multiple attribute tables to spatial soil data. Eventually, the soil parent material color ID code was used to replace the parent material MUKEY feature. Because MUKEYs are only unique within each county, not statewide, it is essential to ensure that grainsize training using soil parent material is not keyed to MUKEY alone. The color code dataset was also developed by the USDA and detailed information can be found at the USDA NRCS website [20]. Since soil color often exhibits a strong correlation with soil type, this dataset provides meaningful complementary information. The USDA dataset provides soil color information at eight depth intervals (5, 10, 15, 25, 50, 75, 100, and 125 cm). In this study, five soil color attributes corresponding to the 25 cm, 50 cm, 75 cm, 100 cm, and 125 cm depths were added to the grainsize training data as new feature variables. It is anticipated that integrating these color ID code data with parent material information can enhance the relationships derived from boring logs and improve the predictive capability of the model.

Once the list of feature variables appropriate for machine learning model training has been determined from a parametric study of their influence on model performance, a data pre-processing procedure was applied to the training dataset to cleanse and re-organize the data. The processing method employed in this study includes data cleaning of those samples with missing values and format conversion. A total of 10,853 samples with null value in at least one feature variable were discarded from the original dataset. Data type and format checks were then applied to the remaining samples in the dataset. For numerical features, all data were converted to float point format. Global Positioning System (GPS) coordinates were rounded off to seven decimals in degrees. This ensures that the positional error of subsurface geological data samples remains below 0.011 m. Surface elevation and drilling depth were rounded off to two decimals in meters. The input feature of parent material and the target feature variable of primary constituent of soil type are both categorical variables, which were converted to object string format in the data pre-processing step. Because some raw string feature data before pre-processing have typos and alternating caps, string features were cleaned before reassigning the strings with a unique number. Parent material contains 150 unique categories, and the target feature variable, primary soil type, contains twelve unique values in the established Unified Soil Classification System (USCS), describing their behavior in construction and corresponding color ID codes, as shown in Table 1. The USCS categorizes soils based on particle size and plasticity by assigning a two-letter symbol: coarse-grained (gravels and sands), fine-grained (silts and clays), and highly organic (peat) categories.

The original dataset has imbalanced distribution in the target variables; therefore, an oversampling strategy was adopted to address this issue. Once the training dataset is finalized, a machine learning classification model can be trained to predict the soil type at the selected site location and specified underground depth. This approach achieved a more favorable distribution by replicating samples from each class with a predetermined factor without overpronouncing extremely rare classes, preserving the underlying distribution of soil types across the state. The specific resampling factors used are as follows: ML: 2, CL: 3, SM: 3, SP: 3, IGM: 5, MH: 5, SC: 5, SW: 5, GP: 25, CH: 25, GM: 25, OL: 25, GW: 25, PT: 25, GC: 25, OH: 25, and ROCK: 2. This list includes two types in addition to the standard fifteen USCS classes: IGM and ROCK. IGM (intermediate geomaterial) describes materials, such as weathered or decomposed rock, that are in transition between soil and bedrock. ROCK denotes unweathered bedrock or solid rock layers that may be found within the soil profile.

2.2. Bedrock Depth Data

Bedrock depth is a critical parameter for geotechnical engineers, as it directly influences foundation design, slope stability, and the assessment of environmental hazards. For example, the depth to bedrock determines whether shallow foundations, such as piles, are sufficient to transfer structural weight to underlying soils. Lack of accurate bedrock depth information can lead to improper foundation design, which may cause potential settlement or differential movement in the future. Detection of bedrock depth is one of the critical site investigation procedures for seismic hazard analysis and underground developments that may encounter varying rock formations. The most common practice to detect bedrock is to directly drill boreholes, which is often expensive and provides limited information from discrete boreholes [22].

Regression machine learning models have been developed in this study to predict the bedrock depth values at a given site location. The bedrock depth dataset used in this study includes a total of 5821 samples available for machine learning model training from 3138 boreholes all located in Maryland. Feature variables for predicting the bedrock depth values using machine learning models include GPS coordinates, surface elevation, physiographical province, and major rock types of the site where data is collected. Similar to the pre-processing work performed for soil type data before training, samples with missing values in the input or target feature variables were discarded from the dataset before training. Latitude and longitude of GPS coordinates were converted to seven-decimal float point data, while surface elevation measurements were converted to two-decimal float point data format. Typos in those categorical features were corrected and subsequently the features were converted to object string format.

Figure 2b shows the statistical distribution of the bedrock depth dataset used for machine learning model training. The geospatial distribution of bedrock depth data in Maryland is also visualized in Figure 2a. the middle and western regions of Maryland have most data in the [0, 30 m] range. The objective of this task is to train neural network models for the bedrock depth data, and then use the validated models for predicting bedrock depth at any given site in Maryland, with a special interest in the transition area. In Maryland, the transition zone between bedrock and overlying surface materials plays a critical role in the state’s geology, particularly in the central region where the Piedmont and Coastal Plain provinces converge. This area, known as the Fall Zone, marks the geological boundary between the hard, crystalline rocks of the Piedmont and the softer, unconsolidated sediments of the Coastal Plain [23,24]. Seismic and drilling surveys in these zones have revealed highly variable bedrock depths within short distances. Such spatial variation of bedrock depth ranges from just a few meters near the Fall Line to 2000 m closer to the Atlantic Ocean coast. Such abrupt changes in bedrock elevation over short distances make reliable predictions difficult through interpolation from the limited number of borehole data in the area. These challenges require either dense site-specific investigations or AI-based prediction.

In this study, a parametric study was carried out to determine the influencing feature variables and optimize the hyper-parameter values for training a bedrock depth data-based neural network model. In this process, regression models were trained with different combinations of feature variables and hyper-parameters, and the results were assessed for accuracy. The feature variables tested in the regression model included GPS coordinates (latitude and longitude), depth, surface elevation, major rock type, state physiographical code, and county. Major rock type was extracted from the State Geologic Map Compilation (SGMC) provided by the United States Geological Survey (USGS), while the surface elevation was derived from the state Light Detection and Ranging (LiDAR) Digital Elevation Model (DEM) dataset. The state physiographical code was obtained from the physiographic map provided by the Maryland Geological Survey. Each digit in this six-digit code represents a hierarchical unit of Maryland’s physiographic subdivisions. The first digit corresponds to the physiographical province. The GPS coordinates and bedrock depth from borehole measurements were matched with the remaining feature variables in ArcGIS. The dependent variable was the bedrock depth values.

Based on the parametric study results, it was found that including the physiographical province as an input feature variable can improve the model prediction accuracy. However, county, major rock type and the rest of the physiographical code units did not help improve the accuracy of the model predictions.

3. Machine Learning Models with Physical Constraints

Deep learning is a machine learning technique that can be used to represent high-dimensional data by capturing the complex relationships inherent with the datasets. The prediction accuracy derives from the neural network models trained with large numbers of data samples to represent the relationships hidden in the selected datasets. Neural network training consists of identifying the optimal parameter configuration that yields the lowest value of the cost function. This is achieved through an iteration process called stochastic gradient descent with momentum. A feedforward neural network model for tabular data was adopted for training the machine learning models in this study. The adopted feedforward neural network model architecture generally had four or five hidden layers with 400 neurons in each layer depending on the complexity and amount of data samples available for training. A typical hidden layer with ReLU (rectified linear unit) activation function, batch normalization, and dropout is the state of art for the selected feedforward neural network model. In training the neural network model, it was optimized through an iteration process termed back-propagation. Batch normalization proposed by [25] normalizes the output of a previous activation layer by subtracting the batch mean and dividing by the batch standard deviation. This ensures that the gradients are more predictive and thus allows for the use of a larger range of learning rates and faster network convergence.

To add physical constraints to the machine learning model, a custom loss function is defined. This custom loss function combines the traditional data-driven loss components with an additional physics-based component that penalizes predictions when they violate the prescribed constraints. In this study, a soft constraint approach was adopted after some tests of both hard and soft constraints, which rewards the model for adhering to predefined constraints while still allowing some flexibility, as the constraints are not enforced strictly. The flowchart in Figure 3 illustrates the workflow of the physics-constrained neural networks implemented in this study.

3.1. Soil Type Classification Neural Network Model with Physical Constraints

In this study, a classification model is adopted, and the training target is to classify the data samples into corresponding discrete soil type categories. For the soil type machine learning model, an embedding layer was applied at the top of the neural network to transform parent material categories into the embedding layer, combined with four other numeric features as input to the first layer of the neural network model. A five-layer neural network was applied as the hidden layer and the output layer returns confidence scores of each seventeen soil type categories. For each input, the trained neural network outputs confidence scores for all soil categories, and the category with the highest score is taken as the predicted soil type.

In the parametric study, different combinations of feature variables including site location, depth, surface elevation, and parent material type were considered. For mixed soil types, the primary constituent in the dataset was selected as the target soil type for training purposes. The dependent variable is the primary constitution of soil type. A total of 90% of this dataset was randomly allocated for training and the remaining 10% of data was used for validation during the training process. An independent test of prediction accuracy was also done for the original soil type dataset by comparing the predicted value with the true value from field investigation. A training loss curve from the model training output was used to ensure that overfitting did not happen during the training. The accuracy (all correct prediction/all samples) of the best machine learning model trained with resampled data was 0.86 for the considered dataset, and the precision, recall and F1 scores for each category are shown in Table 2. It was found that the category with the highest prediction accuracy was OH which had an F1 score of 1, while the lowest accuracy category was SC with an F1 score of 0.78. Comparison of the true values vs. the machine learning model-predicted values for soil type data is performed by the confusion matrix shown in Figure 4.

By training predictive machine learning models on extensive historical borehole datasets, we can generate a high-resolution representation of the subsurface geotechnical space with adaptable point cloud density. However, purely data-driven models are susceptible to generating physically unreasonable or highly improbable predictions due to dimensional space gaps or noise in the training data. For example, a model might predict a soft clay layer beneath a shallow bedrock formation or a negative value for bedrock depth that signifies hovering over the ground surface. Initial results from a data-driven model indicated a high overall accuracy but predicted rare instances of physically impossible soil profiles. For example, the model might predict a discontinuous layer of hard rock floating within a soft clay stratum. To overcome these limitations, this study introduces a physics-constrained approach by defining a custom loss function that penalizes the prediction of physically impossible instances. The physics constrained machine learning model, trained with the custom loss function, is capable of significantly reducing these inconsistencies. This study aims to demonstrate the effectiveness of this methodology through two distinct case studies, one for categorical soil data and the other for numerical regression bedrock depth data.

Custom loss functions are first created to integrate the physical constraints into the machine learning models. In standard machine learning, the loss function measures the difference between the model’s predictions and the actual values. Our custom loss function included an additional term that penalized the loss for violating specific physical principles to comply with established geotechnical knowledge. By incorporating this penalty, the model learns to prioritize physically realistic predictions during training. This loss function combines the standard cross-entropy loss with an additional constraint penalty based on the predefined physics-informed rules. Whenever the model makes predictions that violate these constraints, the penalty is applied. Detailed mathematical expressions defining the custom loss function are provided in Equation (1).

λ

is a hyper-parameter that controls the weight of the physics-based penalty in this formulation. A higher

λ

forces the model to more strictly adhere to the physical constraints.

L_{C E}

is the cross-entropy loss, and

P_{p h y s}

is the penalty term quantifying the total violation of all physical constraints. The term

{\hat{p}}_{i, c}

is the predicted probability for class

c

in sample

i

and is calculated through the SoftMax function in Equation (2).

L_{t o t a l} = L_{C E} + λ \times P_{p h y s} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{c = 1}^{C} y_{i, c} \log {\hat{p}}_{i, c} + λ \times \frac{1}{N} \sum_{i ϵ C} 1 [{\hat{y}}_{i} ϵ Forbidden Classes]

(1)

{\hat{p}}_{i, c} = \frac{\exp (z_{i, c})}{\sum_{k = 1}^{C} \exp (z_{i, k})}

(2)

where

$y_{i, c}$ = The index of the true class label for sample i
$\hat{y}$ = Predicted class
$λ$ = Penalty weight
$N$ = Total number of samples
$K$ = Total number of constraints
$C$ = Total number of classes
$C$ = Training dataset
$g_{k} (x_{i}, {\hat{y}}_{i})$ = Constraint function of the kth constraint for the ith sample
$z_{i}$ = The vector of raw output logits for sample i

In this study, geological findings with real borehole data validation suggest that at any location in the Eastern Shore region of Maryland, the bedrock will be deeper than 300 m. Consequently, certain soil types such as IGM or ROCK are not expected to be encountered at depths shallower than 300 m. The penalty weight

λ

, used in conjunction with these penalty conditions, was selected through a parametric study. This process was particularly crucial for balancing predictive accuracy with adherence to the constraints. The penalty term

P_{p h y s}

is defined as the ratio of the samples that violate the physical constraints, and it ranges from 0 to 1. Accordingly, increasing the penalty weight increases the degree to which the model is influenced by the prohibited predictions relative to minimizing the cross-entropy loss. A coarse range of penalty weights

λ

ϵ {0, 0.5, 1, 2, 5, 10, 25, 50, 75, 100} was initially tested by training the model for each

λ

value and generating statewide predictions. Using the number of constraint violations in the predicted results and standard classification metrics (i.e., accuracy, F1 score) as the main assessment criteria, a finer sweep was performed with

λ

ϵ {2.5, 3, 3.5, 4, 4.5, 5, 5.5}. A penalty weight of

λ

= 4.5 was selected as the option with the best trade-off between prediction accuracy and the lowest number of violations possible. This penalty weight substantially reduced the number of violations while maintaining good overall classification performance.

3.2. Regression Neural Network Model for Bedrock Depth Prediction

A regression model was developed to predict bedrock depth values. A purely data-driven model, while achieving a low MSE (mean square error) on the training dataset, might produce predicted values with high variance and occasional negative values that did not align with reasonable bedrock depth values for the considered region. The physics-constrained machine learning model, by including a penalty for bedrock depth values that were outside an acceptable range, produced a more credible output. The predictions also demonstrated a tighter distribution in the physically plausible range.

In this study, different numbers of neural network layers were tested, and it was decided that the four-layer neural network was optimal based on the results of a parametric study. For model training, 90% of this dataset was randomly allocated for training, and the remaining 10% was used for validation. A split with a high training ratio consistently produced results with improved generalizability, based on the parametric studies evaluating the effects of various split ratios. The best accuracy level for the prediction test was 0.9 for R² value, as shown in Figure 5. Figure 5 is drawn to visually compare the true value from field investigation and the machine learning model-predicted bedrock depth value for the original dataset. It demonstrates that the model performs well in predicting bedrock depth across the full range of measurements taken throughout the state. This performance is attributed to the strong correlation between bedrock depth and the feature variables included in the datasets, which is solidified by the stabilizing effect of the physical constraints. Hence, geologically plausible predictions are consistently achieved and the accumulation of points around the 1:1 line exhibits the model’s ability to represent both shallow and deep bedrock conditions robustly.

For the bedrock depth regression model, a custom loss function which combines the standard Mean Squared Error (MSE) loss function with spatially conditioned penalties was created. The penalty term in the loss function is applied only to samples located west of a user-defined geographic boundary—a single line established by two sets of GPS coordinates approximately along the Fall Line. If the predicted values fall outside a designated range with predefined maximum or minimum values, the penalty increases proportionally to their deviation from the permitted bounds. This sets up a soft constraint that does not strictly forbid out-of-range predictions but rather discourages them, preventing excessive adjustments to the model weights. Equation (3) formulates the custom loss function used in the bedrock depth model.

λ

is the penalty weight of the physics-based constraints defined for bedrock depth.

L_{M S E}

is the loss component calculated through MSE formulation, and

P_{p h y s}

is the penalty term quantifying the total violation of all physical constraints.

L_{t o t a l} = L_{M S E} + λ \times P_{p h y s} = \frac{1}{N} \sum_{i = 1}^{N} {({\hat{y}}_{i} - y_{i})}^{2} + λ \times \frac{1}{N} \sum_{i ϵ C} \sum_{k = 1}^{K} \max (0, g_{k} (x_{i}, {\hat{y}}_{i}))

(3)

where

${\hat{y}}_{i}$ = Predicted value
$y_{i}$ = Observed value
$λ$ = Penalty weight
$N$ = Total number of samples
$K$ = Total number of constraints
$C$ = Training dataset
$g_{k} (x_{i}, {\hat{y}}_{i})$ = Constraint function of the kth constraint for the ith sample

For bedrock depth predictions in this study, an optimal penalty weight of 75 was incorporated, selected through a parametric study.

4. Results and Discussion

To demonstrate the practical benefits of using physically constrained machine learning models to generate predicted values of soil type and bedrock depth at a given site, two case studies were performed. Using both site investigation data and prior knowledge of local geology, such three-dimensional subsurface point cloud data can be generated with machine learning models, as shown in this case study for soil type data. This provides a better understanding of the subsurface geology condition. The empirical evidence from the case studies below demonstrates the effectiveness of incorporating physics constraints through a custom loss function for enhancing the predictions of machine learning models in a geotechnical context. The resulting models not only provide accurate predictions but also generate outputs that are more consistent with known physical and geological conditions.

4.1. Soil Type Prediction in the Eastern Shore Area

The study area is in the Eastern Shore area of Maryland, USA. This study area is located approximately from latitude 39.2423° to 39.3317° in Maryland. A 200 m × 200 m grid was then generated from the resampled 2 m resolution LiDAR DEM data for Maryland and the grid points were used here as location features for implementing the machine learning prediction. ArcGIS pro toolbox (ESRI 2023) was used to extract and join features of GPS coordinates, surface elevation, and soil parental material to the grid network. For this study area with 200 m resolution grid points, the machine learning models needed to make predictions for the target feature variable considered (i.e., soil type). The machine learning model-predicted results were then attached to the grid point table and rasterized to a 200 m × 200 m soil type raster map, as shown in Figure 6 for the entire state and Figure 7 for the study area in the Eastern Shore. As shown in Figure 7, the model predicts CL (clay) and SP (sand) as the primary soil types in most coastal areas at a drilling depth of 3 m. As shown in Figure 7a, when no physical constraints are applied, the machine learning model incorrectly predicts the presence of ROCK (highlighted in red color) in the selected study area at a 3m drilling depth. However, it is well established that on Maryland’s Eastern Shore peninsula, the true bedrock depth will lie below 500 m. This inconsistency demonstrates that the unconstrained model produces results that conflict with field test data. To address this issue, physical constraints were integrated into the training process to ensure predictions remain consistent with known geological conditions. Figure 7b illustrates the updated soil type map retrained with these constraints applied to the Eastern Shore area, where the erroneous ROCK predictions have vanished from the predicted raster map.

4.2. Bedrock Depth Prediction

It is difficult to predict bedrock depth in the afore-mentioned transition zone because its complex geology is accompanied with abrupt changes in bedrock depths. Such spatial variability, coupled with the challenges of mapping the concealed and weathered bedrock surfaces, makes it particularly difficult for geologists and engineers to accurately estimate bedrock depth values in Maryland’s transition zone. Geological knowledge and historic drilling data (e.g., Figure 2a) suggest that west of the Fall Zone fault line, bedrock depth typically ranges from 0 to 36 m. To reflect this information in the predictions, a soft constraint was introduced to guide the machine learning model training, encouraging predicted values to fall within this range for the specified area while still allowing flexibility. Since the bedrock depth model is a regression-based neural network, it can technically produce negative predictions at certain locations, which are physically unrealistic. After applying physical constraints, the number of negative bedrock depth predictions was reduced substantially, from over 115,000 points to only about 1200 points within a 200 m × 200 m grid network consisting of nearly 1.6 million grid points across Maryland, as shown in Figure 8.

Figure 9 shows the predicted bedrock depth raster map for the Cecilton Quadrangle study area in Maryland. At the location of borehole CE Ee 29 (GPS coordinates: 75.872° W, 39.401° N), the machine learning model estimated the bedrock depth to be approximately 429 m. This prediction aligns well with the published reference value of 436.7 m for the basement rock at the same site, demonstrating the model’s ability to capture bedrock depth value with reasonable accuracy. The close agreement between the predicted and reference depths showcases the capability of the physics-constrained machine learning model and demonstrates the potential of integrating data-driven methods with geological knowledge to improve deep geotechnical characterization in areas where drilling information is sparse. Figure 10 shows a predicted bedrock contour plot map for an area covering the BWI airport region within the transition zone near Baltimore, Maryland. It is noted that a borehole located near the lower right corner of the figure, at GPS coordinates (latitude: 39.152054, longitude: −76.651634), reports a bedrock depth of 190.5 m [26]. This value aligns closely with the model’s predicted depth of 189.0 m, with a 1.5% error.

The findings from both case studies confirm that incorporating physics constraints significantly enhances the performance of machine learning models in geotechnical applications. By leveraging historical geotechnical drilling data and physically constrained models, the machine learning models provide more reliable estimated values for locations where certain physical conditions are known and must be complied with. The predictive models also offer a continuous, 3D digital representation of the subsurface data, providing a better understanding of the geotechnical site conditions.

5. Conclusions

In this study, well-established physically constrained machine learning algorithms have been used to model the inherent complex relationship in drilling data from field tests and generate soil type and bedrock depth values. Machine learning methods have been investigated and their prediction performance was tested for validity in automated generation of bedrock depth and soil type data by comparing borehole data from site investigation in a study area located in Maryland. A classification model of feedforward neural networks was adopted for the modeling of soil type datasets while a regression neural network model was used for modeling bedrock depth datasets. This study adopts the soft constraint approach to both classification and regression machine learning models. Based on the findings from this research, the applicability of the physics-constrained machine learning models under investigation are summarized below.

Deep neural network models were trained using various geotechnical drilling datasets, including locations, surface elevation, slope, soil parent material, USDA soil ID codes, and layer depth as input variables. The neural network model has been updated for the drilling data based on parametric study of hyper-parameter values as well as adopting an oversampling strategy to address the data imbalance issue in the original dataset. Tuning the deep neural network model for optimal performance was conducted through a parametric study of the hyper-parameters and different combinations of feature variables. Comparison of the predicted results from the trained neural network model and true values was also made by calculating the corresponding confusion matrices and plotting a scatter plot in this study, as well as historical engineering geology map data.

The empirical evidence from the case studies involving regional scale datasets demonstrates the effectiveness of incorporating physics constraints through custom loss functions for enhancing the predictions of machine learning models in a geotechnical context. The resulting models not only provide accurate predictions but also generate outputs that are more consistent with known physical and geological principles.

The methodology adopted in this study builds upon established machine learning algorithms and incorporates physical constraints through modified loss functions. The use of physical constraints integrated into loss functions steers model training toward statistically accurate and consistent solutions informed by known geological principles and empirical field evidence. It is worth noting that the primary innovation of this study is primarily application- and validation-driven, rather than focused on fundamental algorithm development. Using a statewide geotechnical dataset, large-scale applications of physics-constrained machine learning models are demonstrated with meticulously selected feature variables and systematic validation of regional predictions based on geotechnical principles and expert interpretation. The utility of integrating domain knowledge with existing machine learning approaches for accelerating engineering decision-making through statewide subsurface characterization is emphasized.

Machine learning models can be utilized by providing design engineers with historic borehole data-based predictions in the form of 3D subsurface point clouds. The machine learning computing and visualization in this study have been performed by desktop computers and thus can be conveniently handled by geotechnical engineers. These predictions will be implemented to generate more accurate preliminary designs, which will better inform engineers in advance of formal field investigation.

Author Contributions

Y.Z.: Writing—original draft, Review & Editing, Supervision, Project administration, Funding acquisition, Conceptualization, Methodology, Investigation, Validation. A.D.: Writing—original draft, Visualization, Methodology, Investigation, Data analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Maryland State Highway Administration (MDOT SHA) with award nos. SHA/UM/5-23 and SHA/UM7-01.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy.

Acknowledgments

Technical input from MDOT SHA staff and engineers is recognized as a critical factor in the success of this research, particularly in identifying research opportunities within highway datasets, retrieving and converting some key raw data from database servers into tabular formats, and providing professional guidance on desired output formats and target prediction results for model calibration. However, the opinions and conclusions expressed in this paper are solely those of the writers and do not necessarily reflect the views of the sponsor.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, Y.; Xu, J.; Moore, N. Effectively Implementing Machine Learning with Highway Datasets; Report of Phase II Machine Learning Research Project; Marland DOT State Highway Administration: Baltimore, MD, USA, 2023. Available online: https://roads.maryland.gov/OPR_Research/MD-21-SHAUM5-23_Machine-Learning_Report.pdf (accessed on 3 February 2026).
Zhang, Y.; Xu, J.; Cutts, R. Implementing Machine Learning with Highway Datasets; Report of Research project supported by Maryland DOT State Highway Administration (High Value Research Award); National Transportation Library: Washington, DC, USA, 2021. Available online: https://rosap.ntl.bts.gov/view/dot/56625/dot_56625_DS1.pdf (accessed on 12 February 2024).
Kim, H.J.; Dinoy, P.R.T.; Choi, H.-S.; Lee, K.-B.; Mission, J.L.C. Spatial interpolation of SPT data and prediction of consolidation of clay by ANN method. Coupled Syst. Mech. 2019, 8, 523–535. [Google Scholar] [CrossRef]
Wang, H.; Wang, X.; Liang, R.; Merklin, C.; Taliaferro, S. When AI Meets DIGGS. ASCE GeoStrata Mag. 2021, 25, 38–45. [Google Scholar] [CrossRef]
Yousefpour, N.; Liu, Z.; Zhao, C. Machine Learning Methods for Geotechnical Site Characterization and Scour Assessment. Transp. Res. Rec. 2024, 2679, 632–655. [Google Scholar] [CrossRef]
Xu, J.; Zhang, Y. AI-Powered Digital Twin Technology for Highway System Slope Stability Risk Monitoring. Geotechnics 2025, 5, 19. [Google Scholar] [CrossRef]
Nguyen, T.T.; Nguyen, K.L.; Huynh, T.Q.; Tran, D.Q. Influence of feature selection on machine learning prediction of pile foundation—The role of soil-pile interaction knowledge and application to base resistance. Geod. AI 2025, 3, 100019. [Google Scholar] [CrossRef]
Phoon, K.-K.; Tang, C. (Eds.) Databases for Data-Centric Geotechnics: Site Characterization; CRC Press: Boca Raton, FL, USA, 2024. [Google Scholar] [CrossRef]
Min, Y.; Azizan, N. HardNet: Hard-Constrained Neural Networks with Universal Approximation Guarantees. arXiv 2025, arXiv:2410.10807. [Google Scholar] [CrossRef]
Mukherjee, A.; Zavala, V.M. Physics-Constrained Machine Learning for Chemical Engineering. arXiv 2025, arXiv:2508.20649. [Google Scholar] [CrossRef]
Beucler, T.; Pritchard, M.; Rasp, S.; Ott, J.; Baldi, P.; Gentine, P. Enforcing Analytic Constraints in Neural Networks Emulating Physical Systems. Phys. Rev. Lett. 2021, 126, 098302. [Google Scholar] [CrossRef]
Ebrahimi, P.; Matano, F.; Amato, V.; Mattera, R.; Scepi, G. A field-based thickness measurement dataset of fallout pyroclasticdeposits in the peri-volcanic areas of Campania (Italy): Statisticalcombination of different predictions for spatial estimation of thickness. Earth Syst. Sci. Data 2024, 16, 4161–4188. [Google Scholar] [CrossRef]
Zhou, J.; Zhang, R.; Qiu, Y.; Khandelwal, M. A true triaxial strength criterion for rocks by Gene Expression Programming. J. Rock Mech. Geotech. Eng. 2023, 15, 2508–2520. [Google Scholar] [CrossRef]
Li, D.; Faradonbeh, R.S.; Lv, A.; Wang, X.; Roshan, H. A data-driven field-scale approach to estimate the permeability of fractured rocks. Int. J. Min. Reclam. Environ. 2022, 36, 671–687. [Google Scholar] [CrossRef]
Wang, L.; Gao, Y.; Pan, Q.; Wang, S.; Phoon, K.-K. Coupled geological modeling using multi-source data: A K-dimensional tree-graph convolutional neural process approach. Comput. Geotech. 2025, 187, 107509. [Google Scholar] [CrossRef]
Pan, J. Can machines learn with hard constraints? Nat. Comput. Sci. 2021, 1, 244. [Google Scholar] [CrossRef] [PubMed]
USDA. USDA Soil Survey Geographic Database (SSURGO). 2025. Available online: https://www.nrcs.usda.gov/resources/data-and-reports/soil-survey-geographic-database-ssurgo (accessed on 20 January 2025).
Schoonover, J.E.; Crim, J.F. An Introduction to Soil Concepts and the Role of Soils in Watershed Management. J. Contemp. Water Res. Educ. 2015, 154, 21–34. [Google Scholar] [CrossRef]
ESRI. ArcGIS Pro. 2024. Available online: https://pro.ArcGIS.com/en/pro-app/latest/get-started/download-ArcGIS-pro.htm (accessed on 3 February 2026).
USDA NRCS. Soil Colors of the United States. 2025. Available online: https://www.nrcs.usda.gov/resources/education-and-teaching-materials/soil-colors-of-the-united-states (accessed on 6 November 2025).
ASTM D2487-17; Standard Practice for Classification of Soils for Engineering Purposes (Unified Soil Classification System). ASTM International: West Conshohocken, PA, USA, 2017. Available online: https://www.astm.org/d2487-17.html (accessed on 3 February 2026).
Moon, S.-W.; Subramaniam, P.; Zhang, Y.; Vinoth, G.; Ku, T. Bedrock depth evaluation using microtremor measurement: Empirical guidelines at weathered granite formation in Singapore. J. Appl. Geophys. 2019, 171, 103866. [Google Scholar] [CrossRef]
Hansen, H.J.; Edwards, J., Jr. The Lithology and Distribution of Pre-Cretaceous Basement Rocks Beneath the Maryland Coastal Plain; Department of Natural Resources, Maryland Geological Survey: Baltimore, MD, USA, 1986.
Pyzik, L.; Caddick, J.; Marx, P. Chesapeake Bay: Introduction to an Ecosystem; Chesapeake Bay Program, Annapolis, Maryland; US Environmental Protection Agency: Washington, DC, USA, 2004.
Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015; Available online: https://arxiv.org/abs/1502.03167 (accessed on 3 February 2026).
Junkin, W.D. Geologic Map of the Relay Quadrangle, Anne Arundel, Baltimore, and Howard Counties and Baltimore City, Maryland; MD DNR Publication No: DNR12-032822-308; Department of Natural Resources, Maryland Geological Survey: Baltimore, MD, USA, 2021.

Figure 1. Soil type (grain size) data used for machine learning model training: (a) spatial distribution (top view); (b) data distribution histogram.

Figure 2. Bedrock depth data used for machine learning model training: (a) spatial distribution (top view); (b) data distribution (histogram of bedrock depth values, unit = m).

Figure 3. Workflow of Artificial Neural Networks with physical constraints.

Figure 4. Confusion matrix of machine learning model-predicted soil type categories (y axis) with true values from field investigation (x axis).

Figure 5. Scatter plot (x axis: real value, y axis: predicted value) for accuracy assessment of physically constrained machine learning model trained with the considered bedrock depth dataset.

Figure 6. Physically constrained machine learning-predicted soil type map at 3 m depth from surface in Maryland based on 200 m × 200 m grid network.

Figure 7. Machine learning-predicted soil type map at 3 m depth from surface for Eastern Shore in Maryland based on 200 m × 200 m grid: (a) without constraints; (b) with constraints.

Figure 8. Physically constrained machine learning-predicted bedrock depth map in Maryland based on 200 m × 200 m grid.

Figure 9. Machine learning model-predicted bedrock depth map for a study area at Cecilton.

Figure 10. Machine learning model-predicted bedrock depth (from surface) for study area near BWI airport in Maryland based on 20 m × 20 m grid: contour plot of predicted values (unit: m).

Table 1. USCS soil type classification and color code (adapted from ASTM D2487-17 [21]).

USCS Soil Symbol	Description	Color Code (in RGB)
SP	Poorly Graded Sand	(0, 90, 180)
SW	Well Graded Sand	(50, 120, 190)
SM	Silty Sand	(100, 140, 180)
SC	Clayey Sand	(120, 130, 190)
ML	Low Plasticity Silt	(220, 180, 80)
MH	High Plasticity Silt	(200, 140, 50)
CL	Low Plasticity Clay	(180, 120, 200)
CH	High Plasticity Clay	(160, 50, 160)
GP	Poorly Graded Gravel	(120, 40, 150)
GW	Well Graded Gravel	(140, 80, 160)
GM	Silty Gravel	(180, 120, 120)
GC	Clayey Gravel	(180, 90, 140)
OL	Organic Silt	(180, 180, 180)
OH	Organic Clay	(120, 120, 120)
PT	Peat	(60, 60, 60)
IGM	Decomposed Rock	(160, 40, 40)
Rock	Stronger rock	(120, 30, 30)

Table 2. Accuracy report of physically constrained machine learning model trained with soil type dataset (“#” indicates “number”).

USCS Class	Precision	Recall	F1 Score	Support	# of Samples (Resampled)
CH	0.91	0.99	0.95	549	13,725
CL	0.86	0.88	0.87	46,651	139,953
GC	0.85	1	0.92	36	900
GM	0.83	0.99	0.91	141	3525
GP	0.85	0.99	0.91	1344	33,600
GW	0.81	0.96	0.88	123	3075
IGM	0.87	0.95	0.91	16,421	82,105
MH	0.83	0.88	0.86	10,219	51,095
ML	0.87	0.78	0.82	63,816	127,632
OH	0.99	1	1	10	250
OL	0.97	1	0.98	113	2825
PT	0.95	1	0.98	62	1550
ROCK	1	0.97	0.99	35,477	70,954
SC	0.77	0.8	0.78	8066	40,330
SM	0.82	0.76	0.79	31,148	93,444
SP	0.83	0.8	0.82	27,293	81,879
SW	0.78	0.81	0.79	8429	42,145
Accuracy			0.86	235,119	788,987
Macro Avg	0.87	0.92	0.89	235,119	788,987
Weighted Avg	0.88	0.85	0.85	235,119	788,987

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Y.; Darilmaz, A. Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets. Geotechnics 2026, 6, 20. https://doi.org/10.3390/geotechnics6010020

AMA Style

Zhang Y, Darilmaz A. Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets. Geotechnics. 2026; 6(1):20. https://doi.org/10.3390/geotechnics6010020

Chicago/Turabian Style

Zhang, Yunfeng, and Ahmet Darilmaz. 2026. "Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets" Geotechnics 6, no. 1: 20. https://doi.org/10.3390/geotechnics6010020

APA Style

Zhang, Y., & Darilmaz, A. (2026). Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets. Geotechnics, 6(1), 20. https://doi.org/10.3390/geotechnics6010020

Article Menu

Physics-Constrained Machine Learning Modeling for Geotechnical Data Prediction: Case Study on Site Soil Type and Bedrock Depth Datasets

Abstract

1. Introduction

2. Soil Type and Bedrock Depth Data

2.1. Soil Type Dataset

2.2. Bedrock Depth Data

3. Machine Learning Models with Physical Constraints

3.1. Soil Type Classification Neural Network Model with Physical Constraints

3.2. Regression Neural Network Model for Bedrock Depth Prediction

4. Results and Discussion

4.1. Soil Type Prediction in the Eastern Shore Area

4.2. Bedrock Depth Prediction

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI