Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique

Miller, Mark; Fang, Yong; Wang, Yubo; Kharitonov, Sergey; Akulich, Vladimir

doi:10.3390/infrastructures10060138

Open AccessArticle

Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique

by

Mark Miller

^1,*,

Yong Fang

¹

,

Yubo Wang

¹,

Sergey Kharitonov

² and

Vladimir Akulich

³

¹

Key Laboratory of Transportation Tunnel Engineering of Ministry of Education, School of Civil Engineering, Southwest Jiaotong University, Chengdu 610031, China

²

Department of Bridges and Tunnels, School of Railway Track, Structure and Construction, Russian University of Transport (RUT MIIT), Moscow 127055, Russia

³

Department of Theoretical Mechanics, School of Railway Track, Structure and Construction, Russian University of Transport (RUT MIIT), Moscow 127055, Russia

^*

Author to whom correspondence should be addressed.

Infrastructures 2025, 10(6), 138; https://doi.org/10.3390/infrastructures10060138

Submission received: 5 March 2025 / Revised: 6 May 2025 / Accepted: 30 May 2025 / Published: 3 June 2025

(This article belongs to the Section Smart Infrastructures)

Download

Browse Figures

Versions Notes

Abstract

Natural water content affects many geotechnical parameters and geological properties of soils, which can reduce cohesion and friction, leading to potential failures in structures such as foundations, retaining walls, and slopes. Identification of the water content helps in designing effective drainage and water management systems to prevent flooding and erosion. In tunnel engineering, soil water content plays an important role as the stability of the tunnel face depends on it. This research solves the problem of classifying soil images depending on the natural water content by computer vision technology. First, laboratory soil tests were carried out, and the relationship between the amount of torque on the screw conveyor and the moisture content of the soil was established; photographs of the soil at different conditions were taken at each step of the experiment. Second, the resulting dataset after preprocessing was processed by convolutional neural network algorithms during deep learning; the transfer learning technique was used to obtain better results. As a result, seven algorithms were obtained that allow classifying the soil images, which can later be used to optimize the tunnel construction process. The best classification ability is demonstrated by the algorithm based on the DenseNet architecture (accuracy 0.9302 and loss 0.1980). The proposed model surpasses traditional approaches due to its increased automation and processing speed. Laboratory tests can be carried out only once for one type of soil in order to determine the boundaries of water content for classes labeling, after which only a cheap camera is required from the equipment to transmit new images for processing by the algorithm.

Keywords:

TBM; computer vision; deep learning; CNN; water content; soil condition classification

1. Introduction

Artificial intelligence (AI) is widely used in various industries, and tunnel engineering is no exception. AI has the potential to significantly improve tunnel design, construction, and maintenance by providing advanced data analysis, predictive modeling, and automation capabilities. With the growing demand for infrastructure development and the need for more efficient and sustainable tunneling solutions, artificial intelligence plays a crucial role in driving innovation in tunneling.

Basically, the use of AI in tunneling is reduced to solving two main tasks: the regression task and the classification task. Yagiz et al. [1] used statistical regression models with an artificial neural network (ANN) and particle swarm optimization approach (PSO) [2] to forecast the penetration rates of TBM at Queens Tunnel. Armaghani et al. [3] used the Malaysian water tunnel project for adoption and the gene expression programming method (GEP) in predicting the penetration rate with the value of the determination coefficient of 0.829 on the testing set. Kilic et al. [4] collected operational data from micro-slurry TBM and utilized machine learning algorithms to predict TBM jack speed and torque values. They obtained R² = 0.96 for torque performance evaluation and R² = 0.83 for jack speed decision-making. Fu et al. [5] predicted the advance rate in mixed ground conditions with the help of an optimized backpropagation neural network by the genetic algorithm, which demonstrated R² = 0.920. Shan et al. [6] used a LSTM (long-short term memory network) model to evaluate the TBM performance in Chinese cities, finding that the RNN (recurrent neural network) outperformed LSTM. Shang et al., in their research [7], used the machine learning algorithms random forests and double-input deep learning with a smile mold algorithm (SMA) for predicting longitudinal surface settlement caused by double shield tunneling; their model achieved R² = 0.930. Alzubaidi et al. [8] introduced a CNN (convolutional neural network) that filters out irrelevant sections to identify rock cores and assess the rock quality designation; the model has errors 2.58% and 3.17% on samples of sandstone and limestone, respectively. Cui et al. [9] presented a combined CNN operating with a rock matrix and pore types using rock images for forecasting elastic characteristics; they obtained an R² score value 0.84. Chen et al. [10] installed a visual assessment system to a TBM and collected a large in situ rock mass image dataset. They solved a fine-grain classification task using a self-convolutional-based attention fusion network (SAFN), which allowed them to classify rock masses automatically. Liu et al. [11] proposed a rock fragment classification method based on a CNN, achieving an impressive accuracy value of 91.88%. Xue et al. [12] presented the rock segmentation visual system based on the semantic segmentation technique. Their system can effectively detect large rock particles in the images and make a statistical analysis. Chen et al., in their study [13], proposed a framework for classifying multiple rock structures based on the geological images of a tunnel face using CNNs.

The intricate geological conditions are a primary contributor to various engineering challenges encountered in tunnel construction. Issues such as significant deformation, water influx, rock bursts, and zonal disintegration can readily arise during the tunneling process. Notably, the detrimental impact of water is often regarded as one of key factors leading to substantial tunnel deformation. The softening of groundwater can greatly diminish the load-bearing capacity of the surrounding rock and create a hydration and mechanical coupling effect, including rock swelling due to the interaction between water and clay materials, which can result in large deformation [14]. Additionally, high soil moisture content can cause mud accumulation on the cutterhead and within the soil chamber, resulting in increased thrust and torque on the shield machine and reduced tunneling efficiency [15].

The issue of determining the water saturation of the soil is relevant and is being studied by many researchers. Gravimetric, radiometric, and nuclear magnetic resonance methods are the fastest approaches for water content determination [16,17,18]. The gravimetric method (or drying method) is the most widely accepted international standard due to its simplicity and convenient implementation, but its high time consumption and laboratory dependence make it challenging for real-time measurements [19]. To facilitate quick assessments of soil water content, tensiometers that rely on the water suction of soil were utilized [20,21]. Significant measurement errors occur due to complex non-linear dependencies between the measured soil water content and water suction of the soil. Neutron attenuation techniques have been employed for swift soil water content measurement [22]; however, considering possible dangerous consequences related to radiance, it could be unsafe; moreover, equipment using and maintenance is extremely high. In recent years, spectroscopic technologies have developed rapidly, and new identification methods based on spectroscopy and electromagnetic waves have been proposed [23,24,25,26,27,28]; however, the use of these methods on real construction sites may be limited.

As for using AI directly to predict the soil water content, research in this area is actively developing today. Tsai and Huang in their research [29] determined soil water content from thermal images using an artificial neural network. Although they reached impressive results (R² = 0.847 on testing set), their dataset was a table, generated from the data obtained by photos from unmanned aerial vehicle and laboratory tests of soil samples. A similar idea was used by Usta in the research [30], where an artificial neural network was developed for soil water content predicting by a solving regression problem. Input variables were obtained in tabular form with soil sampling, remote sensing imaging, and a digital evaluating model (DEM), which is usually unavailable in the conditions of tunneling. Kim et al., in their study [31], developed a deep-learning-based model that is able to predict soil water content using features extracted from in situ soil surface images. Although their model demonstrated excellent forecasting ability (R² = 0.95) and proved the possibility of using soil pictures for predicting soil water content, its applicability is still very limited. The equipment used by the authors in the study is expensive and low-mobility (photos are taken by high-resolution camera in a darkroom with special lighting equipment inside); the feature extraction process is complex and requires a high data science skill from the staff.

Using the idea mentioned above about the possibility of determining soil water saturation from images, this study offers a new approach that involves solving the classification problem using a dataset obtained by new laboratory experiment.

2. Experiment for Obtaining the Dataset

2.1. Problem Statement

Earth pressure balance tunnel boring machines (EPB TBMs) use screw conveyors for discharging mucks generated by cutterhead during tunneling. Muck transportation must be timely and predictable; otherwise, many technical problems may arise, such as unbalancing the tunnel face and the unreasonably fast-wearing of cutting tools [32,33,34].

Although the increased moisture content usually makes it easier to cut the soil with a TBM’s cutterhead, this may be an indirect indication of the danger of increased water pressure and potential water inflow into the tunnel.

Muck removal performance is determined by the advancement speed and rotational speed of a TBM. On the other hand, the influence of the geological condition of the soil is obvious, which determines the shape and size of the particles entering the screw conveyor [35]. If the incoming soil is in too much of a liquid state, spewing failure could occur in the muck transportation process. Spewing obstructs the pressure regulation in the chamber, which may be the cause of disaster collapse, such as the sudden inrush of great quantities of groundwater through the chamber and screw conveyor.

This section describes an experiment that was conducted to determine the relationship between natural moisture content and the torque of a screw conveyor during muck transportation.

2.2. Construction Site Information

Please refer to Appendix A.

2.3. Experiment

The test soil was collected from the dump area of the construction site where the excavated muck was transported to. After delivery to the laboratory, the soil was kept in a drying oven for 72 h at a temperature of over 100 degrees Celsius. Then, a 3 kg soil mass was picked out. The change in geological conditions was caused by the addition of water and the subsequent passage of soil through the test screw conveyor (Figure 1). The amount of water varied from 0 to 1200 g in increments of 150 g and was recorded as a percentage according to the natural moisture content formula:

w = \frac{m_{w}}{m_{s}} \times 100 %,

(1)

where

m_{w}

is the mass of water, and

m_{s}

is the mass of solids in the given soil sample.

The measurements were carried out at a constant rotation speed of 30 rpm. When the water content changes from 0 to 10 percent, there are no special deviations in the soil behavior, only its color changes slightly. With an increase in w from 10 to 20 percent, there is a significant increase in torque. This is due to the fact that at these values, the soil begins to clump during mixing, and particles grinding loads the set up more intensively (Figure 2). After the soil moisture reaches the moisture content value at the plastic limit

w = w_{P L} = 21.9

%, the soil lumps begin to fall apart, and the soil gradually turns into a fluid state up to the moisture value at the liquid limit

w = w_{L L} = 36.7 %

, after which moisture ceases to be absorbed into the soil paste. At the same time, there is a sharp reduction in the torque in which the behavior of the soil is sticking and clogging the setup device (Figure 3).

Before and after each test step, the soil conditions were photographed for further processing, dataset creation, and deep learning application.

2.4. Interpretation of Intermediate Results

For controlling the tunnel face stability, EPB TBMs allow one to apply counterbalancing pressure to the face to stabilize the ground pressure (effective and water pressure), which is conducted with the use of the excavated soil staying in the chamber. The pressure value can be adjusted by regulating the TBM thrust force by regulating the air pressure in the top of the chamber and by changing the rotation speed of the screw conveyor, which affects the discharge amount of the excavated soil.

Although the problem of tunnel face balance is studied by many researchers [36,37,38,39] and there is a complex influence of numerous parameters on the equilibrium state, within the framework of this study, the conditional state of the muck at water saturation w = 20% is highlighted, which is the point of change in the working mode of the EPB TBM. At the water content values from 0 to 20% (Figure 2), there is an EPB operating mode in which the soil pressure in the chamber balances the external pressure on the tunnel face without any problems. This situation is quite idealized and rarely found in practice, so it is not taken into account in the application of the computer vision technique.

After cutting and excavating middle-weathered mudstone by the TBM, the soil entering the chamber is mostly a mixed sand–clay mass. The shear strength of such soil is described by the Mohr–Coulomb law:

τ = σ * t a n φ + c

(2)

where

τ

is the shear strength (kPa),

σ

is the total normal stress (kPa),

φ

is the initial friction angle (°), and

c

is cohesion (kPa).

This dependence is widespread and well studied. Many researchers [40,41,42,43] have experimentally confirmed that, with an increase in moisture content, the shear strength of the soil and its bearing capacity decrease [44]. The angle of internal friction basically shows the same trends, while the cohesion force behaves differently. Chao Ye et al. [45] obtained that cohesion increased with increasing water content at relatively low values and reached maximum at w = 15.4% and w = 11.4% for shale and clay, respectively. Shunqing Liu et al., in their research [46], found out that cohesion forces in soil–rock mixtures in different conditions increase steadily and reach a maximum value at around w = 20% and then drop significantly. Tonnizam et al. had obtained similar results in their study [47], where alluvium demonstrated the same properties changing behavior: shear strength and friction angle constantly decreased with moisture content growth, while cohesion decreased after reaching maximum value at around w = 20%. Analyzing the dependence of torque on water content obtained during the experiment and based on the above studies, it can be assumed that the cohesion forces for the present soil type reach the highest value at w = 21.9% and sharply drop after reaching w = 30%, which is sufficient to select boundary states for dividing the soil into classes.

Thus, the soil conditions are divided into three classes in which the muck is considered relatively “good”, “average”, or “bad”: class I at

w \leq w_{P L} = 21.9 %

, class II at

w_{P L} = 21.9 % < w < 30 %,

and class III at

30 % \leq w .

Additionally, tests were carried out for the condition of the soil at the lower and upper limits of soil plasticity (21.9% and 36.7%). Also, to expand the dataset, water weighing from 300 to 600 g with a reduced step was added to a 1.5 kg dry soil sample to obtain more states of intermediate water saturation values (Figure 4).

3. Deep Learning with CNN for Solving Computer Vision Problem of Classification

3.1. Main Concepts

Computer vision is a field of study that focuses on enabling computers to interpret and understand the visual world. It involves the development of algorithms and techniques that allow machines to process, analyze, and extract information from visual data such as images and videos. Deep learning, on the other hand, is a subfield of machine learning that uses artificial neural networks with multiple layers to learn from data.

Convolutional neural networks (CNNs) are a type of deep learning model that has revolutionized the field of computer vision. CNNs are specifically designed to process visual data and have been incredibly successful in tasks such as image classification, object detection, and image segmentation.

The architecture of a CNN is inspired by the organization of the visual cortex in animals. It consists of multiple layers, including convolutional layers, pooling layers, and fully connected layers (Figure 5). In a convolutional layer, a set of learnable filters (kernels) is applied to the input data using a mathematical operation (convolution). Each filter detects different features or patterns within the input data, such as edges, textures, or shapes. The output of the convolutional layer is a set of feature maps, each representing the presence of a specific feature in the input data.

Pooling layers are a type of layer commonly used in CNNs to reduce the spatial dimensions of the input volume for making the network more robust to variations in input data. This is done to decrease the computational complexity and number of parameters in the network, as well as to control overfitting. There are two main types of pooling layers: max pooling and average pooling. In max pooling, the maximum value within a certain window (typically 2 × 2) is selected and retained, while in average pooling, the average value within the window is calculated and retained.

Fully connected layers are used to make predictions based on the extracted features.

Training a CNN involves feeding it with labeled images and adjusting its parameters (weights and biases) using an optimization algorithm such as gradient descent. The goal is to minimize a loss function that measures the difference between the predicted output and the true label.

A more detailed description of operation principles of CNNs is presented in the research [48].

One of the key advantages of CNNs is their ability to automatically learn hierarchical representations of visual data. This means that lower layers in the network learn simple features like edges and textures, while higher layers learn more complex features like object parts or entire objects.

3.2. Transfer Learning

Deep CNNs with good classifying ability can have many layers and millions of trainable parameters (weights and biases). Training such networks requires huge amounts of computing power, which is unprofitable to do again every time for each new task. Therefore, the so-called transfer learning occurs in which the main trained parameters remain the same but only the output layer changes depending on the number of image classes (which determines the number of neurons on the output layer).

Below, the pre-trained models that were used to solve the problem of soil image classification are introduced.

3.2.1. ResNet

The Residual Network (ResNet) is introduced by He et al. (2016) [49]. Unlike classical neural networks, ResNet has a shortcut connection between non-contiguous layers of the network, which allows a better transfer of the information and minimizes the decaying of the gradient during the backpropagation.

3.2.2. MobileNet

Invented by Google research team in 2017, MobileNet [50] is first mobile computer vision model. It uses depth-wise convolutions for significant reduction in the number of trained hyperparameters, which decreases computational load.

3.2.3. Xception

Based on Google’s Iception model, Xception [51] is a linear combination of depth-wise separable convolution layers with residual connections. Having reduced complexity, Xception architecture efficiently uses model parameters and changes inception modules on depth-wise convolutions.

3.2.4. DenseNet

The Densely Connected Convolutional Network (DenseNet) [52] is an efficient structure within CNNs and characterized by short input–output layer connections. DenseNet simplifies an expanded data flow and gradients throughout the network, gradually increasing accuracy with an expanding amount of hyperparameters. The main feature of DenseNet is dense connection, where all feature maps of the current input layer are linked with the previous layer. This connection solves the problem of gradient vanishing and improves the process of features learning, increasing the generalizing ability of the network.

3.2.5. NasNet

The Neural Architecture Search Network (NasNet) [53] is a deep learning architecture that is designed to automatically discover the optimal neural network architecture for a given task. The process involves training and evaluating a large number of candidate architectures, each with different combinations of layers, connections, and hyperparameters. Through a combination of reinforcement learning and evolutionary algorithms, NasNet can efficiently search through a vast search space to find architectures that maximize performance. By automating the design process, NasNet eliminates the need for manual trial and error, which can be time-consuming and prone to human biases. It also allows for the exploration of unconventional architectures that may not have been considered by human designers.

3.2.6. EfficientNet

EfficientNet was published by Tan and Lee [54]; this model evenly scales the depth, width, and resolution of the image with the compound coefficient. Although this architecture has reduced accuracy on classical datasets (for example, ImageNet), at the same time, the size of the model was smaller and its speed performance faster compared to other neural networks.

3.2.7. ConvNeXt [55]

ConvNeXt is a deep learning architecture that combines the strengths of convolutional neural networks (CNNs) and transformers. The main idea behind ConvNeXt is to capture both local and global dependencies in an image by using parallel branches of convolutional layers with different receptive fields. These branches are connected in a way that allows information to flow across them, enabling the network to capture both fine-grained details and high-level context.

In ConvNeXt, each branch consists of a sequence of convolutional layers, followed by a global average pooling layer to aggregate features. The outputs of these branches are then concatenated and passed through a fully connected layer for final classification or regression.

One of the key innovations in ConvNeXt is the use of a cardinality parameter, which controls the number of parallel branches. By varying the cardinality, ConvNeXt can adapt to different levels of complexity in the data and achieve a good balance between representational capacity and computational efficiency.

3.3. Estimation and Optimization of the Models

The metric “accuracy” is used to evaluate the predictive capability of the models:

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N},

(3)

where TP, TN, FP, and FN are true positive, true negative, false positive, and false negative predictions, respectively; (TP + TN) is the number of correct predictions, and (TP + FP + FN + TN) is the total number of predictions.

A loss function is used for the model optimization. It is the difference between the predicted value by the models and the true value. The most common loss function for deep learning models is “cross-entropy”, which is used in this paper and defined as follows:

C r o s s - e n t r o p y = - \sum_{i = 1}^{n} \sum_{j = 1}^{m} y_{i, j} \log (p_{i, j}),

(4)

where

y_{i, j}

is the true value, i.e., sample i belongs to class j and 0 otherwise;

p_{i, j}

is the probability, which is predicted by the model of sample i belonging to class j.

3.4. Dataset

The result of the experiment described in Section 2 is a certain number of photographs of soil in different states. To study the fluid state of the muck, it was decided to divide the soil conditions into three classes as mentioned before; then, the available high-resolution photos are fragmented into images with a resolution of 200 × 200 pixels. A data augmentation technique was also used in which the resulting patches are randomly rotated, reflected, and changed in brightness to increase the diversity of the dataset. In total, there are 930 images with an assigned class label in the dataset, which is divided in training (80%) and test (20%) sets. The dataset was divided into training and test sets at the stage of image preprocessing, taking into account the balance of data distribution. That is, the data for each iteration of the experiment are divided in the mentioned proportions. Then, a validation set (10%) is selected from the training set.

3.5. Results and Discussion

Using the transfer learning method (Section 3.2), one fully connected layer of neurons was added to each model described above, as well as the dropout layer (which automatically turns off part of neurons during the training) to prevent overfitting. The CNN architectures have already been pre-trained on the popular database ImageNet [56] and imported with hyperparameters already set. Thus, during the training, the tuning of hyperparameters related to the above-mentioned added layers was carried out, which allowed saving time and computing power, as well as using fewer epochs. Each of the models uses the same hyperparameters for avoiding its influence on the results: learning_rate = 0.001, batch_size = 32, and the optimization algorithm is Adam, which demonstrated the best classifying ability during the tuning procedure.

The results are presented in Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13. The best architecture for prediction with the highest accuracy and the least loss is DenseNet (accuracy 0.9302 and loss 0.1980); the Xception model (accuracy 0.8760 and loss 0.4172) copes with classification the worst. It should be noted that the dataset used is very limited in quality since only one type of soil is used for the class separation. In the future, it is planned to expand the existing dataset by conducting additional experiments and using soils with different properties.

The DenseNet-based model has demonstrated the best results due to its unique structure in which each input layer receives concatenated feature maps of all previous layers (but not just one previous one). This circumstance provides a richer set of features because each subsequent layer has access to all previous activations and the model captures the complex textures and structures of the soil better. Short links between layers help to keep gradients informative even on deep layers, while the gradient vanishing is often a problem. In the ResNet model, for example, intermediate activations can be “forgotten” as they deepen, while in DenseNet, each layer adds only new features, rather than overwriting the old ones. This is especially useful for soil analysis, where multilevel textures are important (from large particles to microscopic patterns). DenseNet requires fewer parameters than ResNet or EfficientNet due to the general use of features (which does not require re-extracting the same features) and bottleneck-layers with 1 × 1 convolutions before 3 × 3, which reduces computational complexity.

DenseNet turned out to be the best choice because its architecture preserves and reuses features at all levels as much as possible, which is crucial for soil analysis, where both local textures and global patterns are important. This makes the model more accurate with less computational effort compared to other models.

In addition to the mentioned accuracy, the DenseNet-based model performed with an F1-score = 0.903, recall = 0.920, and precision = 0.846, which indicates the high classification ability of the model.

Confusion matrix is presented in Table 1.

Figure 14 below shows the results of applying Grad-CAM (Gradient-weighted Class Activation Mapping) to dataset images from class I during the neural network fitting. This method visualizes the effect of different areas of an image on the prediction of a model.

The blue and white zones on the heat map of the image indicate that these areas have no influence on the prediction results. These include areas with an indistinct structure of the soil and the walls of the container in which the sample is located.

The red zones show the areas that contain the most important features for the forecasting. From the figure, it can be seen that the red zone is directly at a location of the surface with a clearly distinguishable soil structure.

Figure 15 shows an image of the soil from the class III of the training set and its heat map. It can be seen that during training, the neural network does not take into account the area in the center of the image, which can be ambiguously interpreted due to the merging of the foreground and background and unobvious shading. At the same time, the main zone of influence on the classification for the neural network is the border areas, where the surface of the soil is obvious and clearly indicates increased water content.

Traditional methods of classifying soil according to the degree of water saturation (laboratory tests, granulometric analysis, and rheological methods) have a number of limitations that are successfully overcome with the help of the purposed model. The main advantages of the model are increased automation and processing speed since image analysis is performed in real time, while standard methods require sampling, drying, weighing, and other time-consuming procedures. The process itself takes seconds, unlike many hours or days in a laboratory setting. High accuracy and reproducibility are also the distinguishing features of the model. Artificial intelligence recognizes complex patterns (texture, color, and structure) that can be overlooked by humans or simple algorithms. Unlike visual assessment by an expert, the neural network provides stable results with the same input data. Traditional methods require expensive devices, while methods based on computer vision can work even with simple and cheap cameras. The ability to analyze in the field using mobile devices (for example, through an application with the pre-trained model) is also an advantage of the proposed approach. The neural network can be retrained for new types of soil or photographing conditions without changing the entire methodology; the approach itself can be easily integrated into automated monitoring systems. The economic benefits of the proposed approach are also obvious, which significantly reduces the cost of laboratory tests, equipment, and specialists and minimizes errors due to the human factor (incorrect sampling and inaccuracy of weighing).

At the moment, the main limitation is that only one type of soil is used to train models, which has several consequences. First, the model highlights features specific only to this type of soil (texture, color, and structure), while other soils will have different visual characteristics. Secondly, there is a limited interpretation of the various features that appear due to the different behavior of soils in contact with water. For example, sand quickly lets water through and changes its color, while clay forms lumps or cracks. A limited trained model may miss such features or mistake them for noise or error. Finally, since the models are trained on the same type of soil, it is possible to “overtrain” on its artifacts (specific glare or granularity). To overcome these difficulties, first of all, it is necessary to expand the dataset by adding images of new soils at different values of water content. It is also possible to use data multimodality in which images can be supplemented with sensor data (humidity, density, and electrical conductivity) and spectral characteristics (infrared images).

4. Conclusions

This article presents a study of the possibility of predicting the soil conditions depending on natural water content by solving the problem of image classification with computer vision technology. During the experiment, a relationship was established between the amount of torque on the screw conveyor and the amount of moisture content of the soil. At each step of the experiment, photographs of different soil conditions were taken to create a dataset for further application of the computer vision technique. Then, soil images were divided into three classes: class 1, when the water content is between 20% and the plastic limit

w_{P L} = 21.9 %;

class 2, when the water content is between the plastic limit and 30%; class 3, when the water content is between 30% and 40%. The resulting models, after refinement, can later be used to classify soil in real time during tunneling if a camera is installed in the excavation chamber with the ability to send new images to a local computer for classification.

Author Contributions

Conceptualization, M.M. and Y.F.; methodology, M.M., Y.F. and Y.W.; software, M.M. and S.K.; validation, M.M., S.K. and V.A.; formal analysis, S.K.; investigation, M.M., Y.W. and V.A.; resources, Y.F.; data curation, Y.F.; writing—original draft preparation, M.M.; writing—review and editing, M.M. and V.A.; visualization, M.M.; supervision, Y.F.; project administration, Y.F.; funding acquisition, Y.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Science Fund for Distinguished Young Scholars of China (52425807), the Science Foundation of Sichuan Province, China (2024NSFTD0013), and the Sichuan “Top Youth” Special Program for Outstanding Young Science and Technology Talent (DQ202403).

Data Availability Statement

The data presented in this study are available on reasonable request from the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A

The Ziyang line project connects the airport in Chengdu with the railway station in Ziyang City. The total length of the line is 39.37 km, and the underground section is 9.77 km. The strata where the tunnel passes through are mainly moderately weathered mudstone and sandstone and locally strongly weathered mudstone and sandstone. The outer diameter of the shield tunnel lining is 7900 mm, the inner diameter is 7100 mm, and the thickness of the tunnel segment is 400 mm.

The ZTSE8200 (Figure A1) slurry–soil pressure double-mode shield machine produced by China Railway Construction Heavy Industry Co., Ltd. (CRCHI), Changsha, China, is used. The machine is equipped with a double-mode muck improvement system and has the characteristics of efficient tunneling and precise settlement control. The shield machine can switch different tunneling modes in real time based on different geological conditions, the distribution of surface structures, and settlement monitoring data during construction to ensure safe, stable, and orderly construction.

Figure A1. “ZTSE8200” double-mode shield and scheme of the TBM.

The construction section in question descends through the earth ditch after originating from Baotai Avenue and then continues with a curve radius of 570 m; thereafter, it turns into Binjiang Avenue with a curve radius of 720 m through Fuli Garden and reaches Changhong Square Station (Figure A2).

Figure A2. Plan and longitudinal profile of shield construction interval Baotai Avenue Station–Changhon Square Station.

The longest boring mileage of a single line is 2448.74 m. The interval goes down through the Tuojiang River, the length of the lower section is about 450 m, and the minimum distance between the top of the tunnel and the bottom of the river is 15.2 m; the interval passes through the Tuojiang First Bridge pile on the side, about 4 m from the side of the tunnel.

References

Yagiz, S.; Gokceoglu, C.; Sezer, E.; Iplikci, S. Application of two non-linear prediction tools to the estimation of tunnel boring machine performance. Eng. Appl. Artif. Intell. 2009, 22, 808–814. [Google Scholar] [CrossRef]
Yagiz, S.; Karahan, H. Prediction of hard rock TBM penetration rate using particle swarm optimization. Int. J. Rock Mech. Min. Sci. Géoméch. Abstr. 2011, 48, 427–433. [Google Scholar] [CrossRef]
Armaghani, D.J.; Faradonbeh, R.S.; Momeni, E.; Fahimifar, A.; Tahir, M.M. Performance prediction of tunnel boring machine through developing a gene expression programming equation. Eng. Comput. 2017, 34, 129–141. [Google Scholar] [CrossRef]
Kilic, K.; Narihiro, O.; Ikeda, H.; Adachi, T.; Kawamura, Y. Soft Ground Micro TBM Jack Speed and Torque Prediction Using Machine Learning Models Through Operator Data and Micro TBM-log Data Synchronization. Sci. Rep. 2024, 14, 9728. [Google Scholar] [CrossRef]
Fu, X.; Gong, Q.; Wu, Y.; Zhao, Y.; Li, H. Prediction of EPB Shield Tunneling Advance Rate in Mixed Ground Condition Using Optimized BPNN Model. Appl. Sci. 2022, 12, 5485. [Google Scholar] [CrossRef]
Shan, F.; He, X.; Armaghani, D.J.; Zhang, P.; Sheng, D. Success and challenges in predicting TBM penetration rate using recurrent neural networks. Tunn. Undergr. Space Technol. 2022, 130, 104728. [Google Scholar] [CrossRef]
Shang, W.; Li, Y.; Wei, H.; Qiu, Y.; Chen, C.; Gao, X. Prediction method of longitudinal surface settlement caused by double shield tunnelling based on deep learning. Sci. Rep. 2024, 14, 908. [Google Scholar] [CrossRef]
Alzubaidi, F.; Mostaghimi, P.; Si, G.; Swietojanski, P.; Armstrong, R.T. Automated Rock Quality Designation Using Convolutional Neural Networks. Rock Mech. Rock Eng. 2022, 55, 3719–3734. [Google Scholar] [CrossRef]
Cui, R.; Cao, D.; Liu, Q.; Zhu, Z.; Jia, Y. V_P and V_S prediction from digital rock images using a combination of U-Net and convolutional neural networks. Geophysics 2021, 86, MR27–MR37. [Google Scholar] [CrossRef]
Chen, L.; Liu, Z.; Su, H.; Lin, F.; Mao, W. Automated rock mass condition assessment during TBM tunnel excavation using deep learning. Sci. Rep. 2022, 12, 1722. [Google Scholar] [CrossRef]
Liu, Y.; Wang, D.; Hu, J.; Zhu, G. Classifying Rock Fragments Produced by Tunnel Boring Machine Using Optimized Convolutional Neural Network. Rock Mech. Rock Eng. 2023, 57, 1765–1780. [Google Scholar] [CrossRef]
Xue, Z.; Chen, L.; Liu, Z.; Lin, F.; Mao, W. Rock segmentation visual system for assisting driving in TBM construction. Mach. Vis. Appl. 2021, 32, 77. [Google Scholar] [CrossRef]
Chen, J.; Yang, T.; Zhang, D.; Huang, H.; Tian, Y. Deep learning based classification of rock structure of tunnel face. Geosci. Front. 2021, 12, 395–404. [Google Scholar] [CrossRef]
Bao, H.; Liu, C.; Liang, N.; Lan, H.; Yan, C.; Xu, X. Analysis of large deformation of deep-buried brittle rock tunnel in strong tectonic active area based on macro and microcrack evolution. Eng. Fail. Anal. 2022, 138, 106351. [Google Scholar] [CrossRef]
Fang, Y.; Yao, Y.; Song, T.; Wei, L.; Liu, P.; Zhuo, B. Study on disintegrating characteristics and mechanism of cutterhead mud-caking in cohesive strata. Bull. Eng. Geol. Environ. 2022, 81, 510. [Google Scholar] [CrossRef]
Teixeira, J.; Correia dos Santos, R. Exploring the Applicability of Low-Cost Capacitive and Resistive Water Content Sensors onCompacted Soils. Geotech. Geol. Eng. 2021, 39, 2969–2983. [Google Scholar] [CrossRef]
Nikolov, G.T.; Ganev, B.T.; Marinov, M.B.; Galabov, V.T. Comparative Analysis of Sensors for Soil Moisture Measurement. In Proceedings of the 2021 XXX International Scientific Conference Electronics (ET), Sozopol, Bulgaria, 15–17 September 2021; IEEE: New York, NY, USA, 2021; pp. 1–5. [Google Scholar]
Gao, Z.; Zhu, Y.; Liu, C.; Qian, H.; Cao, W.; Ni, J. Design and Test of a Soil Profile Moisture Sensor Based on Sensitive Soil Layers. Sensors 2018, 18, 1648. [Google Scholar] [CrossRef]
Robinson, D.A.; Campbell, C.S.; Hopmans, J.W.; Hornbuckle, B.K.; Jones, S.B.; Knight, R.; Ogden, F.; Selker, J.; Wendroth, O. Soil Moisture Measurement for Ecological and Hydrological Watershed-Scale Observatories: A Review. Vadose Zone J. 2008, 7, 358–389. [Google Scholar] [CrossRef]
Russell, M.B.; Davis, F.E.; Bair, R.A. The Use of Tensio-Meters for Following Soil Moisture Conditions under Corn. J. Am. Soc. Agron. 1940, 32, 922–930. [Google Scholar] [CrossRef]
Fleischhauer-Binz, E. Die Messung von Bodensaugkräften mit Tensiometern. Planta 1949, 37, 565–594. [Google Scholar] [CrossRef]
Jayawardane, N.; Meyer, W.; Barrs, H. Moisture Measurement in a Swelling Clay Soil Using Neutron Moisture Meters. Soil Res. 1984, 22, 109–117. [Google Scholar] [CrossRef]
Knadel, M.; Masís-Meléndez, F.; de Jonge, L.W.; Moldrup, P.; Arthur, E.; Greve, M.H. Assessing Soil Water Repellency of a Sandy Field with Visible near Infrared Spectroscopy. J. Near Infrared Spectrosc. 2016, 24, 215–224. [Google Scholar] [CrossRef]
Katuwal, S.; Knadel, M.; Moldrup, P.; Norgaard, T.; Greve, M.H.; de Jonge, L.W. Visible–Near-Infrared Spectroscopy Can PredictMass Transport of Dissolved Chemicals through Intact Soil. Sci. Rep. 2018, 8, 11188. [Google Scholar] [CrossRef]
Qin, A.; Ning, D.; Liu, Z.; Duan, A. Analysis of the Accuracy of an FDR Sensor in Soil Moisture Measurement under Laboratory and Field Conditions. J. Sensors 2021, 2021, 6665829. [Google Scholar] [CrossRef]
Yiming, W.; Yandong, Z. Study on the Measurement of Soil Water Content Based on the Principle of Standing Wave Ratio. In Proceedings of the Beijing International Conference on Agriculture Engineering, Beijing, China, 14–17 December 1999. [Google Scholar]
Xu, Y.; Yang, W.; Li, Z. Soil Water Sensor Based on Standing Wave Ratio Method of Design and Development. In Proceedings of the International Conference on Computer and Computing Technologies in Agriculture, Beijing, China, 16–19 September 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 720–730. [Google Scholar]
Tian, H.; Gao, C.; Zhang, X.; Yu, C.; Xie, T. Smart Soil Water Sensor with Soil Impedance Detected via Edge Electromagnetic Field Induction. Micromachines 2022, 13, 1427. [Google Scholar] [CrossRef]
Tsai, P.-H.; Huang, Y.; Tai, J.-H. Estimating soil water content from thermal images with an artificial neural network. CATENA 2024, 241, 108029. [Google Scholar] [CrossRef]
Usta, A. Prediction of soil water contents and erodibility indices based on artificial neural networks: Using topography and remote sensing. Environ. Monit. Assess. 2022, 194, 794. [Google Scholar] [CrossRef]
Kim, D.; Kim, T.; Jeon, J.; Son, Y. Soil-Surface-Image-Feature-Based Rapid Prediction of Soil Water Content and Bulk Density Using a Deep Neural Network. Appl. Sci. 2023, 13, 4430. [Google Scholar] [CrossRef]
Armetti, G.; Migliazza, M.R.; Ferrari, F.; Berti, A.; Padovese, P. Geological and mechanical rock mass conditions for TBM performance prediction. The case of “La Maddalena” exploratory tunnel, Chiomonte (Italy). Tunn. Undergr. Space Technol. 2018, 77, 115–126. [Google Scholar] [CrossRef]
Fatemi, S.A.; Ahmadi, M.; Rostami, J. Evaluation of TBM performance prediction models and sensitivity analysis of input parameters. Bull. Eng. Geol. Environ. 2016, 77, 501–513. [Google Scholar] [CrossRef]
Yagiz, S.; Karahan, H. Application of various optimization techniques and comparison of their performances for predicting TBM penetration rate in rock mass. Int. J. Rock Mech. Min. Sci. Géoméch. Abstr. 2015, 80, 308–315. [Google Scholar] [CrossRef]
Xia, Y.; Yang, M.; Mei, Y.; Ji, Z. Influence of Geological Properties and Operational Parameters on TBM Muck Removal Performance for Yinsong Tunnel. Geotech. Geol. Eng. 2021, 40, 2291–2306. [Google Scholar] [CrossRef]
Barla, G.; Pelizza, S. TBM tunnelling in difficult ground conditions. In Proceedings of the ISRM International Symposium, Melbourne, Australia, 19–24 November 2000. [Google Scholar]
Broere, W. Tunnel Face Stability & New CPT Applications. Ph.D. Thesis, Technische Universiteit Delft, Delft, The Netherlands, 2001. [Google Scholar]
Shirlaw, N. Setting operating pressures for TBM tunnelling. In Proceedings of the 32nd geotechnical division’s annual seminar, Hong Kong Institution of Engineers (HKIE), Hong Kong, 25 May 2012. [Google Scholar]
Sitarenios, P.; Litsas, D.; Papadakos, A.; Kavvadas, M. Effect of Hydraulic Conditions in controlling the Face in EPB Excavated Tunnels. In Proceedings of the 2015 World Tunnel Congress, Dubrovnik, Croatia, 22–28 May 2015. [Google Scholar]
Bouri, D.; Krim, A.; Brahim, A.; Arab, A. Shear strength of compacted Chlef sand: Effect of water content, fines content and others parameters. Stud. Geotech. Mech. 2020, 42, 18–35. [Google Scholar] [CrossRef]
Dafalla, M.A. Effects of Clay and Moisture Content on Direct Shear Tests for Clay-Sand Mixtures. Adv. Mater. Sci. Eng. 2013, 2013, 562726. [Google Scholar] [CrossRef]
Gusman, M.; Nazki, A.; Putra, R.R. The modelling influence of water content to mechanical parameter of soil in analysis of slope stability. J. Phys. 2018, 1008, 012022. [Google Scholar] [CrossRef]
Stracke, F.; Jung, J.G.; Korf, E.P.; Consoli, N.C. The Influence of Moisture Content on Tensile and Compressive Strength of Artificially Cemented Sand. Soils Rocks 2012, 35, 303–308. [Google Scholar] [CrossRef]
Bao, H.; Song, Z.; Lan, H.; Ma, Y.; Yan, C.; Liu, S. Analysis of the mechanical effects and influencing factors of cut-fill interface within loess subgrade. Eng. Fail. Anal. 2024, 163, 108488. [Google Scholar] [CrossRef]
Ye, C.; Guo, Z.; Cai, C.; Wang, J.; Deng, J. Effect of water content, bulk density, and aggregate size on mechanical characteristics of Aquults soil blocks and aggregates from subtropical China. J. Soils Sediments 2016, 17, 210–219. [Google Scholar] [CrossRef]
Liu, S.; Wen, C.; Cheng, P.; Zhang, Z.; Wu, Z. Experimental Study on Strength Characteristics of Soil-Rock Mixture Under Different Water Contents; IOS Press: Amsterdam, The Netherlands, 2023. [Google Scholar] [CrossRef]
Mohamad, E.; Alshameri, B.; Kassim, K.; Saad, R. Shear Strength Behaviour for Older Alluvium Under Different Moisture Content. Electron. J. Geotech. Eng. 2011, 16, 605–617. [Google Scholar]
Zhou, F.-Y.; Jin, L.; Dong, J. Review of Convolutional Neural Network. Jisuanji Xuebao/Chin. J. Comput. 2017, 40, 1229–1251. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar] [CrossRef]
Andrew, H.; Menglong, Z.; Bo, C.; Dmitry, K.; Weijun, W.; Tobias, W.; Marco, A.; Hartwig, A. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q.V. Learning Transferable Architectures for Scalable Image Recognition. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8697–8710. [Google Scholar] [CrossRef]
Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946. [Google Scholar] [CrossRef]
Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. arXiv 2022, arXiv:2201.03545. [Google Scholar] [CrossRef]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]

Figure 1. Photo and scheme of the screw conveyor test setup.

Figure 2. Dependence of the applied torque on the water content 0–20%.

Figure 3. Dependence of the applied torque on the water content 20–40%.

Figure 4. Photos of soil samples with constant solid mass 1.5 kg and varying water mass (gram): (a) 300; (b) 330; (c) 350; (d) 380; (e) 420; (f) 450; (g) 480; (h) 510; (i) 530; (j) 550; (k) 580; (l) 600.

Figure 5. Structure of a convolutional neural network.

Figure 6. ResNet model results.

Figure 7. MobileNet model results.

Figure 8. Xception model results.

Figure 9. DenseNet model results.

Figure 10. NasNet model results.

Figure 11. EfficientNet model results.

Figure 12. ConvNeXt model results.

Figure 13. Models’ performance.

Figure 14. Image from training set of class I (left) and its Grad-CAM heat map (right).

Figure 15. Image from training set of class III (left) and its Grad-CAM heat map (right).

Table 1. Confusion matrix of DenseNet-based model.

Confusion Matrix
Class I	22	1	1
Class II	3	22	1
Class II	3	3	75
	Class I	Class II	Class III

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Miller, M.; Fang, Y.; Wang, Y.; Kharitonov, S.; Akulich, V. Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique. Infrastructures 2025, 10, 138. https://doi.org/10.3390/infrastructures10060138

AMA Style

Miller M, Fang Y, Wang Y, Kharitonov S, Akulich V. Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique. Infrastructures. 2025; 10(6):138. https://doi.org/10.3390/infrastructures10060138

Chicago/Turabian Style

Miller, Mark, Yong Fang, Yubo Wang, Sergey Kharitonov, and Vladimir Akulich. 2025. "Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique" Infrastructures 10, no. 6: 138. https://doi.org/10.3390/infrastructures10060138

APA Style

Miller, M., Fang, Y., Wang, Y., Kharitonov, S., & Akulich, V. (2025). Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique. Infrastructures, 10(6), 138. https://doi.org/10.3390/infrastructures10060138

Article Menu

Soil Condition Classification Based on Natural Water Content Using Computer Vision Technique

Abstract

1. Introduction

2. Experiment for Obtaining the Dataset

2.1. Problem Statement

2.2. Construction Site Information

2.3. Experiment

2.4. Interpretation of Intermediate Results

3. Deep Learning with CNN for Solving Computer Vision Problem of Classification

3.1. Main Concepts

3.2. Transfer Learning

3.2.1. ResNet

3.2.2. MobileNet

3.2.3. Xception

3.2.4. DenseNet

3.2.5. NasNet

3.2.6. EfficientNet

3.2.7. ConvNeXt [55]

3.3. Estimation and Optimization of the Models

3.4. Dataset

3.5. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI