Next Article in Journal
Investigation of Carbon Fiber Reinforced Polymer Concrete Reinforcement Ageing Using Microwave Infrared Thermography Method
Previous Article in Journal
Limitations of Large Language Models in Propaganda Detection Task
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A One-Step Methodology for Identifying Concrete Pathologies Using Neural Networks—Using YOLO v8 and Dataset Review

by
Joel de Conceição Nogueira Diniz
1,
Anselmo Cardoso de Paiva
1,
Geraldo Braz Junior
1,
João Dallyson Sousa de Almeida
1,
Aristófanes Corrêa Silva
1,
António Manuel Trigueiros da Silva Cunha
2,3 and
Sandra Cristina Alves Pereira da Silva Cunha
2,4,*
1
UFMA/Computer Science Department, Universidade Federal do Maranhão, Campus do Bacanga, São Luís 65085-580, Brazil
2
UTAD/Engineering Department, Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal
3
ALGORITMI Research Centre, University of Minho, 4800-058 Guimarães, Portugal
4
CMADE—Centre of Materials and Building Technologies, UTAD, 5000-801 Vila Real, Portugal
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(10), 4332; https://doi.org/10.3390/app14104332
Submission received: 2 May 2024 / Revised: 15 May 2024 / Accepted: 17 May 2024 / Published: 20 May 2024

Abstract

:
Pathologies in concrete structures can be visually evidenced on the concrete surface, such as by fissures or cracks, fragmentation of part of the concrete, concrete efflorescence, corrosion stains on the concrete surface, or exposed steel bars, the latter two occurring in reinforced concrete. Therefore, these pathologies can be analyzed via the images of concrete structures. This article proposes a methodology for visually inspecting concrete structures using deep neural networks. This method makes it possible to speed up the detection task and increase its effectiveness by saving time in preparing the identifications to be analyzed and eliminating or reducing errors, such as those resulting from human errors caused by the execution of tedious, repetitive analysis tasks. The methodology was tested to analyze its accuracy. The neural network architecture used for detection was YOLO, versions 4 and 8, which was tested to analyze the gain with migration to a more recent version. The dataset for classification was Ozgnel, which was trained with YOLO version 8, and the detection dataset was CODEBRIM. The use of a dedicated classification dataset allows for a better-trained network for this function and results in the elimination of false positives in the detection stage. The classification achieved 99.65% accuracy.

1. Introduction

The development of urban centers is directly related to the advancement of civil construction. It is relevant to the economy, generating investments and jobs and providing one of its main physical structures [1]. This physical structure needs adequate maintenance to ensure people’s safety. Lack of maintenance or inadequate maintenance can compromise the structure and result in its loss or, worse, severe or fatal accidents.
The second most used material in the world is reinforced concrete, which can generate versatile, durable, accessible, functional, and very attractive constructions [2]. Concrete has high compressive strength, which is one of the main stresses found in constructions, although it has significantly lower tensile strength, which is resolved by its association with steel [1].
One of concrete’s important characteristics is the easy possibility of adapting itself, even in the place where it will be used. This versatility makes meeting the requirements of architectural projects and the modern forms required for them easier. Even in aggressive environmental conditions, it is durable if the proper mixture is constructed and maintained correctly. It stands out for its resistance to significant loading, the effects of aggressive actions, corrosion of embedded metals, ice overload, concrete resistance to changes in volume, abrasion/erosion, and chemical actions whenever due maintenance is observed.
The existence of several factors or agents that can act on concrete results in the possibility of several pathologies in this material, which makes its maintenance essential. Maintenance, therefore, should be paid special attention, given the material’s wide use and the impacts of a possible collapse of concrete structures. It is a considerable risk if the structure loses, or does not present, the resistance for which it was designed when calculating the project. Concrete structures must, therefore, for the reasons mentioned, be monitored throughout their useful life.
The pathologies that usually occur in concrete can be identified and monitored by various techniques, such as instrumentation, and it is very common to inspect its surface, either in the original location or using images, since the main pathologies result in visible marks on the surface of the concrete [3]. Among the main concrete pathologies are cracks or fissures, fragmentation of part of the concrete, efflorescence, corrosion stains, or exposed steel bars visible on the concrete surface.
Improving inspection methods for concrete structures to detect and analyze pathologies is necessary. This analysis is generally carried out through the inspection of structures. Inspections are documented with image capture, performed by a professional in the field. The pathology diagnosis must be deepened through visual analysis of images by a specialist. Images often need to be sent to professionals away from the construction site, as these specialized professionals may only be available in a different region. Receiving the images remotely makes it possible to reduce travel expenses or even solve the problem of the need for more availability of a professional to make the diagnosis.
Regular inspections of concrete structures are fundamental, so they needs to be carried out properly so as not to allow unsafe conditions that result in a risk of using the structure. These inspections can be time-consuming, rely on subjective analysis, and consume resources, all of which can result in analysis errors. Automating inspections can prevent these factors from resulting in analysis problems that compromise the safety of concrete structures [4]. The inspection routine is reproducible since algorithms can model it. Several studies have addressed this possibility, resulting in the automation of this task, which results in important optimization for inspecting these structures.
Since abundant images can be generated in a building inspection, analyzing all these images by a professional can incur human error due to fatigue caused by such a tedious, repetitive task. Computer vision can, therefore, help with these monotonous and repetitive activities, resulting in time savings and reduced errors. This image analysis process for diagnosing pathologies in concrete structures can be modernized by automation using deep-learning methods.
Deep learning has opened up significant possibilities for image processing in the construction industry. Although the available databases are still scarce in some areas, essential work on identifying pathologies in concrete is already available in the literature. The literature presents relevant work on pathologies in concrete structures and artificial intelligence.
Zhinan Gao et al. [5] researched concrete exposed to fire. Cracks and fissures in concrete can especially occur in tunnels, where the temperature increase is significant. The result is that the safety of a tunnel is significantly compromised. The analysis of the parameters involved is complex since their analysis results in non-linear models that cannot easily be solved analytically. This article proposes the use of neural networks to solve this complex problem. The results are compared with experimental solutions to evaluate the efficiency of the proposed model. The study provides an important resource for studying this use case of concrete structures.
Guzmán-Torres et al. [6] emphasize the fact that one of the main activities of infrastructure maintenance is to assess damage to reinforced concrete elements. Corrosion is one of the pathologies that deserves the most attention, and maintenance activity requires major government investment in this area. Visual inspection is one possibility, but it is subjective and requires considerable time and resources. This research uses YOLO, version 3, to detect corrosion in reinforced concrete structures. Transfer learning and high-resolution images are adopted to improve the model’s accuracy. The results show the satisfactory capacity of this artificial intelligence approach for civil engineering.
Ojeda et al. emphasize the importance of identifying and classifying cracks in concrete to avoid structural damage. They used neural networks to determine the behavior and types of failure in concrete cylinders. The methodology cataloged 2650 images of flaw types in concrete cylinders tested in compression in a Materials Testing and Strength Laboratory. The MobileNet, DenseNet121, ResNet50, and VGG16 algorithms were used with 96, 91, 86, and 90% accuracy, with the MobileNet algorithm being the best predictor with 96%. The work presents an algorithm that can help assess the health of concrete and can also be coupled with the use of drones [7].
Beskopylny et al. [8] highlight the importance of using computer vision for monitoring structures and that intelligent technologies are increasingly present in all phases of civil construction. A U-Net convolutional neural network (CNN) is used for the segmentation of hardened cement paste violations. The accuracy was 60%, but another model showed slightly better results. The work shows how this technology allows the efficient monitoring of hardened cement paste in relation to possible pathologies.
Vijay et al. [9] use neural networks to predict the compressive strength of microbial concrete. Sixty sets of data were obtained by observing the strength of curing concrete over 7, 28, and 56 days. The network was trained with three input parameters to produce the compressive strength output. The experiment shows how accumulating bacteria and calcium lactate as a source of nutrients can increase compressive strength and heal cracks. The neural network experiment allows for predicting the resistance of microbial concrete with different concentrations of the bacteria Bacillus subtilis and calcium lactate.
Zhang et al. [10] emphasize how challenging it is to characterize defects due to unknown topology, geometry, material properties, and non-linear deformation. They present a way of identifying unknown geometric parameters and materials with neural networks. They use a meshless method, parameterizing the geometry of the material using a differentiable and trainable method that can identify multiple structural features. The defined framework can be applied in cases involving unknown material properties and highly deformable geometries, aimed at material characterization, quality assurance, and structural design.
Jin et al. [11] review articles related to machine learning in material studies. Studying the mechanics of materials using these new techniques provides a better understanding of the materials used in the construction industry.
Li et al. [12] show the use of a lightweight convolutional neural network, WearNet, which can automatically detect scratches for components in contact sliding, such as those in metal forming. The authors claim that the result was excellent, with a classification accuracy of 94.16%, a much smaller model size, and faster detection speed.
The methodology presented in this work has also been developed from [4], with the use of neural networks to detect and classify pathologies in concrete structures. In addition to the theory, new practical experiments were also presented to validate the proposed methodology. The present work seeks to differentiate the identification of pathologies using neural networks in relation to the researched works. This work proposes a method for visually inspecting concrete structures using neural networks, understanding the importance of improving the classification of pathologies identified by detection. The proposal uses the classification capacity of detection to carry out a specific classification in the sequence under particular conditions. This article puts forth the idea of carrying out both processes with the same technology to allow the sequence, detection, and classification to be in the same scope.

2. Materials and Methods

This section describes the methods and the resources needed to apply it, such as the database required. The method proposed in this paper is described in Figure 1. The steps, therefore, consist of the following:
1.
Image acquisition.
2.
Pathology detection.
3.
Pathology classification.
4.
Evaluation.
The method proposes using acquired images of the structure to be analyzed for use in a convolutional neural network, using computer vision resources for both detection and classification. The images acquired may need to be of better quality for detection techniques. A deep-learning network carries out detection, so it needs to have been trained on an appropriate dataset. If the detection results in an abundance of false positives related to the training dataset with the least amount of data, the network, this time trained for classification specifically with a second suitable dataset, is then used to provide a better classification.
The following sections describe the stages of the method in detail.

2.1. Image Acquisition

The initial stage is acquiring the image dataset to feed the model, which is responsible for the detection and classification task.
Two types of image acquisition will be analyzed. One set of images will be used to train and test the deep neural network that will identify concrete pathologies. The phase for which the system is proposed is a second set of images representing the images to be used in a fundamental pathology analysis scenario.
To train and test the deep neural network, an adequately annotated dataset is required for the task at hand—classification, detection, or segmentation. Good learning is directly related to good annotation and an adequate number of images; the greater the number of images, the better the learning. The annotation format is different depending on the task being performed; for this reason, how it is done is different for each task. There is a different degree of difficulty for each type of annotation. Classification is more straightforward, as it is more a matter of simply noting which type or class an image belongs to. Detection is more complex than classification because each detected artifact must be located in the primitive image and demarcated with a bounding box. The most complex task is segmentation, which requires classifying a given class’s pixels. Different complexities result in a greater or lesser availability of datasets for each type of task.
Correctly detecting and classifying concrete pathologies using images is directly related to the quality of the images used. Poor quality results in pathologies being challenging to identify or even being incorrectly identified. Poor quality can be due to noise, which can be processed and eliminated in numerous instances.
Care in obtaining images is fundamental, although some conditions are unfavorable, so image analysis and enhancement should be carried out after acquisition. Illumination correction, sharpening, noise removal, and correction of certain imperfections are some applications that improve the quality of acquired images.
It is important to remember that post-processing improves the quality of images, and it is also essential to take care of it during image acquisition. Lighting conditions, time of acquisition, proper equipment use, environmental conditions, and even the structure to be analyzed are some important factors that need to be observed and well-planned. Filters are one resource that can improve the quality of images in numerous instances. Filters that work with the transition of intensities in images can soften the image. There are enhancement filters that increase the sharpness of the image, highlight edges and details of the image, and improve contrast and enhance details. Another possibility for improving brightness and contrast and making the image sharper and more detailed is image intensity adjustment.
Filters can also be used to remove or reduce specific instances of noise. The image can be tested with specific filters to evaluate the result and improve quality.
Correcting the distribution of colors in the image is another technique that can improve image quality. This technique makes it possible to create better-quality images with more realistic and vibrant colors.
It is important to remember that each image produced is unique, as they all depend on many variables, such as how the image was produced, the equipment used, environmental conditions, and the moment it was taken. For this reason, each case must be carefully analyzed so that the most appropriate techniques can be applied.

2.2. Pathology Detection

Concrete pathologies can be identified using acquired images of the structure to be analyzed, concerning pathologies that can be visually identified on the concrete surface. This article proposes using a neural network architecture for this detection, which is a combination of computer vision and artificial intelligence that allows good performance. Identification results in the demarcation of the area with pathology via bounding boxes.
Detection and classification tasks are correlated, as the possible pathology identified with detection is classified. The training of the deep-learning network to perform detection is carried out with a database of images and the appropriate annotations of existing pathologies, from which the learning weights of the neural network are determined. Once training is complete, the network can identify pathologies in a new set of images different from those used in training and is able to classify and locate existing pathologies that can then be identified. In the case of a set of images already cropped for a specific pathology, it is interesting to carry out direct classification with a dedicated neural network trained for this function.

2.3. Pathology Classification

The proposed method can classify the specific pathology after detection. Still, it is essential to highlight the fact that the direct classification option is equally possible for cases where the image contains only one pathology. Another possibility for using classification is to associate its use with detection, as described below.
The task of detection, locating an artifact in a more general image, is more complex than simple classification. Classification is usually less complex as it simply consists of classifying the entire image to be processed. Detection, however, has this same activity but on a section of the image that needs to be located, thus requiring more processing. Classification and detection annotations are also different.
The classification should be less complex as it is enough to inform the classification class to which an image belongs; this is usually done by simply separating the images into different folders. In the case of detection, it is necessary to locate the artifact to be classified within the image to produce the marking, and the number of artifacts in a single image can be numerous. The marking of artifacts in the annotation for detection can be subjective, so a qualified professional must carry it out. It can often be difficult, depending on how the artifact appears in the image. These latter difficulties can result in errors or inaccuracies in the annotations.
The greater difficulty in annotations for detection makes this task more time-consuming and expensive, which can lead to fewer datasets being available. The lack of annotated images can also result in lower-quality datasets, affecting the training quality for learning a neural network and the detection capacity of the trained network.
Detecting artifacts in images generates parts identified by a bounding box, which can be cropped to generate new images that, once detected, can be classified. This classification will be as precise as the accuracy of the neural network trained and used.
As detection is not 100% accurate, some resulting images need to be correctly classified. These errors can be identified with a new classification from a neural network that is better trained for this task. For example, this is possible if the classification network is trained with a more extensive dataset. As annotation for classification is more accessible than for detection, it is not uncommon for the classification dataset to contain much more significant data.
If the second dataset contains significantly more data than the one used for detection, using a specific classification dataset to verify the detection results can reduce the number of false positives and consequently improve the neural network’s metrics.

2.4. Improving Accuracy

This article also presents actions that can be taken to improve the accuracy of the neural networks used to detect pathologies in concrete. This improvement presupposes that previous work has been carried out and evaluated with suitable metrics that indicate the model’s efficiency.
Technology is constantly advancing. Methods are revised, and new ones emerge. Evolution allows for more precise results, improved performance, greater possibilities, and diversity, among many other benefits. It is essential to keep abreast of the evolution of the methodologies adopted in research, as this way it is possible to acquire the better performance expected. Research must, therefore, include a constant literature review so that the results are based on the best available knowledge.
An important point for constant updating is that a literature review can avoid unproductive research. Unproductive research consists of producing something that has already been created in similar work. However, achieving results that have already been presented can be a good use of research time if the focus is precisely on proving or questioning previous research.
Another fundamental point made possible by reviewing other work is the continuity of evolution. Starting from the point where a result has already been produced, it is possible to evolve what already exists. This collaboration is the essence of the continuous study of science: evolution and the advancement of knowledge.
The indications proposed for improving the metrics can be summarized in two basic points: assess the state-of-the-art of the neural network technology used; review the dataset annotations used to train the neural network.
The dataset is fundamental for training an artificial intelligence neural network; the increased use of artificial intelligence results in a greater availability of datasets for training. It is important to analyze the origin of a given dataset, as well as its permitted use, before making use of this resource. The use of the dataset will be possible according to the annotation carried out on this database. The annotation describes what the stored data or part of the data represents. It then allows for potential uses, such as classification, detection, or segmentation, in the case of computer vision.
The quality of the annotation is directly related to the quality of the neural network training. The neural network will learn according to the annotations made, so poor annotations result in learning that leads to wrong or inaccurate inferences. In practical terms, if a piece of data represents a semantic result, the annotation must correspond to the classification and, especially in the case of detection and segmentation, the location of the data must be as precise as possible.

3. Experiments and Results

Certain experiments can be conducted to check whether the proposed methodology is valid. This article also takes advantage of a previous experiment by the same authors, according to article [4], to compare technology improvements and revise the dataset. Furthermore, it reinforces the proposal to reduce false positives using two datasets, this time using the same technology.
Two main actions are applied in the experiment: The first consists of updating the technology used, from version 4 of YOLO, initially used in the first article [4], to version 8, which consists of the evolution of the previous technology. The second consists of reviewing the dataset used, CODEBRIM [13], in the first article to identify possible inconsistencies and correct them to allow the neural network to learn better. The classification dataset was also trained with YOLO version 8 instead of the customized network used in the first article. Regarding this last point, the idea of the first article is maintained, making it possible to use a specific dataset containing more images to identify false positives from detection.

3.1. YOLO—You Only Look Once

One of the possibilities of computer vision is the ability to detect objects, which consists of locating a particular object, a region of interest, in a larger image. This task is a more advanced problem than classification and is more comprehensive because it also involves classification, adding the possibility of locating the classified object in the image. YOLO, short for “You Only Look Once”, uses a neural network for real-time detection due to its enormous speed and accuracy in detection. Ref. [14] created this network architecture in 2015. YOLO has evolved through several versions and is currently on version 8.
The YOLO proposal is an end-to-end connected network for making bounding box predictions with their probabilities for classes simultaneously, differing from previous algorithms that reused classifiers for detection. This algorithm uses a single, fully connected layer for predictions. In contrast, others, such as Faster RCNN, detect possible regions of interest using networks of proposed regions followed by recognition of these regions separately. YOLO uses a single interaction to propose the regions of interest, while other methods perform several interactions for the same image.
The algorithm uses an image as input and then a simple CNN to detect objects in the image. Figure 2 [14] shows the architecture of the CNN model used by YOLO.
A transfer learning technique is used to abbreviate learning and is found in the initial twenty layers of the network. This initial learning is performed with ImageNet, in a cluster of fully connected layers. Performance is improved by applying convolution. Finally, the network uses a fully connected layer to determine the location probabilities of the bounding boxes.
Initially, the image is divided into an S × S grid. The cell located in the object’s center will detect the object. Each cell will predict the bounding box and confidence scores, which define the model’s confidence that the bounding box contains the object and with what accuracy. YOLO uses the concept of intersection over union (IoU) to define the most representative bounding box.
YOLO uses non-maximum suppression (NMS). This step is applied in post-processing to improve accuracy and efficiency in detecting the target object. Since several boxes can be defined for the same object, they overlap with small changes in their location and size. This step then eliminates incorrect or redundant boxes for each object.

3.2. Detection of Pathologies in Concrete

The paper referred to as the basis for this current paper presented training for detection using YOLO version 4. This paper presents the same training for detecting pathologies in concrete but uses a more recent version of YOLO, in this case, YOLO version 8.
The same training database, the dataset, is used to compare and evaluate gains with a new version of the neural network architecture.
The training times evaluated are the same for both experiments, making it possible to evaluate the final accuracy and how the training evolved. The two training sessions were measured up to the same epoch, but more epochs can be trained to improve the final accuracy.
Learning is directly related to the quality of the dataset annotation. An annotation with errors results in poor learning. The neural network’s ability to detect a given object will depend on what it has been taught; if we teach it the wrong object, the neural network will reflect this. For example, if we present the pathology of corrosion instead of cracking, corrosion will be detected instead of cracking.
The detection experiment carried out with the CODEBRIM [13] database enables the identification of specific pathologies in more comprehensive images. The image was not cropped specifically for the pathology to be classified, as is necessary in the case of classification.
The dataset [13] has 1052 images. The set of images presents defects for the following classes: 2507 cracks, 1898 spallation, 833 efflorescence, 1507 exposed bars, and 1559 corrosion stains [16]. Figure 3 shows the cases of fissures or cracks in A, exposed steel bars in B, spallation in C, efflorescence in D, and corrosion stains in E.
A review of the dataset CODEBRIM [13] was conducted to detect possible inconsistencies in the annotations for the concrete crack pathology. Some figures were completely wrong in relation to the positioning of the bounding box; these cases were possibly due to the rotation of the image after annotation. Another improvement was the better use of crack pathologies that were present in the images but had not been annotated. Figure 4 shows a case of an incorrect annotation due to the image’s rotation. The marked points that define the bounding boxes are not correctly on the pathologies, due to image rotation.
The review of concrete crack annotations was performed in an environment created in Anaconda. The resource used was LabelImg, which was installed in the created environment via a prompt command. LabelImg allows one to view the annotations created, correct existing annotations, and even create new ones. The most incorrect annotations were deleted, and new ones were created. Annotations with minor inaccuracies were adjusted. New annotations were created where there was pathology without an annotation.
Additional training was carried out with the revised base so that the results could be compared with the original base to measure the gain from the revised annotation. This training with the revised base was performed using YOLO version 8 since the training with the original base was also performed using this version.
The use of the same version of YOLO under the same conditions, except for the fact that the original base and the revised base are used, allows us to understand the impact of the dataset revision. It is important to note that the number of wrong annotations was small in relation to the total universe of images, so the second training session was not exposed to such significant variations in learning quality.
The sequence of results with YOLO for detection was then 0.12 mAP, YOLO version 4, with the original dataset; 0.13 mAP, YOLO version 4, with the revised dataset; and 0.14 mAP, YOLO version 8, with the revised dataset. The results of identifying cracks in the concrete were very consistent but, in some cases, were incorrect. These cases can be solved by using the network dedicated to classification.
Figure 5A shows a case of concrete staining that was identified as concrete cracking. There are cases where the shape of these stains can make identification difficult. Figure 5B shows cases where exposed steel was mistaken for cracks in the concrete. Although exposed steel may be accompanied by cracks, in this case the most important pathology to identify would be exposed steel. Figure 5C shows a joint being mistaken for a crack. This case is somewhat complex, as there are unwanted cracks in joints, but it could also be a case of a specific joint for expansion, for example. A specialist would need to assess this case.
The detection tests with YOLO, version 8, were performed on 103 images to detect cracks in the concrete, as the dataset was revised to detect cracks. The detection generated 140 artifacts, regions with cracks in the concrete. Only eight of the images analyzed showed false positives. Figure 6 shows a successful case with positive detection with a picture of the CODEBRIM dataset [13], test base. Figure 7 shows a case of an image made by the authors of a real situation. Both cases show that the trained neural network can identify the pathology in the concrete for which it was trained.

3.3. Classification of Pathologies in Concrete

YOLO can classify even with detection, but we experimented with its classification power in isolation to show the possibility of improving detection or having images focused on just one pathology.
This time, the learning process for classifying images with and without concrete pathologies was carried out using YOLO version 8. The reference paper used training in a customized neural network for classification.
The dataset used was the same as in the previous experiment by Ozgnel [17]. This dataset had a significant number of images, 40,000, balanced and annotated for cases with and without cracks in the concrete.
The dataset was divided into training, validation, and testing, with 70% for training, 20% for validation, and 10% for testing. This division allows adequate training for learning the neural network, but the ultimate goal is to use the images cropped in the detection to be classified by this trained network. This final step on the cropped images allows the identification of false positives resulting from detection.
Figure 8 shows the Loss curve during training. The training converges quickly to low-loss values but keeps oscillating at low values, and using the early stop feature the lowest value is reached at epoch 306, corresponding to an accuracy of 99.65%.
Figure 9 shows the Confusion Matrix applied to the test set. For classification, the trained neural network could fully identify the true positive cases, with only part of the negative cases not being adequately identified.
Table 1 shows the accuracy achieved in both the latest work carried out for this article and the previous work from the previous article [4]. The current work was carried out with YOLO, version 8, and achieved an accuracy of 99.65%, an improvement on the previous work, which used a customized neural network and achieved 99.43% [4]. The results are also compared with other works found in the literature regarding the parameters of how good the results were. Recall was 99.925% and F1score was 99.787%.
The trained neural network, YOLO version 8, allows pathologies to be detected. This identification is done by restricting a bounding box around the pathology identified.
It is possible to crop out the pathology identified in the detection in the bounding box. This cropping then generates a dataset that can be used for classification. The cropped images result from detection, so they are annotated with the type of pathology they represent.
The test dataset was used to detect and crop concrete cracks for the classification task. The thickness of the bounding box and the identification label were minimized to obtain a cleaner image for classification.
Detecting cracks in the concrete generated a set of images classified with this pathology. For the reasons already explained, these images contain false positives, images classified as a concrete cracking pathology but which do not have this pathology.
Since another YOLO version 8 neural network has been trained for classification with a dataset containing many images, it can be used to identify false positives. The neural network for classification was then used to reclassify the images generated by detecting concrete crack pathologies, Figure 10. This pathology was used because the dataset with significant images was of this pathology. However, this same procedure can be used for other pathologies if a dataset is available for this activity.

4. Discussion

Experiments were carried out to verify that the proposed methodology is valid. The proposal described in this article can also use a previous experiment and apply the proposed method to assess whether the objective of reducing false positives can be achieved.
Two actions are applied in the experiment to improve the results: The first consists of updating the technology used, from version 4 of YOLO, initially used in the first experiment, to version 8, which consists of the evolution of the previous technology. The second consists of reviewing the dataset used, CODEBRIM, in the first experiment to identify possible inconsistencies and correct them to allow the neural network to learn better. The classification dataset was also trained with YOLO version 8 instead of the customized network used in the first experiment. Regarding this last point, the idea of the first experiment is maintained to make it possible to use a specific dataset containing more images to identify false positives from detection.
The dataset uses the same training database to compare and evaluate gains with a new version of the neural network architecture. The training assessed times are the same for both experiments, making it possible to evaluate the final accuracy and how the training has evolved. The two training sessions were measured up to the same epoch, but more epochs can be trained to improve the final accuracy.
A complete dataset review was conducted to detect inconsistencies in the annotations for the concrete cracking pathology. Some identifiers were utterly wrong concerning the positioning of the bounding box. These cases were possibly due to the rotation of the image after annotation. Another improvement was the better use of crack pathologies in images that had not been annotated.
The learning of a neural network is directly related to the quality of the dataset annotation—faulty annotation results in poor learning. The neural network’s ability to detect a particular object will depend on what it has been taught; if we teach the wrong object, the neural network will reflect this. For example, if we present the pathology corrosion instead of cracking, corrosion will be detected instead of cracking. The corrections made, therefore, represent important actions to improve learning.
The task of detection, locating an artifact in a more general image, is more complex than simple classification. Classification is usually less complex, as it simply consists of classifying the entire image to be processed; detection, on the other hand, has this same activity but on a section of the image that needs to be located, thus requiring more processing. Classification and detection annotations are also different.
The classification should be less complex, as it is enough to inform the classification class to which an image belongs. In the case of detection, it is necessary to locate the artifact to be classified within the image to produce the marking, and the number of artifacts in the same image can be numerous. The marking of artifacts in the annotation for detection can be subjective, so a qualified professional must carry it out.This can be difficult in many cases, depending on how the artifact appears in the image. These latter difficulties can result in errors or inaccuracies in the annotations.
The greater degree of difficulty in annotations for detection makes this task more time-consuming and expensive, which can lead to fewer datasets being available. This lack of annotated images can also result in lower-quality datasets, affecting the training quality for learning a neural network and the detection capacity of the trained network.
As detection is not entirely accurate, its use results in some images being classified incorrectly. False positives can be identified with a new classification from a neural network that is better trained for this task. For example, this is possible if the classification network is trained with a larger dataset or better annotations. Datasets with more annotations for classification are common due to the greater ease of annotation than annotation for detection.
This article evaluated the performance gain of neural networks for identifying concrete pathologies. The authors presented a possible use for neural networks in a previous article, and in this article improvements have been proposed and experimented with to achieve better metrics. Regarding detection, using YOLO in version 4 resulted in 0.12 mAP.
Reviewing the dataset allowed us to identify some inconsistencies, basically incorrect annotations due to the rotated image, which can be seen in Figure 4. In addition to corrections, the annotations were put to better use with additional annotations where possible. This dataset was then trained with the same version of YOLO, version 4, to observe the learning gain. The result was 0.13 mAP, which was only a small gain because the number of inconsistencies was insignificant. However, revising the dataset is fundamental for the neural network to learn correctly.
The last detection experiment on the revised dataset is performed with YOLO, version 8. This experiment shows whether or not we will gain from updating the technology used with the neural network. The result is 0.14 mAP, representing the same gain as the revised dataset. A good literature review is also a fundamental step towards understanding what is new in the research area and adopting the new standard to achieve the best possible performance.
The classification experiment in this article uses the same technology as the detection experiment, and we can demonstrate that this technology allows both options and segments to be carried out efficiently and can even be performed in combination. Previously, the performance was 99.43% accurate using a customized neural network, which is already excellent. The experiment in this article used YOLO, version 8, for classification and obtained a performance of 99.65%, which is equally significant and shows a slight gain in accuracy. The review by specific classification allows, as mentioned, verification of false positives that occurred by detection. YOLO has shown itself to be an extremely versatile technology, providing the three tasks of computer vision with artificial intelligence: classification, detection, and segmentation.

5. Conclusions

The methodology presented in this article makes it possible to detect and classify pathologies in concrete structures using deep neural networks using computer vision with artificial intelligence. The proposal enables robust results using deep learning, showing the gains the field of computer science can make possible to enhance civil construction.
One thing that is fundamental for using the methodology proposed in this article is the existence of annotated datasets to identify specific pathologies. The advancement of artificial intelligence has favored the emergence of many databases for this area, as has occurred in several other areas. These datasets enable several works similar to the one presented in this article. The techniques used in this area have also significantly advanced, resulting in greater precision, agility, and ease of use for identifying pathologies. Studies with these encourage the appearance of new work and the advancement of the resources used for this fundamental task of civil construction.
The classification accuracy was 99.65%, which allows adequate performance for the classification task. The detection task on the test base adequately identified pathologies in images of concrete structures. This performance reinforces the idea that using neural networks provides an essential gain for analyzing pathologies, carried out as a measure to maintain concrete structures.
There are specific cases in which the identification of pathologies is more difficult, even with a usual approach, and cases in which the pathology is not adequately highlighted in the image used. For example, “corrosion spots” and/or “efflorescence” may overlap or even, due to the quality of the image, be very similar, making it challenging to define a reasonable classification in this case. Another case, as an example, is when a crack is confused with a joint of certain parts of the structure; this is because the crack may have a straight geometric shape or it could be a case in which the junction of two parts suffered an unwanted separation.
Although the performance of neural networks can be highly accurate, they cannot guarantee complete freedom from error, and some detections and consequent classifications result in false positives, an indication of pathology that is not a case of pathology in concrete. These specific isolated cases can be solved using a second neural network specifically for classification, trained under better conditions, to provide much greater prediction and increase the chances of eliminating these inconsistencies.
The training for learning the neural network for detection and the learning of the neural network for classification may be performed differently depending on the dataset used. It is easier to annotate for classification, which may justify the easier availability of datasets with more images and better annotations. In the practical examples in this article, the dataset used for classification has 40,000 images, while the one used for detection has only 1052 images. As a result, the second network trained for classification will have a greater ability to avoid false positives. This is the advantage of using two datasets if available and if the second dataset, specific for classification, is better quality.
Not only is the quantity of images relevant, but so is the quality of the annotations. Errors in the annotations compromise the learning of the neural network. In the case of annotations for detection, annotations are more challenging. They can depend on subjectivity, even if carried out by a qualified professional, although professional skill lends greater credibility to the annotations. This article benefited from a review of the dataset used for detection, albeit for only one of the pathologies, since the work was focused on concrete crack pathology. A review is something that should be part of the stage of preparing the dataset to be used.
Computer vision techniques have increasingly become part of human daily life in the most diverse areas that can be analyzed. The resources in this area are present in people’s daily lives, although they often go unnoticed, so expanding these techniques to different areas results in tangible gains for humanity’s development.

Author Contributions

Conceptualization, J.d.C.N.D., A.C.d.P. and S.C.A.P.d.S.C.; methodology, J.d.C.N.D. and G.B.J.; validation, A.C.d.P., G.B.J. and S.C.A.P.d.S.C.; formal analysis, A.C.d.P.; investigation, G.B.J.; resources, J.d.C.N.D. and G.B.J.; writing—original draft preparation, J.d.C.N.D.; writing—review and editing, G.B.J. and S.C.A.P.d.S.C.; visualization, G.B.J., A.M.T.d.S.C., J.D.S.d.A. and A.C.S.; supervision, A.C.d.P. and S.C.A.P.d.S.C.; project administration, A.C.d.P. and S.C.A.P.d.S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work is financed by the FCT (Portuguese Foundation for Science and Technology) through the project UIDB/04082/2020 (CMADE).

Data Availability Statement

Ozgenel [17] and CODEBRIM [13] datasets have been used.

Acknowledgments

This work was supported by Fundação para a Ciência e Tecnologia, IP (FCT) within the R&D Units Project Scope: UIDB/00319/2020 (ALGORITMI). The authors acknowledge the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Brazil—Finance Code 001, Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Brazil, and Fundação de Amparo à Pesquisa Desenvolvimento Científico e Tecnológico do Maranhão (FAPEMA), Brazil.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MDPIMultidisciplinary Digital Publishing Institute
CNNConvolutional Neural Network
DBDatabase
mAPmean Average Precision
YOLOYou Only Look Once

References

  1. James, K.W. Reinforced Concrete. In Reinforced Concrete Mechanics and Design; Pearson Education Limited: London, UK, 2016; pp. 22–30. [Google Scholar]
  2. Taheri, S. A review on five key sensors for monitoring of concrete structures. Constr. Build. Mater. 2019, 204, 492–509. [Google Scholar] [CrossRef]
  3. Safiuddin, M.; Kaish, A.B.M.A.; Woon, C.-O.; Raman, S.N. Early-Age Cracking in Concrete: Causes, Consequences, Remedial Measures, and Recommendations. Appl. Sci. 2018, 8, 1730. [Google Scholar] [CrossRef]
  4. Nogueira Diniz, J.d.C.; de Paiva, A.C.; Junior, G.B.; de Almeida, J.D.S.; Silva, A.C.; Cunha, A.M.T.d.S.; Cunha, S.C.A.P.d.S. A Method for Detecting Pathologies in Concrete Structures Using Deep Neural Networks. Appl. Sci. 2023, 13, 5763. [Google Scholar] [CrossRef]
  5. Gao, Z.; Fu, Z.; Wen, M.; Guo, Y.; Zhang, Y. Physical informed neural network for thermo-hydral analysis of fire-loaded concrete. Eng. Anal. Bound. Elem. 2024, 158, 252–261. [Google Scholar] [CrossRef]
  6. Guzmán-Torres, A.; Domínguez-Mota, F.J.; Martínez-Molina, W.; Naser, M.Z.; Tinoco-Guerrero, G.; Tinoco-Ruíz, J.G. Damage detection on steel-reinforced concrete produced by corrosion via YOLOv3: A detailed guide. Front. Built Environ. 2023, 9, 1144606. [Google Scholar] [CrossRef]
  7. Ojeda, J.M.P.; Cayatopa-Calderón, B.A.; Huatangari, L.Q.; Tineo, J.L.P.; Pino, M.E.M.; Pintado, W.R. Convolutional Neural Network for Predicting Failure Type in Concrete Cylinders During Compression Testing. Civ. Eng. J. 2023, 9, 2105–2119. [Google Scholar] [CrossRef]
  8. Beskopylny, A.N.; Shcherban’, E.M.; Stel’makh, S.A.; Mailyan, L.R.; Meskhi, B.; Razveeva, I.; Kozhakin, A.; Beskopylny, N.; El’shaeva, D.; Artamonov, S. Method for Concrete Structure Analysis by Microscopy of Hardened Cement Paste and Crack Segmentation Using a Convolutional Neural Network. J. Compos. Sci. 2023, 7, 327. [Google Scholar] [CrossRef]
  9. Vijay, K.; Murmu, M. Application of artificial neural networks for prediction of microbial concrete compressive strength. J. Build. Pathol. Rehabil. 2022, 7, 1. [Google Scholar] [CrossRef]
  10. Zhang, E.; Dao, M.; Karniadakis, G.E.; Suresh, S. Analyses of internal structures and defects in materials using physics-informed neural networks. Sci. Adv. 2022, 8, eabk0644. [Google Scholar] [CrossRef]
  11. Jin, H.; Zhang, E.; Espinosa, H.D. Recent advances and applications of machine learning in experimental solid mechanics: A review. Appl. Mech. Rev. 2023, 75, 061001. [Google Scholar] [CrossRef]
  12. Li, W.; Zhang, L.; Wu, C.; Cui, Z.; Niu, C. A new lightweight deep neural network for surface scratch detection. Int. J. Adv. Manuf. Technol. 2022, 123, 1999–2015. [Google Scholar] [CrossRef] [PubMed]
  13. Concrete Defect Bridge Image Dataset. Available online: https://zenodo.org/record/2620293#.YgLkC9_MKMo (accessed on 1 March 2022).
  14. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2016, arXiv:1506.02640. [Google Scholar]
  15. Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
  16. Mundt, M.; Majumder, S.; Murali, S.; Panetsos, P.; Ramesh, V. Meta-learning Convolutional Neural Architectures for Multi-target Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
  17. Concrete Crack Images for Classification. Available online: https://data.mendeley.com/datasets/5y9wdsg2zt/2 (accessed on 15 January 2022).
  18. Alipour, M.; Harris, D.K. Increasing the robustness of material-specific deep-learning models for crack detection across different materials. Eng. Struct. 2020, 206, 110157. [Google Scholar] [CrossRef]
  19. Bai, Y.; Zha, B.; Sezen, H.; Yilmaz, A. Deep Cascaded Neural Networks for Automatic Detection of Structural Damage and Cracks from Images. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, V-2-2020, 411–417. [Google Scholar] [CrossRef]
  20. Jitendra, M.S.N.V.; Srinivasu, P.N.; Srinivas, S.A.; Nithya, A.; Kandulapati, S.K. Crack Detection on Concrete Images Using Classification Techniques in Machine Learning. J. Crit. Rev. 2020, 7, 1236–1241. [Google Scholar]
  21. Kim, J.J.; Kim, A.-R.; Lee, S.-W. Artificial Neural Network-Based Automated Crack Detection and Analysis for the Inspection of Concrete Structures. Appl. Sci. 2020, 10, 8105. [Google Scholar] [CrossRef]
  22. Pal, M.; Palevicius, P.; Landauskas, M.; Orinaite, U.; Timofejeva, I.; Ragulskis, M. An Overview of Challenges Associated with Automatic Detection of Concrete Cracks in the Presence of Shadows. Appl. Sci. 2021, 11, 11396. [Google Scholar] [CrossRef]
  23. Orinaite, U.; Palevicius, P.; Pal, M.; Ragulskis, M. A deep learning-based approach for automatic detection of concrete cracks below the waterline. Vibroeng. Procedia 2022, 44, 142–148. [Google Scholar] [CrossRef]
Figure 1. The method proposed in this paper.
Figure 1. The method proposed in this paper.
Applsci 14 04332 g001
Figure 2. YOLO architecture detector [15].
Figure 2. YOLO architecture detector [15].
Applsci 14 04332 g002
Figure 3. Pathologies in concrete [13]: (A) Crack, (B) Exposed Bar, (C) Spallation, (D) Efflorescence and (E) Corrosion Stain.
Figure 3. Pathologies in concrete [13]: (A) Crack, (B) Exposed Bar, (C) Spallation, (D) Efflorescence and (E) Corrosion Stain.
Applsci 14 04332 g003
Figure 4. Wrong annotation due to image rotation [13].
Figure 4. Wrong annotation due to image rotation [13].
Applsci 14 04332 g004
Figure 5. False positive [13]: (A) Corrosion Stain, (B) Exposed Bar and (C) Joint.
Figure 5. False positive [13]: (A) Corrosion Stain, (B) Exposed Bar and (C) Joint.
Applsci 14 04332 g005
Figure 6. Positive detection [13].
Figure 6. Positive detection [13].
Applsci 14 04332 g006
Figure 7. Crack by author.
Figure 7. Crack by author.
Applsci 14 04332 g007
Figure 8. Loss with YOLO, version 8.
Figure 8. Loss with YOLO, version 8.
Applsci 14 04332 g008
Figure 9. Confusion Matrix with YOLO, version 8.
Figure 9. Confusion Matrix with YOLO, version 8.
Applsci 14 04332 g009
Figure 10. YOLO version 8 crack classification [13]: (A) Positive for crack and (B) Negative for crack.
Figure 10. YOLO version 8 crack classification [13]: (A) Positive for crack and (B) Negative for crack.
Applsci 14 04332 g010
Table 1. YOLO classification.
Table 1. YOLO classification.
AuthorAccuracy (%)
Previous paper
Customized CNN [4]
99.43
This paper
YOLO v8
99.65
Alipour [18]98.6
Bai [19]95.92
Jitendra [20]98.6
Kim [21]99.98
Pal [22]98
Ugne [23]99
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Diniz, J.d.C.N.; de Paiva, A.C.; Junior, G.B.; de Almeida, J.D.S.; Silva, A.C.; Cunha, A.M.T.d.S.; Cunha, S.C.A.P.d.S. A One-Step Methodology for Identifying Concrete Pathologies Using Neural Networks—Using YOLO v8 and Dataset Review. Appl. Sci. 2024, 14, 4332. https://doi.org/10.3390/app14104332

AMA Style

Diniz JdCN, de Paiva AC, Junior GB, de Almeida JDS, Silva AC, Cunha AMTdS, Cunha SCAPdS. A One-Step Methodology for Identifying Concrete Pathologies Using Neural Networks—Using YOLO v8 and Dataset Review. Applied Sciences. 2024; 14(10):4332. https://doi.org/10.3390/app14104332

Chicago/Turabian Style

Diniz, Joel de Conceição Nogueira, Anselmo Cardoso de Paiva, Geraldo Braz Junior, João Dallyson Sousa de Almeida, Aristófanes Corrêa Silva, António Manuel Trigueiros da Silva Cunha, and Sandra Cristina Alves Pereira da Silva Cunha. 2024. "A One-Step Methodology for Identifying Concrete Pathologies Using Neural Networks—Using YOLO v8 and Dataset Review" Applied Sciences 14, no. 10: 4332. https://doi.org/10.3390/app14104332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop