Next Article in Journal
Thickness Prediction of Negative Electrodes for Lithium Batteries in the Slot-Die Coating Process
Previous Article in Journal
Reflectance Minimization of GaAs Solar Cell with Single- and Double-Layer Anti-Reflection Coatings: A Simulation Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigating and Identifying the Surface Damage of Traditional Ancient Town Residence Roofs in Western Zhejiang Based on YOLOv8 Technology

by
Shuai Yang
1,
Yile Chen
2,3,*,
Liang Zheng
2,3,*,
Junming Chen
2,
Yuhao Huang
4,
Yue Huang
5,
Ning Wang
1 and
Yuxuan Hu
1
1
Center for Liangzhu Civilization Studies and Center for Cultural Heritage Studies, Hangzhou City University, No. 51 Huzhou Street, Gongshu District, Hangzhou 310015, China
2
Faculty of Humanities and Arts, Macau University of Science and Technology, Avenida Wai Long, Tapai, Macau 999078, China
3
Heritage Conservation Laboratory, Macau University of Science and Technology, Avenida Wai Long, Tapai, Macau 999078, China
4
Faculty of Innovation and Design, City University of Macau, Avenida Padre Tomás Pereira, Taipa, Macau 999078, China
5
Hangzhou Animation & Game College, Hangzhou Vocational & Technical College, No.68 Xueyuan Street, Xiasha Higher Education Park, Qiantang District, Hangzhou 314423, China
*
Authors to whom correspondence should be addressed.
Coatings 2025, 15(2), 205; https://doi.org/10.3390/coatings15020205
Submission received: 30 December 2024 / Revised: 5 February 2025 / Accepted: 5 February 2025 / Published: 8 February 2025

Abstract

:
The environment continues to erode the roofs of ancient buildings in Longmen Ancient Town, posing a threat to the safety of villagers. Scientific detection and diagnosis are important steps in the repair and protection of historical buildings. In order to effectively protect cultural heritage, this study uses the YOLOv8 deep learning model to automatically detect damage on images of traditional residential roofs. The researchers constructed image data sets for the four categories of green vegetation, dry vegetation, missing tiles, and repaired tiles and then perform model training. The results show that the model is generally accurate for missing tiles (0.94 for missing tiles and 0.93 for repaired tiles), and it has a low false detection rate and a low missed detection rate. It does make some mistakes when it comes to green and dry vegetation in complex backgrounds, but the overall detection coverage and F1 score are better. This practical application shows that the model can accurately mark most target areas, especially for the recognition of high-contrast damage types. This study provides efficient and accurate technical support for the diagnosis of traditional roof structures and protection of cultural heritage.

1. Introduction

In the ancient oriental country, Longmen Ancient Town in China attracts countless tourists with its unique architectural style and rich historical and cultural background. It is located at the foot of Longmen Mountain in Fuyang District, Hangzhou City, Zhejiang Province. The town covers about 18 km2, and the core area of the ancient town covers about 2 km2. It is a relatively well-preserved mountain town among the ancient buildings of the Ming and Qing Dynasties in the Jiangnan region [1,2]. However, as time passes, numerous roofs sustain damage, not only diminishing the ancient town’s beauty but also presenting a significant challenge to the preservation of cultural heritage [3,4]. Solving the urgent problem of efficiently and accurately identifying these damages and taking corresponding protective measures has become crucial. In recent years, with the rapid development of artificial intelligence, object detection technology has made remarkable achievements in the field of image recognition, especially in rural agricultural monitoring or heritage damage monitoring [5,6]. The primary objective of this study is to employ intelligent machine learning technology to assess the damage to the roofs of ancient buildings in Longmen Ancient Town, Hangzhou. Traditional roofs are an important part of their architectural heritage and carry profound historical and cultural values [7,8]. However, as materials age and environmental pressure increases, the damage problems of these roofs become increasingly serious, directly threatening the structural safety and historical authenticity of the buildings. Traditional manual detection methods are inefficient, susceptible to subjective influences, and struggle to meet the requirements of heritage protection. Longmen Ancient Town is a typical representative of traditional residential buildings in Zhejiang Province [9,10]. Its roof form reflects the common characteristics of architecture in the Jiangnan region, including the use of materials (terracotta tiles, wooden beams), structural forms (double-slope roofs), and decorative elements (beam and column carvings). The damage problems faced by the roofs (such as tile slippage, vegetation coverage, and material weathering) are widely present in traditional buildings in the Yangtze River Delta and other southern humid climate regions. Therefore, the research results of Longmen Ancient Town are not only applicable to local protection practices but can also provide a reference for the protection of traditional roofs in Zhejiang and other similar heritage areas.
Additionally, we aim to investigate methods for identifying and protecting these roofs using the YOLO model, thereby offering a novel approach to the preservation of ancient architectural heritage. In actual operation, we first need to take aerial photos of the roofs of Longmen Ancient Town to obtain a large number of roof pictures and positioning data. Then, we used professional annotation tools to annotate these images and mark the location and type of roof damage. Next, we input the annotated data set into the YOLOv8 model for training. By continuously iterating and optimizing the model parameters, we finally obtained a model that can accurately identify the damage to the roofs of ancient buildings. In terms of future protection measures, it is necessary to further formulate corresponding repair plans based on different types of roof damage and digitally protect and manage the roof tiles of ancient buildings in the ancient town in the direction of digital restoration to improve sustainability. This practice not only reflects the combination of technology and culture but also demonstrates the wide application and huge potential of AI technology in the protection of traditional ancient towns.

2. Literature Review

Longmen Ancient Town covers a number of natural villages. It boasts a majestic and simple appearance, a solemn shape, and is primarily designed for practicality [11,12]. It has the representative characteristics of Chinese Jiangnan residential architecture. Scholars in China have conducted extensive research on rural areas in the past, primarily focusing on the economic issues associated with industrial crops in these areas [13,14,15,16]. Particularly with the implementation of the rural revitalization strategy, this type of research has yielded significant results. However, the special protection of residential heritage in rural areas is still worthy of attention. As an important part of the building, the roof of the ancient building not only carries the practical function of sheltering from wind and rain but also contains profound historical and cultural value [17,18]. However, over time, factors like natural environmental erosion and human destruction have damaged the tiles on the roofs of many ancient buildings [19]. This phenomenon has attracted widespread attention in the field of protecting ancient architectural sites. Many scholars have devoted themselves to related research and tried to find effective solutions [20].
Some scholars have noticed that damaged roof tiles are one of the common problems of ancient building roofs. Through the analysis of damaged tiles, they found a variety of reasons for their damage, including natural aging, wind and rain erosion, and the manufacturing process [21]. Some scholars have also examined the issue of damaged roofs in ancient buildings from a broader perspective. They paid attention to the aging of internal facilities in ancient buildings and the safety hazards of electrical lines and believed that these problems also need to be paid attention to [22].
It is worth mentioning that the case study method has played an important role in the study of damaged roofs on ancient buildings. Some scholars have selected representative ancient buildings as research objects, systematically collected data and information, and conducted in-depth research. Through case analysis, they have revealed the conditions and problems associated with damaged roofs of ancient buildings in their actual living environments, thereby providing a valuable reference for future research and practice [23]. With the rise of AI technology, artificial intelligence has penetrated the study of architectural cultural relics. Experts in the field have proposed a variety of application solutions and possibilities, which not only broadened the vision of ancient building protection but also provided strong support for the formulation of relevant policies [24,25]. In the study of ancient buildings, some scholars have proposed a classification method based on semi-supervised machine learning of architectural image data. They use graphic recognition technology and machine clustering schemes as references to establish a structural classification model that corresponds to Chinese vernacular architecture, encompassing 9 main categories and 23 subcategories [26]. At the application level, some scholars have proposed that the application of computer science and technology—using artificial intelligence (AI), deep learning (DL), and computer vision digital image data—can help monitor and preserve cultural heritage sites. For instance, the identification of weathering, mortar removal, joint damage, discoloration, erosion, surface cracks, vegetation, seepage, damage, and other defects on the building’s surface is crucial [25]. In addition, some scholars have used convolutional neural networks (CNNs) modeling technology to conduct four types of detection on masonry historical buildings in the city of “Al-Salt” in Jordan: erosion, material loss, color change of the stone, and sabotage issues [26]. The problem of damage detection of wooden heritage and glazed tiles in architectural heritage has also attracted the attention of scholars [27,28]. Damage to the roofs of ancient buildings is a complex problem that involves multiple factors [20]. This is also related to the diversity of architectural heritage. General deep learning models have difficulty meeting current needs. Through unremitting efforts and in-depth research, many scholars have provided rich theoretical knowledge and practical experience in understanding and solving this problem. The relevant research results not only help protect and repair the roofs of ancient buildings but also make positive contributions to the inheritance and promotion of excellent traditional culture.

3. Materials and Research Process

3.1. Study Area: Longmen Ancient Town in Zhejiang

Longmen Ancient Town is located in the southern part of the Yangtze River Delta in China in Fuyang District, Hangzhou City, Zhejiang Province [29]. Longmen Ancient Town is a historical settlement building complex, spanning over 1500 years from the Three Kingdoms period to the present, as depicted in Figure 1a,b. It is a typical landscape that embodies the traditional Chinese clan inheritance system. It is an ancient pastoral town. As a well-preserved historical and cultural site, Longmen Ancient Town shows the typical characteristics of traditional residential buildings in the Jiangnan region. The roofs of its buildings primarily consist of wooden structures and terracotta tiles, reflecting not only regional construction techniques but also a close connection with the natural environment. Adaptability: Longmen Ancient Town is divided into Longmen No. 1 Village, Longmen No. 3 Village, Longmen No. 5 Village, and Longmen No. 7 Village. It has been rated as “China’s Historical and Cultural Town”, “China’s Historical and Cultural Village”, “Two Rivers and One Lake’s National Scenic Spot”, and “Zhejiang Provincial Historical and Cultural Reserve”. This ancient town serves as an ideal research site for studying damage to traditional residential roofs using image recognition technology.
Currently, Longmen Ancient Town boasts a complete collection of traditional building roof samples. The ancient town concentrates a variety of traditional roof structures, typical features of which include moderately sloped terracotta roofs, delicately carved wooden beams and columns, and cantilevered eaves. Environmental factors such as high humidity, frequent rainfall, and biological erosion have long affected these roofs, making them vulnerable to damage such as tile dislocation, surface cracking, and moss growth. The study area (Figure 1c) is mainly the core historical protection area of the old town (Figure 2). There are a lot of different types of traditional roofs in this area, which makes it a good place to find and study damage to roof surfaces using machine learning technology.

3.2. Image Collection Source: Selection of Buildings for Field Investigations

Due to the complex terrain and densely packed buildings of Longmen Ancient Town, the researchers chose a consumer-grade drone with stable performance and flexible control. They used the DJI Mini 4 Pro model drone. The drone boasts a high-definition camera and an advanced gimbal system, enabling stable shooting during flight. The drone boasts a 1/1.3-inch CMOS sensor, capable of capturing up to 48-megapixel photos and transmitting full HD images up to 20 km.
In terms of parameter settings, we referred to the suggestions of many aerial photography experts and adjusted them in combination with the default settings of the drone. We set the ISO value of the camera to automatic mode to achieve the optimal exposure effect under various lighting conditions. We dynamically adjusted the aperture and shutter based on the day’s weather conditions. When choosing the shooting time, we fully considered factors such as light, wind speed, and people’s flow. We ultimately decided to shoot on a non-working day, 13 June 2024, at 2 P.M., maintaining a flight altitude of about 15 m and aiming as close as possible to a 3–5 m distance from the roof. Through the pre-set aerial photography route, the aerial photography drone can capture the roofs of all buildings in the core area of the ancient town. During the shooting, the drone’s high-definition camera and simultaneous interpretation system allowed us to clearly capture details like roof tile damage, drainage system blockage, and roof keel corrosion. We continuously adjusted the shooting settings and enhanced the route. Approximately 800 pictures were taken during the entire aerial photography process, which took 3.5 h and required 4 batteries.

3.3. Research Methods and Process

This study aims to use YOLOv8 object detection technology to automatically identify and classify images of roof damage of traditional residential buildings in Longmen Ancient Town, Hangzhou, Zhejiang Province, and propose an efficient and accurate solution for roof damage analysis. Figure 3 primarily divides the research process into six steps.
(1) Data collection: In order to obtain comprehensive roof image data, this study used a DJI Mini 4 Pro drone and a high-precision 1/1.3-inch CMOS sensor to take aerial photos of the building roofs in the core protection area of Longmen Ancient Town, collecting a total of about 800 high-definition images. The collected images showcase various types of roof damage, including tile slippage, plant coverage, and missing tiles (refer to Section 3.2 above).
(2) Data processing: After a series of optimization processes, the research team used the collected original images for subsequent model training. The research team first accurately cropped all images to eliminate non-building roof content that may interfere with model training and retain only the complete building roof area. After cropping, the images were uniformly resized, and the resolution was adjusted to 512 × 512 pixels to strictly match the input standard of the YOLOv8 model, thereby ensuring that the model can effectively extract image features. In terms of image quality optimization, the research team used professional image processing software such as Photoshop to perform noise reduction, color correction, and perspective adjustment operations. Through these optimization measures, each image can more clearly present the detailed features of roof damage, which helps improve the recognition ability and classification accuracy of the model. Furthermore, to enhance the model’s training efficiency and generalization capabilities, we conducted a preliminary screening, classified the damaged areas, and extracted representative sample data. Among them, the dry vegetation category contains 43 images, the green vegetation category contains 19 images, the missing tile category contains 49 images, and the repaired tile category contains 69 images. We used a total of 180 high-quality sample images for model training, which provided reliable basic data support for subsequent model construction and optimization.
(3) Data labeling: The research team used the professional labeling tool LabelImg to accurately label the pre-processed images and clearly record the types and spatial distribution characteristics of roof damage. The labeling process mainly targeted four typical damage types, including dry vegetation, green vegetation, missing tile, and repaired tile. We labeled 180 damage samples through strict classification and positioning operations, clearly defining the specific scope and characteristics of each damage in the form of a bounding box. To ensure the generalization ability of the model and the reasonable distribution of the data set, the team randomly allocated the labeled samples into training sets, validation sets, and test sets. The team used 145 samples for model training, 17 samples for validation, and 18 samples for testing. Simultaneously, we combined the training set and validation set, totaling 162 samples, for cross-validation to further verify the model’s performance.
(4) Model training: We input the labeled data set into the YOLOv8 model for training. YOLOv8 uses the C2F (Cross Stage Partial with Focus) feature fusion structure and combines the neck and head modules to accurately identify the damage type. During the training phase, the research team optimized the model parameters through multiple rounds of iterations, including learning rate, batch size, and anchor box design, to maximize the accuracy of the model. During the training process, the model uses a convolutional neural network to extract multi-scale features from roof images and uses classification heads (Head0, Head1, Head2) to accurately classify different damage types.
(5) Model testing: After training, the model performance is comprehensively evaluated using the test set. Specific evaluation indicators include mean average precision (mAP), F1 score, recall, precision, and log-average miss rate (LAMR). Among them, mAP is a key indicator for evaluating the overall performance of the target detection model. It measures the detection accuracy of the model at different thresholds and reflects its ability to identify multiple categories of targets. A higher mAP value indicates that the model performs well in target positioning and classification. The F1 score is a comprehensive indicator of precision and recall, which is used to balance the model’s correct prediction ability and target coverage ability. In particular, in the case of unbalanced data categories, the F1 score can provide a fairer performance evaluation. Recall indicates the model’s ability to detect all targets, especially those that are difficult to detect. A higher recall rate indicates the comprehensiveness of the model in capturing different types of damage. Precision measures the credibility of the model’s predictions, focusing on evaluating whether the model can accurately distinguish between target and non-target areas. A higher accuracy rate indicates that the model has a strong ability to control false detections in practical applications. LAMR is used to evaluate the robustness of the target detection model, especially its performance at the low recall stage. It reflects the model’s ability to reduce undetected targets by analyzing the model’s missed detections at different detection thresholds. A lower LAMR value indicates that the model missed fewer targets and could capture them more comprehensively.
(6) Model application: The research team applied the high-efficiency results from the model test to the protection practices of Longmen Ancient Town. The model’s damage distribution map clarifies the spatial distribution of various roof problems, offering a scientific foundation for the repair and digital management of ancient buildings. For instance, we recommend adopting a repair measure of partial tile replacement for the identified missing tile area and propose a protection plan of regular cleaning and surface reinforcement for the vegetation area. Future research plans aim to integrate the detection data into the digital protection platform, enabling dynamic monitoring and intelligent management of the roofs of ancient buildings.

4. Field Survey Results: Traditional Houses in Longmen Ancient Town

4.1. Analysis of Hangzhou’s Climate Characteristics

Fuyang District, Hangzhou City, Zhejiang Province, where Longmen Ancient Town is located, belongs to the subtropical monsoon climate zone [30,31]. This type of climate brings mild and humid weather conditions, four distinct seasons, sufficient sunlight [32], and abundant rainfall (Figure 4 and Figure 5). As shown in Figure 4, Fuyang District of Hangzhou has a typical subtropical monsoon climate, with high temperatures, high humidity, cloudy and rainy weather in the summer, and weak wind speed in the winter. The solar radiation fluctuates significantly between sunny and rainy conditions. The wind speed is at a medium–low level most of the year, but it will rise briefly and significantly when typhoons or severe convective weather occur. The overall cloud cover level is high, especially in the rainy period in summer; the relative humidity is high throughout the year, with an obvious correspondence to high temperature and humidity in summer and the plum rain season; and the humidity drops slightly in winter but remains at a high level. The dry bulb temperature changes a lot with the seasons, with clear patterns of high temperatures in the summer and low temperatures in the winter. The direct normal radiation will be highest when there is enough sunlight and few clouds, and it will be close to zero at night or when it is cloudy. The horizontal diffuse radiation is more evenly spread than the direct radiation, and it will rise when there are more clouds or rain, which shows that Fuyang District has more clouds all year and that diffuse radiation makes up a big part of it. Figure 5 shows that the wind field in Fuyang District changes strongly with the seasons, with a north wind in the winter and an east wind in the summer. This is in line with how winds usually blow in a subtropical monsoon climate. From January to March, the wind is mainly northwest or north, and the wind speed is relatively mild; in April, there are obvious strong wind characteristics (maximum wind speed up to 22 m/s), and the wind is still mainly northwest or west; in May, with the strengthening of the subtropical high pressure and the warm and humid air flow in the south (maximum wind speed up to 25 m/s), the main wind direction turns to the south or southwest; from June to August, under the prevalence of the summer monsoon, the wind direction is mostly south and the overall wind speed is relatively stable and maintains a high humidity and heat feature; and from September to December, the wind direction gradually turns to northwest or north, and the wind speed is relatively mild. However, it is precisely these climate characteristics that have accelerated the damage process to the roofs of ancient buildings to a certain extent. This obvious change between seasons causes the roof materials of ancient buildings to expand and contract continuously in the alternation of cold and heat, which can easily lead to loose tiles, cracking, and even falling off in the long run. Especially during the long-term, continuous plum rain season, which is unique to the Jiangnan region of China, the soaking of rainwater further erodes the roof materials, reducing their durability. The marine climate greatly affects Hangzhou. The salt and moisture brought by the sea breeze have a corrosive effect on building materials, especially on wooden components and stone. Over time, this erosion will cause the roof structure to gradually age and lose its original strength and stability. Natural factors such as high temperatures, heavy rains, four distinct seasons, high humidity, and the influence of the marine climate can cause damage to the roofs of buildings in Longmen Ancient Town.

4.2. Roof Tiles Construction and Feature Analysis

The traditional roof construction techniques and residential characteristics of Longmen Ancient Town complement each other, showing the adaptability of traditional buildings in the Jiangnan region to the natural environment and also reflecting the unique regional culture and artistic aesthetics. The construction technology of its tiles is delicate and scientific, combined with the roof shape and functional design, providing valuable reference value for contemporary architectural heritage protection and regional architectural research.

4.2.1. Construction and Characterization of Roofs

(1) Traditional roof structure system
The traditional roof construction technology of Longmen Ancient Town fully reflects the technical wisdom of Jiangnan architecture. Its core lies in the scientific structural system and practical construction technology. Durable wooden materials like fir or pine, along with traditional mortise and tenon technology, construct the roof frame, eliminating the need for iron nails. This not only ensures the stability of the structure but also facilitates later disassembly and repair. The purlin is the main load-bearing component. To support the tiles and ensure an even distribution of roof weight, we evenly lay the rafters on the purlins. The slope of the roof is usually between 30° and 45°. This design is not only conducive to rapid drainage but also prevents tiles from sliding off due to steep slopes. In addition, the elastic structure of the wooden roof frame can effectively absorb the vibration caused by external impacts (such as wind, rain, or earthquakes), enhancing the seismic performance of the roof. A scientific roof structure design provides a reliable guarantee for the durability and functionality of the roof.
(2) Tile production and the laying process
Traditional roofs primarily consist of tiles, whose production and laying process showcase the exceptional craftsmanship of skilled artisans. Local high-viscosity clay forms the basis of roofing tiles. High-temperature kilns fire them after careful screening, hammering, and shaping to form durable and waterproof building materials. Tiles are divided into two types: bottom tiles (plate tiles) and cover tiles (tube tiles). When laying, the bottom tiles are first laid on the rafters to form a base layer for draining rainwater; the cover tiles are then covered on the bottom tiles, and the curved structure achieves more efficient waterproof performance while enhancing the structural stability of the roof. Starting from the eaves, the craftsmen lay the tiles layer by layer up to the ridge, ensuring a tight overlap and overall flatness between the tiles. During the laying process, the craftsmen need to accurately calculate the overlap length of the tiles to meet the drainage requirements and ensure the aesthetic effect of the roof (Figure 6).
(3) Roof node treatment and decoration technology
The traditional roofs of Longmen Ancient Town show a high degree of integration of functionality and artistry in node treatment. The ridge is an important node of the roof. Ridge tiles and decorative tiles comprise this node. Its main function is to connect the roofs on both sides and achieve drainage and protection. Auspicious patterns like dragons, phoenixes, and auspicious clouds, engraved on ridge tiles, not only enhance the roof’s visual appeal but also carry rich cultural implications. The design of the eaves not only shields the wall from rain erosion but also enhances the dynamic and layered sense of the building. Geometric patterns or auspicious symbols, engraved on the end of the tile, serve as an important decorative component of the eaves, fixing the tiles in place and guiding the rainwater to drains. Additionally, we install the drip tile at the lower edge of the eaves to effectively prevent rainwater from directly eroding the wall and foundation. The fine treatment of each node of the roof not only improves the functionality but also gives the building a strong cultural atmosphere.
(4) Waterproof and ventilation designs
Given the humid and rainy climate characteristics of the Jiangnan region, Longmen Ancient Town has cleverly designed its traditional roofs to ensure waterproofing and ventilation. The overlapping structure of the tiles forms a natural waterproof barrier, which can effectively guide rainwater to the eaves and prevent it from leaking into the house. Simultaneously, some roofs incorporate layers of straw or plaster beneath the tiles to enhance their waterproof performance (Figure 7). The open design of the ridge and eaves provides a natural ventilation channel, effectively removes moisture from the roof, and protects the wooden structure from corrosion. Jiangnan architecture maintains its unique sense of openness and spatial permeability by precisely calculating the length of the eaves to prevent rainwater from splashing into the room.
(5) Cultural and aesthetic characteristics
The traditional roofs of Longmen Ancient Town not only have practical functions but also contain rich cultural connotations and aesthetic values. Roof ridge decorations, tiles, and animal heads are not only decorative embellishments of architectural details but also reflect the auspicious meanings in traditional culture, such as fortune, longevity, and exorcism of evil spirits and disasters. The design of flying eaves and upturned corners shows the agility and lightness of the roof with graceful arcs, becoming an important symbol of traditional Jiangnan architecture. The arrangement of tiles and the line design of the roof form a simple and rhythmic aesthetic effect. Additionally, the principle of “using local materials” and the use of sustainable construction methods for roof materials further demonstrate the respect and adaptability of traditional architecture to the natural environment. These cultural and aesthetic elements make the traditional roofs of Longmen Ancient Town an important symbol of Jiangnan architectural style.

4.2.2. Characteristics of Traditional Residential Buildings

(1) Spatial layout and settlement structure
The overall layout of residential buildings in Longmen Ancient Town is compact, showing the typical characteristics of Jiangnan water town settlements. The river guides the construction of buildings, and water systems weave through the streets and alleys. Most residential buildings adopt the traditional courtyard structure. The design of the enclosed courtyard not only protects the privacy of the residents but also forms good ventilation and lighting conditions. Narrow alleys connect the buildings. The street space is organic and continuous, adding a unique spatial charm to the settlement as a whole. The function-oriented architectural layout, together with the water system, bridges, and streets and alleys, form a traditional living environment that integrates function and culture.
(2) Architectural form and function
Most residential buildings in Longmen Ancient Town are two- or three-story wooden structures. Usually, a brick and stone base forms the bottom floor to prevent moisture, while wood and adobe walls primarily construct the upper floor. The double-slope roof, in conjunction with the overhanging eaves design, not only efficiently drains water but also enhances the building’s lightness. The main room of each house is usually the core space of family life. Guests and family activities take place in the spacious and well-lit hall. Distributed on both sides, the wing rooms and utility rooms serve both storage and living purposes. The overall design rationally allocates living and production space, reflecting the emphasis on the function and efficiency of traditional houses.
(3) Material selection and ecological wisdom
The principle of “adapting to local conditions” guides the selection of houses in Longmen Ancient Town, which primarily utilize local wood, clay, and stone. Ceramic tiles cover the roof, while rammed earth or adobe brick structures mostly form the walls, demonstrating the efficient use of environmental resources. This “local material” construction method not only reduces construction costs but also enhances the integration of buildings with the natural environment. In addition, the architectural design focuses on ventilation, moisture-proofing, and lighting. The building perfectly adapts to the humid climate of the south of the Yangtze River by raising the foundation, waterproofing the eaves, and designing transparent north–south windows.
(4) Overall presentation of local characteristics
The overall style of the residential buildings in Longmen Ancient Town shows the distinctive characteristics of Jiangnan water town architecture. The integration of architecture with the natural environment and water system makes it present a unique regional aesthetic. Through the comprehensive expression of function, material, decoration, and cultural connotation, the residential buildings in Longmen Ancient Town have achieved a harmonious unity between practicality and culture, becoming a model of traditional Jiangnan settlement architecture.

4.3. Analysis of Damage Types and Factors in Roof Tiles

In this survey, the researchers summarized four types of roof tile damage, including (1) plant damage (green new growth of plants or microorganisms); (2) plant damage (appearing as brown and dried features); (3) improper repair (appearing as orange dot distribution features); and (4) missing (appearing as black block-like holes). Long-term exposure to the natural environment causes biological degradation. A biofilm composed of bacteria, algae, or other microorganisms may form on the tile surface, and soil may accumulate in individual gaps to grow green plants. As time continues, some plants die and dry up. Discoloration manifests as a variation in tile color depth, leading to the replacement of some colors with others. For instance, the kiln temperature and mineral composition of the adobe influence one color, while tiles of different styles and colors replace the lost colors at different times. From the standpoint of the architectural style’s degree of preservation, this restoration is, to some extent, improper. In addition, the roof tiles will gradually age over time and lose their original strength and stability. Longmen Ancient Town has hundreds of years of history, and the tiles will become dry or rotten due to long-term exposure to wind and sun; they may also break due to weathering. The collapse was caused by damage to the roof frame wood structure, the large-scale loss of tiles due to the collapse of the frame due to wood decay, termite erosion, and other problems. Finally, it will also show the characteristics of block loss.

5. Automatically Identifying Model Construction Results

5.1. Model Setup

Based on the YOLOv8 (You Only Look Once, Version 8) object detection model, this study built a detection framework specifically for identifying damage to traditional roofs in Longmen Ancient Town. The YOLOv8 model in this study builds on the previous YOLO series research’s speed and accuracy by adding more advanced feature fusion strategies and module optimization. This makes the model much better at finding things and generalizing what it finds. The core modules of YOLOv8 include the backbone (backbone network), neck (feature fusion layer), and head (classification and regression layer) modules (Figure 8). The machine learning environment is set as follows: the operating system is Windows 11 (X64), the CUDA version is 11.5, the deep learning framework is PyTorch (1.13.0), and the graphics card and processor are a GeForce GTX 3070 (16 GB) card and an AMD Ryzen 9 5900HX (3.30 GHz) processor, respectively.
(1) The backbone module is the core part of the model, which is used to extract multi-scale features of the image. In this study, the input image size is 512 × 512 × 3 pixels. Backbone generates five feature maps (P1 to P5) in sequence, each corresponding to a different scale and number of channels. Among them, P1 is 256 × 256 × 80 and P5 is 16 × 16 × 640. These feature maps gradually decrease in resolution as the number of channels increases, ensuring the full extraction of roof damage features from coarse-grained to fine-grained. The model uses the Cross Stage Partial with Focus (C2f) module for feature extraction. The C2f module reduces redundant calculations through Partial Residual Connection, which improves model efficiency while maintaining feature integrity. In addition, through multi-layer convolution (Conv) and pooling (Down Sample) operations, the backbone module can accurately capture the spatial and semantically significant features of different damage types (such as missing tiles, plant coverage, etc.).
(2) The main task of the neck module is to fuse the features extracted by the backbone module to generate a more expressive feature representation. The neck module of the model adopts a combination of Feature Pyramid Networks (FPNs) and Path Aggregation Networks (PANets), which has a stronger ability to integrate contextual information. In the neck part, the model Up Sample and Down Sample feature maps different scales and further refines the feature fusion through the C2f module. In this study, the neck module combines the global features of roof damage with local details through multiple feature concatenation operations, such as the detailed texture of tile slippage and the large-area distribution of plant coverage.
(3) The head part is responsible for mapping the fused features into specific detection results, including target categories (such as damage types) and bounding box coordinates. In the model, the head module adopts a multi-branch output structure, and each branch focuses on a different feature map scale (such as cv2[0], cv2[1], cv2[2], etc.). The key components in the head module include convolution operations (Conv) and activation functions (Sigmoid). These components further process the input features and generate probability distributions of the corresponding categories and bounding box parameters for the target. The model introduces an improved loss function (CIoU) to optimize the accuracy of the prediction results.

5.2. Model Training Results

Figure 9 illustrates the model training process. The researchers used hyperparameter configuration and multi-stage optimization strategies to make the model better at detecting things over time. They accomplished this by combining freeze and thaw training, which involved changing the model’s loss value, training parameters, and convergence.
(1) In terms of parameter setting for model training, the size of the input image is fixed to 512 × 512 to balance the model’s computational efficiency and feature extraction capabilities. The training phase is divided into two parts: in the frozen training phase (0 to 50 epochs), only the classification and regression heads are optimized, the backbone network weights remain fixed, and the batch size is four; and in the unfrozen training phase (51 to 300 epochs), the backbone network weights are unfrozen, all layers are allowed to participate in the optimization, and the batch size is adjusted to two. The initial learning rate is set to 0.01, the minimum learning rate is 0.0001, and the cosine annealing learning rate decay strategy (Cosine Decay) is used to ensure stable convergence of the model in the later stages of training. The optimizer uses stochastic gradient descent (SGD) with a momentum factor of 0.937 to enhance gradient stability during training. The optimizer uses eight threads to load data and accelerate training.
(2) In terms of loss value changes, training loss and validation loss show a trend of rapid decline and gradual stabilization, respectively (for the specific value of 300 epochs, please refer to Appendix A). In the frozen training stage, the training loss value dropped rapidly from the initial 283.01 to below 20 at the 50th epoch; after entering the thawing stage, the loss value continued to drop and finally reached the lowest value of 1.79 at the 285th epoch. The validation loss value dropped from the initial 229.13 to 4.87 at the 16th epoch and then stabilized. This trend shows that the model can gradually learn the main features of tile damage and show good generalization abilities on the training and validation sets.
(3) The mean average precision (mAP), an important indicator for evaluating model performance, gradually increased from an initial value of 0 to a maximum value of 0.60 at the 85th epoch and remained stable in subsequent training. During the training phase, the freezing phase mostly set up the model’s basic detection skills. The unfreezing phase improved the backbone network’s feature extraction skills, which made it easier for the model to recognize different kinds of tile surface damage.

5.3. Model Indicator Analysis Results

During the model training process, for the models with outstanding performance (epochs 16, 85, 285, and 300), a comprehensive indicator evaluation was performed on the test set. The test indicators included mAP, LAMR, F1, recall and precision. Table 1 and Figure 10 illustrate the analysis of the test results from various damage categories.
(1) In the dry vegetation category, the AP value stayed between 0.52 and 0.54 across training rounds, indicating a fairly stable performance. The LAMR value dropped from 0.66 (16th epoch) to 0.67 (300th epoch), indicating a slight drop in the model’s missed detection rate for dry vegetation. The F1 score gradually increased from 0.48 in the 16th epoch to 0.58 in the 300th epoch, indicating that the overall performance of the model improved with the progress of training. In addition, the recall value increased from 0.35 to 0.5, while the precision value remained at a high level of around 0.7, indicating that the model’s detection coverage of this category of damage has improved.
(2) The AP value for the green vegetation category varied slightly across different rounds, peaking at 0.55 in 85th epoch and falling to 0.53 in the 285th epoch and 300th epoch. Meanwhile, the LAMR value consistently remained high, ranging from 0.8 to 0.83, suggesting a more evident issue with missed detection in this category. The F1 score was low at the 16th epoch (0.16) and gradually increased to 0.49 (300th epoch) as training progressed, but the recall value was always low (0.08–0.39) and the precision value was high (0.73–1), indicating that the model can accurately identify green vegetation damage.
(3) The performance of the missing tile category is significantly better than that of other categories. The missing tile category consistently maintains a high AP value (0.9–0.94) and a low LAMR value (0.19–0.2), signifying a very low missed detection rate. The F1 score and recall value both reached high levels (0.84–0.89 and 0.8), and the precision value reached 1 in multiple epochs, indicating that the model can accurately detect and classify tile missing damage, which is closely related to the significance of its damage characteristics.
(4) The AP value of the repaired tile category fluctuates greatly in different rounds, gradually increasing from 0.7 in the 16th epoch to 0.93 in the 85th epoch and then decreasing to 0.78 in the 285th epoch and 300th epoch. The LAMR value shows a downward trend, with the lowest value (0.03) in Epoch 85, but it rebounds in subsequent training. The F1 score increases from 0.62 (16th epoch) to 0.74 (300th epoch), the recall value remains at 0.81, and the precision value fluctuates from 0.8 to between 0.65 and 0.68, indicating that the model is relatively stable in detecting this category.
The performance trends across various training rounds fall into the following categories:
(1) In the early stage (16th epoch), the model succeeded at finding missing tiles and repair traces (AP values of 0.94 and 0.7, respectively) but was not as successful at finding categories related to vegetation (dry vegetation and green vegetation AP values of 0.52 and 0.54, respectively). This suggests that the model learned important features more quickly in the early training stage.
(2) In the middle stage (the 85th epoch), the performance of all categories improved, especially the AP value of the repair trace category, which reached a peak of 0.93, while the LAMR value dropped significantly to 0.03; the F1 score and recall value performed well (0.79 and 0.81). At this stage, the model showed strong learning ability and had significant optimization effects on some damage categories.
(3) During the later period (the 285th epoch and 300th epoch), there was a steady increase in the AP value and F1 score of vegetation-related categories, while the tile missing category continued to exhibit the best performance. Conversely, the AP value of the repair trace category decreased. This may be due to overfitting or category distribution deviation during further optimization of the model.
The confusion matrix (Figure 11, Figure 12, Figure 13 and Figure 14) shows the classification results of the model for different categories of roof tile damage and reveals the classification accuracy and misclassification issues.
(1) The confusion matrix of the 16th epoch model shows (Figure 11) that the model performs best in the missing tile category, with a classification accuracy of 90% and only 10% of the samples being misclassified as background. However, the model performs poorly in the green and dry vegetation categories, exhibiting accuracy rates of 37% and 50%, respectively, leading to the misclassification of 23% of the samples as background. Furthermore, the repaired tile category displays an accuracy of 69%, with 15% of the samples misclassified as background. The misclassification of the background category is more severe, misclassifying 50% of the background as dry vegetation and 63% as green vegetation. This shows that the model has a weak ability to distinguish some categories (background and vegetation categories), which may be due to feature overlap or sample imbalance.
(2) The confusion matrix of the 85th epoch model shows (Figure 12) that the model performed best in the repaired tile category, with a classification accuracy of 94% and a significantly reduced misclassification rate, with only 21% of samples being misclassified as background. The performance of the missing tile category declined slightly, with an accuracy of 80%, but it still maintained a high level of detection. In contrast, the performance of the dry vegetation and green vegetation categories improved, with accuracy rates of 64% and 48%, respectively, but the proportion of misdetection as background was still high at 25% and 39%, respectively. The background category’s misclassification issue persisted, misclassifying 36% of the background as dry vegetation and 52% as green vegetation.
(3) The confusion matrix of the 285th epoch model shows (Figure 13) that the missing tile category maintains a high accuracy (80%), and the misclassification rate is further reduced. The model misclassifies only 8% of the missing tile samples as background. The accuracy of the repaired tile category has gone down to 81%, but the percentage of wrong classifications as background has increased up to 27%. This shows that the model’s ability to tell the difference between repair traces has changed over time. The classification accuracy of dry vegetation is 57%, which is lower than before. Simultaneously, the model misclassifies 19% of the samples as background, suggesting ongoing deficiencies in feature extraction for this category. The classification accuracy of the green vegetation category remains at 48%, but the proportion of misclassification as background has increased to 46%, showing obvious false detection problems. The misclassification of the background category is still significant, with 43% of the background being misclassified as dry vegetation, 52% being misclassified as green vegetation, and 20% being misclassified as missing tile. This shows that the model still has difficulty in distinguishing between background and vegetation categories.
(4) The confusion matrix of the 300th epoch model reveals (Figure 14) a stable performance of the missing tile category, with a classification accuracy of 80%, and only 8% of the samples were misclassified as background. This suggests that the model’s performance in this category is relatively reliable. The accuracy of repaired tile remained at 81%, but the proportion of samples misclassified as background increased to 29%, indicating that there is still room for improvement in the classification performance of this category. The dry vegetation and green vegetation categories exhibited relatively weak classification performance, with accuracy rates of 57% and 48%, respectively, and a misclassification of 21% and 42% of the samples as background. This shows that the model still has some difficulty in distinguishing vegetation-related categories from background features. The misclassification rate of the background category is particularly significant, as it incorrectly classifies 43% of the background as dry vegetation, 52% as green vegetation, and 20% as missing tiles. This suggests that further optimization is necessary to enhance the model’s ability to extract and distinguish features between background and vegetation categories.
In summary, from the analysis of indicators from the 16th to the 300th epoch models, missing tile has always been the most stable and accurate category (80%–90%), while repaired tile reached its peak performance (94%) in the mid-term (85th epoch) and fluctuated in the later period but still maintained its performance at around 81%. In contrast, the classification performance of dry vegetation and green vegetation categories has been improved to a limited extent, with accuracy rates of around 57% and 48%, respectively, and confusion with the background has always existed, with the proportion of background categories misclassified as vegetation categories being as high as 43%–52%. Overall, the model has a strong ability to recognize significant damage categories (such as missing tiles).

5.4. Comparison of Model Detection Results

By randomly selecting one test image from each of the four categories, we visually compare the detection effects of the 16th, 85th, 285th, and 300th epoch models (Figure 15). We can intuitively observe the differences in the detection performance of each model for each category.
(1) The 16th epoch model (Figure 15(A2)) demonstrated a certain level of detection ability in the dry vegetation category, but it missed many detections, resulting in scattered areas of selection. The 85th epoch model (Figure 15(A3)) significantly improved the detection coverage, but there were still a small number of background false detections (dry vegetation on the right side of Figure 15(A3)). However, the 285th epoch (Figure 15(A4)) and 300th epoch (Figure 15(A5)) models still missed detections in some low-significance areas (green vegetation above the Figure 15(A4,A5) images).
(2) The green vegetation category performed poorly in the 16th epoch model (Figure 15(B2)), with serious missed detections and only a small number of areas captured. The 85th epoch model (Figure 15(B3)) significantly increased the box selection area, but some of the box selections were not accurate enough. The 285th epoch model (Figure 15(B4)) and the 300th epoch model (Figure 15(B5)) enhanced the detection effect, yet they did not fully resolve the issue of missed detection.
(3) The missing tile category performed well in all models, maintaining a high detection accuracy from the 16th epoch (Figure 15(C2)) to the 300th epoch (Figure 15(C5)), but there are still some shortcomings. Among them, there is a false detection on the right side of C2, and C3, C4, and C5 all miss a missing tile category in the image. Overall, the model’s feature extraction ability for missing tile is stable.
(4) The performance of the repaired tile category in the 16th epoch model (Figure 15(D2)) was relatively preliminary, and some areas were not correctly selected. The 85th epoch model (Figure 15(C3)) achieved significant improvement, with a significant increase in the selected area and more accuracy. The 285th epoch (Figure 15(D4)) and 300th epoch (Figure 15(D5)) models maintained a high level of detection completeness and accuracy.
Combined with the previous numerical analysis and detection results, the 85th epoch model is the best-performing model in this study. From the numerical analysis, the AP of the model on the missing tile category reached 0.90, the F1 score and recall rate were both 0.89 and 0.80, and the precision rate reached 1.00, indicating that its detection accuracy and coverage are in the best state. The repaired tile category performs exceptionally well, with an AP value of 0.93 and an F1 score and a recall rate of 0.79 and 0.81, respectively, making it the most stable generation among all models for detecting this category. According to the confusion matrix, the 85th epoch model performs best at distinguishing the background, missing tile, and repaired tile categories, and the misclassification rate is significantly lower than that of other generations. In the picture test, the 85th epoch model demonstrates the most comprehensive and accurate box selection area for detecting missing tiles and repair traces, with minimal instances of missed or false detection. It is worth noting that the 85th epoch model has not completely solved the confusion problem between the background and vegetation categories; compared with the 16th epoch, the 285th epoch, and the 300th epoch, the 85th epoch model is more stable in the vegetation category. This study will use the 85th epoch model for further research.

6. Discussion

6.1. Analysis of Model Feature Layers

During the object detection process, the model generates multi-scale feature maps through the feature extraction network. These feature maps can capture the spatial and semantic information of the object in the image. Figure 16 shows the feature extraction, feature fusion, and final object detection results of the model on the test image (resolution 512 × 512 pixels).
In the feature map of the cv2 layer (cv2[0], cv2[1], cv2[2]), the feature map mainly focuses on the shallow features of the image, such as local texture, edge, and shape information. It can be seen that in cv2[0], the model clearly captures the local area of green vegetation. As the layer goes deeper (cv2[1], cv2[2]), the focus of the feature map gradually expands, covering a wider range of vegetation areas.
In terms of the feature map of the cv3 layer (cv3[0], cv3[1], cv3[2]), the feature map further extracts deep semantic features. Compared with the cv2 layer, the focus area of the cv3 layer feature map is more concentrated, especially the response to green vegetation in cv3[0] and cv3[1]. The feature map can better filter background noise and highlight the detection target by weighting the activation intensity of different regions.
The cv2 and cv3 layers integrate the information of multi-level feature maps through feature fusion to generate a more comprehensive and detailed feature representation. The feature fusion in Figure 16 shows the fusion result. The model successfully integrates features of different scales to accurately mark the vegetation-covered areas in the image. The fused feature map is input into the classification and regression module to generate the final target detection result (right side of Figure 16). The detection results show that the model can accurately select the green vegetation area and assign the correct category label to each target. This process verifies the key role played by the feature extraction and fusion modules in detection.

6.2. Model Application Effect

To verify the application effect of the model in actual scenarios, the research team took 200 new roof pictures from the scene and randomly selected eight of them (Samples A–H) for detection. The results are shown in Figure 17. In the eight samples, the model showed good detection capabilities for roof damage categories (dry vegetation, green vegetation, missing tile, and repaired tile) and was able to accurately mark different types of damaged areas. Feature shows the response strength of the model to various targets, and Output clearly shows the target selection and classification results of the model.
(1) The detection results of the dry vegetation category are shown in Sample A, Sample E, and Sample H. The model detects dry vegetation more accurately and can identify sparsely distributed target areas. In more complex cases such as Sample E and Sample H, the difference between the dry vegetation and green vegetation categories can be distinguished, but there are still a small number of false detections.
(2) The model performs well in detecting green vegetation (such as Sample E, Sample F, and Sample H), and can effectively mark densely distributed vegetation areas. The model can also accurately distinguish the background in the edge area of the roof surface (Sample F), and the model performs stably and accurately.
(3) In Sample G, the model performs stably in detecting missing tile, and all obvious damaged areas are correctly framed without obvious false detection. This shows that the model is very reliable at extracting features of missing tiles on the roof surface.
(4) The detection of the repaired tile category performs well in Sample B, Sample C, and Sample D and can accurately identify large repaired areas. In particular, in Sample B, the frame selection of the slender strip area is very complete, but in Sample D, there are a small number of redundant frames, which may be false detections caused by feature similarity.
From the above analysis, it can be seen that the model has a good practical application effect in roof damage detection, especially in the categories of missing tiles and improper repairs on the roof surface, and can achieve high-precision target recognition and classification. Although the model showed high accuracy and robustness in the above-mentioned actual scene tests, there are still several limitations that need further attention. First, the model is highly dependent on the distribution of training data. Applying it in environments with significantly different lighting, weather conditions, or surface coverage may lead to a decline in the recognition effect. Second, the model still makes mistakes when it comes to scenes with a lot of overlapping features or a lot of different types of vegetation. This is especially true when green and dry vegetation are interfering with each other. In addition, as the damage types continue to expand, the similarity between features and the diversity of background noise will increase, thereby increasing the risk of overfitting and the probability of the model being misclassified in multi-category recognition. Finally, this study has not fully included other potential damage forms such as roof structure degradation, drainage system blockage, and material corrosion. We need to conduct further exploration to identify such hidden or complex damage. To solve these issues, future research can keep improving the ability to spot damage in multiple dimensions by creating a wider range of data collection environments and making the data set bigger and better. This will provide more solid technical support for finding and protecting the roofs of standard residential buildings in a scientific and thorough way.

6.3. Roof Inspection Design Combined with UAV

Additionally, the researchers propose a design scheme that integrates drones with roof damage monitoring. Unmanned aerial vehicles (UAVs) are used in agricultural irrigation, forest fire prevention, power and oil pipeline inspections, post-disaster assessments, and building appearance inspections because of their low operating costs, high flight safety, and excellent adaptability to the natural environment. However, China does not widely use drones for monitoring roof damage in ancient buildings. Due to different needs in different fields, the design of drone systems is different. Therefore, this study selected a drone system suitable for the roof damage detection of ancient Chinese buildings and made targeted improvements (Figure 18). Compared with traditional drones, this drone is equipped with an APM flight control with data transmission, which can obtain the vertical distance between the optical center of the camera and the surface of the building when the airborne photo is taken, and batch collects the images required for recognition by the YOLOv8 model. The ultrasonic ranging data are transmitted to the PC with a receiver through data transmission. Through route planning, it can scan the entire area by itself, and an integrated remote control is added to receive real-time image transmission and data transmission signals. Once a danger occurs, the manual intervention mode can be turned on in time. This design also demonstrates the outstanding potential of drone roof image acquisition combined with YOLOv8 model recognition in the protection and restoration of architectural heritage roofs.

7. Conclusions

7.1. Research Discovery

This study systematically implements the automatic detection and classification of roof damage of traditional buildings in Longmen Ancient Town, Hangzhou City, Zhejiang Province, using the YOLOv8 object detection model. Additionally, it explores the application potential of artificial intelligence technology in the protection of historical buildings. During the research process, we verified the model’s superior performance in detection efficiency and accuracy through a multi-stage training strategy, a comprehensive data processing and analysis process, and real-world testing. The research results show that: (1) The detection effect of the model in this study on the missing tile and repaired tile categories is significantly better than that of other categories, with AP reaching more than 90%, respectively. In the test set and actual application scenarios, the model can accurately select missing tile and repaired tile, and the missed detection rate is extremely low, showing a strong target feature extraction and classification ability. (2) Through the layer-by-layer analysis of the multiple rounds of test results and feature maps in the model training, the researchers can see that the model gradually grasps the main feature distribution of dry vegetation and green vegetation, but the detection results of these two categories still have certain false detection and missed detection phenomena in complex backgrounds. (3) In the actual application of the model, this study further proved the practicality and robustness of the model by randomly selecting 200 newly taken on-site pictures for verification. The test results show that the model can accurately select and classify the target area in most samples, especially in the high-contrast damage category. (4) By combining the performance test and confusion matrix analysis of multiple generations of models, it can be confirmed that the 85th epoch model has achieved the best comprehensive performance, showing high detection accuracy, recall rate, and F1 score, especially in complex scenes with a dense distribution of multiple categories, which can effectively reduce missed detection and false detection.
The main contribution of this study is that it combines artificial intelligence object detection technology to provide an efficient and accurate solution for the digital detection of traditional building roof damage. Compared with traditional manual detection methods, this method significantly improves detection efficiency, reduces errors caused by subjective judgment, and provides reliable data support for repair decisions for common roof surface damage, such as tile slippage and vegetation coverage. This study combines artificial intelligence models with historical building protection and verifies its application potential in actual scenarios through drone data collection and multi-stage model training, opening up a new technical path for the protection, restoration, and digital management of traditional buildings. In addition, the research results also provide an important reference for similar Chinese traditional building roof protection practices, laying the foundation for the deep integration of cultural heritage protection and artificial intelligence technology.

7.2. Limitations and Future Work

Although this study has achieved many practical results, there are still some limitations. (1) In the detection of green vegetation and dry vegetation categories, the model’s ability to distinguish complex backgrounds is relatively weak, especially in areas with blurred boundaries or low contrast, which are prone to missed or false detection. This may be related to the relatively small number of samples in this category in the training data and the high complexity of the features. (2) The detection categories studied are only concentrated on four common damage types, and other potential damage types, such as roof structure damage and drainage system blockage, have not yet been covered. We still need to expand the model’s scope of application. (3) The confusion problem between background and target categories still exists, especially when the damage features are similar to the background features; the classification performance of the model needs to be further optimized. (4) The object of this study is located in Longmen Ancient Town, which represents the case type of traditional residential water town in western Zhejiang. However, as far as China is concerned, the territory is very vast, and this specialized model may not be applicable to the whole of China. Therefore, more regional labels need to be included for consideration.
Future research will further improve the performance of the model, especially in terms of improving the detection ability of low-saliency targets and reducing background interference. This includes the following: (1) by introducing the attention mechanism to optimize the feature extraction module, the model’s ability to distinguish complex features is enhanced; (2) data augmentation technology is used to expand the diversity of training samples, increase the number of samples in complex backgrounds and low-contrast scenes, and further optimize the robustness of the model; (3) the research will also combine the multi-task learning framework to expand the detection range to other potential damage types and achieve a more comprehensive classification and assessment of roof damage; (4) in terms of practical applications, the real-time data collection and detection technology of drones can be used to build a dynamic monitoring and intelligent protection system based on a cloud platform to achieve real-time monitoring, data analysis, and repair suggestions for roof damage in historical buildings. Simultaneously, this study will expand its technical solution to encompass historical buildings in diverse regions and cultural backgrounds, customizing the model to accommodate a broader spectrum of cultural heritage protection practices.

Author Contributions

Conceptualization, S.Y. and Y.H. (Yue Huang); methodology, Y.C. and L.Z.; software, Y.C. and L.Z.; validation, Y.C., L.Z. and Y.H. (Yuhao Huang); formal analysis, S.Y., Y.C., L.Z. and Y.H. (Yue Huang); investigation, S.Y. and Y.H. (Yue Huang); resources, S.Y. and Y.H. (Yue Huang); data curation, S.Y. and Y.H. (Yue Huang); writing—original draft preparation, S.Y., Y.C., L.Z., J.C. and Y.H. (Yue Huang); writing—review and editing, S.Y., Y.C., L.Z., J.C. and Y.H. (Yue Huang); visualization, Y.H. (Yuhao Huang) and Y.H. (Yuxuan Hu); supervision, S.Y., Y.H. (Yue Huang) and Y.H. (Yuxuan Hu); project administration, S.Y., Y.H. (Yue Huang) and N.W.; funding acquisition, S.Y., Y.H. (Yue Huang) and N.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Humanities and Social Sciences Youth Foundation, Ministry of Education of the People’s Republic of China (grant number: 22YJCZH161); Ministry of Education of the People’s Republic of China Industry-University-Research Project (grant number: 231105342131517); and Zhejiang Provincial Philosophy and Social Sciences Planning Project (grant number: 24NDQN150YBM). The funders had no role in the study conceptualization, data curation, formal analysis, methodology, software, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data sets used and analyzed during the current study are available from Shuai Yang (samyang@zju.edu.cn) on reasonable request.

Acknowledgments

We would like to express our sincere gratitude to the students and the staff who assisted us during the field survey.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Loss Metrics During Training

In this study, the specific values of loss metrics during training are as follows:
Table A1. Loss metrics during training in this study.
Table A1. Loss metrics during training in this study.
EpochTraining LossValidation Loss
1.0283.0129398769802229.12842559814453
2.067.201200538211422.664851665496826
3.08.4202447599834867.727818489074707
4.06.1609447664684726.748100638389587
5.05.5748083988825485.550254583358765
6.05.1987969875335695.513710379600525
7.04.9385078483157695.818173170089722
8.04.9713090194596195.794464349746704
9.04.7780274483892655.844989657402039
10.04.6080708040131465.4295032024383545
11.04.7168319556448195.315472841262817
12.04.3272684084044565.343663334846497
13.04.537931448883485.5462048053741455
14.04.2851118180486895.63400411605835
15.04.2217911680539455.27153217792511
16.04.2282123035854764.872853636741638
17.03.9185502794053825.203720569610596
18.04.2149787545204165.330354690551758
19.04.0095412797398034.9289796352386475
20.04.1504510111278955.157945871353149
21.03.87497911850611354.9402230978012085
22.03.8625192112392855.279082179069519
23.03.8618924087948275.058476090431213
24.03.860759390725035.192415833473206
25.03.75789457559585575.378412246704102
26.03.7812054753303535.234185218811035
27.03.7677788204616975.313507080078125
28.03.7351915207174095.229167222976685
29.03.65920855932765535.2667622566223145
30.03.49941251013014035.30493426322937
31.03.56007789572085.173294425010681
32.03.59130639500088175.463017463684082
33.03.50316443045934065.296589255332947
34.03.63126550118128445.523662686347961
35.03.49528965685102655.5393065214157104
36.03.73173681894938145.393651962280273
37.03.61862658792071875.394462943077087
38.03.6124745640489795.1864224672317505
39.03.3986649049652955.2568230628967285
40.03.32718908124499855.133337616920471
41.03.54621956083509645.06995153427124
42.03.28554596834712555.363473892211914
43.03.2672795686456895.568755030632019
44.03.3409870498710215.319752931594849
45.03.22511964374118385.314714968204498
46.03.19828094376458075.31688928604126
47.03.1870892908838064.936183869838715
48.03.24125548534923125.059817433357239
49.03.23032417562272835.305909514427185
50.03.1546117398473955.1718891859054565
51.03.5845800952778925.505593240261078
52.03.45473138160175755.171465456485748
53.03.5598325050539445.515733897686005
54.03.48701264295313075.3259724378585815
55.03.23593268129560665.35360461473465
56.03.5134582834111325.024634152650833
57.03.32486415074931245.368274301290512
58.03.3498169763220685.146198779344559
59.03.3307545963260864.989331871271133
60.03.50734498931301975.202360600233078
61.03.10614526768525464.983969300985336
62.03.17434509760803655.180098056793213
63.03.1986128505733285.503257215023041
64.03.15106312433878575.252945721149445
65.03.196500375866895.434637248516083
66.03.40615646044413235.46635890007019
67.03.10136590401331575.289748340845108
68.03.09387935863600835.196868181228638
69.02.9809298415978755.103890061378479
70.02.9513152986764915.681272864341736
71.03.0502072059445915.177073389291763
72.02.9020997815661965.543020009994507
73.02.9925897055202065.346562504768372
74.03.0807157920466535.435064196586609
75.02.90888224542140965.191833674907684
76.02.81661962303850375.294306814670563
77.02.9788600239488815.241958290338516
78.02.91504049632284365.199945449829102
79.02.87352671391434145.024118721485138
80.02.97840794093079045.354002803564072
81.02.79256239367855935.309221297502518
82.02.8067837738328515.542476832866669
83.02.76777269939581544.971868932247162
84.02.8262786534097465.309619963169098
85.02.87397867110040475.120590656995773
86.02.7302624483903255.073243468999863
87.02.6571596910556165.286428660154343
88.02.77134882410367345.286125957965851
89.02.845840894513665.197303265333176
90.02.6870374133189525.172180324792862
91.02.6663075718614795.318373918533325
92.02.66312075157960275.352686941623688
93.02.6551891085174355.036076128482819
94.02.65348207122749764.993526726961136
95.02.7081367373466495.2824074029922485
96.02.5660567333300915.271271824836731
97.02.53243358267678165.2538445591926575
98.02.6252698898315435.15626934170723
99.02.67093726495901735.172775328159332
100.02.53154517875777345.345183610916138
101.02.54508199459976635.668607294559479
102.02.6164196199840975.260363161563873
103.02.6273726390467755.297725200653076
104.02.53117000725534255.622186541557312
105.02.50449361734920075.300058215856552
106.02.63763626250955775.34476637840271
107.02.5002180735270185.242717951536179
108.02.5721220340993675.498638451099396
109.02.4917953163385395.721703469753265
110.02.61701439486609565.4027329087257385
111.02.4653031693564525.519768953323364
112.02.5474938584698575.501642107963562
113.02.41356206271383485.5688910484313965
114.02.4277758714225565.147663712501526
115.02.41162563032574135.398656934499741
116.02.57210709982448155.2121356427669525
117.02.4226356446743015.534132659435272
118.02.43588796423541175.599622040987015
119.02.3741974218024155.2122564017772675
120.02.4020968195464885.5332518219947815
121.02.30232031146685275.388486266136169
122.02.478659442729425.163467764854431
123.02.3934764133559335.4307175278663635
124.02.30485713813039975.404813647270203
125.02.2988809910085475.722573637962341
126.02.44494958718617775.1718266904354095
127.02.4930141849650285.215699702501297
128.02.4937055326170395.566073775291443
129.02.42635046442349775.568940043449402
130.02.2261001351806855.385970294475555
131.02.24867040581173375.100738674402237
132.02.35029140777058085.629933804273605
133.02.40255433320999155.448204308748245
134.02.32528213163216925.676319509744644
135.02.25282992588149175.40545454621315
136.02.23028814295927675.3017958998680115
137.02.30748290485805945.0277367532253265
138.02.30808708320061355.413030445575714
139.02.2838452855745955.2137598395347595
140.02.2718137602011365.608694493770599
141.02.35551198323567725.51924455165863
142.02.1828050331936945.427482694387436
143.02.15343335602018555.404742002487183
144.02.13735583672920855.429654121398926
145.02.2487091322739925.536401331424713
146.02.16616589162084775.5345683097839355
147.02.13944419556193885.5860046446323395
148.02.1882182988855575.372467249631882
149.02.18174584209918985.615984857082367
150.02.21224506861633735.768014490604401
151.02.0813193437125955.275599360466003
152.02.3238246838251755.770521104335785
153.02.3047492719358875.586254775524139
154.02.1769989944166615.5822674036026
155.02.147977328962755.44855460524559
156.02.11894831723637065.528890162706375
157.02.1267673696080845.577036440372467
158.02.1469197289811245.589718580245972
159.02.1383702970213365.587451308965683
160.02.234010315603685.425671637058258
161.02.098153534862735.4958988428115845
162.02.37728762626647955.538624882698059
163.02.14330469485786255.573834717273712
164.02.0427230745553975.7587050795555115
165.02.0780161652300095.499907553195953
166.02.1257149179776515.740845024585724
167.02.1245757606294425.614028364419937
168.02.11414022495349265.324392914772034
169.02.13078705221414575.3101116716861725
170.02.1567942467000755.747155427932739
171.02.3344679805967545.416957765817642
172.02.1040362682607445.3214499950408936
173.02.2770990646547745.505950510501862
174.02.15572347740332275.238791525363922
175.02.05738828331232075.863373875617981
176.02.08820637729432875.563097566366196
177.02.10101869702339175.534978687763214
178.02.0866158099638095.306969612836838
179.01.98604559898376465.3219895362854
180.01.99602542403671485.2943615317344666
181.02.0418754559424195.736985504627228
182.01.92811320970455815.666010469198227
183.02.0302111746536365.652487009763718
184.01.9655134611659585.551377713680267
185.01.99891058024432925.502628147602081
186.02.13791749543613865.314296126365662
187.02.02170679552687555.37531441450119
188.02.0178101385633155.235474348068237
189.01.9490618440839985.7264708280563354
190.02.0018157801694345.293230086565018
191.01.9846008535888465.295426607131958
192.02.0405667862958385.26984316110611
193.02.04752678424119955.642051935195923
194.01.9021827073560825.484188437461853
195.01.96013248463471725.583483040332794
196.01.95105274766683585.587881833314896
197.02.03242456830210165.612217128276825
198.02.0024374094274315.485441744327545
199.01.87131339311599735.059922158718109
200.01.93443531791369135.748556613922119
201.01.87238941921128185.3129458129405975
202.01.86841784417629245.702697306871414
203.01.8966105795568895.349358141422272
204.01.97581491702132755.310623109340668
205.01.877070438530715.673441410064697
206.01.91169887450006275.528092622756958
207.01.88617999603350955.744377672672272
208.01.94695201598935655.498331010341644
209.01.84155505978398855.6178083419799805
210.01.86367993470695285.4937189519405365
211.02.2299250562985745.190759718418121
212.02.1953999565707315.663421154022217
213.02.15847460594442185.612324953079224
214.02.27332470648818545.487780392169952
215.02.23622869286272245.577988564968109
216.02.19518850081496765.520171582698822
217.02.2238352464305035.89059180021286
218.02.15913230180740365.5243706703186035
219.02.11183981928560455.4200805723667145
220.02.0577055861552565.79927796125412
221.02.16567395793067075.2475133538246155
222.02.17780139876736545.348675012588501
223.02.23165252473619275.3924709260463715
224.02.12835391693645055.180470556020737
225.02.157712944679795.400013774633408
226.02.0234838624795285.476243674755096
227.02.1388345327642235.500479459762573
228.02.0972376449240585.60048371553421
229.02.05089732839001565.570494174957275
230.02.11262936227851445.46452921628952
231.02.07258092694812355.859246492385864
232.01.9105663796265925.513609528541565
233.01.97978207965691885.290630608797073
234.02.11031240224838265.648497104644775
235.02.0112444228596165.024340361356735
236.02.06995874808894255.252139300107956
237.01.98957931498686485.44338721036911
238.02.007630490594445.8632171750068665
239.02.1192556867996855.37006601691246
240.02.01384142372343265.1958867609500885
241.02.02601897882090665.5862244963645935
242.02.02456534902254735.822095692157745
243.01.93593353529771165.714192450046539
244.02.04350179102685745.854075133800507
245.01.964999685684845.29244065284729
246.02.10435196095042755.633374869823456
247.02.0562649617592495.421928584575653
248.01.90489094456036885.764021098613739
249.01.89178538488017185.637931019067764
250.01.9767859180768335.6620747447013855
251.01.91977940334214115.82159811258316
252.01.93954258163770035.407146453857422
253.01.90724435117509635.897126853466034
254.01.9335086122155195.6025354862213135
255.01.96550804542170625.775189995765686
256.02.09347597344054135.490861505270004
257.02.0306985030571625.279992252588272
258.01.94625725845495875.458898693323135
259.01.92335984607537585.809788763523102
260.01.91997498108281046.003212928771973
261.01.90573904580540135.463661849498749
262.02.002391899625465.33598592877388
263.01.96462987694475395.531261324882507
264.01.90098765989144645.565259754657745
265.02.0004752824703855.344159513711929
266.01.9759098854329855.479030340909958
267.01.9678041785955435.573125243186951
268.01.95799718300501515.364033550024033
269.01.95019662049081595.349084913730621
270.01.97696255478594045.856136173009872
271.02.01997609602080445.541984140872955
272.01.87996965315606865.722230076789856
273.01.96232228808932835.557120978832245
274.01.8872837225596115.5592955350875854
275.01.86372856961356265.533895641565323
276.01.97912178271346625.633853435516357
277.01.90947768919997745.542587697505951
278.01.91680054201020145.509026616811752
279.01.83736266195774085.72770881652832
280.01.9813778880569675.540918827056885
281.01.90336427920394475.760140746831894
282.02.0008597920338315.495365649461746
283.02.0248069696956215.72679203748703
284.01.9247937997182215.376008868217468
285.01.7945955428812245.498400866985321
286.01.87501893937587745.541150867938995
287.01.87890153129895525.862847089767456
288.01.87091380523310765.6896087527275085
289.01.90030826048718555.707343876361847
290.01.89283949633439375.7659898698329926
291.01.87312313417593655.418803095817566
292.02.05135477913750555.7842923402786255
293.01.90837511089113025.358145207166672
294.01.97723035679923155.511350721120834
295.02.0226131810082335.7540766298770905
296.01.9413110133674415.526892185211182
297.01.90331299768553835.686949133872986
298.01.83032419118616325.482560098171234
299.01.84685562716590025.716579079627991
300.01.91645262969864755.652753949165344
Source: The authors’ statistics are based on the machine learning results.

References

  1. Hong, Z.; Li, C.X. Comparative study on the construction idea and scale of courtyard space in Southern courtyard buildings: A case study of the folk houses in Jiangnan and Tunpu in Anshun. Famous China City 2019, 3, 75–80. [Google Scholar]
  2. Gong, W.Q.; Zong, X.X. Protection and renewal of traditional architecture under the background of rural revitalization strategy: Taking Longmen Ancient Town as an example. Archit. Cult. 2023, 1, 140–143. [Google Scholar]
  3. Jigyasu, R.; Murthy, M.; Boccardi, G.; Marrion, C.E.; Douglas, D.; King, J.; O’Brien, G.; Dolcemascolo, G.; Kim, Y.; Albrito, P.; et al. Heritage and Resilience: Issues and Opportunities for Reducing disaster risks. Working Paper. 2013, Global Platform for Disaster Risk Reduction, Geneva, 58p. Available online: http://openarchive.icomos.org/id/eprint/3359/ (accessed on 15 November 2024).
  4. Feilden, B. Conservation of Historic Buildings; Routledge: London, UK, 2007. [Google Scholar]
  5. Zou, Z.; Zhao, X.; Zhao, P.; Qi, F.; Wang, N. CNN-based statistics and location estimation of missing components in routine inspection of historic buildings. J. Cult. Herit. 2019, 38, 221–230. [Google Scholar] [CrossRef]
  6. Zheng, Y.; Zhang, G.; Tan, S.; Feng, L. Research on Progress of Forest Fire Monitoring with Satellite Remote Sensing. Agric. Rural Stud. 2023, 1. [Google Scholar] [CrossRef]
  7. Zhu, Y.; Luo, W. Comparative study on the roofs of traditional revival architecture in modern China. IOP Conf. Ser. Earth Environ. Sci. 2021, 783, 012116. [Google Scholar] [CrossRef]
  8. Shen, Y.; Zhang, E.; Feng, Y.; Liu, S.; Wang, J. Parameterizing the Curvilinear Roofs of Traditional Chinese Architecture. Nexus Netw. J. 2021, 23, 475–492. [Google Scholar] [CrossRef]
  9. Lin, Q.; Tan, H. Spatial Deconstruction and Optimized Strategies of Historical and Cultural Villages and Towns. In Proceedings of the 2023 6th International Conference on Humanities Education and Social Sciences (ICHESS 2023), Xi’an, China, 13–15 October 2023; Volume 179, p. 01031. [Google Scholar] [CrossRef]
  10. Yihui, S.; Tian, C.; Meng, Z. Sustainable Tourism Development Management of Local Cultural Landscapes. Chin. J. Popul. Resour. Environ. 2008, 6, 74–79. [Google Scholar] [CrossRef]
  11. Mao, J.M. Study on Spatial Form and Architectural Characteristics of Longmen Ancient Town in Fuyang; Zhejiang Sci-Tech University: Hangzhou, China, 2023. [Google Scholar]
  12. Xu, Y.W. The unique Longmen ancient town. Zhejiang For. 2021, 1, 38–39. [Google Scholar]
  13. Rowe, P.; Chung, Y. Metabolic Aspects of Rural Life and Settlement Coexistence in China. Agric. Rural Stud. 2024, 2, 0007. [Google Scholar] [CrossRef]
  14. Chen, N.; Jin, M.; Wang, S.; Zhang, X.; Sun, H.; Cao, F. The Impact of Forestry Industry Integration on the Forest Farmers’ Income in China: A Theoretical and Empirical Study. Agric. Rural Stud. 2024, 2, 0004. [Google Scholar] [CrossRef]
  15. Qin, C.; Zhu, Y. Medical Insurance Benefits and Labor Decisions of Middle-Aged and Elderly People: Evidence from Rural China. Agric. Rural Stud. 2024, 2. [Google Scholar] [CrossRef]
  16. Ren, W. The Impact of Typhoons on Agricultural Productivity—Evidence from Coastal Regions of China. Agric. Rural Stud. 2024, 2, 0024. [Google Scholar] [CrossRef]
  17. Xue, B.Y. Introduction to Ancient Chinese Architecture; China Building Industry Press: Beijing, China, 2015. [Google Scholar]
  18. Liu, J. Research on Building Style Recognition Based on Deep Learning; Xi’an University of Architecture and Technology: Xi’an, China, 2023. [Google Scholar]
  19. Zhang, Q. Feudal Hierarchy Seen in Ancient Building Tiles; Charm China: Zhengzhou, China, 2019. [Google Scholar]
  20. Keshmiry, A.; Hassani, S.; Dackermann, U.; Li, J. Assessment, repair, and retrofitting of masonry structures: A comprehensive review. Constr. Build. Mater. 2024, 442, 137380. [Google Scholar] [CrossRef]
  21. Huang, Y.R. Analysis and treatment of roof leakage in Zhongshan Memorial Hall. Guangzhou Archit. J. 2008, 8, 14–18. [Google Scholar]
  22. Qian, W.; Liu, S.W.; Li, X.F. Analysis of landscape changes of ancient water towns in northern Zhejiang from the perspective of cultural ecology: A case study of Longmen Ancient Town. Agric. Technol. 2024, 44, 113–115. [Google Scholar]
  23. Robert, A. Conservation Technology of Historical Buildings; The Publishing House of Electronics Industry: Beijing, China, 2012. [Google Scholar]
  24. Chapinal-Heras, D.; Díaz-Sánchez, C. A review of AI applications in Human Sciences research. Digit. Appl. Archaeol. Cult. Herit. 2023, 30, e00288. [Google Scholar] [CrossRef]
  25. Mishra, M.; Lourenço, P.B. Artificial intelligence-assisted visual inspection for cultural heritage: State-of-the-art review. J. Cult. Herit. 2024, 66, 536–550. [Google Scholar] [CrossRef]
  26. Samhouri, M.; Al-Arabiat, L.; Al-Atrash, F. Prediction and measurement of damage to architectural heritages facades using convolutional neural networks. Neural Comput. Appl. 2022, 34, 18125–18141. [Google Scholar] [CrossRef]
  27. Lee, J.; Yu, J.M. Automatic Surface Damage Classification Developed Based on Deep Learning for Wooden Architectural Heritage. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 151–157. [Google Scholar] [CrossRef]
  28. Wang, N.; Zhao, X.; Zou, Z.; Zhao, P.; Qi, F. Autonomous damage segmentation and measurement of glazed tiles in historic buildings via deep learning. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 277–291. [Google Scholar] [CrossRef]
  29. Bao, S.H.; Zhuo, X.L.; Tao, J. Using semi-supervised machine learning to assist classification and recognition of Chinese vernacular architecture. J. Build. Eng. 2024, 98, 111327. [Google Scholar] [CrossRef]
  30. Liu, S.; Liu, Y.; Li, S. Building the branding and attractive small town based on the CI theory—Take the longmen ancient town for example. In Proceedings of the 2011 International Conference on Multimedia Technology, Hangzhou, China, 26–28 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 4348–4351. [Google Scholar] [CrossRef]
  31. Peterek, M.; Hebbo, M.S.Y.; Rico, S.R.; Klement, D.I.M. City Profile Hangzhou. 2021. Available online: https://fhffm.bsz-bw.de/frontdoor/deliver/index/docId/6528/file/221128_WoP4_City_Profile_Hangzhou_final.pdf (accessed on 17 November 2024).
  32. Jin, K.; Qin, P.; Liu, C.; Zong, Q.; Wang, S. Impact of Urbanization on Sunshine Duration from 1987 to 2016 in Hangzhou City, China. Atmosphere 2021, 12, 211. [Google Scholar] [CrossRef]
Figure 1. Study area: Longmen Ancient Town in Zhejiang (image source: drawn by the author).
Figure 1. Study area: Longmen Ancient Town in Zhejiang (image source: drawn by the author).
Coatings 15 00205 g001
Figure 2. Aerial photography of Longmen Ancient Town (image source: photographed by the author).
Figure 2. Aerial photography of Longmen Ancient Town (image source: photographed by the author).
Coatings 15 00205 g002
Figure 3. Research methods and processes (image source: drawn by the author).
Figure 3. Research methods and processes (image source: drawn by the author).
Coatings 15 00205 g003
Figure 4. Climate analysis of Hangzhou City (image source: drawn by the author via Ladybug).
Figure 4. Climate analysis of Hangzhou City (image source: drawn by the author via Ladybug).
Coatings 15 00205 g004
Figure 5. Analysis of the annual wind frequency rise in Hangzhou City (image source: drawn by the author via Ladybug).
Figure 5. Analysis of the annual wind frequency rise in Hangzhou City (image source: drawn by the author via Ladybug).
Coatings 15 00205 g005
Figure 6. The main structure of the roof tiles in this study (image source: drawn by the author).
Figure 6. The main structure of the roof tiles in this study (image source: drawn by the author).
Coatings 15 00205 g006
Figure 7. Ingenious drainage design of the roof (image source: drawn by the author).
Figure 7. Ingenious drainage design of the roof (image source: drawn by the author).
Coatings 15 00205 g007
Figure 8. The YOLOv8 architecture used in this study (image source: drawn by the author).
Figure 8. The YOLOv8 architecture used in this study (image source: drawn by the author).
Coatings 15 00205 g008
Figure 9. Changes in loss value during model training (image source: drawn by the author).
Figure 9. Changes in loss value during model training (image source: drawn by the author).
Coatings 15 00205 g009
Figure 10. Performance statistics of the models at different epochs (the asterisk * in the figure indicates the median). In the figure, F1* indicates score threshold = 0.5; Recall* indicates score threshold = 0.5; Precision* indicates score threshold = 0.5 (image source: drawn by the author).
Figure 10. Performance statistics of the models at different epochs (the asterisk * in the figure indicates the median). In the figure, F1* indicates score threshold = 0.5; Recall* indicates score threshold = 0.5; Precision* indicates score threshold = 0.5 (image source: drawn by the author).
Coatings 15 00205 g010
Figure 11. The confusion matrix of the 16th epoch model (image source: drawn by the author).
Figure 11. The confusion matrix of the 16th epoch model (image source: drawn by the author).
Coatings 15 00205 g011
Figure 12. The confusion matrix of the 85th epoch model (image source: drawn by the author).
Figure 12. The confusion matrix of the 85th epoch model (image source: drawn by the author).
Coatings 15 00205 g012
Figure 13. The confusion matrix of the 285th epoch model (image source: drawn by the author).
Figure 13. The confusion matrix of the 285th epoch model (image source: drawn by the author).
Coatings 15 00205 g013
Figure 14. The confusion matrix of the 300th epoch model (image source: drawn by the author).
Figure 14. The confusion matrix of the 300th epoch model (image source: drawn by the author).
Coatings 15 00205 g014
Figure 15. (AD) Detection results of different epoch models (image source: drawn by the author).
Figure 15. (AD) Detection results of different epoch models (image source: drawn by the author).
Coatings 15 00205 g015
Figure 16. Feature map analysis of the model during plant detection on tile surfaces (image source: drawn by the author).
Figure 16. Feature map analysis of the model during plant detection on tile surfaces (image source: drawn by the author).
Coatings 15 00205 g016
Figure 17. Results of applying the model to newly collected images from the field (image source: drawn by the author).
Figure 17. Results of applying the model to newly collected images from the field (image source: drawn by the author).
Coatings 15 00205 g017
Figure 18. Roof inspection design combined with UAV (image source: drawn by the author).
Figure 18. Roof inspection design combined with UAV (image source: drawn by the author).
Coatings 15 00205 g018
Table 1. Model performance statistics for different epochs.
Table 1. Model performance statistics for different epochs.
ClassificationAverage PrecisionLog-Average Miss RateF1*Recall*Precision*Epoch
Dry Vegetation0.520.660.480.350.7116
Green Vegetation0.540.830.160.081
Missing Tile0.940.190.840.80.88
Repaired Tile0.70.640.620.50.8
Dry Vegetation0.520.690.520.420.6685
Green Vegetation0.550.820.420.280.81
Missing Tile0.90.190.890.81
Repaired Tile0.930.030.790.810.76
Dry Vegetation0.540.670.580.50.7285
Green Vegetation0.530.80.490.390.73
Missing Tile0.90.20.890.81
Repaired Tile0.780.510.720.810.65
Dry Vegetation0.540.670.580.50.7300
Green Vegetation0.530.80.490.360.73
Missing Tile0.90.20.890.81
Repaired Tile0.780.510.740.810.68
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, S.; Chen, Y.; Zheng, L.; Chen, J.; Huang, Y.; Huang, Y.; Wang, N.; Hu, Y. Investigating and Identifying the Surface Damage of Traditional Ancient Town Residence Roofs in Western Zhejiang Based on YOLOv8 Technology. Coatings 2025, 15, 205. https://doi.org/10.3390/coatings15020205

AMA Style

Yang S, Chen Y, Zheng L, Chen J, Huang Y, Huang Y, Wang N, Hu Y. Investigating and Identifying the Surface Damage of Traditional Ancient Town Residence Roofs in Western Zhejiang Based on YOLOv8 Technology. Coatings. 2025; 15(2):205. https://doi.org/10.3390/coatings15020205

Chicago/Turabian Style

Yang, Shuai, Yile Chen, Liang Zheng, Junming Chen, Yuhao Huang, Yue Huang, Ning Wang, and Yuxuan Hu. 2025. "Investigating and Identifying the Surface Damage of Traditional Ancient Town Residence Roofs in Western Zhejiang Based on YOLOv8 Technology" Coatings 15, no. 2: 205. https://doi.org/10.3390/coatings15020205

APA Style

Yang, S., Chen, Y., Zheng, L., Chen, J., Huang, Y., Huang, Y., Wang, N., & Hu, Y. (2025). Investigating and Identifying the Surface Damage of Traditional Ancient Town Residence Roofs in Western Zhejiang Based on YOLOv8 Technology. Coatings, 15(2), 205. https://doi.org/10.3390/coatings15020205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop