1. Introduction
Structural health monitoring (SHM) has evolved from a research curiosity to an engineering necessity as building infrastructure ages, safety expectations rise, and data-driven maintenance becomes economically imperative [
1]. Modern building management increasingly relies on quantitative condition assessment to optimize repair schedules, extend service life, and prevent catastrophic failures [
2,
3]. Among measurable response parameters, strain is particularly diagnostic: it directly indicates localized material deformation, enables stress state assessment, and provides an early indication of damage initiation from microcracking in concrete to yielding in steel [
4,
5,
6]. Despite decades of sensor development, comprehensive strain monitoring in buildings remains rare. Conventional technologies such as electrical resistance strain gauges, fiber Bragg grating (FBG) sensors, vibrating wire gauges, and Linear Variable Differential Transformers (LVDTs) offer established accuracy but impose prohibitive barriers to scaled deployment [
7,
8,
9]. FBG interrogators cost USD 10,000–50,000; individual wired sensors require expensive installation labor [
10,
11]. Dense wiring for power and signal transmission may be incompatible with existing buildings and operationally disruptive [
9,
12]. Wired sensors are vulnerable to electromagnetic interference, and physical damage; replacement requires access to embedded installations [
13,
14,
15,
16].
Vision-based SHM has attracted substantial research interest as a spatially distributed alternative [
17,
18,
19,
20]. Established techniques such as digital image correlation (DIC), template matching, and feature tracking achieve subpixel displacement resolution under ideal conditions [
21,
22,
23,
24]. However, two dominant system architectures have limited practical building deployment. Fully embedded edge-processing systems perform all computation at the sensing node using single-board computers such as Raspberry Pi and NVIDIA Jetson with high-speed industrial cameras. This approach introduces its own challenges: per-node costs of USD 100–500, power consumption of 5–20 W requiring dedicated electrical infrastructure, thermal management complications, and reduced flexibility for algorithm updates or system reconfiguration [
25,
26,
27,
28]. The current literature lacks practical architectures that achieve measurement accuracy comparable to conventional sensors using minimal-cost hardware while avoiding the computational constraints and inflexibility of fully embedded processing. Specifically, there is need for systems that reduce the cost per sensing point to levels comparable with a basic wired sensor, minimize power consumption for deployment via building infrastructure-integrated live USB power and enable computational flexibility for algorithm evolution without field hardware modification as well as integrating naturally with existing Wi-Fi networked systems. This study addresses the identified gap through architectural decoupling: separating image acquisition from computational processing. The specific innovation is using ESP32-CAM modules USD 5 wireless camera systems designed for IoT applications, as edge nodes performing exclusively image capture and transmission, with all analysis executed on centralized servers. This proposed system approach is different from both high-end vision systems which centralize processing but require expensive acquisition hardware and edge-processing systems which distribute computation to sophisticated and expensive local hardware.
This study develops and rigorously validates a vision-based strain monitoring system with centralized processing specifically designed for building-scale deployment. The specific contributions include: the development and laboratory validation of a vision-based strain monitoring system achieving micrometer-level displacement resolution and microstrain-level strain accuracy using a USD 5 ESP32-CAM microcontroller edge hardware device with centralized processing; the demonstration of scalable architecture with identified technical pathways for addressing environmental robustness, cybersecurity, and network resilience requirements for field deployment in smart building applications.
The remainder of this paper is organized as follows:
Section 2 details the system architecture, hardware composition, and algorithms with explicit uncertainty analysis;
Section 3 presents laboratory validation experiments and performance metrics;
Section 4 discusses results:
Section 5 discusses the limitations of the proposed method, and the path to field deployment; and
Section 6 concludes with future work directions.
2. Materials and Methods
2.1. System Architecture
The proposed monitoring system as illustrated in
Figure 1 adopts a distributed three-tier architecture, separating functionality across edge, communication, and centralized processing layers. The system consists of (i) an edge camera node, (ii) a wireless communication layer, and (iii) a centralized processing unit.
The edge node, based on the ESP32-CAM module, is responsible solely for image acquisition and wireless transmission. All computationally intensive operations—including displacement tracking, scale calibration, and strain estimation—are performed on a centralized PC. This design reduces hardware complexity at the sensing node and enables scalable deployment across multiple monitoring locations. The various system attributes for the edge node as well as the centralized systems are summarized in
Table 1.
However, while the centralized architecture simplifies system design, it introduces potential scalability and reliability limitations. As the number of deployed nodes increases, data transmission and processing demands may lead to network congestion and computational bottlenecks. Additionally, the centralized processing unit represents a single point of failure, where interruptions may affect monitoring continuity. In this study, the system is evaluated under limited-scale deployment, and future work will explore distributed or edge-assisted processing strategies to improve scalability and fault tolerance.
Image data are transmitted over a 2.4 GHz Wi-Fi network and processed sequentially at the central unit.
1. Edge Camera Node: The MISS-Building sensor is built on the ESP32-CAM module mounted on the monitored structural component, with only two functions: grayscale image capture and wireless image transmission.
2. Wireless Communication Layer: The system uses standard IP-based networking to transmit image frames to a centralized server over a 2.4 GHz Wi-Fi network. A store-and-forward buffering mechanism using onboard microSD storage ensures measurement continuity during intermittent network outages. Captured images are timestamped and stored locally when connectivity is unavailable, and transmission resumes automatically upon reconnection.
3. Centralized PC/BMS Analysis Layer: All computational operations are performed on a centralized PC/BMS platform. These include image decoding, preprocessing, displacement tracking using normalized cross-correlation (NCC), subpixel localization via quadratic interpolation, checkerboard-based scale calibration, and strain computation. The centralized layer also supports data logging, visualization, and alert generation.
Network and Security Considerations: While the current implementation uses standard WPA2-secured Wi-Fi, we explicitly identify cybersecurity, data encryption, and network congestion resilience as critical requirements for field deployment. The architecture supports these enhancements at the centralized layer without edge hardware modification.
2.2. Hardware Design
2.2.1. Core Hardware Components
The edge camera node is a compact unit with four core components.
ESP32-CAM AI Thinker Module: The system uses an ESP32-CAM module equipped with an OV5640 CMOS image sensor. Although the sensor supports a native resolution of 3264 × 2448 pixels, images are captured in grayscale at 1280 × 720 resolution to balance data efficiency and measurement performance. The sampling rate is set to 1 Hz based on system design constraints and default acquisition configuration. This rate is sufficient for capturing quasi-static structural behavior while minimizing power consumption and the communication load. However, it limits the system’s ability to capture transient or high-frequency structural responses and is therefore most suitable for long-term monitoring of gradual deformation. The entire cost of the hardware components and their respective specifications are summarized in
Table 2.
MicroSD Card: A 4 GB Class 10 microSD card provides local buffering capability. This enables continuous image storage during network outages, ensuring measurement continuity over extended durations.
Image Sensor Configuration: Images are captured in grayscale to reduce computational complexity and eliminate color-channel noise, improving template matching consistency. The camera operates using the default exposure settings provided by the ESP32-CAM module. While this simplifies configuration, variations in illumination may affect image quality and tracking performance under non-uniform lighting conditions.
Mounting Shelf: A custom stainless-steel mounting platform supports the ESP32-CAM and optical targets, including a checkerboard reference and a bullseye tracking target. The configuration ensures stable relative positioning during laboratory validation.
2.2.2. Optical Targets
Bullseye Target: The bullseye target as shown in
Figure 2 is attached to the moving structural component. Its radial symmetry supports robust detection and provides partial invariance to rotational misalignment.
Checkerboard Reference Pattern (1 mm squared size): The reference checkerboard pattern is fixed to a stationary structural surface. It is used for pixel-to-physical scale calibration and periodic recalibration to compensate for drift.
2.2.3. Usage and Energy Efficiency
The main design criteria are function, a low cost, a low power requirement, and ease of deployment, which is necessary for the system’s suitability for smart building SHM.
Power Consumption: The system consumes the nominal power of 1 W during operation. This low power demand enables 66 h of continuous operation with a single 10,000 mAh portable battery or permanent operation via the smart building’s existing 5 V DC USB power supply. This eliminates the need for a dedicated high-voltage or high-current power infrastructure, making the system compatible with smart building power supplies.
Deployment Workflow: The sensor deployment utilizes a simple two-step workflow. First the strain transfer mechanism and the shelf are attached using the M4 bolt and the structural attachment point. Secondly the ESP32-CAM within the smart building network is connected to a 5 V power supply source then the image transmission and centralized PC-edge node begin immediately.
2.3. Algorithms and Uncertainty
2.3.1. Algorithm Overview
The monitoring framework follows a sequential image-based processing pipeline consisting of image acquisition, transmission, preprocessing, displacement tracking, and strain computation. Image frames captured at the edge node are transmitted to a centralized processing unit for analysis. The algorithm is designed for one-dimensional horizontal displacement tracking and incorporates a region of interest (ROI), confidence-based tracking validation, and periodic scale recalibration to improve robustness. The overall workflow is illustrated in
Figure 3.
2.3.2. Image Acquisition and Compression Effects
Image frames are captured at fixed sampling intervals and encoded in JPEG format prior to transmission. JPEG encoding reduces data size and improves transmission efficiency; however, the compression ratio depends on the scene content and default camera configuration.
JPEG compression is inherently lossy and may introduce artifacts that affect pixel-level intensity distributions. These artifacts are particularly relevant for subpixel localization, which relies on accurate intensity gradients around correlation peaks. In this study, the compression configuration was not explicitly optimized, and its impact on displacement accuracy was not systematically quantified.
Although the experimental results show agreement with reference measurements under controlled conditions, the influence of compression artifacts under varying imaging conditions such as low contrast, noise, or illumination variability remains an important area for further investigation.
2.3.3. Initialization and Scale Calibration
At the start of monitoring, the initialization procedure is performed using the first valid image frame as shown in
Figure 4. The bullseye target is detected using the circular Hough transform, and a template is extracted for subsequent tracking. The checkerboard pattern is detected using Harris corner detection, and geometric consistency is verified using RANSAC-based homography estimation.
The pixel-to-physical scale is determined using the known checkerboard square size (1 mm). The median inter-corner pixel distance is used to define the conversion factor:
where px_per_mm denotes the number of pixels per millimeter, computed as the median inter-corner pixel distance of the checkerboard pattern.
2.3.4. Displacement Tracking
For each frame, the horizontal position of the bullseye target is updated using a template-matching approach constrained to one-dimensional motion. Unlike feature-based methods such as optical flow, normalized cross-correlation (NCC) does not require distinctive keypoints and is well suited for the symmetric, repetitive structure of the bullseye target.
The NCC is computed between a stored reference template and a local search region centered around the previous target position. This localized search reduces the computational cost and improves tracking stability. NCC is selected for its deterministic behavior, robustness to uniform illumination variations, and effectiveness in tracking repetitive patterns.
Tracking confidence is evaluated based on correlation peak magnitude and inter-frame displacement consistency. When the confidence falls below a predefined threshold, a reinitialization procedure is triggered to re-detect the target and update the template.
While NCC performs well for translational motion under controlled conditions, it remains sensitive to geometric transformations such as rotation, deformation, and partial occlusion. The radial symmetry of the bullseye target provides partial robustness to rotation; however, tracking accuracy may degrade under more complex structural motion.
2.3.5. Subpixel Localization
To improve accuracy beyond pixel resolution, subpixel localization is applied to refine the detected target position. Quadratic interpolation is performed around the peak of the correlation response to estimate the subpixel offset as shown in
Figure 5. Let R(x) denote the correlation response along the horizontal axis, where x is the pixel coordinate along the horizontal search axis, with a discrete maximum at pixel location x
0. The subpixel offset δx is computed as
The refined bullseye position is expressed as
This approach enables subpixel displacement estimation while maintaining computational efficiency. However, the accuracy of the refinement depends on the quality of the correlation response and may be affected by noise, compression artifacts, and reduced contrast.
2.3.6. Scale Recalibration
Although the checkerboard reference is assumed to remain stationary, slight movement due to thermal expansion, vibration, or mounting imperfections may introduce scale drift over long monitoring durations. To mitigate this, periodic checkerboard detection is performed to update the scale factor and reference position. However, this approach assumes stability of the reference region between recalibration intervals, and long-term performance under real-world conditions requires further validation.
2.3.7. Displacement and Strain Computation
The horizontal displacement of the bullseye relative to its initial position was computed as follows:
The axial strain, expressed in microstrain, was calculated as
This formulation enables strain estimation suitable for structural components, where conventional wired sensors may be impractical.
3. Laboratory Performance Evaluation
3.1. Experimental Details
Laboratory experiments were conducted to evaluate the performance of the proposed vision-based strain monitoring system under quasi-static and static load conditions. The experiments conducted assessed static accuracy, repeatability, stability, and sensing range.
The experimental setup as shown in
Figure 6 consisted of a fixed platform and a platform driven by a precision stepper motor (positioning accuracy: 0.03 mm, phase voltage: 3.6 V, holding torque: 0.28 N·m, maximum speed: 100 mm/s
2, and drive voltage: 24 V) to generate controlled relative displacements. Both platforms were aligned along the same horizontal plane. The MISS-Building sensing unit was rigidly mounted on the optical platform. The DM8-02-5V model LVDT, manufactured by Guangzhou Ceheng Technology Co., Ltd., Guangzhou, China, was used as a benchmark displacement sensor. It features a pen-type design with a spring return mechanism and a stainless-steel housing of 8 mm diameter. Its key technical parameters include a measurement range of 0–20 mm, a precision of 1 µm, an output voltage of 0–5 V, a linearity of 0.036%, and a sensitivity of 2.5029 V/mm. The LVDT was mounted parallel to the direction of motion on the movable platform and connected to a dedicated data-acquisition unit. A schematic overview of the experimental setup is shown in
Figure 6.
The sensing node was powered by a 5 V supply, and wireless communication was provided via a portable modem. All image processing and strain computations were performed on a PC-based processing unit.
3.2. Zero-Drift Test
The long-term stability of the proposed system was evaluated through a 24 h zero-drift test under controlled laboratory conditions. During this period, no external load was applied and the relative position of the measurement targets remained fixed.
The displacement measurements remained bounded within ±2 µm over the entire monitoring duration, corresponding to strain fluctuations within approximately ±6 µε. The results as shown in
Figure 7 indicate stable baseline behavior with no significant drift, demonstrating the temporal stability of both the sensing hardware and the processing pipeline.
The minimal drift observed over the 24 h period indicates stable system performance under controlled laboratory conditions, supporting its effectiveness for monitoring quasi-static structural behavior.
3.3. Sensing Range
The measurable displacement and strain ranges of the system were evaluated through incremental displacement testing. A step size of 100 µm was applied incrementally until the observable limit of the imaging system was reached.
The system as shown in
Figure 8 achieved an effective strain measurement range of approximately ±35,000 µε. This range is primarily governed by optical configuration parameters, including the field of view, target size, and image resolution, and can be adjusted based on application requirements.
3.4. Static and Quasi-Static Performance
The static step-loading experiments were performed to evaluate the accuracy, repeatability, and stability of the proposed sensing system under controlled displacement inputs. Three displacement step sizes, 50 µm, 25 µm, and 15 µm, were selected to represent typical incremental deformation levels encountered in static or slowly varying structural responses. For each displacement step, corresponding measurements were recorded using the LVDT and the MISS-Building sensor. The strain responses obtained for the different loading cases are shown in
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13.
Measurement errors were computed as the difference between the strain values obtained from the proposed system and those measured by the LVDT as shown in
Table 3. For the 50 µm loading case, the maximum error (MAXE), mean absolute error (MAE), and maximum percentage error were 4.93 µε, 2.63 µε, and 3.41%, respectively. For the 25 µm loading case, the corresponding values were 2.47 µε, 1.31 µε, and 3.54%. For the 15 µm loading case, the maximum error, mean absolute error, and maximum percentage error were 3.64 µε, 1.24 µε, and 8.60%, respectively.
Repeated loading sequences produced consistent strain responses, indicating stable and repeatable sensor behavior under static loading conditions. These results demonstrate that the proposed system is capable of resolving small static strains relevant to building and civil engineering applications.
The observed maximum absolute error of 1.8 με for the quais-static performance reflects the combined influence of mechanical tolerances in the strain transfer mechanism, image noise, and subpixel estimation uncertainty as shown in
Table 4.
4. Discussion
The experimental results demonstrate that the proposed vision-based monitoring system is capable of measuring static and quasi-static strain responses using a low-cost imaging platform. Across all laboratory evaluations, including zero-drift testing, sensing range assessment, and step-loading experiments, the system exhibits stable performance and consistent agreement with reference LVDT measurements under controlled conditions.
The 24 h zero-drift test indicates stability of the sensing framework, with displacement variations bounded within ±2 μm, corresponding to strain fluctuations of approximately ±6 με. This suggests that the system maintains a stable baseline under controlled laboratory conditions, which is essential for long-term monitoring applications where distinguishing the true structural response from measurement noise is critical. However, this evaluation is limited to controlled conditions, and long-term stability under varying environmental influences requires further investigation.
The sensing range results highlight that measurement capability is governed not only by algorithmic performance but also by optical configuration parameters such as field of view, target size, and image resolution. These factors define the achievable measurement range and provide flexibility in adapting the system to different structural monitoring scenarios.
Static step-loading experiments further demonstrate the accuracy and repeatability of the proposed system. The close agreement with LVDT measurements across all displacement levels confirms that the system can resolve small service-level strains relevant to building applications. As expected, measurement uncertainty increases at smaller displacement magnitudes; however, the errors remain within acceptable bounds for practical SHM deployment.
A key contribution of this work lies in the system-level architecture rather than in the development of a novel image-processing algorithm. By decoupling image acquisition from strain computation, the proposed framework enables the use of low-cost sensing nodes while leveraging centralized processing for high-precision analysis. This approach addresses a critical scalability challenge in vision-based SHM, where cost, power consumption, and computational limitations at the sensing node often restrict deployment.
The proposed architecture aligns naturally with smart building infrastructure, where wireless networks and centralized processing systems are already available. This enables distributed multi-point monitoring without complex wiring or high-end sensing hardware, making the system well suited for deployment in building environments. Compared to closely related vision-based strain sensing systems, such as the MISS-Dym sensor system, proposed by Bai et al. [
28], which employs higher-cost embedded processing hardware, the proposed system achieves comparable sensing functionality at significantly reduced cost.
However, the current system is primarily designed for static and quasi-static applications. The use of a 1 Hz sampling rate and reliance on centralized processing limit its suitability for dynamic structural monitoring scenarios. Additionally, factors such as illumination variability, compression artifacts, and geometric transformations (e.g., rotation or occlusion) may affect tracking robustness under real-world conditions. These limitations highlight the need for further development to ensure reliable performance in long-term field deployment.
5. Limitations
While laboratory validation establishes the foundational performance of the proposed system, several critical limitations remain that constrain immediate real-world deployment. These limitations arise from differences between controlled experimental conditions and complex field environments, as well as from current system design assumptions and implementation choices.
A primary limitation relates to environmental robustness, particularly illumination variability. To reduce the influence of lighting fluctuations during validation, a constant LED panel illumination source is included in the sensor’s setup to provide stable and uniform lighting conditions. This controlled setup minimized variations in pixel intensity and ensured consistent feature contrast, thereby improving the reliability of template matching and subpixel localization. However, this approach does not fully represent real-world conditions, especially in outdoor or semi-exposed environments, where factors such as direct sunlight, shadows, reflections, and weather-induced lighting changes can introduce significant variability. These effects may degrade image quality, reduce contrast, and introduce noise, ultimately affecting displacement estimation accuracy.
More broadly, the system was validated under controlled laboratory conditions with a stable temperature and minimal disturbance. In real-world environments, additional factors such as airborne particulates, humidity, and structural vibrations may further degrade image quality and affect feature detection and subpixel localization. These influences may introduce noise or bias in displacement estimation, particularly under low-contrast or partially occluded conditions.
A second limitation is the restriction of the current implementation to one-dimensional horizontal displacement measurement under quasi-static conditions. In practice, structural systems experience multi-axis deformation, including vertical displacement, rotation, and dynamic loading across a wide frequency range. Furthermore, the current sampling rate of 1 Hz limits the ability of the system to capture transient or high-frequency responses, such as those induced by wind, traffic, or seismic activity. Consequently, the present system is primarily suited for monitoring slow or quasi-static structural behavior.
The mechanical strain transfer mechanism introduces a non-negligible source of uncertainty, estimated at approximately ±6 µm due to backlash, alignment imperfections, and component tolerances. This uncertainty is comparable to the resolution of the vision-based measurement, indicating that the mechanical subsystem currently constrains the overall system accuracy. Additionally, long-term deployment may be affected by thermal expansion, material creep, or the slight movement of the reference checkerboard surface, which may introduce drift in strain estimation over extended periods.
Another important limitation relates to the use of JPEG compression for image transmission. While compression reduces data size and communication load, it introduces lossy artifacts that may affect pixel-level intensity distributions, particularly under low-contrast or noisy imaging conditions. These artifacts can influence template matching accuracy and subpixel localization, potentially introducing small but systematic errors in displacement estimation. The present study does not explicitly quantify the impact of different compression levels on measurement accuracy, which remains an important area for further investigation.
From a system architecture perspective, the current implementation relies on the centralized processing of transmitted image data. While effective for small-scale deployments, this approach introduces potential scalability challenges when extended to large sensor networks. Increasing the number of nodes may lead to network congestion, higher latency, and processing bottlenecks. Additionally, centralized architectures introduce a single point of failure, where loss of the processing node may interrupt monitoring. The current system also does not incorporate redundancy or distributed fallback mechanisms.
Network reliability and cybersecurity also present important considerations. The system currently relies on standard Wi-Fi communication without dedicated encryption or authentication mechanisms, and its performance under network congestion, packet loss, or extended outages has not been characterized. These factors may affect data integrity, latency, and overall system reliability, particularly in shared building networks.
Finally, the displacement estimation algorithm, based on normalized cross-correlation, provides computational efficiency but remains sensitive to illumination variation, partial occlusion, and geometric transformations such as rotation and deformation. While effective for controlled translational motion, its robustness may degrade under complex field conditions.
Collectively, these limitations define the current boundary between laboratory validation and scalable field deployment, highlighting the need for further development to achieve robust long-term operation under realistic environmental and structural conditions.
6. Future Work
Future research will focus on extending the proposed system toward robust field deployment and large-scale structural health monitoring applications.
A key direction is the development of multi-axis displacement and strain measurement capabilities using stereo vision or multi-camera configurations, enabling the capture of vertical, rotational, and complex structural deformations. Additionally, increasing the sampling rate will allow the system to capture dynamic structural responses, including transient and high-frequency events.
To address environmental variability, future work will explore advanced illumination handling strategies, including adaptive exposure control, high-dynamic-range (HDR) imaging, and illumination-invariant preprocessing techniques. Furthermore, learning-based approaches such as DeepLab- and EfficientNet-based models will be investigated to improve robustness against lighting changes, occlusion, and geometric transformations [
29,
30].
The impact of JPEG compression on subpixel localization and displacement accuracy will be systematically evaluated across different compression levels and imaging conditions to better understand its influence on measurement reliability.
From a system design perspective, future developments will focus on improving scalability and resilience through the integration of edge-based processing and distributed computation, reducing reliance on centralized architectures and minimizing network load. Redundant communication strategies and fault-tolerant system designs will also be explored to ensure continuous operation in the presence of network failures.
Long-term field validation will be conducted to assess system performance over extended periods, including the effects of thermal expansion, environmental exposure, and structural aging on measurement stability and drift.
Finally, efforts will be directed toward simplifying the mechanical strain transfer mechanism, including the use of direct surface-mounted optical targets, in order to reduce system complexity, cost, and mechanically induced uncertainty.
7. Conclusions
This study presented a low-cost, vision-based strain monitoring system designed for static and quasi-static building monitoring applications. The proposed approach adopts a distributed architecture in which compact camera nodes are used exclusively for image acquisition and wireless transmission, while displacement tracking and strain This study presented a low-cost, vision-based strain monitoring system designed for static and quasi-static building monitoring applications. The proposed approach adopts a distributed architecture in which compact camera nodes are used exclusively for image acquisition and wireless transmission, while displacement tracking and strain computation are performed using centralized PC-based processing.
Laboratory experiments confirmed stable measurement behavior across all evaluation scenarios. The system achieved zero-drift within ±2 µm (±6 µε) over a 24 h period, a sensing range of ±35,000 µε, and a mean absolute error of 2.7 µε across all static loading cases. This demonstrates consistent agreement with the reference LVDT sensor. Static step-loading tests with displacement increments of 50 µm, 25 µm, and 15 µm confirmed the system’s ability to resolve small strains relevant to service-level structural assessment. The achievable sensing range depends primarily on optical configuration and measurement geometry, allowing flexibility in adapting the system to different monitoring requirements.
Rather than introducing a novel image-processing technique, the primary contribution of this work lies in the system-level design and experimental validation of a viable monitoring architecture. By separating sensing from computation, the proposed framework reduces hardware complexity at the sensing node and supports deployment at scale using low-cost components, making it well-suited for building-scale monitoring scenarios where installation simplicity, cost efficiency, and multi-point sensing capability are essential.
Future work will focus on extending the proposed framework beyond the controlled laboratory conditions considered in this study. One important direction is the evaluation of system performance under varying environmental conditions, including changes in lighting, temperature, and camera alignment, which are commonly encountered in real building environments. Additional investigations will consider longer monitoring durations and field deployments to assess long-term robustness and data continuity. While the present study focuses on static and quasi-static loading, future research may also explore adaptations of the framework for low-frequency dynamic response, provided that suitable sampling strategies and synchronization methods are implemented. Further improvements may include the integration of automated quality assessment metrics, adaptive image acquisition strategies, and streamlined data management for large-scale deployments.