A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery

Hu, Yue; Liu, Yuce; Zhang, Yu; Dong, Hongwei; Li, Chongzheng; Mao, Hongzhi; Wang, Fusong; Wang, Meng

doi:10.3390/buildings15183377

Open AccessArticle

A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery

by

Yue Hu

¹,

Yuce Liu

¹

,

Yu Zhang

^2,3,

Hongwei Dong

^2,3,

Chongzheng Li

⁴

,

Hongzhi Mao

⁴

,

Fusong Wang

^5,6,*

and

Meng Wang

¹

China Three Gorges Corporation, Wuhan 420010, China

²

China Yangtze Power Co., Ltd., Beijing 100038, China

³

Three Gorges Electric Energy Co., Ltd., Wuhan 430000, China

⁴

School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

⁵

School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

⁶

School of Safety Science and Emergency Management, Wuhan University of Technology, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(18), 3377; https://doi.org/10.3390/buildings15183377

Submission received: 12 August 2025 / Revised: 4 September 2025 / Accepted: 13 September 2025 / Published: 17 September 2025

(This article belongs to the Special Issue Research on Solar Energy System and Storage for Sustainable Buildings)

Download

Browse Figures

Versions Notes

Abstract

Against the global backdrop of energy transition, the precise assessment of urban rooftop photovoltaic (PV) system capacity is recognized as crucial for optimizing the energy structure and enhancing the sustainable utilization efficiency of spatial resources. Publicly available aerial imagery is characterized by non-orthorectified issues; direct utilization is known to lead to geometric distortions in rooftop PV and errors in capacity prediction. To address this, a dual-optimization framework is proposed in this study, integrating monocular vision-based 3D reconstruction with a lightweight linear model. Leveraging the orthogonal characteristics of building structures, camera self-calibration and 3D reconstruction are achieved through geometric constraints imposed by vanishing points. Scale distortion is suppressed via the incorporation of a multi-dimensional geometric constraint error control strategy. Concurrently, a linear capacity-area model is constructed, thereby simplifying the complexity inherent in traditional multi-parameter fitting. Utilizing drone oblique photography and Google Earth public imagery, 3D reconstruction was performed for 20 PV-equipped buildings in Wuhan City. Two buildings possessing high-precision field survey data were selected as typical experimental subjects for validation. The results demonstrate that the 3D reconstruction method reduced the mean absolute percentage error (MAPE)—used here as an estimator of measurement uncertainty—of PV area identification from 10.58% (achieved by the 2D method) to 3.47%, while the coefficient of determination (R²) for the capacity model reached 0.9548. These results suggest that this methodology can provide effective technical support for low-cost, high-precision urban rooftop PV resource surveys. It has the potential to significantly enhance the reliability of energy planning data, thereby contributing to the efficient development of urban spatial resources and the achievement of sustainable energy transition goals.

Keywords:

non-orthorectified imagery; urban rooftop photovoltaics; 3D reconstruction; photovoltaic capacity estimation; sustainable energy planning

1. Introduction

Against the macro background of globally accelerating the low-carbon transition of the energy structure and actively implementing decarbonization strategic goals, the construction of sustainable energy systems has become a core issue in urban development [1]. Distributed photovoltaic (PV) systems, particularly urban rooftop PV systems, are increasingly emerging as a crucial vehicle for cities to achieve a cleaner and low-carbon energy supply, alongside the intensive utilization of space resources [2]. This prominence stems from their significant advantages: a highly flexible deployment mode compatible with built environments, considerable potential for scaled application, and an almost negligible occupation of valuable land resources [3,4]. Within this context, the high-precision and high-efficiency assessment of the actual installed capacity of existing rooftop PV installations carries significance that extends far beyond simple resource accounting. It not only constitutes fundamental data essential for scientifically planning regional energy structures, optimizing power dispatch and consumption, and supporting the sustainable transformation of urban energy systems, but also, as it directly quantifies the energy output efficiency of cities’ finite three-dimensional spatial resources (especially rooftop space), represents a core technical challenge. Addressing it is essential for enhancing the sustainable development and utilization efficiency of urban spatial resources [5].

This study focuses on the high-precision estimation of the installed capacity of existing urban rooftop photovoltaic (PV) systems [6]. Similar to research emphasizing PV resource potential prediction [3,4,7,8], reliance is placed on publicly available imagery (e.g., satellite images and aerial drone photography) for the image recognition of PV components. However, the widely utilized public data sources prevalent in current mainstream works (e.g., Malof et al. [9,10], Mayer et al. [11]) still face two critical bottlenecks hindering accuracy improvement. Firstly, non-orthorectified projection effects, primarily caused by the image acquisition perspective (such as satellite side-looking imaging or drone oblique photography), lead to significant geometric distortions in the imagery. Buildings and PV arrays are distorted due to perspective relationships, manifesting as facade tilting and roof compression or stretching. Secondly, there exists high uncertainty in the imaging methodology. These images are often derived from complex sources with unknown acquisition parameters. Specifically, critical metadata—including shooting angle, flight altitude, camera orientation, and focal length parameters—is not publicly available. Furthermore, the imagery is obtained from diverse times, devices, and platforms. This irregularity makes it difficult to establish a universal geometric correction model, thereby increasing the uncontrollability of systematic errors. If the recognition results from such non-orthorectified imagery are directly utilized for estimating urban rooftop PV capacity, significant systematic errors will inevitably be introduced. The root cause of this error lies in the geometric distortion induced by non-orthorectified projection: roof areas may be stretched or compressed within the imagery due to building tilt and height differentials, resulting in severe distortion of scale information along the direction perpendicular to the imaging line-of-sight. Empirical studies have demonstrated that the extent of facade projection deformation (expressed as the ratio of projected facade area to actual area) is significantly positively correlated (Pearson correlation coefficients of 0.870, 0.909, and 0.843 for the north, south, and east facades, respectively; see Section 3.1) with photovoltaic area estimation errors associated with 2D methods. This phenomenon is discussed in Section 3.1 of this paper. This core issue severely constrains the overall accuracy and credibility of large-scale urban rooftop PV surveys. Conducting these surveys with low-cost, accessible imagery becomes less reliable under these geometric distortions. Consequently, it impedes the accurate assessment of urban sustainable energy potential.

To address the issue of non-orthorectification errors, various approaches have been explored. For instance, Liang et al. used orthorectified imagery [12], and Li et al. applied LiDAR technology [13]. These methods provide high-precision geometric information. However, their high acquisition costs and complex processing workflows pose significant obstacles for city-wide applications. Consequently, their scalability is limited. Consequently, their application is hindered in supporting sustainable large-scale census operations. In summary, progress has been made in identifying urban rooftop PV systems through deep learning methods [5,9,10,14,15]. However, the two-dimensional outlines identified by these approaches still fail to resolve the geometric distortions introduced by non-orthorectified projection. As a result, the error in installed capacity estimation remains substantial.

A recent study by So et al. [6] shares our objective of utilizing publicly available imagery for low-cost assessment. Nonetheless, their method estimates capacity by fitting a model based on the visible (2D) surface area and color information of PV arrays, deliberately ignoring the 3D tilt angle. This simplification, while innovative, introduces a high dependency on image acquisition conditions (e.g., solar-PV relative position) and PV module types, making it potentially less robust and generalizable across diverse datasets characterized by complex imagery and varied PV installations.

To address these challenges, this study leverages image visual information [16,17] and introduces a monocular vision-based 3D reconstruction technique leveraging vanishing points (a technique extensively explored and preliminarily applied in autonomous driving and robotics domains [18,19,20,21]). This approach is combined with a lightweight linear capacity prediction model to establish a dual-optimization framework [6]. The framework is designed to achieve low-cost, high-precision extraction of existing photovoltaic (PV) areas on urban rooftops, thereby enabling accurate capacity prediction. The core innovations of this framework are manifested in the following aspects:

(1): A single publicly available aerial/satellite image is employed, significantly reducing surveying costs compared to technologies such as LiDAR.
(2): Based on the Hankou University case study, the correlation between facade projection and PV area estimation error is quantified for the first time, revealing the systematic error source inherent in 2D methods.
(3): To rectify the non-orthorectified issues prevalent in public imagery, a novel camera calibration and 3D reconstruction method is developed. This method leverages prior knowledge of building orthogonal structures and vanishing point geometric constraints to eliminate perspective distortion.
(4): A closed-loop error optimization process incorporating multi-dimensional geometric constraints (with a 5% residual threshold) is designed. This process mitigates scale distortion induced by image quality deficiencies and annotation errors, thereby ensuring PV reconstruction accuracy.
(5): A manufacturer-data-driven linear capacity-area model (trained on 215 samples) is constructed, circumventing the parameter complexity associated with traditional physical modeling.

To verify the reliability of the framework, this study utilized typical buildings in Wuhan City as empirical subjects. Combined with UAV oblique photography data, the technical advantages of the proposed method in PV area identification and capacity prediction were validated. This framework is demonstrated to provide a cost-accuracy balanced solution for urban rooftop photovoltaic power capacity assessment [1,9]. It is thereby considered to effectively support urban energy planning and the implementation of the “dual carbon” goals. Furthermore, a replicable and scalable methodological paradigm is established for large-scale, sustainable urban rooftop PV resource surveys based on publicly available satellite remote sensing imagery, significantly enhancing the feasibility of decarbonization strategy implementation.

The discussion in this paper follows the logical framework of “problem-driven, methodological innovation, and empirical verification”. Section 1 systematically explains the research objective of achieving cost-accuracy balanced extraction of existing rooftop PV areas and capacity prediction, which is conducted using publicly available aerial imagery. Subsequently, to address the non-orthorectified issues inherent in public imagery (where the correlation between facade projection and PV area estimation error was quantified based on a typical case in Section 3.1), a dual optimization framework integrating monocular 3D reconstruction and a lightweight linear model is proposed. Section 2 provides a detailed exposition of the technical workflow. Section 3 presents the results and discussion.

2. Methodology

2.1. Dataset

2.1.1. Photovoltaic Array Image Dataset

Images of 20 effective photovoltaic buildings in Wuhan were collected in this study. The image data are comprised of publicly available Google Earth [22] imagery and self-collected high-resolution drone images. Among these, drone data were acquired with a DJI Mavic 3 Classic drone (SZ DJI Technology Co., Ltd., Shenzhen, China), which is equipped with a 4/3-inch CMOS image sensor with 20 million effective pixels, an 84° field of view, and an equivalent focal length of 24 mm.

The publicly available satellite imagery was sourced from Google Earth Pro (version 7.3.6.10201, Google LLC, Mountain View, CA, USA). The images were accessed between August 2022 and April 2025, and the capture dates for the individual building images vary within this period. The specific imagery used for each building is the most recent cloud-free version available on the platform at the time of our study. The use of this imagery for academic research is in accordance with Google’s Terms of Service, permitted under “Fair Use” for non-commercial, educational purposes.

This study focuses on the assessment of photovoltaic (PV) systems on building rooftops. The inherent orthogonal structural features of buildings (such as mutually perpendicular walls and roof edges) are employed as geometric constraints for camera calibration; these features simultaneously constitute the key criteria for building screening. The accuracy of 3D reconstruction is benchmarked against the requirements for PV area estimation. An LOD1 (Level of Detail 1) basic block model, where roofs are simplified and represented as rectangles, is demonstrated to effectively support the geometric error optimization process. This approach ensures the required accuracy, while computational efficiency is significantly enhanced. The LOD specifications for the 3D building models are illustrated in Figure 1.

During the experiment, the Comprehensive Building and No. 3 Teaching Building of Hankou University in Wuhan were selected as typical experimental subjects. High-definition images of these two buildings are shown in Figure 2. An ideal geometric reference for 3D reconstruction is provided by their low occlusion rate and prominent structural features, and subsequent accuracy verification is effectively supported. The remaining 18 buildings were used for qualitative assessment and method development.

Key geometric parameters of the buildings and PV components were obtained in detail through field surveys (e.g., roof dimensions, building height, PV tilt angle, and orientation). Among these, roof dimensions, PV dimensions, and building height were used as benchmark data for model validation, while the tilt angle and orientation were employed as known parameters to support the 3D modeling process.

The photographic altitude was set at 250 m relative to the roof plane. This altitude was determined to balance the requirements for image resolution and three-dimensional feature capture, resulting in a ground resolution of 2.7 cm for the captured photovoltaic (PV) rooftops. High-definition images with varying degrees of non-orthogonality were acquired through multi-angle non-orthographic photography. Ultimately, a standardized dataset was constructed, containing images of the PV arrays along with their precise dimensions and the corresponding building dimensions.

The UAV flights were conducted in compliance with local regulations for low-altitude, lightweight UAV operations, which were permitted for this academic research project.

2.1.2. Capacity-Area Manufacturer Dataset

The capacity model was constructed based on product technical parameter manuals from 19 mainstream PV manufacturers, which encompass technical specifications for over 200 different PV module models. Key indicators were defined as power capacity and dimensional specifications. During data processing, power units (W) and area units (m²) were standardized. The Interquartile Range (IQR) method was employed to identify and remove outliers, resulting in 215 valid samples. This standardized database provides reliable support for modeling and analyzing the capacity–area relationship.

2.2. Study Framework

Our technical framework addresses the core challenges of assessing urban rooftop PV capacity from a single non-orthorectified aerial image. As illustrated in Figure 3, the framework primarily comprises four core modules: camera parameter calibration, 3D reconstruction with error control, precise PV area extraction, and lightweight capacity estimation. The detailed workflow is described as follows:

Input Data Preparation: A single non-orthorectified aerial image (e.g., publicly available satellite imagery from Google Earth or UAV oblique photography imagery) is input, along with a limited number of easily obtainable key building geometric parameters. These parameters include building footprint dimensions (L × W), height (H), and PV module installation tilt angle (θ) and orientation. They are typically acquired through simple on-site surveys or publicly available sources. The source of these priors is as follows: the footprint dimensions (L × W) were obtained by measuring the polygon outlines of buildings on publicly available web-mapping services (e.g., Baidu Maps, which provides a built-in distance measurement tool). The building height (H) was acquired from open urban 3D model databases. In the absence of specific installation data, the PV tilt angle (θ) can be set based on regional common practice or default values, and the orientation is typically assumed to be south-facing for sites in the Northern Hemisphere. These priors represent a pragmatic approach to data acquisition that leverages commonly accessible sources.

Camera Self-Calibration: Leveraging the ubiquitous orthogonal structural features inherent to buildings (such as mutually perpendicular walls and roof edges), groups of parallel lines corresponding to three orthogonal directions are selected within the input image. Their corresponding vanishing points (VPs) are computed. Utilizing the geometric orthogonality constraints between the vanishing points, combined with a known dimensional constraint of one building side (e.g., the building base length), the camera’s intrinsic matrix (K) and extrinsic matrices (rotation matrix R and translation vector T) are solved, thereby completing camera self-calibration (Section 2.3).

3D Reconstruction and Error Control: Using the calibrated camera parameters (K, R, T), preliminary 3D reconstruction is performed via the collinearity equations (Equation (1)). To mitigate calibration error accumulation and ensure the geometric accuracy of the reconstructed model, an error-closed-loop iterative strategy is introduced, incorporating multi-dimensional geometric constraints: The reconstructed model’s roof dimensions (L, W) and building height (H) are projected back onto the 2D image plane and aligned with the corresponding features in the input image. Simultaneously, the reconstructed key geometric parameters are compared against the input measured/known parameters (L × W, H), and the residual is calculated (Equation (18)). If the residual exceeds the preset threshold (ε = 5%), the vanishing point selection is revised or the dimensional constraints are optimized, initiating iterative optimization until the accuracy requirement is met (Section 2.4).

PV Array 3D Reconstruction and Area Extraction: Based on the optimized camera parameters and the reconstructed building roof plane, leveraging the known PV module tilt angle (θ) and orientation parameters (typically south-facing), the PV array model is precisely reconstructed in 3D space. Model parameters are finely adjusted to ensure its projected contours are precisely aligned with the edges of the PV modules in the input imagery (Section 2.5). Finally, the surface area (α) of the aligned PV array is extracted directly within the 3D modeling environment.

Lightweight Capacity Estimation: The accurately extracted PV surface area (α) is input into a pre-constructed lightweight linear capacity model (Equation (19)). This model, established based on statistical analysis of large-scale PV manufacturer data, utilizes surface area as the sole independent variable and directly outputs the estimated installed capacity (c) of the PV system (Section 2.6).

This workflow achieves inversion from 2D imagery to 3D information through monocular vision geometric constraints, effectively correcting projection distortion in non-orthorectified imagery. Combined with the error control strategy to ensure accuracy, it ultimately enables robust capacity estimation via a concise linear model. Consequently, an efficient and low-cost technical pathway is provided for city-scale PV resource surveys.

2.3. Vanishing-Point Constrained Camera Calibration

Three-dimensional spatial information is inferred from two-dimensional images, which constitutes the core objective of camera calibration [23,24,25,26,27]. During the monocular vision imaging process, points in 3D space are mapped onto a two-dimensional pixel plane via the collinearity condition equations. The mathematical expression is provided by:

s [\begin{matrix} i \\ j \\ 1 \end{matrix}] = K [R |T] [\begin{matrix} x \\ y \\ \begin{matrix} z \\ 1 \end{matrix} \end{matrix}]

(1)

In the equation, s represents the scale factor (dimensionless), K denotes the intrinsic matrix, R and T are the rotation matrix and translation vector (extrinsic parameters), respectively, (i, j) represents the pixel coordinates, and (x, y, z) denotes the world coordinates.

The problem involves describing and computing the transformations between the world, camera, and image plane coordinate systems. Define the world coordinate system as R_O (O, x, y, z), the camera coordinate system as R_C (C, i, j, k), and the image plane coordinate system as R_S (S, i, j). The world coordinate system R_O is transformed into the camera coordinate system R_C via the extrinsic matrix [R | T]. Subsequently, the camera coordinate system R_C is transformed into the image plane coordinate system R_S via the intrinsic matrix K.

In this study, the open-source tool fspy (version 1.0.3; stuffmatic, Stockholm, Sweden) [28] is used for calibration, with the orthogonal structural features of building facades being utilized to select three orthogonal pixel points in monocular images, and the corresponding three sets of vanishing points are calculated. Secondly, known dimensional constraints of the building (specifically, the actual length of one side) are combined to solve the camera’s intrinsic parameters (intrinsic matrix K) and extrinsic parameters (including rotation matrix R and translation vector T). A schematic diagram of the camera calibration process based on vanishing points is provided in Figure 4.

2.3.1. Calculating the Intrinsic Matrix

As shown in Figure 4, the projection center is denoted as C, and its projection onto the image plane is point P. The image contains three mutually orthogonal sets of parallel lines, whose corresponding vanishing points are v₁, v₂, and v₃. These vanishing points are represented in homogeneous coordinates as (v_x, v_y, 1). Corresponding to the three orthogonal directions (x/y/z axes), these vanishing points satisfy the following constraint in the camera coordinate system:

{v_{i / R_{C}}}^{T} \cdot v_{j / R_{C}} = 0 (i \neq j and i, j \in \{1, 2, 3\})

(2)

And:

v_{/ R_{C}} = K^{- 1} \cdot v_{/ R_{S}}

(3)

Therefore:

{v_{i / R_{S}}}^{T} K^{- T} K^{- 1} v_{j / R_{S}} = 0

(4)

where K is the intrinsic matrix:

K = [\begin{matrix} f & 0 & u_{0} \\ 0 & f & v_{0} \\ 0 & 0 & 1 \end{matrix}]

(5)

We derive the equation:

(v_{ix} {- u}_{0}) (v_{jx} - u_{0}) + (v_{iy} - v_{0}) (v_{jy} - v_{0}) + f^{2} = 0 (i \neq j)

(6)

For the combination of three orthogonal vanishing points, we can establish three equations to solve for the parameters f, u₀, and v₀, ultimately determining the intrinsic matrix K.

2.3.2. Calculating the Rotation Matrix

Taking the vanishing point v₁ corresponding to the x-axis in the camera coordinate system R_O as an example, it represents a point at infinity in the real world projected onto the image plane. It is represented in homogeneous coordinates as

{(\begin{matrix} 1 & 0 & \begin{matrix} 0 & 0 \end{matrix} \end{matrix})}_{/ R_{O}}

. Its collinearity equation is expressed as:

\begin{matrix} s v_{1 / R_{S}} = K [R |T] X_{\infty / R_{O}} \\ = K [\begin{matrix} r_{1} & r_{2} & r_{3} \end{matrix} |T] [\begin{matrix} 1 \\ 0 \\ \begin{matrix} 0 \\ 0 \end{matrix} \end{matrix}] \end{matrix}

(7)

Therefore:

s v_{1 / R_{S}} = K r_{1}

(8)

r_{1}

is normalized:

r_{1} = \frac{K^{- 1} v_{1 / R_{S}}}{‖K^{- 1} v_{x}‖} = [\begin{matrix} \frac{v_{1 x} - u_{0}}{\sqrt{{(v_{1 x} - u_{0})}^{2} + {(v_{1 y} - v_{0})}^{2} + f^{2}}} \\ \frac{v_{1 y} - v_{0}}{\sqrt{{(v_{1 x} - u_{0})}^{2} + {(v_{1 y} - v_{0})}^{2} + f^{2}}} \\ \frac{f}{\sqrt{{(v_{1 x} - u_{0})}^{2} + {(v_{1 y} - v_{0})}^{2} + f^{2}}} \end{matrix}]

(9)

Similarly, the rotation vectors

r_{2}

and

r_{3}

corresponding to the y-axis and z-axis, respectively, can be calculated using the vanishing points v₂ and v₃.

r_{2} = \frac{K^{- 1} v_{2 / R_{S}}}{‖K^{- 1} v_{y}‖} r_{3} = \frac{K^{- 1} v_{3 / R_{S}}}{‖K^{- 1} v_{z}‖}

(10)

Ultimately, the rotation matrix is given by:

R = [\begin{matrix} r_{1} & r_{2} & r_{3} \end{matrix}]

(11)

2.3.3. Calculating the Translation Vector

As shown in Figure 5, let point A′ denote the perspective projection of point A, and vector

\vec{A^{'} P^{'}}

denote the perspective projection of vector

\vec{AP}

. Vector

\vec{AP}

is parallel to the x-axis of the world coordinate system R_O and originates from its origin O. To determine the translation vector T, we assume the length of

\vec{AP}

is known; otherwise, the translation vector can only be determined up to an unknown scale factor.

We can write:

{\vec{AP}}_{/ R_{O}} = l \cdot u; {\vec{AP}}_{/ R_{C}} = R \cdot {\vec{AP}}_{/ R_{O}}

(12)

Let P″ denote the intersection point of line (OP) and line D passing through A with direction vector

\vec{AP}

. Then:

{P^{″}}_{/ R_{C}} = {(OP)}_{/ R_{C}} ⋂ D = {(OP')}_{/ R_{C}} ⋂ D

(13)

Since triangles OA′P″ and OAP are similar, we obtain:

\frac{‖{\vec{A^{'} P^{″}}}_{/ R_{C}}‖}{‖{\vec{AP}}_{/ R_{C}}‖} = \frac{‖{\vec{{OA}^{'}}}_{/ R_{C}}‖}{‖{\vec{OA}}_{/ R_{C}}‖}

(14)

or

‖{\vec{OA}}_{/ R_{C}}‖ = \frac{‖{\vec{{OA}^{'}}}_{/ R_{C}}‖ \cdot ‖{\vec{AP}}_{/ R_{C}}‖}{‖{\vec{A^{'} P^{″}}}_{/ R_{C}}‖}

(15)

Therefore:

{\vec{OA}}_{/ R_{C}} = ‖{\vec{OA}}_{/ R_{C}}‖ \cdot \frac{{\vec{{OA}^{'}}}_{/ R_{C}}}{‖{\vec{{OA}^{'}}}_{/ R_{C}}‖}

(16)

The translation vector is given by:

T = R^{- 1} \cdot {\overset{⇢}{O A}}_{/ R_{C}}

(17)

Similarly, if the length of a line segment parallel to any other coordinate axis (y or z) of R_O is known, the translation vector T can be solved using the same principle.

2.4. Error Control Strategy

The conversion from the 2D image plane coordinate system to the 3D world coordinate system is enabled by the initial parameters obtained via the vanishing point calibration method (intrinsic matrix K and extrinsic matrix [R | T]) through the collinearity equations. Given that point selection accuracy is critical for vanishing-point-based calibration and that errors can be accumulated progressively during the parameter estimation process—ultimately affecting the geometric precision of the reconstructed 3D model [9]—a robust error control strategy based on multi-dimensional geometric constraints is proposed in this study. Publicly available, redundant geometric information commonly accessible for urban buildings—such as building footprint dimensions and height data—is leveraged in the core idea of this strategy to compute residuals and iteratively optimize the initially calibrated parameters. The propagation of single-point errors is effectively suppressed, and the overall scale accuracy of the reconstructed model is ensured by this approach [20]. The specific steps are outlined as follows.

Based on the initial calibration parameters, the building’s 3D model was reconstructed from the monocular image using the open-source 3D modeling software Blender (version 2.83.19, Blender Foundation, Amsterdam, The Netherlands) [29]. By interactively adjusting the parameters of the reconstructed model—specifically, the rooftop dimensions (length L and width W) and building height (H)—the projected contours of the model in the two-dimensional image plane (including rooftop edge lines and building footprint outlines) were precisely aligned with the corresponding features in the input image. This step effectively leveraged the identifiable geometric outlines of buildings in the image as strong constraints, with potential errors in camera calibration parameters being mapped and corrected into the observable and adjustable parameter space of the 3D model [25].

The key geometric parameters of the reconstructed model are compared with publicly available measured data. The outline size residual Δ(L × W) and height residual ΔH are calculated:

∆ (L \times W) = \frac{|L_{pre} \cdot W_{pre} - L_{real} \cdot W_{real}|}{L_{real} \cdot W_{real}} ∆ H = \frac{|H_{pre} - H_{real}|}{H_{real}}

(18)

The residual threshold is set as

ϵ = 5 %

, taking into account typical measurement errors in building dimensions, acceptable accuracy levels for engineering applications, and the efficiency of model optimization. The current set of camera calibration parameters is deemed valid only when both the residual error in the building’s footprint dimensions

∆ (L \times W)

and the height residual

∆ H

are strictly below this threshold. If either error exceeds the threshold, the process returns to the camera calibration stage, where vanishing point selection is adjusted or geometric constraints on building dimensions are refined.

The residual threshold of ε = 5% was selected based on a trade-off between reconstruction accuracy and operational feasibility. This value was chosen to be stricter than typical uncertainties in the readily available prior data to ensure it meaningfully improved the model, while being loose enough to be achievable without excessive manual iteration for the majority of buildings in our dataset.

A formal quantitative sensitivity analysis of this threshold (e.g., testing ε = 3% or 8%) would require fully automating the vanishing point selection and model adjustment steps to avoid introducing human variability. While such an ablation study is a valuable direction for future work with an automated pipeline, the chosen value of 5% proved to be robust and effective in practice, as evidenced by the low final MAPU achieved (3.47%).

2.5. PV Array 3D Reconstruction and Area Extraction

Based on the validated or optimized camera calibration parameters, the 3D reconstruction process incorporates key field-surveyed information, including the tilt angle θ and orientation of the photovoltaic (PV) modules, which are typically tilted southward [30,31]. These empirically obtained parameters are treated as known constraints and are directly input into the 3D modeling software Blender. Specifically, the tilt angle θ is used to precisely define the angle between the surface normal of the PV modules and the horizontal rooftop plane, while the orientation parameter (commonly south-facing) determines the azimuth of the PV modules on the roof plane, ensuring that the reconstructed PV array accurately reflects the real-world installation in spatial posture. The open-source 3D modeling software Blender is employed to reconstruct rooftop PV arrays from monocular images.

Using the reconstructed rooftop plane as the spatial reference baseline, the lowest edge of the photovoltaic (PV) array model—typically the edge adjacent to the rooftop mounting base—is strictly constrained to align with this reference plane, thereby simulating the actual installation condition. Subsequently, the edge length parameters of the PV panels are finely adjusted to ensure that their projected boundaries in the two-dimensional image coordinate system precisely match the visually identifiable edges of the PV modules in the original image [23]. This sequence of spatial geometric constraints constitutes a critical step in mitigating scale distortions of PV components caused by perspective projection and oblique viewing angles in monocular imagery.

The surface area data of the aligned PV array are extracted via the 3D modeling software. Figure 6 illustrates the geometric alignment process and the reconstructed building and PV array models. By leveraging spatial geometric constraints, scale distortions caused by non-orthogonal projection in monocular imagery are effectively mitigated by this method, providing high-precision surface area input for capacity prediction.

2.6. Capacity Fitting Model

A Pearson correlation analysis was performed on the manufacturer capacity–area dataset, and a remarkably high correlation coefficient of 0.9599 (p = 2.1590 × 10⁻¹¹⁷) between module area and power generation capacity was revealed. This provides strong statistical support for constructing a linear regression model and validates the feasibility of utilizing the “power capacity per unit area” parameter [5,6].

Based on this analysis, a lightweight linear prediction model is established:

c = γ α + γ_{0}

(19)

where

c

represents the power capacity of the PV module (W),

γ

denotes the power capacity per unit area parameter (W/m²),

α

is the surface area of the PV module (m²), and

γ_{0}

is the model intercept (W). Although theoretically the intercept approaches zero, retaining this parameter enhances model flexibility [5].

3. Results and Discussion

3.1. Impact of Facade Distortion on PV Area Prediction

To quantitatively assess the error contribution of building facade distortion under non-orthorectified projection to rooftop photovoltaic (PV) area prediction, the facade area ratio (i.e., the ratio of projected facade area to its actual area) and the relative error of PV area prediction (i.e., the relative difference between PV area estimated using a purely 2D projection method and the actual PV area) were extracted along directions of Teaching Building No. 3 at Hankou University. The correlation between these two variables was then calculated to evaluate their relationship.

The Pearson correlation coefficients calculated for all samples were 0.870 for the north facade, 0.909 for the south facade, and 0.843 for the east facade, indicating that as the degree of facade projection distortion increases, the systematic error in photovoltaic area estimation using 2D methods also significantly increases. This strong correlation suggests that geometric distortions caused by the tilt or height differences of various building facades are a major source of error in conventional 2D extraction methods.

Taking the south facade as an example, an ordinary least squares (OLS) linear regression was performed on the data presented in Figure 7.

∆_{PV} = 0.5126 ∆_{facade} + 0.0063

(20)

The analysis yielded a highly significant relationship (p < 0.001 for β) with the following parameters: slope β = 0.5126 (95% CI: [0.3572, 0.6297]; SE = 0.0591), and intercept α = 0.0063. The coefficient of determination R² was 0.8971.

A Breusch–Pagan test was conducted to assess heteroskedasticity (LM statistic = 1.2438, p = 0.2647), indicating that the null hypothesis of homoscedasticity cannot be rejected at the 5% significance level. This supports the use of standard OLS assumptions for this relationship.

This result indicates that for every 0.1 increase in the facade area ratio, the relative uncertainty of the 2D estimation method increases by approximately 5.1 percentage points on average.

Specifically, for every 0.1 increase in the facade area ratio, the relative error of the 2D estimation method increases by approximately 5.1% on average. This further confirms that building tilt and height differences cause stretching or compression of rooftop areas in non-orthorectified imagery. Such geometric effects make it challenging for 2D methods that directly utilize aerial images—while ignoring non-orthorectification errors—to accurately recover the true dimensions of rooftop planes, especially where scale information perpendicular to the imaging line of sight is severely distorted. In contrast, the 3D reconstruction approach proposed in this study effectively mitigates the interference of facade projection distortion on rooftop dimension estimation by restoring the three-dimensional spatial geometry.

3.2. PV Area Recognition Accuracy and Uncertainty Analysis

To quantitatively evaluate the effectiveness of the 3D reconstruction method, the Mean Absolute Percentage Error (MAPE) is employed as the core metric to estimate the uncertainty in area prediction. The performance differences between the proposed 3D reconstruction method and a conventional 2D method (which directly utilizes aerial imagery while ignoring non-orthographic errors) are compared when applied to the PV module image dataset.

To establish a baseline for comparison, the performance of the proposed 3D reconstruction method was evaluated against a conventional 2D method. This 2D baseline involved the manual polygonal delineation of PV array boundaries directly on the original, non-orthorectified input imagery (i.e., without applying any geometric correction). Crucially, to enable a direct and fair comparison, the scale for this 2D method was derived in an analogous way to the 3D method: it was calibrated using the same known dimensions of a building segment (e.g., the length of a roof edge) that served as the control for the 3D reconstruction. The area in pixels was converted to physical units (m²) by first establishing a pixel-to-meter ratio from this known reference length. This approach represents a common yet geometrically naive practice that directly utilizes imagery while ignoring perspective distortion, thereby highlighting the specific advantage of our 3D framework in correcting these geometric errors.

As presented in Figure 8, the mean MAPE of the 2D method, indicating its higher uncertainty, was recorded as 10.58%, while the proposed 3D method with the 5% error control strategy achieved a significantly reduced mean MAPE of 3.47%. The error frequency distribution table further demonstrates that the 2D method exhibited a significantly higher proportion of high-error samples, while the results of the proposed method were concentrated within the low-error range. These findings indicate that 3D reconstruction substantially enhances the stability and accuracy of area identification by correcting geometric distortions induced by non-orthogonal projection. Furthermore, the error reduction is primarily attributed to the error control strategy integrated into the 3D reconstruction process. This strategy, leveraging vanishing point constraints and multi-dimensional geometric constraints, effectively suppressed scale biases caused by non-orthorectified projection in monocular imagery.

This significant reduction in estimation uncertainty—from 10.58% to 3.47%—strongly demonstrates that the major geometric distortions caused by non-orthorectified projection, particularly the rooftop plane’s perspective foreshortening and scaling biases induced by building facade tilt and height variations, are successfully corrected by the 3D reconstruction process. The high consistency between the 3D method’s predicted areas and the actual areas is visually illustrated in Figure 9, with data points closely clustered around the reference line y = x. In contrast, obvious outliers are exhibited by the 2D method’s points.

3.3. Capacity Model Performance

A total of 215 data points comprising photovoltaic (PV) module areas and their corresponding power capacities were collected in this study. To mitigate the influence of outliers on the regression model, a multi-stage outlier removal procedure was designed: first, univariate extreme values in area and capacity were removed using the interquartile range (IQR) rule; second, multivariate extreme samples were further filtered by using the Z-score method; third, samples with residuals exceeding ±2 standard deviations from an initial linear regression model were excluded; finally, observations with excessive influence were identified and removed based on Cook’s Distance from ordinary least squares (OLS) regression, with a threshold of 4/n being used.

After this procedure, 202 samples were retained, corresponding to a removal rate of 6.0%. Subsequent examination of the excluded samples revealed that all corresponded to atypical modules with power-to-area ratios deviating significantly from mainstream products, including colored photovoltaic glass, photovoltaic tiles, and obsolete models. Therefore, these were considered non-representative extreme values and were justifiably excluded.

Figure 10 presents a comparison of the dataset before and after outlier removal, demonstrating that data cleaning substantially improved the model’s stability and predictive accuracy.

Based on the cleaned dataset, a linear regression model was constructed relating photovoltaic (PV) module surface area (

α

) to power generation capacity (

c

):

c = γ α + γ_{0}

where

γ

represents the capacity per unit area parameter (W/m²), and

γ_{0}

denotes the model intercept (W). Although the intercept is theoretically expected to approach zero, retaining this parameter is found to enhance the model’s flexibility in practical applications.

As demonstrated in Figure 11, a high level of consistency was demonstrated by the model. A coefficient of determination (R²) of 0.9548 was achieved, with a mean squared error (MSE) of 426.79 W² and a mean absolute error (MAE) of 18.86 W, indicating excellent predictive capability. The specific regression equation derived is:

c = 226.95 α - 8.7663

(21)

Although the model intercept (γ₀ =

-

8.7663 W) is non-zero, it is statistically insignificant (p-value = 0.2316 > 0.05; 95% CI:

-

28.5453, 6.9494). This indicates that forcing the regression through the origin (i.e., setting γ₀ = 0) would not materially change the model’s predictions or conclusions; while retaining it enhances model flexibility.

To validate the model’s generalizability, a five-fold cross-validation approach was further adopted. Robust generalization capability for unseen data is indicated by Table 1, which shows that the average cross-validated R² is 0.9503 ± 0.0108.

It is important to note that this linear capacity-area model is derived from and is therefore applicable to the types of standard photovoltaic modules represented in the technical specifications of the 19 mainstream manufacturers included in our dataset. Its application to atypical module designs (e.g., colored PV glass, solar tiles, or flexible thin-film modules with significantly different power-to-area ratios) may require additional validation or model adjustment.

An important consideration for the long-term application of the proposed capacity-area model is its sensitivity to the continuous evolution of photovoltaic technology. The linear relationship (Equation (19)) and its parameter γ (W/m²) are empirically derived from the prevailing market technologies at the time of the study. The emergence of new, high-efficiency module designs (e.g., perovskite-silicon tandems, heterojunction cells) with significantly higher power densities could indeed alter this relationship, potentially reducing the model’s accuracy if not updated.

However, the framework itself is designed to be adaptable, not static. The simplicity of the linear model is a key advantage here; it can be rapidly recalibrated with a new set of manufacturer datasheets. Future work could automate this process by periodically scraping the latest technical specifications from manufacturer websites. Thus, while the specific numerical value of γ may change over time, the understanding that a strong, linear capacity-area relationship exists and the methodology for extracting the area α remain the core, enduring contributions of this work. The proposed method provides a powerful tool for generating snapshots of PV capacity at a city scale, with the understanding that its predictive core (the linear coefficient) requires periodic updates to reflect technological progress, much like any other data-driven model in a rapidly advancing field.

3.4. Analysis of Measurement Uncertainty

In accordance with the Guide to the Expression of Uncertainty in Measurement (GUM), an analysis of the measurement uncertainty was conducted to evaluate the reliability of the proposed method. The overall measurement process is complex and non-linear, involving the 3D reconstruction of the PV array and subsequent area extraction. The key quantity of interest is the surface area α (in m²) of the photovoltaic array.

Measurement Model:

The measured area α is a function of multiple input quantities: the input image data (I), the camera parameters (K, R, T) estimated through calibration, the prior geometric constraints (M_prior, e.g., building height H, footprint dimensions L × W), and the manual intervention points (M_man, e.g., vanishing point selection, PV boundary delineation).

Identification of Major Uncertainty Sources:

The main contributors to the combined standard uncertainty of the area measurement u_c(α) include:

(1): Uncertainty in camera calibration (u_cal): Arises from the selection of vanishing points and linear features in the image. This component also encompasses residual optical distortions not fully corrected by the calibration model.
(2): Uncertainty in prior dimensions (u_prior): Associated with the accuracy of the easily obtainable building geometric parameters (e.g., H, L, W). This is a Type B uncertainty estimate.
(3): Uncertainty in manual delineation (u_man): Related to the manual selection of PV array boundaries in the image for both the 2D baseline and the final alignment in the 3D model.
(4): Uncertainty from the linear capacity model (u_model): Although not directly affecting the area measurement α, this contributes to the final capacity uncertainty. It is quantified by the standard error of the regression fit to the manufacturer’s data.

Combined Uncertainty Estimate:

Due to the non-linearity of the overall measurement function, a Monte Carlo simulation would be required for a rigorous propagation of uncertainties. However, for the practical purpose of this study, the validation process against high-accuracy field measurements provides an empirical estimate of the total uncertainty. The Mean Absolute Percentage Uncertainty (MAPU)—previously reported as MAPE—of 3.47% for the 3D method is therefore put forward as a pragmatic estimate of the relative combined standard uncertainty u_c,rel(α) for the PV area measurement:

u_c,rel(α) ≈ MAPU = 3.47%

This value encapsulates the net effect of all the uncertainty sources listed above and provides a direct measure of the precision achievable with the proposed method under the conditions of this experiment.

4. Conclusions

The core challenge of geometric distortion and systematic errors caused by prevalent non-orthorectified projection when publicly available aerial imagery is utilized for urban rooftop photovoltaic (PV) capacity assessment is addressed in this study. A dual-optimization framework integrating monocular vision-based 3D reconstruction with a lightweight linear capacity model is proposed and validated. The core contributions and advantages of this framework are summarized as follows:

High-Precision, Low-Cost 3D Reconstruction: By innovatively leveraging orthogonal structural priors of urban buildings, camera self-calibration and building 3D reconstruction are achieved through vanishing point geometric constraints from a single non-orthorectified image. Combined with a multi-dimensional error control strategy utilizing redundant measured geometric data (footprint dimensions, height) at a 5% residual threshold, geometric distortions are effectively corrected by this approach. It reduces the mean absolute percentage error (MAPE) —our measure of measurement uncertainty—to 3.47% (substantially outperforming the 10.58% error of 2D methods). Requiring only easily accessible public imagery and minimal building geometric priors, this method dramatically lowers the cost barrier for high-precision 3D data acquisition.

Efficient, Robust Capacity Prediction: Based on statistical analysis of large-scale manufacturer data, a lightweight linear capacity prediction model using PV surface area as the sole independent variable is constructed. While high predictive accuracy (R² = 0.9548) is ensured, complex physical modeling is completely circumvented. This model exhibits high computational efficiency and facilitates straightforward engineering deployment. Its exceptional generalization capability and robustness are confirmed by rigorous five-fold cross-validation (mean R² = 0.9503 ± 0.0108).

In summary, our findings indicate that this framework may provide a viable technical solution for city-scale rooftop PV resource surveys, potentially integrating high precision, low cost, and strong engineering applicability. It appears to be particularly well-suited for efficiently utilizing massive public aerial/satellite imagery (e.g., Google Earth) to conduct rapid, large-scale resource potential assessments, thereby furnishing reliable data support for urban energy planning and the achievement of “dual carbon” goals.

Future work will focus on: (1) developing more robust automatic vanishing point detection algorithms to reduce manual intervention; (2) exploring automation and intellectualization of the multi-dimensional geometric constraint error control process (e.g., introducing optimization algorithms for automatic parameter iteration) to further enhance efficiency and accuracy.

Author Contributions

Formal analysis, Y.H., H.D., C.L., H.M., and F.W.; Investigation, Y.L., Y.Z., H.D., and M.W.; Methodology, Y.H., Y.Z., and C.L.; Resources, Y.L., Y.Z., and F.W.; Software, Y.H., Y.L., H.M., F.W., and M.W.; Validation, Y.H., H.D., and H.M.; Writing—original draft, Y.H. and C.L.; Writing—review and editing, M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study is funded by China Yangtze Power Co., Ltd. under the contract Z342302008.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns.

Conflicts of Interest

Authors Yue Hu, Yuce Liu and Meng Wang were affiliated to the company China Three Gorges Corporation. Authors Yu Zhang and Hongwei Dong were affiliated to the company China Yangtze Power Co., Ltd. and the company Three Gorges Electric Energy Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

National Energy Administration. 2024 Photovoltaic Power Generation Construction Status. Available online: https://www.nea.gov.cn/20250221/f04452701c914d51a89d0c0ea6f4acd1/c.html (accessed on 24 July 2025).
Liu, W.; Mao, H.; Tian, Z.; Luo, Y.; Ma, L.; Chen, X.; Hu, M.; Li, J.; Fan, J. Challenges of Large-Scale Facade PV Systems in Dense Urban Environments in China. Innov. Energy 2025, 2, 100091. [Google Scholar] [CrossRef]
Li, Y.; Ding, D.; Liu, C.; Wang, C. A Pixel-Based Approach to Estimation of Solar Energy Potential on Building Roofs. Energy Build. 2016, 129, 563–573. [Google Scholar] [CrossRef]
Hu, M.; Liu, Z.; Huang, Y.; Wei, M.; Yuan, B. Estimation of Rooftop Solar Photovoltaic Potential Based on High-Resolution Images and Digital Surface Models. Buildings 2023, 13, 2686. [Google Scholar] [CrossRef]
Hu, W.; Bradbury, K.; Malof, J.M.; Li, B.; Huang, B.; Streltsov, A.; Fujita, K.S.; Hoen, B. Mapping Solar Array Location, Size, and Capacity Using Deep Learning and Overhead Imagery. arXiv 2019. [Google Scholar] [CrossRef]
So, B.; Nezin, C.; Kaimal, V.; Keene, S.; Collins, L.; Bradbury, K.; Malof, J.M. Estimating the Electricity Generation Capacity of Solar Photovoltaic Arrays Using Only Color Aerial Imagery. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017. [Google Scholar]
Ranjgar, B.; Niccolai, A. Large-Scale Rooftop Solar Photovoltaic Power Production Potential Assessment: A Case Study for Tehran Metropolitan Area, Iran. Energies 2023, 16, 7111. [Google Scholar] [CrossRef]
Martín-Jiménez, J.; Del Pozo, S.; Sánchez-Aparicio, M.; Lagüela, S. Multi-Scale Roof Characterization from LiDAR Data and Aerial Orthoimagery: Automatic Computation of Building Photovoltaic Capacity. Autom. Constr. 2020, 109, 102965. [Google Scholar] [CrossRef]
Malof, J.M.; Rui, H.; Collins, L.M.; Bradbury, K.; Newell, R. Automatic Solar Photovoltaic Panel Detection in Satellite Imagery. In Proceedings of the 2015 International Conference on Renewable Energy Research and Applications (ICRERA), Palermo, Italy, 22–25 November 2015. [Google Scholar]
Malof, J.M.; Bradbury, K.; Collins, L.M.; Newell, R.G. Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial Imagery. Appl. Energy 2016, 183, 229–240. [Google Scholar] [CrossRef]
Mayer, K.; Wang, Z.; Arlt, M.-L.; Neumann, D.; Rajagopal, R. DeepSolar for Germany: A Deep Learning Framework for PV System Mapping from Aerial Imagery. In Proceedings of the 2020 International Conference on Smart Energy Systems and Technologies (SEST), Istanbul, Turkey, 7–9 September 2020. [Google Scholar]
Liang, S. Research on Photovoltaic Target Recognition and Extraction Method Based on Satellite and Aerial Orthophoto Images. Master’s Thesis, Zhejiang University, Hangzhou, China, 2021. [Google Scholar] [CrossRef]
Li, Z.; Ji, S.; Fan, D.; Yan, Z.; Wang, F.; Wang, R. Reconstruction of 3D Information of Buildings from Single-View Images Based on Shadow Information. ISPRS Int. J. Geo Inf. 2024, 13, 62. [Google Scholar] [CrossRef]
Mayer, K.; Rausch, B.; Arlt, M.-L.; Gust, G.; Wang, Z.; Neumann, D.; Rajagopal, R. 3D-PV-Locator: Large-Scale Detection of Rooftop-Mounted Photovoltaic Systems in 3D. Appl. Energy 2022, 310, 118469. [Google Scholar] [CrossRef]
He, K.; Zhang, L. Automatic Detection and Mapping of Solar Photovoltaic Arrays with Deep Convolutional Neural Networks in High Resolution Satellite Images. In Proceedings of the 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020; pp. 3068–3073. [Google Scholar]
Anbarasu, B. Vision-Based Heading Estimation for Navigation of a Micro-Aerial Vehicle in GNSS-Denied Staircase Environment Using Vanishing Point. Aerosp. Syst. 2024, 7, 395–418. [Google Scholar] [CrossRef]
Malik, Z.; Mirani, A.; Gopi, T.; Alapati, M. A Review on Vision-Based Deep Learning Techniques for Damage Detection in Bolted Joints. Asian J. Civ. Eng. 2024, 25, 5697–5707. [Google Scholar] [CrossRef]
Kansal, S.; Mukherjee, S. Automatic Single-View Monocular Camera Calibration-Based Object Manipulation Using Novel Dexterous Multi-Fingered Delta Robot. Neural Comput. Appl. 2019, 31, 2661–2678. [Google Scholar] [CrossRef]
Arredondo-Soto, M.; García-Murillo, M.A.; Cervantes-Sánchez, J.J.; Torres, F.J.; Moreno-Avalos, H.A. Identification of Geometric Parameters of a Parallel Robot by Using a Camera Calibration Technique. J. Mech. Sci. Technol. 2021, 35, 729–737. [Google Scholar] [CrossRef]
Li, X.; Kim, H.; Kakani, V.; Kim, H. Multilayer Perceptron-Based Error Compensation for Automatic On-the-Fly Camera Orientation Estimation Using a Single Vanishing Point from Road Lane. Sensors 2024, 24, 1039. [Google Scholar] [CrossRef] [PubMed]
Lee, S.J.; Hwang, S.S. Fast and Accurate Self-Calibration Using Vanishing Point Detection in Manmade Environments. Int. J. Control Autom. Syst. 2020, 18, 2609–2620. [Google Scholar] [CrossRef]
Google Earth Pro, Version 7.3.6.10201, Google LLC: Mountain View, CA, USA, 2023. Available online: https://www.google.com/earth/versions/ (accessed on 25 January 2025).
Ge, Y.; Guo, B.; Zha, P.; Jiang, S.; Jiang, Z.; Li, D. 3D Reconstruction of Ancient Buildings Using UAV Images and Neural Radiation Field with Depth Supervision. Remote Sens. 2024, 16, 473. [Google Scholar] [CrossRef]
Brazil, G.; Liu, X. M3D-RPN: Monocular 3D Region Proposal Network for Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
Liu, Y.; Chen, X.; Gu, T.; Zhang, Y.; Xing, G. Real-Time Camera Pose Estimation via Line Tracking. Vis. Comput. 2018, 34, 899–909. [Google Scholar] [CrossRef]
Shoukat, M.A.; Sargano, A.B.; Malyshev, A.; You, L.; Habib, Z. SS3DNet-AF: A Single-Stage, Single-View 3D Reconstruction Network with Attention-Based Fusion. Appl. Sci. 2024, 14, 11424. [Google Scholar] [CrossRef]
Zhou, Y.; He, Y.; Zhu, H.; Wang, C.; Li, H.; Jiang, Q. Monocular 3D Object Detection: An Extrinsic Parameter Free Approach. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 7552–7562. [Google Scholar]
fSpy, Version 1.0.3, Stuffmatic: Stockholm, Sweden, 2020. Available online: https://fspy.io (accessed on 25 January 2025).
Blender Development Team. *Blender*, Stichting Blender Foundation: Amsterdam, The Netherlands, 2023. Available online: http://www.blender.org (accessed on 25 January 2025).
Haghdadi, N.; Copper, J.; Bruce, A.; MacGill, I. A Method to Estimate the Location and Orientation of Distributed Photovoltaic Systems from Their Generation Output Data. Renew. Energy 2017, 108, 390–400. [Google Scholar] [CrossRef]
Meng, B.; Loonen, R.C.G.M.; Hensen, J.L.M. Data-Driven Inference of Unknown Tilt and Azimuth of Distributed PV Systems. Sol. Energy 2020, 211, 418–432. [Google Scholar] [CrossRef]

Figure 1. The LOD specifications for the 3D building models.

Figure 2. Image of Hankou University’s Comprehensive Building and No. 3 Teaching Building. (30.5476° N, 114.2652° E).

Figure 3. Study Framework.

Figure 4. Schematic diagram of vanishing point calibration.

Figure 5. Calculating the translation vector.

Figure 6. Geometric alignment and reconstructed building and PV model.

Figure 7. Correlation between the south facade area ratio and the photovoltaic area prediction error ratio.

Figure 8. Comparative histogram of the frequency distribution of MAPE for the two methods.

Figure 9. Scatter plot comparing predicted values and true values for the two methods.

Figure 10. Scatter plot of data before and after outlier removal.

Figure 11. Linear fitting model for PV area and capacity.

Table 1. Results of five-fold cross-validation.

Fold	R²	MSE (W²)	MAE (W)
1	0.9589	391.75	18.18
2	0.9584	454.46	19.13
3	0.9326	516.55	20.29
4	0.9427	458.44	19.86
5	0.9590	431.49	19.33
Mean	0.9503 ± 0.0108	450.54 ± 40.63	19.36 ± 0.71

Note: The dataset was randomly partitioned into five folds for cross-validation.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, Y.; Liu, Y.; Zhang, Y.; Dong, H.; Li, C.; Mao, H.; Wang, F.; Wang, M. A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery. Buildings 2025, 15, 3377. https://doi.org/10.3390/buildings15183377

AMA Style

Hu Y, Liu Y, Zhang Y, Dong H, Li C, Mao H, Wang F, Wang M. A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery. Buildings. 2025; 15(18):3377. https://doi.org/10.3390/buildings15183377

Chicago/Turabian Style

Hu, Yue, Yuce Liu, Yu Zhang, Hongwei Dong, Chongzheng Li, Hongzhi Mao, Fusong Wang, and Meng Wang. 2025. "A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery" Buildings 15, no. 18: 3377. https://doi.org/10.3390/buildings15183377

APA Style

Hu, Y., Liu, Y., Zhang, Y., Dong, H., Li, C., Mao, H., Wang, F., & Wang, M. (2025). A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery. Buildings, 15(18), 3377. https://doi.org/10.3390/buildings15183377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Precision Measurement Method for Rooftop Photovoltaic Capacity Using Drone and Publicly Available Imagery

Abstract

1. Introduction

2. Methodology

2.1. Dataset

2.1.1. Photovoltaic Array Image Dataset

2.1.2. Capacity-Area Manufacturer Dataset

2.2. Study Framework

2.3. Vanishing-Point Constrained Camera Calibration

2.3.1. Calculating the Intrinsic Matrix

2.3.2. Calculating the Rotation Matrix

2.3.3. Calculating the Translation Vector

2.4. Error Control Strategy

2.5. PV Array 3D Reconstruction and Area Extraction

2.6. Capacity Fitting Model

3. Results and Discussion

3.1. Impact of Facade Distortion on PV Area Prediction

3.2. PV Area Recognition Accuracy and Uncertainty Analysis

3.3. Capacity Model Performance

3.4. Analysis of Measurement Uncertainty

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI