Automatic Parametrization and Shadow Analysis of Roofs in Urban Areas from ALS Point Clouds with Solar Energy Purposes

A basic feature of modern and smart cities is their energetic sustainability, using clean and renewable energies and, therefore, reducing the carbon emissions, especially in large cities. Solar energy is one of the most important renewable energy sources, being more significant in sunny climate areas such as the South of Europe. However, the installation of solar panels should be carried out carefully, being necessary to collect information about building roofs, regarding its surface and orientation. This paper proposes a methodology aiming to automatically parametrize building roofs employing point cloud data from an Aerial Laser Scanner (ALS) source. This parametrization consists of extracting not only the area and orientation of the roofs in an urban environment, but also of studying the shading of the roofs, given a date and time of the day. This methodology has been validated using 3D point cloud data of the city of Santiago de Compostela (Spain), achieving roof area measurement errors in the range of ±3%, showing that even low-density ALS data can be useful in order to carry out further analysis with energetic perspective.


Introduction
Nowadays, buildings are responsible of the 36% of CO 2 emissions and 40% of the energy consumption in the European Union (EU), according to the European Commission.To improve the efficiency and sustainability of the energy consumed within buildings is important not only for reducing the carbon footprint but also for generating economic and social benefits related with the wellbeing of the building inhabitants and reducing the energy poverty.That is why the Energy Performance of Buildings Directive [1] is aiming at nearly zero-energy standards, requiring all public buildings to satisfy this energetic efficiency by 2018 and all buildings by the end of 2020.Specifically, photovoltaic solar energy is widely used in urban environments, as it is a clean and silent source of energy, and it accounted for a 11.6% of the total quantity of electricity generated from renewable energy sources in the EU-that is, 28 countries in 2016 [2].Generally, there are three steps that are taken to estimate the solar potential: (1) Collection of input data (cartography, Light Detection and Ranging (LiDAR), or photogrammetry among others), (2) Development of a solar radiation model, and (3) Definition of an interface for the interaction with the end user [3].This work is entirely focused on the first step and its connection with the second, as identifying which areas are suitable for the use of solar energy is essential for the determination of the solar potential [4], meaning that it is necessary to measure position, size, inclination and azimuth of the areas of installation of solar panels, which, in an urban environment, are typically the roofs of the buildings.
As it has been mentioned, there are different sources for the input data that can be used for energy applications.For instance, Nex and Ramondino [5] generate DSM models from aerial images in order to reconstruct roof outlines.Similarly, Ahmadi et al. [6] extract building boundaries also using imagery, being their research based on a model of active contours.The literature, however, has been more focused on employing data from ALS sources, which allow to collect accurate and dense 3D representations of the environment.Laser scanner data has been widely used in the last decade for a huge variety of applications.On one side, Terrestrial Laser Scanners (TLS) are typically employed for the detection and classification of objects at street level [7][8][9], and for road and railway infrastructure analysis [10][11][12].Although TLS data is much denser than ALS and therefore the potential of this data source to capture small features with high resolution is higher, a terrestrial scan cannot collect geometric information about the building roofs as they will be always occluded.Therefore, Aerial data has to be employed, whose densities typically vary between 1-30 points per m 2 to higher densities such as the ALS dataset presented in [13], which averages 200 points per m 2 by maximizing data coverage on building facades, flying at a low altitude and orientating flight paths at 45 • with the major axes of the city streets, making possible a precise segmentation of building facades and roofs [14,15].Other applications of ALS data are the extraction of the road network centerlines [16,17] or terrain recognition [18,19].Regarding the extraction of building roofs with ALS data, there also exist several related works that should be remarked.Yan et al. [20] present a roof segmentation method based on a global plane fitting approach that achieves great accuracy results for point densities between 1.5 and 4 points per m 2 .Vosselman et al. [21] propose a point cloud classification framework that integrates a set of segmentation approaches based on segments and context, selecting features based on local analysis for the classification.With an energetic perspective, Lukač et al. [22] present a photovoltaic estimation of building roofs, considering all the necessary parameters of a photovoltaic module and, on data collected with aerial LiDAR data.Finally, Lingfors et al. [23] compare the performance of low-resolution and high-resolution airborne LiDAR data in order to automatically create a 2.5D building model of a neighborhood, while categorizing the buildings to perform a solar resource assessment.
The main contribution of this work is twofold: First, it presents a fully automatic methodology that segments ALS point clouds in order to extract building roofs and accurately measure several of their geometric features, all of them related with the determination of the solar potential.Furthermore, a shading analysis is proposed where the usable area of those roofs with the most suitable orientation for the installation of solar panels can be computed given any date and time of the day.The novelty of the work is also twofold: On one side, the methodological approach as a whole (although employing already existing techniques in some of its stages, such as triangulation).On the other side, this approach was developed with a focus on the case of the Spanish National Plan of Aerial Ortophotography, which provides aerial point clouds of the Spanish territory with known specifications.
This paper is structured as follows.The proposed methodology is depicted in Section 2. The case study data employed for the validation of the methodology is shown in Section 3.Then, Section 4 shows and discuss the results that have been obtained after the application of the methodology on the case study data.Finally, Section 5 outlines the conclusions of this work.

Methodology
The presented methodological approach consists of a number of sequential processing blocks as depicted in Figure 1.It inputs urban point cloud data acquired from an aerial laser scanner, and aims for the automatic characterization of the roofs with an energetic perspective, extracting information regarding the orientation, surface and shadows on the extracted roofs.These processing blocks have

Data Preprocessing
Let P = (x, y, z) be a 3D point cloud that consists of an unorganized set of 3D coordinates representing the aerial scan of an urban area.As roofs are the only elements that have to be extracted, a preliminary filtering is performed in order to reduce the amount of data to process, hence saving resources in terms of memory and execution time.A visual analysis of an aerial point cloud from an urban area shows that there are three large groups of points: Ground, facades and roofs.The data processing step consists of filtering out one of these groups, specifically the facades.For that purpose, a saliency analysis is applied to the point cloud P.This analysis is based on the algorithm by Wang et al. [24], which aims to classify the road boundaries in an urban point cloud.Here, the aim is to define whether or not a point belongs to an approximately horizontal plane.Such point will be subsequently defined as a non-salient point, while the rest of the points in the cloud will be defined as salient points.First, this algorithm computes the normal vector of each point within P, and separates all the normals in five clusters using k-means.A dominant normal vector is selected as the cluster whose normals are closer to the z-axis, and the distance of every other normal with respect that dominant normal vector is computed.The distances are mapped to a hyperbolic tangent space and classified in two classes: One class will contain non-salient points (roofs and ground), while the second one will contain salient points (facades, noise and other objects such as poles) (Figure 2).This way, a non-salient point cloud can be defined.Let  = (, ) be a function that selects a subset  of points (being  a list of indices) within a point cloud , as defined originally in [15].Thus, given the indices of non-salient points   , the point cloud that will be processed in the following steps will be   = (,   ).

Data Preprocessing
Let P = (x, y, z) be a 3D point cloud that consists of an unorganized set of 3D coordinates representing the aerial scan of an urban area.As roofs are the only elements that have to be extracted, a preliminary filtering is performed in order to reduce the amount of data to process, hence saving resources in terms of memory and execution time.A visual analysis of an aerial point cloud from an urban area shows that there are three large groups of points: Ground, facades and roofs.The data processing step consists of filtering out one of these groups, specifically the facades.For that purpose, a saliency analysis is applied to the point cloud P.This analysis is based on the algorithm by Wang et al. [24], which aims to classify the road boundaries in an urban point cloud.Here, the aim is to define whether or not a point belongs to an approximately horizontal plane.Such point will be subsequently defined as a non-salient point, while the rest of the points in the cloud will be defined as salient points.First, this algorithm computes the normal vector of each point within P, and separates all the normals in five clusters using k-means.A dominant normal vector is selected as the cluster whose normals are closer to the z-axis, and the distance of every other normal with respect that dominant normal vector is computed.The distances are mapped to a hyperbolic tangent space and classified in two classes: One class will contain non-salient points (roofs and ground), while the second one will contain salient points (facades, noise and other objects such as poles) (Figure 2).This way, a non-salient point cloud can be defined.Let S = (P, i) be a function that selects a subset i of points (being i a list of indices) within a point cloud P, as defined originally in [15].Thus, given the indices of non-salient points i ns , the point cloud that will be processed in the following steps will be P ns = S(P, i ns ).

Data Preprocessing
Let P = (x, y, z) be a 3D point cloud that consists of an unorganized set of 3D coordinates representing the aerial scan of an urban area.As roofs are the only elements that have to be extracted, a preliminary filtering is performed in order to reduce the amount of data to process, hence saving resources in terms of memory and execution time.A visual analysis of an aerial point cloud from an urban area shows that there are three large groups of points: Ground, facades and roofs.The data processing step consists of filtering out one of these groups, specifically the facades.For that purpose, a saliency analysis is applied to the point cloud P.This analysis is based on the algorithm by Wang et al. [24], which aims to classify the road boundaries in an urban point cloud.Here, the aim is to define whether or not a point belongs to an approximately horizontal plane.Such point will be subsequently defined as a non-salient point, while the rest of the points in the cloud will be defined as salient points.First, this algorithm computes the normal vector of each point within P, and separates all the normals in five clusters using k-means.A dominant normal vector is selected as the cluster whose normals are closer to the z-axis, and the distance of every other normal with respect that dominant normal vector is computed.The distances are mapped to a hyperbolic tangent space and classified in two classes: One class will contain non-salient points (roofs and ground), while the second one will contain salient points (facades, noise and other objects such as poles) (Figure 2).This way, a non-salient point cloud can be defined.Let  = (, ) be a function that selects a subset  of points (being  a list of indices) within a point cloud , as defined originally in [15].Thus, given the indices of non-salient points   , the point cloud that will be processed in the following steps will be   = (,   ).

Point Cloud Classification
Once the point cloud has been preprocessed and facade points have been filtered out, there are two principal elements on the remaining point cloud P ns , namely ground points and roof points.This step implements a classification process that aims for the definition of one class of points for each of the aforementioned elements.Due to the low density of the point cloud (details are described in Section 3) this process can be carried out using triangulation methods that would be unfeasible in a denser point cloud (e.g., a TLS or MLS acquired point cloud).Here, a Delaunay Triangulation [25] T Pns is computed on P ns (Figure 3a).Generalizing, a Delaunay Triangulation T P outputs a N-by-3 array (being N the number of points in P) where each row contains the point indices of a triangle such that its circumcircle does not contain any other point in its interior.
Considering that facade points have been removed in a previous step, a number of triangles in T Pns will represent connections between roofs and ground.Detecting those triangles is the first step of the classification.For that purpose, the normal vector to the plane defined by each triangle is computed in first place.The normal vector of a triangle that connects roof and ground should have an inclination α with respect to the z-axis close to 90 degrees, so a soft threshold β = 55 degrees is defined to select a group of potential triangles where α > β.Another parameter that allows the identification of connections is the height of the triangle, so a threshold h = 3 m is defined such that only triangles higher than h are considered.Then, for each triangle that complies with both thresholds, the point with highest z is labeled as a roof point, and the point with lowest z as ground point, leaving unlabeled the third point of the triangle (Figure 3b).
These labeled points will be used as seeds for a region growing algorithm aiming to assign labels to the whole point cloud.First, the triangles that were identified as connections are removed from T Pns .Then, in order to avoid spurious triangles due to the presence of noisy, isolated points or in the limits of the point cloud, an area filter is applied such that triangles whose area is more than 1.5 m 2 are also removed (let T f be the triangulation resulting after the removal).This ensures the correct performance of the aforementioned region growing, which consists of the following two steps: (1) Define regions of points which are connected by triangles in T f by searching the point indices of the neighboring triangles in the region and iterating this operation as long as new points are added to it.(2) For each region of points, check if there are labeled points within the region and assign that label to the whole region.As the triangles that connect ground and roofs had been removed, the case of a region containing both labels should not be possible (Figure 3c).
Finally, a label refinement process is carried out.Three considerations are made at this point: There may exist points without label (point regions with no seeds remain unlabeled after the region growing), and there may be mislabeled points both in the ground segment (due to small objects such as the upper part of vehicles or vegetation) and in the roof segment (which are unlikely but may appear in large roofs with different heights).Assuming that the vast majority of the points have been correctly classified, a nearest neighbor algorithm is applied, selecting as neighborhood of each point a sphere of r = 8 m.Regarding unlabeled points, they are assigned the most common label within their neighborhood.Subsequently, an iterative process is defined for detecting or correcting points erroneously labeled as roofs.In the first iteration, the neighborhood of each roof point is checked, and the label is corrected if there are more ground points than roof points within it.This process repeats, gradually reducing the radius r to avoid the contact with actual roofs, until there are no more mislabeled points found.This iterative process was found necessary to correct relatively large objects that should be considered within the ground segment in the context of this work (e.g., upper part of a large truck).Finally, points erroneously labeled as ground are corrected following the same process, but performing only a single iteration, which was found enough for ensuring a correct performance (Figure 3d).
After this process, each point in P ns has a label indicating whether it belongs to the ground (point indices i g ) or a roof (point indices i r ), so the point cloud can be segmented in roofs, P r = S(P ns , i r ) and ground P g = S P ns , i g .

Roof Parametrization
Once the points of the roofs have been isolated, a number of parameters can be computed.From a point of view based on energy efficiency and solar panel installation, there most relevant parameters are the orientation of the roof and its usable area.
First, the orientation of each point within   is computed.For that purpose, the normal vector of each point   = (  ,   ,   )  is obtained using a local neighborhood of 10 points.These normal vectors describe the orientation of each point, defining the azimuth and the elevation of the vector as: where 2 is the four-quadrant inverse tangent, and  the two-quadrant (angles within [−90,90] range) inverse sine.Hence, the azimuth represents the angle between the horizontal projection of the normal vector and the North (which is the y-axis for the study case data) in the XY plane, and the elevation represents the angle between the normal and the East (which is the x-axis) in the XZ plane.Here, an orientation index is assigned to each point according to its azimuth, as represented in Figure 4a, defining the orientation as North-East, North-West, South-West or South-East.This way, the point cloud   can be visualized based on the orientation index of each point, Figure 4b.Although the orientation of the roof points is already known, the point cloud   is still a set of unorganized 3D coordinates, with no contextual relationship among them.In order to organize this information, a Euclidean clustering algorithm driven by the orientation index is applied.A cluster of points   = {, , }  will contain a number of points such that its orientation index is the same, the closest Euclidean distance between any pair of points of the cluster is less than a predefined threshold,

Roof Parametrization
Once the points of the roofs have been isolated, a number of parameters can be computed.From a point of view based on energy efficiency and solar panel installation, there most relevant parameters are the orientation of the roof and its usable area.
First, the orientation of each point within P r is computed.For that purpose, the normal vector of each point n i = n x , n y , n z y is obtained using a local neighborhood of 10 points.These normal vectors describe the orientation of each point, defining the azimuth and the elevation of the vector as: where arctan2 is the four-quadrant inverse tangent, and arcsin the two-quadrant (angles within [−90,90] range) inverse sine.Hence, the azimuth represents the angle between the horizontal projection of the normal vector and the North (which is the y-axis for the study case data) in the XY plane, and the elevation represents the angle between the normal and the East (which is the x-axis) in the XZ plane.Here, an orientation index is assigned to each point according to its azimuth, as represented in Figure 4a, defining the orientation as North-East, North-West, South-West or South-East.This way, the point cloud P r can be visualized based on the orientation index of each point, Figure 4b.
Although the orientation of the roof points is already known, the point cloud P r is still a set of unorganized 3D coordinates, with no contextual relationship among them.In order to organize this information, a Euclidean clustering algorithm driven by the orientation index is applied.A cluster of points C i = {x, y, z} i will contain a number of points such that its orientation index is the same, the closest Euclidean distance between any pair of points of the cluster is less than a predefined threshold, and the closest Euclidean distance with respect to any point of a different cluster is more than that threshold (which, for this work, has been empirically defined as 2 m).This iterative process is carried out independently for each orientation index, obtaining a set of clusters C roo f s = {C NE , C NW , C SW , C SE }.This way, the points of a gabled roof will be represented by two different clusters, indicating its orientation.
Finally, the surface of each cluster is computed.There are different possible approaches for obtaining this parameter.One solution consists on the rasterization of each point cluster (that is, projecting the points on a squared grid on the XY plane and assigning the same index to the points within each cell of the grid), but it was found that the size of the grid using for building the raster has a considerable impact on the accuracy of the measured surface.Hence, the chosen approach consists of a triangulation of the points in each cluster and the addition of the surfaces of each triangle.
This stage of the process outputs, for each roof in the point cloud P r , the set of clusters C roo f s that contains the points of each roof, having into account its orientation; as well as the area A roo f s = {A NE , A NW , A SW , A SE } of every group of points in C roo f s .
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 6 of 14 and the closest Euclidean distance with respect to any point of a different cluster is more than that threshold (which, for this work, has been empirically defined as 2 m).This iterative process is carried out independently for each orientation index, obtaining a set of clusters   = {  ,   ,   ,   }.This way, the points of a gabled roof will be represented by two different clusters, indicating its orientation.
Finally, the surface of each cluster is computed.There are different possible approaches for obtaining this parameter.One solution consists on the rasterization of each point cluster (that is, projecting the points on a squared grid on the XY plane and assigning the same index to the points within each cell of the grid), but it was found that the size of the grid using for building the raster has a considerable impact on the accuracy of the measured surface.Hence, the chosen approach consists of a triangulation of the points in each cluster and the addition of the surfaces of each triangle.
This stage of the process outputs, for each roof in the point cloud   , the set of clusters   that contains the points of each roof, having into account its orientation; as well as the area   = {  ,   ,   ,   } of every group of points in   .

Shading Analysis
Shading is a relevant factor that influences the energy production of solar panels installed on the roof of a building.Using only the 3D data from the point cloud   , it is possible to establish the shading of a surface given the date and time of the day.Note that, as this work is focused on the geometric rather than on the energetic dimension, the effect of the meteorology will not be considered, assuming totally clear days for the computation of the shading.
In order to know the shading of a roof surface at a given time, it is necessary to obtain the azimuth (  ) and elevation (  ) of the Sun.Given the latitude () and the longitude (), as well as the day of the year (d, which stands for the number of days since the beginning of the year) and time (H, hour of the day expressed as a decimal value), these parameters are computed as:

Shading Analysis
Shading is a relevant factor that influences the energy production of solar panels installed on the roof of a building.Using only the 3D data from the point cloud P r , it is possible to establish the shading of a surface given the date and time of the day.Note that, as this work is focused on the geometric rather than on the energetic dimension, the effect of the meteorology will not be considered, assuming totally clear days for the computation of the shading.
In order to know the shading of a roof surface at a given time, it is necessary to obtain the azimuth (A sun ) and elevation (E sun ) of the Sun.Given the latitude (ϕ) and the longitude (λ), as well as the day of the year (d, which stands for the number of days since the beginning of the year) and time (H, hour of the day expressed as a decimal value), these parameters are computed as: A sun = arcsin(sin(ϕ) sin(δ) + cos(ϕ) cos(δ) cos(ω)) (3) where δ is the declination angle, computed as shown in Equation ( 5), being the angle b obtained as shown in Equation ( 6): and ω is the hour angle (Equation ( 7)), which depends on the Time-Corrected Equation of Time (EoT TC ) (Equation ( 8)): Once the azimuth and elevation of the Sun are known for a specific date and time, the shading of each roof in C roo f s can be computed.As solar panels are installed oriented to the South in the North Hemisphere, only the cluster subset C s = {C SW , C SE } is for this analysis.
The process that has been followed can be thought of as a visibility analysis where, for each point p i = (x i , y i , z i ) ∈ C s , an occlusion search is performed in the direction of a vector with azimuth and elevation (A sun , E sun ), which can be represented in Cartesian coordinates as a unit vector v S = v sx , v sy , v sz .First, the parametric equations of a line that goes through p i with the direction of v S are defined: Then, a sliding sphere of radius 0.5 m is defined and slides through the line by setting a number of equally spaced positions for its center varying the parameter t from Equation (9).For each position of the sphere, the presence of points of P r within it is checked.Whenever any point is found, it will be considered that there is an occlusion, which implies the point is shaded.
Finally, for each cluster of points, the surface free of shading is computed following the approach of Section 2.2.If an array with different times of the day is defined, the daily evolution of the shading and the usable surface of a roof can be visualized (Figure 5).
All the parameters that have been computed for each roof can be put together as an object of a class specifically defined to store the results of the process.Therefore, each roof in C roo f s is represented by its point cloud, orientation index, elevation, total surface, and usable surface at a time (or array of times) of a given day.

Case Study
The methodologies defined in this work have been applied to an aerial laser scan of the city of Santiago de Compostela, in the northwest of Spain.The point cloud data employed comes from the Spanish National Plan of Aerial Ortophotography (PNOA) (http://pnoa.ign.es/presentacion), which has acquired aerial point cloud data from the Spanish territory.The aerial system employs the geodetic system ETRS89, with a positioning system that comprises GPS and an inertial system (IMU) for the processing of the trajectory.The LiDAR system has a density of 0.5 points/m 2 , with an average distance between points of 1.4 m.It records up to four returns for each pulse, and the altimetric precision of the point cloud has a mean squared error   ≤ 0.2 m.
Results in Section 4 will employ three different sections of the case study data in order to offer a quantitative measurement of the methodology performance.Each section is easily distinguishable from the others in terms of urban features (Figure 6): (1) The historic center, which dates from the 9th century.The buildings have small fronts and large depths, and the streets have a variable width, generally with North-South direction.( 2) The first expansion of the city, from the 20th century.The population growth was fast at that time and, even though the new buildings had a previous planning, the building blocks have irregular shapes, and the heights are not uniform.(3) Second expansion of the city, from the first decade of 21st century.The urban growth is planned, with wide streets and building blocks with regular shapes and height.Table 1 contains information about the number of points and area of the case study data.

Case Study
The methodologies defined in this work have been applied to an aerial laser scan of the city of Santiago de Compostela, in the northwest of Spain.The point cloud data employed comes from the Spanish National Plan of Aerial Ortophotography (PNOA) (http://pnoa.ign.es/presentacion), which has acquired aerial point cloud data from the Spanish territory.The aerial system employs the geodetic system ETRS89, with a positioning system that comprises GPS and an inertial system (IMU) for the processing of the trajectory.The LiDAR system has a density of 0.5 points/m 2 , with an average distance between points of 1.4 m.It records up to four returns for each pulse, and the altimetric precision of the point cloud has a mean squared error RMSE Z ≤ 0.2 m.
Results in Section 4 will employ three different sections of the case study data in order to offer a quantitative measurement of the methodology performance.Each section is easily distinguishable from the others in terms of urban features (Figure 6): (1) The historic center, which dates from the 9th century.The buildings have small fronts and large depths, and the streets have a variable width, generally with North-South direction.( 2) The first expansion of the city, from the 20th century.The population growth was fast at that time and, even though the new buildings had a previous planning, the building blocks have irregular shapes, and the heights are not uniform.(3) Second expansion of the city, from the first decade of 21st century.The urban growth is planned, with wide streets and building blocks with regular shapes and height.Table 1 contains information about the number of points and area of the case study data.

Results and Discussion
The methodology proposed in Section 2 has been validated in the case study data showed in Section 3. First, quantitative results are given for the point cloud samples from the historic center, and the first and second expansions of the city regarding roof surface calculation.Then, qualitative results are given, showing the results after the complete case study data is analyzed, regarding roof classification and shading analysis.

Roof Parametrization
In order to quantify the performance of the methodology, the point cloud samples were processed, and the extracted roof areas were compared with a manual ground truth that was gathered using data from Google Maps.Note that the outputs of the methodology consider each orientation of a roof as a single element and, also, under-segmentation may be possible if several roofs with the same orientation have not geometric separation.Therefore, the manual reference had to be collected considering both issues (Figure 7).Results are shown separately from each point cloud sample in order to extract meaningful conclusions from the data.Furthermore, to offer reliable and compact results, only the 10 largest roofs of each sample are shown in this section.

Results and Discussion
The methodology proposed in Section 2 has been validated in the case study data showed in Section 3. First, quantitative results are given for the point cloud samples from the historic center, and the first and second expansions of the city regarding roof surface calculation.Then, qualitative results are given, showing the results after the complete case study data is analyzed, regarding roof classification and shading analysis.

Roof Parametrization
In order to quantify the performance of the methodology, the point cloud samples were processed, and the extracted roof areas were compared with a manual ground truth that was gathered using data from Google Maps.Note that the outputs of the methodology consider each orientation of a roof as a single element and, also, under-segmentation may be possible if several roofs with the same orientation have not geometric separation.Therefore, the manual reference had to be collected considering both issues (Figure 7).Results are shown separately from each point cloud sample in order to extract meaningful conclusions from the data.Furthermore, to offer reliable and compact results, only the 10 largest roofs of each sample are shown in this section.(a) A cluster of points containing a roof is highlighted in red.Note that it contains only points which are oriented to the South-East, and as there is not physical separation between roofs of different buildings with the same orientation, the cluster presents under-segmentation.(b) The manual reference was collected taking into account this nature of the processing algorithms in order to correctly quantify the performance of the methodology.show the results of the comparison between the areas returned by the application of the proposed methodology and the manually measured areas.Several conclusions can be drawn from these results.First, it can be seen that the overall performance of the algorithm is not highly affected by the type of urbanization.The three data samples employed show different classes of urban blocks, and the measurement error varies on the ±3% range.These errors have three different causes: (1) The resolution of the point cloud, as stated in Section 3, is such that the average distance between points is 1.4 m.Therefore, it is likely that the boundaries of the roofs are not precisely measured and a small amount of surface is lost.(2) The presence of balconies close to the roofs (on the top floor of the building) may be interpreted as part of the roof itself, increasing the measured area.This building typology appears specially in the First Expansion of the city (Table 3), which explains the fact that the measured area is larger than the ground truth area and some of the errors are over +10%.(3) Undersegmentation is likely to happen in areas with a high density of buildings with no separation and same height.As long as there is not a physical separation between roofs of different buildings, geometry properties will not be enough to segment them properly, and under-segmentation only implies that measured areas may stand for a group of buildings instead of single buildings, but it should not affect to the precision of the area measurement itself, and therefore it is preferred over over-segmentation.However, it is important to note that the ground truth of over-segmented roofs may be less reliable as the manual reference to be extracted is more complex.It may also include the surface of small chimneys, which can be a relevant source of error in small roofs where these objects represent a considerable proportion of the total surface.This explains the fact that the largest error is presented in the Historic Centre, where the roofs are smaller and there is not separation between many of the buildings.Tables 2-4 show the results of the comparison between the areas returned by the application of the proposed methodology and the manually measured areas.Several conclusions can be drawn from these results.First, it can be seen that the overall performance of the algorithm is not highly affected by the type of urbanization.The three data samples employed show different classes of urban blocks, and the measurement error varies on the ±3% range.These errors have three different causes: (1) The resolution of the point cloud, as stated in Section 3, is such that the average distance between points is 1.4 m.Therefore, it is likely that the boundaries of the roofs are not precisely measured and a small amount of surface is lost.(2) The presence of balconies close to the roofs (on the top floor of the building) may be interpreted as part of the roof itself, increasing the measured area.This building typology appears specially in the First Expansion of the city (Table 3), which explains the fact that the measured area is larger than the ground truth area and some of the errors are over +10%.
(3) Under-segmentation is likely to happen in areas with a high density of buildings with no separation and same height.As long as there is not a physical separation between roofs of different buildings, geometry properties will not be enough to segment them properly, and under-segmentation only implies that measured areas may stand for a group of buildings instead of single buildings, but it should not affect to the precision of the area measurement itself, and therefore it is preferred over over-segmentation.However, it is important to note that the ground truth of over-segmented roofs may be less reliable as the manual reference to be extracted is more complex.It may also include the surface of small chimneys, which can be a relevant source of error in small roofs where these objects represent a considerable proportion of the total surface.This explains the fact that the largest error is presented in the Historic Centre, where the roofs are smaller and there is not separation between many of the buildings.

Roof Classification and Shading Analysis
The proposed methodology has been applied to the whole case study data, which, as stated in Section 3, comprises approximately 4.5 million points of the city of Santiago de Compostela.In order to process these data, the point cloud is divided in a number of point cells, and each of them is processed individually.Figure 8 shows the results obtained after this processing.Qualitatively, it is proven that the algorithms work consistently across different types of buildings and urban distributions.Table 5 shows some relevant results such as the total area covered by roofs and the area covered on each orientation, as well as the non-shaded area on 16th May at 3 p.m.Note that the non-shaded area corresponds only to those roofs that are oriented to the South.
to process these data, the point cloud is divided in a number of point cells, and each of them is processed individually.Figure 8 shows the results obtained after this processing.Qualitatively, it is proven that the algorithms work consistently across different types of buildings and urban distributions.Table 5 shows some relevant results such as the total area covered by roofs and the area covered on each orientation, as well as the non-shaded area on 16th May at 3 p.m.Note that the nonshaded area corresponds only to those roofs that are oriented to the South.

Conclusions
In this paper, a methodology for the automatic classification and parametrization of building roofs using 3D point cloud data from an aerial laser scanner is presented, which aims to compute the area of the roofs in urban environments having into account its orientation so further analysis regarding the installation of solar panels can be made.The methodology relies only on the geometry of the point cloud (that is, it only uses the 3D coordinates xyz), and consists of a number of processing steps applied sequentially to the input data.First, a classification of the ground, based on a triangulation of the 3D cloud, filters out unnecessary points.Then, the points that belong to roofs are organized and parametrized, obtaining a number of clusters that represent individual roofs with a set of properties such as area and orientation.Finally, a shading analysis is carried out in order to analyze the usable surface of a roof given a certain date and time, which is useful to know the adequacy of a roof to have solar panels installed.
This methodology was developed in a data-driven fashion, being focused on the case of the Spanish National Plan of Aerial Ortophotography, which provides aerial point clouds of the Spanish territory.The aerial scan of the city of Santiago de Compostela (Spain) was employed as study case

Conclusions
In this paper, a methodology for the automatic classification and parametrization of building roofs using 3D point cloud data from an aerial laser scanner is presented, which aims to compute the area of the roofs in urban environments having into account its orientation so further analysis regarding the installation of solar panels can be made.The methodology relies only on the geometry of the point cloud (that is, it only uses the 3D coordinates xyz), and consists of a number of processing steps applied sequentially to the input data.First, a classification of the ground, based on a triangulation of the 3D cloud, filters out unnecessary points.Then, the points that belong to roofs are organized and parametrized, obtaining a number of clusters that represent individual roofs with a set of properties such as area and orientation.Finally, a shading analysis is carried out in order to analyze the usable surface of a roof given a certain date and time, which is useful to know the adequacy of a roof to have solar panels installed.
This methodology was developed in a data-driven fashion, being focused on the case of the Spanish National Plan of Aerial Ortophotography, which provides aerial point clouds of the Spanish territory.The aerial scan of the city of Santiago de Compostela (Spain) was employed as study case for the validation of the methodology.Three different samples, representing well differentiated zones of the city, were extracted in order to quantify the performance of the roof area measurement.It was found that the error is on the ±3% range, which is an interesting result having into account the small point density of the point cloud data.The main conclusion that can be extracted from these results is that aerial point cloud data is totally suitable for carrying out further analysis which can be focused on the adequacy of the buildings to have solar panels installed, not only because the area orientated to the South (considering the analysis on the North Hemisphere) is known, but also because the shading of the roof can be calculated, and therefore the energy loss due to the shading can also be known.

Figure 2 .
Figure 2. Data preprocessing.A saliency analysis is performed on the raw point cloud , removing points from the facades (coloured in red), leaving mainly roofs and the ground.

Figure 2 .
Figure 2. Data preprocessing.A saliency analysis is performed on the raw point cloud , removing points from the facades (coloured in red), leaving mainly roofs and the ground.

Figure 2 .
Figure 2. Data preprocessing.A saliency analysis is performed on the raw point cloud P, removing points from the facades (coloured in red), leaving mainly roofs and the ground.

Figure 3 .
Figure 3. Point cloud classification.(a) Delaunay Triangulation of the point cloud,   .(b) The geometric features of the triangulation are employed to find the connections between roofs (red points) and ground (black points), which can be used to perform the segmentation using a region growing algorithm.(c) Result of the region growing algorithm.It can be seen that some points are still not classified or misclassified, being a refinement needed at this point.(d) Results after the labelling refinement.

Figure 3 .
Figure 3. Point cloud classification.(a) Delaunay Triangulation of the point cloud, T Pns .(b) The geometric features of the triangulation are employed to find the connections between roofs (red points) and ground (black points), which can be used to perform the segmentation using a region growing algorithm.(c) Result of the region growing algorithm.It can be seen that some points are still not classified or misclassified, being a refinement needed at this point.(d) Results after the labelling refinement.

Figure 4 .
Figure 4. Roof parametrization.(a) Four orientation indices are defined for each point: North West-North East, North East-South East, South East-South West and South West-North West.(b) Visualization of the orientation of each point in the cloud   .(c) Result of the orientation-driven Euclidean Clustering process.Each cluster is represented in a random color.

Figure 4 .
Figure 4. Roof parametrization.(a) Four orientation indices are defined for each point: North West-North East, North East-South East, South East-South West and South West-North West.(b) Visualization of the orientation of each point in the cloud P r .(c) Result of the orientation-driven Euclidean Clustering process.Each cluster is represented in a random color.

Figure 5 .
Figure 5. Shading analysis.The shading of the roof at three different times of the day (morning, noon, and afternoon) is represented in black color.

Figure 5 .
Figure 5. Shading analysis.The shading of the roof at three different times of the day (morning, noon, and afternoon) is represented in black color.

14 Figure 6 .
Figure 6.Case study data.(a) The city of Santiago de Compostela (Spain) was analyzed in order to get a qualitative measure of the performance of the methodology.Smaller samples of the point cloud were manually extracted to get quantitative measures from three easily differentiated areas within the city, the historic center (light blue), the first city expansion (light green) and the second city expansion (light red).(b) Point cloud data of the city.

Figure 6 .
Figure 6.Case study data.(a) The city of Santiago de Compostela (Spain) was analyzed in order to get a qualitative measure of the performance of the methodology.Smaller samples of the point cloud were manually extracted to get quantitative measures from three easily differentiated areas within the city, the historic center (light blue), the first city expansion (light green) and the second city expansion (light red).(b) Point cloud data of the city.

Figure 7 .
Figure 7.Comparison of measured areas with ground truth areas.(a)A cluster of points containing a roof is highlighted in red.Note that it contains only points which are oriented to the South-East, and as there is not physical separation between roofs of different buildings with the same orientation, the cluster presents under-segmentation.(b) The manual reference was collected taking into account this nature of the processing algorithms in order to correctly quantify the performance of the methodology.

Figure 7 .
Figure 7.Comparison of measured areas with ground truth areas.(a)A cluster of points containing a roof is highlighted in red.Note that it contains only points which are oriented to the South-East, and as there is not physical separation between roofs of different buildings with the same orientation, the cluster presents under-segmentation.(b) The manual reference was collected taking into account this nature of the processing algorithms in order to correctly quantify the performance of the methodology.

Figure 8 .
Figure 8. Roof classification, city of Santiago de Compostela.(a) All the points classified as roofs are highlighted in red.(b) The roofs are colored according to their orientation.

Figure 8 .
Figure 8. Roof classification, city of Santiago de Compostela.(a) All the points classified as roofs are highlighted in red.(b) The roofs are colored according to their orientation.

Table 1 .
Case study data.

Table 1 .
Case study data.

Table 2 .
Results for the Historic Centre data sample.

Table 2 .
Results for the Historic Centre data sample.

Table 3 .
Results for the First Expansion data sample.

Table 4 .
Results for the Second Expansion data sample.

Table 5 .
Measured areas of the roofs according to their orientation, and not-shaded areas (on 7th July, 3 p.m.) only measured on roofs which are oriented to the South east-South west. )