Automatic Object-Detection of School Building Elements in Visual Data: A Gray-Level Histogram Statistical Feature-Based Method

Automatic object-detection technique can improve the efficiency of building data collection for semi-empirical methods to assess the seismic vulnerability of buildings at a regional scale. However, current structural element detection methods rely on color, texture and/or shape information of the object to be detected and are less flexible and reliable to detect columns or walls with unknown surface materials or deformed shapes in images. To overcome these limitations, this paper presents an innovative gray-level histogram (GLH) statistical feature-based object-detection method for automatically identifying structural elements, including columns and walls, in an image. This method starts with converting an RGB image (i.e. the image colors being a mix of red, green and blue light) into a grayscale image, followed by detecting vertical boundary lines using the Prewitt operator and the Hough transform. The detected lines divide the image into several sub-regions. Then, three GLH statistical parameters (variance, skewness, and kurtosis) of each sub-region are calculated. Finally, a column or a wall in a sub-region is recognized if these features of the sub-region satisfy the predefined criteria. This method was validated by testing the detection precision and recall for column and wall images. The results indicated the high accuracy of the proposed method in detecting structural elements with various surface treatments or deflected shapes. The proposed structural element detection method can be extended to detecting more structural characteristics and retrieving structural deficiencies from digital images in the future, promoting the automation in building data collection.


Introduction
Regional seismic vulnerability assessment of school buildings can provide civil-protection agencies with information crucial to optimal emergency planning and mitigation, including projections of the scale and distribution of building damages. Semi-empirical methods, which evaluate the seismic vulnerability of buildings based on a small range of building attributes selected by experts coupled with data-mining techniques, have proved its efficiency in rapidly assessing the seismic vulnerabilities of buildings at a large scale [1,2]. These building attributes include the number and sizes of structural

Background
Ideally, a holistic urban seismic risk assessment would investigate the behavior of every single building during seismic events via computational modeling. However, it is impractical to apply them at a regional scale because the computational models require detailed information of each building and it takes massive time to build models and execute structural analysis for hundreds of thousands of buildings. For vulnerability assessment to achieve adequate accuracy while avoiding exhaustive data collection and computational analysis, semi-empirical approaches have been developed: a few structural characteristics of buildings (e.g., the number and sizes of columns and walls, building height, and structural deficiencies) assessed by expert judgment are coupled with data-mining techniques to predict buildings' seismic vulnerabilities [1,14,15]. Therefore, collecting structural information is the first step to successfully apply the semi-empirical approaches for building seismic vulnerability assessment at a regional scale. Realizing deficiencies of traditional paper-based field surveys, vision-based technologies have been applied to structural information collection.
A handful of prior studies on structural element detection methods can be generally classified into color/texture-based methods and shape-based methods: the former detects a structural element by judging if the color or texture of the object's material matches the predefined material databases, and the latter by judging if the shape of the object matches the predefined element shape templates. However, structural elements in images may fail to match these predefined databases or templates because of their diversities in colors, textures, and shapes. Therefore, grayscale statistical feature-based object-detection methods, being able to detect objects without predefining databases or templates, are introduced in this section.

Color/Texture-Based Object Detection Methods
The color and texture of materials are commonly used in detecting structural elements since those elements are usually composed of a single material (e.g., concrete) [11][12][13]16]. Based on the finding that most color values of construction materials were in certain ranges, Abeid Neto et al. [17] identified a structural element by detecting its boundary pixels whose color values satisfied the predefined color ranges. Instead of taking into account color features only, Brilakis and Soibelman [18] combined both color and texture features to label regions in an image. A region was considered as a structural element, if the Euclidean distance between the region label and a predefined material label was smaller than a predefined threshold. Recently, machine learning techniques are widely used in detecting concrete structural elements by recognizing their materials [11,[19][20][21][22]. The input data are those quantified color or texture characteristics, such as RGB values (the color depth of red, green, and blue light, ranging from 0 to 255) [23], luminance values [24], and filter bank responses [25].
However, color/texture-based methods are unable to cope with structural elements covered in unknown paint, tiles, or other surface treatments, which are common to be seen in school building columns and walls. Moreover, color/texture-based methods require a predefined material library, which is not only time-consuming to be built but also is unable to cover all characteristics of materials because even the same material can have different characteristics under different scenarios. For example, the color of a material can be different under various illumination intensities. Missing containing characteristics of materials can lead to the failure of detection.

Shape-Based Object Detection Methods
In addition to color and texture features, the edge information, indicating the boundary of an object and appearing in the area where image pixel intensity changes sharply, is another feature used for detecting structural elements. Shape-based object detection methods identify structural elements in images by detecting their edge lines. Jung and Schramm [26] detected rectangle objects which represent structural elements in images by searching their two pairs of parallel edge lines. Although this method performs well in detecting rectangle structural elements of different sizes and length-width ratios, the sole reliance on shape information narrows their applications in detecting slant or vague elements in images [27]. To improve such drawbacks, coupling shape information with color and texture features, Zhu and Brilakis [28] proposed a concrete column-detection method based on the assumptions that the shape of a single concrete column is bounded by a pair of long vertical lines, and the color and texture patterns on the surface of a concrete column are uniform and could be matched to a predefined color-pattern database for particular materials. Zhang et al. [29] used long edge lines as well as color features to locate a wall between two windows. Hamledari et al. [30] utilized both shape and color features to detect vertical indoor components of structures from 2D digital images during construction phases. Although such a combination improves the drawbacks of shape-based methods, it introduces the disadvantage of color/texture-based methods as aforementioned. In addition, current shape-based methods fail to work when vertical edge lines of structural elements deviate from the vertical direction because of camera tilting or perspective illusion.

Grayscale Statistical Feature-Based Object Detection Methods
Grayscale statistical feature-based methods utilize a set of statistical parameters extracted from grayscale pixel intensity of images to recognize objects. Generally, grayscale statistical parameters are classified into two groups: gray-level co-occurrence matrix (GLCM) statistical parameters and gray-level histogram (GLH) statistical parameters.
Haralick and Shanmugam [31] first introduced the GLCM, which described the frequency of one gray level appearing in a specified spatial linear relationship with another gray level within the area under investigation and originally proposed 14 GLCM statistical parameters to label the GLCM. From then on, these GLCM statistical parameters have been widely used in image recognition, such as detecting sea ice changing in images [32][33][34], classifying landscape images [35], and distinguishing computed tomography (CT) images of normal and abnormal tissues [36,37]. However, it is difficult to understand the essential meanings of GLCM statistical parameters and to visibly check how these GLCM parameters can affect detection results since GLCM statistical parameters represent abstract characteristics of an object. Accordingly, GLH statistical parameters (mean, variance, integrated intensity, skewness, and kurtosis), representing the distribution of the pixel intensity of an image, have been developed and used in image recognition [38][39][40]. Zhang and Wang [41] used GLH and GLCM statistical parameters respectively to classify normal and abnormal human brain CT images, and the results showed that the classification accuracy with GLH statistical parameters was higher than that with GLCM statistical parameters, proving GLH's superiority in image classification. This result was also supported by Fallahi et al. [42]. In addition to medical use, An et al. [43] classified traffic signs with GLH statistical parameters and got a high classification accuracy rate.
The proposed method innovatively combines the GLH statistical parameters and shape features of structural elements to recognize columns and walls in images given that columns and walls have two distinguishing visual characteristics: (1) the gray levels of their surfaces are relatively homogeneous compared with their neighboring areas that contain various objects, and (2) their boundary edges are long vertical lines. Additionally, as some of GLH statistical parameters contribute little in representing specific objects, it is unnecessary to use all GLH parameters to detect objects. Thus, this study will select proper GLH parameters by considering both statistical meanings and visual meanings of these GLH parameters to decrease the computational cost for the detection of structural elements.

Methodology
As shown in Figure 1, this method starts with transforming an RGB image (i.e. the image colors are a mix of red, green and blue light) into a grayscale image. Then the Prewitt operator, coupled with the Hough transform, is used to detect long vertical lines in the grayscale image. Quadrilateral sub-regions could then be identified between two adjacent detected lines. Next, three GLH statistical parameters (variance, skewness, and kurtosis) of each sub-region are calculated. A structural element is recognized if the shape and the GLH statistical parameters of a sub-region satisfy our predefined criteria. Finally, the type of the element (a column or a wall) is determined according to the ratio of the sub-region's length to width. . Figure 1. The process map for the automatic detection of columns and walls.

Image Preprocessing
In order to extract the GLH statistical parameters in later steps, the RGB image is firstly converted into a grayscale image using Equation (1),

Image Preprocessing
In order to extract the GLH statistical parameters in later steps, the RGB image is firstly converted into a grayscale image using Equation (1), where (x, y) is the location of a pixel in the pixel matrix of an image; f (x, y) is the grayscale value comprising a pixel intensity level between 0 and 255; R(x, y), G(x, y), B(x, y) represent the red, green and blue values of the pixel located in (x, y), respectively; a 1 , a 2 and a 3 are weighting factors that are automatically adjusted according to the R, G, B values of a given image. An edge is defined as the abrupt change of gray levels in a grayscale image [44], providing indications of the physical extent of an object. In a continuous domain edge segment F(x, y), the continuous gradient G(x, y) is calculated along a certain direction. The edge is detected if the gradient G(x, y) is above a given threshold value. The gradient G(x, y) can be computed by Equation (2), where ϕ is the angle between F(x, y) and the horizontal axis.
In a discrete grayscale image which is a M × N pixel matrix with M rows and N columns of pixels, a row gradient G R ( j, k) and a column gradient G C ( j, k) can be calculated using Equation (3), where ( j, k) is the coordinate of a pixel in the matrix, in which the coordinate of the first pixel (in the top-left corner) is (0,0). The row and column gradients G R ( j, k) and G C ( j, k) can be alternatively computed by using impulse response matrix, Equation (4): where H R ( j, k) and H C ( j, k) are row-and column-impulse response matrix, respectively, which can be defined by given edge-detection operators: the Prewitt operator [45], the Sobel operator [46], the Roberts operator [47], and the Canny operator [48]. Among these operators, the Prewitt operator is distinguished by its capability of detecting the horizontal and vertical edges [49], defined by Equation (5) and Equation (6), respectively: However, the edge-detection operator can only result in a discrete edge map composed of isolated boundary points without knowing which points lie on the same lines. Therefore, coupled with edge-detection operators, line-detection techniques should be used to extract lines from the discrete edge map.
The Hough transform [50] is widely used for retrieving line information from edge maps, due to its powerful ability in dealing with pixels that are missing or spurious as a result of noise produced by image defects or edge-detector imperfections. In a discrete binary image, such as an edge map produced by the Prewitt operator, each non-zero point (x, y) in the Cartesian coordinate space is transformed into a line in the polar coordinate space (ρ, θ) using Equation (7), where ρ is the perpendicular distance from the image's origin point to the line, restricted to [−D, D], with D being the half-diagonal size of the image; and θ is the angle between the normal of the line and the x-axis of the image, restricted to [−90 • , 90 • ]. After transformation, lines are quantized into cells A (ρ, θ) whose initial values are 0. Straight lines that are transformed from points lying on the same line in Cartesian coordinate space will intersect at the same point in polar coordinate space. Then, the number of straight lines intersecting at the point (ρ i , θ i ) will be recorded in A(ρ i , θ i ), which represents the number of edge points lying on the same straight line. After all detected edge points are processed, the A(ρ, θ) cells are examined. Large counts in cells A(ρ, θ) correspond to collinear edge points that can be fitted by a straight line with the (ρ, θ) parameters. Small counts in cells usually represent isolated points or noise points that can be deleted. Through the Hough transform, straight lines longer than one-four of the height of an image are detected while straight lines shorter than one-four of the height of an image are abandoned as they are seen as noisy lines (preliminary study showed that the threshold of one-four of the height of an image can cover most of the vertical boundary lines of structural elements in images).
Traditional shape-based methods assumed that the boundary lines of structural elements in an image are precisely vertical, which lead to the failure detection of slant structural elements resulting from camera tilting or perspective illusion. The proposed method improves shape-based methods by setting an allowed inclination (±5 • ) of boundary lines with the vertical direction to more flexibly detect structural elements. After the Hough transform, an image is divided into several quadrilateral sub-regions, each of which is formed by two adjacent detected lines.

Columns and Walls Recognition using GLH Statistical Parameters
The GLH is innovatively used in this proposed study to recognize structural elements from detected sub-regions in an image. The meanings of the five GLH statistical parameters are described as follows: mean is the average gray value of a GLH; integrated density is the sum of the gray values of a GLH; variance indicates the degree of spread from the mean of a GLH; skewness represents the dissymmetric degree of the distribution of a GLH; and kurtosis reflects the tailedness of the distribution of a GLH.
By investigating GLHs of detected sub-regions, the authors found that, comparing with the GLH of the adjacent sub-regions that contain various objects, the GLH of a structural element sub-region showed a more concentrated, symmetric distribution with fat tailedness due to relatively uniform gray levels on the surface of the element (also discussed in Section 4.2.1), while differences of the gray values between structural element regions and nonstructural element regions are inconspicuous. Therefore, only variance, skewness, and kurtosis are used to differentiate GLHs between structural element sub-regions and nonstructural element sub-regions. These three statistical parameters of a GLH can be calculated by Equations (8)-(13): where i is the gray level of a pixel ranging from 0 to 255; H(i) is the proportion of pixels in the gray level i; n i is the number of pixels in the gray level i of a sub-region; and m and n are the numbers of pixels in the rows and columns of a sub-region, respectively; µ is the average gray level of pixels; σ is the standard deviation of pixels' gray levels; L is the RGB color space, which is equal to 256. Accordingly, three GLH statistical parameters of every sub-region are calculated. Based on the preliminary study, the authors conclude that (1) the variance and skewness values of a sub-region that is a structural element is smaller than the values of adjacent sub-regions that contain complex surrounding environment, respectively; (2) the kurtosis value of a sub-region that is a structural element is larger than the kurtosis values of adjacent sub-regions that contain complex surrounding environment. The first conclusion can be explained by the concentrated distributions of the gray levels of structural element sub-regions in narrow ranges as a result of the relatively uniform color of columns' and walls' surfaces, while gray levels of sub-regions containing complex surrounding environment distribute in wide ranges, leading to large variance values. Meanwhile, the centralization of gray level distributions also results in the low skewness values of structural element sub-regions because the concentration of gray level distributions leads to a relatively symmetric distribution compared with the random distribution of non-structural element sub-regions' gray levels. Finally, based on Equation (13), for a concentrated distribution, a few pixels with extreme values (i.e., the values considerably deviate from the mean value) can lead to an extreme large skewness value of a structural element sub-region. Accordingly, this study considers a sub-region as a column or a wall if it satisfies the following criteria: (1) the intersection angle of a sub-region's two adjacent boundary lines is less than 5 • ; (2) variance and skewness values of the sub-region's GLH are local minimum while the kurtosis value is local maximum among adjacent sub-regions; (3) the ratio of length to width of a sub-region is set as no less than two for a column, and less than two for a wall. The length of a sub-region is defined by the length of its longer boundary line, and the width is defined as the distance between its two boundary lines [51].

Validation
The database of Taiwan's National Center for Research on Earthquake Engineering (NCREE) contains thousands of school building images in different sizes and resolutions taken in Taiwan by various devices. The authors selected images that were taken indoors and outdoors from the front view and the side view of buildings from the database of NCREE and implemented these images in MATLAB R2017a (The MathWorks, Natick, MA, USA) to validate the performance of the proposed method. Finally, the detection precision and recall were used respectively to assess the detection accuracy and detection efficiency of the proposed method. Part of the column and wall detection results are shown in Tables 1 and 2 respectively. Precision and recall are calculated by Equations (14) and (15), respectively: where TP (true positive) is the number of target objects (i.e., columns or walls in this study) that are correctly detected; FP (false positive) is the number of detected objects that are not target objects; FN

column Detection Results and Discussion
The detection precision for all 20 column images in Table 1 is 100%, while detection recall ranges from 50% to 100% with an average of 78.4%. The No. 1 column image, in which all five columns were correctly detected, is taken as an example to explain the whole procedure of the proposed method. Firstly, by using Equation (1), the original RGB image (Figure 2a) was converted into a grayscale image (Figure 2b), which served as the basis of edge detection. Then, discrete edge points were detected using the Prewitt Operator, generating the edge map. As shown in Figure 2c, the edge map is a binary image in which white points are edge points and black points are the background. Next, long vertical lines, potentially to be boundaries of structural elements, were retrieved by the Hough Transform from the edge map. In this study, the authors defined a vertical line whose length is longer than a quarter of the height of the image as a long vertical line. Moreover, slightly inclined lines could also be retrieved by setting the allowed inclination of ± 5 • with the vertical direction. Consequently, 20 vertical boundary lines were detected, dividing the image into 19 sub-regions, as shown in Figure 2d. Accordingly, the variance, skewness, and kurtosis of the GLHs of every sub-regions were calculated, and sub-regions with variance and skewness values being local minimum and kurtosis values being local maximum were identified as structural elements. Finally, these detected structural elements were regarded as columns because their ratios of length to width were no less than two. As shown in Figure 3a-c, sub-region 3, 7, 12, 16, 19 are columns. It can be observed that non-column sub-regions contain various objects in different gray levels: windows in high gray levels and retaining walls in low gray levels, while column sub-regions only contain columns in similar gray levels. To further illustrate the differences of variance, skewness, and kurtosis among a column sub-region and its two adjacent non-column sub-regions, the GLHs of the column sub-region 7 and the two adjacent non-column sub-region 6 and 8 in Figure 2d are shown in Figure 3d-f respectively. Three peaks exist in Figure 3d,f but only one peak in Figure 3e, which means that in non-column sub-region 6 and 8, gray levels of pixels dispersedly distribute in wide ranges while in the column sub-region 7, gray levels of pixels concentrate on a narrow range. Therefore, compared with the GLH of adjacent non-column sub-regions, the GLH of a column sub-region shows a more concentrated and symmetric distribution, with fat tailedness.     The column detection results prove that columns with tiled or painted surfaces can be detected, overcoming the shortage of color/texture-based methods. For example, in Figure 4a, both column 1 and column 2 were successfully detected although they are tiled with colorful tiles-the top and bottom parts are tiled with dark tiles while the middle parts with light tiles. Moreover, columns with a small tilt, which commonly occur due to camera tilting or perspective illusion, can be detected by this method. As shown in Figure 4b, three tilted columns (column 2, 3 and 4) in the image, with tilt angles between their boundary lines and the vertical direction ranging from 1 • to 5 • , were detected as a result of the maximum allowed tilt angle set as 5 • in this study. On the other hand, taking images from the side view of structural elements can aggravate tilting of distant objects in images, leading to the failure detection of distant columns: the most distant column 1 in Figure 4b failed to be detected because the aggravated inclination of its boundary lines exceeds 5 • . To avoid such deficiency, images should be taken from the front view of structural elements.  Table 1 (sub-region 3, 7, 12, 16, and 19 are columns): (a) variance of gray-level histograms (GLHs); (b) skewness of GLHs; (c) kurtosis of GLHs; (d) the GLH in sub-region 6 (non-structural element); (e) the GLH in sub-region 7 (a column); (f) the GLH in sub-region 8 (non-structural element).

Wall detection results and discussion
As shown in Table 2, similar to column detection results, the detection precision of all 20 wall images are 100% and the detection recall ranges from 50% to 100% with the average of 73.2%, proving the outstanding capability of the proposed method in correctly detecting columns and walls. Looking into these images, the authors found that the proposed method was able to detect structural elements in complex scenarios. Take the No. 1 image in Table 2 as an example. Compared with the similar GLHs among non-column sub-regions (Figure 3), GLHs are markedly different from each other among non-column sub-regions in Figure 5a, leading to the huge differences of variance, skewness and kurtosis values among non-column sub-regions. The variance value in the sub-region 2 is much smaller than the value in the sub-region 6 (Figure 5b,c) while the skewness value in the sub-region 2 is much bigger than the value in the sub-region 6 ( Figure 5d). However, most sub-regions with complex scenarios in wall images have little influence on the detection results because only these subregions adjacent to structural element sub-regions can determine the detection results of structural elements. As shown in Figure 5, whether the wall in sub-region 4 could be successfully detected is only affected by adjacent sub-region 3 and 5 in spite of other sub-regions with diverse GLHs. Moreover, the proposed detection method is able to detect slightly distorted structural elements by setting an allowed intersection angles (5°) of objects' two vertical boundary lines, compared with existing structural element detection methods which assume that the vertical boundaries of a structural element are a pair of parallel lines [28]. For instance, the wall (sub-region 4) in Figure 5a were successfully detected even if its two boundary lines are not parallel with an intersection angle of 2° caused by perspective illusion. On the other hand, the perspective illusion can lead to the failure detection of distant structure elements in images whose boundary lines are too blurry to be detected. For instance, wall 1 in Figure 6a and wall 1 in Figure 6b were not successfully detected because the edge detection method was unable to retrieve these walls' left boundary lines that are invisible due to far distance between the walls and the camera -perspective illusion makes distant object small and blurry. Therefore, images that are taken from the side view of the structural elements to be detected result in low detection recall.
Another advantage of the presented method is that different illumination intensities in images will not hinder structural elements from being successfully detected. The change in illumination intensities can only result in translation motion of GLHs: the peaks of GLHs move to high gray level side under high illumination and move to low gray level side under low illumination, without changing their concentration, symmetry, and tailedness. As shown in Figure 7, gray levels of pixels in light sub-region 2 mainly range from 200 to 250 while gray levels of pixels in dark sub-region 6 are mainly around 150. However, the shapes of these two GLHs show fewer differences. The insensitivity

Wall detection results and discussion
As shown in Table 2, similar to column detection results, the detection precision of all 20 wall images are 100% and the detection recall ranges from 50% to 100% with the average of 73.2%, proving the outstanding capability of the proposed method in correctly detecting columns and walls. Looking into these images, the authors found that the proposed method was able to detect structural elements in complex scenarios. Take the No. 1 image in Table 2 as an example. Compared with the similar GLHs among non-column sub-regions (Figure 3), GLHs are markedly different from each other among non-column sub-regions in Figure 5a, leading to the huge differences of variance, skewness and kurtosis values among non-column sub-regions. The variance value in the sub-region 2 is much smaller than the value in the sub-region 6 (Figure 5b,c) while the skewness value in the sub-region 2 is much bigger than the value in the sub-region 6 ( Figure 5d). However, most sub-regions with complex scenarios in wall images have little influence on the detection results because only these sub-regions adjacent to structural element sub-regions can determine the detection results of structural elements. As shown in Figure 5, whether the wall in sub-region 4 could be successfully detected is only affected by adjacent sub-region 3 and 5 in spite of other sub-regions with diverse GLHs. Moreover, the proposed detection method is able to detect slightly distorted structural elements by setting an allowed intersection angles (5 • ) of objects' two vertical boundary lines, compared with existing structural element detection methods which assume that the vertical boundaries of a structural element are a pair of parallel lines [28]. For instance, the wall (sub-region 4) in Figure 5a were successfully detected even if its two boundary lines are not parallel with an intersection angle of 2 • caused by perspective illusion. On the other hand, the perspective illusion can lead to the failure detection of distant structure elements in images whose boundary lines are too blurry to be detected. For instance, wall 1 in Figure 6a and wall 1 in Figure 6b were not successfully detected because the edge detection method was unable to retrieve these walls' left boundary lines that are invisible due to far distance between the walls and the camera-perspective illusion makes distant object small and blurry. Therefore, images that are taken from the side view of the structural elements to be detected result in low detection recall.  Table 2.  Table 2.  Table 2, (b) the No. 7 image in Table 2.
Another advantage of the presented method is that different illumination intensities in images will not hinder structural elements from being successfully detected. The change in illumination intensities can only result in translation motion of GLHs: the peaks of GLHs move to high gray level side under high illumination and move to low gray level side under low illumination, without changing their concentration, symmetry, and tailedness. As shown in Figure 7, gray levels of pixels in light sub-region 2 mainly range from 200 to 250 while gray levels of pixels in dark sub-region 6 are mainly around 150. However, the shapes of these two GLHs show fewer differences. The insensitivity to illumination intensity of this method allows images to be collected throughout the daytime, overcoming the limitation of traditional image recognition methods that images should be collected in the same time period of a day (Han et al. 2017). On the other hand, the non-uniform illumination on the surface of an element can result in failure of detection due to the asymmetric distribution of the element's GLH. For instance, the wall 4 in Figure 7a, with non-uniform illumination on its surface, failed to be detected because of the local maximum value of the skewness in sub-region 4 ( Figure 7c). Thus, non-uniform illumination on surfaces of structural elements should be avoided when taking photos. Appl. Sci. 2019, 9,

Conclusions
Realizing the benefits of collecting building information using visual data, researchers have begun to focus on the automatic extraction of structural characteristics from images to quickly assess seismic vulnerabilities of regional-scale buildings. However, existing structural element detection methods have a certain of limitations for the detection of school building columns and walls. To improve existing automatic structural element detection techniques, this paper combines the GLH statistical feature-based object-detection method with an improved shape-based method: the former can identify structural elements from complex scenarios, and the latter has proved its superior capability in detecting slant objects in an image. Focusing on detecting columns and walls, the proposed method was validated by testing the detection precision and recall for column and wall images selected from the database of NCREE, respectively. The main advantages of the proposed column and wall detection method are listed as follows. First, the present method is able to detect columns and walls with various surface treatments or slightly tilting shapes which frequently occur in images. Second, images in different sizes and resolutions taken by either cameras or mobile phones can be used in this method, broadening the sources and decreasing the cost for images collection. Last, the proposed method is insensitive to illumination intensities which means that images taken in different time periods of a day can be used in this method. As three GLH statistical parameters  Table 2 (sub-region 2, 4, and 6 contain walls while only walls in sub-region 2 and 6 are detected): (a) the preprocessed image; (b) variance of GLHs; (c) skewness of GLHs; (d) kurtosis of GLHs; (e) the GLH in sub-region 2 (a wall); (f) the GLH in sub-region 6 (a wall).

Conclusions
Realizing the benefits of collecting building information using visual data, researchers have begun to focus on the automatic extraction of structural characteristics from images to quickly assess seismic vulnerabilities of regional-scale buildings. However, existing structural element detection methods have a certain of limitations for the detection of school building columns and walls. To improve existing automatic structural element detection techniques, this paper combines the GLH statistical feature-based object-detection method with an improved shape-based method: the former can identify structural elements from complex scenarios, and the latter has proved its superior capability in detecting slant objects in an image. Focusing on detecting columns and walls, the proposed method was validated by testing the detection precision and recall for column and wall images selected from the database of NCREE, respectively. The main advantages of the proposed column and wall detection method are listed as follows. First, the present method is able to detect columns and walls with various surface treatments or slightly tilting shapes which frequently occur in images. Second, images in different sizes and resolutions taken by either cameras or mobile phones can be used in this method, broadening the sources and decreasing the cost for images collection. Last, the proposed method is insensitive to illumination intensities which means that images taken in different time periods of a day can be used in this method. As three GLH statistical parameters used in this method only depend on the shapes of GLHs, their values are insusceptible of the average high or low illumination intensities. On the other hand, as for the unsuccessful detection of columns and walls, the main reason is that images are not taken from the front view of structural elements. Camera tilting and perspective illusion can aggravate the inclination of distant elements, leading to the failure of detecting their boundary lines. Therefore, taking images from the front view of the target structural elements are recommended.
Although the proposed method was applied here to only detecting and counting columns and walls, it can be further extended to detect other structural characteristics, such as sizes and surface defects of structural elements, or structural deficiencies, such as short columns and soft-stories, as broader applications to automatically extract structural information from digital images. Furthermore, as glass has been widely used for facades in modern buildings and the failure of such glass-elements can lead to serious casualties in earthquakes, automatic detection of building elements made of glass is a critical and challenging issue in the future [52]. The proposed GLH statistical feature-based object-detection method has the potential for solving such issue because the gray levels of glass-elements' surfaces are homogeneous compared with the gray levels of surrounding environment that contains various objects, while the color/texture-based methods may fail to work in detecting glass that has various colors and surface treatments.