Historical Single Image-Based Modeling : The Case of Gobierna Tower , Zamora ( Spain )

Historical perspective images have been proved to be very useful to properly provide a dimensional analysis of buildings façades or even to generate a pseudo-3D reconstruction based on rectified images of the whole structure. In this paper, the case of Gobierna Tower (Zamora, Spain) is analyzed from a historical single image-based modeling approach. In particular, a bottom-up approach, which takes advantage from the perspective of the image, the existence of the three vanishing points and the usual geometric constraints (i.e., planarity, orthogonality, and parallelism) is applied for the dimensional analysis of a destroyed historical building. Results were compared with ground truth measurements existing in a historical topographical surveying obtaining deviations of about 1%.


Introduction
Despite the high proliferation of three-dimensional scanning systems, which have revolutionized the data acquisition fashion, photogrammetry, and computer vision techniques, have evolved to offer powerful and friendly tools that can render virtual worlds that meet the most demanding expectations: high geometric accuracy on the points, high radiometric quality on the surfaces of the object, and an geometry and vanishing points; Section 4 describes the method applied to the "Gobierna Tower" putting special emphasis in the vanishing points computation method; a final section is devoted to put across the concluding remarks.

Historical Background
One of the most highlighted cases in lost buildings, that belonged to the architectural heritage of the city of Zamora (Spain) [16], is the demolition of the towers that characterized the Stone Bridge on the river Duero.
During its long-term presence, the Stone Bridge has undergone continuous transformations, transformations that have been necessary to reduce the devastating effects of the endemic swellings that knocked down the city slums.
At the end of the 19th century, its state was so worrisome that it was closed to the traffic and replaced upstream by another viaduct.Once the local authorities dealt with the construction of a new metallic bridge, also decided to recover the battered Stone Bridge.Between 1905 and 1908, Luis de Justo, Civil Engineer, designed and executed eleven projects that modified radically the appearance of the medieval bridge of Zamora, in addition to repairing it.Up to then, the Stone Bridge of Zamora displayed a similar configuration to the original one.Several documents show this fact, such as the View of Zamora by Anton van Wyngaerde (1570) (Figure 1), the plan by Blas de Vega (1820) [17], the photography of J. Laurent (c.1870) (Figure 2), and, finally, the 1905 previous state plans by Luis de Justo.Sixteen pointed arches -one more than nowadays-laid out a 280-meter-long bridge, which leaned in fifteen foundations, finished off by cutwaters and lightened by spillways.Two towers controlled bridge crossing.The deck, straight guidelined and double downgrade ended, angled sharply towards the east to make difficult enemy assault, once surpassed the exit tower.The deck was finished off by 0.40 m wide parapets.At that moment, it was crowned by more than 300 battlements.
Over the cutwater of the previous tympanum to the angle, rose the tower of La Gobierna, popularly known due the weather vane that topped it.Moreover, in Wyngaerde's View or in Blas de Vega's elevation, an initial door prevented parking on the deck during the night.In the other extreme, over the northern pier, an arc rose that opened the bridge by to the city.Nothing remains today.After Luis de Justo rehabilitation, the slopes were modified and the spillways were extended, and even new ones were added.The works undertaken by the Department of Public Work between 1905 and 1907 [18] consisted of the demolition of towers, parapets, and tympanums to facilitate the later integral repair of the bridge.Thus, vaults were reviewed, tympanums were reconstructed, and new asphalt paved the existing surface.In the northern part, the arc was replaced with one roundabout, and, in the southern, in the angled stretch, the previous arc to the "Gobierna Tower" was totally recovered.Fortunately, before disappearing, some of these architectures were documented with building survey, while others did so with photography.This is the case of the Stone Bridge that appears with its towers and doors, in several documents.

Theoretical Basis: Single View Photogrammetry
The vanishing points, (VPx, VPy, VPz, see Figure 3) related to the perspective image of a box-like shape (like a building) contain information, both of the camera and of the image pose.Thus, the determination of these vanishing points can lead, firstly, to the determination of the interior and exterior orientation of the image and secondly, to the computation of the metric properties of any element pertaining to the faces of such a solid.
As can be seen from Figure 3, the intersections, from the point of view (S), of the three directions (XYZ) of the box with the image plane give the three vanishing points: VPx, VPy, and VPz, conforming a so-called perspective pyramid, The location of these points depends on: the focal length SP, with P, the principal point (the intersection of the image axis with the image plane), the horizontal angle QSVPx (horizontal angle of the camera axis with the main faç ade), and the vertical angle PSQ (angle of the camera axis with the vertical direction).As will be seen later, the angle that renders the rotation of the camera around its own axis (swing angle) can be derived from the angle of the horizon line with the edge of the image.
It can be seen that the principal point, P, is the orthocenter of the triangle VPx, VPy, VPz, and that the position of the point of view can be derived easily once the previous parameters have been computed and a restriction from the object is provided (e.g., a known distance).In any case, it can be seen that the first step is always to determine the position of the three vanishing points related to a certain building and this relies heavily on both the robustness of the pose configuration and on the ability of extracting straight lines from the image that intersect on each of the vanishing points.The quality of the process, as will be discussed later, thus, depends on the definition of the image lines and on the angle that each bundle of vanishing lines spans.From this, it can be seen that when the perspective angles are poor, the vanishing point are far away from the center of the image, decreasing, therefore, the reliability of the process.

A Case Study: The Historical and Demolished "Gobierna Tower"
The "Gobierna Tower" together with its bridge was documented through several drawings, historical photographs and even with a topographical surveying performed by the engineer Luis de Justo in 1905.In particular, the most relevant documents correspond to Wygaerden [19] who performed some perspective drawings of the bridge with its tower, or the historical photographs of Laurent in 1870.In our case, different documents and perspective historical images have been analyzed in order to test the historical single image-based modeling approach (Figure 4).In addition, the topographical surveying, which contains a dimensional analysis of the tower has been considered as "ground truth" to assess the accuracy of the process.
The following figure (Figure 4) outlines the workflow developed:

Data Processing
There are two ways of retrieving the metric information of the object from a single image: automatic and manual.The first one is always preferable when the image exhibits high quality (high-resolution and definition of the vanishing lines), when the ratio between correct observations (automatically extracted line segments) and mistaken observations (blunders derived from shadows, scars on the image, reflections, etc.) is above 4 (environ) and when the perspective geometry is strong enough.The second one is the alternative to weak cases in which these drawbacks are combined: low number of vanishing lines, poor quality image, high number of blunders, and poor perspective geometry.
The automatic approach is structured in three steps: (a) Extracting edge pixels by means of the Canny filter [20].(b) Clustering pixels into raster segments according to neighboring criteria and with length restrictions in a fashion very similar to the Burns Method [21].(c) Determining vector lines (first and last points) from raster segments according to a plane collinearity condition.
The output from these processes is the input in the following one: the determination of the vanishing points.Several methods of approaching this have been implemented [15]: minimization of the area of the triangle, modified Gaussian sphere, Tales theorem, modified Hough Transform, etc.In all of them, a central role is played by the RANSAC (RANdom SAmple Consensus) robust estimator [22] and its ability of determining and erasing blunders.
A whole set of possibilities have been applied to automatically process the target image but none of them has been successful due to the reasons outlined above.Thus, finally, the manual procedure was applied and even though there is really a very small set of lines, an acceptable result has been reached and this have been possible by the application of the modified Hough Transform Method, which is briefly described in the following lines: As is well known, the Hough Transform [23] deals with the relation between the space representation of some geometric feature (points, straight lines, circles, etc.) and the representation of its geometric parameters under the same Cartesian principles.For the following, we will focus on the problem we are trying to overcome: determining lines from pixels extracted on the image and determining vanishing points from these lines.The strength of the Hough Transform relies on the complete symmetry between points and lines.According to the conventional expression of a 2D straight line y = ax + b, the parameters to render a point (x,y) are exchangeable with the parameters to describe a line (a,b).Thus a straight line in the image space is transformed to a point in the parameter space and vice versa, a point in the image space is transformed to a straight line in the parameter space (Figure 5).
This leads (at least) to the following series of consequences (Table 1): The Hough procedure works by quantizing the image space, then extracting all information for every discrete cell, translating this information to the equivalent parameter space, and, finally, proceeding to some voting scrutiny to find out the relevant feature that meets the target criteria (The drawback related to the singularity of the parameter a when lines are close to verticality is overcome by tuning from the Cartesian (a,b) representation of the line to its polar (r,α) expression).
To determine a vanishing point the procedure is as follows: (a) For every start and end point of every line segment rendered by the automatic or manual extraction, the correspondent line in the parameter space is computed and represented.Every cell that lies on the line receives one vote.(b) A voting procedure is undertaken so that the most visited cells give the lines that form families of lines that pass through each of the vanishing points.(c) For all these lines the correspondent parameters (a,b) i are computed.(d) The best coordinates of each of the vanishing points are computed by applying a least squares criteria to the equation: y 0 = a i x 0 + b i in which (x 0 ,y 0 ) are the coordinates of a vanishing point.(e) In order to avoid residual outliers, a weighting procedure is applied to the above task, so that a robust M-estimator, modified Danish estimator [24] can be implemented and, thus, the blunders may be expelled from the computation and the reliability can be improved.
Once the coordinates of the three vanishing points are computed the interior and exterior orientation parameters are addressed from the perspective pyramid, built from these points plus the point of view (Figure 6c): (a) The orthocenter of the triangle formed by the three vanishing points (VPx, VPy, VPz) yields the principal point (P) (Figure 6c).(b) The rotation angles (θ,ν) and the focal length (f) can be derived from the following relations (Figure 6).
On the horizontal triangle (Figure 6a) formed by the point of view, S, and the horizontal vanishing points, VPx and VPy, the following relations hold Equation ( 1  Finally, the swing angle (Figure 6c) can be computed from Equation ( 3): (c) Once these parameters are known, the coordinates of the point of view, S, can be easily derived by applying a certain restriction to the object (in addition to the point of view itself) and then solving from the collinearity equations (Figure 7).An example case is measuring a horizontal distance in the object and setting the origin of the Datum at one of these points.We can, thus, write six equations for five unknowns: (XYZ) S and the two scale factors for each collinearity condition [25] Equation (4).
where x a , y a , x b , y b are the image coordinates of the ground points A and B, respectively, which define the Datum and the known horizontal distance D AB ; x p , y p are the principal point, P, coordinates; R is the rotation matrix and Xs, Ys, Zs and λ aA , λ bB are the unknowns corresponding to the point of view, S, and the two scale factors, respectively.Finally, once the interior and the exterior orientations are solved, the dimensional analysis process and the pseudo-3D modeling process are available (Figure 8).By pseudo-3D modeling we mean that the object facades can be rectified, that is, digitally transformed to eliminate any perspective effect and, furthermore, if two facades are related by a orthogonality condition, as is usually the case, a double rectification step can be applied, that is, from the two 2D documents linked to each other a 3D document can be obtained by an orthogonality relation.If more photographs, containing other pair of (orthogonal) facades are available, this process could be extended to complete the four (orthogonal and parallel) faces of the object.Both the dimensional analysis and the pseudo-3D modeling procedures are based on the collinearity equations and both require the definition of a geometric constraint (i.e., working plane) on the object.For example, we can work with the XZ plane for which the-planarity, verticality, and parallelism with XZ plane-constraint Y = 0 is applied.Furthermore, we can also work with YZ plane for which the-planarity, verticality, and parallelism with YZ plane-constraint X = 0 is applied.Obviously, both constraints imply an orthogonality relation between them.If more photographs, depicting other facades, were available, similar constraints could be applied to complete the whole building.Note that the scale factor could be propagated from the first image to the others although it would be highly convenient to measure more distances on the facades.
For any object point T (with Y T = 0), which is imaged on the photograph as t, we have Equation ( 5): In addition, dividing the first and third equations by the second one and rearranging we get Equation ( 6): where all the terms at the right side of the equation are known.This can be applied to discrete points or in a scanning fashion to all the pixels that lie in the face related to the XZ plane and therefore, obtain the pseudo-3D model of the faç ade.

Results
After analyzing more than ten images, the only historical photograph that properly worked presents a size of 7.7 × 12.18 cm and is scanned with a pixel resolution of 150 dpi providing an image of 455 × 719 pixels (Figure 9).The secret to success remains in the distance to the object (very close, around 70 m), as well as the well-defined perspective of the photograph towards the main three orthogonal directions.Furthermore, the selected photograph is based on the following hypotheses, as far as the building's geometry is concerned: (i) faç ades are planar geometric structures; (ii) lens distortion is not considered; (iii) faç ade edges are straight and constitute the input data of the described method; and (iv) the existence of constraints (parallelism, perpendicularity, and coplanarity) of building's edges and facades.According to the proposed approach, the photograph is manually vectorized with lines clustered along the three main object directions (X,Y,Z).As a result, the three main vanishing points (VPx, VPy, VPz) are computed, based on the robust Hough approach described above.The coordinates of these relevant points together with its RMSE (Root Mean Square Error), in pixel units, are outlined in Table 2.It should be noted the subpixel precision obtained.Computed the main structural components of the process, the geometric internal camera parameters, i.e., focal length and principal point, are estimated based on the perspective pyramid construction (Figure 10) taking the three vanishing points as vertices.The principal point of the image (P) is the orthocenter of this triangle, whereas the height of the pyramid corresponds to the focal length (f).The solved internal geometry of the camera, the camera pose (i.e.orientation and position) is computed.The orientation of the camera (θ,ν,χ) is computed, based on the geometric relations developed in Equations (1-3), which establish a relationship between the orthogonal directions (X,Y,Z) and the corresponding vanishing points (VPx,VPy,VPz), assuming the intrinsic geometric parameters of the camera known.For the spatial position (X,Y,Z) S , the user must introduce some known measurement of the building together with some geometric constraint in order to overcome the indetermination problem i.e., it is not possible to compute indirectly the camera position only with one image.In our case, a known  Finally, a pseudo-3D model was generated based on the rectified facades computed geometrically from vanishing points, that is, using the collinearity equations supported by a geometric constraint (i.e., working plane) on the object Equations ( 5) and ( 6).The result is showed in Figure 11b.Furthermore, in order to visualize this result integrated with its current state, a virtual recreation of the Stone Bridge with the "Gobierna Tower" has been generated (Figure 11c).

Conclusions
When the lack of information is clearly due to the non-existence of the object of interest, such as historical demolished buildings, classical but solid perspective geometry statements can be of great utility, instead of advanced image processing techniques, especially in those cases in which only individual or single images exist.The main goal of this study was to provide a dimensional analysis and even a pseudo-3D reconstruction of the demolished historical building "Gobierna Tower" using single historical photographs.To this end, a single image-based modeling method has been developed and adapted to this specific case.The accuracy assessment results come to confirm that from a single view we can measure distances and areas and even to provide a simple 3D model with enough quality.The results obtained could be useful for the authorities of Zamora's Council as they have been considering reconstructing the "Gobierna Tower".The monument would play an important touristic role but specially would meet a popular demand supported by social and cultural reasons which would be directly connected with the identity of Zamora's society.
With relation to the workflow developed and the results obtained the main conclusions are the following: (a) Manual processing permits achieve better results than automatic processing.This is due to the weakness related to low number of vanishing lines, poor quality image, high number of blunders and poor perspective geometry.(b) Although robust estimators (especially RANSAC) have proven largely its efficiency in filtering gross errors, this is not the case.As just stated, when the image is poor both in geometry and radiometry, the automatic approach leads to an excessive number of blunders and so, the manual identification of vanishing lines is better.(c) An original vanishing point method based on the Hough Transform, which guarantees efficiency and quality in the results, even with unfavorable cases (a three-point perspective getting close to two-point perspective), has been successfully applied.Other methods to compute the vanishing points, such as the triangle area minimization or the Gaussian sphere, have not provided good results.(d) A relative error of 1% has been obtained for the accuracy assessment of the results.This value can be considered very good since the single image-based modeling approach developed involves many steps and thus the corresponding error propagation.(e) Finally, it should be remarked that the method is only applicable in scenes with strong geometric contents (i.e., presence of structural planes and lines).In addition, the image must have perspective along the three main directions (X,Y,Z) in order to compute the corresponding three vanishing points (VPx,VPy,VPz).Obviously, if these vanishing points are well defined more precision and reliability can be reached for the single image-based modeling approach.

Figure 1 .
Figure 1.Detail of the Stone Bridge of the Zamora city drawn by Antón van den Wyngaerden, in 1570.

Figure 2 .
Figure 2. Bridge over the Duero river in Zamora, photograph acquired by J. Laurent, in 1870.Photograph of the southern half (arcs 7-3).

Figure 4 .
Figure 4. Workflow developed for the historical single image-based modeling applied to the case study of the "Gobierna Tower".

Figure 5 .
Figure 5. Interpretation of several cases of the Hough Transform applied to straight lines.For each of the four cases, the Image Space is represented at the left and the Parameter Space is represented at the right.In the Image space, there can be seen: (1) A straight line; (2) A point; (3) A set of collinear points; (4) A vanishing point.
the horizontal triangle).On the vertical triangle (Figure6b) formed by the point of S, the vertical vanishing point VPz and the intersection of the horizon and maximum slope lines, Q, the following relations hold Equation (2(in the vertical triangle).

Figure 6 .
Figure 6.(a) Horizontal triangle formed by the point of view, S, and the horizontal vanishing points VPx and VPy ,with the azimuth angle (θ); (b) Vertical triangle formed by the point of view, S, the vertical vanishing point, VPz, and the intersection of the horizon and maximum slope lines, Q, with the tilt angle (ν); (c) Perspective triangle formed by the three vanishing points (VPx, VPy, VPz) that contains the principal point (P) as the intersection of the heights of the triangle and image showing the swing angle (χ) around the camera axis: the horizon line and the width image edges are not parallel (in addition, the maximum slope line and the height image edges are not parallel).

Figure 7 .
Figure 7.The coordinates of the point of view, S, can be computed once the principal point (P), focal length (f), and rotation angles (θ,ν,χ) are known by proceeding to a Datum definition by which two imaged points (a and b) receive two object coordinates: A(0,0,0) and B(D AB ,0,0) and by means of the collinearity equations.In this case, the X coordinate of point B, that lies on X axis, is the measured distance between A and B.

Figure 8 .
Figure 8. Dimensional analysis on a plane (in this case Y = 0) of the object and pseudo-3D modeling based on the rectification of the whole plane.The coordinates of any object point T can be computed from its image t, and the constraint Y T = 0, by means of the collinearity equations.

Figure 9 .
Figure 9. Historical photograph (1900) used for the single image-based modeling approach.

Figure 11 .
Figure 11.(a) Dimensional analysis based on distances for the accuracy assessment; (b) Pseudo-3D model of the "Gobierna Tower" obtained through the single image-based modeling approach; (c) Virtual 3D reconstruction that integrates the "Gobierna Tower" in the Zamora Stone Bridge.

Table 1 .
Different cases between image and parameter spaces for the Hough transform.
(set of lines that intersect on a point) A set of collinear points (the straight line to which they belong represents the vanishing point)

Table 2 .
Vanishing points coordinates and its errors.

Table 4 .
Accuracy assessment: dimensional analysis of distances.