Multi-Scale and Multi-Sensor 3 D Documentation of Heritage Complexes in Urban Areas †

The 3D documentation of heritage complexes or quarters often requires more than one scale due to its extended area. While the documentation of individual buildings requires a technique with finer resolution, that of the complex itself may not need the same degree of detail. This has led to the use of a multi-scale approach in such situations, which in itself implies the integration of multi-sensor techniques. The challenges and constraints of the multi-sensor approach are further added when working in urban areas, as some sensors may be suitable only for certain conditions. This paper describes the integration of heterogeneous sensors as a logical solution in addressing this problem. The royal palace complex of Kasepuhan Cirebon, Indonesia, was taken as a case study. The site dates to the 13th Century and has survived to this day as a cultural heritage site, preserving within itself a prime example of vernacular Cirebonese architecture. This type of architecture is influenced by the tropical climate, with distinct features designed to adapt to the hot and humid year-long weather. In terms of 3D documentation, this presents specific challenges that need to be addressed both during the acquisition and processing stages. Terrestrial laser scanners, DSLR cameras, as well as UAVs were utilized to record the site. The implemented workflow, some geometrical analysis of the results, as well as some derivative products will be discussed in this paper. Results have shown that although the proposed multi-scale and multi-sensor workflow has been successfully employed, it needs to be adapted and the related challenges addressed in a particular manner.


Introduction
Heritage documentation is an important aspect in any conservation effort.Apart from the traditional 2D drawings and photographs, digital 3D documentation of historical sites currently presents a useful tool in the analysis and interpretation of historic buildings, as well as its eventual reconstruction in the event of damage [1].This is more important in the presence of important threats, both natural [2,3] and anthropological [4].Digital documentation uses various sensors in capturing the reality.This may be done either by image-based or range-based techniques [5].Each technique has its own advantages and disadvantages; as such, it is not rare to see the combination of both techniques in a thorough documentation project [6,7].Indeed, the integration of heterogeneous data is an important research issue [8].
The Terrestrial Laser Scanner (TLS) is a range-based system that enables the acquisition of many points in a short period of time [9] and has seen much use in the field of heritage documentation [10,11].As regards to image-based techniques, photogrammetry has in recent times seen important leaps, and even more so with the advent of Unmanned Aerial Vehicles (UAVs) or drones [12,13].Both range-based and image-based techniques generate dense point clouds from which derivative geospatial products may be generated.
In the context of the documentation of heritage complexes, the extent of the site makes it logical to use a multi-scalar approach.Larger areas do not need fine resolution data, as opposed to smaller buildings or even architectural elements.In this approach, the site is digitized in several scale steps, according to the required resolution for each level [14].The use of the multi-scalar approach also means that more than one sensor could be employed in order to cover each scale step.Furthermore, for the case of urban areas, many constraints such as the geography and urban density mean that the use of one single sensor may not be sufficient.It is in this regard that multi-sensor and multi-scalar documentation became a logical solution to the problem of documenting historical complexes.
Even though the use of 3D documentation methods has seen a significant rise in the past few years, vernacular architecture usually presents different challenges depending on its architectural style.This may also be influenced by the climate and geography of the site.The Indonesian archipelago has a lot of examples of traditional vernacular architecture, characterized by distinctive building design and layout, which takes into account its tropical climate (e.g., open-air pavilions, rooftops, etc.).Some of these sites also possess historical values, and therefore merit a 3D digitization and conservation.
Within the context of the Franco-Indonesian PHC NUSANTARA research program, a digital documentation mission for a representative Indonesian vernacular and historical site was made possible in May 2018.In this regard, the Kasepuhan Palace, located in the city of Cirebon, West Java, was chosen as a prime example of such a site.The mission was conducted by a multidisciplinary team of surveyors and architects.The Kasepuhan Palace is an urban palatial complex located at the center of Cirebon city, on the northern coast of Java island.It was the center of the Kasepuhan Sultanate, one of the main successors to the old Sultanate of Cirebon.Currently, the palace no longer commands any political influence, but still plays a role as a center of Cirebonese culture.The site is also interesting due to the fact that it had been previously drawn by architects using manual measurements, enabling a comparison and eventually updating of the existing drawings by means of the 3D data acquisition.
The oldest parts of the palace date from the 13th Century [15], with many additions and embellishments throughout the centuries.Many of its buildings demonstrate the local architecture suited for the tropical climate, with influences from Islamic, Chinese, and European architecture.Following these constraints and influences, the palace proper does not consist of a single contiguous building, but rather a complex of smaller buildings located within the same premises.While some enclosed buildings do exist, including the main building, many parts of the palace complex consist of small pavilions, often with few walls in order to increase air flow [16].The whole palatial complex was modeled in a lower resolution in order to get a general idea of the site and its surroundings within the context of the historical Cirebon city center.One area of particular interest within the complex was identified to be digitized at a higher resolution, namely the Siti Inggil ("Elevated Ground") area near the entrance (see Figure 1).The Siti Inggil area dates from the 15tth Century [17] and served as a viewing platform for the Sultan, as it directly faces the city's main square ("alun-alun").
The objective of this paper is to address the challenges in implementing a multi-scalar and multi-sensor 3D documentation workflow for tropical vernacular cases.The paper will report these challenges and how we addressed them during the project.The paper will also describe some geometrical analysis and quality control, in keeping with the required geometric precision and accuracy of digital heritage documentation.Both image-based and range-based techniques were employed in this regard.Several sensors were also used, including a UAV, which enables both a bird's eye view of the palace's surroundings, and also close-range aerial photogrammetry, which complements the classical terrestrial photogrammetry and laser scanning.

Related Work
The 3D documentation of heritage buildings and architectural elements has been addressed many times in the literature.The use of photogrammetry, laser scanning, and indeed the combination of both is seen as a winning solution for digital documentation [1].Laser scanning has the advantage of being a fast system, able to produce millions of points in a very short amount of time [18].It is therefore a practical solution for most digitizing projects.Nevertheless, different types of laser scanners are required for different ranges and accuracies; one of its disadvantages is therefore its price.This is more so when the site to be digitized requires different scales.
Photogrammetry has seen many improvements and renewed interest in the last decade, partly due to significant developments in terms of sensor manufacturing [19].This is also helped by breakthroughs in the field of dense matching [20], as well as the democratization of UAVs [21].As regards heritage documentation, photogrammetry is often employed in its close-range configuration.Using this technique, a flexible ground sampling distance (GSD) can be designed according to the requirements.It is therefore capable of capturing very intricate details.Added with the possibility to recover real-life texture from the images, this gives photogrammetry an advantage over other methods.UAVs give another edge to photogrammetry by adding aerial points of view.However, photogrammetric data processing requires extensive time, experience, and resources.Historically, photogrammetry requires a well-calibrated and even metric sensors.Today, however, faster computing capabilities and algorithms have widened the scope of sensors that may be processed according to photogrammetrical principles.The DSLR camera is typically used in a standard heritage documentation project due to its relative stability and possibility to choose acquisition parameters [5,22].However, other lower end sensors have also seen a rise in quality and therefore usability in recording heritage objects, e.g., smart-phone cameras [23,24] or spherical panoramic cameras [25,26].
Laser scanning, on the other hand, has been around since the 1980s and was a revolutionary technology in the domain of 3D mapping.Contrary to traditional total stations, the TLS technology enables the recording of the environment in a fast and relatively accurate manner while being fairly easy to use.This abundance of data, however, may also be a disadvantage as it generates large files of point clouds.This is especially true when the object in question is a complex building, as is often the case in heritage documentation [1].Occlusions may also occur, which render the final point cloud incomplete [27].The TLS also has limitations when it comes to point cloud colors and textures, even with the addition of cameras attached to the device [28].Meanwhile, registration between TLS stations has been studied in many literature works, mainly based on a coarse 3D conformal transformation followed by a fine closest neighbor-based registration [7,8,29].
Due to the advantages and disadvantages of each recording method, both can be complementary, and their integration is often performed [30,31].Various integration workflows have been proposed, which mainly depend on the particular case.Early attempts include the possibility to integrate the 3D data by means of a Geographic Information System (GIS) [32].Another approach performs independent georeferencing on each dataset in the same coordinate system, with direct integration at the end of each georeferencing [33,34].In terms of photogrammetric workflow, several protocols stress the need for a good camera network for the bundle adjustment part and enough overlaps for the dense matching parts [35,36].UAV acquisition strategies can also be used in order to extract as much information as possible from aerial flights, for example by performing oblique image acquisitions [37].
Architectural documentation has been performed using these methods in many research works [38].It is often used in extracting orthophotos [39], vector models of facades [40], 3D models [41], as well as inputs for heritage building information models (HBIM) [42].Both photogrammetry and laser scanning have been employed in the documentation of vernacular buildings [43,44].They have also been employed in the case of sites located in a tropical climate [32], as well as for the distinctive Asian architecture [45,46].
However, few research works have been conducted on vernacular Indonesian architecture documentation.One research paper discussed the digitizing of the Borobudur temple [47] using a similar multi-sensor and multi-scale approach.However, the site in question is of a very different nature than the one encountered in the Kasepuhan Palace; therefore, different problems and challenges arise.Indeed, the Kasepuhan Palace presents a different layout and architecture from other monolithic monuments in that it is spread out in a larger area.Another research work that was more similar to this project was conducted in a traditional village in Sumatra; however, the paper reported only the use of close-range photogrammetry [48].

Proposed Workflow
The proposed workflow involves a multi-sensor and multi-scale digitization of the Kasepuhan Palace site, with the intention to create a hybrid photogrammetric-laser scanning 3D model.This hybrid model can then be used to generate various derivative products that support the work of architects and archaeologists, while also providing a medium for the dissemination of information to the public.Photogrammetry was mainly used to capture close-range objects with intricate details, which require higher resolution, as well as an aerial point of view of the site.Laser scanning, on the other hand, was used to capture the buildings in general and thus acts as a bridge between the resolution generated by aerial photogrammetry and that by close-range photogrammetry.In terms of the sensors used, the main types of sensors employed include the following:

•
Aerial photogrammetry: Aerial photogrammetry was performed in order to obtain a global view of the site and its surroundings.The DJI Phantom 4 (Normal) UAV was used in this regard, equipped with a 12-megapixel camera and a small 3.6-mm lens.The flight was conducted at an average altitude of 80 m.

•
Close-range photogrammetry: The DSLR camera Canon EOS 5DSR was used to capture close-range images.This camera generates images with a 50-megapixel resolution.Two types of lenses were used during the acquisition: a 24-mm one was used to capture larger architectural elements and buildings (e.g., columns, ceilings, walls, etc.), while a 40-mm lens was used for elements requiring finer details (e.g., carvings, fine woodwork, etc.).Additional close-range images were acquired using the UAV from varying flying heights from 5-20 m.

•
Laser scanning: A TLS was used to obtain the general environment and 3D model of the buildings within the palace complex.A Faro Focus 3D was used to this end.The Faro Focus 3D is a phase-based TLS, which works well within a medium range of distance.
• Topographical surveying: In parallel with the 3D data acquisition, a topographical survey was conducted to measure several aerial premarks, as well as the 3D coordinates of some control points.These control points were then integrated into the aerial and close-range photogrammetric project, as well as the laser scanning project in order to georeference them to the same system.The total station Topcon GTS-212 was used for the tacheometric measurements, which in turn was attached to the Indonesian national UTM system by means of GNSS measurements.
The multi-scale aspect of the project refers to the different scale steps that were acquired (see Table 1).UAV flights were conducted to take images of the Kasepuhan Palace and its surroundings; this constitutes Step 1 of the assigned scale steps.Step 2 involves a digitizing process at a larger scale, this time encompassing a particular area of the complex: the Siti Inggil area near the entrance.This area was prioritized during the documentation mission due to its historical value.Close-range UAV data were combined with a middle-range (10-25 m) TLS point cloud in order to achieve this level.The TLS acquisition was aided by artificial spheres, which facilitate the registration process.The third scale step was comprised of monuments, pavilions, or other single buildings in general.Close-range photogrammetry using the 24-mm lens, as well as the TLS at a closer distance (5-10 m) were employed in this case.The final step consisted of fine details such as carvings and intricate decorative woodwork on some of the architectural elements.The 40-mm lens was used in this case.
The method of 3D data integration follows the existing workflow as described in [33].Instead of a block integration of all the data, which would have taken an immense amount of time and resources, integration was performed by means of independent georeferencing for each dataset.This means that photogrammetric projects were georeferenced using the absolute orientation method, while laser scanning projects used the measured coordinates of some of the artificial spheres to perform a rigid-body transformation.All control coordinates for Steps 1 and 2 were measured via total station and GNSS survey and attached to the same projection system (UTM 49S).Since both TLS and photogrammetry data were georeferenced to the same system, the two data were automatically merged at the end of the georeferencing process.The fact that all data were georeferenced to the same absolute system also means that future missions can be superposed easily.The control points were each measured twice from two stations, in order to enable spatial intersection.Note that for Steps 3 and 4, georeferencing by means of surveyed control points would mean a large amount of control points.In order to economize the available time and resources, manual control points were measured from the georeferenced TLS point cloud.The downside of this method, however, would be the propagation of error from the TLS georeferencing process.
Architects have also previously created drawings and plans of the Kasepuhan Palace, mostly using measuring tapes and distance meters (Figure 2).These preliminary data were useful for the mission planning and may also serve as a comparison to the obtained 3D data.These drawings, however, are somehow simplified with only measurements for one facade given.The 3D documentation therefore also permits a more detailed measurement on the dimensions of various objects, which may be very useful for architects and archaeologists alike.

Challenges and Constraints
Various challenges specific to the conditions encountered in the Kasepuhan site were mostly related to the local climate and culture.As has been previously mentioned, the Kasepuhan Palace compound consists of a number of separate pavilions.These pavilions serve as both shelters against the elements and as a cultural statement of the palace as a historic center of Cirebonese culture.Due to the tropical and coastal climate of Cirebon, the weather is usually hot and humid.These conditions dictated the design of the vernacular architecture present in Kasepuhan.The use of walls was scarce on the majority of buildings, which provides a free flow of air and natural cooling [16].Walls are limited to some boundaries between the different parts of the compound.This lack of walls presents a particular challenge to TLS data acquisition, since this means that there are more hidden angles to cover and more masking problems.This translates into more stations and sometimes higher resolution scans in order to cover the hidden parts.This problem was addressed by taking additional photogrammetric images of some of the parts where the TLS is most difficult to station.The resulting point cloud can then be integrated into the TLS one to cover some of the most problematic parts.
The climate of the site also poses certain challenges with regards to the acquisition mission.Cirebon possesses a tropical and coastal climate, being located at the northern coast of the island of Java.The temperature during the mission averaged between 30 • C and 34 • C, with very high levels of humidity.These conditions play a role in the performance of some surveying tools, which rely on infrared and/or lasers.The total station encountered small problems in measuring distances due to this factor, and at some points, the measurement had to be performed several times.This may also be due to the strong solar exposure.In this regard, the use of the spatial intersection method for the computation of control points is useful, since it may be based only on angle measurements, which are less influenced by the temperature and humidity.This method has the added benefit of giving a more robust result due to the fact that each point is measured from at least two stations.Solar intensity was also an important influencing factor, as it renders some passive sensors problematic due to overexposure.Problems with overexposure in photos may be rectified in manual mode; however, the panoramic images taken by the TLS were more difficult to compensate.In some processing steps where it was required to select points, intensity rather than RGB images were therefore used (Figure 3).The dense vegetation in and around the Kasepuhan complex was also a challenge, particularly for UAV acquisition.Ground control points (GCPs) were to be placed around the complex following standard photogrammetric convention, but this was hindered by the amount of tree canopy.Premark placement for GCPs was therefore very limited to small open areas, which do not necessarily correspond to the ideal photogrammetric ground control network.It also posed a problem for GNSS measurements, necessitating a longer measurement time in order to get to the required centimetric precision.In regards to TLS and close-range terrestrial photogrammetry, the dense vegetation also generated noises, which must then be cleaned from the resulting point cloud.
Another problem with the site is that residential housing is virtually mixed with palatial buildings in some areas, sometimes with no walls or fences to delineate them.This means that residents were free to enter the palace compounds, rendering the site non-sterile.Particular care in the handling of surveying tools and TLS spheres was therefore of the utmost importance.Members of the team had to stand by on several spots with artificial TLS spheres in order to avoid them being moved by passers-by.Several control points were therefore also measured on immobile detail points (e.g., roof edges, brick intersections, etc.) to mitigate problems that may arise from moved targets.Tourists also came in large groups due to the palace's guided tour system.Complete authorization for the sterilization of the site was difficult due to the amount of tourists who visited Kasepuhan, even during normal work days.This added complexity during the acquisition and noises in the point cloud.

Results and Discussions
Overall, 310 aerial images, over 430 close-range UAV images, and 1060 terrestrial close-range images were taken during the mission.In terms of TLS stations, 23 stations were acquired throughout the palace compounds.Photogrammetric processing was performed using Agisoft PhotoScan, while TLS registration and georeferencing were conducted using the Faro Scene software.

Preliminary Processing
The Step 1 images oriented in PhotoScan are shown by Figure 4. Overall, eight control points spread over a 23 hectares area were measured on the field and used during the image orientation process.Five points were used as GCPs, yielding an overall RMS value of 3.5 cm.The remaining three points were used as check-points, giving an RMS value of 9.6 cm.Since the main objective of the Step 1 processing was only to give a general overview of the site, not enough time was available to measure too many control points for the whole area (the main area of interest for the project being the Siti Inggil area).While far from ideal, this configuration was largely enough for the Step 1 requirements.This first step is useful in assessing the Kasepuhan site and its environment and may be used to generate maps and orthophotos with centimetric resolution.
Meanwhile, from the 23 TLS stations that were acquired, a complete registration of all stations was performed using Faro Scene, with a 4.5-mm average distance between tie points ("tensions").These values rest within the tolerance when considering the theoretical GSD as described in Table 1.
The TLS point cloud was georeferenced using the coordinates of six artificial spheres measured during the topographical survey using a total station and is a separate step from the GNSS measurement used for the aerial control points.As regards to the other scale steps, the georeferenced TLS point cloud was used as a reference for the absolute orientation of the photogrammetric projects.Manual control points were selected on the TLS point cloud on parts where the photogrammetric result overlapped.Even though this process was performed on the full resolution of the TLS data, further errors in terms of accuracy were still possible due to the error propagation.The TLS data were then subsampled into 5-mm, 1-cm, and 2-cm point clouds with the aim to reduce the file size.
Figure 5 shows some results with regards to the multi-scale approach used in this project.Aerial photogrammetry was used in the creation of the Step 1 3D model; this scale step can be used as an overview of the Kasepuhan site and its surroundings.In total, 370 images were taken for this Step 1. Orthophotos can also be generated from this 3D model, from which an update of existing maps can also be derived.Some interesting preliminary remarks based on this result include the fact that some of the areas of the palace are not aligned to the cardinal directions.The Siti Inggil area is quite notable as it demonstrates a parallelogram shape, in addition to the non-alignment to the cardinal directions.
Step 2 shows results from the TLS dataset, from which a higher resolution can be observed in the case of certain areas (Siti Inggil in Figure 5).Advancing to even higher resolution, Step 3 shows the Royal Pavilion within Siti Inggil, which was modeled using close-range photogrammetry images.In this resulting point cloud, characteristic architectural elements can already be identified individually.A total of 640 images were taken for various details of the palace, averaging 150 images for each object of interest.Step 4 shows the finest resolution, that of decorative elements such as carvings.In Figure 5, a decorative plinth of a column generated from 30 images is showcased.Step 3 shows a building, the Royal Pavilion, within Siti Inggil, which was modeled using close-range photogrammetry.Finally, Step 4 shows an architectural detail, in this case a column's plinth, also modeled using close-range photogrammetry.

Quality Assessment
Assessment of the quality of the results was performed on several aspects, namely comparison against the pre-existing architectural drawings, analysis of the quality of the close-range photogrammetry results in Steps 2 and 3, and an evaluation of the TLS dataset in the scope of Step 4. This will hopefully give an overall, as well as more detailed views of the quality of the 3D documentation project.

Comparison against Architectural Drawings
One of the first analyses that was conducted on the result was a comparison against the pre-existing architectural drawings.Table 2 shows a comparison of the dimensions of the elevated platform sides and the pillar heights of the Central Pavilion of Siti Inggil.For this purpose, measurements were taken from the Step 3 model.One of the disadvantages of the existing drawings is the fact that they assume that all facades of the object are symmetrical, and therefore only give measurements on one side.Similarly, pillars were categorized into taller interior and lower exterior columns, with singular measurements for each.3D documentation uses the reality-based approach, and therefore captures the object as it is.Step 3 of the 3D model gives a centimetric resolution, and may therefore detect variations of this order.
When observing Table 2, the platform sides had differences (∆) ranging from 0.4-2.7 cm.The average difference was 1.4 cm, which is acceptable considering that the measurements on the drawings were made using measuring tapes.An interesting observation can however be seen in the ∆ for the pillar height.A seemingly systematic error of around 2 cm can be observed in this case, which may also be due to errors that occurred during the manual measurement.The objective of this part of the assessment is to confirm the manual measurements that were made for the architectural drawings.Since the drawings serve as a reference for architects, archaeologists, and conservators working on the site, it was deemed important to compare them to the 3D documentation method.It has also shown that the 3D methods were able to give more precise measurements in three dimensions, instead of simplifying the dimensions into the constraints of a 2D drawing.

Close-Range Photogrammetry Assessment (Steps 2 and 3)
In the case of Steps 2 and 3, the point cloud resulting from the photogrammetry method was georeferenced using common point coordinates obtained from the TLS point cloud.In this regard, the TLS provides a reference to the close-range photogrammetry result.This assumption was made from the fact that the TLS data were georeferenced and did not require a scaling factor, making it more reliable for the georeferencing of the close-range photogrammetry results.In terms of acquisition time, this method reduces much of the field time as it means that only TLS and aerial photogrammetry control points are required to be measured during the topographical survey.However, this may mean that a propagation of error might occur in the results.
Figure 6a shows the result of the combination of point clouds generated by TLS and UAV photogrammetry for the Siti Inggil area in Step 2. As can be seen, the UAV complements the TLS data by filling some holes otherwise impossible to acquire from a terrestrial point of view.This includes buildings roofs, but also tree crowns.Although visually, the composite point cloud seems to have been combined correctly, a further quantitative analysis was performed in order to investigate the quality of the data.In this regard, the cloud-to-cloud distance analysis was performed using the CloudCompare software.This analysis computes the absolute distance between nearest points belonging to two point clouds.Owing to the absence of a precise mesh model that may act as an absolute reference in this step, local surface modeling was conducted using the TLS data as a reference.The results are shown in Figure 6b, which gave a mean distance between the TLS and UAV photogrammetry point clouds of 2.3 cm with a standard deviation of 1.8 cm.Despite the millimetric theoretical GSD for the TLS data, this centimetric result was generated by the UAV data's GSD, which was indeed in the order of 1 cm (see Table 1).Furthermore, in Figure 6b, it can also be observed that the largest distances are located either in the periphery of the Siti Inggil or on parts that are considered movable (e.g., tree crowns and grasses).However, by observing the paved pathway, as well as most of the rooftops of the pavilions, the absolute distance is very small, in the order of 0-7 mm.A similar analysis can be observed in the comparison between TLS and terrestrial photogrammetry data, with similar results.In terms of Step 3, the Central Pavilion was chosen as a sample to be analyzed.This building was chosen due to its central position with open space around it, which permitted the generation of a relatively complete point cloud.Figure 7a shows the composite point cloud from combined TLS and close-range photogrammetry results.Pure TLS data did not manage to capture the entirety of the building, as it did not manage to scan the building roofs and difficult angles in the interior.Photogrammetry was useful in adding details of the interior ceilings, while additional UAV images were used in completing the roofs.However, the terrestrial photogrammetry encountered a problem in this case when trying to reconstruct the pavilion's floors.The floors of the pavilion are made of ceramic tiles, which when photographed from a low angle (as is the case with the terrestrial close-range acquisitions) became reflective surfaces and rendered the dense matching algorithm problematic in generating points.Here, the complementarity of TLS and photogrammetry is showcased, as both techniques completed each other and generated a complete model of the pavilion.This complementarity, however, requires a good registration between the two data.In this regard, another cloud-to-cloud distance analysis was performed in order to quantify the quality of the registration.
Figure 7b shows the result of such analysis in CloudCompare, generating a mean distance between TLS and photogrammetry point clouds of 7 mm with a standard deviation of 5 mm.These values were also higher when considering the theoretical GSD as expressed in Table 1.By observing Figure 7b, a slight systematic error can be seen to the right part of the model.This may be due to the imperfect iterative closest point (ICP) registration process, caused by the noises in both datasets.One main problem of the ICP process was the presence of a tree behind the pavilion, rendering some parts of the roofs very noisy in a different way for each dataset.This may have contributed to the slight rotation between the two datasets, as evidenced in Figure 7. Nevertheless, the resulting average distance and its standard deviation were still within the centimetric level, which is enough in this case.

TLS Assessment (Step 4)
Georeferencing of Step 4 results used common points identified on the TLS data.However, in terms of point cloud resolution, the close-range photogrammetry data provided much finer details.The close-range data also enabled the creation of a detailed mesh, which could then serve as a reference surface.Another analysis based on cloud-to-mesh signed Euclidean distance was therefore performed in order to assess the quality of the TLS data in recording these details.It should be noted that the Faro Focus 3D was designed as a medium-range TLS.
Figure 8 shows the 3D reconstruction of a plinth located within the Royal Pavilion.The close-range photogrammetry point cloud resulted in a very detailed point cloud due to the acquisition setup.This point cloud was meshed using the Poisson method in order to serve as a reference for the TLS. Figure 8d shows the signed distance between the TLS point cloud and the photogrammetric mesh.A mean error of 1.5 mm was acquired, as well as a standard deviation value of 2 mm.The mean error was quite high for this level of resolution and indicates the presence of a systematic error.This was probably caused by errors during the registration of both point clouds.The standard deviation value, however, roughly corresponds to the theoretical resolution of the TLS acquisition.
A similar analysis was performed on the woodwork on the Central Pavilion's ceilings (Figure 9).A mean error of 0.4 mm and a standard deviation value of 5.6 mm were obtained, as shown by Figure 9d.The mean error was near zero, which indicates that the registration between the two datasets was correct, and only a negligible systematic error was detected.That being said, in Figure 9d, we can observe a systematic tilt at the upper right and lower left corners of the ceilings.This fact may not be noticed when the errors are not plotted as such in Figure 9d, as the tilt may have compensated the value of the mean error.Moreover, the value of the standard deviation was higher in this case, almost thrice that of the plinth.This might be caused by the sensor to object distance, as both photogrammetry and TLS data were acquired from a slightly farther distance than that of the plinth, which reduces the resolution of both TLS and photogrammetry data.

Derived Products
Once the preliminary processing was finished, several derivative products were generated from the resulting hybrid photogrammetric and TLS point cloud.This section will briefly describe some of the products that were created in this project, which were mainly done in order to increase the management and dissemination of the Kasepuhan site and Siti Inggil in particular.This includes an orthophoto map, a photorealistic 3D model, a BIM model, and a virtual reality environment.

Orthophotography of Kasepuhan and Its Surroundings
The orthophoto map (Figure 10) was one of the first derived products to be generated, as it requires only the finishing of the Step 1 scale level.The map was generated by aerial images taken from a flying height of around 80 m from the ground, yielding a pixel resolution of 3 cm.This map was not by any means meant to be a true topographic map, but rather an overview of the Kasepuhan Palace and its surroundings.In this sense, it followed the (semi-) automatic workflow in PhotoScan, and the DTM (digital terrain model) from which the orthographic projection was based did not undergo a thorough and rigorous manual clean up.Indeed, the purpose of the map was to serve as a support during the acquisition mission itself, by showing the general layout of the site and enabling a more detailed planning for the acquisitions of the next scale steps.Although a CAD drawing of the palace layout was available, the high-resolution orthophoto texture helped in performing qualitative interpretation of some ground features.Processing speed was therefore of the essence, and a first version of the orthophoto map was already available at the end of the first day of the mission.Regardless, the map still provides valuable information to other stakeholders such as architects and palace managers.The up-to-date nature of the data from which it was created made it a useful tool in revising the existing maps of the palace complex.

Photo-Textured 3D Meshed Model
A surface reconstruction of Siti Inggil was made possible by meshing the point cloud into 3D meshed models.3D meshes have smaller file sizes and may be superimposed by textures, therefore both easier for data management and better for visualization purposes.Mesh triangles can be textured using an interpolation of point colors, but this results in a blurred texture, which is not photorealistic.Since photogrammetric data were also available, the superimposition of photorealistic texture was also feasible.The proposed workflow in the creation of these photorealistic textured 3D model will be elaborated in this section.The results from several main steps are shown in Figure 11.Starting from the registered and combined TLS and photogrammetry point cloud (Figure 11a), a 3D mesh was generated using the Poisson method (Figure 11b).Since both photogrammetric and TLS point cloud were already georeferenced to the same absolute coordinate system, the integration of these two data sources were immediate.The meshing was performed using the CloudCompare software.The combination of TLS and photogrammetry was important, since using only photogrammetric point cloud results in some missing parts, and vice versa with the TLS point cloud.In the photogrammetric point cloud, this problem is exacerbated by the ceramic tiles used on the pavilions' flooring, often rendering the floors missing.For the TLS point cloud, texturing is important since textures from the inherent panoramic images are not good enough and are often hindered by occlusions from the station's point of view.
The resulting mesh was then re-imported into PhotoScan, bearing in mind that the coordinate systems of the PhotoScan project and the mesh were identical.This is crucial in order to keep the texture mapping properties already set in the PhotoScan project and avoid shifted textures.This would bring the mesh into the PhotoScan project's network of oriented images.Photorealistic textures can then be superimposed on this mesh using PhotoScan's texturing function (Figure 11c).The resulting 3D model has a reduced file size compared to the original point cloud, while possessing a photorealistic texture.This procedure was performed for all Step 3 objects, including all the pavilions and the 15th Century brick wall surrounding Siti Inggil.The photorealistic 3D models provide a better visualization compared to the point cloud or simple meshes.These models can then be integrated into the virtual visit environment, which was also developed in this project.

BIM 3D Model
In terms of site management, the 3D mesh is not sufficient as it could not store semantic information.One way to address this problem is the creation of a building information system (BIM), which permits the storing of semantic data within the 3D model.The BIM 3D model was created using the Archicad software, based on a wireframe CAD model, which were created using photogrammetry using the software PhotoModeler (EosSystems).The 3D model was drawn as a test, and in this regard, a repeating pattern texture image was used instead of the photo textures.The resulting model of the Central Pavilion can be seen in Figure 12.Note that the BIM model still consists solely of geometric data, with semantics and ontology to be added in a later phase.The representation of complex geometries in a BIM environment is an interesting subject and may be addressed in several ways, e.g., representation as mesh rather than geometric primitives.

Virtual Visit Environment
In order to create an immersive medium for visualizing the results, a virtual visit environment was developed.Both the point cloud and photorealistic 3D meshed models were used in this regard.A combination of texturing methods was opted, with the Step 3 objects (i.e., pavilions and brick walls) displayed in photorealistic textures, while the less interesting parts of Siti Inggil (e.g., lamp post, ground, trees) only used interpolated color originating from the point cloud color.This was done in order to reduce processing time and focus more on the objects of interest.Two products were generated from this virtual visit environment, displayed here in Figure 13.The first one is a virtual visit video of Siti Inggil, which showcases the photorealistic models of the main objects of interest.The second product is a Virtual Reality (VR) system, which was developed using the HTC VIVE VR goggles.These derivative products is very useful in helping the dissemination of information about the site to the public.The immersive environment enables people to visit the site remotely and appreciate more the Kasepuhan Palace in its empty state.It is also an interesting tool for architects in order to examine various details of the area without having to go to the site.
Figure 13.The virtual visit environment developed for the Siti Inggil area.To the left, the combined 3D mesh models from various scale steps.Notice that the 3D model of the pavilion is at a higher resolution than the ground or tree beside it.To the right, a virtual visit of the same area using VR gear.

Conclusions and Outlook
This paper described the workflow used in the digitizing of a tropical vernacular architecture in the context of cultural heritage documentation.The workflow included a multi-scale approach, in which the varying components of 3D data were linked together using measured 3D coordinates to create a georeferenced hybrid 3D point cloud.The multi-scale aspect divided the project into four scale steps, going from Step 1 (aerial photogrammetry for the site and its surroundings), Step 2 (TLS and UAV close-range images for smaller specific areas of interest), Step 3 (terrestrial close-range photos for architectural objects), up to Step 4 (terrestrial close-range images for fine decorative elements).The results showed that the approach is suitable for this type of site, where the area is spread out in a large complex with smaller buildings within.Furthermore, some challenges during the acquisition have been discussed in this paper.Some adaptive measures to answer these challenges were also briefly mentioned.Indeed, the type of architecture encountered in Kasepuhan poses a special case adapted for the surrounding climate, and therefore must be addressed differently.Challenges regarding the conditions during the acquisition were also described in the paper.
The analysis conducted in this paper showed that for Step 1, UAV photogrammetry is a very suitable solution.It provides a fast result in the form of the orthophoto map and a rough DTM, which permits a more detailed planning within the period of the acquisition mission.TLS still proves to be the most practical to document a site for a Step 2 scale level such as Siti Inggil.It is much faster in terms of both acquisition and processing and provides a rather complete overview of the site.However, as has been seen in this paper, some missing data are still possible due to difficult angles and occlusions (e.g., roofs and pillars).It is in this regard that photogrammetry can be a complementary technique in providing the missing data.Naturally, this can also be performed by other types of TLS (e.g., handheld laser scanner), but the cost constraints may give photogrammetry the advantage in this particular case.
In terms of Step 3, the TLS is still a pertinent solution.However, within this scale step, the problem for TLS comes in regard to its texturing options, which is very limited.Again, photogrammetry may complement this technique in providing not only missing parts, but also photorealistic textures.However, for Step 4, photogrammetry is by far the best option for the recording process.The TLS used in the project simply is not designed for such close-range application.Close-range photogrammetry can be employed in this case rather quickly and simply, while also providing photorealistic textures.Again, this may be substituted with a more precise laser scanner such as triangulation-based scanners (e.g., FARO FreeStyle, Konica Minolta Range 7, etc.).However, the downside of this solution would be the acquisition of multiple laser scanners for such a project where the multi-scale aspect is important.The analysis in this paper also showed that certain metrics and statistics (e.g., standard deviation with regards to a reference) are sometimes required in order to better represent the quality of the results (case in point as seen in Figure 9).This type of documentation project is very important for many stakeholders.Architects and conservators may find it useful in conservation efforts, while policy makers may use it to decide on necessary measures to preserve the site.The 3D data may also be used for other, more sophisticated purposes.For example, the creation of a heritage building information model (HBIM) will be very useful in the management of the heritage site.Virtual reality solutions also enable a democratization of 3D technology and the diffusion of historical information to the public.Finally, with the existing 3D documentation acting as digital archives, a physical reconstruction in the case of damage can be performed via, for example, 3D printing technology.Further work will focus on several points, including formalization of a standard procedure for heritage mapping for tropical vernacular cases, elaboration of VR systems to include semantic information, and the automation of some of the time-consuming steps during the data processing.

Figure 1 .
Figure 1.The area of Siti Inggil within the palace compound, which was documented in higher detail.To the left, an aerial view of the area with several main structures overlaid.To the right, the Royal Pavilion of the Siti Inggil compound.Note the characteristic tropical architecture in the form of double-tiered roofs and the absence of walls.

Figure 2 .
Figure 2.An excerpt of the architectural drawings for the Central Pavilion of Siti Inggil commissioned by the Tourism and Cultural Service of the Province of West Java.Note the two-dimensional nature of the drawings, which shows only measurements for one facade, while assuming that all of the object's sides are symmetrical.

Figure 3 .
Figure 3. Example of a section of a panoramic image captured by the TLS.The figure to the left shows the RGB image, with strong overexposure due to sunray intensity.The figure to the right shows the intensity image, which was preferred in the tie point identification process.

Figure 4 .
Figure 4. Images of the Step 1 scale level oriented in PhotoScan.

Figure 5 .
Figure 5. Example of the result of the multi-scale approach.In Step 1, the whole Kasepuhan complex and its surroundings are modeled by aerial photogrammetry.Step 2 shows the Siti Inggil area within the palace compounds; here is shown the registered TLS point cloud.Step 3 shows a building, the Royal Pavilion, within Siti Inggil, which was modeled using close-range photogrammetry.Finally, Step 4 shows an architectural detail, in this case a column's plinth, also modeled using close-range photogrammetry.

Figure 6 .
Figure 6.Results for the Siti Inggil area in Step 2. (a) shows the TLS point cloud.(b) shows the absolute distance between points from the TLS and UAV point clouds, with an average distance of 2.3 cm and a standard deviation value of 1.8 cm.(c) shows a similar analysis between TLS and terrestrial photogrammetry, yielding an average distance of 2.0 cm and a standard deviation of 1.6 cm.

Figure 7 .
Figure 7. Results for the Central Pavilion in Step 3; (a) shows a composite of TLS and close-range photogrammetry point clouds.In (b), the absolute distance between points from each point cloud is shown with an average distance of 7 mm and a standard deviation value of 5 mm.

Figure 8 .
Figure 8. Results for the eastern upper plinth of the Royal Pavilion; the 3D mesh (a) and textured (b) models generated by photogrammetry were used as a reference against the TLS point cloud (c).In (d), the cloud-to-mesh distance analysis yielded a mean error of 1.5 mm and a standard deviation value of 2 mm.

Figure 9 .
Figure 9. Results for the woodwork on the Central Pavilion's ceilings; the 3D mesh (a) and textured (b) models used as a reference against TLS point cloud (c).The cloud-to-mesh analysis (d) yielded a mean error of 0.4 mm with a standard deviation of 5.6 mm.

Figure 10 .
Figure 10.Orthophoto map of Kasepuhan and its surroundings, produced from aerial UAV images.This type of product is useful to assess the overall condition of the site.

Figure 11 .
Figure 11.3D models of the Royal Pavilion; (a) the combined point cloud from TLS and photogrammetry, (b) 3D meshed model of the combined point cloud, and (c) photorealistic textured 3D mesh model.Note the homogenization of textures in (c).

Figure 12 .
Figure 12. 3D models of the Central Pavilion in the BIM environment in (a) wireframe and (b) solid models.

Table 1 .
The different scale steps in this project, with information on the methods used, sensors employed, and approximate GSD (for photogrammetry) and resolution (for TLS) for each step.

Table 2 .
Comparison of some dimensions of the Central Pavilion of Siti Inggil between the resulting 3D model and the architectural drawings.Shown here are comparisons for the pavilion's platforms' sides and pillar heights.