MDPI - Publisher of Open Access Journals

27 pages, 12866 KiB

Open AccessArticle

Multimodal Augmented Reality System for Real-Time Roof Type Recognition and Visualization on Mobile Devices

by Bartosz Kubicki, Artur Janowski and Adam Inglot

Appl. Sci. 2025, 15(3), 1330; https://doi.org/10.3390/app15031330 - 27 Jan 2025

Cited by 1 | Viewed by 1254

The utilization of augmented reality (AR) is becoming increasingly prevalent in the integration of virtual reality (VR) elements into the tangible reality of the physical world. It facilitates a more straightforward comprehension of the interconnections, interdependencies, and spatial context of data. Furthermore, the presentation of analyses and the combination of spatial data with annotated data are facilitated. This is particularly evident in the context of mobile applications, where the combination of real-world and virtual imagery facilitates enhances visualization. This paper presents a proposal for the development of a multimodal system that is capable of identifying roof types in real time and visualizing them in AR on mobile devices. The current approach to roof identification is based on data made available by public administrations in an open-source format, including orthophotos and building contours. Existing computer processing technologies have been employed to generate objects representing the shapes of building masses, and in particular, the shape of roofs, in three-dimensional (3D) space. The system integrates real-time data obtained from multiple sources and is based on a mobile application that enables the precise positioning and detection of the recipient’s viewing direction (pose estimation) in real time. The data were integrated and processed in a Docker container system, which ensured the scalability and security of the solution. The multimodality of the system is designed to enhance the user’s perception of the space and facilitate a more nuanced interpretation of its intricacies. In its present iteration, the system facilitates the extraction and classification/generalization of two categories of roof types (gable and other) from aerial imagery through the utilization of deep learning methodologies. The outcomes achieved suggest considerable promise for the advancement and deployment of the system in domains pertaining to architecture, urban planning, and civil engineering. Full article

(This article belongs to the Special Issue Applications of Data Science and Artificial Intelligence)

► Show Figures

Figure 1

27 pages, 31771 KiB

Open AccessArticle

Enhancing Building Archaeology: Drawing, UAV Photogrammetry and Scan-to-BIM-to-VR Process of Ancient Roman Ruins

by Chiara Stanga, Fabrizio Banfi and Stefano Roascio

Drones 2023, 7(8), 521; https://doi.org/10.3390/drones7080521 - 9 Aug 2023

Cited by 30 | Viewed by 4548

Abstract

This research investigates the utilisation of the scan-to-HBIM-to-XR process and unmanned aerial vehicle (UAV) photogrammetry to improve the depiction of archaeological ruins, specifically focusing on the Claudius Anio Novus aqueduct in Tor Fiscale Park, Rome. UAV photogrammetry is vital in capturing detailed aerial imagery of the aqueduct and its surroundings. Drones with high-resolution cameras acquire precise and accurate data from multiple perspectives. Subsequently, the acquired data are processed to generate orthophotos, drawings and historic building information modelling (HBIM) of the aqueduct, contributing to the future development of a digital twin. Virtual and augmented reality (VR-AR) technology is then employed to create an immersive experience for users. By leveraging XR, individuals can virtually explore and interact with the aqueduct, providing realistic and captivating visualisation of the archaeological site. The successful application of the scan-to-HBIM-to-XR process and UAV photogrammetry demonstrates their potential to enhance the representation of building archaeology. This approach contributes to the conservation of cultural heritage, enables educational and tourism opportunities and fosters novel research avenues for the comprehension and experience of ancient structures. Full article

(This article belongs to the Special Issue Digital Twins and Extended Reality: Opportunities and Challenges of Integrated Applications)

► Show Figures

Figure 1

21 pages, 4605 KiB

Open AccessArticle

Photogrammetric Co-Processing of Thermal Infrared Images and RGB Images

by Adam Dlesk, Karel Vach and Karel Pavelka

Sensors 2022, 22(4), 1655; https://doi.org/10.3390/s22041655 - 20 Feb 2022

Cited by 26 | Viewed by 4403

Abstract

In some applications of thermography, spatial orientation of the thermal infrared information can be desirable. By the photogrammetric processing of thermal infrared (TIR) images, it is possible to create 2D and 3D results augmented by thermal infrared information. On the augmented 2D and 3D results, it is possible to locate thermal occurrences in the coordinate system and to determine their scale, length, area or volume. However, photogrammetric processing of TIR images is difficult due to negative factors which are caused by the natural character of TIR images. Among the negative factors are the lower resolution of TIR images compared to RGB images and lack of visible features on the TIR images. To eliminate these negative factors, two methods of photogrammetric co-processing of TIR and RGB images were designed. Both methods require a fixed system of TIR and RGB cameras and for each TIR image a corresponding RGB image must be captured. One of the methods was termed sharpening and the result of this method is mainly an augmented orthophoto, and an augmented texture of the 3D model. The second method was termed reprojection and the result of this method is a point cloud augmented by thermal infrared information. The details of the designed methods, as well as the experiments related to the methods, are presented in this article. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

25 pages, 134662 KiB

Open AccessArticle

Detection of Windthrown Tree Stems on UAV-Orthomosaics Using U-Net Convolutional Networks

by Stefan Reder, Jan-Peter Mund, Nicole Albert, Lilli Waßermann and Luis Miranda

Remote Sens. 2022, 14(1), 75; https://doi.org/10.3390/rs14010075 - 24 Dec 2021

Cited by 12 | Viewed by 3747

Abstract

The increasing number of severe storm events is threatening European forests. Besides the primary damages directly caused by storms, there are secondary damages such as bark beetle outbreaks and tertiary damages due to negative effects on the market. These subsequent damages can be minimized if a detailed overview of the affected area and the amount of damaged wood can be obtained quickly and included in the planning of clearance measures. The present work utilizes UAV-orthophotos and an adaptation of the U-Net architecture for the semantic segmentation and localization of windthrown stems. The network was pre-trained with generic datasets, randomly combining stems and background samples in a copy–paste augmentation, and afterwards trained with a specific dataset of a particular windthrow. The models pre-trained with generic datasets containing 10, 50 and 100 augmentations per annotated windthrown stems achieved F1-scores of 73.9% (S1Mod10), 74.3% (S1Mod50) and 75.6% (S1Mod100), outperforming the baseline model (F1-score 72.6%), which was not pre-trained. These results emphasize the applicability of the method to correctly identify windthrown trees and suggest the collection of training samples from other tree species and windthrow areas to improve the ability to generalize. Further enhancements of the network architecture are considered to improve the classification performance and to minimize the calculative costs. Full article

(This article belongs to the Special Issue Pattern Analysis in Remote Sensing)

► Show Figures

Graphical abstract

42 pages, 132720 KiB

Open AccessArticle

Large-Scale Reality Modeling of a University Campus Using Combined UAV and Terrestrial Photogrammetry for Historical Preservation and Practical Use

by Bryce E. Berrett, Cory A. Vernon, Haley Beckstrand, Madi Pollei, Kaleb Markert, Kevin W. Franke and John D. Hedengren

Drones 2021, 5(4), 136; https://doi.org/10.3390/drones5040136 - 17 Nov 2021

Cited by 24 | Viewed by 9240

Abstract

Unmanned aerial vehicles (UAV) enable detailed historical preservation of large-scale infrastructure and contribute to cultural heritage preservation, improved maintenance, public relations, and development planning. Aerial and terrestrial photo data coupled with high accuracy GPS create hyper-realistic mesh and texture models, high resolution point clouds, orthophotos, and digital elevation models (DEMs) that preserve a snapshot of history. A case study is presented of the development of a hyper-realistic 3D model that spans the complex

1.7

^{2}

area of the Brigham Young University campus in Provo, Utah, USA and includes over 75 significant structures. The model leverages photos obtained during the historic COVID-19 pandemic during a mandatory and rare campus closure and details a large scale modeling workflow and best practice data acquisition and processing techniques. The model utilizes 80,384 images and high accuracy GPS surveying points to create a

1.65

trillion-pixel textured structure-from-motion (SfM) model with an average ground sampling distance (GSD) near structures of

0.5

cm and maximum of

4

cm. Separate model segments (31) taken from data gathered between April and August 2020 are combined into one cohesive final model with an average absolute error of

3.3

cm and a full model absolute error of <1 cm (relative accuracies from

0.25

cm to

1.03

cm). Optimized and automated UAV techniques complement the data acquisition of the large-scale model, and opportunities are explored to archive as-is building and campus information to enable historical building preservation, facility maintenance, campus planning, public outreach, 3D-printed miniatures, and the possibility of education through virtual reality (VR) and augmented reality (AR) tours. Full article

(This article belongs to the Special Issue Geoinformatics for the Preservation and Valorization of Cultural Heritage)

► Show Figures

Figure 1

36 pages, 22784 KiB

Open AccessArticle

Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery

by Stefan Bachhofner, Ana-Maria Loghin, Johannes Otepka, Norbert Pfeifer, Michael Hornacek, Andrea Siposova, Niklas Schmidinger, Kurt Hornik, Nikolaus Schiller, Olaf Kähler and Ronald Hochreiter

Remote Sens. 2020, 12(8), 1289; https://doi.org/10.3390/rs12081289 - 18 Apr 2020

Cited by 12 | Viewed by 6768

Abstract

We studied the applicability of point clouds derived from tri-stereo satellite imagery for semantic segmentation for generalized sparse convolutional neural networks by the example of an Austrian study area. We examined, in particular, if the distorted geometric information, in addition to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this regard, we trained a fully convolutional neural network that uses generalized sparse convolution one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching), and twice on 3D geometric as well as color information. In the first experiment, we did not use class weights, whereas in the second we did. We compared the results with a fully convolutional neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color features. The decision tree using hand-crafted features has been successfully applied to aerial laser scanning data in the literature. Hence, we compared our main interest of study, a representation learning technique, with another representation learning technique, and a non-representation learning technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our study area, we reported that geometric and color information only improves the performance of the Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a higher overall performance in our case. We also found that training the network with median class weighting partially reverts the effects of adding color. The network also started to learn the classes with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto generally outperforms the other two with a kappa score of over 90% and an average per class accuracy of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2% higher accuracy for roads. Full article

(This article belongs to the Special Issue Machine and Deep Learning for Earth Observation Data Analysis)

► Show Figures

Graphical abstract

18 pages, 3156 KiB

Open AccessArticle

Estimating Maize-Leaf Coverage in Field Conditions by Applying a Machine Learning Algorithm to UAV Remote Sensing Images

by Chengquan Zhou, Hongbao Ye, Zhifu Xu, Jun Hu, Xiaoyan Shi, Shan Hua, Jibo Yue and Guijun Yang

Appl. Sci. 2019, 9(11), 2389; https://doi.org/10.3390/app9112389 - 11 Jun 2019

Cited by 21 | Viewed by 4180

Abstract

Leaf coverage is an indicator of plant growth rate and predicted yield, and thus it is crucial to plant-breeding research. Robust image segmentation of leaf coverage from remote-sensing images acquired by unmanned aerial vehicles (UAVs) in varying environments can be directly used for large-scale coverage estimation, and is a key component of high-throughput field phenotyping. We thus propose an image-segmentation method based on machine learning to extract relatively accurate coverage information from the orthophoto generated after preprocessing. The image analysis pipeline, including dataset augmenting, removing background, classifier training and noise reduction, generates a set of binary masks to obtain leaf coverage from the image. We compare the proposed method with three conventional methods (Hue-Saturation-Value, edge-detection-based algorithm, random forest) and a frontier deep-learning method called DeepLabv3+. The proposed method improves indicators such as Q_seg, S_r, E_s and mIOU by 15% to 30%. The experimental results show that this approach is less limited by radiation conditions, and that the protocol can easily be implemented for extensive sampling at low cost. As a result, with the proposed method, we recommend using red-green-blue (RGB)-based technology in addition to conventional equipment for acquiring the leaf coverage of agricultural crops. Full article

(This article belongs to the Section Optics and Lasers)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI