Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (698)

Search Parameters:
Keywords = building point extraction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 1405 KB  
Article
Bionic Corner Detection Based on Cooperative Processing of Simple Cells and End-Stopped Cells
by Shuo Sun and Haiyang Yu
Algorithms 2026, 19(5), 343; https://doi.org/10.3390/a19050343 - 30 Apr 2026
Abstract
Corner detection is a fundamental task in computer vision that plays a critical role in applications such as image registration, 3D reconstruction, and object tracking. In biological visual systems, simple cells in the primary visual cortex exhibit high selectivity to edge stimuli of [...] Read more.
Corner detection is a fundamental task in computer vision that plays a critical role in applications such as image registration, 3D reconstruction, and object tracking. In biological visual systems, simple cells in the primary visual cortex exhibit high selectivity to edge stimuli of specific orientations, while end-stopped cells can detect geometric singular structures such as line segment endpoints and corners. Existing corner detection methods based on visual neural computation typically employ a strategy of densely distributed end-stopped cells for corner localization, which suffers from significant localization deviation under small angle conditions due to mutual interference between responses of adjacent neurons. To address this problem, this paper proposes a bionic corner detection method based on cooperative processing of simple cells and end-stopped cells. The method constructs a two-stage cooperative processing framework: the edge filtering stage employs a Gabor filter bank to simulate the orientation selectivity of simple cells, extracting edge positions and orientation information; the dynamic construction stage builds unilateral end-stopped cells only at filtered edge positions based on local orientation information, fundamentally avoiding computational redundancy and response interference caused by global dense distribution; the corner localization stage determines precise corner coordinates through hierarchical clustering and dual-cluster centroid fusion strategies. Experimental results demonstrate that, in the 15° acute-angle regime where dense end-stopped schemes are most severely affected by response interference, the proposed method reduces the mean localization error from 8.76 to 2.34 pixels, corresponding to a 73.3% improvement; averaged across the eight tested angle levels from 15° to 165°, the improvement is approximately 40.9%, and all per-angle differences are statistically significant (paired t-test, p < 0.01 or below, N = 10 independent runs). On standard test images, the method attains the lowest mean localization error among the eight compared detectors (1.58 pixels, versus 1.68–3.42 pixels for Harris, FAST, COSFIRE, KAZE, SuperPoint, Deep Corner, and Wei et al.), while maintaining competitive detection rate, false-alarm rate, and runtime. Physiological plausibility validation experiments show that the correlation coefficient between the detection deviation of this method and human perceptual deviation reaches 0.923, indicating that the output of the framework aligns with previously reported human perceptual bias patterns and supporting its biological plausibility as a biologically inspired—rather than mechanistic—model of corner perception. The source code, dataset, and experimental results are publicly available (see Data Availability Statement). Full article
Show Figures

Figure 1

34 pages, 21194 KB  
Article
Deep Learning-Based Semantic Segmentation of Airborne LiDAR Point Clouds Using a Transformer-Enhanced PointNet++ Architecture
by Hacer Kubra Sevinc and Ismail Rakip Karas
Geomatics 2026, 6(3), 43; https://doi.org/10.3390/geomatics6030043 - 29 Apr 2026
Abstract
Airborne LiDAR (Light Detection and Ranging) data is widely used in urban modelling and three-dimensional spatial analysis studies. However, the irregular structure of LiDAR point clouds, varying point densities, and class imbalances observed in the datasets make semantic segmentation problematic. This study addresses [...] Read more.
Airborne LiDAR (Light Detection and Ranging) data is widely used in urban modelling and three-dimensional spatial analysis studies. However, the irregular structure of LiDAR point clouds, varying point densities, and class imbalances observed in the datasets make semantic segmentation problematic. This study addresses the four-class semantic segmentation problem (unclassified, vegetation, ground, and building) on aerial LiDAR point clouds, with a particular focus on multi-class segmentation. The Oregon LiDAR Program dataset was obtained through the OpenTopography platform for use in this study. The point cloud data were resampled to 4096 points to ensure a fixed input size; for each point, the X, Y, and Z coordinates, along with the RGB and intensity features, were utilized. Experimental studies compared the proposed method with both baseline models (PointNet, PointNet++ MSG, and VoxelNet Lite) and recent state-of-the-art architectures, including Point Transformer, KPConv, and RandLA-Net. Additionally, the PointNet2 MSG Transformer model was developed based on the PointNet++ MSG architecture and includes a transformer-based feature fusion module. Different loss functions and training configurations were evaluated, and the effects of ensemble learning and test-time augmentation strategies on model performance were analyzed. The experimental results show that the proposed approach achieved a mean Intersection over Union (IoU) of 51.74% and an accuracy of 61.50% on the test dataset. These results demonstrate that combining multi-scale feature extraction with transformer-based feature fusion is an effective approach for semantic segmentation of LiDAR point clouds and multi-class segmentation tasks. Full article
Show Figures

Graphical abstract

25 pages, 3306 KB  
Article
Unsupervised Driving Behavior Primitive Inference via Hierarchical Segmentation and Context-Aware Clustering
by Lu Zhang, Tao Li, Xuelian Zheng, Wenyu Kang and Yuhan Fu
Sensors 2026, 26(9), 2744; https://doi.org/10.3390/s26092744 - 29 Apr 2026
Abstract
Driving behavior primitives serve as fundamental building blocks for modeling and semantically interpreting time-series driving behavior. Extracting behavior primitives is challenging due to the high dimensionality and complex interdependencies among behavioral variables, as well as the rich temporal dynamics of real-world driving maneuvers. [...] Read more.
Driving behavior primitives serve as fundamental building blocks for modeling and semantically interpreting time-series driving behavior. Extracting behavior primitives is challenging due to the high dimensionality and complex interdependencies among behavioral variables, as well as the rich temporal dynamics of real-world driving maneuvers. This paper proposes an unsupervised two-stage framework that optimizes time-series segmentation and segment clustering to yield interpretable and context-aware behavior primitives. First, a Hierarchical Bayesian Model-based Agglomerative Sequence Segmentation (H-BMASS) method is introduced that decouples longitudinal and lateral driving behaviors and performs hierarchical segmentation. This design mitigates under-segmentation by ensuring that change points reflect genuine behavioral transitions. Second, to cluster driving segments of varying durations into a finite set of primitive types, an Integrating Numerical and Trend Discretization Latent Dirichlet Allocation (INT-LDA) model is developed. The model combines variables’ temporal trend discretization with numerical discretization to create symbolic representations of driving data, thereby preserving the essential time dependency of driving behavior and improving segment clustering accuracy. Evaluated on naturalistic driving data collected from a high-fidelity simulator, the proposed framework identifies five distinct behavior primitives with clear physical interpretations. The resulting primitives provide a compact, semantically rich representation of driving behavior, facilitating driver modeling, decision prediction, and scenario-based testing for autonomous vehicles. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

28 pages, 33079 KB  
Article
Pedestrian Localization Using Smartphone LiDAR in Indoor Environments
by Kwangjae Sung and Jaehun Kim
Electronics 2026, 15(9), 1810; https://doi.org/10.3390/electronics15091810 - 24 Apr 2026
Viewed by 137
Abstract
Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied [...] Read more.
Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied environments. Since visual place recognition (VPR) methods that rely on images captured by camera sensors are highly sensitive to variations in appearance, including changes in lighting, surface color, and shadows, they can lead to poor place recognition accuracy. In contrast, light detection and ranging (LiDAR)-based place recognition (LPR) approaches based on 3D point cloud data that captures the shape and geometric structure of the environment are robust to changes in place appearance and can therefore provide more reliable place recognition results than VPR methods. This work presents an indoor LPR method called PointNetVLAD-based indoor pedestrian localization (PIPL). PIPL is a deep network model that uses PointNetVLAD to learn to extract global descriptors from 3D LiDAR point cloud data. PIPL can recognize places previously visited by a pedestrian using point clouds captured by a low-cost LiDAR sensor on a smartphone in small-scale indoor environments, while PointNetVLAD performs place recognition for vehicles using high-cost LiDAR, GPS, and inertial measurement unit (IMU) sensors in large-scale outdoor areas. For place recognition on 3D point cloud reference maps generated from LiDAR scans, PointNetVLAD exploits the universal transverse mercator (UTM) coordinate system based on GPS and IMU measurements, whereas PIPL uses a virtual coordinate system designed in this study due to the unavailability of GPS indoors. In experiments conducted in campus buildings, PIPL shows significant advantages over NetVLAD (known as a convolutional neural network (CNN)-based VPR method). Particularly in indoor environments with repetitive scenes where geometric structures are preserved and image-based appearance features are sparse or unclear, PIPL achieved 39% higher top-1 accuracy and 10% higher top-3 accuracy compared to NetVLAD. Furthermore, PIPL achieved place recognition accuracy comparable to NetVLAD even with a small number of points in a 3D point cloud and outperformed NetVLAD even with a smaller model training dataset. The experimental results also indicate that PIPL requires over 76% less place retrieval time than NetVLAD while maintaining robust place classification performance. Full article
(This article belongs to the Special Issue Advanced Indoor Localization Technologies: From Theory to Application)
Show Figures

Figure 1

18 pages, 1019 KB  
Article
Pose-Driven Cow Behavior Recognition in Complex Barn Environments: A Method Combining Knowledge Distillation and Deployment Optimization
by Jie Hu, Xuan Li, Ruyue Ren, Shujie Wang, Mingkai Yang, Jianing Zhao, Juan Liu and Fuzhong Li
Animals 2026, 16(9), 1301; https://doi.org/10.3390/ani16091301 - 23 Apr 2026
Viewed by 153
Abstract
Cattle behavior constitutes important phenotypic information reflecting animals’ health status, activity level, and welfare condition, and is therefore of considerable significance for automated monitoring and precision management in smart livestock farming. However, under complex barn conditions, cattle behavior recognition is easily affected by [...] Read more.
Cattle behavior constitutes important phenotypic information reflecting animals’ health status, activity level, and welfare condition, and is therefore of considerable significance for automated monitoring and precision management in smart livestock farming. However, under complex barn conditions, cattle behavior recognition is easily affected by factors such as illumination variation, partial occlusion, background interference, and individual differences, thereby reducing recognition stability and generalization capability. To address these challenges, this study proposes a pose-driven method for cattle behavior recognition in complex barn environments. First, a 16-keypoint annotation scheme suitable for describing bovine posture, termed cow16, was constructed. Based on this scheme, OpenPose was employed to extract heatmaps (HMs) and part affinity fields (PAFs), which were then used to build an intermediate HM/PAF posture representation. Subsequently, this representation was taken as the input to a lightweight convolutional neural network for classifying three behavioral categories: stand, walk, and lying. On this basis, class-imbalance correction during training and a multi-random-seed logits ensemble strategy during inference were further introduced. In addition, knowledge distillation was adopted to transfer knowledge from a high-performance teacher model to a lightweight student model. Experimental results demonstrate that training-stage class-imbalance correction and inference-stage multi-random-seed logits ensembling exhibit strong complementarity; when combined, the AB configuration improves the test-set Macro-F1 by 3.83 percentage points. Moreover, the distilled student model still achieves competitive recognition performance while maintaining 1× inference cost, indicating a favorable trade-off between accuracy and efficiency. This study provides a useful reference for deployment-oriented cattle behavior recognition in smart farming scenarios and offers a lightweight technical basis for subsequent practical applications. Full article
(This article belongs to the Section Cattle)
14 pages, 3380 KB  
Proceeding Paper
A Rapid Stress Retrieval Approach for Long-Fiber Angle-Ply Laminates Using the RBF Kansa Method
by Andrea Chiappa and Corrado Groth
Eng. Proc. 2026, 131(1), 34; https://doi.org/10.3390/engproc2026131034 - 22 Apr 2026
Viewed by 131
Abstract
Building on a previous work presented by the authors, this study extends a fast stress retrieval method to long-fiber angle-ply laminates subjected to constant bending and torque moments. The fiber/matrix interface stress state is efficiently estimated using global deformation data obtained from a [...] Read more.
Building on a previous work presented by the authors, this study extends a fast stress retrieval method to long-fiber angle-ply laminates subjected to constant bending and torque moments. The fiber/matrix interface stress state is efficiently estimated using global deformation data obtained from a finite element analysis performed on a coarse model, potentially employing a homogenized material. Radial basis functions (RBFs) are utilized to bridge the macroscale and microscale, enabling the extraction of appropriate boundary conditions at the representative volume element (RVE) level. A collocation-based Kansa method, also leveraging RBF, is then applied to a carefully selected set of points to determine the local stress distribution. The accuracy of the proposed approach is assessed by comparing its results with high-fidelity FEM sub-modeling. Full article
Show Figures

Figure 1

22 pages, 12161 KB  
Article
SV-LIO: A Probabilistic Adaptive Semantic Voxel Map for LiDAR–Inertial Odometry
by Lixiao Yang and Youbing Feng
Electronics 2026, 15(8), 1744; https://doi.org/10.3390/electronics15081744 - 20 Apr 2026
Viewed by 191
Abstract
Accurate and real-time localization is a fundamental prerequisite for the autonomous navigation of mobile robots. LiDAR–Inertial Odometry (LIO) achieves high-precision state estimation and scene reconstruction in unknown environments by effectively fusing data from LiDAR and Inertial Measurement Units (IMU). However, conventional LIO methods [...] Read more.
Accurate and real-time localization is a fundamental prerequisite for the autonomous navigation of mobile robots. LiDAR–Inertial Odometry (LIO) achieves high-precision state estimation and scene reconstruction in unknown environments by effectively fusing data from LiDAR and Inertial Measurement Units (IMU). However, conventional LIO methods typically rely solely on geometric features during point cloud registration. In complex scenarios, such as outdoor unstructured or dynamic environments, these methods are often susceptible to reduced localization accuracy due to geometric degeneration or mismatches. To address these challenges, we propose SV-LIO, A Probabilistic Adaptive Semantic Voxel Map for LiDAR–Inertial Odometry, which leverages point-wise semantic information from semantic segmentation to enhance registration accuracy and system robustness. Specifically, we construct a probabilistic adaptive semantic voxel map that extracts multi-scale spatial planes attached with semantic information. Building on this representation, we employ a semantic-guided strategy for nearest-neighbor plane association between LiDAR scans and the local map, and construct semantic-weighted point-to-plane residuals to constrain pose estimation. By jointly optimizing the IMU-propagated pose prior and semantic-guided LiDAR observation constraints, SV-LIO realizes high-precision real-time state estimation and semantic scene reconstruction. Extensive experiments on the KITTI dataset demonstrate that SV-LIO achieves significant improvements in both localization accuracy compared to state-of-the-art (SOTA) LIO methods, while also constructing semantic maps capable of providing rich environmental information. Full article
(This article belongs to the Section Electrical and Autonomous Vehicles)
Show Figures

Figure 1

27 pages, 6579 KB  
Article
EF-YOLO: Detecting Small Targets in Early-Stage Agricultural Fires via UAV-Based Remote Sensing
by Jun Tao, Zhihan Wang, Jianqiu Wu, Yunqin Li, Tomohiro Fukuda and Jiaxin Zhang
Remote Sens. 2026, 18(8), 1119; https://doi.org/10.3390/rs18081119 - 9 Apr 2026
Cited by 1 | Viewed by 393
Abstract
Early detection of agricultural fires with Unmanned Aerial Vehicles (UAVs) is important for environmental safety, yet it remains difficult because ignition cues are extremely small, smoke patterns vary widely, and farmland scenes often contain strong background interference such as specular reflections. Model development [...] Read more.
Early detection of agricultural fires with Unmanned Aerial Vehicles (UAVs) is important for environmental safety, yet it remains difficult because ignition cues are extremely small, smoke patterns vary widely, and farmland scenes often contain strong background interference such as specular reflections. Model development is further constrained by the scarcity of data from the early ignition stage. To address these challenges, we propose a joint data and model optimization framework. We first build a hybrid dataset through an ROI-guided synthesis pipeline, in which latent diffusion models are used to insert high-fidelity, carefully screened fire samples into real farmland backgrounds. We then introduce EF-YOLO, a detector designed for high sensitivity to small targets. The network uses SPD-Conv to reduce feature loss during spatial downsampling and includes a high-resolution P2 head to improve the detection of minute objects. To reduce background clutter, a Dual-Path Frequency–Spatial Enhancement (DP-FSE) module serves as a lightweight statistical surrogate that extracts global contextual cues and local salient features in parallel, thereby suppressing high-frequency noise. Experimental results show that EF-YOLO achieves an APS of 40.2% on sub-pixel targets, exceeding the YOLOv8s baseline by 15.4 percentage points. With a recall of 88.7% and a real-time inference speed of 78 FPS, the proposed framework offers a strong balance between detection performance and efficiency, making it well suited for edge-deployed agricultural fire early-warning systems. Full article
Show Figures

Figure 1

32 pages, 1293 KB  
Article
Early Detection of Re-Identification Risk in Multi-Turn Dialogues via Entity-Aware Evidence Accumulation
by Yeongseop Lee, Seungun Park and Yunsik Son
Appl. Sci. 2026, 16(8), 3680; https://doi.org/10.3390/app16083680 - 9 Apr 2026
Viewed by 415
Abstract
In multi-turn conversational AI, individually innocuous personally identifiable information (PII) fragments disclosed across successive turns can accumulate into a re-identification risk that no single utterance reveals on its own. Existing PII detectors operate on isolated utterances and therefore cannot track this cross-turn evidence [...] Read more.
In multi-turn conversational AI, individually innocuous personally identifiable information (PII) fragments disclosed across successive turns can accumulate into a re-identification risk that no single utterance reveals on its own. Existing PII detectors operate on isolated utterances and therefore cannot track this cross-turn evidence build-up. We propose a stateful middleware guardrail whose core design principle is speaker-attributed entity isolation: every extracted PII fragment is attributed to its originating conversational participant, and evidence is accumulated in entity-isolated subgraphs that prevent cross-entity contamination. The system signals re-identification onset tpred at the earliest turn where combination-based rules grounded in the uniqueness literature are satisfied. On a 184-record template-synthetic evaluation corpus, the gated NER configuration leads on primary timeliness (OW@5 = 73.4%, MAE= 1.357 turns); the full system achieves OW@5 = 70.7% with MAE = 2.442 turns as an alternative operating mode for ambiguity-sensitive disclosure patterns. We further evaluate behavior on a 300-record mutation stress set, test RULE_B on the ABCD external corpus, and supplement RULE_A evaluation with both a proxy-labeled transfer analysis on PersonaChat and a manual annotation study on 151 Switchboard dialogues. The reported results should be interpreted as an initial empirical reference point rather than a sufficient endpoint for autonomous runtime enforcement. Full article
(This article belongs to the Special Issue Advances in Intelligent Systems—2nd edition)
Show Figures

Figure 1

40 pages, 3738 KB  
Article
Knowledge Evolution in the Mobile Industry via Embedding-Based Topic Growth and Typology Analysis
by Sungjin Jeon, Woojun Jung and Keuntae Cho
Systems 2026, 14(4), 415; https://doi.org/10.3390/systems14040415 - 9 Apr 2026
Viewed by 359
Abstract
The mobile industry has experienced long-run changes in its knowledge structure, including identifiable transition points observable through embedding-based semantic analysis. Using abstracts from 86,674 mobile industry publications published between 2005 and 2024, we embed documents with SPECTER2, build year-specific embedding distributions, and derive [...] Read more.
The mobile industry has experienced long-run changes in its knowledge structure, including identifiable transition points observable through embedding-based semantic analysis. Using abstracts from 86,674 mobile industry publications published between 2005 and 2024, we embed documents with SPECTER2, build year-specific embedding distributions, and derive knowledge regimes by combining change-point detection with inter-year distribution distances. We then extract regime-specific topics via clustering and reconstruct topic lineages by aligning topic similarities to classify inheritance, differentiation, convergence, and disappearance. The analysis delineates three regimes spanning 2005 to 2012, 2013 to 2019, and 2020 to 2024, with pronounced transitions around 2012 to 2013 and 2019 to 2020. Regime 1 centers on foundational technologies such as wireless communication, power, sensors, and reliability. Regime 2 expands toward platforms, apps, and data analytics alongside cross-domain convergence. Regime 3 is characterized by strengthened 5G operations and data-driven services, together with the independent rise in policy, governance, and regulation topics. Transitions reflect recombination built on inherited knowledge rather than abrupt replacement, and post-transition topics display distinct growth typologies by network position and growth pattern. By integrating embedding-based changepoint detection with topic lineage reconstruction, we provide a reproducible account of regime transitions and quantitative evidence to inform the timing of corporate R&D, standard and platform strategies, and policy and regulatory design. Full article
Show Figures

Figure 1

25 pages, 6283 KB  
Article
Surface Defect Detection in Liquid Crystal Display Polariser Coating Manufacturing Based on an Enhanced YOLOv10-N Approach
by Jiayue Zhang, Shanhui Liu, Minghui Chen, Kezhan Zhang, Yinfeng Li, Ming Peng and Yeting Teng
Coatings 2026, 16(4), 451; https://doi.org/10.3390/coatings16040451 - 8 Apr 2026
Viewed by 386
Abstract
To address the issues of uneven grayscale distribution, weak defect features, and small target scales on the coating surface of LCD polarizers during manufacturing, an improved YOLOv10-N-based method is proposed for surface defect detection. First, a polarizer coating defect dataset is constructed based [...] Read more.
To address the issues of uneven grayscale distribution, weak defect features, and small target scales on the coating surface of LCD polarizers during manufacturing, an improved YOLOv10-N-based method is proposed for surface defect detection. First, a polarizer coating defect dataset is constructed based on the LCD polarizer coating process and the characteristics of coating defects. Adaptive median filtering is then employed for image denoising, while a particle-swarm-optimization-based improved histogram equalization method is adopted for image enhancement. Next, the Scale-aware Pyramid Pooling (SCPP) module is introduced into the C2f module of the backbone network to construct the C2f_SCPP feature extraction module, thereby improving the model’s ability to detect coating defects with different morphologies through multi-scale semantic feature fusion. In addition, rotation-equivariant convolution PreCM is incorporated into the SPPF module of the backbone network to build the SPPF_PreCM module, which effectively suppresses feature redundancy and scale conflicts while strengthening the representation of tiny defects. Finally, while retaining the original Distribution Focal Loss (DFL) branch of YOLOv10, WIoU is used to replace CIoU as the IoU loss term in bounding box regression, thereby improving localization accuracy and accelerating model convergence during training. Experimental results show that, compared with YOLOv10-N, the proposed method improves mAP@0.5 and mAP@0.5:0.95 by 1.8 and 2.8 percentage points, respectively, demonstrating its effectiveness for polarizer coating defect detection. However, its generalization capability under diverse production environments, varying illumination conditions, and complex noise scenarios still requires further investigation. Full article
(This article belongs to the Section High-Energy Beam Surface Engineering and Coatings)
Show Figures

Figure 1

21 pages, 28338 KB  
Article
An Enhanced YOLOv8n-Based Approach for Pig Behavior Recognition
by Jianjun Guo, Yudian Xu, Lijun Lin, Beibei Zhang, Piao Zhou, Shangwen Luo, Yuhan Zhuo, Jingyu Ji, Zhijie Luo and Guangming Cheng
Computers 2026, 15(4), 230; https://doi.org/10.3390/computers15040230 - 8 Apr 2026
Viewed by 395
Abstract
Pig behavior statistics can reflect their health status. Conventional approaches depend on manual observation to derive behavioral information from video recordings, a process that demands substantial time and human effort. To overcome these limitations in indoor intensive farming environments, this study introduces an [...] Read more.
Pig behavior statistics can reflect their health status. Conventional approaches depend on manual observation to derive behavioral information from video recordings, a process that demands substantial time and human effort. To overcome these limitations in indoor intensive farming environments, this study introduces an effective approach for recognizing pig behaviors, employing an enhanced YOLOv8n architecture. The approach utilizes advanced object detection algorithms to automatically identify pig behaviors, including stand, lie, eat, fight, and tail-bite, from overhead video footage of the enclosure. First, images of daily pig behaviors are collected using cameras to build a pig behavior dataset. To boost detection accuracy, the SE attention mechanism is embedded within the feature extraction backbone of the YOLOv8n network to enhance its representational capacity, strengthening the model’s capacity to grasp overarching contextual information and improve the expressiveness of extracted features. The GIoU loss function is employed during training to reduce computational cost and accelerate model convergence. Moreover, integrating Ghost convolution into the backbone significantly reduces both computational complexity and the total number of parameters. The experimental findings reveal that the optimized YOLOv8n model contains just 1.71 million parameters, marking a 42.93% reduction relative to the baseline model. Its floating-point operations total 5.0 billion, indicating a 38.27% decrease, while the mean average precision (mAP@50) reaches 96.8%, surpassing the original by 2.6 percentage points. Compared with other widely used YOLO-based object detection frameworks, the proposed approach achieves notably higher accuracy while requiring significantly lower computational resources and model complexity. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Graphical abstract

20 pages, 12712 KB  
Article
Large-Scale Airborne LiDAR Point Cloud Building Extraction Based on Improved Voxelized Deep Learning Network
by Bai Xue, Yanru Song, Pi Ai, Hongzhou Li, Shuhan Liu and Li Guo
Buildings 2026, 16(7), 1450; https://doi.org/10.3390/buildings16071450 - 7 Apr 2026
Viewed by 393
Abstract
High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose [...] Read more.
High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose a novel framework integrating geometric topology perception with cross-dimensional attention mechanisms within a Sparse Voxel Convolutional Neural Network (SPVCNN). The key contributions include: (1) an enhanced LaserMix++ multi-scale hybrid augmentation strategy featuring cross-scene block replacement, ground normal–constrained rotation, and non-uniform scaling; (2) a dual-branch SPVCNN architecture embedding a collaborative module of Geometric Self-Attention (GSA) and Cross-Space Residual Attention (CSRA) to preserve topological consistency and enable cross-dimensional feature interaction; and (3) a Boundary Enhancement Module (BEM) specifically designed to resolve boundary ambiguity and overlapping predictions. Evaluated on a 177 km2 dataset covering Washington, D.C., our method significantly outperforms the baseline SPVCNN, improving accuracy by 12.04 percentage points (0.8212 to 0.9416) and Intersection over Union (IoU) by 9.96 percentage points (0.866 to 0.9656). Furthermore, it surpasses mainstream networks such as Cylinder3D and MinkResNet by over 50% in absolute accuracy gain. These results demonstrate the effectiveness of synergistically combining geometric perception with adaptive attention for robust building extraction from large-scale LiDAR data. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

23 pages, 6677 KB  
Article
Fine-Grained 3D Building Reconstruction and Floor Height Estimation from Ultra-High-Resolution TomoSAR Data Using Geometric Constraints
by Haoyuan Chen, Wenkang Liu, Quan Chen, Lei Cui and Mengdao Xing
Remote Sens. 2026, 18(7), 1073; https://doi.org/10.3390/rs18071073 - 2 Apr 2026
Viewed by 460
Abstract
The automatic generation of semantic Level of Detail (LOD) 2 models from TomoSAR point clouds is frequently compromised by elevation side-lobes, data sparsity, and inherent geometric distortions. In particular, the energy dispersion caused by side-lobes blurs vertical structures, making the extraction of floor [...] Read more.
The automatic generation of semantic Level of Detail (LOD) 2 models from TomoSAR point clouds is frequently compromised by elevation side-lobes, data sparsity, and inherent geometric distortions. In particular, the energy dispersion caused by side-lobes blurs vertical structures, making the extraction of floor details and accurate floor height estimation significantly challenging. To overcome these limitations, we present a refined reconstruction framework that tightly couples tomographic imaging mechanisms with building geometric priors. For fine-grained vertical reconstruction, we employ a geometry-constrained inverse projection strategy that concentrates scattered energy back onto the building façade to mitigate side-lobe interference. This is complemented by a Global Coherent Integration method, utilizing spectral analysis to robustly recover periodic floor patterns and estimate average floor heights. In the horizontal domain, we address the conflict between noise suppression and feature preservation through a separation-of-axes morphological strategy. Unlike traditional isotropic filtering, this approach processes orthogonal directions independently to bridge data gaps while strictly maintaining sharp building corners and recovering fine substructures. Validated on airborne Ku-band datasets, the proposed method demonstrates the capability to produce topologically complete and semantically rich urban models from sparse radar observations. Full article
Show Figures

Figure 1

20 pages, 4887 KB  
Article
Geo-RVF: A Multi-Task Lightweight Perception System Based on Radar–Vision Fusion for USVs
by Jianhong Zhou, Zhen Huang, Yifan Liu, Gang Zhang, Yilan Yu and Zhen Tian
J. Mar. Sci. Eng. 2026, 14(7), 664; https://doi.org/10.3390/jmse14070664 - 31 Mar 2026
Viewed by 406
Abstract
Visual perception in Unmanned Surface Vehicles (USVs) suffers from drastic lighting changes and missing texture features. These factors lead to depth scale drift and motion estimation bias. Moreover, existing multi-modal fusion models are computationally complex and unfit for resource-limited edge devices. To address [...] Read more.
Visual perception in Unmanned Surface Vehicles (USVs) suffers from drastic lighting changes and missing texture features. These factors lead to depth scale drift and motion estimation bias. Moreover, existing multi-modal fusion models are computationally complex and unfit for resource-limited edge devices. To address these problems, a lightweight Radar–Vision Fusion (Geo-RVF) algorithm is proposed. To supplement spatial information, point clouds are projected to build sparse depth maps. A probability consistency-based depth correction module is designed to suppress water noise. This helps extract accurate geometric anchors to guide visual depth propagation. Subsequently, a Recurrent Autoregressive Network (RAN) fuses radar and image features in the temporal dimension. This resolves dynamic positional deviations caused by texture degradation and distant small targets. After real-time optimization, Geo-RVF achieves 23 FPS on the Jetson Orin NX. On a collected dataset, the method attains a mean average precision (mAP) 50–90 of 44.2% and a mean intersection over union (mIoU) of 99%, outperforming HybridNets and Achelous. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Back to TopTop