Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (934)

Search Parameters:
Keywords = automatic annotation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 34398 KB  
Article
Quantifying Bilberry Counts and Densities: A Comparative Assessment of Segmentation and Object Detection Models from Drone and Camera Imagery
by Susanna Hyyppä, Josef Taher, Harri Kaartinen, Teemu Hakala, Kirsi Karila, Leena Matikainen, Marjut Turtiainen, Antero Kukko and Juha Hyyppä
Forests 2026, 17(2), 253; https://doi.org/10.3390/f17020253 (registering DOI) - 13 Feb 2026
Abstract
Nordic forest management is increasingly emphasizing multi-functional goals, expanding beyond timber production towards non-wood forest products such as wild berries. Wild berry yield maps are based on sample plot data combined with meteorological, remote sensing, and geoinformation data. Automating sample plot data processing [...] Read more.
Nordic forest management is increasingly emphasizing multi-functional goals, expanding beyond timber production towards non-wood forest products such as wild berries. Wild berry yield maps are based on sample plot data combined with meteorological, remote sensing, and geoinformation data. Automating sample plot data processing is crucial, as manual collection is labor-intensive, time-consuming, and complicated by short berry seasons and fluctuating yields. This study compares two methods for automatic bilberry detection and counting: a deep learning detector YOLO and a machine learning model using the segment anything model (SAM) followed by a random forest classification (SAM-RF). Both system camera and drone imagery were evaluated as input data. YOLOv8 clearly outperformed SAM–RF in berry detection, achieving an R2 of 0.98 and an RMSE of 3.8 berries when evaluated against annotated system camera images, compared to an R2 of 0.80 for SAM–RF. System camera imagery consistently produced higher accuracy than drone imagery due to higher image clarity and more optimal viewing angles, with YOLOv8 achieving an R2 of 0.95 against field counts, compared to 0.81 for drone images. The results also indicate that the primary error source in berry counting arises from the fact that many berries are not visible in the captured images. The results from the data analysis support the use of the developed technologies in yield modeling and even in implementing future ‘follow-me’ drone berry assistants. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Graphical abstract

30 pages, 13782 KB  
Article
Geometry-Aware Human Noise Removal from TLS Point Clouds via 2D Segmentation Projection
by Fuga Komura, Daisuke Yoshida and Ryosei Ueda
Sensors 2026, 26(4), 1237; https://doi.org/10.3390/s26041237 - 13 Feb 2026
Abstract
Large-scale terrestrial laser scanning (TLS) point clouds are increasingly used for applications such as digital twins and cultural heritage documentation; however, removing unwanted human points captured during acquisition remains a largely manual and time-consuming process. This study proposes a geometry-aware framework for automatically [...] Read more.
Large-scale terrestrial laser scanning (TLS) point clouds are increasingly used for applications such as digital twins and cultural heritage documentation; however, removing unwanted human points captured during acquisition remains a largely manual and time-consuming process. This study proposes a geometry-aware framework for automatically removing human noise from TLS point clouds by projecting 2D instance segmentation masks (obtained using You Only Look Once (YOLO) v8 with an instance segmentation head) into 3D space and validating candidates through multi-stage geometric filtering. To suppress false positives induced by reprojection misalignment and planar background structures (e.g., walls and ground), we introduce projection-followed geometric validation (or “geometric gating”) using Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and principal component analysis (PCA)-based planarity analysis, followed by cluster-level plausibility checks. Experiments were conducted on two real-world outdoor TLS datasets—(i) Osaka Metropolitan University Sugimoto Campus (OMU) (82 scenes) and (ii) Jinaimachi historic district in Tondabayashi (JM) (68 scenes). The results demonstrate that the proposed method achieves high noise removal accuracy, obtaining precision/recall/intersection over union (IoU) of 0.9502/0.9014/0.8607 on OMU and 0.8912/0.9028/0.8132 on JM. Additional experiments on mobile mapping system (MMS) data from the Waymo Open Dataset demonstrate stable performance without parameter recalibration. Furthermore, quantitative and qualitative comparisons with representative time-series geometric dynamic object removal methods, including DUFOMap and BeautyMap, show that the proposed approach maintains competitive recall under a human-only ground-truth definition while reducing over-removal of static structures in TLS scenes, particularly when humans are observed in only one or a few scans due to limited revisit frequency. The end-to-end processing time with YOLOv8 was 935.62 s for 82 scenes (11.4 s/scene) on OMU and 571.58 s for 68 scenes (8.4 s/scene) on JM, supporting practical efficiency on high-resolution TLS imagery. Ablation studies further clarify the role of each stage and indicate stable performance under the observed reprojection errors. The annotated human point cloud dataset used in this study has been publicly released to facilitate reproducibility and further research on human noise removal in large-scale TLS scenes. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

33 pages, 1379 KB  
Article
Breaking the Ceiling: Mitigating Extreme Response Bias in Surveys Using an Open-Ended Adaptive-Testing System and LLM-Based Response Analysis
by Moshe Gish, Amit Nowominski and Rotem Dror
AI 2026, 7(2), 73; https://doi.org/10.3390/ai7020073 - 13 Feb 2026
Abstract
Assessments of extreme psychological constructs often face a persistent challenge: the ceiling effect, in which a significant proportion of respondents select the highest score on a scale, thus obscuring meaningful variation within the population. This effect may have profound consequences in studies of [...] Read more.
Assessments of extreme psychological constructs often face a persistent challenge: the ceiling effect, in which a significant proportion of respondents select the highest score on a scale, thus obscuring meaningful variation within the population. This effect may have profound consequences in studies of extreme psychological constructs. To address this limitation, we present a novel framework that integrates Multistage Testing (MST) with open-ended questions that are automatically analyzed by large language models (LLMs). This hybrid approach adapts the survey questions to the respondent while leveraging LLMs to efficiently and reliably interpret free-text answers from large-scale online surveys. Using a case study on aversion toward cockroaches, we show how our method can effectively eliminate extreme ceiling effects, revealing hidden data distributions that are often obscured by extreme responses to conventional Likert-type survey questions. We also validate our method by comparing LLM performance to expert human annotations. This demonstrates the consistency and reliability of LLMs in evaluating free-text answers. This framework offers a generalizable methodology that enables more precise and sensitive quantitative measurement of extreme psychological constructs, allowing researchers to study topics that until now were inaccessible due to significant, inherent ceiling effects. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

22 pages, 494 KB  
Article
LinguoNER: A Language-Agnostic Framework for Named Entity Recognition in Low-Resource Languages with a Focus on Yambeta
by Philippe Tamla, Stephane Donna, Tobias Bigala, Dilan Nde, Maxime Yves Julien Manifi Abouh and Florian Freund
Informatics 2026, 13(2), 31; https://doi.org/10.3390/informatics13020031 - 11 Feb 2026
Viewed by 112
Abstract
This paper presents LinguoNER, a practical and extensible framework for bootstrapping Named Entity Recognition (NER) in extremely low-resource languages, demonstrated on Yambeta, a Bantu language spoken by a minority community in Cameroon. Due to scarce digital resources and the absence of [...] Read more.
This paper presents LinguoNER, a practical and extensible framework for bootstrapping Named Entity Recognition (NER) in extremely low-resource languages, demonstrated on Yambeta, a Bantu language spoken by a minority community in Cameroon. Due to scarce digital resources and the absence of annotated corpora, Yambeta has remained largely underrepresented in Natural Language Processing (NLP). LinguoNER addresses this gap by providing a methodologically transparent end-to-end workflow that integrates corpus acquisition, gazetteer-driven automatic annotation, tokenizer training, transformer fine-tuning, and multi-level evaluation in settings where large-scale manual annotation is infeasible. Using a Bible-derived corpus as a linguistically stable starting point, we release the first publicly available Yambeta NER dataset (≈25,000 tokens) annotated with the CoNLL BIO scheme and a restricted entity schema (PER/LOC/ORG). Because labels are generated via dictionary-based annotation, the corpus is best characterized as silver-standard; credibility is strengthened through recorded dictionaries, transparency logs, expert-in-the-loop validation on sampled subsets, and complementary qualitative error analysis. We additionally train a dedicated Yambeta WordPiece tokenizer that preserves tone markers and diacritics, and fine-tune a bert-base-cased transformer for token classification. On a held-out test split, LinguoNER achieves strong token-level performance (Precision = 0.989, Recall = 0.981, F1 = 0.985), substantially outperforming a dictionary-only gazetteer baseline (ΔF1 ≈ 0.36). Per-entity-type evaluation further indicates improvements beyond surface-form matching, while remaining errors are linguistically motivated and primarily involve multi-word entity boundaries, agglutinative constructions, and tone-/diacritic-sensitive tokenization. We emphasize that results are restricted to a Bible domain and a limited label space, and should be interpreted as proof-of-concept evidence rather than claims of broad out-of-domain generalization. Overall, LinguoNER provides a reproducible blueprint for bootstrapping NER resources in underrepresented languages and supports future work on broader corpora sources (e.g., news, OPUS, JW300), additional African languages (e.g., Yoruba, Igbo, Bassa), and the iterative creation of expert-refined datasets and gold-standard subsets. Full article
Show Figures

Figure 1

15 pages, 4617 KB  
Article
Artificial Intelligence-Based Proximal Bone Shape Asymmetry Analysis and Clinical Correlation with Cartilage Relaxation Times and Functional Activity
by Rafeek Thahakoya, Rupsa Bhattacharjee, Misung Han, Felix Gerhard Gassert, Johanna Luitjens, Valentina Pedoia, Richard B. Souza and Sharmila Majumdar
Bioengineering 2026, 13(2), 184; https://doi.org/10.3390/bioengineering13020184 - 5 Feb 2026
Viewed by 535
Abstract
The current study investigated proximal femur bone shape asymmetry and its associations with cartilage composition and functional performance in individuals with hip osteoarthritis (OA). Forty-seven participants with hip OA (mean age: 53.77 ± 12.47 years; 22 females; BMI: 24.49 ± 4.0 kg/m2 [...] Read more.
The current study investigated proximal femur bone shape asymmetry and its associations with cartilage composition and functional performance in individuals with hip osteoarthritis (OA). Forty-seven participants with hip OA (mean age: 53.77 ± 12.47 years; 22 females; BMI: 24.49 ± 4.0 kg/m2) were included in this study. Bilateral hip MRI was performed using a 3.0 T MR scanner with 3D proton density fat-saturated CUBE and MAPSS sequences. Automatic segmentation of the proximal femur was achieved using a U-Net framework refined through a human-in-the-loop annotation strategy, followed by three-dimensional bone shape analysis to quantify asymmetry. Cartilage relaxation times were assessed using atlas-based segmentation and quantification, while functional activity was evaluated according to OARSI-recommended criteria. The proposed proximal femur bone segmentation showed a DSC of 96.48% (95%-CI: 96.33–96.64) and Hausdorff Distance of 4.66 mm (95%-CI: 3.80–5.51). Increased bone shape asymmetry in the posterior–lateral–superior region of the proximal femur was associated with functional activity in the chair stand test (rho = −0.41; p = 0.006), and the anterior–lateral–inferior region demonstrated a comparatively higher significant positive correlation (rho = 0.37; p = 0.006) with the T1rho values of the acetabular cartilage region. Overall, the findings indicate that region-specific proximal femoral bone shape asymmetry in hip OA is associated with cartilage characteristics and functional impairment, highlighting the potential value of bone shape features as imaging biomarkers relevant to clinical function. Full article
Show Figures

Graphical abstract

27 pages, 11971 KB  
Article
An Application Study on Digital Image Classification and Recognition of Yunnan Jiama Based on a YOLO-GAM Deep Learning Framework
by Nan Ji, Fei Ju and Qiang Wang
Appl. Sci. 2026, 16(3), 1551; https://doi.org/10.3390/app16031551 - 3 Feb 2026
Viewed by 154
Abstract
Yunnan Jiama (paper horse prints), a representative form of intangible cultural heritage in southwest China, is characterized by subtle inter-class differences, complex woodblock textures, and heterogeneous preservation conditions, which collectively pose significant challenges for digital preservation and automatic image classification. To address these [...] Read more.
Yunnan Jiama (paper horse prints), a representative form of intangible cultural heritage in southwest China, is characterized by subtle inter-class differences, complex woodblock textures, and heterogeneous preservation conditions, which collectively pose significant challenges for digital preservation and automatic image classification. To address these challenges and improve the computational analysis of Jiama images, this study proposes an enhanced object detection framework based on YOLOv8 integrated with a Global Attention Mechanism (GAM), referred to as YOLOv8-GAM. In the proposed framework, the GAM module is embedded into the high-level semantic feature extraction and multi-scale feature fusion stages of YOLOv8, thereby strengthening global channel–spatial interactions and improving the representation of discriminative cultural visual features. In addition, image augmentation strategies, including brightness adjustment, salt-and-pepper noise, and Gaussian noise, are employed to simulate real-world image acquisition and degradation conditions, which enhances the robustness of the model. Experiments conducted on a manually annotated Yunnan Jiama image dataset demonstrate that the proposed model achieves a mean average precision (mAP) of 96.5% at an IoU threshold of 0.5 and 82.13% under the mAP@0.5:0.95 metric, with an F1-score of 94.0%, outperforming the baseline YOLOv8 model. These results indicate that incorporating global attention mechanisms into object detection networks can effectively enhance fine-grained classification performance for traditional folk print images, thereby providing a practical and scalable technical solution for the digital preservation and computational analysis of intangible cultural heritage. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

35 pages, 752 KB  
Review
Ontology Learning in Educational Systems
by Tatyana Ivanova and Valentina Terzieva
Information 2026, 17(2), 147; https://doi.org/10.3390/info17020147 - 2 Feb 2026
Viewed by 193
Abstract
E-learning content and participants in the learning process are usually annotated with metadata. Complicated metadata models are necessary for organizing personalized learning, so an ontological metadata representation is used. Since ontologies represent static knowledge, changes in e-learning systems and related description metadata require [...] Read more.
E-learning content and participants in the learning process are usually annotated with metadata. Complicated metadata models are necessary for organizing personalized learning, so an ontological metadata representation is used. Since ontologies represent static knowledge, changes in e-learning systems and related description metadata require frequent changes to corresponding ontologies. Only a few professionals in the educational domain have some expertise in ontology development. So, maximal possible automation is of great importance for the development and maintenance of knowledge models, needed for intelligent e-learning environments. Ontology learning is an approach for automatic ontology development and evolution, affected significantly by recent advances in Artificial Intelligence and Language Models. The main objective of this study is to explore and analyze ontology learning approaches and techniques and the specifics of their use in an intelligent e-learning environment. It examines and summarizes recent scientific research to reveal the degree of development and the extent to which ontology learning is applied to support personalized tutoring. The paper outlines trends and challenges of ontology learning from textual e-learning content and comprehensively discusses ontology learning and its applications in intelligent e-learning. It also describes a use case concerning the implementation and practical usage of ontology learning. Full article
(This article belongs to the Special Issue Semantic Web and Language Models)
Show Figures

Figure 1

19 pages, 2387 KB  
Article
High-Precision Marine Radar Object Detection Using Tiled Training and SAHI Enhanced YOLOv11-OBB
by Sercan Külcü
Sensors 2026, 26(3), 942; https://doi.org/10.3390/s26030942 - 2 Feb 2026
Viewed by 316
Abstract
Reliable object detection in marine radar imagery is critical for maritime situational awareness, collision avoidance, and autonomous navigation. However, it remains challenging due to sea clutter, small targets, and interference from fixed navigational aids. This study proposes a high-precision detection pipeline that integrates [...] Read more.
Reliable object detection in marine radar imagery is critical for maritime situational awareness, collision avoidance, and autonomous navigation. However, it remains challenging due to sea clutter, small targets, and interference from fixed navigational aids. This study proposes a high-precision detection pipeline that integrates tiled training, Sliced Aided Hyper Inference (SAHI), and an oriented bounding box (OBB) variant of the lightweight YOLOv11 architecture. The proposed approach effectively addresses scale variability in Plan Position Indicator (PPI) radar images. Experiments were conducted on the real-world DAAN dataset provided by the German Aerospace Center (DLR). The dataset consists of 760 full-resolution radar frames containing multiple moving vessels, dynamic own-ship, and clutter sources. A semi-automatic contour-based annotation pipeline was developed to generate multi-format labels, including axis-aligned bounding boxes, oriented bounding boxes (OBBs), and instance segmentation masks, directly from radar echo characteristics. The results demonstrate that the tiled YOLOv11n-OBB model with SAHI achieves an mAP@0.5 exceeding 0.95, with a mean center localization error below 10 pixels. The proposed method shows better performance on small targets compared to standard full-image baselines and other YOLOv11 variants. Moreover, the lightweight models enable near real-time inference at 4–6 FPS on edge hardware. These findings indicate that OBBs and scale-aware strategies enhance detection precision in complex marine radar environments, providing practical advantages for tracking and navigation tasks. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

22 pages, 300 KB  
Article
The Ten Minutes That Shocked the World—Teaching Generative AI to Analyze the Trump–Zelensky Multimodal Debate
by Isabella Poggi, Tommaso Scaramella, Sissy Violini, Simona Careri, Maria Désirée Epure and Daniele Dragoni
Information 2026, 17(2), 136; https://doi.org/10.3390/info17020136 - 1 Feb 2026
Viewed by 205
Abstract
Today, foundation models simulate humans’ skills in translation, literature review, fact checking, fake-news detection, novel and poetry production. However, generative AI can also be applied to discourse analysis. This study instructed the Gemini 2.5 model to analyze multimodal political discourse. We selected some [...] Read more.
Today, foundation models simulate humans’ skills in translation, literature review, fact checking, fake-news detection, novel and poetry production. However, generative AI can also be applied to discourse analysis. This study instructed the Gemini 2.5 model to analyze multimodal political discourse. We selected some fragments from the Trump–Zelensky debate held at the White House on 28 February 2025 and annotated each sentence, gesture, intonation, gaze, and facial expression in terms of LEP (Logos, Ethos, Pathos) analysis to assess when speakers, in words or body communication, rely on rational argumentation, stress their own merits or the opponents’ demerits, or express and try to induce emotions in the audience. Through detailed prompts, we asked the Gemini 2.5 model to run the LEP analysis on the same fragments. Then, considering the human’s and model’s annotations in parallel, we proposed a metric to compare their respective analyses and measure discrepancies, finally tuning an optimized prompt for the model’s best performance, which in some cases outperformed the human’s analysis: an interesting application, since the LEP analysis highlights deep aspects of multimodal discourse but is highly time-consuming, while its automatic version allows us to interpret large chunks of speech in a fast but reliable way. Full article
Show Figures

Graphical abstract

37 pages, 2905 KB  
Article
A Slide Annotation System with Multimodal Analysis for Video Presentation Review
by Amma Liesvarastranta Haz, Komang Candra Brata, Nobuo Funabiki, Htoo Htoo Sandi Kyaw, Evianita Dewi Fajrianti and Sritrusta Sukaridhoto
Algorithms 2026, 19(2), 110; https://doi.org/10.3390/a19020110 - 1 Feb 2026
Viewed by 260
Abstract
With the rapid growth of online presentations, there has been an increasing need for efficient review of recorded materials. In typical presentations, speakers verbally elaborate on each slide, providing details not captured in the slides themselves. Automatically extracting and embedding these verbal explanations [...] Read more.
With the rapid growth of online presentations, there has been an increasing need for efficient review of recorded materials. In typical presentations, speakers verbally elaborate on each slide, providing details not captured in the slides themselves. Automatically extracting and embedding these verbal explanations at their corresponding slide locations can greatly enhance the review process for audiences. This paper presents a Slide Annotation System that employs a robust hybrid two-stage detector to identify slide boundaries, extracts slide text through Optical Character Recognition (OCR), transcribes narration, and employs a multimodal Large Language Model (LLM) to generate concise, context-aware annotations that are added to their corresponding slide locations. For evaluations, the technical performance was validated on five recorded presentations, while the user experience was assessed by 37 participants. The results showed that the system achieved a macro-average F1 score of 0.879 (SD=0.024, 95% CI[0.849,0.909]) for slide segmentation and 90.0% accuracy (95% CI[74.4%,96.5%]) for annotation alignment. Subjective evaluations revealed high annotation validity and usefulness as rated by presenters, and a high System Usability Scale (SUS) score of 80.5 (SD=6.7, 95% CI[78.3,82.7]). Qualitative feedback further confirmed that the system effectively streamlined the review process, enabling users to locate key information more efficiently than standard video playback. These findings demonstrate the strong potential of the proposed system as an effective automated annotation system. Full article
(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)
Show Figures

Graphical abstract

32 pages, 5551 KB  
Article
BanglaOCT2025: A Population-Specific Fovea-Centric OCT Dataset with Self-Supervised Volumetric Restoration Using Flip-Flop Swin Transformers
by Chinmay Bepery, G. M. Atiqur Rahaman, Rameswar Debnath, Sajib Saha, Md. Shafiqul Islam, Md. Emranul Islam Abir and Sanjay Kumar Sarker
Diagnostics 2026, 16(3), 420; https://doi.org/10.3390/diagnostics16030420 - 1 Feb 2026
Viewed by 168
Abstract
Background: Age-related macular degeneration (AMD) is a major cause of vision loss, yet publicly available Optical Coherence Tomography (OCT) datasets lack demographic diversity, particularly from South Asian populations. Existing datasets largely represent Western cohorts, limiting AI generalizability. Moreover, raw OCT volumes contain redundant [...] Read more.
Background: Age-related macular degeneration (AMD) is a major cause of vision loss, yet publicly available Optical Coherence Tomography (OCT) datasets lack demographic diversity, particularly from South Asian populations. Existing datasets largely represent Western cohorts, limiting AI generalizability. Moreover, raw OCT volumes contain redundant spatial information and speckle noise, hindering efficient analysis. Methods: We introduce BanglaOCT2025, a retrospective dataset collected from the National Institute of Ophthalmology and Hospital (NIOH), Bangladesh, using Nidek RS-330 Duo 2 and RS-3000 Advance systems. We propose a novel preprocessing pipeline comprising two stages: (1) A constraint-based centroid minimization algorithm automatically localizes the foveal center and extracts a fixed 33-slice macular sub-volume, robust to retinal tilt and acquisition variability; and (2) A self-supervised volumetric denoising module based on a Flip-Flop Swin Transformer (FFSwin) backbone suppresses speckle noise without requiring paired clean reference data. Results: The dataset comprises 1585 OCT volumes (202,880 B-scans), including 857 expert-annotated cases (54 DryAMD, 61 WetAMD, and 742 NonAMD). Denoising quality was evaluated using reference-free volumetric metrics, paired statistical analysis, and blinded clinical review by a retinal specialist, confirming preservation of pathological biomarkers and absence of hallucination. Under a controlled paired evaluation using the same classifier with frozen weights, downstream AMD classification accuracy improved from 69.08% to 99.88%, interpreted as an upper-bound estimate of diagnostic signal recoverability rather than independent generalization. Conclusions: BanglaOCT2025 is the first clinically validated OCT dataset representing the Bengali population and establishes a reproducible fovea-centric volumetric preprocessing and restoration framework for AMD analysis, with future validation across independent and multi-centre test cohorts. Full article
(This article belongs to the Special Issue 3rd Edition: AI/ML-Based Medical Image Processing and Analysis)
Show Figures

Graphical abstract

21 pages, 6584 KB  
Article
Diffusion-Based Anonymization and Foundation Model-Powered Semi-Automatic Image Annotation for Privacy-Protective Intelligent Connected Vehicle Traffic Data
by Tong Wang, Hui Xie, Feng Gao, Zian Meng, Pengcheng Zhang and Guohao Duan
World Electr. Veh. J. 2026, 17(2), 70; https://doi.org/10.3390/wevj17020070 - 31 Jan 2026
Viewed by 330
Abstract
Large-scale collection and annotation of sensitive facial data in real-world traffic scenarios face significant hurdles regarding privacy protection, temporal consistency, and high costs. To address these issues, this work proposes an integrated method specifically designed for sensitive information anonymization and semi-automatic image annotation [...] Read more.
Large-scale collection and annotation of sensitive facial data in real-world traffic scenarios face significant hurdles regarding privacy protection, temporal consistency, and high costs. To address these issues, this work proposes an integrated method specifically designed for sensitive information anonymization and semi-automatic image annotation (AIA). Specifically, the Nullface anonymization model is applied to remove identity information from facial data while preserving non-identity attributes including pose, expression, and background that are relevant to downstream vision tasks. Secondly, the Qwen3-VL multimodal foundation model is combined with the Grounding DINO detection model to build an end-to-end annotation platform using the Dify workflow, covering data cleaning and automated labeling. A traffic-sensitive information dataset with diverse and complex backgrounds is then constructed. Subsequently, the systematic experiments on the WIDER FACE subset show that Nullface significantly outperforms baseline methods including FAMS and Ciagan in head pose preservation and image quality. Finally, evaluation on object detection further confirms the effectiveness of the proposed approach. The accuracy achieved by the proposed method reaches 91.05%, outperforming AWS, and is almost identical to the accuracy of manual annotation. This demonstrates that the anonymization process maintains critical semantic details required for effective object detection. Full article
(This article belongs to the Special Issue Recent Advances in Intelligent Vehicle)
Show Figures

Figure 1

28 pages, 2204 KB  
Article
An Intelligent Generation Method for Building Fire Protection Maintenance Work Orders Based on Large Language Models
by Chu Han, Jia Wang, Wei Zhou and Xiaoping Zhou
Fire 2026, 9(2), 65; https://doi.org/10.3390/fire9020065 - 30 Jan 2026
Viewed by 417
Abstract
Maintenance of building fire protection facilities is crucial for preventing fires and safeguarding lives and property; the standardization and timeliness of these activities directly determine operational reliability. However, as fire-safety requirements escalate, manually drafting maintenance work orders remains inefficient and prone to omissions. [...] Read more.
Maintenance of building fire protection facilities is crucial for preventing fires and safeguarding lives and property; the standardization and timeliness of these activities directly determine operational reliability. However, as fire-safety requirements escalate, manually drafting maintenance work orders remains inefficient and prone to omissions. Furthermore, regulatory documents in this domain are inherently complex, and annotated resources are scarce, hampering the digitalization of fire-safety management. To address these challenges, this paper presents an LLM-based method for automatically generating maintenance work orders for building fire protection facilities. The proposed approach integrates a domain-specific knowledge base and incorporates the FS-RAG (Fire Services–Retrieval-Augmented Generation) framework to enhance both the accuracy and practical usability of generated work orders. First, we construct a lightweight domain knowledge base, FSKB (Fire Services Knowledge Base), derived from extensive maintenance regulations, capturing key elements such as equipment types, components, maintenance actions, and frequencies. Second, we design an FS-RAG framework that leverages retrieval-augmented generation to extract critical information from regulations and fuse it with the knowledge base, ensuring high accuracy and operational feasibility. Multi-round evaluations across stages B0–B4 validate the effectiveness of our method. Results indicate significant improvements over traditional approaches: the line-level compliance rate reaches 97.3% (an increase of 5.7% over B1 and 30.4% over B0), and the F1 score achieves 90.42% (an increase of 12.62% over B1 and 29.87% over B0). Full article
Show Figures

Figure 1

21 pages, 622 KB  
Article
Truth Is Better Generated than Annotated: Hierarchical Prompt Engineering and Adaptive Evaluation for Reliable Synthetic Knowledge Dialogues
by Hyeongju Ju, EunKyeong Lee, Junyoung Kang, JaKyoung Kim and Dongsuk Oh
Appl. Sci. 2026, 16(3), 1387; https://doi.org/10.3390/app16031387 - 29 Jan 2026
Viewed by 138
Abstract
Large Language Models (LLMs) have demonstrated exceptional performance in knowledge-based dialogue generation and text evaluation. Synthetic data serves as a cost-effective alternative for generating high-quality datasets. However, it often plagued by hallucinations, inconsistencies, and self-anthropomorphized responses. Concurrently, manual construction of knowledge-based dialogue datasets [...] Read more.
Large Language Models (LLMs) have demonstrated exceptional performance in knowledge-based dialogue generation and text evaluation. Synthetic data serves as a cost-effective alternative for generating high-quality datasets. However, it often plagued by hallucinations, inconsistencies, and self-anthropomorphized responses. Concurrently, manual construction of knowledge-based dialogue datasets remains bottlenecked by prohibitive costs and inherent human subjectivity. To address these multifaceted challenges, we propose ACE (Automatic Construction of Knowledge-Grounded and Engaging Human–AI Conversation Dataset), a hybrid method using hierarchical prompt engineering. This approach mitigates hallucinations and self-personalization while maintaining response consistency. Furthermore, existing human and automated evaluation methods struggle to assess critical factors like factual accuracy and coherence. To overcome this, we introduce the Truthful Answer Score (TAS), a novel metric specifically designed for knowledge-based dialogue evaluation. Our experimental results demonstrate that the ACE dataset achieves higher quality than existing benchmarks, such as Wizard of Wikipedia (WoW) and FaithDial. Additionally, TAS aligns more closely with human judgment, offering a more reliable and scalable evaluation framework. Our findings demonstrate that leveraging LLMs through systematic prompting can substantially reduce reliance on human annotation while simultaneously elevating the quality and reliability of synthetic datasets. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

32 pages, 33186 KB  
Article
Satellite Mapping of 30 m Time-Series Forest Distribution in Hunan, China, Based on a 25-Year Multispectral Imagery and Environmental Features
by Rong Liu, Gui Zhang, Aibin Chen and Jizheng Yi
Remote Sens. 2026, 18(3), 426; https://doi.org/10.3390/rs18030426 - 28 Jan 2026
Viewed by 295
Abstract
Forests play a critical role in Earth’s ecosystem, yet monitoring their long-term, large-scale spatiotemporal dynamics remains a significant challenge. This study addresses this gap by developing an integrated framework to map annual forest distribution in Hunan, China, from 1999 to 2023 at a [...] Read more.
Forests play a critical role in Earth’s ecosystem, yet monitoring their long-term, large-scale spatiotemporal dynamics remains a significant challenge. This study addresses this gap by developing an integrated framework to map annual forest distribution in Hunan, China, from 1999 to 2023 at a high resolution of 30 m. Our methodology combines multi-temporal satellite imagery (Landsat 5/7/8/9) with key environmental variables, including digital elevation models, temperature, and precipitation data. To efficiently reconstruct historical maps, training samples were automatically derived from a reliable 2023 forest product using a transferable logic, drastically reducing manual annotation effort. Comprehensive evaluations demonstrate the robustness of our approach: (1) Qualitative analyses reveal superior spatial detail and temporal consistency compared to existing global forest maps. (2) Rigorous quantitative validation based on ∼9000 reference samples confirms high and stable accuracy (∼92.4%) and recall (∼91.9%) over the 24-year period. (3) Furthermore, comparisons with government forestry statistics show strong agreement, validating the practical utility of the data. This work provides a valuable, accurate long-term dataset that forms a scientific basis for critical downstream applications such as ecological conservation planning, carbon stock assessment, and climate change research, thereby highlighting the transformative potential of multi-source data fusion and automated methods in advancing geospatial monitoring. Full article
Show Figures

Figure 1

Back to TopTop