Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (25,775)

Search Parameters:
Keywords = information extraction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 9092 KB  
Article
Prediction of Rice Brown Spot Disease Using Spectral Indices Derived from UAVs and Machine Learning Models in Lambayeque and Cajamarca, Peru
by Juan Valdiviezo, Jaime Aguilar-Lome, María Jaramillo-Carrión, Luis Ángel Ruiz and Lia Ramos-Fernández
Drones 2026, 10(7), 495; https://doi.org/10.3390/drones10070495 (registering DOI) - 29 Jun 2026
Abstract
Rice brown spot, caused by Bipolaris oryzae, is an important constraint for rice production and requires timely field-scale monitoring. This study evaluated the use of multispectral bands acquired with a UAV-mounted sensor, together with vegetation indices, combined with machine-learning models to estimate [...] Read more.
Rice brown spot, caused by Bipolaris oryzae, is an important constraint for rice production and requires timely field-scale monitoring. This study evaluated the use of multispectral bands acquired with a UAV-mounted sensor, together with vegetation indices, combined with machine-learning models to estimate rice brown spot severity under field conditions in Lambayeque and Cajamarca, Peru. A total of 37 sampling observations were collected across the vegetative, flowering, and milk-ripening stages. Spectral variables were extracted from UAV orthomosaics and related to field-based disease severity assessments. The strongest correlations with severity were observed for NDRE (r = −0.83) and NPCI (r = 0.77). Three regression models were evaluated using leave-one-out cross-validation (LOOCV): support vector regression with radial basis function kernel (SVR-rbf), support vector regression with linear kernel (SVR-linear), and Random Forest (RF). The SVR-linear model showed the lowest prediction error using NDRE, GREEN, and BLUE as predictors (R2_CV = 0.76; RMSE_CV = 1.31), although its performance was very similar to that of SVR-rbf and RF. These results indicate that UAV-derived multispectral information can support plot-level estimation of rice brown spot severity. However, model performance should be interpreted cautiously because of the small dataset, heterogeneous disease conditions, and moderate prediction accuracy. Further studies with larger and independent datasets are needed to improve robustness and transferability. Full article
Show Figures

Figure 1

32 pages, 270887 KB  
Article
DCFP-YOLO: A Dual-Backbone Feature Fusion Network for Multi-Pose Chili Flower Recognition and Edge Deployment
by Minqiu Kuang, Xiaojian Li, Fangping Xie, Shang Chen, Dawei Liu, Yang Xiang, Bei Wu, Feng Liu, Yuxuan Zhang and Xu Li
Agriculture 2026, 16(13), 1422; https://doi.org/10.3390/agriculture16131422 (registering DOI) - 29 Jun 2026
Abstract
To address the challenges of difficult feature extraction and insufficient recognition accuracy caused by the small size of chili flowers, occlusion by branches and leaves, and illumination variations in complex field environments, a dual-backbone-based chili flower pose estimation algorithm, termed DCFP-YOLO, is proposed. [...] Read more.
To address the challenges of difficult feature extraction and insufficient recognition accuracy caused by the small size of chili flowers, occlusion by branches and leaves, and illumination variations in complex field environments, a dual-backbone-based chili flower pose estimation algorithm, termed DCFP-YOLO, is proposed. Built upon the YOLO11n framework, the proposed method performs classification and recognition of five typical upward-oriented chili flower poses. To alleviate the loss of local detail features of small chili flowers under complex backgrounds, a dual-backbone feature extraction network composed of StarNet and ShuffleNetV2 is constructed. Specifically, the StarNet backbone enhances the extraction of fine-grained local features from key floral regions, while the ShuffleNetV2 backbone improves the perception of global spatial structural information. The complementary fusion of dual-backbone features strengthens the representation capability of chili flower pose features in complex environments. To mitigate the attenuation of shallow detail information during multi-scale feature transmission, a Bidirectional Multi-branch Auxiliary Feature Pyramid Network (BiMAFPN) is designed to enhance feature propagation through cross-scale feature interaction, thereby improving pose recognition performance under occlusion and overlapping conditions. Furthermore, a Programmable Gradient Information (PGI)-assisted training mechanism is introduced to optimize gradient propagation paths and alleviate information bottlenecks in deep networks, thereby enhancing the robustness of multi-pose feature extraction under occlusion, blur, and complex illumination conditions. Experimental results demonstrate that DCFP-YOLO achieves recall, mAP50, and mAP50 values of 87.4%, 92.0%, and 66.9%, respectively, representing improvements of 1.7, 1.3, and 3.5 percentage points over the baseline model. Overall performance surpasses that of current mainstream object detection algorithms. After deployment on the NVIDIA Jetson AGX Orin platform, the model achieves an inference speed of 20.9 frames/s, which can basically satisfy the real-time perception requirements of chili flower pose recognition in complex agricultural environments. The proposed method provides an effective visual perception framework for chili flower pose recognition in complex agricultural environments. Rather than constituting a complete robotic pollination solution, the developed model serves as a potential perception component for future intelligent pollination robotic systems, providing reliable flower pose information for subsequent research on target localization, end-effector alignment, and robotic pollination in unstructured greenhouse environments. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

17 pages, 1031 KB  
Article
Augmented Disentanglement and Aggregation for Nested Named Entity Recognition
by Jinjin Zhang, Kun Zhang, Chengliang Zhong and Ruhan A
Electronics 2026, 15(13), 2840; https://doi.org/10.3390/electronics15132840 (registering DOI) - 29 Jun 2026
Abstract
Nested named entity recognition (Nested NER) aims to identify and classify all possible span entities within a text. Existing approaches primarily rely on enumeration techniques and span-based methods to address the challenge of overlapping entities. However, these methods often overlook the structural distribution [...] Read more.
Nested named entity recognition (Nested NER) aims to identify and classify all possible span entities within a text. Existing approaches primarily rely on enumeration techniques and span-based methods to address the challenge of overlapping entities. However, these methods often overlook the structural distribution and inherent semantics of entities, making them susceptible to issues such as ambiguous start-end tokens, blurred entity boundaries, and a high degree of token overlap. In this paper, we propose a novel strategy we name Augmented Disentanglement and Aggregation for Nested Named Entity Recognition (ada-NER), which employs a series of augmentation strategies to extract nested entities from text. Specifically, we first reformulate the nested NER task as a problem of disentangling and aggregating the relationships between span recognition and type classification. This formulation enables the model to capture fine-grained and comprehensive contextual interactions within sentences. Furthermore, span entities are recognized through the joint modeling of hard boundary encoding and soft edge encoding, while type classification is enhanced by incorporating both intra and inter distribution relationships as well as dependency information. Finally, we introduce a well-designed fusion mechanism to obtain entity representation within a shared space. Extensive experiments on two public datasets demonstrate the effectiveness of our proposed model, which consistently outperforms competitive baseline methods. Full article
Show Figures

Figure 1

43 pages, 2827 KB  
Article
MS-SENet: A Multi-Scale Squeeze–Excitation Network for Deep-Learning-Based Automatic Modulation Classification in Cognitive Radio Systems
by Evelio Astaiza Hoyos, Héctor Fabio Bermúdez-Orozco and Nasly Cristina Rodriguez-Idrobo
Future Internet 2026, 18(7), 343; https://doi.org/10.3390/fi18070343 (registering DOI) - 29 Jun 2026
Abstract
Automatic modulation classification (AMC) is a critical enabler of cognitive radio (CR) systems, allowing secondary users to identify primary user modulation schemes and adapt transmission parameters in real time. Traditional AMC approaches, based on likelihood functions or hand-crafted features, suffer from degraded performance [...] Read more.
Automatic modulation classification (AMC) is a critical enabler of cognitive radio (CR) systems, allowing secondary users to identify primary user modulation schemes and adapt transmission parameters in real time. Traditional AMC approaches, based on likelihood functions or hand-crafted features, suffer from degraded performance under low signal-to-noise ratio (SNR) conditions and realistic channel impairments. In this paper, we propose MS-SENet (Multi-Scale Squeeze–Excitation Network), a novel deep-learning architecture that integrates multi-scale convolutional feature extraction, squeeze-and-excitation channel attention, residual learning, bidirectional long short-term memory (BiLSTM) temporal modelling, and global attention pooling into a unified framework for robust AMC. The multi-scale convolution module employs parallel branches with kernel sizes of 3, 5, and 7 to capture both fine-grained phase transitions and coarse envelope patterns from raw in-phase/quadrature (I/Q) signal samples. Squeeze–excitation residual blocks perform channel-wise feature recalibration, enabling the network to emphasize informative feature maps while suppressing less relevant ones. A bidirectional LSTM layer models temporal dependencies across the signal sequence, and a global attention pooling mechanism performs weighted temporal aggregation prior to classification. We present a comprehensive taxonomy of deep-learning architectures for AMC organised along five axes—input representation, feature extraction, temporal modelling, regularization strategy, and architectural complexity—and conduct a rigorous comparative evaluation against ten baseline architectures on a RadioML-style synthetic dataset (110,000 samples, 11 modulation classes, and 20 SNR levels from −20 to +18 dB). The experimental results demonstrate that MS-SENet achieves a mean classification accuracy of 87.9% at SNR ≥ 0 dB (the average of the medium and high SNR regime averages: 86.06% for 0 ≤ SNR < 10 dB and 89.68% for SNR ≥ 10 dB) while maintaining a compact footprint of approximately 406 K parameters, making it suitable for deployment on resource-constrained edge devices. We further analyze the robustness of the proposed architecture to multipath fading, carrier frequency offset, and sample rate offset, confirming its resilience under practical operating conditions. MS-SENet is an architecture designed for automatic modulation classification of I/Q signals and is not related to the homonymous architecture for speech emotion recognition. Full article
22 pages, 1967 KB  
Article
SAR Scatter-Wave Jamming Multiplexing Communication: Based on Small Phase Modulation
by Bochang Yu, Qidong Zhang, Guidao Lin, Jiaqi Chen, Ziyu Huang, Gaogao Liu, Hongfu Guo and Wen Wu
Remote Sens. 2026, 18(13), 2104; https://doi.org/10.3390/rs18132104 (registering DOI) - 29 Jun 2026
Abstract
This paper proposes a novel jamming multiplexing communication method for synthetic aperture radar (SAR) imaging in scatter-wave jamming (SWJ) scenarios. This method multiplexes communication information by modulating a weak phase in the slow time dimension of the SWJ signal. After receiving the signal, [...] Read more.
This paper proposes a novel jamming multiplexing communication method for synthetic aperture radar (SAR) imaging in scatter-wave jamming (SWJ) scenarios. This method multiplexes communication information by modulating a weak phase in the slow time dimension of the SWJ signal. After receiving the signal, the ground communication station (CS) extracts the information according to the designed algorithm, thus achieving communication transmission. Specifically, a detailed signal model is established to describe the modulation and demodulation mechanism of the communication information. The impact of communication modulation on the SAR image is theoretically analyzed, and the bit error rate (BER) performance of the communication transmission is evaluated. Due to the use of high-gain matched filtering technology in the communication demodulation process, the proposed method achieves better communication BER performance in low signal-to-noise ratio (SNR) environments without significantly reducing jamming performance. Simulation results show that the proposed method achieves reliable information transmission while maintaining effective SWJ capability. Full article
Show Figures

Figure 1

18 pages, 615 KB  
Article
Early-Cycle Lifetime Prediction of Lithium-Ion Batteries with Ultra-Short Cycle Life Using Transferable Statistical Features
by Yuxiang Kuang, Dongxu Guo and Yuejiu Zheng
Batteries 2026, 12(7), 236; https://doi.org/10.3390/batteries12070236 (registering DOI) - 29 Jun 2026
Abstract
Early-cycle lifetime prediction of lithium-ion batteries is important for rapid cell screening, battery development, and manufacturing quality control. However, accurate prediction at the early stage remains difficult because capacity fade is usually very limited during the initial cycles, and the available degradation signals [...] Read more.
Early-cycle lifetime prediction of lithium-ion batteries is important for rapid cell screening, battery development, and manufacturing quality control. However, accurate prediction at the early stage remains difficult because capacity fade is usually very limited during the initial cycles, and the available degradation signals are weak. In this study, an early degradation voltage morphology (EDVM)-based framework is proposed for early cycle-life prediction. Two statistical features and one degradation mode voltage signature (DMVS) feature are extracted from the discharge capacity-difference profiles between the 10th and 3rd cycles and combined with an extreme gradient boosting (XGBoost) model. Validation on 138 commercial NCM811 cylindrical cells shows that the proposed framework achieves a mean absolute percentage error (MAPE) of 12.29% using only the first 10 cycles of data. In addition, the DMVS feature identifies three groups of early degradation behavior and provides physically interpretable information on degradation heterogeneity. These results indicate that the proposed method is an efficient and interpretable approach for early cycle-life prediction and has practical potential for battery evaluation and screening. Full article
Show Figures

Figure 1

26 pages, 796 KB  
Article
Age-Aware Collaborative Scheduling for Ensuring Data Freshness in WBAN-Based Health Monitoring Systems
by Beom-Su Kim
Mathematics 2026, 14(13), 2303; https://doi.org/10.3390/math14132303 (registering DOI) - 29 Jun 2026
Abstract
Wireless body area networks (WBANs) for healthcare monitoring require age-of-information (AoI)-aware resource allocation under heterogeneous periodic and aperiodic traffic. Existing AoI-aware resource allocation methods can be broadly divided into centralized, decentralized, and hybrid approaches, but each has a structural limitation: centralized scheduling may [...] Read more.
Wireless body area networks (WBANs) for healthcare monitoring require age-of-information (AoI)-aware resource allocation under heterogeneous periodic and aperiodic traffic. Existing AoI-aware resource allocation methods can be broadly divided into centralized, decentralized, and hybrid approaches, but each has a structural limitation: centralized scheduling may allocate time slots to sources without newly generated samples, decentralized access may suffer from collision-induced delay under heavy contention, and fixed hybrid access may fail to adapt the scheduled and random access regions to the current traffic composition. To jointly address these limitations, this paper formulates a sample-wise weighted AoI minimization problem that accounts for source-specific sampling periods, transmission lengths, and priority weights, and proposes an online collaborative hybrid scheduler. The proposed method extracts traffic features at runtime, classifies sources as periodic or aperiodic, schedules periodic samples through contention-free access close to their sampling start times, and supports aperiodic samples through random access without pre-reserving slots. It further adapts the contention-free and random access regions according to the detected traffic composition. Simulation results show that the proposed scheduler reduces sample-wise weighted AoI compared with centralized and decentralized AoI schedulers by mitigating incorrect scheduling, reducing collision-induced delay, and improving superframe utilization. Full article
Show Figures

Figure 1

22 pages, 40458 KB  
Article
Mapping and Yield Estimation of Cultivated Alfalfa Using Cutting-Induced NDVI Peak–Trough Features from Sentinel-2 Time Series
by Jie Liu, Qisheng Feng, Shuai Fu, Tiangang Liang, Jinlong Gao and Wei Sun
Agronomy 2026, 16(13), 1255; https://doi.org/10.3390/agronomy16131255 (registering DOI) - 29 Jun 2026
Abstract
Alfalfa (Medicago sativa) is an important forage source for grassland agricultural development; developing accurate and efficient methods for alfalfa identification and yield estimation using remote sensing is of considerable interest. However, the traditional methods of identifying large areas of crops and [...] Read more.
Alfalfa (Medicago sativa) is an important forage source for grassland agricultural development; developing accurate and efficient methods for alfalfa identification and yield estimation using remote sensing is of considerable interest. However, the traditional methods of identifying large areas of crops and yield estimation have some problems, such as the limited spatial resolution of remote sensing data and a strong dependence on training data. In this study, using Sentinel-2 high-resolution imagery and the Google Earth Engine (GEE) platform, we constructed a cloud-free normalized difference vegetation index (NDVI) time-series dataset and proposed an effective method for alfalfa feature extraction and yield estimation. The results show that: (1) the producer’s accuracy, user’s accuracy, overall accuracy, and Kappa coefficient of alfalfa identification using the trough recognition algorithm were 98.51%, 91.67%, 94.26%, and 0.88, respectively. The total area of cultivated alfalfa identified in the study area in 2020 was estimated at 46,793.21 hm2, and was mainly distributed in the northern region of the Qilian Mountains. (2) NDVI showed a highly significant correlation with alfalfa hay yield, and the power function regression model performed best, with an R2 greater than 0.65. (3) The annual unit hay yield of four alfalfa cuttings was estimated at 17,497.55–32,962.10 kg/hm2, with a total hay yield of 4.838 × 108 kg and an average hay yield of 4464.95 kg/hm2. The proposed method has significant application potential for automated and rapid remote sensing-based identification and yield estimation of large-scale alfalfa cultivation. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

20 pages, 4510 KB  
Review
Karst Rocky Desertification in Southwest China: Remote Sensing Progress, Critical Challenges, and Future Pathways
by Youyan Huang, Zhongfa Zhou, Qunyan Tang, Denghong Huang, Bo Li and Ying Luo
Appl. Sci. 2026, 16(13), 6459; https://doi.org/10.3390/app16136459 (registering DOI) - 29 Jun 2026
Abstract
Karst desertification is a major ecological and environmental issue that threatens regional ecological security and sustainable development; its dynamic monitoring is of great significance for evaluating the effectiveness of ecological restoration and promoting regional sustainable development. Based on the Web of Science database, [...] Read more.
Karst desertification is a major ecological and environmental issue that threatens regional ecological security and sustainable development; its dynamic monitoring is of great significance for evaluating the effectiveness of ecological restoration and promoting regional sustainable development. Based on the Web of Science database, this paper offers a bibliometric-informed narrative review of the evolution of remote sensing monitoring and information extraction technologies for karst desertification from 1987 to 2025. It focuses on analyzing research progress in methods such as multi-source remote sensing data fusion, deep learning models, and integrated GIS analysis, with regard to improving the accuracy of information extraction and the ability to identify spatiotemporal dynamics of karst desertification. This paper also compares the advantages and limitations of different technologies in terms of high-resolution identification and long-term dynamic monitoring. Building on this foundation and drawing on relevant domestic and international research findings, this study examines the development trends and major challenges facing karst desertification monitoring technologies. It further outlines the direction for establishing a long-term, standardized dynamic monitoring system, with the aim of providing a scientific basis for ecological governance and sustainable development in the karst regions of Southwest China. Full article
Show Figures

Figure 1

8 pages, 675 KB  
Perspective
Sovereign Large Language Models for Structured Data Extraction from Pathology Reports: A Perspective for the Clinical Laboratory
by Ravi Shankar
Laboratories 2026, 3(3), 9; https://doi.org/10.3390/laboratories3030009 (registering DOI) - 29 Jun 2026
Abstract
The surgical pathology report remains one of the richest yet least computable artefacts in the clinical record. Diagnostic, prognostic, and treatment-relevant information is recorded predominantly as a free-text narrative that resists aggregation for research, quality monitoring, and cancer registration, while manual abstraction is [...] Read more.
The surgical pathology report remains one of the richest yet least computable artefacts in the clinical record. Diagnostic, prognostic, and treatment-relevant information is recorded predominantly as a free-text narrative that resists aggregation for research, quality monitoring, and cancer registration, while manual abstraction is slow, costly, and difficult to scale. Large language models (LLMs) have rapidly emerged as a means of converting unstructured pathology narrative into structured, analysis-ready data. This perspective examines the current state of the evidence, with particular reference to breast pathology, and foregrounds the distinction between proprietary cloud-hosted models and locally deployed open-weight models. Recent comparative studies indicate that open-weight models can approach the accuracy of proprietary systems for structured extraction, offering a privacy-preserving and cost-controlled alternative that keeps protected health information inside the institutional firewall—a decisive advantage under data-protection regimes such as Singapore’s Personal Data Protection Act (PDPA) and Human Biomedical Research Act (HBRA). We argue that hybrid architectures—pairing deterministic rule-based extraction for unambiguous fields with local LLMs for narrative reasoning—currently offer the most defensible route to laboratory deployment. We also highlight the “reality gap” between synthetic benchmark performance and real-world clinical accuracy, and the need to align studies with emerging reporting and appraisal frameworks (TRIPOD-LLM, PROBAST + AI). Structured extraction is compatible with the quality and traceability expectations of accredited laboratories only when it is verified before use, monitored over time, and kept under human oversight. Full article
Show Figures

Figure 1

18 pages, 1957 KB  
Article
A Survivor-Based Multilayer Perceptron for Short-Term PV Power Forecasting
by Arif Yelği, Vedat Esen, Taner Dindar and Ali Samet Sarkın
Appl. Sci. 2026, 16(13), 6448; https://doi.org/10.3390/app16136448 (registering DOI) - 29 Jun 2026
Abstract
Accurate short-term power forecasting is essential for enhancing the efficiency and reliability of energy systems. Nonetheless, conventional techniques for forecasting struggle to detect nonlinear patterns in power time series, as maintaining both stability and accuracy in predictions is tough. This research presents a [...] Read more.
Accurate short-term power forecasting is essential for enhancing the efficiency and reliability of energy systems. Nonetheless, conventional techniques for forecasting struggle to detect nonlinear patterns in power time series, as maintaining both stability and accuracy in predictions is tough. This research presents a unique prediction framework that integrates a Multilayer Perceptron (MLP) with survivor-based evolutionary selection strategies. The proposed neural network architecture comprises three hidden layers containing 32, 16, and 8 neurons, respectively. This enables the network to extract features while preserving essential information progressively. A Survivor selection process is employed to enhance the model’s efficacy. This approach retains the optimal training models for subsequent training phases. This technique enhances both predictive accuracy and training efficiency. The amalgamation of Survivor-based selection methodologies with MLP architectures for short-term power generation forecasting is overlooked in the existing literature, although it holds promise. Thus, the proposed model is evaluated against established baselines, including Linear Regression (LR), Random Forest (RF), and Support Vector Regression (SVR). The results from 30 distinct trials indicate that the proposed MLP (32-16-8) combined with the Survivor approach exhibits the minimal prediction errors, with a mean absolute error (MAE) of 5.3588 and a root mean square error (RMSE) of 10.0216. This strategy is superior in minimizing errors compared to alternative methods. Furthermore, statistical analyses utilizing the Wilcoxon signed-rank test and paired t-test indicate that the proposed method significantly outperforms SVR and RF, while displaying performance comparable to LR. The findings indicate that including a Survivor-based selection mechanism in the MLP training process is an effective and reliable method for forecasting short-term generation power. Full article
Show Figures

Figure 1

33 pages, 1237 KB  
Article
Hypothesis-Informed Feature Stability Scoring for High-Dimensional ETL Pipelines
by Konstantin Piryankov, Iveta Grigorova, Aleksandar Karamfilov and Aleksandar Efremov
Appl. Sci. 2026, 16(13), 6445; https://doi.org/10.3390/app16136445 (registering DOI) - 28 Jun 2026
Abstract
High-dimensional financial Extract–Transform–Load (ETL) pipelines often contain heterogeneous variables whose statistical properties may change between recurring data deliveries, affecting feature reliability before downstream machine learning models are trained. This study extends a previously proposed Canberra-based data drift monitoring framework by introducing a hypothesis-informed [...] Read more.
High-dimensional financial Extract–Transform–Load (ETL) pipelines often contain heterogeneous variables whose statistical properties may change between recurring data deliveries, affecting feature reliability before downstream machine learning models are trained. This study extends a previously proposed Canberra-based data drift monitoring framework by introducing a hypothesis-informed feature stability component for automated feature assessment and prioritization. Unlike the prior descriptive framework, which relied on univariate and bivariate exploratory metrics, the proposed extension adds an inferential layer and evaluates how this layer changes feature ranking relative to the original score and alternative marginal drift measures. The method combines univariate deviations in summary statistics, bivariate deviations in dependency-related metrics, and hypothesis-based evidence from Anderson–Darling, Mann–Whitney U, and Levene tests. The resulting p-values are aggregated using a Landau-calibrated harmonic mean p-value formulation and transformed into a bounded hypothesis score, which is integrated into a composite variable-level stability ranking. The framework operates on precomputed exploratory data analysis (EDA) outputs, enabling scalable comparison between a validated reference dataset and a current ETL delivery. The proposed extension provides an interpretable and computationally efficient mechanism for identifying unstable features and supporting feature review, exclusion, or prioritization in automated machine learning pipelines. Full article
(This article belongs to the Special Issue Machine Learning-Based Feature Extraction and Selection: 2nd Edition)
45 pages, 823 KB  
Article
An Information-Geometric Justification for Composite Coherence in Event-Based Narrative Extraction
by Brian Keith-Norambuena
Entropy 2026, 28(7), 732; https://doi.org/10.3390/e28070732 (registering DOI) - 28 Jun 2026
Abstract
Graph-based narrative extraction relies on a coherence function to score transitions between events, but the coherence metrics in current use are defined operationally and lack an information-theoretic foundation. We study the composite metric C=A·T, where A is the [...] Read more.
Graph-based narrative extraction relies on a coherence function to score transitions between events, but the coherence metrics in current use are defined operationally and lack an information-theoretic foundation. We study the composite metric C=A·T, where A is the angular similarity of document embeddings and T=1dJS is the topic proximity through the Jensen–Shannon distance of soft cluster memberships, and we provide an information-geometric reading of this metric together with an axiomatic characterization of the geometric-mean combinator. On the product manifold Sd1×Δ+K1, the negative log-coherence decomposes additively into an angular and a topic cost. Because the Riemannian metric tensor induced by the Jensen–Shannon distance on the simplex is proportional to the Fisher information matrix, the topic component is locally consistent with the Fisher–Rao metric singled out by Chentsov’s theorem. Within a parametric family of combinators (the compensability spectrum), the geometric mean is the unique combinator consistent with four natural axioms (a boundary/veto condition, symmetry, log-additivity, normalization), and the construction also motivates a proper product metric d× that we use as a reference distance. Experiments on four corpora spanning news and academic domains (40 to 6000 documents), three general-purpose embedding families (GPT-4/ada-002, MPNet, MiniLM-L6) plus citation-aware SPECTER2, and three alternative topic models (LDA, soft k-means, GMM) are consistent with the framework: the Fisher identity holds with R0.99, the geometric mean tracks d× closely (ρ=0.999), and a downstream LLM-as-judge consistency check shows that the geometric mean is not empirically dominated by any alternative combinator or single-channel baseline. Sweeping the compensability spectrum, the bottleneck-coherence gap between extracted storylines and random sequences splits into a symmetric component—maximized at the geometric mean on the four corpora above and a fifth, human-navigation corpus—and a displacement term; a cross-modal case study on a human-curated image narrative reproduces the same effect in a second modality. Together, these results provide an information-geometric justification for the composite coherence metric and articulate the conditions under which the geometric mean is the natural choice. Full article
(This article belongs to the Special Issue Information Theory in Artificial Intelligence)
26 pages, 744 KB  
Review
Cybersecurity Warnings as Safety-Relevant Learning Mechanisms: A Scoping Review of Behavioral Adaptation, Trust Calibration, and Risk Regulation
by Eleonora Chiantera, Lorenzo Arciulo and Francesco Di Nocera
Safety 2026, 12(4), 87; https://doi.org/10.3390/safety12040087 (registering DOI) - 28 Jun 2026
Abstract
Cybersecurity relies heavily on warning systems to regulate user behavior under uncertainty. These warnings (ranging from browser security dialogs to phishing alerts and enterprise security notifications) do more than convey information: they may alter the conditions under which users select, avoid, verify, report, [...] Read more.
Cybersecurity relies heavily on warning systems to regulate user behavior under uncertainty. These warnings (ranging from browser security dialogs to phishing alerts and enterprise security notifications) do more than convey information: they may alter the conditions under which users select, avoid, verify, report, or override security-related actions. When combined with feedback, they may also contribute to calibrated reliance and safer behavior over time. However, existing research remains fragmented across usable security, human–computer interaction, and safety-related decision-making, and is largely focused on short-term outcomes. As a result, limited attention has been given to how cybersecurity warnings function as risk-control and learning mechanisms within safety-relevant socio-technical systems. This scoping review maps how empirical studies have examined cybersecurity warning systems in relation to behavioral adaptation, trust calibration, and risk regulation, and whether they assess persistence, transfer, or learning over time, identifying recurring design patterns, critical trade-offs, and structural gaps. Following PRISMA-ScR 2018 guidelines, we searched major multidisciplinary and domain-specific databases, with no time frame limits, for empirical studies that examined cybersecurity warnings in relation to learning-relevant behavioral, cognitive, or performance outcomes. Seventeen studies met the inclusion criteria; this number reflects the review’s focused conceptual scope rather than the size of the cybersecurity-warning literature as a whole. Eligible studies included experimental, quasi-experimental, field, and mixed-method designs, but no included study assessed persistence or transfer over time. Data extraction focused on warning characteristics, learning and trust mechanisms, user behavior, and security outcomes. Across the included studies, research primarily evaluates immediate responses, such as clicks, choices, response time, and classification accuracy, whereas comprehension and corrective feedback are infrequently assessed, trust calibration is never formally measured, and persistence or transfer over time is assessed in none of the included studies. On this basis, the review proposes a learning-oriented framework for evaluating cybersecurity warnings beyond short-term compliance, emphasizing feedback, calibrated reliance, risk-regulation responses, and direct tests of maintenance and transfer over time. Full article
Show Figures

Figure 1

30 pages, 16160 KB  
Article
Structural Prior-Guided Adaptive Wavelet Denoising for Single-Channel Dolphin Whistles
by Ru Wu, Xiang Zhou, Wen Chen, Peibin Zhu and Xiaomei Xu
J. Mar. Sci. Eng. 2026, 14(13), 1185; https://doi.org/10.3390/jmse14131185 (registering DOI) - 28 Jun 2026
Viewed by 48
Abstract
The continuous, narrowband time-frequency structure of dolphin whistles is an important information carrier for target detection, behavioral analysis, and ecological monitoring in passive acoustic monitoring. However, ocean noise can easily obscure whistle time-frequency contours, blur their boundaries, and cause local discontinuities, thereby reducing [...] Read more.
The continuous, narrowband time-frequency structure of dolphin whistles is an important information carrier for target detection, behavioral analysis, and ecological monitoring in passive acoustic monitoring. However, ocean noise can easily obscure whistle time-frequency contours, blur their boundaries, and cause local discontinuities, thereby reducing the reliability of subsequent acoustic analysis. Existing denoising methods based on transform-domain thresholding and spectral-domain statistical modeling can attenuate background interference to some extent. However, without explicit structural constraints, these methods still have difficulty achieving a satisfactory balance between noise suppression and preservation of the whistle time–frequency structure. To address this problem, this study proposes a Structural Prior-Guided Adaptive Wavelet Denoising (SPG-AWD) method for single-channel unsupervised scenarios. The proposed method introduces structural priors at two levels: adaptive subband selection and terminal node denoising. At the first level, subband nodes are adaptively split, retained, or suppressed based on stationary wavelet packet recursive decomposition and the distribution of candidate structures. At the second level, a structural mask satisfying local grouped-energy and two-dimensional time–frequency connectivity constraints is extracted, and a continuous whistle-presence probability is obtained through a signed distance transform. This probability is then used to jointly guide local noise power spectral density estimation and protective Wiener gain fusion. Simulation results show that, under real recorded background noise and ship noise conditions, SPG-AWD achieves favorable overall denoising performance when the input SNR is higher than −16 dB, while maintaining a more stable balance between target region energy preservation and non-target region noise suppression. Experiments on real recordings further demonstrate that the proposed method can effectively suppress in-band noise components within the whistle-bearing frequency range, better preserve continuous main frequency contours, and improve the overall perceptibility of whistle contours, confirming its applicability to single-channel dolphin whistle denoising in complex underwater noise environments. Full article
(This article belongs to the Section Ocean Engineering)
Back to TopTop