MDPI - Publisher of Open Access Journals

17 pages, 8857 KB

Open AccessArticle

An Interpretable Deep Learning System for Fine-Grained Classification and Longitudinal Tracking of Neonatal Auricular Deformities

by Yihui Feng, Xujun Hu, Xiwen Zhang, Xiaobao Ma, Jialin Xie, Jianyong Chen and Yangyang Yuan

Biology 2026, 15(13), 985; https://doi.org/10.3390/biology15130985 (registering DOI) - 23 Jun 2026

Viewed by 180

Abstract

Early non-invasive correction of neonatal auricular deformities is highly dependent on timely and precise diagnosis. However, clinical practice is often compromised by the subjectivity of visual assessments and the lack of objective tracking metrics, which frequently leads to missed optimal treatment windows. To [...] Read more.

Early non-invasive correction of neonatal auricular deformities is highly dependent on timely and precise diagnosis. However, clinical practice is often compromised by the subjectivity of visual assessments and the lack of objective tracking metrics, which frequently leads to missed optimal treatment windows. To address these challenges, we developed an interpretable deep learning-based diagnostic system for the automated screening and fine-grained classification of these deformities. Methodologically, a large-scale, multi-source dataset (n = 4644) was curated to support model training. The system pairs an automated object detector (YOLOv11) for background-reduced region-of-interest isolation with a cascaded classification pipeline optimized via ConvNeXt-Tiny. Crucially, we introduced a supervised contrastive learning module to project high-dimensional morphological features into a continuous severity score, enabling quantitative longitudinal tracking of therapeutic efficacy. To evaluate generalization and robustness, the framework underwent rigorous evaluation across three independent real-world cohorts and one controlled synthetic stress test. The system achieved 88.2% accuracy (Area Under the Curve (AUC): 0.949) in binary screening and 87.4% accuracy (macro-AUC: 0.976) in multi-class subtyping on the internal baseline. To enhance interpretability and build clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM) was utilized to explore the spatial distribution of the model’s attention, which frequently aligned with key anatomical landmarks. Furthermore, the learned severity scores robustly quantified post-intervention improvements (p = 0.0004), effectively capturing subtle anatomical normalization. While validation for rare subtypes remains underpowered, and the severity score currently functions mainly as a learned morphological similarity index requiring future clinical calibration, this study ultimately provides an objective and standardized web-based tool to facilitate the early intervention and precision management of neonatal auricular anomalies. Full article

(This article belongs to the Special Issue AI Deep Learning Approach to Study Biological Questions (3rd Edition))

► Show Figures

Figure 1

21 pages, 3029 KB

Open AccessArticle

ParaChromo: Scalable and Seam-Coherent Inference for 3D Genome Diffusion

by Xialin Su, Mingxiang Zhu, Wei Shang and Zhixin Ou

Electronics 2026, 15(13), 2750; https://doi.org/10.3390/electronics15132750 - 23 Jun 2026

Viewed by 77

Abstract

Diffusion models for 3D genome structures make inference an ensemble-generation and tiling problem. In the released ChromoGen workflow, millions of independent denoising trajectories are executed through a single-GPU path, while overlapping genomic windows are sampled without enforcing consistency of their shared physical interval. [...] Read more.

Diffusion models for 3D genome structures make inference an ensemble-generation and tiling problem. In the released ChromoGen workflow, millions of independent denoising trajectories are executed through a single-GPU path, while overlapping genomic windows are sampled without enforcing consistency of their shared physical interval. We introduce ParaChromo, a parallel inference framework for conditioned, tiled 3D genome diffusion workloads built around the trained diffusion U-Net and distance-map interface. ParaChromo organizes the workload into three inference-layer modules: a workload-dispatch module schedules region, guidance, and sample chunks across worker groups; an encoder-aware sharded-conditioning module scales and shards the EPCOT front end with FSDP while keeping the inner-loop U-Net replicated; and a seam-coherent tiled-synchronization module projects the shared 12-bead overlap of adjacent reverse chains in distance-map space. On eight A6000 GPUs, the combined reduced-step and task-parallel systems path raises throughput from

2.356 \pm 0.003

to

235.71 \pm 1.120

samples/s, a

100.04 \pm 0.486

-fold gain over the released single-GPU baseline. The reduced-step setting is supported by a sweep from 50 to 1000 DDIM steps, where distance-distribution and Hi-C-based metrics remain stable across four chromosomes. For the synchronization module, the chr22 seam discrepancy falls from 150.9 pm to 7.9 pm, while matched internal and Hi-C-based quality metrics are preserved. The synchronized chr22 run also gives a chromosome-scale coordinate rendering over 32 paper-aligned tiles. Together, these results show that conditioned, tiled 3D genome diffusion can be executed as a scalable workload when throughput parallelism, sampler length, encoder placement, and spatial consistency are treated as separate but compatible constraints. Full article

(This article belongs to the Special Issue Advances in 3D Computer Vision and 3D Data Processing)

► Show Figures

Figure 1

26 pages, 8518 KB

Open AccessArticle

CVA-Net: Multi-View 3D Reconstruction for Fringe Projection Profilometry via Cross-View Attention and Sim2Real Learning

by Zuqiong Chen, Xiaopin Zhong and Yibin Tian

Photonics 2026, 13(6), 601; https://doi.org/10.3390/photonics13060601 (registering DOI) - 21 Jun 2026

Viewed by 221

Abstract

Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that [...] Read more.

Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that directly reconstructs dense depth maps from multi-view fringe patterns. CVA-Net simultaneously processes four fringe images acquired from orthogonal projection directions and leverages a CVA module to explicitly model inter-view dependencies, enabling adaptive fusion of complementary information. A 3D U-Net backbone with attention gates, atrous spatial pyramid pooling (ASPP), and an auxiliary parameter estimation branch further enhances reconstruction accuracy and structural consistency via multitask learning. To support Sim2Real network training, we build a Blender-based digital twin of a multi-view FPP system and generate a large-scale synthetic dataset with perfect ground truth. Extensive experiments on both synthetic and real-world objects demonstrate that CVA-Net significantly outperforms state-of-the-art single-view methods. With a symmetric four-view configuration and fringe period of 8, CVA-Net achieves an MAE of 0.0359 mm, an MSE of 0.0379 mm² and an RMSE of 0.1947 mm, reducing the MAE, MSE, and RMSE by 32.8%, 54.1%, and 32.2%, respectively, compared to the best single-view competitor. Ablation studies validate the contribution of each architectural component, while real-system experiments demonstrate the feasibility of transferring a network trained purely on synthetic data to practical FPP measurements without domain adaptation. Although further improvements are required to enhance reconstruction accuracy under real imaging conditions, the proposed framework provides an effective initial step toward bridging the gap between digital-twin-based training and real-world multi-view FPP applications. CVA-Net provides a robust, occlusion-aware solution for multi-view FPP reconstruction. Full article

(This article belongs to the Special Issue Optical Imaging for 3D Surface and Phase Recovery: Techniques and Applications)

► Show Figures

Figure 1

22 pages, 338 KB

Open AccessArticle

Quillen–Suslin Theorem for Connected Cochain DG Algebras

by Xuefeng Mao and Biyan Zhu

Mathematics 2026, 14(12), 2215; https://doi.org/10.3390/math14122215 - 20 Jun 2026

Viewed by 94

Abstract

Let

A

be a connected cochain differential graded algebra and P a finitely generated differential graded

A

-module. We show that P is semi-free if it is semi-projective and it is categorically free if it is categorically projective. It can be considered as [...] Read more.

Let

A

be a connected cochain differential graded algebra and P a finitely generated differential graded

A

-module. We show that P is semi-free if it is semi-projective and it is categorically free if it is categorically projective. It can be considered as a generalization of the well-known Quillen–Suslin Theorem in differential graded context. As an application, we show that the ghost length and the cone length of a compact differential graded module coincide. Full article

23 pages, 5651 KB

Open AccessArticle

Rotation-Equivariant Feature Learning on Polar BEV for Robust LiDAR Place Recognition

by Zhenhuan Yuan, Youchun Xu, Zhichao Zhang, Yuan Zhu, Jianshi Li, Feng Lu, Le Wang, Jinsheng Chen and Wei Lei

Appl. Sci. 2026, 16(12), 6155; https://doi.org/10.3390/app16126155 - 17 Jun 2026

Viewed by 203

Abstract

LiDAR-based place recognition is critical for long-term autonomous navigation in Global Navigation Satellite System (GNSS)-denied environments, yet existing methods struggle to balance accuracy and efficiency under substantial yaw rotations. This paper proposes a robust framework based on a multi-channel polar bird’s-eye-view (BEV) representation. [...] Read more.

LiDAR-based place recognition is critical for long-term autonomous navigation in Global Navigation Satellite System (GNSS)-denied environments, yet existing methods struggle to balance accuracy and efficiency under substantial yaw rotations. This paper proposes a robust framework based on a multi-channel polar bird’s-eye-view (BEV) representation. Under yaw-dominated revisits, the polar BEV image transforms yaw rotation into cyclic column shifts, providing a useful structural prior for rotation-equivariant feature extraction. Raw point clouds are projected onto polar BEV grids encoding density, height, and intensity. A rotation-equivariant feature extractor comprising a Radial Compression Module and a rotation-equivariant Transformer module captures long-range azimuthal dependencies via Conditional Positional Encoding and Circular Relative-Position Bias. The equivariant features are aggregated by NetVLAD into a compact global descriptor, trained end-to-end with a hard-example mining triplet loss. Extensive experiments on the public KITTI and NCLT datasets, as well as our self-constructed LiDAR Place Recognition Revisit (LPRR) dataset, demonstrate competitive performance on KITTI and superior performance on NCLT and LPRR among the compared methods. The proposed framework achieves a favorable trade-off between performance and computational cost, and shows promising cross-dataset generalization on the evaluated NCLT and LPRR datasets without fine-tuning. Full article

(This article belongs to the Section Robotics and Automation)

► Show Figures

Figure 1

25 pages, 1601 KB

Open AccessArticle

A Centralized AI Lakehouse Framework for Brain Tumor MRI Classification and Segmentation, University KPI Forecasting, and Water Potability Prediction

by Ronish Shrestha, Md Masud Rana, Bo Sun, Frank Sun, Helen Lou and Alek Hutson

Sensors 2026, 26(12), 3804; https://doi.org/10.3390/s26123804 - 15 Jun 2026

Viewed by 217

Abstract

In many university and healthcare projects, models are built for very different data types such as tables, institutional time series, and medical images, but they are deployed as separate applications. In this work, that separation made testing and maintenance difficult because each module [...] Read more.

In many university and healthcare projects, models are built for very different data types such as tables, institutional time series, and medical images, but they are deployed as separate applications. In this work, that separation made testing and maintenance difficult because each module had its own pipeline and runtime requirements. This paper presents an integrated AI lakehouse-style implementation that runs three model pipelines inside one containerized backend. For medical imaging, we used MRI datasets from IEEE DataPort: a four-class classification set with 7012 images (5708 train/1304 test) and a segmentation set with 3063 image–mask pairs. The classification model (ResNet50 transfer learning) is evaluated using a proper train–validation–test protocol across multiple splits (80/10/10, 70/10/20, 60/10/30, and 10/30/60), achieving a test accuracy of 99.00% under the standard 80/10/10 split. Additionally, a patient-level evaluation is conducted using an external glioma dataset to provide a more realistic assessment without data leakage. The segmentation model (DeepLabV3-ResNet50) achieved 83.09% validation mIoU and 88.79% Dice score. For university KPI forecasting, we used annual IPEDS and NSF HERD data from 2010 to 2023 for three universities (BSU, EOU, and UAB). To examine the effect of preprocessing on forecasting performance, two case studies are conducted. In the first case, linear interpolation is applied to generate semester-level data. In the second case, the original annual data is used directly without interpolation. Random Forest regression and ARIMA models are evaluated using MAE, RMSE, MAPE, and

R^{2}

. The results showed that interpolation improved apparent forecasting performance due to smoothing, while evaluation on the original annual data provided a more realistic assessment of model behavior. To further validate the framework on a larger dataset, an additional case study is conducted using a student dropout dataset. For water potability, we trained and compared multiple tabular classifiers on a large dataset (1,048,575 samples). A Random Forest model (100 trees, max depth 10) achieved 85.86% test accuracy and high recall for unsafe samples (0.8447). All modules are served via FastAPI and deployed together using Docker, with workflow automation routing requests to the correct endpoint. System-level benchmarking indicates that the backend maintains stable throughput and latency under concurrent requests. Full article

(This article belongs to the Special Issue AI-Empowered Internet of Things)

► Show Figures

Figure 1

27 pages, 10092 KB

Open AccessArticle

Online Digital Tools for Expert Assisted Self-Evaluation of Environmental Impact: Benchmarking, Synthetic Data Generation and Advanced Analytics Based on Use Case Life Cycle Assessment

by David F. Nettleton, David Fernández Gutiérrez, Hasler Iglesias Yañez, Daniele Spinelli, Matteo Maccanti, Poojan Timilsina, Isay González, Paulina Guajardo and Emad Yaghmaei

Appl. Sci. 2026, 16(12), 6047; https://doi.org/10.3390/app16126047 - 15 Jun 2026

Viewed by 242

Abstract

Background: This paper presents the development of digital tools created within the BIORADAR European project to improve user access to Life Cycle Assessment (LCA) results from the project’s use cases and to enable users to upload, benchmark and analyze their own data. The [...] Read more.

Background: This paper presents the development of digital tools created within the BIORADAR European project to improve user access to Life Cycle Assessment (LCA) results from the project’s use cases and to enable users to upload, benchmark and analyze their own data. The work addresses common challenges in circularity and environmental impact assessment, particularly data availability and expert-assisted self-assessment for users such as small- and medium-sized enterprises. Methods: The LCA data for the project use cases is calculated using the Environmental Footprint methodology. Benchmarking compares bio-approach use cases with traditional approaches across three key sectors selected by the BIORADAR project: fertilizer, textile and packaging. These sectors are recognized by the European Commission as three of the most important sectors in terms of environmental impact. Case impact factor data are normalized using a reference statistic, and a weighting is assigned to each key performance indicator to calculate the global score. Individual impact factor values can also be used for benchmarking. Synthetic data are generated through an advanced statistical decomposition algorithm. Advanced data analytics are provided with clustering and a decision tree algorithm using supervised machine learning. Results: Two examples of decision-oriented case studies are used to illustrate how the platform can support the interpretation and use of already computed LCA results in realistic settings. The web-based expert-assisted self-assessment tool, developed in JavaScript, allows users to input their data, benchmark them against project results and perform multidimensional data analysis. The resulting digital tools provide access to LCA data for each use case, generate realistic synthetic datasets preserving key statistical properties, support benchmarking of both project and user-uploaded cases, and perform data analytics, which complement the benchmarking module with a structural and exploratory interpretation of the data. Conclusions: Overall, the tools integrate use case benchmarking, data processing, advanced analytics and user interfaces to facilitate environmental self-assessment and comparison within the BIORADAR framework. Full article

(This article belongs to the Section Green Sustainable Science and Technology)

► Show Figures

Figure 1

34 pages, 7055 KB

Open AccessArticle

Extending a Vision–Language Model with Audio Understanding: Introducing Qolda-AVL for the Kazakh Language

by Batyr Arystanbekov, Akylbek Maxutov, Aspandiyar Nurimanov and Huseyin Atakan Varol

Big Data Cogn. Comput. 2026, 10(6), 192; https://doi.org/10.3390/bdcc10060192 - 15 Jun 2026

Viewed by 257

Abstract

Recent advances in multi-modal large language models have enabled systems to jointly process text, images, and audio. However, these developments have primarily benefited high-resource languages, leaving many low-resource communities underserved. In response, we introduce Qolda-AVL, a compact five-billion-parameter audio–vision–language model tailored for Kazakh. [...] Read more.

Recent advances in multi-modal large language models have enabled systems to jointly process text, images, and audio. However, these developments have primarily benefited high-resource languages, leaving many low-resource communities underserved. In response, we introduce Qolda-AVL, a compact five-billion-parameter audio–vision–language model tailored for Kazakh. Qolda-AVL extends our previous Qolda vision–language model by adding a dedicated audio perception branch while maintaining strong visual and linguistic performance. Built on the Qwen3-VL-Thinking backbone, we incorporate Audio DeepStack, which transfers features from three intermediate Whisper encoder layers into the first three layers of the language model using dedicated projections and residual connections. The model is trained through a four-stage pipeline: adapting the Whisper encoder and language model to Kazakh, aligning the new audio branch to the language backbone, and jointly fine-tuning all modules on chain-of-thought reasoning tasks across audio, image, and text. All audio, vision, and language capabilities are evaluated using the model’s native reasoning mode, and a chain-of-thought trace is generated before each final answer during the performance assessment. To facilitate further research, we open-source the model along with the adapted Kazakh versions of four audio benchmarks, covering spoken attribute reasoning, spoken mathematical question answering, and audio captioning with question answering. Full article

► Show Figures

Figure 1

21 pages, 4058 KB

Open AccessArticle

Intermember Simulation Uncertainty in North Pacific Tropical Cyclone Genesis Frequency Under the Influence of the Interdecadal Pacific Oscillation at Decadal-Scale

by Jianing Li, Zhen Wang, Jiuwei Zhao, Leying Zhang and Yue Li

Atmosphere 2026, 17(6), 604; https://doi.org/10.3390/atmos17060604 - 12 Jun 2026

Viewed by 178

Abstract

Substantial uncertainties remain in climate model simulations of tropical cyclones (TCs), particularly those associated with internal climate variability. While the influence of the El Niño–Southern Oscillation (ENSO) on interannual TC variability is well established, the contribution of the Interdecadal Pacific Oscillation (IPO) to [...] Read more.

Substantial uncertainties remain in climate model simulations of tropical cyclones (TCs), particularly those associated with internal climate variability. While the influence of the El Niño–Southern Oscillation (ENSO) on interannual TC variability is well established, the contribution of the Interdecadal Pacific Oscillation (IPO) to decadal-scale uncertainty is less well constrained. Although models generally reproduce IPO-related variations in tropical cyclone genesis frequency (TCGF) over the eastern North Pacific, large discrepancies persist across the broader North Pacific basin. Clarifying the role of IPO in modulating TCGF uncertainty is therefore essential for improving decadal TC projections. In this study, we analyzed a large ensemble of historical simulations from the MRI-AGCM within the d4PDF (Database for Policy Decision Making for Future Climate Change) framework. Empirical orthogonal function (EOF) analysis is applied to IPO-composited fields to identify the leading modes of intermember (100 members *60 y, 6000 times) simulation uncertainty on a decadal-scale. The results reveal that state-of-the-art models exhibit robust and spatially coherent uncertainty structures in TCGF under different IPO phases. Two leading modes are identified: (1) a South China Sea mode, closely associated with systematic precipitation biases, and (2) a zonal dipole mode between the eastern and western North Pacific, linked to the equatorward propagation of Arctic Oscillation (AO)-related variability. Misrepresentation of AO variability is found to contribute substantially to biases in simulated TCGF patterns. Comparisons with observational datasets further support the proposed mechanisms. These findings highlight the importance of improving the representation of precipitation processes and extratropical–tropical teleconnections in climate models, which is critical for enhancing the reliability of decadal predictions of North Pacific TC activity. Full article

(This article belongs to the Section Climatology)

► Show Figures

Figure 1

27 pages, 7054 KB

Open AccessArticle

Building an Intelligent QA System for Smart City Planning: Integrating LLMs and Knowledge Graphs

by Chenjing Zhou and Minjing Lao

Appl. Sci. 2026, 16(12), 5927; https://doi.org/10.3390/app16125927 - 11 Jun 2026

Viewed by 130

Abstract

Smart city planning involves a wide range of knowledge domains. However, general intelligent Question Answering systems often fall short when applied to this domain, and the relevant studies are not yet sufficient. To this end, this paper constructs an intelligent QA system that [...] Read more.

Smart city planning involves a wide range of knowledge domains. However, general intelligent Question Answering systems often fall short when applied to this domain, and the relevant studies are not yet sufficient. To this end, this paper constructs an intelligent QA system that combines a large language model with a domain-specific knowledge graph. Capable of understanding questions accurately and generating professional answers, this system is designed to provide efficient knowledge services for smart city planning by following four steps. First, based on four authoritative planning guidelines, a domain-specific knowledge graph with a four-layer framework is constructed using Neo4j Community Edition 5.26.24. The framework includes top-level goals, knowledge modules, standard terminology and community scenarios. Subsequently, natural language questions are classified and matched with the templates before being converted into structured queries. Finally, the system performs Cypher query language queries and invokes ChatGLM4 to generate professional answers. The knowledge graph contains 100 entity nodes and 44 relations, and its ontology layer defines 28 entity types and 12 relation types. Therefore, the domain knowledge is structured and visualized, and planning professionals can intuitively retrieve diverse planning elements. In addition to its intelligent knowledge query function, this system assists planning professionals in preparing planning schemes and verifying compliance, reducing the time spent on reviewing regulations and comparing clauses, improving the efficiency of scheme preparation, and facilitating the refined implementation of urban renewal projects. It has high application value in smart city planning practices. Its construction approach can also serve as a reference for intelligent knowledge services in other fields. Full article

► Show Figures

Figure 1

35 pages, 22696 KB

Open AccessArticle

Missing Tooth Height Map Prediction via CBAM-Enhanced Conditional Pix2Pix with Sobel Edge Loss

by Lining Wang, Changying Wang, Peiyao Qu, Jiayi Xu, Qingxue Zhang and Mingsen Li

Appl. Sci. 2026, 16(12), 5905; https://doi.org/10.3390/app16125905 - 11 Jun 2026

Viewed by 136

Abstract

Personalized reconstruction of missing-tooth morphology is a key problem in digital prosthodontics. The main challenge is to generate results that are consistent with the patient’s local dentition, the morphology of the contralateral teeth, and anatomically plausible occlusal details. Although several deep learning-based methods [...] Read more.

Personalized reconstruction of missing-tooth morphology is a key problem in digital prosthodontics. The main challenge is to generate results that are consistent with the patient’s local dentition, the morphology of the contralateral teeth, and anatomically plausible occlusal details. Although several deep learning-based methods have been proposed for dental restoration, existing approaches still have limitations, including insufficient use of patient-specific contextual information, oversmoothed boundary structures in the generated results, and relatively high model complexity. To address these limitations, this study proposes a CBAM-Sobel conditional Pix2Pix framework, termed CS-cPix2Pix, for predicting the height map of a missing tooth from height projection maps of the contralateral teeth and adjacent teeth. The framework uses height projection maps of a three-tooth contralateral region and an adjacent-tooth region as conditional inputs. A U-Net generator is adopted to learn the mapping from the input conditions to the target missing-tooth height map, and a convolutional block attention module is introduced in the encoder to enhance feature representation in key morphological regions. Furthermore, a Sobel edge loss is incorporated in addition to the adversarial loss and L1 reconstruction loss to constrain the local gradient structure of the generated height map and reduce oversmoothing of occlusal edges, grooves, and ridges. Experimental results show that CS-cPix2Pix achieves better overall quantitative performance than the baseline Pix2Pix model and multiple ablation models, especially in terms of PSNR, FSIM, IoU, and Sobel-L1. Under the current experimental setting, the proposed method generates missing-tooth height maps with clearer boundaries and more continuous structures, and it supports relatively stable reconstruction of three-dimensional occlusal surface meshes from the predicted height maps. However, the present model development still mainly relies on a single public orthodontic dental dataset and focuses primarily on teeth numbered 4, 5, and 6. Therefore, the generalization of the proposed method to other tooth positions, other scanners, different populations, and different acquisition conditions still requires further verification. Full article

► Show Figures

Figure 1

43 pages, 4604 KB

Open AccessArticle

AI-Assisted Script Generation for Bulk PDF Retrieval and Renaming from Open Access Journal Archives: A Feasibility Case Study

by Dimitris Rousidis, Paraskevas Koukaras and Christos Tjortjis

Appl. Sci. 2026, 16(12), 5903; https://doi.org/10.3390/app16125903 - 11 Jun 2026

Viewed by 176

Abstract

The volume of academic and scientific publications grows rapidly, increasing the need for efficient mechanisms for accessing, obtaining and managing large collections of Open Access (OA) journal articles. For the purposes of an ongoing project requiring the analysis of thousands of OA Journal [...] Read more.

The volume of academic and scientific publications grows rapidly, increasing the need for efficient mechanisms for accessing, obtaining and managing large collections of Open Access (OA) journal articles. For the purposes of an ongoing project requiring the analysis of thousands of OA Journal articles, a fast and reliable way to automatically download and rename PDF files was essential. To address this need, ChatGPT was employed to generate Python scripts from scratch, with the task deliberately assigned to a user with no Python programming experience, relying partially on his familiarity with HTML and CSS structures. Excluding one manually processed journal, which was used as a descriptive baseline, the study achieved a workflow-level success rate of 90.32% across the 31 AI-assisted journal workflows that were evaluated. Of these, 25 workflows were completed through fully functional downloader/renamer scripts, while three additional journals were processed through successful renaming workflows after automated downloading proved unsuccessful. Four MDPI journals were handled through a shared semi-automated workflow. The paper also presentsdescriptive observations from the documented workflow, indicating a gradual reduction in development time, prompts, and debugging iterations across later stages of the project, as the interaction process became more refined. Furthermore, within this feasibility case, the observed average operational time corresponded to approximately 15.8 s per file for the fully manual procedure, 13.8 s for the complete automated workflow corpus, and 10.8 s after excluding one highly time-consuming outlier case. Statistical analyses of the generated scripts, including imported modules, libraries, functions, constants, control structures, and total lines of code, are also presented. Overall, the study demonstrates the feasibility of AI-assisted scripting in one documented case involving a user without Python programming experience to accomplish tasks that were previously associated with programming expertise. Full article

(This article belongs to the Special Issue Advanced Technologies Applied in Digital Media Era)

► Show Figures

Figure 1

20 pages, 30488 KB

Open AccessArticle

Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing

by Wei Ju, Zheng Liang, Huan Chen and Jie Shen

Remote Sens. 2026, 18(12), 1907; https://doi.org/10.3390/rs18121907 - 9 Jun 2026

Viewed by 238

Abstract

Remote sensing image dehazing remains a formidable challenge due to complex atmospheric scattering and large-scale spatially varying degradation, which severely compromise fine-grained surface details. While recent diffusion-based restoration frameworks, such as DiffIR, have achieved remarkable efficiency by injecting compact diffusion priors into deterministic [...] Read more.

Remote sensing image dehazing remains a formidable challenge due to complex atmospheric scattering and large-scale spatially varying degradation, which severely compromise fine-grained surface details. While recent diffusion-based restoration frameworks, such as DiffIR, have achieved remarkable efficiency by injecting compact diffusion priors into deterministic networks, they typically rely on a monolithic global Image Prior Representation (IPR). However, such a global design is suboptimal for the dehazed results of remote sensing imagery, where haze distribution exhibits strong spatial heterogeneity and scale dependency. To address this limitation, this paper presents the Hierarchical and Scale-Adaptive Diffusion Prior (HS-DiffIR) framework. Specifically, Hierarchical Image Prior Representation decomposes the holistic diffusion latent into multi-scale priors aligned with the hierarchical stages of the restoration network. Such a design facilitates fine-grained, scale-aware guidance by projecting the compact global latent into layer-specific representations, thereby bypassing the computational burden of high-dimensional generative modeling. Complementing this, the Scale-Adaptive Injection mechanism utilizes lightweight learnable coefficients to dynamically modulate the influence of diffusion priors across different feature scales, allowing the network to adaptively balance global semantic consistency and local detail recovery under dense-haze conditions. Evaluations on remote sensing benchmarks confirm that HS-DiffIR generally outperforms the DiffIR baseline. The method yields superior quantitative metrics (particularly PSNR) at a marginal computational cost while demonstrating robust detail restoration in regions subject to severe, spatially variant haze. Full article

(This article belongs to the Special Issue Hyperspectral Remote Sensing Image Analysis via Advanced Deep Learning and Computer Vision)

► Show Figures

Figure 1

25 pages, 1115 KB

Open AccessArticle

Controllable Symbolic Music Generation via Stage-Aware Style Routing and Differentiable Melody Regularization

by Xuanfei Zhou, Yinxuan Huang, Sining Han, Jiangyao Bai, Qianzhen Zhang, Lailong Luo and Chen Wang

Information 2026, 17(6), 568; https://doi.org/10.3390/info17060568 - 8 Jun 2026

Viewed by 172

Abstract

Controllable symbolic music generation must preserve a reference melody while remaining responsive to style prompts. Existing hierarchical diffusion systems typically reuse a shared condition vector across harmony, rhythm, and timbre stages, which can entangle stylistic factors and weaken melody preservation. We present HCDMG++, [...] Read more.

Controllable symbolic music generation must preserve a reference melody while remaining responsive to style prompts. Existing hierarchical diffusion systems typically reuse a shared condition vector across harmony, rhythm, and timbre stages, which can entangle stylistic factors and weaken melody preservation. We present HCDMG++, a hierarchical diffusion framework that addresses these two limitations through stage-aware style routing and differentiable melody regularization. The routing module uses a residual multi-layer perceptron (MLP) with zero-initialized scalar gates to project text-derived style embeddings into harmony-, rhythm-, and timbre-specific subspaces, whereas the regularization branch aligns soft pitch histograms and contour trajectories with the conditioning melody during training without breaking the differentiable computation graph. We evaluate the integrated system on a 384-sample benchmark covering four melodies, eight styles, four random seeds, and three denoising budgets, supplemented by a matched legacy-compatible reference and inference-time component ablation that contrasts legacy behavior, silenced gates, an automated uniform gamma routing sweep, and the full forward pass. HCDMG++ produces valid four-track outputs in all 384 runs, reaches a peak pitch histogram similarity score of 0.508 under a 64-step budget, and improves pitch histogram alignment over Legacy-HCDMG by roughly two orders of magnitude on the matched slice, while attaining a positive Fisher-style style separability score where the legacy benchmark is too sparse to support one. These results indicate that stage-specific conditioning and differentiable structural guidance jointly improve controllability in symbolic music diffusion, while also exposing the remaining limitations in long-form generalization and perceptual validation, which motivate the future work outlined at the end of this paper. Full article

(This article belongs to the Section Information Applications)

► Show Figures

Figure 1

16 pages, 11781 KB

Open AccessArticle

Data-Driven Warehouse Management for Power Materials: Integrating UWB Positioning with Demand Forecasting

by Hui Yang, Guobin Chen and Zhengfan Liu

Electronics 2026, 15(12), 2525; https://doi.org/10.3390/electronics15122525 - 8 Jun 2026

Viewed by 187

Abstract

This study addresses two critical issues in power material warehouse management: insufficient positioning accuracy leading to inefficient inventory auditing and uncontrolled material movement, and procurement-demand imbalances caused by subjective forecasting methods. We present an integrated warehouse management system that synergizes Ultra-Wideband (UWB) centimeter-level [...] Read more.

This study addresses two critical issues in power material warehouse management: insufficient positioning accuracy leading to inefficient inventory auditing and uncontrolled material movement, and procurement-demand imbalances caused by subjective forecasting methods. We present an integrated warehouse management system that synergizes Ultra-Wideband (UWB) centimeter-level real-time positioning with data-driven demand forecasting. The UWB subsystem, built on STM32F1 microcontrollers (STMicroelectronics, Geneva, Switzerland) and DW1000 RF modules (Decawave Ltd., Dublin, Ireland), achieves high-precision location tracking by employing the Double-Sided Two-Way Ranging (DS-TWR) method combined with trilateration and triangular centroid algorithms. The data-driven procurement subsystem utilizes a vast historical dataset (4.86 million records from 36,988 grid projects, 2020–2024) to train demand prediction models. A comparative evaluation of six algorithms identified the Random Forest (RF) model as optimal, demonstrating superior performance with 89.2% accuracy, a Mean Absolute Error (MAE) of 5.48, and a Mean Absolute Percentage Error (MAPE) of 4.89%. The RF model effectively incorporates key factors like failure rates and seasonal cycles. Experimental validation confirmed the UWB subsystem’s robustness, with an average positioning error of 12.05 cm. The integrated system enables precise material tracking, 3D trajectory reconstruction, and generates data-informed procurement signals—including replenishment warnings, optimized order quantities, and adaptive resupply cycles. This approach significantly reduces surplus inventory while maintaining high material availability, offering a scientific, data-driven solution for enhancing efficiency in power material management. Full article

► Show Figures

Figure 1

Search Results (468)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (468)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI