Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction

A special issue of Machine Learning and Knowledge Extraction (ISSN 2504-4990). This special issue belongs to the section "Learning".

Deadline for manuscript submissions: 15 October 2026 | Viewed by 1274

Special Issue Editors


E-Mail Website
Guest Editor
Biosignal Processing and Artificial Intelligence in Medicine and Healthcare, Department of Electronics and Communications Engineering, Arab Academy for Science, Technology and Maritime Transport (AASTMT), Alexandria P.O. Box 1029, Egypt
Interests: artificial intelligence; signal processing; image processing; biomedical signal/image processing; computer vision; pattern recognition; biomedical engineering; machine/deep learning; data mining; feature selection; wearable sensors; brain–computer interface; neuroinformatics; medical/health informatics; precision agriculture
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Information Technology and Computer Science, Nile University, Cairo 12677, Egypt
Interests: artificial intelligence for medical imaging and biomedical signal processing; brain–computer interfaces and neuroengineering; machine/deep learning, multimodal learning and data fusion; large language models and foundation models; explainable and privacy-preserving AI; federated learning

E-Mail Website
Guest Editor
Nanoelectronics Integrated Systems Center, Nile University, Giza 12588, Egypt
Interests: artificial intelligence; smart systems; IoT; circuits and systems; data security and encryption; nonlinear dynamics; machine intelligence; image processing; precision agriculture
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Recent advances in AI and computational modeling are transforming signal and image processing by enabling more accurate, adaptive and scalable analyses across diverse domains. In parallel, modern sensing technologies including imaging systems, physiological sensors, wearable devices, audio platforms and IoT infrastructures are generating large-scale and heterogeneous multimodal data streams.

The AI-driven processing of such complex data raises important methodological challenges related to representation learning, multimodal integration, temporal modeling, cross-modal reasoning and reliable interpretation.

Submissions may present novel algorithms, intelligent model architectures, training paradigms, evaluation strategies or comparative studies that clarify the capabilities and limitations of emerging AI techniques.

Topics of interest include, but are not limited to, the following:

  • AI-based feature representation and extraction
  • Intelligent segmentation and enhancement methods
  • Learning-based noise reduction and signal reconstruction
  • Multimodal data fusion using deep and hybrid AI models
  • Cross-modal and multimodal learning
  • Generative AI for signal and image analysis
  • Interpretable and explainable AI
  • Federated AI learning framework
  • Distributed and edge AI for real-time sensing systems
  • Vision–language integration and language-guided analysis
  • Large language models (LLMs) for signal and image understanding
  • Multimodal foundation models combining signals, images and language
  • AI-efficient and lightweight models for real-time deployment
  • Temporal AI models for dynamic sensor data

Submissions addressing domain-specific challenges are especially welcome when they introduce clear AI-driven methodological innovation, particularly in the following:

  • Medical imaging and biosignal processing;
  • Smart agriculture, energy and water systems;
  • Communication systems;
  • Intelligent monitoring infrastructures.

We look forward to your contributions to this Special Issue.

Prof. Dr. Omneya A. Attallah
Dr. Sahar Selim
Dr. Lobna A. Said
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Machine Learning and Knowledge Extraction is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • signal processing
  • image processing
  • multimodal data
  • deep learning
  • multimodal fusion
  • foundation models
  • generative models
  • explainable methods
  • federated learning
  • cross-modal learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

32 pages, 7875 KB  
Article
Preserving Spatial and Frequency Information in CNNs: Hilbert Curve Flattening and Wavelet Pooling for Explainable Medical Image Analysis
by Jesús Jaime Moreno Escobar
Mach. Learn. Knowl. Extr. 2026, 8(6), 152; https://doi.org/10.3390/make8060152 - 1 Jun 2026
Abstract
Conventional CNN architectures often struggle with information loss during feature extraction, particularly in pooling and flattening layers, where spatial coherence and high-frequency details critical for tasks such as medical diagnostics are compromised. To address this, we introduce a novel integration of Hilbert curve [...] Read more.
Conventional CNN architectures often struggle with information loss during feature extraction, particularly in pooling and flattening layers, where spatial coherence and high-frequency details critical for tasks such as medical diagnostics are compromised. To address this, we introduce a novel integration of Hilbert curve flattening and multiscale frequency-selective wavelet pooling, which preserves diagnostically relevant features while optimizing computational efficiency. Multifrequency selective wavelet pooling improves the performance and adaptability of convolutional neural networks by preserving spatial adjacency structures and eliminating duplicate information. Here, raster flattening was replaced with a conventional Hilbert curve that organized data more efficiently, and wavelet pooling performed feature selection across frequency bands better than average pooling or max-pooling. On standard architectures (Inception, VGG16, ResNet, EfficientNet), our approach consistently produced an improved precision of 1.42% over earlier methods across all datasets and classes, including diagnosis of autism via structural MRI in a proof-of-concept dataset (38 subjects, 4 in the test set), with high precision, at 99%. Hence, validation on larger independent cohorts will be part of the future work. The synergy of Hilbert curve flattening and multiscale frequency-selective wavelet pooling mitigates signal decomposition losses and maintains spatial frequency relationships, advancing CNNs for high-stakes applications like medical imaging and remote sensing. These new strategies enhance spatial coherence and global efficiency, ensuring robustness in applications ranging from medical imaging to time-series forecasting. Full article
Show Figures

Graphical abstract

19 pages, 968 KB  
Article
Document Image Binarization Using Various Machine Learning Models and Ensembles Trained on Classic Local and Global Binarization Algorithms and Image Statistics
by Nicolae Tarbă, Costin-Anton Boiangiu and Mihai-Lucian Voncilă
Mach. Learn. Knowl. Extr. 2026, 8(6), 149; https://doi.org/10.3390/make8060149 - 1 Jun 2026
Viewed by 58
Abstract
Image binarization is a preprocessing technique that maps an image’s pixel values to either black or white, and it is crucial in many fields of computer vision, such as document digitization and medical imaging. Thresholding is a popular image binarization technique for grayscale [...] Read more.
Image binarization is a preprocessing technique that maps an image’s pixel values to either black or white, and it is crucial in many fields of computer vision, such as document digitization and medical imaging. Thresholding is a popular image binarization technique for grayscale images because it splits pixel values into greater than or lower than a specific threshold. Global thresholding is fast because it computes only one threshold for the entire image, but it cannot handle many types of noise specific to document images. Local thresholding has greater computational complexity because it adjusts the thresholds for each pixel based on the surrounding pixels, but it can handle such types of noise, although it risks introducing noise in uniform areas of the image. Mixed global–local approaches can mitigate this risk while still being able to handle most types of noise. This paper proposes a mixed global–local thresholding method that harnesses two popular automatic machine learning frameworks to train machine learning models using the results of several thresholding algorithms and other image statistics. Cross-validation was performed to ensure that the selected models are robust and perform well on new data. We obtained results comparable with other state-of-the-art methods on popular document image binarization datasets. Full article
Show Figures

Figure 1

42 pages, 5367 KB  
Article
Wavelet-Guided Mamba-Attention Network for Boundary-Aware Colorectal Polyp Segmentation
by Xin Liu, Nor Ashidi Mat Isa, Chao Chen, Hanxu Liu, Chao Wang and Fajin Lv
Mach. Learn. Knowl. Extr. 2026, 8(6), 142; https://doi.org/10.3390/make8060142 - 23 May 2026
Viewed by 178
Abstract
Colorectal cancer is the third most commonly diagnosed cancer worldwide, and early detection of polyps via colonoscopy is essential for improving patient survival. However, automatic polyp segmentation faces three key challenges: balancing global context with local detail, delineating ambiguous boundaries under low contrast, [...] Read more.
Colorectal cancer is the third most commonly diagnosed cancer worldwide, and early detection of polyps via colonoscopy is essential for improving patient survival. However, automatic polyp segmentation faces three key challenges: balancing global context with local detail, delineating ambiguous boundaries under low contrast, and handling large variations in polyp size and morphology. To address these challenges, we propose WMA-Net, a Wavelet-Guided Mamba-Attention Network that uses wavelet-domain semantic–boundary separation as the organizing design principle. Rather than introducing a new individual operator, the contribution lies in how existing components—wavelet decomposition, Mamba state space modeling, multi-directional pixel difference convolution, and uncertainty-aware reverse attention—are combined and coordinated within one boundary-aware framework. The architecture integrates pixel difference convolution for multi-directional edge detection, frequency-selective cross-scale fusion with dual-stream wavelet-domain processing, Mamba-based multi-scale aggregation with linear complexity, and uncertainty-aware progressive boundary refinement. Extensive experiments on five public polyp benchmarks demonstrate state-of-the-art performance on four out of five datasets. On the seen datasets, WMA-Net achieves mean Dice scores of 94.4% on CVC-ClinicDB and 93.6% on Kvasir-SEG. On the unseen datasets, WMA-Net attains 91.7% on CVC-300, 82.3% on CVC-ColonDB, and 83.8% on ETIS-LaribPolypDB, demonstrating robust cross-dataset generalization. Comprehensive ablation studies validate the effectiveness and synergy of each proposed module. Full article
Show Figures

Figure 1

25 pages, 33743 KB  
Article
CTCF: A Three-Level Coarse-to-Fine Cascade for Unsupervised Deformable Medical Image Registration
by Daniil Pasenko and Roman Davydov
Mach. Learn. Knowl. Extr. 2026, 8(5), 122; https://doi.org/10.3390/make8050122 - 2 May 2026
Viewed by 348
Abstract
Deformable medical image registration aims to spatially align anatomical structures across volumetric scans. Recent transformer-based methods achieve high overlap accuracy but often produce deformation fields with topological violations. We propose CTCF, a Cascade Transformer for Coarse-to-Fine registration that wraps a lightweight coarse-and-refined envelope [...] Read more.
Deformable medical image registration aims to spatially align anatomical structures across volumetric scans. Recent transformer-based methods achieve high overlap accuracy but often produce deformation fields with topological violations. We propose CTCF, a Cascade Transformer for Coarse-to-Fine registration that wraps a lightweight coarse-and-refined envelope around a core registration module. Level 1 provides a coarse displacement estimate at quarter resolution, Level 2 performs the main registration via a Swin Transformer encoder with deformable cross-attention and a learned super-resolution decoder, and Level 3 applies error-driven flow refinement at half resolution. The two outer levels add only 3.0% parameter overhead yet improve registration accuracy while maintaining competitive deformation regularity relative to external baselines. The model is trained end-to-end with a composite unsupervised loss combining local normalized cross-correlation, diffusion regularization, inverse-consistency, and Jacobian-based topology preservation. On the OASIS brain MRI benchmark, CTCF achieves the highest Dice score of 0.8208 among the compared unsupervised methods while maintaining competitive SDlogJ, with all Dice improvements statistically significant at p<0.001 by the Wilcoxon signed-rank test. On IXI, CTCF also achieves the best Dice, HD95, SDlogJ, and fold percentage among the compared methods. A five-round ablation study validates each component: cascade decomposition isolates each level’s contribution, and resolution scaling experiments confirm the framework’s scalability, yielding further accuracy gains with zero parameter overhead. Full article
Show Figures

Graphical abstract

Back to TopTop