MDPI - Publisher of Open Access Journals

22 pages, 24173 KiB

Open AccessArticle

ScaleViM-PDD: Multi-Scale EfficientViM with Physical Decoupling and Dual-Domain Fusion for Remote Sensing Image Dehazing

by Hao Zhou, Yalun Wang, Wanting Peng, Xin Guan and Tao Tao

Remote Sens. 2025, 17(15), 2664; https://doi.org/10.3390/rs17152664 (registering DOI) - 1 Aug 2025

Viewed by 40

Abstract

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm [...] Read more.

Remote sensing images are often degraded by atmospheric haze, which not only reduces image quality but also complicates information extraction, particularly in high-level visual analysis tasks such as object detection and scene classification. State-space models (SSMs) have recently emerged as a powerful paradigm for vision tasks, showing great promise due to their computational efficiency and robust capacity to model global dependencies. However, most existing learning-based dehazing methods lack physical interpretability, leading to weak generalization. Furthermore, they typically rely on spatial features while neglecting crucial frequency domain information, resulting in incomplete feature representation. To address these challenges, we propose ScaleViM-PDD, a novel network that enhances an SSM backbone with two key innovations: a Multi-scale EfficientViM with Physical Decoupling (ScaleViM-P) module and a Dual-Domain Fusion (DD Fusion) module. The ScaleViM-P module synergistically integrates a Physical Decoupling block within a Multi-scale EfficientViM architecture. This design enables the network to mitigate haze interference in a physically grounded manner at each representational scale while simultaneously capturing global contextual information to adaptively handle complex haze distributions. To further address detail loss, the DD Fusion module replaces conventional skip connections by incorporating a novel Frequency Domain Module (FDM) alongside channel and position attention. This allows for a more effective fusion of spatial and frequency features, significantly improving the recovery of fine-grained details, including color and texture information. Extensive experiments on nine publicly available remote sensing datasets demonstrate that ScaleViM-PDD consistently surpasses state-of-the-art baselines in both qualitative and quantitative evaluations, highlighting its strong generalization ability. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing (5th Edition))

► Show Figures

Figure 1

30 pages, 7223 KiB

Open AccessArticle

Smart Wildlife Monitoring: Real-Time Hybrid Tracking Using Kalman Filter and Local Binary Similarity Matching on Edge Network

by Md. Auhidur Rahman, Stefano Giordano and Michele Pagano

Computers 2025, 14(8), 307; https://doi.org/10.3390/computers14080307 - 30 Jul 2025

Viewed by 106

Abstract

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part [...] Read more.

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part of a single event, resulting in increased power consumption and inefficient bandwidth usage. Furthermore, maintaining consistent animal identities in the wild is difficult due to occlusions, variable lighting, and complex environments. In this study, we propose a lightweight hybrid tracking framework built on the YOLOv8m deep neural network, combining motion-based Kalman filtering with Local Binary Pattern (LBP) similarity for appearance-based re-identification using texture and color features. To handle ambiguous cases, we further incorporate Hue-Saturation-Value (HSV) color space similarity. This approach enhances identity consistency across frames while reducing redundant transmissions. The framework is optimized for real-time deployment on edge platforms such as NVIDIA Jetson Orin Nano and Raspberry Pi 5. We evaluate our method against state-of-the-art trackers using event-based metrics such as MOTA, HOTA, and IDF1, with a focus on detected animals occlusion handling, trajectory analysis, and counting during both day and night. Our approach significantly enhances tracking robustness, reduces ID switches, and provides more accurate detection and counting compared to existing methods. When transmitting time-series data and detected frames, it achieves up to 99.87% bandwidth savings and 99.67% power reduction, making it highly suitable for edge-based wildlife monitoring in resource-constrained environments. Full article

(This article belongs to the Special Issue Intelligent Edge: When AI Meets Edge Computing)

► Show Figures

Figure 1

17 pages, 6870 KiB

Open AccessArticle

Edge- and Color–Texture-Aware Bag-of-Local-Features Model for Accurate and Interpretable Skin Lesion Diagnosis

by Dichao Liu and Kenji Suzuki

Diagnostics 2025, 15(15), 1883; https://doi.org/10.3390/diagnostics15151883 - 27 Jul 2025

Viewed by 358

Abstract

Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features [...] Read more.

Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features often have large receptive fields, resulting in poor spatial alignment with the input image. Second, the design of most deep models neglects interpretable traditional visual features inspired by clinical experience, such as color–texture and edge features. This study aims to propose a novel approach integrating deep learning with traditional visual features to handle these limitations. Methods: We introduce the edge- and color–texture-aware bag-of-local-features model (ECT-BoFM), which limits the receptive field of deep features to a small size and incorporates edge and color–texture information from traditional features. A non-rigid reconstruction strategy ensures that traditional features enhance rather than constrain the model’s performance. Results: Experiments on the ISIC 2018 and 2019 datasets demonstrated that ECT-BoFM yields precise heatmaps and achieves high diagnostic performance, outperforming state-of-the-art methods. Furthermore, training models using only a small number of the most predictive patches identified by ECT-BoFM achieved diagnostic performance comparable to that obtained using full images, demonstrating its efficiency in exploring key clues. Conclusions: ECT-BoFM successfully combines deep learning and traditional visual features, addressing the interpretability and diagnostic accuracy challenges of existing methods. ECT-BoFM provides an interpretable and accurate framework for skin lesion diagnosis, advancing the integration of AI in dermatological research and clinical applications. Full article

(This article belongs to the Special Issue Lesion Detection and Analysis Using Artificial Intelligence, Third Edition)

► Show Figures

Figure 1

28 pages, 3794 KiB

Open AccessArticle

A Robust System for Super-Resolution Imaging in Remote Sensing via Attention-Based Residual Learning

by Rogelio Reyes-Reyes, Yeredith G. Mora-Martinez, Beatriz P. Garcia-Salgado, Volodymyr Ponomaryov, Jose A. Almaraz-Damian, Clara Cruz-Ramos and Sergiy Sadovnychiy

Mathematics 2025, 13(15), 2400; https://doi.org/10.3390/math13152400 - 25 Jul 2025

Viewed by 192

Abstract

Deep learning-based super-resolution (SR) frameworks are widely used in remote sensing applications. However, existing SR models still face limitations, particularly in recovering contours, fine features, and textures, as well as in effectively integrating channel information. To address these challenges, this study introduces a [...] Read more.

Deep learning-based super-resolution (SR) frameworks are widely used in remote sensing applications. However, existing SR models still face limitations, particularly in recovering contours, fine features, and textures, as well as in effectively integrating channel information. To address these challenges, this study introduces a novel residual model named OARN (Optimized Attention Residual Network) specifically designed to enhance the visual quality of low-resolution images. The network operates on the Y channel of the YCbCr color space and integrates LKA (Large Kernel Attention) and OCM (Optimized Convolutional Module) blocks. These components can restore large-scale spatial relationships and refine textures and contours, improving feature reconstruction without significantly increasing computational complexity. The performance of OARN was evaluated using satellite images from WorldView-2, GaoFen-2, and Microsoft Virtual Earth. Evaluation was conducted using objective quality metrics, such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), Edge Preservation Index (EPI), and Perceptual Image Patch Similarity (LPIPS), demonstrating superior results compared to state-of-the-art methods in both objective measurements and subjective visual perception. Moreover, OARN achieves this performance while maintaining computational efficiency, offering a balanced trade-off between processing time and reconstruction quality. Full article

(This article belongs to the Special Issue Computing in Image Processing for Remote Sensing and Biomedical Applications)

► Show Figures

Figure 1

27 pages, 8957 KiB

Open AccessArticle

DFAN: Single Image Super-Resolution Using Stationary Wavelet-Based Dual Frequency Adaptation Network

by Gyu-Il Kim and Jaesung Lee

Symmetry 2025, 17(8), 1175; https://doi.org/10.3390/sym17081175 - 23 Jul 2025

Viewed by 285

Abstract

Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this [...] Read more.

Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this limitation, we propose the Dual Frequency Adaptive Network (DFAN). DFAN first decomposes the input into low- and high-frequency components via Stationary Wavelet Transform. In the low-frequency branch, Swin Transformer layers restore global structures and color consistency. In contrast, the high-frequency branch features a dedicated module that combines Directional Convolution with Residual Dense Blocks, precisely reinforcing edges and textures. A frequency fusion module then adaptively merges these complementary features using depthwise and pointwise convolutions, achieving a balanced reconstruction. During training, we introduce a frequency-aware multi-term loss alongside the standard pixel-wise loss to explicitly encourage high-frequency preservation. Extensive experiments on the Set5, Set14, BSD100, Urban100, and Manga109 benchmarks show that DFAN achieves up to +0.64 dBpeak signal-to-noise ratio, +0.01 structural similarity index measure, and −0.01learned perceptual image patch similarity over the strongest frequency-domain baselines, while also delivering visibly sharper textures and cleaner edges. By unifying spatial and frequency-domain advantages, DFAN effectively mitigates high-frequency degradation and enhances SISR performance. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

17 pages, 3856 KiB

Open AccessArticle

Wavelet Fusion with Sobel-Based Weighting for Enhanced Clarity in Underwater Hydraulic Infrastructure Inspection

by Minghui Zhang, Jingkui Zhang, Jugang Luo, Jiakun Hu, Xiaoping Zhang and Juncai Xu

Appl. Sci. 2025, 15(14), 8037; https://doi.org/10.3390/app15148037 - 18 Jul 2025

Viewed by 297

Abstract

Underwater inspection images of hydraulic structures often suffer from haze, severe color distortion, low contrast, and blurred textures, impairing the accuracy of automated crack, spalling, and corrosion detection. However, many existing enhancement methods fail to preserve structural details and suppress noise in turbid [...] Read more.

Underwater inspection images of hydraulic structures often suffer from haze, severe color distortion, low contrast, and blurred textures, impairing the accuracy of automated crack, spalling, and corrosion detection. However, many existing enhancement methods fail to preserve structural details and suppress noise in turbid environments. To address these limitations, we propose a compact image enhancement framework called Wavelet Fusion with Sobel-based Weighting (WWSF). This method first corrects global color and luminance distributions using multiscale Retinex and gamma mapping, followed by local contrast enhancement via CLAHE in the L channel of the CIELAB color space. Two preliminarily corrected images are decomposed using discrete wavelet transform (DWT); low-frequency bands are fused based on maximum energy, while high-frequency bands are adaptively weighted by Sobel edge energy to highlight structural features and suppress background noise. The enhanced image is reconstructed via inverse DWT. Experiments on real-world sluice gate datasets demonstrate that WWSF outperforms six state-of-the-art methods, achieving the highest scores on UIQM and AG while remaining competitive on entropy (EN). Moreover, the method retains strong robustness under high turbidity conditions (T ≥ 35 NTU), producing sharper edges, more faithful color representation, and improved texture clarity. These results indicate that WWSF is an effective preprocessing tool for downstream tasks such as segmentation, defect classification, and condition assessment of hydraulic infrastructure in complex underwater environments. Full article

(This article belongs to the Special Issue Application of Computer Vision and Deep Learning in Construction Engineering)

► Show Figures

Figure 1

29 pages, 10358 KiB

Open AccessArticle

Smartphone-Based Sensing System for Identifying Artificially Marbled Beef Using Texture and Color Analysis to Enhance Food Safety

by Hong-Dar Lin, Yi-Ting Hsieh and Chou-Hsien Lin

Sensors 2025, 25(14), 4440; https://doi.org/10.3390/s25144440 - 16 Jul 2025

Viewed by 284

Abstract

Beef fat injection technology, used to enhance the perceived quality of lower-grade meat, often results in artificially marbled beef that mimics the visual traits of Wagyu, characterized by dense fat distribution. This practice, driven by the high cost of Wagyu and the affordability [...] Read more.

Beef fat injection technology, used to enhance the perceived quality of lower-grade meat, often results in artificially marbled beef that mimics the visual traits of Wagyu, characterized by dense fat distribution. This practice, driven by the high cost of Wagyu and the affordability of fat-injected beef, has led to the proliferation of mislabeled “Wagyu-grade” products sold at premium prices, posing potential food safety risks such as allergen exposure or consumption of unverified additives, which can adversely affect consumer health. Addressing this, this study introduces a smart sensing system integrated with handheld mobile devices, enabling consumers to capture beef images during purchase for real-time health-focused assessment. The system analyzes surface texture and color, transmitting data to a server for classification to determine if the beef is artificially marbled, thus supporting informed dietary choices and reducing health risks. Images are processed by applying a region of interest (ROI) mask to remove background noise, followed by partitioning into grid blocks. Local binary pattern (LBP) texture features and RGB color features are extracted from these blocks to characterize surface properties of three beef types (Wagyu, regular, and fat-injected). A support vector machine (SVM) model classifies the blocks, with the final image classification determined via majority voting. Experimental results reveal that the system achieves a recall rate of 95.00% for fat-injected beef, a misjudgment rate of 1.67% for non-fat-injected beef, a correct classification rate (CR) of 93.89%, and an F1-score of 95.80%, demonstrating its potential as a human-centered healthcare tool for ensuring food safety and transparency. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

21 pages, 5735 KiB

Open AccessArticle

Estimation of Tomato Quality During Storage by Means of Image Analysis, Instrumental Analytical Methods, and Statistical Approaches

by Paris Christodoulou, Eftichia Kritsi, Georgia Ladika, Panagiota Tsafou, Kostantinos Tsiantas, Thalia Tsiaka, Panagiotis Zoumpoulakis, Dionisis Cavouras and Vassilia J. Sinanoglou

Appl. Sci. 2025, 15(14), 7936; https://doi.org/10.3390/app15147936 - 16 Jul 2025

Viewed by 297

Abstract

The quality and freshness of fruits and vegetables are critical factors in consumer acceptance and are significantly affected during transport and storage. This study aimed to evaluate the quality of greenhouse-grown tomatoes stored for 24 days by combining non-destructive image analysis, spectrophotometric assays [...] Read more.

The quality and freshness of fruits and vegetables are critical factors in consumer acceptance and are significantly affected during transport and storage. This study aimed to evaluate the quality of greenhouse-grown tomatoes stored for 24 days by combining non-destructive image analysis, spectrophotometric assays (including total phenolic content and antioxidant and antiradical activity assessments), and attenuated total reflectance–Fourier transform infrared (ATR-FTIR) spectroscopy. Additionally, water activity, moisture content, total soluble solids, texture, and color were evaluated. Most physicochemical changes occurred between days 14 and 17, without major impact on overall fruit quality. A progressive transition in peel hue from orange to dark orange, and increased surface irregularity of their textural image were noted. Moreover, the combined use of instrumental and image analyses results via multivariate analysis allowed the clear discrimination of tomatoes according to storage days. In this sense, tomato samples were effectively classified by ATR-FTIR spectral bands, linked to carotenoids, phenolics, and polysaccharides. Machine learning (ML) models, including Random Forest and Gradient Boosting, were trained on image-derived features and accurately predicted shelf life and quality traits, achieving R² values exceeding 0.9. The findings demonstrate the effectiveness of combining imaging, spectroscopy, and ML for non-invasive tomato quality monitoring and support the development of predictive tools to improve postharvest handling and reduce food waste. Full article

(This article belongs to the Section Food Science and Technology)

► Show Figures

Figure 1

19 pages, 1318 KiB

Open AccessArticle

Decoding Plant-Based Beverages: An Integrated Study Combining ATR-FTIR Spectroscopy and Microscopic Image Analysis with Chemometrics

by Paris Christodoulou, Stratoniki Athanasopoulou, Georgia Ladika, Spyros J. Konteles, Dionisis Cavouras, Vassilia J. Sinanoglou and Eftichia Kritsi

AppliedChem 2025, 5(3), 16; https://doi.org/10.3390/appliedchem5030016 - 16 Jul 2025

Viewed by 872

Abstract

As demand for plant-based beverages grows, analytical tools are needed to classify and understand their structural and compositional diversity. This study applied a multi-analytical approach to characterize 41 commercial almond-, oat-, rice- and soy-based beverages, evaluating attenuated total reflectance Fourier transform infrared (ATR-FTIR) [...] Read more.

As demand for plant-based beverages grows, analytical tools are needed to classify and understand their structural and compositional diversity. This study applied a multi-analytical approach to characterize 41 commercial almond-, oat-, rice- and soy-based beverages, evaluating attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy, protein secondary structure proportions, colorimetry, and microscopic image texture analysis. A total of 26 variables, derived from ATR-FTIR and protein secondary structure assessment, were employed in multivariate models, using partial least squares discriminant analysis (PLS-DA) and orthogonal PLS-DA (OPLS-DA) to evaluate classification performance. The results indicated clear group separation, with soy and rice beverages forming distinct clusters while almond and oat samples showing partial overlap. Variable importance in projection (VIP) scores revealed that β-turn and α-helix protein structures, along with carbohydrate-associated spectral bands, were the key features for beverages’ classification. Textural features derived from microscopy images correlated with sugar and carbohydrate content and color parameters were also employed to describe beverages’ differences related to sugar content and visual appearance in terms of homogeneity. These findings demonstrate that combining ATR-FTIR spectral data with protein secondary structure data enables the effective classification of plant-based beverages, while microscopic image textural and color parameters offer additional extended product characterization. Full article

► Show Figures

Figure 1

26 pages, 6371 KiB

Open AccessArticle

Growth Stages Discrimination of Multi-Cultivar Navel Oranges Using the Fusion of Near-Infrared Hyperspectral Imaging and Machine Vision with Deep Learning

by Chunyan Zhao, Zhong Ren, Yue Li, Jia Zhang and Weinan Shi

Agriculture 2025, 15(14), 1530; https://doi.org/10.3390/agriculture15141530 - 15 Jul 2025

Viewed by 255

Abstract

To noninvasively and precisely discriminate among the growth stages of multiple cultivars of navel oranges simultaneously, the fusion of the technologies of near-infrared (NIR) hyperspectral imaging (HSI) combined with machine vision (MV) and deep learning is employed. NIR reflectance spectra and hyperspectral and [...] Read more.

To noninvasively and precisely discriminate among the growth stages of multiple cultivars of navel oranges simultaneously, the fusion of the technologies of near-infrared (NIR) hyperspectral imaging (HSI) combined with machine vision (MV) and deep learning is employed. NIR reflectance spectra and hyperspectral and RGB images for 740 Gannan navel oranges of five cultivars are collected. Based on preprocessed spectra, optimally selected hyperspectral images, and registered RGB images, a dual-branch multi-modal feature fusion convolutional neural network (CNN) model is established. In this model, a spectral branch is designed to extract spectral features reflecting internal compositional variations, while the image branch is utilized to extract external color and texture features from the integration of hyperspectral and RGB images. Finally, growth stages are determined via the fusion of features. To validate the availability of the proposed method, various machine-learning and deep-learning models are compared for single-modal and multi-modal data. The results demonstrate that multi-modal feature fusion of HSI and MV combined with the constructed dual-branch CNN deep-learning model yields excellent growth stage discrimination in navel oranges, achieving an accuracy, recall rate, precision, F1 score, and kappa coefficient on the testing set are 95.95%, 96.66%, 96.76%, 96.69%, and 0.9481, respectively, providing a prominent way to precisely monitor the growth stages of fruits. Full article

(This article belongs to the Special Issue Multi- and Hyper-Spectral Imaging Technologies for Crop Monitoring—2nd Edition)

► Show Figures

Figure 1

23 pages, 10392 KiB

Open AccessArticle

Dual-Branch Luminance–Chrominance Attention Network for Hydraulic Concrete Image Enhancement

by Zhangjun Peng, Li Li, Chuanhao Chang, Rong Tang, Guoqiang Zheng, Mingfei Wan, Juanping Jiang, Shuai Zhou, Zhenggang Tian and Zhigui Liu

Appl. Sci. 2025, 15(14), 7762; https://doi.org/10.3390/app15147762 - 10 Jul 2025

Viewed by 253

Abstract

Hydraulic concrete is a critical infrastructure material, with its surface condition playing a vital role in quality assessments for water conservancy and hydropower projects. However, images taken in complex hydraulic environments often suffer from degraded quality due to low lighting, shadows, and noise, [...] Read more.

Hydraulic concrete is a critical infrastructure material, with its surface condition playing a vital role in quality assessments for water conservancy and hydropower projects. However, images taken in complex hydraulic environments often suffer from degraded quality due to low lighting, shadows, and noise, making it difficult to distinguish defects from the background and thereby hindering accurate defect detection and damage evaluation. In this study, following systematic analyses of hydraulic concrete color space characteristics, we propose a Dual-Branch Luminance–Chrominance Attention Network (DBLCANet-HCIE) specifically designed for low-light hydraulic concrete image enhancement. Inspired by human visual perception, the network simultaneously improves global contrast and preserves fine-grained defect textures, which are essential for structural analysis. The proposed architecture consists of a Luminance Adjustment Branch (LAB) and a Chroma Restoration Branch (CRB). The LAB incorporates a Luminance-Aware Hybrid Attention Block (LAHAB) to capture both the global luminance distribution and local texture details, enabling adaptive illumination correction through comprehensive scene understanding. The CRB integrates a Channel Denoiser Block (CDB) for channel-specific noise suppression and a Frequency-Domain Detail Enhancement Block (FDDEB) to refine chrominance information and enhance subtle defect textures. A feature fusion block is designed to fuse and learn the features of the outputs from the two branches, resulting in images with enhanced luminance, reduced noise, and preserved surface anomalies. To validate the proposed approach, we construct a dedicated low-light hydraulic concrete image dataset (LLHCID). Extensive experiments conducted on both LOLv1 and LLHCID benchmarks demonstrate that the proposed method significantly enhances the visual interpretability of hydraulic concrete surfaces while effectively addressing low-light degradation challenges. Full article

► Show Figures

Figure 1

14 pages, 6074 KiB

Open AccessArticle

Cross-Modal Data Fusion via Vision-Language Model for Crop Disease Recognition

by Wenjie Liu, Guoqing Wu, Han Wang and Fuji Ren

Sensors 2025, 25(13), 4096; https://doi.org/10.3390/s25134096 - 30 Jun 2025

Viewed by 353

Abstract

Crop diseases pose a significant threat to agricultural productivity and global food security. Timely and accurate disease identification is crucial for improving crop yield and quality. While most existing deep learning-based methods focus primarily on image datasets for disease recognition, they often overlook [...] Read more.

Crop diseases pose a significant threat to agricultural productivity and global food security. Timely and accurate disease identification is crucial for improving crop yield and quality. While most existing deep learning-based methods focus primarily on image datasets for disease recognition, they often overlook the complementary role of textual features in enhancing visual understanding. To address this problem, we proposed a cross-modal data fusion via a vision-language model for crop disease recognition. Our approach leverages the Zhipu.ai multi-model to generate comprehensive textual descriptions of crop leaf diseases, including global description, local lesion description, and color-texture description. These descriptions are encoded into feature vectors, while an image encoder extracts image features. A cross-attention mechanism then iteratively fuses multimodal features across multiple layers, and a classification prediction module generates classification probabilities. Extensive experiments on the Soybean Disease, AI Challenge 2018, and PlantVillage datasets demonstrate that our method outperforms state-of-the-art image-only approaches with higher accuracy and fewer parameters. Specifically, with only 1.14M model parameters, our model achieves a 98.74%, 87.64% and 99.08% recognition accuracy on the three datasets, respectively. The results highlight the effectiveness of cross-modal learning in leveraging both visual and textual cues for precise and efficient disease recognition, offering a scalable solution for crop disease recognition. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

14 pages, 1120 KiB

Open AccessFeature PaperEditor’s ChoiceArticle

Impact of Different Dehydration Methods on Drying Efficiency, Nutritional and Physico-Chemical Quality of Strawberries Slices (Fragaria ananassa)

by Patrícia Antunes, Sara Dias, Diogo Gonçalves, Telma Orvalho, Marta B. Evangelista, Enrique Pino-Hernández and Marco Alves

Processes 2025, 13(7), 2065; https://doi.org/10.3390/pr13072065 - 30 Jun 2025

Viewed by 394

Abstract

This study aimed to evaluate the drying kinetics, microstructural features, moisture content, color, pH, aw, texture, acidity, rehydration capacity, and sensorial attributes of strawberry slices processed by different drying methodologies. Strawberry samples were processed by hot air-drying (HA, 60 °C, 0.5 m/s), freeze-drying [...] Read more.

This study aimed to evaluate the drying kinetics, microstructural features, moisture content, color, pH, aw, texture, acidity, rehydration capacity, and sensorial attributes of strawberry slices processed by different drying methodologies. Strawberry samples were processed by hot air-drying (HA, 60 °C, 0.5 m/s), freeze-drying (FD, 0.055 mbar), and pulsed electric field (PEF)-assisted freeze-drying (PEFFD, 1 kV/cm and 3.2 kJ/kg). PEF pre-treatment significantly increased cell membrane permeability by forming micropores, which led to a significant reduction in the moisture content of up to 8.87% and improved the drying efficiency. Nonetheless, this pre-treatment did not significantly alter the drying rate due to the inherent constraints of the freeze-drying process. PEFFD samples better retained their shape, volume, and visual quality, and exhibited a maximum rehydration capacity of 64.90%. The ascorbic acid retention was found to be higher in the FD and PEFFD when compared to HA. FD and PEFFD samples had an increase in both red and yellow hue. PEF shows promise as a pre-treatment technique, improving both the drying efficiency and strawberry quality. Further studies are needed to assess PEFFD’s industrial scalability and economic feasibility. Full article

(This article belongs to the Special Issue Emerging Thermal and Non-Thermal Technologies for Food Preservation and Sustainable Food Processing)

► Show Figures

Figure 1

17 pages, 12088 KiB

Open AccessArticle

Edge-Guided DETR Model for Intelligent Sensing of Tomato Ripeness Under Complex Environments

by Jiamin Yao, Jianxuan Zhou, Yangang Nie, Jun Xue, Kai Lin and Liwen Tan

Mathematics 2025, 13(13), 2095; https://doi.org/10.3390/math13132095 - 26 Jun 2025

Viewed by 466

Abstract

Tomato ripeness detection in open-field environments is challenged by dense planting, heavy occlusion, and complex lighting conditions. Existing methods mainly rely on color and texture cues, limiting boundary perception and causing redundant predictions in crowded scenes. To address these issues, we propose an [...] Read more.

Tomato ripeness detection in open-field environments is challenged by dense planting, heavy occlusion, and complex lighting conditions. Existing methods mainly rely on color and texture cues, limiting boundary perception and causing redundant predictions in crowded scenes. To address these issues, we propose an improved detection framework called Edge-Guided DETR (EG-DETR), based on the DEtection TRansformer (DETR). EG-DETR introduces edge prior information by extracting multi-scale edge features through an edge backbone network. These features are fused in the transformer decoder to guide queries toward foreground regions, which improves detection under occlusion. We further design a redundant box suppression strategy to reduce duplicate predictions caused by clustered fruits. We evaluated our method on a multimodal tomato dataset that included varied lighting conditions such as natural light, artificial light, low light, and sodium yellow light. Our experimental results show that EG-DETR achieves an

A P

of 83.7% under challenging lighting and occlusion, outperforming existing models. This work provides a reliable intelligent sensing solution for automated harvesting in smart agriculture. Full article

(This article belongs to the Special Issue Intelligent Perception Computing and Graph Neural Networks: Algorithms, Applications, and New Challenges)

► Show Figures

Figure 1

21 pages, 6399 KiB

Open AccessFeature PaperArticle

An Upscaling-Based Strategy to Improve the Ephemeral Gully Mapping Accuracy

by Solmaz Fathololoumi, Daniel D. Saurette, Harnoordeep Singh Mann, Naoya Kadota, Hiteshkumar B. Vasava, Mojtaba Naeimi, Prasad Daggupati and Asim Biswas

Land 2025, 14(7), 1344; https://doi.org/10.3390/land14071344 - 24 Jun 2025

Viewed by 387

Abstract

Understanding and mapping ephemeral gullies (EGs) are vital for enhancing agricultural productivity and achieving food security. This study proposes an upscaling-based strategy to refine the predictive mapping of EGs, utilizing high-resolution Pléiades Neo (0.6 m) and medium-resolution Sentinel-2 (10 m) satellite imagery, alongside [...] Read more.

Understanding and mapping ephemeral gullies (EGs) are vital for enhancing agricultural productivity and achieving food security. This study proposes an upscaling-based strategy to refine the predictive mapping of EGs, utilizing high-resolution Pléiades Neo (0.6 m) and medium-resolution Sentinel-2 (10 m) satellite imagery, alongside ground-truth EGs mapping in Niagara Region, Canada. The research involved generating spectral feature maps using Blue, Green, Red, and Near-infrared spectral bands, complemented by indices indicative of surface wetness, vegetation, color, and soil texture. Employing the Random Forest (RF) algorithm, this study executed three distinct strategies for EGs identification. The first strategy involved direct calibration using Sentinel-2 spectral features for 10 m resolution mapping. The second strategy utilized high-resolution Pléiades Neo data for model calibration, enabling EGs mapping at resolutions of 0.6, 2, 4, 6, and 8 m. The third, or upscaling strategy, applied the high-resolution calibrated model to medium-resolution Sentinel-2 imagery, producing 10 m resolution EGs maps. The accuracy of these maps was evaluated against actual data and compared across strategies. The findings highlight the Variable Importance Measure (VIM) of different spectral features in EGs identification, with normalized near-infrared (Norm NIR) and normalized red reflectance (Norm Red) exhibiting the highest and lowest VIM, respectively. Vegetation-related indices demonstrated a higher VIM compared to surface wetness indices. The overall classification error of the upscaling strategy at spatial resolutions of 0.6, 2, 4, 6, 8, and 10 m (Upscaled), as well as that of the direct Sentinel-2 model, were 7.9%, 8.2%, 9.1%, 10.3%, 11.2%, 12.5%, and 14.5%, respectively. The errors for EGs maps at various resolutions revealed an increase in identification error with higher spatial resolution. However, the upscaling strategy significantly improved the accuracy of EGs identification in medium spatial resolution scenarios. This study not only advances the methodology for EGs mapping but also contributes to the broader field of precision agriculture and environmental management. By providing a scalable and accessible approach to EGs mapping, this research supports enhanced soil conservation practices and sustainable land management, addressing key challenges in agricultural sustainability and environmental stewardship. Full article

(This article belongs to the Special Issue Quantifying Soil Erosion Processes Using Satellite Data: Progress and Perspectives)

► Show Figures

Figure 1

Search Results (540)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (540)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI