MDPI - Publisher of Open Access Journals

14 pages, 3118 KB

Open AccessArticle

Reconstruction Modeling and Validation of Brown Croaker (Miichthys miiuy) Vocalizations Using Wavelet-Based Inversion and Deep Learning

by Sunhyo Kim, Jongwook Choi, Bum-Kyu Kim, Hansoo Kim, Donhyug Kang, Jee Woong Choi, Young Geul Yoon and Sungho Cho

Sensors 2025, 25(19), 6178; https://doi.org/10.3390/s25196178 (registering DOI) - 6 Oct 2025

Abstract

Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this [...] Read more.

Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this study, we present a framework for reconstructing brown croaker vocalizations by integrating fk14 wavelet synthesis, PSO-based parameter optimization (with an objective combining correlation and normalized MSE), and deep learning-based validation. Sensitivity analysis using a normalized Bartlett processor identified delay and scale (length) as the most critical parameters, defining valid ranges that maintained waveform similarity above 98%. The reconstructed signals matched measured calls in both time and frequency domains, replicating single-pulse morphology, inter-pulse interval (IPI) distributions, and energy spectral density. Validation with a ResNet-18-based Siamese network produced near-unity cosine similarity (~0.9996) between measured and reconstructed signals. Statistical analyses (95% confidence intervals; residual errors) confirmed faithful preservation of SPL values and minor, biologically plausible IPI variations. Under noisy conditions, similarity decreased as SNR dropped, indicating that environmental noise affects reconstruction fidelity. These results demonstrate that the proposed framework can reliably generate acoustically realistic and morphologically consistent fish vocalizations, even under data-limited scenarios. The methodology holds promise for dataset augmentation, PAM applications, and species-specific call simulation. Future work will extend this framework by using reconstructed signals to train generative models (e.g., GANs, WaveNet), enabling scalable synthesis and supporting real-time adaptive modeling in field monitoring. Full article

(This article belongs to the Topic Advances in Underwater Signal Processing and Communication: Challenges, Innovations, and Applications)

► Show Figures

Figure 1

12 pages, 4847 KB

Open AccessArticle

Surformer v1: Transformer-Based Surface Classification Using Tactile and Vision Features

by Manish Kansana, Elias Hossain, Shahram Rahimi and Noorbakhsh Amiri Golilarz

Information 2025, 16(10), 839; https://doi.org/10.3390/info16100839 - 27 Sep 2025

Abstract

Surface material recognition is a key component in robotic perception and physical interaction, particularly when leveraging both tactile and visual sensory inputs. In this work, we propose Surformer v1, a transformer-based architecture designed for surface classification using structured tactile features and Principal Component [...] Read more.

Surface material recognition is a key component in robotic perception and physical interaction, particularly when leveraging both tactile and visual sensory inputs. In this work, we propose Surformer v1, a transformer-based architecture designed for surface classification using structured tactile features and Principal Component Analysis (PCA)-reduced visual embeddings extracted via ResNet 50. The model integrates modality-specific encoders with cross-modal attention layers, enabling rich interactions between vision and touch. Currently, state-of-the-art deep learning models for vision tasks have achieved remarkable performance. With this in mind, our first set of experiments focused exclusively on tactile-only surface classification. Using feature engineering, we trained and evaluated multiple machine learning models, assessing their accuracy and inference time. We then implemented an encoder-only Transformer model tailored for tactile features. This model not only achieves the highest accuracy, but also demonstrated significantly faster inference time compared to other evaluated models, highlighting its potential for real-time applications. To extend this investigation, we introduced a multimodal fusion setup by combining vision and tactile inputs. We trained both Surformer v1 (using structured features) and a Multimodal CNN (using raw images) to examine the impact of feature-based versus image-based multimodal learning on classification accuracy and computational efficiency. The results showed that Surformer v1 achieved 99.4% accuracy with an inference time of 0.7271 ms, while the Multimodal CNN achieved slightly higher accuracy but required significantly more inference time. These findings suggest that Surformer v1 offers a compelling balance between accuracy, efficiency, and computational cost for surface material recognition. The results also underscore the effectiveness of integrating feature learning, cross-modal attention and transformer-based fusion in capturing the complementary strengths of tactile and visual modalities. Full article

(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)

► Show Figures

Figure 1

24 pages, 4963 KB

Open AccessArticle

A Hybrid Deep Learning and Optical Flow Framework for Monocular Capsule Endoscopy Localization

by İrem Yakar, Ramazan Alper Kuçak, Serdar Bilgi, Onur Ferhanoglu and Tahir Cetin Akinci

Electronics 2025, 14(18), 3722; https://doi.org/10.3390/electronics14183722 - 19 Sep 2025

Viewed by 310

Abstract

Pose estimation and localization within the gastrointestinal tract, particularly the small bowel, are crucial for invasive medical procedures. However, the task is challenging due to the complex anatomy, homogeneous textures, and limited distinguishable features. This study proposes a hybrid deep learning (DL) method [...] Read more.

Pose estimation and localization within the gastrointestinal tract, particularly the small bowel, are crucial for invasive medical procedures. However, the task is challenging due to the complex anatomy, homogeneous textures, and limited distinguishable features. This study proposes a hybrid deep learning (DL) method combining Convolutional Neural Network (CNN)-based pose estimation and optical flow to address these challenges in a simulated small bowel environment. Initial pose estimation was used to assess the performance of simultaneous localization and mapping (SLAM) in such complex settings, using a custom endoscope prototype with a laser, micromotor, and miniaturized camera. The results showed limited feature detection and unreliable matches due to repetitive textures. To improve this issue, a hybrid CNN-based approach enhanced with Farneback optical flow was applied. Using consecutive images, three models were compared: Hybrid ResNet-50 with Farneback optical flow, ResNet-50, and NASNetLarge pretrained on ImageNet. The analysis showed that the hybrid model outperformed both ResNet-50 (0.39 cm) and NASNetLarge (1.46 cm), achieving the lowest RMSE of 0.03 cm, with feature-based SLAM failing to provide reliable results. The hybrid model also gained a competitive inference speed of 241.84 ms per frame, outperforming ResNet-50 (316.57 ms) and NASNetLarge (529.66 ms). To assess the impact of the optical flow choice, Lucas–Kanade was also implemented within the same framework and compared with the Farneback-based results. These results demonstrate that combining optical flow with ResNet-50 enhances pose estimation accuracy and stability, especially in textureless environments where traditional methods struggle. The proposed method offers a robust, real-time alternative to SLAM, with potential applications in clinical capsule endoscopy. The results are positioned as a proof-of-concept that highlights the feasibility and clinical potential of the proposed framework. Future work will extend the framework to real patient data and optimize for real-time hardware. Full article

► Show Figures

Figure 1

21 pages, 13741 KB

Open AccessArticle

Individual Tree Species Classification Using Pseudo Tree Crown (PTC) on Coniferous Forests

by Kongwen (Frank) Zhang, Tianning Zhang and Jane Liu

Remote Sens. 2025, 17(17), 3102; https://doi.org/10.3390/rs17173102 - 5 Sep 2025

Viewed by 776

Abstract

Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced [...] Read more.

Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced a novel data representation method, pseudo tree crown (PTC), which provides a pseudo-3D pixel-value view that enhances the informational richness of images and significantly improves classification performance. While our original implementation was successfully tested on urban and deciduous trees, this study extends the application of PTC to Canadian conifer species, including jack pine, Douglas fir, spruce, and aspen. We address key challenges such as snow-covered backgrounds and evaluate the impact of training dataset size on classification results. Classification was performed using Random Forest, PyTorch (ResNet50), and YOLO versions v10, v11, and v12. The results demonstrate that PTC can substantially improve individual tree classification accuracy by up to 13%, reaching the high 90% range. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

► Show Figures

Figure 1

34 pages, 4661 KB

Open AccessArticle

An AHP-Based Multicriteria Framework for Evaluating Renewable Energy Service Proposals in Public Healthcare Infrastructure: A Case Study of an Italian Hospital

by Cristina Ventura, Ferdinando Chiacchio, Diego D’Urso, Giuseppe Marco Tina, Gabino Jiménez Castillo and Ludovica Maria Oliveri

Energies 2025, 18(17), 4680; https://doi.org/10.3390/en18174680 - 3 Sep 2025

Viewed by 743

Abstract

Public healthcare infrastructure is among the most energy-intensive of public facilities; therefore, it needs to become more environmentally and economically sustainable by increasing energy efficiency and improving service reliability. Achieving these goals requires modernizing hospital energy systems with renewable energy sources (RESs). This [...] Read more.

Public healthcare infrastructure is among the most energy-intensive of public facilities; therefore, it needs to become more environmentally and economically sustainable by increasing energy efficiency and improving service reliability. Achieving these goals requires modernizing hospital energy systems with renewable energy sources (RESs). This process often involves Energy Service Companies (ESCOs), which propose integrated RES technologies with tailored contractual schemes. However, comparing ESCO offers is challenging due to their heterogeneous technologies, contractual structures, and long-term performance commitments, which make simple cost-based assessments inadequate. This study develops a structured Multi-Criteria Decision-Making (MCDM) methodology to evaluate energy projects in public healthcare facilities. The framework, based on the Analytic Hierarchy Process (AHP), combines both quantitative (net present value, stochastic simulations of energy cost savings, and CO₂ emission reductions) with qualitative assessments (redundancy, flexibility, elasticity, and stakeholder image). It addresses the lack of standardized tools for ranking real-world ESCO proposals in public procurement. The approach, applied to a case study, involves three ESCO proposals for a large hospital in Southern Italy. The results show that integrating photovoltaic generation with trigeneration achieves the highest overall score. The proposed framework provides a transparent, replicable tool to support evidence-based energy investment decisions, extendable to other public-sector infrastructures. Full article

(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)

► Show Figures

Figure 1

27 pages, 7274 KB

Open AccessArticle

Intelligent Identification of Internal Leakage of Spring Full-Lift Safety Valve Based on Improved Convolutional Neural Network

by Shuxun Li, Kang Yuan, Jianjun Hou and Xiaoqi Meng

Sensors 2025, 25(17), 5451; https://doi.org/10.3390/s25175451 - 3 Sep 2025

Viewed by 619

Abstract

In modern industry, the spring full-lift safety valve is a key device for safe pressure relief of pressure-bearing systems. Its valve seat sealing surface is easily damaged after long-term use, causing internal leakage, resulting in safety hazards and economic losses. Therefore, it is [...] Read more.

In modern industry, the spring full-lift safety valve is a key device for safe pressure relief of pressure-bearing systems. Its valve seat sealing surface is easily damaged after long-term use, causing internal leakage, resulting in safety hazards and economic losses. Therefore, it is of great significance to quickly and accurately diagnose its internal leakage state. Among the current methods for identifying fluid machinery faults, model-based methods have difficulties in parameter determination. Although the data-driven convolutional neural network (CNN) has great potential in the field of fault diagnosis, it has problems such as hyperparameter selection relying on experience, insufficient capture of time series and multi-scale features, and lack of research on valve internal leakage type identification. To this end, this study proposes a safety valve internal leakage identification method based on high-frequency FPGA data acquisition and improved CNN. The acoustic emission signals of different internal leakage states are obtained through the high-frequency FPGA acquisition system, and the two-dimensional time–frequency diagram is obtained by short-time Fourier transform and input into the improved model. The model uses the leaky rectified linear unit (LReLU) activation function to enhance nonlinear expression, introduces random pooling to prevent overfitting, optimizes hyperparameters with the help of horned lizard optimization algorithm (HLOA), and integrates the bidirectional gated recurrent unit (BiGRU) and selective kernel attention module (SKAM) to enhance temporal feature extraction and multi-scale feature capture. Experiments show that the average recognition accuracy of the model for the internal leakage state of the safety valve is 99.7%, which is better than the comparison model such as ResNet-18. This method provides an effective solution for the diagnosis of internal leakage of safety valves, and the signal conversion method can be extended to the fault diagnosis of other mechanical equipment. In the future, we will explore the fusion of lightweight networks and multi-source data to improve real-time and robustness. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

21 pages, 28885 KB

Open AccessArticle

Assessment of Yellow Rust (Puccinia striiformis) Infestations in Wheat Using UAV-Based RGB Imaging and Deep Learning

by Atanas Z. Atanasov, Boris I. Evstatiev, Asparuh I. Atanasov and Plamena D. Nikolova

Appl. Sci. 2025, 15(15), 8512; https://doi.org/10.3390/app15158512 - 31 Jul 2025

Cited by 2 | Viewed by 550

Abstract

Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study [...] Read more.

Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study presents a UAV-based approach for detecting yellow rust using only RGB imagery and deep learning for pixel-based classification. The methodology involves data acquisition, preprocessing through histogram equalization, model training, and evaluation. Among the tested models, a UnetClassifier with ResNet34 backbone achieved the highest accuracy and reliability, enabling clear differentiation between healthy and infected wheat zones. Field experiments confirmed the approach’s potential for identifying infection patterns suitable for precision fungicide application. The model also showed signs of detecting early-stage infections, although further validation is needed due to limited ground-truth data. The proposed solution offers a low-cost, accessible tool for small and medium-sized farms, reducing pesticide use while improving disease monitoring. Future work will aim to refine detection accuracy in low-infection areas and extend the model’s application to other cereal diseases. Full article

(This article belongs to the Special Issue Advanced Computational Techniques for Plant Disease Detection)

► Show Figures

Figure 1

22 pages, 1359 KB

Open AccessArticle

Fall Detection Using Federated Lightweight CNN Models: A Comparison of Decentralized vs. Centralized Learning

by Qasim Mahdi Haref, Jun Long and Zhan Yang

Appl. Sci. 2025, 15(15), 8315; https://doi.org/10.3390/app15158315 - 25 Jul 2025

Cited by 1 | Viewed by 620

Abstract

Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to [...] Read more.

Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to train deep learning models across decentralized data sources without compromising user privacy. The pipeline begins with data acquisition, in which annotated video-based fall-detection datasets formatted in YOLO are used to extract image crops of human subjects. These images are then preprocessed, resized, normalized, and relabeled into binary classes (fall vs. non-fall). A stratified 80/10/10 split ensures balanced training, validation, and testing. To simulate real-world federated environments, the training data is partitioned across multiple clients, each performing local training using pretrained CNN models including MobileNetV2, VGG16, EfficientNetB0, and ResNet50. Two FL topologies are implemented: a centralized server-coordinated scheme and a ring-based decentralized topology. During each round, only model weights are shared, and federated averaging (FedAvg) is applied for global aggregation. The models were trained using three random seeds to ensure result robustness and stability across varying data partitions. Among all configurations, decentralized MobileNetV2 achieved the best results, with a mean test accuracy of 0.9927, F1-score of 0.9917, and average training time of 111.17 s per round. These findings highlight the model’s strong generalization, low computational burden, and suitability for edge deployment. Future work will extend evaluation to external datasets and address issues such as client drift and adversarial robustness in federated environments. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 2285 KB

Open AccessArticle

WormNet: A Multi-View Network for Silkworm Re-Identification

by Hongkang Shi, Minghui Zhu, Linbo Li, Yong Ma, Jianmei Wu, Jianfei Zhang and Junfeng Gao

Animals 2025, 15(14), 2011; https://doi.org/10.3390/ani15142011 - 8 Jul 2025

Viewed by 325

Abstract

Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary [...] Read more.

Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary poses, and significant background noise. To address these challenges, we propose a multi-view network for silkworm ReID, called WormNet, which is built upon an innovative strategy termed extraction purification extraction interaction. Specifically, we introduce a multi-order feature extraction module that captures a wide range of fine-grained features by utilizing convolutional kernels of varying sizes and parallel cardinality, effectively mitigating issues of high individual similarity and diverse poses. Next, a feature mask module (FMM) is employed to purify the features in the spatial domain, thereby reducing the impact of background interference. To further enhance the data representation capabilities of the network, we propose a channel interaction module (CIM), which combines an efficient channel attention network with global response normalization (GRN) in parallel to recalibrate features, enabling the network to learn crucial information at both the local and global scales. Additionally, we introduce a new silkworm ReID dataset for network training and evaluation. The experimental results demonstrate that WormNet achieves an mAP value of 54.8% and a rank-1 value of 91.4% on the dataset, surpassing both state-of-the-art and related networks. This study offers a valuable reference for ReID in insects and other organisms. Full article

(This article belongs to the Section Animal System and Management)

► Show Figures

Figure 1

43 pages, 6844 KB

Open AccessArticle

CORE-ReID V2: Advancing the Domain Adaptation for Object Re-Identification with Optimized Training and Ensemble Fusion

by Trinh Quoc Nguyen, Oky Dicky Ardiansyah Prima, Syahid Al Irfan, Hindriyanto Dwi Purnomo and Radius Tanone

AI Sens. 2025, 1(1), 4; https://doi.org/10.3390/aisens1010004 - 4 Jul 2025

Viewed by 1098

Abstract

This study presents CORE-ReID V2, an enhanced framework built upon CORE-ReID V1. The new framework extends its predecessor by addressing unsupervised domain adaptation (UDA) challenges in person ReID and vehicle ReID, with further applicability to object ReID. During pre-training, CycleGAN is employed to [...] Read more.

This study presents CORE-ReID V2, an enhanced framework built upon CORE-ReID V1. The new framework extends its predecessor by addressing unsupervised domain adaptation (UDA) challenges in person ReID and vehicle ReID, with further applicability to object ReID. During pre-training, CycleGAN is employed to synthesize diverse data, bridging image characteristic gaps across different domains. In the fine-tuning, an advanced ensemble fusion mechanism, consisting of the Efficient Channel Attention Block (ECAB) and the Simplified Efficient Channel Attention Block (SECAB), enhances both local and global feature representations while reducing ambiguity in pseudo-labels for target samples. Experimental results on widely used UDA person ReID and vehicle ReID datasets demonstrate that the proposed framework outperforms state-of-the-art methods, achieving top performance in mean average precision (mAP) and Rank-k Accuracy (Top-1, Top-5, Top-10). Moreover, the framework supports lightweight backbones such as ResNet18 and ResNet34, ensuring both scalability and efficiency. Our work not only pushes the boundaries of UDA-based object ReID but also provides a solid foundation for further research and advancements in this domain. Full article

► Show Figures

Figure 1

36 pages, 4389 KB

Open AccessArticle

EffRes-DrowsyNet: A Novel Hybrid Deep Learning Model Combining EfficientNetB0 and ResNet50 for Driver Drowsiness Detection

by Sama Hussein Al-Gburi, Kanar Alaa Al-Sammak, Ion Marghescu, Claudia Cristina Oprea, Ana-Maria Claudia Drăgulinescu, Nayef A. M. Alduais, Khattab M. Ali Alheeti and Nawar Alaa Hussein Al-Sammak

Sensors 2025, 25(12), 3711; https://doi.org/10.3390/s25123711 - 13 Jun 2025

Viewed by 1387

Abstract

Driver drowsiness is a major contributor to road accidents, often resulting from delayed reaction times and impaired cognitive performance. This study introduces EffRes-DrowsyNet, a hybrid deep learning model that integrates the architectural efficiencies of EfficientNetB0 with the deep representational capabilities of ResNet50. The [...] Read more.

Driver drowsiness is a major contributor to road accidents, often resulting from delayed reaction times and impaired cognitive performance. This study introduces EffRes-DrowsyNet, a hybrid deep learning model that integrates the architectural efficiencies of EfficientNetB0 with the deep representational capabilities of ResNet50. The model is designed to detect early signs of driver fatigue through advanced video-based analytics by leveraging both computational scalability and deep feature learning. Extensive experiments were conducted on three benchmark datasets—SUST-DDD, YawDD, and NTHU-DDD—to validate the model’s performance across a range of environmental and demographic variations. EffRes-DrowsyNet achieved 97.71% accuracy, 98.07% precision, and 97.33% recall on the SUST-DDD dataset. On the YawDD dataset, it sustained a high accuracy of 92.73%, while on the NTHU-DDD dataset, it reached 95.14% accuracy, 94.09% precision, and 95.39% recall. These results affirm the model’s superior generalization and classification performance in both controlled and real-world-like settings. The findings underscore the effectiveness of hybrid deep learning models in real-time, safety-critical applications, particularly for automotive driver monitoring systems. Furthermore, EffRes-DrowsyNet’s architecture provides a scalable and adaptable solution that could extend to other attention-critical domains such as industrial machinery operation, aviation, and public safety systems. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

7 pages, 398 KB

Open AccessProceeding Paper

Enhancing Real Estate Listings Through Image Classification and Enhancement: A Comparative Study

by Eyüp Tolunay Küp, Melih Sözdinler, Ali Hakan Işık, Yalçın Doksanbir and Gökhan Akpınar

Eng. Proc. 2025, 92(1), 80; https://doi.org/10.3390/engproc2025092080 - 22 May 2025

Viewed by 1023

Abstract

We extended real estate property listings on the online prop-tech platform. On the platform, the images were classified into the specified classes according to quality criteria. The necessary interventions were made by measuring the platform’s appropriateness level and increasing the advertisements’ visual appeal. [...] Read more.

We extended real estate property listings on the online prop-tech platform. On the platform, the images were classified into the specified classes according to quality criteria. The necessary interventions were made by measuring the platform’s appropriateness level and increasing the advertisements’ visual appeal. A dataset of 3000 labeled images was utilized to compare different image classification models, including convolutional neural networks (CNNs), VGG16, residual networks (ResNets), and the LLaVA large language model (LLM). Each model’s performance and benchmark results were measured to identify the most effective method. In addition, the classification pipeline was expanded using image enhancement with contrastive unsupervised representation learning (CURL). This method assessed the impact of improved image quality on classification accuracy and the overall attractiveness of property listings. For each classification model, the performance was evaluated in binary conditions, with and without the application of CURL. The results showed that applying image enhancement with CURL enhances image quality and improves classification performance, particularly in models such as CNN and ResNet. The study results enable a better visual representation of real estate properties, resulting in higher-quality and engaging user listings. They also underscore the importance of combining advanced image processing techniques with classification models to optimize image presentation and categorization in the real estate industry. The extended platform offers information on the role of machine learning models and image enhancement methods in technology for the real estate industry. Also, an alternative solution that can be integrated into intelligent listing systems is proposed in this study to improve user experience and information accuracy. The platform proves that artificial intelligence and machine learning can be integrated for cloud-distributed services, paving the way for future innovations in the real estate sector and intelligent marketplace platforms. Full article

(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

► Show Figures

Figure 1

24 pages, 3848 KB

Open AccessArticle

Efficient Deep Learning Model Compression for Sensor-Based Vision Systems via Outlier-Aware Quantization

by Joonhyuk Yoo and Guenwoo Ban

Sensors 2025, 25(9), 2918; https://doi.org/10.3390/s25092918 - 5 May 2025

Cited by 2 | Viewed by 1401

Abstract

With the rapid growth of sensor technology and computer vision, efficient deep learning models are essential for real-time image feature extraction in resource-constrained environments. However, most existing quantized deep neural networks (DNNs) are highly sensitive to outliers, leading to severe performance degradation in [...] Read more.

With the rapid growth of sensor technology and computer vision, efficient deep learning models are essential for real-time image feature extraction in resource-constrained environments. However, most existing quantized deep neural networks (DNNs) are highly sensitive to outliers, leading to severe performance degradation in low-precision settings. Our study reveals that outliers extending beyond the nominal weight distribution significantly increase the dynamic range, thereby reducing quantization resolution and affecting sensor-based image analysis tasks. To address this, we propose an outlier-aware quantization (OAQ) method that effectively reshapes weight distributions to enhance quantization accuracy. By analyzing previous outlier-handling techniques using structural similarity (SSIM) measurement results, we demonstrated that OAQ significantly reduced the negative impact of outliers while maintaining computational efficiency. Notably, OAQ was orthogonal to existing quantization schemes, making it compatible with various quantization methods without additional computational overhead. Experimental results on multiple CNN architectures and quantization approaches showed that OAQ effectively mitigated quantization errors. In post-training quantization (PTQ), our 4-bit OAQ ResNet20 model achieved improved accuracy compared with full-precision counterparts, while in quantization-aware training (QAT), OAQ enhanced 2-bit quantization performance by 43.55% over baseline methods. These results confirmed the potential of OAQ for optimizing deep learning models in sensor-based vision applications. Full article

(This article belongs to the Special Issue Image Feature Extraction for Computer Vision Tasks in Sensor Systems and Applications)

► Show Figures

Figure 1

21 pages, 1875 KB

Open AccessArticle

Direction-Aware Lightweight Framework for Traditional Mongolian Document Layout Analysis

by Chenyang Zhou, Monghjaya Ha and Licheng Wu

Appl. Sci. 2025, 15(8), 4594; https://doi.org/10.3390/app15084594 - 21 Apr 2025

Viewed by 655

Abstract

Traditional Mongolian document layout analysis faces unique challenges due to its vertical writing system and complex structural arrangements. Existing methods often struggle with the directional nature of traditional Mongolian text and require substantial computational resources. In this paper, we propose a direction-aware lightweight [...] Read more.

Traditional Mongolian document layout analysis faces unique challenges due to its vertical writing system and complex structural arrangements. Existing methods often struggle with the directional nature of traditional Mongolian text and require substantial computational resources. In this paper, we propose a direction-aware lightweight framework that effectively addresses these challenges. Our framework introduces three key innovations: a modified MobileNetV3 backbone with asymmetric convolutions for efficient vertical feature extraction, a dynamic feature enhancement module with channel attention for adaptive multi-scale information fusion, and a direction-aware detection head with

(sin θ, cos θ)

vector representation for accurate orientation modeling. We evaluate our method on TMDLAD, a newly constructed traditional Mongolian document layout analysis dataset, comparing it with both heavy ResNet-50-based models and lightweight alternatives. The experimental results demonstrate that our approach achieves state-of-the-art performance, with 0.715 mAP and 92.3% direction accuracy with a mean absolute error of only 2.5°, while maintaining high efficiency at 28.6 FPS using only 8.3 M parameters. Our model outperforms the best ResNet-50-based model by 3.6% in mAP and the best lightweight model by 4.3% in mAP, while uniquely providing direction prediction capability that other lightweight models lack. The proposed framework significantly outperforms existing methods in both accuracy and efficiency, providing a practical solution for traditional Mongolian document layout analysis that can be extended to other vertical writing systems. Full article

► Show Figures

Figure 1

20 pages, 2484 KB

Open AccessReview

The Role of Multilevel Inverters in Mitigating Harmonics and Improving Power Quality in Renewable-Powered Smart Grids: A Comprehensive Review

by Shanikumar Vaidya, Krishnamachar Prasad and Jeff Kilby

Energies 2025, 18(8), 2065; https://doi.org/10.3390/en18082065 - 17 Apr 2025

Cited by 2 | Viewed by 1905

Abstract

The world is increasingly turning to renewable energy sources (RES) to address climate change issues and achieve net-zero carbon emissions. Integrating RES into existing power grids is necessary for sustainability because the unpredictability and irregularity of the RES can affect grid stability and [...] Read more.

The world is increasingly turning to renewable energy sources (RES) to address climate change issues and achieve net-zero carbon emissions. Integrating RES into existing power grids is necessary for sustainability because the unpredictability and irregularity of the RES can affect grid stability and generate power quality issues, leading to equipment damage and increasing operational costs. As a result, the importance of RES is severely compromised. To tackle these challenges, traditional power systems (TPS) will have to become more innovative. Smart grids use advanced technology such as two-way communication between consumers and service providers, automated control, and real-time monitoring to manage power flow effectively. Inverters are effective tools for solving power quality problems in renewable-powered smart grids. However, their effectiveness depends on topology, control method and design. This review paper focuses on the role of multilevel inverters (MLIs) in mitigating power quality issues such as voltage sag, swell and total harmonics distortion (THD). The results shown here are through simulation studies using DC sources but can be extended to RES-integrated smart grids. The comprehensive review also examines the drawbacks of TPS to understand the importance and necessity of developing a smart power system. Finally, the paper discusses future trends in MLI control technology, addressing power quality problems in smart grid environments. Full article

(This article belongs to the Section F3: Power Electronics)

► Show Figures

Figure 1

Search Results (112)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (112)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI