Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (112)

Search Parameters:
Keywords = extended ResNet-50

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 3118 KB  
Article
Reconstruction Modeling and Validation of Brown Croaker (Miichthys miiuy) Vocalizations Using Wavelet-Based Inversion and Deep Learning
by Sunhyo Kim, Jongwook Choi, Bum-Kyu Kim, Hansoo Kim, Donhyug Kang, Jee Woong Choi, Young Geul Yoon and Sungho Cho
Sensors 2025, 25(19), 6178; https://doi.org/10.3390/s25196178 (registering DOI) - 6 Oct 2025
Abstract
Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this [...] Read more.
Fish species’ biological vocalizations serve as essential acoustic signatures for passive acoustic monitoring (PAM) and ecological assessments. However, limited availability of high-quality acoustic recordings, particularly for region-specific species like the brown croaker (Miichthys miiuy), hampers data-driven bioacoustic methodology development. In this study, we present a framework for reconstructing brown croaker vocalizations by integrating fk14 wavelet synthesis, PSO-based parameter optimization (with an objective combining correlation and normalized MSE), and deep learning-based validation. Sensitivity analysis using a normalized Bartlett processor identified delay and scale (length) as the most critical parameters, defining valid ranges that maintained waveform similarity above 98%. The reconstructed signals matched measured calls in both time and frequency domains, replicating single-pulse morphology, inter-pulse interval (IPI) distributions, and energy spectral density. Validation with a ResNet-18-based Siamese network produced near-unity cosine similarity (~0.9996) between measured and reconstructed signals. Statistical analyses (95% confidence intervals; residual errors) confirmed faithful preservation of SPL values and minor, biologically plausible IPI variations. Under noisy conditions, similarity decreased as SNR dropped, indicating that environmental noise affects reconstruction fidelity. These results demonstrate that the proposed framework can reliably generate acoustically realistic and morphologically consistent fish vocalizations, even under data-limited scenarios. The methodology holds promise for dataset augmentation, PAM applications, and species-specific call simulation. Future work will extend this framework by using reconstructed signals to train generative models (e.g., GANs, WaveNet), enabling scalable synthesis and supporting real-time adaptive modeling in field monitoring. Full article
Show Figures

Figure 1

12 pages, 4847 KB  
Article
Surformer v1: Transformer-Based Surface Classification Using Tactile and Vision Features
by Manish Kansana, Elias Hossain, Shahram Rahimi and Noorbakhsh Amiri Golilarz
Information 2025, 16(10), 839; https://doi.org/10.3390/info16100839 - 27 Sep 2025
Abstract
Surface material recognition is a key component in robotic perception and physical interaction, particularly when leveraging both tactile and visual sensory inputs. In this work, we propose Surformer v1, a transformer-based architecture designed for surface classification using structured tactile features and Principal Component [...] Read more.
Surface material recognition is a key component in robotic perception and physical interaction, particularly when leveraging both tactile and visual sensory inputs. In this work, we propose Surformer v1, a transformer-based architecture designed for surface classification using structured tactile features and Principal Component Analysis (PCA)-reduced visual embeddings extracted via ResNet 50. The model integrates modality-specific encoders with cross-modal attention layers, enabling rich interactions between vision and touch. Currently, state-of-the-art deep learning models for vision tasks have achieved remarkable performance. With this in mind, our first set of experiments focused exclusively on tactile-only surface classification. Using feature engineering, we trained and evaluated multiple machine learning models, assessing their accuracy and inference time. We then implemented an encoder-only Transformer model tailored for tactile features. This model not only achieves the highest accuracy, but also demonstrated significantly faster inference time compared to other evaluated models, highlighting its potential for real-time applications. To extend this investigation, we introduced a multimodal fusion setup by combining vision and tactile inputs. We trained both Surformer v1 (using structured features) and a Multimodal CNN (using raw images) to examine the impact of feature-based versus image-based multimodal learning on classification accuracy and computational efficiency. The results showed that Surformer v1 achieved 99.4% accuracy with an inference time of 0.7271 ms, while the Multimodal CNN achieved slightly higher accuracy but required significantly more inference time. These findings suggest that Surformer v1 offers a compelling balance between accuracy, efficiency, and computational cost for surface material recognition. The results also underscore the effectiveness of integrating feature learning, cross-modal attention and transformer-based fusion in capturing the complementary strengths of tactile and visual modalities. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

24 pages, 4963 KB  
Article
A Hybrid Deep Learning and Optical Flow Framework for Monocular Capsule Endoscopy Localization
by İrem Yakar, Ramazan Alper Kuçak, Serdar Bilgi, Onur Ferhanoglu and Tahir Cetin Akinci
Electronics 2025, 14(18), 3722; https://doi.org/10.3390/electronics14183722 - 19 Sep 2025
Viewed by 310
Abstract
Pose estimation and localization within the gastrointestinal tract, particularly the small bowel, are crucial for invasive medical procedures. However, the task is challenging due to the complex anatomy, homogeneous textures, and limited distinguishable features. This study proposes a hybrid deep learning (DL) method [...] Read more.
Pose estimation and localization within the gastrointestinal tract, particularly the small bowel, are crucial for invasive medical procedures. However, the task is challenging due to the complex anatomy, homogeneous textures, and limited distinguishable features. This study proposes a hybrid deep learning (DL) method combining Convolutional Neural Network (CNN)-based pose estimation and optical flow to address these challenges in a simulated small bowel environment. Initial pose estimation was used to assess the performance of simultaneous localization and mapping (SLAM) in such complex settings, using a custom endoscope prototype with a laser, micromotor, and miniaturized camera. The results showed limited feature detection and unreliable matches due to repetitive textures. To improve this issue, a hybrid CNN-based approach enhanced with Farneback optical flow was applied. Using consecutive images, three models were compared: Hybrid ResNet-50 with Farneback optical flow, ResNet-50, and NASNetLarge pretrained on ImageNet. The analysis showed that the hybrid model outperformed both ResNet-50 (0.39 cm) and NASNetLarge (1.46 cm), achieving the lowest RMSE of 0.03 cm, with feature-based SLAM failing to provide reliable results. The hybrid model also gained a competitive inference speed of 241.84 ms per frame, outperforming ResNet-50 (316.57 ms) and NASNetLarge (529.66 ms). To assess the impact of the optical flow choice, Lucas–Kanade was also implemented within the same framework and compared with the Farneback-based results. These results demonstrate that combining optical flow with ResNet-50 enhances pose estimation accuracy and stability, especially in textureless environments where traditional methods struggle. The proposed method offers a robust, real-time alternative to SLAM, with potential applications in clinical capsule endoscopy. The results are positioned as a proof-of-concept that highlights the feasibility and clinical potential of the proposed framework. Future work will extend the framework to real patient data and optimize for real-time hardware. Full article
Show Figures

Figure 1

21 pages, 13741 KB  
Article
Individual Tree Species Classification Using Pseudo Tree Crown (PTC) on Coniferous Forests
by Kongwen (Frank) Zhang, Tianning Zhang and Jane Liu
Remote Sens. 2025, 17(17), 3102; https://doi.org/10.3390/rs17173102 - 5 Sep 2025
Viewed by 776
Abstract
Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced [...] Read more.
Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced a novel data representation method, pseudo tree crown (PTC), which provides a pseudo-3D pixel-value view that enhances the informational richness of images and significantly improves classification performance. While our original implementation was successfully tested on urban and deciduous trees, this study extends the application of PTC to Canadian conifer species, including jack pine, Douglas fir, spruce, and aspen. We address key challenges such as snow-covered backgrounds and evaluate the impact of training dataset size on classification results. Classification was performed using Random Forest, PyTorch (ResNet50), and YOLO versions v10, v11, and v12. The results demonstrate that PTC can substantially improve individual tree classification accuracy by up to 13%, reaching the high 90% range. Full article
Show Figures

Figure 1

34 pages, 4661 KB  
Article
An AHP-Based Multicriteria Framework for Evaluating Renewable Energy Service Proposals in Public Healthcare Infrastructure: A Case Study of an Italian Hospital
by Cristina Ventura, Ferdinando Chiacchio, Diego D’Urso, Giuseppe Marco Tina, Gabino Jiménez Castillo and Ludovica Maria Oliveri
Energies 2025, 18(17), 4680; https://doi.org/10.3390/en18174680 - 3 Sep 2025
Viewed by 743
Abstract
Public healthcare infrastructure is among the most energy-intensive of public facilities; therefore, it needs to become more environmentally and economically sustainable by increasing energy efficiency and improving service reliability. Achieving these goals requires modernizing hospital energy systems with renewable energy sources (RESs). This [...] Read more.
Public healthcare infrastructure is among the most energy-intensive of public facilities; therefore, it needs to become more environmentally and economically sustainable by increasing energy efficiency and improving service reliability. Achieving these goals requires modernizing hospital energy systems with renewable energy sources (RESs). This process often involves Energy Service Companies (ESCOs), which propose integrated RES technologies with tailored contractual schemes. However, comparing ESCO offers is challenging due to their heterogeneous technologies, contractual structures, and long-term performance commitments, which make simple cost-based assessments inadequate. This study develops a structured Multi-Criteria Decision-Making (MCDM) methodology to evaluate energy projects in public healthcare facilities. The framework, based on the Analytic Hierarchy Process (AHP), combines both quantitative (net present value, stochastic simulations of energy cost savings, and CO2 emission reductions) with qualitative assessments (redundancy, flexibility, elasticity, and stakeholder image). It addresses the lack of standardized tools for ranking real-world ESCO proposals in public procurement. The approach, applied to a case study, involves three ESCO proposals for a large hospital in Southern Italy. The results show that integrating photovoltaic generation with trigeneration achieves the highest overall score. The proposed framework provides a transparent, replicable tool to support evidence-based energy investment decisions, extendable to other public-sector infrastructures. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

27 pages, 7274 KB  
Article
Intelligent Identification of Internal Leakage of Spring Full-Lift Safety Valve Based on Improved Convolutional Neural Network
by Shuxun Li, Kang Yuan, Jianjun Hou and Xiaoqi Meng
Sensors 2025, 25(17), 5451; https://doi.org/10.3390/s25175451 - 3 Sep 2025
Viewed by 619
Abstract
In modern industry, the spring full-lift safety valve is a key device for safe pressure relief of pressure-bearing systems. Its valve seat sealing surface is easily damaged after long-term use, causing internal leakage, resulting in safety hazards and economic losses. Therefore, it is [...] Read more.
In modern industry, the spring full-lift safety valve is a key device for safe pressure relief of pressure-bearing systems. Its valve seat sealing surface is easily damaged after long-term use, causing internal leakage, resulting in safety hazards and economic losses. Therefore, it is of great significance to quickly and accurately diagnose its internal leakage state. Among the current methods for identifying fluid machinery faults, model-based methods have difficulties in parameter determination. Although the data-driven convolutional neural network (CNN) has great potential in the field of fault diagnosis, it has problems such as hyperparameter selection relying on experience, insufficient capture of time series and multi-scale features, and lack of research on valve internal leakage type identification. To this end, this study proposes a safety valve internal leakage identification method based on high-frequency FPGA data acquisition and improved CNN. The acoustic emission signals of different internal leakage states are obtained through the high-frequency FPGA acquisition system, and the two-dimensional time–frequency diagram is obtained by short-time Fourier transform and input into the improved model. The model uses the leaky rectified linear unit (LReLU) activation function to enhance nonlinear expression, introduces random pooling to prevent overfitting, optimizes hyperparameters with the help of horned lizard optimization algorithm (HLOA), and integrates the bidirectional gated recurrent unit (BiGRU) and selective kernel attention module (SKAM) to enhance temporal feature extraction and multi-scale feature capture. Experiments show that the average recognition accuracy of the model for the internal leakage state of the safety valve is 99.7%, which is better than the comparison model such as ResNet-18. This method provides an effective solution for the diagnosis of internal leakage of safety valves, and the signal conversion method can be extended to the fault diagnosis of other mechanical equipment. In the future, we will explore the fusion of lightweight networks and multi-source data to improve real-time and robustness. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

21 pages, 28885 KB  
Article
Assessment of Yellow Rust (Puccinia striiformis) Infestations in Wheat Using UAV-Based RGB Imaging and Deep Learning
by Atanas Z. Atanasov, Boris I. Evstatiev, Asparuh I. Atanasov and Plamena D. Nikolova
Appl. Sci. 2025, 15(15), 8512; https://doi.org/10.3390/app15158512 - 31 Jul 2025
Cited by 2 | Viewed by 550
Abstract
Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study [...] Read more.
Yellow rust (Puccinia striiformis) is a common wheat disease that significantly reduces yields, particularly in seasons with cooler temperatures and frequent rainfall. Early detection is essential for effective control, especially in key wheat-producing regions such as Southern Dobrudja, Bulgaria. This study presents a UAV-based approach for detecting yellow rust using only RGB imagery and deep learning for pixel-based classification. The methodology involves data acquisition, preprocessing through histogram equalization, model training, and evaluation. Among the tested models, a UnetClassifier with ResNet34 backbone achieved the highest accuracy and reliability, enabling clear differentiation between healthy and infected wheat zones. Field experiments confirmed the approach’s potential for identifying infection patterns suitable for precision fungicide application. The model also showed signs of detecting early-stage infections, although further validation is needed due to limited ground-truth data. The proposed solution offers a low-cost, accessible tool for small and medium-sized farms, reducing pesticide use while improving disease monitoring. Future work will aim to refine detection accuracy in low-infection areas and extend the model’s application to other cereal diseases. Full article
(This article belongs to the Special Issue Advanced Computational Techniques for Plant Disease Detection)
Show Figures

Figure 1

22 pages, 1359 KB  
Article
Fall Detection Using Federated Lightweight CNN Models: A Comparison of Decentralized vs. Centralized Learning
by Qasim Mahdi Haref, Jun Long and Zhan Yang
Appl. Sci. 2025, 15(15), 8315; https://doi.org/10.3390/app15158315 - 25 Jul 2025
Cited by 1 | Viewed by 620
Abstract
Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to [...] Read more.
Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to train deep learning models across decentralized data sources without compromising user privacy. The pipeline begins with data acquisition, in which annotated video-based fall-detection datasets formatted in YOLO are used to extract image crops of human subjects. These images are then preprocessed, resized, normalized, and relabeled into binary classes (fall vs. non-fall). A stratified 80/10/10 split ensures balanced training, validation, and testing. To simulate real-world federated environments, the training data is partitioned across multiple clients, each performing local training using pretrained CNN models including MobileNetV2, VGG16, EfficientNetB0, and ResNet50. Two FL topologies are implemented: a centralized server-coordinated scheme and a ring-based decentralized topology. During each round, only model weights are shared, and federated averaging (FedAvg) is applied for global aggregation. The models were trained using three random seeds to ensure result robustness and stability across varying data partitions. Among all configurations, decentralized MobileNetV2 achieved the best results, with a mean test accuracy of 0.9927, F1-score of 0.9917, and average training time of 111.17 s per round. These findings highlight the model’s strong generalization, low computational burden, and suitability for edge deployment. Future work will extend evaluation to external datasets and address issues such as client drift and adversarial robustness in federated environments. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

20 pages, 2285 KB  
Article
WormNet: A Multi-View Network for Silkworm Re-Identification
by Hongkang Shi, Minghui Zhu, Linbo Li, Yong Ma, Jianmei Wu, Jianfei Zhang and Junfeng Gao
Animals 2025, 15(14), 2011; https://doi.org/10.3390/ani15142011 - 8 Jul 2025
Viewed by 325
Abstract
Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary [...] Read more.
Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary poses, and significant background noise. To address these challenges, we propose a multi-view network for silkworm ReID, called WormNet, which is built upon an innovative strategy termed extraction purification extraction interaction. Specifically, we introduce a multi-order feature extraction module that captures a wide range of fine-grained features by utilizing convolutional kernels of varying sizes and parallel cardinality, effectively mitigating issues of high individual similarity and diverse poses. Next, a feature mask module (FMM) is employed to purify the features in the spatial domain, thereby reducing the impact of background interference. To further enhance the data representation capabilities of the network, we propose a channel interaction module (CIM), which combines an efficient channel attention network with global response normalization (GRN) in parallel to recalibrate features, enabling the network to learn crucial information at both the local and global scales. Additionally, we introduce a new silkworm ReID dataset for network training and evaluation. The experimental results demonstrate that WormNet achieves an mAP value of 54.8% and a rank-1 value of 91.4% on the dataset, surpassing both state-of-the-art and related networks. This study offers a valuable reference for ReID in insects and other organisms. Full article
(This article belongs to the Section Animal System and Management)
Show Figures

Figure 1

43 pages, 6844 KB  
Article
CORE-ReID V2: Advancing the Domain Adaptation for Object Re-Identification with Optimized Training and Ensemble Fusion
by Trinh Quoc Nguyen, Oky Dicky Ardiansyah Prima, Syahid Al Irfan, Hindriyanto Dwi Purnomo and Radius Tanone
AI Sens. 2025, 1(1), 4; https://doi.org/10.3390/aisens1010004 - 4 Jul 2025
Viewed by 1098
Abstract
This study presents CORE-ReID V2, an enhanced framework built upon CORE-ReID V1. The new framework extends its predecessor by addressing unsupervised domain adaptation (UDA) challenges in person ReID and vehicle ReID, with further applicability to object ReID. During pre-training, CycleGAN is employed to [...] Read more.
This study presents CORE-ReID V2, an enhanced framework built upon CORE-ReID V1. The new framework extends its predecessor by addressing unsupervised domain adaptation (UDA) challenges in person ReID and vehicle ReID, with further applicability to object ReID. During pre-training, CycleGAN is employed to synthesize diverse data, bridging image characteristic gaps across different domains. In the fine-tuning, an advanced ensemble fusion mechanism, consisting of the Efficient Channel Attention Block (ECAB) and the Simplified Efficient Channel Attention Block (SECAB), enhances both local and global feature representations while reducing ambiguity in pseudo-labels for target samples. Experimental results on widely used UDA person ReID and vehicle ReID datasets demonstrate that the proposed framework outperforms state-of-the-art methods, achieving top performance in mean average precision (mAP) and Rank-k Accuracy (Top-1, Top-5, Top-10). Moreover, the framework supports lightweight backbones such as ResNet18 and ResNet34, ensuring both scalability and efficiency. Our work not only pushes the boundaries of UDA-based object ReID but also provides a solid foundation for further research and advancements in this domain. Full article
Show Figures

Figure 1

36 pages, 4389 KB  
Article
EffRes-DrowsyNet: A Novel Hybrid Deep Learning Model Combining EfficientNetB0 and ResNet50 for Driver Drowsiness Detection
by Sama Hussein Al-Gburi, Kanar Alaa Al-Sammak, Ion Marghescu, Claudia Cristina Oprea, Ana-Maria Claudia Drăgulinescu, Nayef A. M. Alduais, Khattab M. Ali Alheeti and Nawar Alaa Hussein Al-Sammak
Sensors 2025, 25(12), 3711; https://doi.org/10.3390/s25123711 - 13 Jun 2025
Viewed by 1387
Abstract
Driver drowsiness is a major contributor to road accidents, often resulting from delayed reaction times and impaired cognitive performance. This study introduces EffRes-DrowsyNet, a hybrid deep learning model that integrates the architectural efficiencies of EfficientNetB0 with the deep representational capabilities of ResNet50. The [...] Read more.
Driver drowsiness is a major contributor to road accidents, often resulting from delayed reaction times and impaired cognitive performance. This study introduces EffRes-DrowsyNet, a hybrid deep learning model that integrates the architectural efficiencies of EfficientNetB0 with the deep representational capabilities of ResNet50. The model is designed to detect early signs of driver fatigue through advanced video-based analytics by leveraging both computational scalability and deep feature learning. Extensive experiments were conducted on three benchmark datasets—SUST-DDD, YawDD, and NTHU-DDD—to validate the model’s performance across a range of environmental and demographic variations. EffRes-DrowsyNet achieved 97.71% accuracy, 98.07% precision, and 97.33% recall on the SUST-DDD dataset. On the YawDD dataset, it sustained a high accuracy of 92.73%, while on the NTHU-DDD dataset, it reached 95.14% accuracy, 94.09% precision, and 95.39% recall. These results affirm the model’s superior generalization and classification performance in both controlled and real-world-like settings. The findings underscore the effectiveness of hybrid deep learning models in real-time, safety-critical applications, particularly for automotive driver monitoring systems. Furthermore, EffRes-DrowsyNet’s architecture provides a scalable and adaptable solution that could extend to other attention-critical domains such as industrial machinery operation, aviation, and public safety systems. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

7 pages, 398 KB  
Proceeding Paper
Enhancing Real Estate Listings Through Image Classification and Enhancement: A Comparative Study
by Eyüp Tolunay Küp, Melih Sözdinler, Ali Hakan Işık, Yalçın Doksanbir and Gökhan Akpınar
Eng. Proc. 2025, 92(1), 80; https://doi.org/10.3390/engproc2025092080 - 22 May 2025
Viewed by 1023
Abstract
We extended real estate property listings on the online prop-tech platform. On the platform, the images were classified into the specified classes according to quality criteria. The necessary interventions were made by measuring the platform’s appropriateness level and increasing the advertisements’ visual appeal. [...] Read more.
We extended real estate property listings on the online prop-tech platform. On the platform, the images were classified into the specified classes according to quality criteria. The necessary interventions were made by measuring the platform’s appropriateness level and increasing the advertisements’ visual appeal. A dataset of 3000 labeled images was utilized to compare different image classification models, including convolutional neural networks (CNNs), VGG16, residual networks (ResNets), and the LLaVA large language model (LLM). Each model’s performance and benchmark results were measured to identify the most effective method. In addition, the classification pipeline was expanded using image enhancement with contrastive unsupervised representation learning (CURL). This method assessed the impact of improved image quality on classification accuracy and the overall attractiveness of property listings. For each classification model, the performance was evaluated in binary conditions, with and without the application of CURL. The results showed that applying image enhancement with CURL enhances image quality and improves classification performance, particularly in models such as CNN and ResNet. The study results enable a better visual representation of real estate properties, resulting in higher-quality and engaging user listings. They also underscore the importance of combining advanced image processing techniques with classification models to optimize image presentation and categorization in the real estate industry. The extended platform offers information on the role of machine learning models and image enhancement methods in technology for the real estate industry. Also, an alternative solution that can be integrated into intelligent listing systems is proposed in this study to improve user experience and information accuracy. The platform proves that artificial intelligence and machine learning can be integrated for cloud-distributed services, paving the way for future innovations in the real estate sector and intelligent marketplace platforms. Full article
(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)
Show Figures

Figure 1

24 pages, 3848 KB  
Article
Efficient Deep Learning Model Compression for Sensor-Based Vision Systems via Outlier-Aware Quantization
by Joonhyuk Yoo and Guenwoo Ban
Sensors 2025, 25(9), 2918; https://doi.org/10.3390/s25092918 - 5 May 2025
Cited by 2 | Viewed by 1401
Abstract
With the rapid growth of sensor technology and computer vision, efficient deep learning models are essential for real-time image feature extraction in resource-constrained environments. However, most existing quantized deep neural networks (DNNs) are highly sensitive to outliers, leading to severe performance degradation in [...] Read more.
With the rapid growth of sensor technology and computer vision, efficient deep learning models are essential for real-time image feature extraction in resource-constrained environments. However, most existing quantized deep neural networks (DNNs) are highly sensitive to outliers, leading to severe performance degradation in low-precision settings. Our study reveals that outliers extending beyond the nominal weight distribution significantly increase the dynamic range, thereby reducing quantization resolution and affecting sensor-based image analysis tasks. To address this, we propose an outlier-aware quantization (OAQ) method that effectively reshapes weight distributions to enhance quantization accuracy. By analyzing previous outlier-handling techniques using structural similarity (SSIM) measurement results, we demonstrated that OAQ significantly reduced the negative impact of outliers while maintaining computational efficiency. Notably, OAQ was orthogonal to existing quantization schemes, making it compatible with various quantization methods without additional computational overhead. Experimental results on multiple CNN architectures and quantization approaches showed that OAQ effectively mitigated quantization errors. In post-training quantization (PTQ), our 4-bit OAQ ResNet20 model achieved improved accuracy compared with full-precision counterparts, while in quantization-aware training (QAT), OAQ enhanced 2-bit quantization performance by 43.55% over baseline methods. These results confirmed the potential of OAQ for optimizing deep learning models in sensor-based vision applications. Full article
Show Figures

Figure 1

21 pages, 1875 KB  
Article
Direction-Aware Lightweight Framework for Traditional Mongolian Document Layout Analysis
by Chenyang Zhou, Monghjaya Ha and Licheng Wu
Appl. Sci. 2025, 15(8), 4594; https://doi.org/10.3390/app15084594 - 21 Apr 2025
Viewed by 655
Abstract
Traditional Mongolian document layout analysis faces unique challenges due to its vertical writing system and complex structural arrangements. Existing methods often struggle with the directional nature of traditional Mongolian text and require substantial computational resources. In this paper, we propose a direction-aware lightweight [...] Read more.
Traditional Mongolian document layout analysis faces unique challenges due to its vertical writing system and complex structural arrangements. Existing methods often struggle with the directional nature of traditional Mongolian text and require substantial computational resources. In this paper, we propose a direction-aware lightweight framework that effectively addresses these challenges. Our framework introduces three key innovations: a modified MobileNetV3 backbone with asymmetric convolutions for efficient vertical feature extraction, a dynamic feature enhancement module with channel attention for adaptive multi-scale information fusion, and a direction-aware detection head with (sinθ,cosθ) vector representation for accurate orientation modeling. We evaluate our method on TMDLAD, a newly constructed traditional Mongolian document layout analysis dataset, comparing it with both heavy ResNet-50-based models and lightweight alternatives. The experimental results demonstrate that our approach achieves state-of-the-art performance, with 0.715 mAP and 92.3% direction accuracy with a mean absolute error of only 2.5°, while maintaining high efficiency at 28.6 FPS using only 8.3 M parameters. Our model outperforms the best ResNet-50-based model by 3.6% in mAP and the best lightweight model by 4.3% in mAP, while uniquely providing direction prediction capability that other lightweight models lack. The proposed framework significantly outperforms existing methods in both accuracy and efficiency, providing a practical solution for traditional Mongolian document layout analysis that can be extended to other vertical writing systems. Full article
Show Figures

Figure 1

20 pages, 2484 KB  
Review
The Role of Multilevel Inverters in Mitigating Harmonics and Improving Power Quality in Renewable-Powered Smart Grids: A Comprehensive Review
by Shanikumar Vaidya, Krishnamachar Prasad and Jeff Kilby
Energies 2025, 18(8), 2065; https://doi.org/10.3390/en18082065 - 17 Apr 2025
Cited by 2 | Viewed by 1905
Abstract
The world is increasingly turning to renewable energy sources (RES) to address climate change issues and achieve net-zero carbon emissions. Integrating RES into existing power grids is necessary for sustainability because the unpredictability and irregularity of the RES can affect grid stability and [...] Read more.
The world is increasingly turning to renewable energy sources (RES) to address climate change issues and achieve net-zero carbon emissions. Integrating RES into existing power grids is necessary for sustainability because the unpredictability and irregularity of the RES can affect grid stability and generate power quality issues, leading to equipment damage and increasing operational costs. As a result, the importance of RES is severely compromised. To tackle these challenges, traditional power systems (TPS) will have to become more innovative. Smart grids use advanced technology such as two-way communication between consumers and service providers, automated control, and real-time monitoring to manage power flow effectively. Inverters are effective tools for solving power quality problems in renewable-powered smart grids. However, their effectiveness depends on topology, control method and design. This review paper focuses on the role of multilevel inverters (MLIs) in mitigating power quality issues such as voltage sag, swell and total harmonics distortion (THD). The results shown here are through simulation studies using DC sources but can be extended to RES-integrated smart grids. The comprehensive review also examines the drawbacks of TPS to understand the importance and necessity of developing a smart power system. Finally, the paper discusses future trends in MLI control technology, addressing power quality problems in smart grid environments. Full article
(This article belongs to the Section F3: Power Electronics)
Show Figures

Figure 1

Back to TopTop