MDPI - Publisher of Open Access Journals

39 pages, 8108 KiB

Open AccessArticle

PSMP: Category Prototype-Guided Streaming Multi-Level Perturbation for Online Open-World Object Detection

by Shibo Gu, Meng Sun, Zhihao Zhang, Yuhao Bai and Ziliang Chen

Symmetry 2025, 17(8), 1237; https://doi.org/10.3390/sym17081237 - 5 Aug 2025

Inspired by the human ability to learn continuously and adapt to changing environments, researchers have proposed Online Open-World Object Detection (OLOWOD). This emerging paradigm faces the challenges of detecting known categories, discovering unknown ones, continuously learning new categories, and mitigating catastrophic forgetting. To [...] Read more.

Inspired by the human ability to learn continuously and adapt to changing environments, researchers have proposed Online Open-World Object Detection (OLOWOD). This emerging paradigm faces the challenges of detecting known categories, discovering unknown ones, continuously learning new categories, and mitigating catastrophic forgetting. To address these challenges, we propose Category Prototype-guided Streaming Multi-Level Perturbation, PSMP, a plug-and-play method for OLOWOD. PSMP, comprising semantic-level, enhanced data-level, and enhanced feature-level perturbations jointly guided by category prototypes, operates at different representational levels to collaboratively extract latent knowledge across tasks and improve adaptability. In addition, PSMP constructs the “contrastive tension” based on the relationships among category prototypes. This mechanism inherently leverages the symmetric structure formed by class prototypes in the latent space, where prototypes of semantically similar categories tend to align symmetrically or equidistantly. By guiding perturbations along these symmetric axes, the model can achieve more balanced generalization between known and unknown categories. PSMP requires no additional annotations, is lightweight in design, and can be seamlessly integrated into existing OWOD methods. Extensive experiments show that PSMP achieves an improvement of approximately 1.5% to 3% in mAP for known categories compared to conventional online training methods while significantly increasing the Unknown Recall (UR) by around 4.6%. Full article

(This article belongs to the Special Issue Symmetry and Asymmetry in Computer Vision and Graphics)

► Show Figures

Figure 1

19 pages, 7432 KiB

Open AccessArticle

Image-Level Anti-Personnel Landmine Detection Using Deep Learning in Long-Wave Infrared Images

by Jun-Hyung Kim and Goo-Rak Kwon

Appl. Sci. 2025, 15(15), 8613; https://doi.org/10.3390/app15158613 (registering DOI) - 4 Aug 2025

Viewed by 49

Abstract

This study proposes a simple deep learning-based framework for image-level anti-personnel landmine detection in long-wave infrared imagery. To address challenges posed by the limited size of the available dataset and the small spatial size of anti-personnel landmines within images, we integrate two key [...] Read more.

This study proposes a simple deep learning-based framework for image-level anti-personnel landmine detection in long-wave infrared imagery. To address challenges posed by the limited size of the available dataset and the small spatial size of anti-personnel landmines within images, we integrate two key techniques: transfer learning using pre-trained vision foundation models, and attention-based multiple instance learning to derive discriminative image features. We evaluate five pre-trained models, including ResNet, ConvNeXt, ViT, OpenCLIP, and InfMAE, in combination with attention-based multiple instance learning. Furthermore, to mitigate the reliance of trained models on irrelevant features such as artificial or natural structures in the background, we introduce an inpainting-based image augmentation method. Experimental results, conducted on a publicly available “legbreaker” anti-personnel landmine infrared dataset, demonstrate that the proposed framework achieves high precision and recall, validating its effectiveness for landmine detection in infrared imagery. Additional experiments are also performed on an aerial image dataset designed for detecting small-sized ship targets to further validate the effectiveness of the proposed approach. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

21 pages, 4252 KiB

Open AccessArticle

AnimalAI: An Open-Source Web Platform for Automated Animal Activity Index Calculation Using Interactive Deep Learning Segmentation

by Mahtab Saeidifar, Guoming Li, Lakshmish Macheeri Ramaswamy, Chongxiao Chen and Ehsan Asali

Animals 2025, 15(15), 2269; https://doi.org/10.3390/ani15152269 - 3 Aug 2025

Viewed by 175

Abstract

Monitoring the activity index of animals is crucial for assessing their welfare and behavior patterns. However, traditional methods for calculating the activity index, such as pixel intensity differencing of entire frames, are found to suffer from significant interference and noise, leading to inaccurate [...] Read more.

Monitoring the activity index of animals is crucial for assessing their welfare and behavior patterns. However, traditional methods for calculating the activity index, such as pixel intensity differencing of entire frames, are found to suffer from significant interference and noise, leading to inaccurate results. These classical approaches also do not support group or individual tracking in a user-friendly way, and no open-access platform exists for non-technical researchers. This study introduces an open-source web-based platform that allows researchers to calculate the activity index from top-view videos by selecting individual or group animals. It integrates Segment Anything Model2 (SAM2), a promptable deep learning segmentation model, to track animals without additional training or annotation. The platform accurately tracked Cobb 500 male broilers from weeks 1 to 7 with a 100% success rate, IoU of 92.21% ± 0.012, precision of 93.87% ± 0.019, recall of 98.15% ± 0.011, and F1 score of 95.94% ± 0.006, based on 1157 chickens. Statistical analysis showed that tracking 80% of birds in week 1, 60% in week 4, and 40% in week 7 was sufficient (r ≥ 0.90; p ≤ 0.048) to represent the group activity in respective ages. This platform offers a practical, accessible solution for activity tracking, supporting animal behavior analytics with minimal effort. Full article

(This article belongs to the Section Animal Welfare)

► Show Figures

Figure 1

22 pages, 728 KiB

Open AccessArticle

Design and Performance Evaluation of LLM-Based RAG Pipelines for Chatbot Services in International Student Admissions

by Maksuda Khasanova Zafar kizi and Youngjung Suh

Electronics 2025, 14(15), 3095; https://doi.org/10.3390/electronics14153095 - 2 Aug 2025

Viewed by 256

Abstract

Recent advancements in large language models (LLMs) have significantly enhanced the effectiveness of Retrieval-Augmented Generation (RAG) systems. This study focuses on the development and evaluation of a domain-specific AI chatbot designed to support international student admissions by leveraging LLM-based RAG pipelines. We implement [...] Read more.

Recent advancements in large language models (LLMs) have significantly enhanced the effectiveness of Retrieval-Augmented Generation (RAG) systems. This study focuses on the development and evaluation of a domain-specific AI chatbot designed to support international student admissions by leveraging LLM-based RAG pipelines. We implement and compare multiple pipeline configurations, combining retrieval methods (e.g., Dense, MMR, Hybrid), chunking strategies (e.g., Semantic, Recursive), and both open-source and commercial LLMs. Dual evaluation datasets of LLM-generated and human-tagged QA sets are used to measure answer relevancy, faithfulness, context precision, and recall, alongside heuristic NLP metrics. Furthermore, latency analysis across different RAG stages is conducted to assess deployment feasibility in real-world educational environments. Results show that well-optimized open-source RAG pipelines can offer comparable performance to GPT-4o while maintaining scalability and cost-efficiency. These findings suggest that the proposed chatbot system can provide a practical and technically sound solution for international student services in resource-constrained academic institutions. Full article

(This article belongs to the Special Issue AI-Driven Data Analytics and Mining)

► Show Figures

Figure 1

28 pages, 4950 KiB

Open AccessArticle

A Method for Auto Generating a Remote Sensing Building Detection Sample Dataset Based on OpenStreetMap and Bing Maps

by Jiawei Gu, Chen Ji, Houlin Chen, Xiangtian Zheng, Liangbao Jiao and Liang Cheng

Remote Sens. 2025, 17(14), 2534; https://doi.org/10.3390/rs17142534 - 21 Jul 2025

Viewed by 347

Abstract

In remote sensing building detection tasks, data acquisition remains a critical bottleneck that limits both model performance and large-scale deployment. Due to the high cost of manual annotation, limited geographic coverage, and constraints of image acquisition conditions, obtaining large-scale, high-quality labeled datasets remains [...] Read more.

In remote sensing building detection tasks, data acquisition remains a critical bottleneck that limits both model performance and large-scale deployment. Due to the high cost of manual annotation, limited geographic coverage, and constraints of image acquisition conditions, obtaining large-scale, high-quality labeled datasets remains a significant challenge. To address this issue, this study proposes an automatic semantic labeling framework for remote sensing imagery. The framework leverages geospatial vector data provided by OpenStreetMap, precisely aligns it with high-resolution satellite imagery from Bing Maps through projection transformation, and incorporates a quality-aware sample filtering strategy to automatically generate accurate annotations for building detection. The resulting dataset comprises 36,647 samples, covering buildings in both urban and suburban areas across multiple cities. To evaluate its effectiveness, we selected three publicly available datasets—WHU, INRIA, and DZU—and conducted three types of experiments using the following four representative object detection models: SSD, Faster R-CNN, DETR, and YOLOv11s. The experiments include benchmark performance evaluation, input perturbation robustness testing, and cross-dataset generalization analysis. Results show that our dataset achieved a mAP at 0.5 intersection over union of up to 93.2%, with a precision of 89.4% and a recall of 90.6%, outperforming the open-source benchmarks across all four models. Furthermore, when simulating real-world noise in satellite image acquisition—such as motion blur and brightness variation—our dataset maintained a mean average precision of 90.4% under the most severe perturbation, indicating strong robustness. In addition, it demonstrated superior cross-dataset stability compared to the benchmarks. Finally, comparative experiments conducted on public test areas further validated the effectiveness and reliability of the proposed annotation framework. Full article

► Show Figures

Figure 1

15 pages, 4874 KiB

Open AccessArticle

A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification

by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam

J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025

Viewed by 466

Abstract

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article

(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)

► Show Figures

Figure 1

22 pages, 1906 KiB

Open AccessArticle

Explainable and Optuna-Optimized Machine Learning for Battery Thermal Runaway Prediction Under Class Imbalance Conditions

by Abir El Abed, Ghalia Nassreddine, Obada Al-Khatib, Mohamad Nassereddine and Ali Hellany

Thermo 2025, 5(3), 23; https://doi.org/10.3390/thermo5030023 - 15 Jul 2025

Viewed by 379

Abstract

Modern energy storage systems for both power and transportation are highly related to lithium-ion batteries (LIBs). However, their safety depends on a potentially hazardous failure mode known as thermal runaway (TR). Predicting and classifying TR causes can widely enhance the safety of power [...] Read more.

Modern energy storage systems for both power and transportation are highly related to lithium-ion batteries (LIBs). However, their safety depends on a potentially hazardous failure mode known as thermal runaway (TR). Predicting and classifying TR causes can widely enhance the safety of power and transportation systems. This paper presents an advanced machine learning method for forecasting and classifying the causes of TR. A generative model for synthetic data generation was used to handle class imbalance in the dataset. Hyperparameter optimization was conducted using Optuna for four classifiers: Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), tabular network (TabNet), and Extreme Gradient Boosting (XGBoost). A three-fold cross-validation approach was used to guarantee a robust evaluation. An open-source database of LIB failure events is used for model training and testing. The XGBoost model outperforms the other models across all TR categories by achieving 100% accuracy and a high recall (1.00). Model results were interpreted using SHapley Additive exPlanations analysis to investigate the most significant factors in TR predictors. The findings show that important TR indicators include energy adjusted for heat and weight loss, heater power, average cell temperature upon activation, and heater duration. These findings guide the design of safer battery systems and preventive monitoring systems for real applications. They can help experts develop more efficient battery management systems, thereby improving the performance and longevity of battery-operated devices. By enhancing the predictive knowledge of temperature-driven failure mechanisms in LIBs, the study directly advances thermal analysis and energy storage safety domains. Full article

► Show Figures

Figure 1

21 pages, 3826 KiB

Open AccessArticle

UAV-OVD: Open-Vocabulary Object Detection in UAV Imagery via Multi-Level Text-Guided Decoding

by Lijie Tao, Guoting Wei, Zhuo Wang, Zhaoshuai Qi, Ying Li and Haokui Zhang

Drones 2025, 9(7), 495; https://doi.org/10.3390/drones9070495 - 14 Jul 2025

Viewed by 513

Abstract

Object detection in drone-captured imagery has attracted significant attention due to its wide range of real-world applications, including surveillance, disaster response, and environmental monitoring. Although the majority of existing methods are developed under closed-set assumptions, and some recent studies have begun to explore [...] Read more.

Object detection in drone-captured imagery has attracted significant attention due to its wide range of real-world applications, including surveillance, disaster response, and environmental monitoring. Although the majority of existing methods are developed under closed-set assumptions, and some recent studies have begun to explore open-vocabulary or open-world detection, their application to UAV imagery remains limited and underexplored. In this paper, we address this limitation by exploring the relationship between images and textual semantics to extend object detection in UAV imagery to an open-vocabulary setting. We propose a novel and efficient detector named Unmanned Aerial Vehicle Open-Vocabulary Detector (UAV-OVD), specifically designed for drone-captured scenes. To facilitate open-vocabulary object detection, we propose improvements from three complementary perspectives. First, at the training level, we design a region–text contrastive loss to replace conventional classification loss, allowing the model to align visual regions with textual descriptions beyond fixed category sets. Structurally, building on this, we introduce a multi-level text-guided fusion decoder that integrates visual features across multiple spatial scales under language guidance, thereby improving overall detection performance and enhancing the representation and perception of small objects. Finally, from the data perspective, we enrich the original dataset with synonym-augmented category labels, enabling more flexible and semantically expressive supervision. Experiments conducted on two widely used benchmark datasets demonstrate that our approach achieves significant improvements in both mean mAP and Recall. For instance, for Zero-Shot Detection on xView, UAV-OVD achieves 9.9 mAP and 67.3 Recall, 1.1 and 25.6 higher than that of YOLO-World. In terms of speed, UAV-OVD achieves 53.8 FPS, nearly twice as fast as YOLO-World and five times faster than DetrReg, demonstrating its strong potential for real-time open-vocabulary detection in UAV imagery. Full article

(This article belongs to the Special Issue Applications of UVs in Digital Photogrammetry and Image Processing)

► Show Figures

Figure 1

19 pages, 1957 KiB

Open AccessArticle

Resource-Efficient Cotton Network: A Lightweight Deep Learning Framework for Cotton Disease and Pest Classification

by Zhengle Wang, Heng-Wei Zhang, Ying-Qiang Dai, Kangning Cui, Haihua Wang, Peng W. Chee and Rui-Feng Wang

Plants 2025, 14(13), 2082; https://doi.org/10.3390/plants14132082 - 7 Jul 2025

Cited by 2 | Viewed by 421

Abstract

Cotton is the most widely cultivated natural fiber crop worldwide, yet it is highly susceptible to various diseases and pests that significantly compromise both yield and quality. To enable rapid and accurate diagnosis of cotton diseases and pests—thus supporting the development of effective [...] Read more.

Cotton is the most widely cultivated natural fiber crop worldwide, yet it is highly susceptible to various diseases and pests that significantly compromise both yield and quality. To enable rapid and accurate diagnosis of cotton diseases and pests—thus supporting the development of effective control strategies and facilitating genetic breeding research—we propose a lightweight model, the Resource-efficient Cotton Network (RF-Cott-Net), alongside an open-source image dataset, CCDPHD-11, encompassing 11 disease categories. Built upon the MobileViTv2 backbone, RF-Cott-Net integrates an early exit mechanism and quantization-aware training (QAT) to enhance deployment efficiency without sacrificing accuracy. Experimental results on CCDPHD-11 demonstrate that RF-Cott-Net achieves an accuracy of 98.4%, an F1-score of 98.4%, a precision of 98.5%, and a recall of 98.3%. With only 4.9 M parameters, 310 M FLOPs, an inference time of 3.8 ms, and a storage footprint of just 4.8 MB, RF-Cott-Net delivers outstanding accuracy and real-time performance, making it highly suitable for deployment on agricultural edge devices and providing robust support for in-field automated detection of cotton diseases and pests. Full article

(This article belongs to the Special Issue Precision Agriculture in Crop Production)

► Show Figures

Figure 1

21 pages, 5444 KiB

Open AccessArticle

Diagnosis of Schizophrenia Using Feature Extraction from EEG Signals Based on Markov Transition Fields and Deep Learning

by Alka Jalan, Deepti Mishra, Marisha and Manjari Gupta

Biomimetics 2025, 10(7), 449; https://doi.org/10.3390/biomimetics10070449 - 7 Jul 2025

Viewed by 613

Abstract

Diagnosing schizophrenia using Electroencephalograph (EEG) signals is a challenging task due to the subtle and overlapping differences between patients and healthy individuals. To overcome this difficulty, deep learning has shown strong potential, especially given its success in image recognition tasks. In many studies, [...] Read more.

Diagnosing schizophrenia using Electroencephalograph (EEG) signals is a challenging task due to the subtle and overlapping differences between patients and healthy individuals. To overcome this difficulty, deep learning has shown strong potential, especially given its success in image recognition tasks. In many studies, one-dimensional EEG signals are transformed into two-dimensional representations to allow for image-based analysis. In this work, we have used the Markov Transition Field for converting EEG signals into two-dimensional images, capturing both the temporal patterns and statistical dynamics of the data. EEG signals are continuous time-series recordings from the brain, where the current state is often influenced by the immediately preceding state. This characteristic makes MTF particularly suitable for representing such data. After the transformation, a pre-trained VGG-16 model is employed to extract meaningful features from the images. The extracted features are then passed through two separate classification pipelines. The first uses a traditional machine learning model, Support Vector Machine, while the second follows a deep learning approach involving an autoencoder for feature selection and a neural network for final classification. The experiments were conducted using EEG data from the open-access Schizophrenia EEG database provided by MV Lomonosov Moscow State University. The proposed method achieved a highest classification accuracy of 98.51 percent and a recall of 100 percent across all folds using the deep learning pipeline. The Support Vector Machine pipeline also showed strong performance with a best accuracy of 96.28 percent and a recall of 97.89 percent. The proposed deep learning model represents a biomimetic approach to pattern recognition and decision-making. Full article

(This article belongs to the Special Issue New Biomimetic Advances in Signal and Image Processing for Biomedical Applications 2025)

► Show Figures

Figure 1

16 pages, 1535 KiB

Open AccessArticle

Clinical Text Classification for Tuberculosis Diagnosis Using Natural Language Processing and Deep Learning Model with Statistical Feature Selection Technique

by Shaik Fayaz Ahamed, Sundarakumar Karuppasamy and Ponnuraja Chinnaiyan

Informatics 2025, 12(3), 64; https://doi.org/10.3390/informatics12030064 - 7 Jul 2025

Viewed by 510

Abstract

Background: In the medical field, various deep learning (DL) algorithms have been effectively used to extract valuable information from unstructured clinical text data, potentially leading to more effective outcomes. This study utilized clinical text data to classify clinical case reports into tuberculosis (TB) [...] Read more.

Background: In the medical field, various deep learning (DL) algorithms have been effectively used to extract valuable information from unstructured clinical text data, potentially leading to more effective outcomes. This study utilized clinical text data to classify clinical case reports into tuberculosis (TB) and non-tuberculosis (non-TB) groups using natural language processing (NLP), a pre-processing technique, and DL models. Methods: This study used 1743 open-source respiratory disease clinical text data, labeled via fuzzy matching with ICD-10 codes to create a labeled dataset. Two tokenization methods preprocessed the clinical text data, and three models were evaluated: the existing Text-CNN, the proposed Text-CNN with t-test, and Bio_ClinicalBERT. Performance was assessed using multiple metrics and validated on 228 baseline screening clinical case text data collected from ICMR–NIRT to demonstrate effective TB classification. Results: The proposed model achieved the best results in both the test and validation datasets. On the test dataset, it attained a precision of 88.19%, a recall of 90.71%, an F1-score of 89.44%, and an AUC of 0.91. Similarly, on the validation dataset, it achieved 100% precision, 98.85% recall, 99.42% F1-score, and an AUC of 0.982, demonstrating its effectiveness in TB classification. Conclusions: This study highlights the effectiveness of DL models in classifying TB cases from clinical notes. The proposed model outperformed the other two models. The TF-IDF and t-test showed statistically significant feature selection and enhanced model interpretability and efficiency, demonstrating the potential of NLP and DL in automating TB diagnosis in clinical decision settings. Full article

► Show Figures

Figure 1

31 pages, 9156 KiB

Open AccessArticle

A Comparative Analysis of Deep Learning-Based Segmentation Techniques for Terrain Classification in Aerial Imagery

by Martina Formichini and Carlo Alberto Avizzano

AI 2025, 6(7), 145; https://doi.org/10.3390/ai6070145 - 3 Jul 2025

Viewed by 564

Abstract

Background: Deep convolutional neural networks (CNNs) have become widely popular for many imaging applications, and they have also been applied in various studies for monitoring and mapping areas of land. Nevertheless, most of these networks were designed to perform in different scenarios, such [...] Read more.

Background: Deep convolutional neural networks (CNNs) have become widely popular for many imaging applications, and they have also been applied in various studies for monitoring and mapping areas of land. Nevertheless, most of these networks were designed to perform in different scenarios, such as autonomous driving and medical imaging. Methods: In this work, we focused on the usage of existing semantic networks applied to terrain segmentation. Even though several existing networks have been used to study land segmentation using transfer learning methodologies, a comparative analysis of how the underlying network architectures perform has not yet been conducted. Since this scenario is different from the one in which these networks were developed, featuring irregular shapes and an absence of models, not all of them can be correctly transferred to this domain. Results: Fifteen state-of-the-art neural networks were compared, and we found that, in addition to slight differences in performance, there were relevant differences in the numbers and types of outliers that were worth highlighting. Our results show that the best-performing models achieved a pixel-level class accuracy of 99.06%, with an F1-score of 72.94%, 71.5% Jaccard loss, and 88.43% recall. When investigating the outliers, we found that PSPNet, FCN, and ICNet were the most effective models. Conclusions: While most of this work was performed on an existing terrain dataset collected using aerial imagery, this approach remains valid for investigation of other datasets with more classes or richer geographical extensions. For example, a dataset composed of Copernicus images opens up new opportunities for large-scale terrain analysis. Full article

► Show Figures

Figure 1

19 pages, 2917 KiB

Open AccessArticle

An Approach to Trustworthy Article Ranking by NLP and Multi-Layered Analysis and Optimization

by Chenhao Li, Jiyin Zhang, Weilin Chen and Xiaogang Ma

Algorithms 2025, 18(7), 408; https://doi.org/10.3390/a18070408 - 3 Jul 2025

Viewed by 282

Abstract

The rapid growth of scientific publications, coupled with rising retraction rates, has intensified the challenge of identifying trustworthy academic articles. To address this issue, we propose a three-layer ranking system that integrates natural language processing and machine learning techniques for relevance and trust [...] Read more.

The rapid growth of scientific publications, coupled with rising retraction rates, has intensified the challenge of identifying trustworthy academic articles. To address this issue, we propose a three-layer ranking system that integrates natural language processing and machine learning techniques for relevance and trust assessment. First, we apply BERT-based embeddings to semantically match user queries with article content. Second, a Random Forest classifier is used to eliminate potentially problematic articles, leveraging features such as citation count, Altmetric score, and journal impact factor. Third, a custom ranking function combines relevance and trust indicators to score and sort the remaining articles. Evaluation using 16,052 articles from Retraction Watch and Web of Science datasets shows that our classifier achieves 90% accuracy and 97% recall for retracted articles. Citations emerged as the most influential trust signal (53.26%), followed by Altmetric and impact factors. This multi-layered approach offers a transparent and efficient alternative to conventional ranking algorithms, which can help researchers discover not only relevant but also reliable literature. Our system is adaptable to various domains and represents a promising tool for improving literature search and evaluation in the open science environment. Full article

(This article belongs to the Special Issue Data-Driven Intelligent Modeling and Optimization Algorithms for Industrial Processes: 2nd Edition)

► Show Figures

Figure 1

30 pages, 2494 KiB

Open AccessArticle

A Novel Framework for Mental Illness Detection Leveraging TOPSIS-ModCHI-Based Feature-Driven Randomized Neural Networks

by Santosh Kumar Behera and Rajashree Dash

Math. Comput. Appl. 2025, 30(4), 67; https://doi.org/10.3390/mca30040067 - 30 Jun 2025

Viewed by 396

Abstract

Mental illness has emerged as a significant global health crisis, inflicting immense suffering and causing a notable decrease in productivity. Identifying mental health disorders at an early stage allows healthcare professionals to implement more targeted and impactful interventions, leading to a significant improvement [...] Read more.

Mental illness has emerged as a significant global health crisis, inflicting immense suffering and causing a notable decrease in productivity. Identifying mental health disorders at an early stage allows healthcare professionals to implement more targeted and impactful interventions, leading to a significant improvement in the overall well-being of the patient. Recent advances in Artificial Intelligence (AI) have opened new avenues for analyzing medical records and behavioral data of patients to assist mental health professionals in their decision-making processes. In this study performance of four Randomized Neural Networks (RandNNs) such as Board Learning System (BLS), Random Vector Functional Link Network (RVFLN), Kernelized RVFLN (KRVFLN), and Extreme Learning Machine (ELM) are explored for detecting the type of mental illness a user may have by analyzing the random text of the user posted on social media. To improve the performance of the RandNNs during handling the text documents with unbalanced class distributions, a hybrid feature selection (FS) technique named as TOPSIS-ModCHI is suggested in the preprocessing stage of the classification framework. The effectiveness of the suggested FS with all the four randomized networks is assessed over the publicly available Reddit Mental Health Dataset after experimenting on two benchmark multiclass unbalanced datasets. From the experimental results, it is inferred that detecting the mental illness using BLS with TOPSIS-ModCHI produces the highest precision value of 0.92, recall value of 0.66, f-measure value of 0.77, and Hamming loss value of 0.06 as compared to ELM, RVFLN, and KRVFLN with a minimum feature size of 900. Overall, utilizing BLS for mental health analysis can offer a promising avenue toward improved interventions and a better understanding of mental health issues, aiding in decision-making processes. Full article

► Show Figures

Figure 1

14 pages, 1810 KiB

Open AccessArticle

Assessing the Accuracy of Diagnostic Capabilities of Large Language Models

by Andrada Elena Urda-Cîmpean, Daniel-Corneliu Leucuța, Cristina Drugan, Alina-Gabriela Duțu, Tudor Călinici and Tudor Drugan

Diagnostics 2025, 15(13), 1657; https://doi.org/10.3390/diagnostics15131657 - 29 Jun 2025

Viewed by 625

Abstract

Background: In recent years, numerous artificial intelligence applications, especially generative large language models, have evolved in the medical field. This study conducted a structured comparative analysis of four leading generative large language models (LLMs)—ChatGPT-4o (OpenAI), Grok-3 (xAI), Gemini-2.0 Flash (Google), and DeepSeek-V3 (DeepSeek)—to [...] Read more.

Background: In recent years, numerous artificial intelligence applications, especially generative large language models, have evolved in the medical field. This study conducted a structured comparative analysis of four leading generative large language models (LLMs)—ChatGPT-4o (OpenAI), Grok-3 (xAI), Gemini-2.0 Flash (Google), and DeepSeek-V3 (DeepSeek)—to evaluate their diagnostic performance in clinical case scenarios. Methods: We assessed medical knowledge recall and clinical reasoning capabilities through staged, progressively complex cases, with responses graded by expert raters using a 0–5 scale. Results: All models performed better on knowledge-based questions than on reasoning tasks, highlighting the ongoing limitations in contextual diagnostic synthesis. Overall, DeepSeek outperformed the other models, achieving significantly higher scores across all evaluation dimensions (p < 0.05), particularly in regards to medical reasoning tasks. Conclusions: While these findings support the feasibility of using LLMs for medical training and decision support, the study emphasizes the need for improved interpretability, prompt optimization, and rigorous benchmarking to ensure clinical reliability. This structured, comparative approach contributes to ongoing efforts to establish standardized evaluation frameworks for integrating LLMs into diagnostic workflows. Full article

(This article belongs to the Special Issue A New Era in Diagnosis: From Biomarkers to Artificial Intelligence)

► Show Figures

Figure 1

Search Results (372)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (372)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI