Submit to Special Issue Submit Abstract to Special Issue Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Visual Attributes in Computer Vision Applications

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: 31 December 2025 | Viewed by 9624

Share This Special Issue

Special Issue Editor

Dr. Md Baharul Islam

E-Mail Website
Guest Editor

Department of Computing & Software Engineering, Florida Gulf Coast, Fort Myers, FL 33965, USA
Interests: image processing; computer vision; human-centered computing

Special Issue Information

Dear Colleagues,

Recent advancements in computer vision algorithms have unlocked the potential of visual attributes across a wide spectrum of emerging research areas and innovative real-world applications. These include, but are not limited to, healthcare, autonomous vehicles, activity recognition, facial and gesture analysis, biomedical imaging, vision-based rehabilitation, augmented reality (AR), virtual reality (VR), mixed reality (MR), and other intelligent systems. Both static and dynamic visual attributes are key to shaping these applications. Optimizing the use of individual or combined visual attributes has the potential to significantly enhance the performance, accuracy, and impact of these systems for end users and industry. This Special Issue aims to accelerate progress in this rapidly evolving field by fostering interdisciplinary collaboration and engagement within the visual computing and intelligent computer vision communities.

We welcome submissions on a range of topics, including but not limited to visual quality computing, visual cues for activity and gesture analysis, healthcare applications, image/video segmentation and inpainting, real-world or in-the-wild applications, low-level intelligent vision systems, visual field augmentation, transformation, and gamification.

Dr. Md Baharul Islam
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

visual quality enhancement and restoration
image and video inpainting and super-resolutions
gesture recognition for human–computer interactions
sign language recognition systems
computer vision for medical imaging
visual attributes in diagnostic healthcare systems
vision-based rehabilitation and assistive technologies
visual cues for remote healthcare monitoring
vision for autonomous driving systems
object detection and tracking in dynamic environments
scene understanding and semantic segmentation for autonomous systems
visual data for edge and cloud computing
low-level vision for intelligent systems
AR/VR/MR applications in training and simulation
gamification in vision-based applications
immersive environments for learning and training
object completion and inpainting in vision systems
vision-based surveillance and security systems
visual perception under adverse conditions (e.g., low light, weather, etc.)
computer vision for environmental monitoring (e.g., agriculture, forestry, etc.)

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

Jump to: Review

16 pages, 5544 KB

Open AccessArticle

Visual Feature Domain Audio Coding for Anomaly Sound Detection Application

by Subin Byun and Jeongil Seo

Algorithms 2025, 18(10), 646; https://doi.org/10.3390/a18100646 - 15 Oct 2025

Viewed by 444

Abstract

Conventional audio and video codecs are designed for human perception, often discarding subtle spectral cues that are essential for machine-based analysis. To overcome this limitation, we propose a machine-oriented compression framework that reinterprets spectrograms as visual objects and applies Feature Coding for Machines (FCM) to anomalous sound detection (ASD). In our approach, audio signals are transformed log-mel spectrograms, from which intermediate feature maps are extracted, compressed, and reconstructed through the FCM pipeline. For comparison, we implement AAC-LC (Advanced Audio Coding Low Complexity) as a representative perceptual audio codec and VVC (Versatile Video Coding) as spectrogram-based video codec. Experiments were conducted on the DCASE (Detection and Classification of Acoustic Scenes and Events) 2023 Task 2 dataset, covering four machine types (fan, valve, toycar, slider), with anomaly detection performed using the official Autoencoder baseline model released in DCASE 2024. Detection scores were computed from reconstruction error and Mahalanobis distance. The results show that the proposed FCM-based ACoM (Audio Coding for Machines) achieves comparable or superior performance to AAC at less than half the bitrate, reliably preserving critical features even under ultra-low bitrate conditions (1.3–6.3 kbps). While VVC retains competitive performance only at high bitrates, it degrades sharply at low bitrates. These findings demonstrate that feature-based compression offers a promising direction for next-generation ACoM standardization, enabling efficient and robust ASD in bandwidth-constrained industrial environments. Full article

(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)

► Show Figures

Figure 1

33 pages, 5041 KB

Open AccessArticle

Multimodal Video Summarization Using Machine Learning: A Comprehensive Benchmark of Feature Selection and Classifier Performance

by Elmin Marevac, Esad Kadušić, Nataša Živić, Nevzudin Buzađija, Edin Tabak and Safet Velić

Algorithms 2025, 18(9), 572; https://doi.org/10.3390/a18090572 - 10 Sep 2025

Cited by 1 | Viewed by 1938

Abstract

The exponential growth of user-generated video content necessitates efficient summarization systems for improved accessibility, retrieval, and analysis. This study presents and benchmarks a multimodal video summarization framework that classifies segments as informative or non-informative using audio, visual, and fused features. Sixty hours of annotated video across ten diverse categories were analyzed. Audio features were extracted with pyAudioAnalysis, while visual features (colour histograms, optical flow, object detection, facial recognition) were derived using OpenCV. Six supervised classifiers—Naive Bayes, K-Nearest Neighbors, Logistic Regression, Decision Tree, Random Forest, and XGBoost—were evaluated, with hyperparameters optimized via grid search. Temporal coherence was enhanced using median filtering. Random Forest achieved the best performance, with 74% AUC on fused features and a 3% F1-score gain after post-processing. Spectral flux, grayscale histograms, and optical flow emerged as key discriminative features. The best model was deployed as a practical web service using TensorFlow and Flask, integrating informative segment detection with subtitle generation via beam search to ensure coherence and coverage. System-level evaluation demonstrated low latency and efficient resource utilization under load. Overall, the results confirm the strength of multimodal fusion and ensemble learning for video summarization and highlight their potential for real-world applications in surveillance, digital archiving, and online education. Full article

(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)

► Show Figures

Figure 1

19 pages, 2306 KB

Open AccessArticle

Optimized Adaptive Multi-Scale Architecture for Surface Defect Recognition

by Xueli Chang, Yue Wang, Heping Zhang, Bogdan Adamyk and Lingyu Yan

Algorithms 2025, 18(8), 529; https://doi.org/10.3390/a18080529 - 20 Aug 2025

Cited by 1 | Viewed by 837

Abstract

Detection of defects on steel surface is crucial for industrial quality control. To address the issues of structural complexity, high parameter volume, and poor real-time performance in current detection models, this study proposes a lightweight model based on an improved YOLOv11. The model first reconstructs the backbone network by introducing a Reversible Connected Multi-Column Network (RevCol) to effectively preserve multi-level feature information. Second, the lightweight FasterNet is embedded into the C3k2 module, utilizing Partial Convolution (PConv) to reduce computational overhead. Additionally, a Group Convolution-driven EfficientDetect head is designed to maintain high-performance feature extraction while minimizing consumption of computational resources. Finally, a novel WISEPIoU loss function is developed by integrating WISE-IoU and POWERFUL-IoU to accelerate the model convergence and optimize the accuracy of bounding box regression. The experiments on the NEU-DET dataset demonstrate that the improved model achieves a parameter reduction of 39.1% from the baseline and computational complexity of 49.2% reduction in comparison with the baseline, with an mAP@0.5 of 0.758 and real-time performance of 91 FPS. On the DeepPCB dataset, the model exhibits reduction of parameters and computations by 39.1% and 49.2%, respectively, with mAP@0.5 = 0.985 and real-time performance of 64 FPS. The study validates that the proposed lightweight framework effectively balances accuracy and efficiency, and proves to be a practical solution for real-time defect detection in resource-constrained environments. Full article

(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)

► Show Figures

Figure 1

16 pages, 12755 KB

Open AccessArticle

Improved Algorithm to Detect Clandestine Airstrips in Amazon RainForest

by Gabriel R. Pardini, Paulo M. Tasinaffo, Elcio H. Shiguemori, Tahisa N. Kuck, Marcos R. O. A. Maximo and William R. Gyotoku

Algorithms 2025, 18(2), 102; https://doi.org/10.3390/a18020102 - 13 Feb 2025

Viewed by 1667

Abstract

The Amazon biome is frequently targeted by illegal activities, with clandestine mining being one of the most prominent. Due to the dense forest cover, criminals often rely on covert aviation as a logistical tool to supply remote locations and sustain these activities. This work presents an enhancement to a previously developed landing strip detection algorithm tailored for the Amazon biome. The initial algorithm utilized satellite images combined with the use of Convolutional Neural Networks (CNNs) to find the targets’ spatial locations (latitude and longitude). By addressing the limitations identified in the initial approach, this refined algorithm aims to improve detection accuracy and operational efficiency in complex rainforest environments. Tests in a selected area of the Amazon showed that the modified algorithm resulted in a recall drop of approximately 1% while reducing false positives by 26.6%. The recall drop means there was a decrease in the detection of true positives, which is balanced by the reduction in false positives. When applied across the entire biome, the recall decreased by 1.7%, but the total predictions dropped by 17.88%. These results suggest that, despite a slight reduction in recall, the modifications significantly improved the original algorithm by minimizing its limitations. Additionally, the improved solution demonstrates a 25.55% faster inference time, contributing to more rapid target identification. This advancement represents a meaningful step toward more effective detection of clandestine airstrips, supporting ongoing efforts to combat illegal activities in the region. Full article

(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)

► Show Figures

Figure 1

Review

Jump to: Research

47 pages, 944 KB

Open AccessReview

Algorithms for Plant Monitoring Applications: A Comprehensive Review

by Giovanni Paolo Colucci, Paola Battilani, Marco Camardo Leggieri and Daniele Trinchero

Algorithms 2025, 18(2), 84; https://doi.org/10.3390/a18020084 - 5 Feb 2025

Cited by 2 | Viewed by 3755

Abstract

Many sciences exploit algorithms in a large variety of applications. In agronomy, large amounts of agricultural data are handled by adopting procedures for optimization, clustering, or automatic learning. In this particular field, the number of scientific papers has significantly increased in recent years, triggered by scientists using artificial intelligence, comprising deep learning and machine learning methods or bots, to process field, crop, plant, or leaf images. Moreover, many other examples can be found, with different algorithms applied to plant diseases and phenology. This paper reviews the publications which have appeared in the past three years, analyzing the algorithms used and classifying the agronomic aims and the crops to which the methods are applied. Starting from a broad selection of 6060 papers, we subsequently refined the search, reducing the number to 358 research articles and 30 comprehensive reviews. By summarizing the advantages of applying algorithms to agronomic analyses, we propose a guide to farming practitioners, agronomists, researchers, and policymakers regarding best practices, challenges, and visions to counteract the effects of climate change, promoting a transition towards more sustainable, productive, and cost-effective farming and encouraging the introduction of smart technologies. Full article

(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)

► Show Figures

Journal Menu

Journal Browser

Visual Attributes in Computer Vision Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (5 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI