Visual Attributes in Computer Vision Applications

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: 31 December 2025 | Viewed by 6802

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computing & Software Engineering, Florida Gulf Coast, Fort Myers, FL 33965, USA
Interests: image processing; computer vision; human-centered computing

Special Issue Information

Dear Colleagues,

Recent advancements in computer vision algorithms have unlocked the potential of visual attributes across a wide spectrum of emerging research areas and innovative real-world applications. These include, but are not limited to, healthcare, autonomous vehicles, activity recognition, facial and gesture analysis, biomedical imaging, vision-based rehabilitation, augmented reality (AR), virtual reality (VR), mixed reality (MR), and other intelligent systems. Both static and dynamic visual attributes are key to shaping these applications. Optimizing the use of individual or combined visual attributes has the potential to significantly enhance the performance, accuracy, and impact of these systems for end users and industry. This Special Issue aims to accelerate progress in this rapidly evolving field by fostering interdisciplinary collaboration and engagement within the visual computing and intelligent computer vision communities.

We welcome submissions on a range of topics, including but not limited to visual quality computing, visual cues for activity and gesture analysis, healthcare applications, image/video segmentation and inpainting, real-world or in-the-wild applications, low-level intelligent vision systems, visual field augmentation, transformation, and gamification.

Dr. Md Baharul Islam
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • visual quality enhancement and restoration
  • image and video inpainting and super-resolutions
  • gesture recognition for human–computer interactions
  • sign language recognition systems
  • computer vision for medical imaging
  • visual attributes in diagnostic healthcare systems
  • vision-based rehabilitation and assistive technologies
  • visual cues for remote healthcare monitoring
  • vision for autonomous driving systems
  • object detection and tracking in dynamic environments
  • scene understanding and semantic segmentation for autonomous systems
  • visual data for edge and cloud computing
  • low-level vision for intelligent systems
  • AR/VR/MR applications in training and simulation
  • gamification in vision-based applications
  • immersive environments for learning and training
  • object completion and inpainting in vision systems
  • vision-based surveillance and security systems
  • visual perception under adverse conditions (e.g., low light, weather, etc.)
  • computer vision for environmental monitoring (e.g., agriculture, forestry, etc.)

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

16 pages, 5544 KB  
Article
Visual Feature Domain Audio Coding for Anomaly Sound Detection Application
by Subin Byun and Jeongil Seo
Algorithms 2025, 18(10), 646; https://doi.org/10.3390/a18100646 (registering DOI) - 15 Oct 2025
Abstract
Conventional audio and video codecs are designed for human perception, often discarding subtle spectral cues that are essential for machine-based analysis. To overcome this limitation, we propose a machine-oriented compression framework that reinterprets spectrograms as visual objects and applies Feature Coding for Machines [...] Read more.
Conventional audio and video codecs are designed for human perception, often discarding subtle spectral cues that are essential for machine-based analysis. To overcome this limitation, we propose a machine-oriented compression framework that reinterprets spectrograms as visual objects and applies Feature Coding for Machines (FCM) to anomalous sound detection (ASD). In our approach, audio signals are transformed log-mel spectrograms, from which intermediate feature maps are extracted, compressed, and reconstructed through the FCM pipeline. For comparison, we implement AAC-LC (Advanced Audio Coding Low Complexity) as a representative perceptual audio codec and VVC (Versatile Video Coding) as spectrogram-based video codec. Experiments were conducted on the DCASE (Detection and Classification of Acoustic Scenes and Events) 2023 Task 2 dataset, covering four machine types (fan, valve, toycar, slider), with anomaly detection performed using the official Autoencoder baseline model released in DCASE 2024. Detection scores were computed from reconstruction error and Mahalanobis distance. The results show that the proposed FCM-based ACoM (Audio Coding for Machines) achieves comparable or superior performance to AAC at less than half the bitrate, reliably preserving critical features even under ultra-low bitrate conditions (1.3–6.3 kbps). While VVC retains competitive performance only at high bitrates, it degrades sharply at low bitrates. These findings demonstrate that feature-based compression offers a promising direction for next-generation ACoM standardization, enabling efficient and robust ASD in bandwidth-constrained industrial environments. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

33 pages, 5041 KB  
Article
Multimodal Video Summarization Using Machine Learning: A Comprehensive Benchmark of Feature Selection and Classifier Performance
by Elmin Marevac, Esad Kadušić, Nataša Živić, Nevzudin Buzađija, Edin Tabak and Safet Velić
Algorithms 2025, 18(9), 572; https://doi.org/10.3390/a18090572 - 10 Sep 2025
Viewed by 682
Abstract
The exponential growth of user-generated video content necessitates efficient summarization systems for improved accessibility, retrieval, and analysis. This study presents and benchmarks a multimodal video summarization framework that classifies segments as informative or non-informative using audio, visual, and fused features. Sixty hours of [...] Read more.
The exponential growth of user-generated video content necessitates efficient summarization systems for improved accessibility, retrieval, and analysis. This study presents and benchmarks a multimodal video summarization framework that classifies segments as informative or non-informative using audio, visual, and fused features. Sixty hours of annotated video across ten diverse categories were analyzed. Audio features were extracted with pyAudioAnalysis, while visual features (colour histograms, optical flow, object detection, facial recognition) were derived using OpenCV. Six supervised classifiers—Naive Bayes, K-Nearest Neighbors, Logistic Regression, Decision Tree, Random Forest, and XGBoost—were evaluated, with hyperparameters optimized via grid search. Temporal coherence was enhanced using median filtering. Random Forest achieved the best performance, with 74% AUC on fused features and a 3% F1-score gain after post-processing. Spectral flux, grayscale histograms, and optical flow emerged as key discriminative features. The best model was deployed as a practical web service using TensorFlow and Flask, integrating informative segment detection with subtitle generation via beam search to ensure coherence and coverage. System-level evaluation demonstrated low latency and efficient resource utilization under load. Overall, the results confirm the strength of multimodal fusion and ensemble learning for video summarization and highlight their potential for real-world applications in surveillance, digital archiving, and online education. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

19 pages, 2306 KB  
Article
Optimized Adaptive Multi-Scale Architecture for Surface Defect Recognition
by Xueli Chang, Yue Wang, Heping Zhang, Bogdan Adamyk and Lingyu Yan
Algorithms 2025, 18(8), 529; https://doi.org/10.3390/a18080529 - 20 Aug 2025
Cited by 1 | Viewed by 667
Abstract
Detection of defects on steel surface is crucial for industrial quality control. To address the issues of structural complexity, high parameter volume, and poor real-time performance in current detection models, this study proposes a lightweight model based on an improved YOLOv11. The model [...] Read more.
Detection of defects on steel surface is crucial for industrial quality control. To address the issues of structural complexity, high parameter volume, and poor real-time performance in current detection models, this study proposes a lightweight model based on an improved YOLOv11. The model first reconstructs the backbone network by introducing a Reversible Connected Multi-Column Network (RevCol) to effectively preserve multi-level feature information. Second, the lightweight FasterNet is embedded into the C3k2 module, utilizing Partial Convolution (PConv) to reduce computational overhead. Additionally, a Group Convolution-driven EfficientDetect head is designed to maintain high-performance feature extraction while minimizing consumption of computational resources. Finally, a novel WISEPIoU loss function is developed by integrating WISE-IoU and POWERFUL-IoU to accelerate the model convergence and optimize the accuracy of bounding box regression. The experiments on the NEU-DET dataset demonstrate that the improved model achieves a parameter reduction of 39.1% from the baseline and computational complexity of 49.2% reduction in comparison with the baseline, with an mAP@0.5 of 0.758 and real-time performance of 91 FPS. On the DeepPCB dataset, the model exhibits reduction of parameters and computations by 39.1% and 49.2%, respectively, with mAP@0.5 = 0.985 and real-time performance of 64 FPS. The study validates that the proposed lightweight framework effectively balances accuracy and efficiency, and proves to be a practical solution for real-time defect detection in resource-constrained environments. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

16 pages, 12755 KB  
Article
Improved Algorithm to Detect Clandestine Airstrips in Amazon RainForest
by Gabriel R. Pardini, Paulo M. Tasinaffo, Elcio H. Shiguemori, Tahisa N. Kuck, Marcos R. O. A. Maximo and William R. Gyotoku
Algorithms 2025, 18(2), 102; https://doi.org/10.3390/a18020102 - 13 Feb 2025
Viewed by 1455
Abstract
The Amazon biome is frequently targeted by illegal activities, with clandestine mining being one of the most prominent. Due to the dense forest cover, criminals often rely on covert aviation as a logistical tool to supply remote locations and sustain these activities. This [...] Read more.
The Amazon biome is frequently targeted by illegal activities, with clandestine mining being one of the most prominent. Due to the dense forest cover, criminals often rely on covert aviation as a logistical tool to supply remote locations and sustain these activities. This work presents an enhancement to a previously developed landing strip detection algorithm tailored for the Amazon biome. The initial algorithm utilized satellite images combined with the use of Convolutional Neural Networks (CNNs) to find the targets’ spatial locations (latitude and longitude). By addressing the limitations identified in the initial approach, this refined algorithm aims to improve detection accuracy and operational efficiency in complex rainforest environments. Tests in a selected area of the Amazon showed that the modified algorithm resulted in a recall drop of approximately 1% while reducing false positives by 26.6%. The recall drop means there was a decrease in the detection of true positives, which is balanced by the reduction in false positives. When applied across the entire biome, the recall decreased by 1.7%, but the total predictions dropped by 17.88%. These results suggest that, despite a slight reduction in recall, the modifications significantly improved the original algorithm by minimizing its limitations. Additionally, the improved solution demonstrates a 25.55% faster inference time, contributing to more rapid target identification. This advancement represents a meaningful step toward more effective detection of clandestine airstrips, supporting ongoing efforts to combat illegal activities in the region. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

Review

Jump to: Research

47 pages, 944 KB  
Review
Algorithms for Plant Monitoring Applications: A Comprehensive Review
by Giovanni Paolo Colucci, Paola Battilani, Marco Camardo Leggieri and Daniele Trinchero
Algorithms 2025, 18(2), 84; https://doi.org/10.3390/a18020084 - 5 Feb 2025
Cited by 1 | Viewed by 3180
Abstract
Many sciences exploit algorithms in a large variety of applications. In agronomy, large amounts of agricultural data are handled by adopting procedures for optimization, clustering, or automatic learning. In this particular field, the number of scientific papers has significantly increased in recent years, [...] Read more.
Many sciences exploit algorithms in a large variety of applications. In agronomy, large amounts of agricultural data are handled by adopting procedures for optimization, clustering, or automatic learning. In this particular field, the number of scientific papers has significantly increased in recent years, triggered by scientists using artificial intelligence, comprising deep learning and machine learning methods or bots, to process field, crop, plant, or leaf images. Moreover, many other examples can be found, with different algorithms applied to plant diseases and phenology. This paper reviews the publications which have appeared in the past three years, analyzing the algorithms used and classifying the agronomic aims and the crops to which the methods are applied. Starting from a broad selection of 6060 papers, we subsequently refined the search, reducing the number to 358 research articles and 30 comprehensive reviews. By summarizing the advantages of applying algorithms to agronomic analyses, we propose a guide to farming practitioners, agronomists, researchers, and policymakers regarding best practices, challenges, and visions to counteract the effects of climate change, promoting a transition towards more sustainable, productive, and cost-effective farming and encouraging the introduction of smart technologies. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

Back to TopTop