You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

20 November 2025

Blind Image Quality Assessment Using Convolutional Neural Networks

,
and
Department of Data Science and Engineering, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
*
Author to whom correspondence should be addressed.
This article belongs to the Section Sensing and Imaging

Abstract

In the domain of image and multimedia processing, image quality is a critical factor, as it directly influences the performance of subsequent tasks such as compression, transmission, and content analysis. Reliable assessment of image quality is therefore essential not only for benchmarking algorithms but also for ensuring user satisfaction in real-world multimedia applications. The most advanced Blind image quality assessment (BIQA) methods are typically built upon deep learning models and rely on complex architectures that, while effective, require substantial computational resources and large-scale training datasets. This complexity can limit their scalability and practical deployment, particularly in resource-constrained environments. In this paper, we revisit a model inspired by one of the early applications of convolutional neural networks (CNNs) in BIQA and demonstrate that by leveraging recent advancements in machine learning—such as Bayesian hyperparameter optimization and widely used stochastic optimization methods (e.g., Adam)—it is possible to achieve competitive performance using a simpler, more scalable, and lightweight architecture. To evaluate the proposed approach, we conducted extensive experiments on widely used benchmark datasets, including TID2013 and KADID-10k. The results show that the proposed model achieves competitive performance while maintaining a substantially more efficient design. These findings suggest that lightweight CNN-based models, when combined with modern optimization strategies, can serve as a viable alternative to more elaborate frameworks, offering an improved balance between accuracy, efficiency, and scalability.

1. Introduction

The development of the Internet and significant progress in imaging device technology (cameras, smartphones, etc.) have led to the acquisition and processing of a vast number of digital images. The quality of these images is critical for the performance of vision systems. Therefore, there is a need for metrics and algorithms for image quality assessment (IQA) capable of replacing the most reliable evaluation procedures, namely subjective assessments performed by human observers and reflecting the properties of the human visual system (HVS). These assessments are typically expressed as the mean opinion score (MOS), defined as the average of individual subjective ratings. IQA methods are commonly classified into three categories: Full-Reference IQA, Reduced-Reference IQA, and No-Reference IQA [1]. The latter category is also called the Blind image quality assessment (BIQA). A reference image is understood as an undistorted image representing the original, high-quality content. Methods belonging to the FR-IQA and RR-IQA classes, which utilize reference information, generally provide strong performance [2,3]. Unfortunately, in many real-world applications, access to reference images is often unavailable. BIQA methods, which do not require access to reference images for quality assessment, are finding a growing number of applications. Rapid progress in BIQA research has been made, from NSS-based methods to deep learning (DL) models [4,5].
In the age of artificial intelligence, deep neural networks—particularly convolutional neural networks (CNNs)—have outperformed traditional approaches by enabling joint end-to-end learning of features and regression directly from raw input data [6]. In recent years, CNN architectures inspired by models such as ResNet and VGG have become less dominant in BIQA, increasingly being complemented or replaced by vision transformers, which allow models to achieve higher correlations with MOS. However, CNNs continue to play an important role in BIQA systems designed for mobile, embedded, and other resource-constrained environments, where high computational efficiency and low power consumption are required. In such scenarios, lightweight CNN architectures (typically containing 1–10 million parameters) offer fast image quality prediction. The BIQA model proposed in this paper adopts this lightweight design, providing computational efficiency while maintaining competitive performance and suitability for the aforementioned applications.
The paper is organized as follows. After a brief introduction that outlines the IQA topics, Section 2 discusses the applications of CNNs in the field of BIQA. This section references both classic works from 10 years ago and several recent review articles. In Section 3, we describe the details of the proposed network architecture and the programming tools and computer hardware used to develop CNN. Section 4 presents the experimental results obtained from the TID2013 and KADID-10k image databases. Finally, the conclusions are presented in Section 5.

3. Materials and Methods

The model we propose is based on the architecture introduced by Kang et al. [7], deemed the first to apply CNNs to BIQA problems. Although current methods report higher accuracy, this solution stands out because it provides significantly lower computational cost and enhanced explainability.

3.1. Data Preprocessing

Data preprocessing was heavily inspired by the implementation by Kang et al. [7]: images are normalized and partitioned into non-overlapping patches. Furthermore, performance improvements were achieved through the integration of state-of-the-art techniques, which were not widely available or commonly used when the original algorithm was developed. In particular, the Tree-structured Parzen Estimator was utilized for hyperparameter tuning, and the Adam optimizer was employed for model training. The distorted images from TID2013 and KADID-10k were divided into training, validation, and test sets in proportions of 60%, 20%, and 20%, respectively. Before being fed into the model, the input data was normalized. Normalization was performed locally, meaning that the mean and standard deviation are computed within each non-overlapping neighborhood (typically a 3 × 3 window). Formally, the normalization process of the input image I can be described as follows:
I ^ ( x , y ) = I ( x , y ) μ ( x , y ) σ ( x , y ) + C ,
where
μ ( x , y ) = i = k k j = k k ω ( i , j ) I ( x i , y j )
is the local mean, ( i , j ) is the position of an arbitrary pixel, ω is the kernel—a 3 × 3 region,
σ ( x , y ) = i = k k j = k k ω ( i , j ) I ( x i , y j ) μ ( x , y ) 2
is the local standard deviation, and C is a non-zero constant that prevents division by zero.
The normalized images are subsequently divided into non-overlapping 32 × 32 patches. Each patch is assigned the ground-truth score of the corresponding original image. This approach is justified by the homogeneity of the distortions considered, as the distortion level is consistent across all regions of the image. Furthermore, this method effectively increases the dataset and enhances the training process.

3.2. Proposed Architecture

Figure 1 shows the architecture of the proposed neural network. The model takes as input a normalized 32 × 32 image patch. The input is processed by four convolutional blocks, each containing a convolutional layer, a batch normalization layer, and a max pooling layer. The details are outlined in Table 1. The number of parameters in the proposed neural network is approximately 0.9 M parameters, which qualifies the network as a lightweight neural model designed for BIQA. The computational cost is approximately 0.031 GFLOPs, which indicates that the model is computationally efficient.
Figure 1. Architecture of the proposed neural network.
Table 1. Architecture details of the CNN model.
ReLU was selected as the activation function for the convolutional layers and the dense (fully connected) layer. The formula for ReLU is defined as follows:
f x = max 0 , x ,
where f : R R , is the ReLU activation function and x is the linear output of the previous layer. Mean Absolute Error (MAE) was used as the loss function to optimize model parameters. MAE is defined by the following formula:
M A E = 1 N i = 1 N y ^ i y i ,
where y ^ i denotes i-th ground-truth score, y i is the i-th predicted score, and N is the number of training examples.
Following the last convolutional block, the output shape is 2 × 2 × 256. As the fully connected layer accepts a one-dimensional vector as input, the data needs to be reshaped. This reshape operation is known as data flattening and involves converting a three-dimensional tensor into a one-dimensional vector by concatenating all values into a single array. The flatten layer performs this operation by transforming the output shape, resulting in a vector of size 1024, which can then be fed into the dense layer. To increase generalizability and mitigate the risk of overfitting, a dropout layer is placed immediately before the final output layer. Dropout is a regularization technique that temporarily deactivates a selection of neurons during training. During a forward pass, any neuron can be dropped with a probability determined by the dropout rate, a hyperparameter. In practice, deactivation involves setting the neuron’s activation to zero.
The Adam optimizer was selected to update the model weights. Adam is widely used in training neural networks. It combines the best properties of the classical optimizers such as Momentum and RMSProp [15]. The output of the model is an estimated quality score, whose range and values vary depending on the characteristics of the input dataset. Given the number of trainable components—including four convolutional layers and a dense layer—manual selection of optimal hyperparameters becomes increasingly complex and impractical. A widely adopted state-of-the-art solution to this issue is the use of automated hyperparameter optimization.
Consequently, the Optuna framework (version 4.0.0) was used to perform this task. Specifically, the number of neurons in each layer, the learning rate, and the dropout rate were optimized. The hyperparameter search was guided by the Tree-structured Parzen Estimator (TPE) algorithm [16]. A characteristic feature of TPE is the use of decision trees to model probability distributions, in order to efficiently search for hyperparameters. This technique identifies hyperparameter combinations with a high probability of yielding better results. The hyperparameter search space used during tuning is presented in the table below.
Table 2 presents the hyperparameters included in the automated Bayesian tuning process. The first entry, n_neurons, corresponds to the number of neurons in the dense layer, followed by the learning rate and the dropout rate, respectively. The column “Value bounds” presents the range of values considered for each parameter during tuning. Additional details of the CNN model proposed for BIQA can be found in [17].
Table 2. Hyperparameter value bounds.

3.3. Development Tools and System Specifications

Our DNN model was developed in Python (version 3.10.14) using the following libraries (version numbers are provided in brackets):
  • TensorFlow (v.2.10.1) for building and training models;
  • NumPy (v.1.26.4) for numerical computations;
  • Pandas (v.2.2.2) for data manipulation;
  • Optuna (v.4.0.0) for hyperparameter tuning;
  • Scipy (v.1.14.1) for scientific computing tasks;
  • Matplotlib (v.3.9.2) for visualizations;
  • OpenCV (v.4.10.0.84) for image processing.
We selected TensorFlow because of its comprehensive support for complex deep learning tasks. Another viable alternative is PyTorch (v.2.5.0) which is known for its dynamic computation graph and ease of use.
TensorFlow automatically handles GPU computations. However, a compatible version of the CUDA framework is required. The supported CUDA version, along with its associated components, is listed below.
  • cuda-nvcc (v.12.4.131), a CUDA compiler;
  • cudatoolkit (v.11.2.2), the CUDA framework;
  • cudnn (v.8.1.0.77), a CUDA library providing highly optimized implementations of common deep learning procedures such as convolution, pooling, and normalization.
Each of the packages mentioned above was installed using Miniconda (v.24.7.1), a lightweight distribution of the Conda package manager.
Deep neural networks (DNNs) require powerful computing units. The computations were performed using a GPU and CUDA technology, which enables parallel computing. The computations were performed on a machine equipped with the following components: two Intel Xeon E5-2680 v2 processors (Intel, Santa Clara, CA, USA) operating at a frequency of 2.80 GHz, 48 GB of RAM, and an NVIDIA GeForce RTX 4070 graphics card (NVIDIA, Santa Clara, CA, USA) with 12 GB of memory.

4. Results

The evaluation of the proposed model requires testing on image databases. For this purpose, two publicly available and relatively recent databases were selected, each containing images with various distortions and their corresponding MOS values. The databases used are TID2013 [18] and KADID-10k [19]. Table 3 summarizes the key characteristics of these databases.
Table 3. Summary of image quality assessment databases.
The TID2013 database consists of 25 reference images (Figure 2) that were subjected to 24 types of distortion at five different levels of severity. Notable examples among the 24 distortions include additive Gaussian noise, masked noise, JPEG compression, contrast variations, color quantization with dithering, and chromatic aberrations. Consequently, a total of 3000 images with subjective quality assessment (MOS) were generated. These quality scores were collected from human observers in a controlled laboratory environment. The KADID-10k database was created from 81 reference images (Figure 3), each of which was subjected to 25 types of distortion, with characteristics similar to those in TID2013, at five levels of severity. As a result, the database contains 10,125 distorted images. The database uses a differential mean opinion score (DMOS). The DMOS values for these images were obtained via crowdsourcing.
Figure 2. TID2013 (Tampere Image Database): reference images [18].
Figure 3. KADID-10k (Konstanz Artificially Distorted Image quality Database): reference images [19].
The databases use different scoring methods: DMOS ranges from 0 to 100, where 100 indicates the worst quality. In contrast, MOS ranges from 0 to 6, where 0 indicates the worst quality. To address these differences and ensure consistency, a logistic regression mapping was applied to align the quality metrics across the datasets. This step is essential when training a model in one dataset and evaluating it on another. An example of this mapping is presented in Figure 4.
Figure 4. Scatter plots of subjective MOS versus IQA metrics obtained from the TID2013 database.
Figure 4 shows a scatter plot with MOS on the x-axis and DMOS on the y-axis. The orange points represent the actual DMOS values from the training data used to fit the logistic regression model, while the blue points indicate the DMOS values predicted by the model. Although there is some divergence for low-quality images, the logistic regression adequately captures the nonlinear mapping between MOS and DMOS, enabling accurate transformation from one scale to the other.
In the field of BIQA, image quality metrics are typically compared against subjective perceptual ratings. For this purpose, Pearson’s linear correlation coefficient (PLCC), Spearman’s rank correlation coefficient (SROCC), and Kendall’s rank correlation coefficient (KROCC) are used. According to the recommendations outlined by Sheikh et al. [20], the calculation of PLCC should be preceded by a nonlinear regression based on a five-parameter logistic function:
p x , β = β 1 1 2 1 1 + e x p β 2 x β 3 + β 4 x + β 5 ,
where β i , i = 1 , 2 , , 5—parameters, x—raw quality index.
The formulas for calculating the PLCC and SROCC correlation coefficients are as follows:
P L C C = i = 1 N p i p ¯ s i s ¯ i = 1 N p i p ¯ 2 s i s ¯ 2 ,
where p i and s i represent the raw values of the subjective and objective measures, respectively, while p ¯ and s ¯ denote the mean values of the subjective and objective measures,
S R O C C = 1 6 i = 1 N d i 2 N N 2 1 ,
where d i represents the difference between the ranks of two measures for the i-th observation, and N is the total number of observations. Higher correlation values indicate better BIQA performance. These two criteria evaluate prediction accuracy (PLCC) and prediction monotonicity (SROCC).
In the experiments described below, which compare different neural networks used in BIQA, we only report the values of the PLCC and SROCC indices, as they are the most commonly used metrics in BIQA research. Our comparison includes a set of well-known DNN-based solutions as well as newer methods that have recently gained significant attention in the research community due to their strong performance. These approaches represent a variety of architectural designs, including models based on transfer learning, dual-branch structures, multi-output strategies, and transformer-based mechanisms. The results for the compared networks were taken from [21].
Table 4 presents the results of our network trained and tested on the TID2013 database. The proposed model demonstrates strong performance, achieving a PLCC of 0.871 and an SROCC of 0.846, outperforming the DB-CNN and MEON models and closely approaching the state-of-the-art TReS network. Importantly, unlike many existing methods, our network does not rely on transfer learning from large pre-trained models or datasets, such as models trained on ImageNet (used by DB-CNN) or ResNet-50 (used by TReS). A comparison of the proposed solution with another lightweight NIMA [22] model based on the MobileNet architecture favors our model in terms of both correlation metrics and the number of parameters.
Table 4. Performance comparison of IQA methods trained and tested on the TID2013 database.
Furthermore, the architectures of these competing methods incorporate additional complex design elements, such as dual-output heads (MEON), dual-branch structures (DB-CNN), and transformers (TReS, DEIQT). In contrast, our model employs a significantly simpler architecture with fewer parameters and can be trained from scratch using standard, well-established CNN techniques, making it both efficient and accessible.
For further comparison, two additional models (ExIQA [27], CoDI-IQA [28]) from 2024 to 2025 were also included. The ExIQA model addresses the BIQA task from the perspective of distortion identification, aiming to determine both the types and the strengths of distortions in an image by leveraging a Vision–Language Model (VLM). The identified distortions are subsequently provided as input to a regressor in order to predict the image quality score. The training of ExIQA was conducted on a large-scale dataset containing over one hundred thousand multi-distorted images based on the KADID-10k database.
The CoDI model (Cross-domain Distortion Identification for Image Quality Assessment), similar to ExIQA, estimates image quality indirectly by analyzing distortion types and strengths. It leverages deep networks such as ResNet-50 (CNN) and Swin Transformer (ViT) as encoders to disentangle content-dependent and distortion-related features. Cross-dataset experiments demonstrate its strong generalization capability. Although both new models achieved very high PLCC and SROCC values across various test image databases, their training requires adjusting tens of millions of parameters.
Table 5 presents the results of the proposed method trained and evaluated on the KADID-10k database. The proposed model achieves a PLCC of 0.789 and an SROCC of 0.798, representing a notable drop in performance relative to the TID2013 results. Although the model outperforms MEON, it falls short of the other competing methods on this dataset. The model we propose performs worse than another lightweight model (GreenBIQA [29]), but the number of parameters is only half that of the other model.
Table 5. Performance comparison of IQA methods tested on KADID-10k.
Figure 5 and Figure 6 show the achieved PLCC and SROCC coefficients for all compared models as a function of the number their trainable parameters. The proposed architecture stands out by having the lowest number of trainable parameters while maintaining satisfactory correlation performance.
Figure 5. Correlation coefficients vs. number of parameters (TID2013).
Figure 6. Correlation coefficients vs. number of parameters (KADID-10k).
The performance deterioration on KADID-10k is notable compared to TID2013. This discrepancy can be attributed to the observation that hyperparameter tuning was conducted primarily on the TID2013 database, while extensive tuning on KADID-10k was not feasible due to computational limitations. Although such optimization could potentially improve the results, the training process requires several hours per configuration, making comprehensive hyperparameter exploration impractical given the available computational resources.
While the results obtained for our CNN model were not directly compared with those of Kang’s CNN, it is reasonable to assume that our model would outperform the Kang approach. Kang’s models were evaluated exclusively on the TID2008 database, which contains images subjected to a narrower and less diverse range of distortions compared to the TID2013 database. Consequently, our approach can be considered not only effective, but also efficient and accessible for practical applications.
Table 6 presents the durations of training and evaluation. The reported number of epochs refers exclusively to those actually used during training; epochs omitted due to early stopping are not included. The evaluation time corresponds to the total duration of image quality prediction on the left-out test set. As shown, training on KADID-10k required substantially more time and a greater number of epochs compared to TID2013. This outcome is expected, as KADID-10k contains a larger number of samples and therefore requires more iterations for the model to converge.
Table 6. Training and evaluation times for each database.
The low average prediction times achieved by the proposed BIQA model indicate not only a lightweight network architecture and efficient implementation but also demonstrate its potential for real-time operation and energy-efficient deployment—an aspect of particular importance for mobile devices and embedded systems. These prediction times should be considered in conjunction with the accuracy of the model, i.e., the reliability of quality assessments.

5. Conclusions

DNN-based solutions have become the standard in BIQA, with recent models increasingly relying on complex architectures, transfer learning, and attention mechanisms to enhance performance. Although these advances have yielded notable improvements, they often introduce drawbacks such as increased model size, higher training complexity, and a strong dependence on pre-trained networks. Our proposed method, inspired by the first CNN-based approaches in BIQA, revisits the principle of architectural simplicity without compromising effectiveness. We integrate modern machine learning techniques, including Bayesian hyperparameter optimization with Optuna, to strengthen performance while maintaining this classical foundation. Consequently, our model achieves performance comparable to current state-of-the-art solutions while remaining relatively lightweight, interpretable, and easy to train from scratch. This balance between simplicity and effectiveness makes it particularly suitable for future scenarios in which scalability, optimization, or deployment efficiency are essential.

Author Contributions

Conceptualization, M.F., H.P. and W.T.; methodology, W.T.; software, W.T.; validation, M.F. and H.P.; formal analysis, M.F.; investigation, W.T.; resources, M.F.; data curation, W.T.; writing—original draft preparation, W.T.; writing—review and editing, M.F. and H.P.; visualization, M.F.; supervision, H.P.; project administration, M.F. and H.P. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was founded by Silesian University of Technology, Gliwice, Poland.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

This work was supported by the Polish Ministry of Science and Higher Education under internal grant 02/070/BK_25/0066 for the Department of Data Science and Engineering at the Silesian University of Technology, Gliwice, Poland.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BIQABlind Image Quality Assessment
CNNConvolutional Neural Network
CUDACompute Unified Device Architecture
DLDeep Learning
DMOSDifferential Mean Opinion Score
DNNDeep Neural Network
GANGenerative Adversarial Network
GPUGraphics Processing Unit
HVSHuman Visual System
IQAImage Quality Assessment
KADIDKonstanz Artificially Distorted Image quality Database
KROCCKendall Rank Order Correlation Coefficient
MAEMean Average Error
MOSMean Opinion Score
NSSNatural Scene Statistics
PLCCPearson Linear Correlation Coefficient
ReLURectified Linear Unit
SROCCSpearman Rank Order Correlation Coefficient
TIDTampere Image Database
TPETree-structured Parzen Estimator

References

  1. Ding, Y. Visual Quality Assessment for Natural and Medical Image; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  2. Varga, D. An optimization-based family of predictive, fusion-based models for full-reference image quality assessment. J. Imaging 2023, 9, 116. [Google Scholar] [CrossRef] [PubMed]
  3. Frackiewicz, M.; Machalica, Ł.; Palus, H. New combined metric for full-reference image quality assessment. Symmetry 2024, 16, 1622. [Google Scholar] [CrossRef]
  4. Yang, P.; Sturtz, J.; Qingge, L. Progress in blind image quality assessment: A brief review. Mathematics 2023, 11, 2766. [Google Scholar] [CrossRef]
  5. Mao, Q.; Liu, S.; Li, Q.; Jeon, G.; Kim, H.; Camacho, D. No-reference image quality assessment: Past, present, and future. Expert Syst. 2025, 42, e13842. [Google Scholar] [CrossRef]
  6. Yang, X.; Li, F.; Liu, H. A survey of DNN methods for blind image quality assessment. IEEE Access 2019, 7, 123788–123806. [Google Scholar] [CrossRef]
  7. Kang, L.; Ye, P.; Li, Y.; Doermann, D. Convolutional neural networks for no-reference image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 1733–1740. [Google Scholar]
  8. Kang, L.; Ye, P.; Li, Y.; Doermann, D. Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 2791–2795. [Google Scholar]
  9. Ma, C.; Shi, Z.; Lu, Z.; Xie, S.; Chao, F.; Sui, Y. A survey on image quality assessment: Insights, analysis, and future outlook. arXiv 2025, arXiv:2502.08540. [Google Scholar] [CrossRef]
  10. Jia, S.; Zhang, Y. Saliency-based deep convolutional neural network for no-reference image quality assessment. Multimed. Tools Appl. 2018, 77, 14859–14872. [Google Scholar] [CrossRef]
  11. You, J.; Korhonen, J. Transformer for image quality assessment. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1389–1393. [Google Scholar]
  12. Zhang, P.; Shao, X.; Li, Z. Cycleiqa: Blind image quality assessment via cycle-consistent adversarial networks. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
  13. Liu, H.I.; Galindo, M.; Xie, H.; Wong, L.K.; Shuai, H.H.; Li, Y.H.; Cheng, W.H. Lightweight deep learning for resource-constrained environments: A survey. ACM Comput. Surv. 2024, 56, 267. [Google Scholar] [CrossRef]
  14. Mei, Z.; Wang, Y.C.; Kuo, C.C.J. Lightweight high-performance blind image quality assessment. APSIPA Trans. Signal Inf. Process. 2024, 13, e7. [Google Scholar] [CrossRef]
  15. Zhu, Z.; Sun, H.; Zhang, C. Effectiveness of optimization algorithms in deep image classification. arXiv 2021, arXiv:2110.01598. [Google Scholar] [CrossRef]
  16. Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for hyper-parameter optimization. In Proceedings of the NIPS’11: 25th International Conference on Neural Information Processing Systems, Granada, Spain, 12–15 December 2011. [Google Scholar]
  17. Trojanowski, W. Blind Image Quality Assessment Using Deep Neural Networks. Master’s Thesis, Silesian University of Technology, Gliwice, Poland, 2024. [Google Scholar]
  18. Ponomarenko, N.; Jin, L.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Image database TID2013: Peculiarities, results and perspectives. Signal Process. Image Commun. 2015, 30, 57–77. [Google Scholar] [CrossRef]
  19. Lin, H.; Hosu, V.; Saupe, D. KADID-10k: A large-scale artificially distorted IQA database. In Proceedings of the 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 5–7 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–3. [Google Scholar]
  20. Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 2006, 15, 3440–3451. [Google Scholar] [CrossRef] [PubMed]
  21. Ma, J.; Chen, Y.; Chen, L.; Tang, Z. Dual-attention pyramid transformer network for no-reference image quality assessment. Expert Syst. Appl. 2024, 257, 125008. [Google Scholar] [CrossRef]
  22. Talebi, H.; Milanfar, P. NIMA: Neural image assessment. IEEE Trans. Image Process. 2018, 27, 3998–4011. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, W.; Ma, K.; Jia, Y.; Deng, D.; Zhou, W. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 36–47. [Google Scholar] [CrossRef]
  24. Ma, K.; Liu, W.; Zhang, K.; Duanmu, Z.; Wang, Z.; Zuo, W. End-to-end blind image quality assessment using deep neural networks. IEEE Trans. Image Process. 2017, 27, 1202–1213. [Google Scholar] [CrossRef] [PubMed]
  25. Golestaneh, S.A.; Dadsetan, S.; Kitani, K.M. No-reference image quality assessment via transformers, relative ranking, and self-consistency. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 1220–1230. [Google Scholar]
  26. Qin, G.; Hu, R.; Liu, Y.; Zheng, X.; Liu, H.; Li, X.; Zhang, Y. Data-efficient image quality assessment with attention-panel decoder. In Proceedings of the 2023 AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 2091–2100. [Google Scholar]
  27. Ranjbar, S.K.; Fatemizadeh, E. Exiqa: Explainable image quality assessment using distortion attributes. arXiv 2024, arXiv:2409.06853. [Google Scholar] [CrossRef]
  28. Liu, S.; Mao, Q.; Li, C.; Chen, J.; Meng, F.; Tian, Y.; Liang, Y. Content-distortion high-order interaction for Blind Image Quality Assessment. arXiv 2025, arXiv:2504.05076. [Google Scholar]
  29. Mei, Z.; Wang, Y.C.; He, X.; Kuo, C.C.J. GreenBIQA: A lightweight blind image quality assessment method. In Proceedings of the 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), Shanghai, China, 26–28 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.