Digital Image Quality Prediction System †

: “A picture is worth a thousand words.” Based on this well-known adage, we can say that images are important in our society, and increasingly so. Currently, the Internet is the main channel of socialization and marketing, where we seek to communicate in the most efﬁcient way possible. People receive a large amount of information daily and that is where the need to attract attention with quality content and good presentation arises. Social networks, for example, are becoming more visual every day. Only on Facebook can you see that the success of a publication increases up to 180% if it is accompanied by an image. That is why it is not surprising that platforms such as Pinterest and Instagram have grown so much, and have positioned themselves thanks to their power to communicate with images. In a world where more and more relationships and transactions are made through computer applications, many decisions are made based on the quality, aesthetic value or impact of digital images. In the present work, a quality prediction system for digital images was developed, trained from the quality perception of a group of humans.


Introduction
In recent years, significant efforts were applied to the development of successful models and algorithms that can automatically and accurately predict the perceptual quality of two-dimensional (2D) and three-dimensional (3D) digital images and videos.This estimate comes from studies with at least a century of experience, or more if we take into account those developed by Platon and Aristotle, usually from Humanities departments: Psychology, Sociology, Philosophy, Fine Arts, etc. [1].Different research groups sought to create computer systems capable of learning the aesthetic and quality perception of a group of humans as part of a generative system for uses such as the selection and arrangement of images within a set, even though it is complex to translate this into computer problems.Visual quality refers to the quantification of the perceptual degradation of a visual stimulus due to the presence or absence of distortions.Most of the applications that were developed were designed to treat synthetically distorted images [2].In this case, unlike other image quality assessment algorithms that use synthetically distorted images [3,4], it was decided to use images with absence of distortion [5,6].Despite the fact that the data collected contained quality and aesthetic results, on this occasion only the quality data were used as they constituted more objective results [7].

Materials and Methods
After analyzing the degree of generalization of some datasets used in automatic image prediction, it was concluded that it was not enough to consider them as a reference in the training of automatic image prediction and classification systems.Taking this into account, a new set of images from the web portal DPChallenge.comwas developed in search of greater statistical consistency [7].
The proposed dataset was built following the steps outlined in previous works [8]: obtaining the images on the web portal, filtering those images, organizing them according to their evaluation on the portal and selecting sets with an equal number of images.Subsequently, the quality of the images was evaluated by a group of humans through the Amazon Mechanical Turk platform.This group of humans was made up of 525 inhabitants of the USA (39% men and 61% women), aged between 18 and 70.A representation of the images from this dataset is shown in Figure 1.With this data, a system was created to predict the quality of digital images with the search engine Correlation by Genetic Search (CGS) [9,10].

Results
The results obtained during the experimental phase correspond to 50 runs of a 5-fold cross-validation with a training model where 80% of the set is dedicated to training and the remaining 20% to testing.As input data, 1024 features of VGG19 were used.The average number of features used in the 50 runs is 114, which has also reached an average Pearson correlation of 0.77 and an average error of 0.15. Figure 2 shows the distribution of features, Pearson correlation and error of the 50 runs.The absence of a large number of outliers stands out, which provides consistency and validity to the data obtained.In the three cases, the data that is recognized as outlier belongs to the same run.In the case of the error, its greater variability can be observed, with a maximum error of 0.16 and a minimum error of 0.09.In the case of the features and the Pearson correlation, a much more uniform and concentrated representation is observed, with a very small variability that leads to deduce that the model proposes coherent results in the 50 runs.

Conclusions
This paper focuses on the creation of a digital image quality prediction system from a set of human-evaluated images.The task was tested with a hybrid method for the creation of multiple regression models based on the maximization of the correlation, the CGS method.Thus, an average Pearson correlation of 0.77 in 50 runs with 5-fold cross-validation was achieved, with a consistent distribution and low variability, which provides better results than other state-of-the-art works such as Nadal et al. [11] or Marin and Leder [12].

Figure 1 .
Figure 1.Images of different scoring ranges belonging to the dataset used in this work, evaluated by humans according to their quality.(a) Image with an average score of 2.7 out of a maximum of 10.(b) Image with an average score of 5.7 out of a maximum of 10. (c) Image with an average score of 9.28 out of a maximum of 10.

Figure 2 .
Figure 2. Results obtained in the experiments carried out with CGS.(a) Features used in each of the 50 runs.(b) Pearson correlation of the validation set in each run.(c) Average error obtained in each run.