Tabular Data Models for Predicting Art Auction Results

Mauer, Patryk; Paszkiel, Szczepan

doi:10.3390/app142311006

Open AccessArticle

Tabular Data Models for Predicting Art Auction Results

by

Patryk Mauer

and

Szczepan Paszkiel

^*

Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, 45-758 Opole, Poland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(23), 11006; https://doi.org/10.3390/app142311006

Submission received: 14 September 2024 / Revised: 16 November 2024 / Accepted: 25 November 2024 / Published: 26 November 2024

(This article belongs to the Special Issue Randomized Neural Networks and Deep Learning: Research Frontiers and Cutting-Edge Applications)

Download

Browse Figures

Versions Notes

Abstract

Featured Application

This work provides a comprehensive analysis of various prediction architectures tailored for tabular data, specifically applied to predicting art auction results. By leveraging machine learning models such as LinearModel, K-Nearest Neighbors, DecisionTree, RandomForest, XGBoost, CatBoost, LightGBM, MLP, VIME, ModelTree, DeepGBM, DeepFM and SAINT, the study offers insights into the relative strengths and weaknesses of each approach when forecasting auction outcomes based on historical data, artwork characteristics, and market trends. These findings can help auction houses, art investors, and market analysts understand which methods may offer practical, although currently limited, predictive value, and highlight areas for further improvement in prediction accuracy and strategy optimization.

Abstract

Predicting art auction results presents a unique challenge due to the complexity and variability of factors influencing artwork prices. This study explores a range of machine learning architectures designed to forecast auction outcomes using tabular data, including historical auction records, artwork characteristics, artist profiles, and market indicators. We evaluate traditional models such as LinearModel, K-Nearest Neighbors, DecisionTree, RandomForest, XGBoost, CatBoost, LightGBM, MLP, VIME, ModelTree, DeepGBM, DeepFM, and SAINT. By comparing the performance of these models on a dataset comprising extensive auction results, we provide insights into their relative effectiveness across different scenarios. Additionally, we address the interpretability of models, which is crucial for understanding the influence of various features on predictions. The results suggest that while some models perform better than others, no single approach offers consistently high accuracy across all cases. This study provides guidance for auction houses, art investors, and market analysts in refining predictive approaches, identifying key challenges, and understanding where further improvements are needed for more accurate data-driven decisions in the art market.

Keywords:

tabular data predictions; machine learning architectures; neural networks; decision trees

1. Introduction

The prediction of art auction results is a complex and multifaceted challenge that has gained significant attention in recent years due to the growing importance of data-driven decision-making in the art market. As the art industry becomes increasingly globalized and competitive, auction houses, investors, and collectors seek more reliable methods to forecast the outcomes of auctions, which often involve substantial financial stakes. Accurate predictions of auction results can inform bidding strategies, guide investment decisions, and enhance market transparency [1], making this area of study highly relevant.

The art market is characterized by its unique and volatile nature, where prices are influenced by a combination of tangible factors, such as the medium, size, and condition of the artwork, as well as intangible elements, including the reputation of the artist, provenance, and current market trends [2]. These complexities have driven researchers to explore advanced predictive models that can effectively handle the intricacies of tabular data commonly found in auction datasets [3].

Traditional statistical methods, such as Linear Models, K-Nearest Neighbors (KNN), and Support Vector Machines (SVM) have long been used to model auction outcomes, but they often fall short in capturing non-linear relationships and interactions among variables [4]. Recent advancements in machine learning have introduced more sophisticated approaches, such as Decision Trees, Random Forests, Gradient Boosting Machines (including XGBoost [5], CatBoost [6]), and Neural Networks (such as Multi-Layer Perceptron (MLP) and DeepGBM [7]), which offer improved predictive accuracy and the ability to handle large, complex datasets. Additionally, modern approaches like VIME [8], RLN [9], ModelTree [10], DeepFM [11], and SAINT [12] have been explored for their potential in tackling the nuanced challenges of auction data prediction. However, the art market’s inherent unpredictability and the need for model interpretability present ongoing challenges.

Several publications have explored the application of machine learning in the art market, highlighting both the potential and limitations of current methodologies [13,14,15]. Some studies have focused on the predictive power of specific features [16], while others have investigated the use of ensemble models to improve accuracy [17]. Despite these advances, there is still debate over the most effective modelling strategies.

This study aims to provide a comprehensive evaluation of various machine learning architectures for predicting art auction results using tabular data. By systematically comparing traditional models with more advanced techniques, we seek to identify the strengths and weaknesses of each approach. Our analysis includes an exploration of model interpretability, a crucial aspect for stakeholders who need to understand the factors driving predictions. The principal conclusion of this work is that while advanced models generally offer better predictive performance, their complexity necessitates careful consideration of interpretability, especially in a market as nuanced as art.

In summary, this research contributes to the ongoing discourse on the application of machine learning in the art market by offering insights into the most effective prediction architectures and by providing a collected dataset of real-world European online auction results. Our findings are intended to guide auction houses, investors, and market analysts in selecting appropriate models for their specific needs, ultimately improving the precision and transparency of auction result predictions.

2. Materials and Methods

This study utilizes a dataset of observed art auction results for prints and multiples, which includes information on artwork characteristics, historical auction data, artist names, and relevant market indicators. The dataset has been carefully curated to ensure accuracy and relevance, providing a robust foundation for the predictive models discussed in this work.

2.1. Datasets

From the original dataset, four distinct datasets were generated by further extracting image-related features of the artworks. The core features common to all datasets include the following:

ARTIST: Name of the artist.
TECHNIQUE: Medium or method used (e.g., lithograph, etching).
SIGNATURE: Whether the artwork is hand signed, plate signed or unsigned.
CONDITION: Physical state of the artwork.
TOTAL DIMENSIONS: Area of the artwork.
YEAR: Year of creation.
PRICE: Final auction price.

Key Differences across the datasets are as follows:

AuctionResultsNoImg (NoImg) Dataset contains only the core features without any image-related features.
AuctionResultsColor (Color) Dataset includes an additional Colorfulness Score [18], a measure of color intensity and variety of the image of the artwork.
AuctionResultsSVD (SVD) Dataset adds SVD Entropy [19] of the image of the artwork to the core features, excluding the Colorfulness Score.
AuctionResultsColorSVD (ColorSVD) Dataset adds both Colorfulness Score and SVD Entropy, which quantifies the complexity of the artwork’s visual representation.

The Singular Value Decomposition (SVD Entropy) entropy of an image is a measure of the information content or complexity of an image based on its singular values obtained from the SVD of the image matrix. It can be computed using the following formula:

H_{S V D} = - \sum_{i = 1}^{r} s_{i} \log (s_{i})

(1)

where r represents the rank (the number of non-zero singular values) of the matrix and

s_{i}

is the normalized singular value.

Additionally, the California Housing (Housing) dataset [20] was used as a baseline to compare the results against a well-known dataset within the research community, providing a point of reference for evaluating the models and methodologies applied.

2.2. Data Preprocessing

Data preprocessing was conducted to clean and standardize the dataset. This process involved ensuring the data structure described in Appendix A.1, filtering data described in Appendix A.2. and encoding of categorical variables described in Appendix A.3. Well-established methods, e.g., standard scaling for numerical variables, were employed to ensure that the data were in a suitable format for machine learning algorithms.

2.3. Dataset Description

The cardinality of the data plays a crucial role in the analysis of the model. Table 1 offers an overview of the number of unique values for the core features, with the exception of CONDITION and SIGNATURE, which have only three distinct values each.

The four datasets under consideration share identical characteristics for the features presented in Table 1, differing only in image-related attributes. The high cardinality of the ARTIST feature is particularly significant, as it presents challenges that the models must address. In contrast, the TECHNIQUE feature exhibits relatively low cardinality. The TOTAL-DIMENSIONS feature demonstrates a substantial number of unique values, as it is derived from the product of two dimensions of the artworks. The YEAR feature reflects a temporal span of 123 years, indicating the breadth of the data. Lastly, the unique value distribution of the PRICE feature suggests considerable variation between individual price points, highlighting the feature’s complexity.

The distribution of the core features is shown in Table 2.

The distribution of the features reveals notable disparities among artists, with a small subset of 58 out of 395 artists accounting for 25% of the total artworks sold. Over 75% of the artworks were created using a single predominant technique. The dimensions of the majority of artworks are relatively modest, with 75% of them being smaller than 3400 cm², while only 25% fall within the larger range of 3401 to 10,000 cm². The most pronounced skew is observed in the sales prices, where 75% of the artworks were sold for less than 200 units (EUR), highlighting a concentration of lower-priced sales within the dataset.

2.4. Model Selection and Training

The analysis involved a comprehensive comparison of various machine learning models, including Linear Regression, KNN, Decision Trees, Random Forests, Gradient Boosting Machines (GBMs such as XGBoost, CatBoost), Model Trees and neural models such as MLP and other models like VIME, DeepGBM, DeepFM, and SAINT.

Each model was trained using 5-fold cross-validation to ensure robust performance across different data subsets, thereby reducing the risk of overfitting. Hyperparameter tuning was conducted using the Optuna library [21], with a maximum of 5 trials to optimize the model settings and maximize the models’ performance. Neural network models were trained for up to 1000 epochs, with early stopping implemented after 20 epochs without improvement. The models were trained on their mean squared error.

2.5. Reproducibility

All code associated with model training and evaluation is available in the GitHub repository [https://github.com/PatrykMauer/TabSurvey, accessed on 24 November 2024] that builds upon a framework for comparing models for tabular data [22]. Code associated with the data preprocessing that includes data cleaning, handling missing values, removing outliers, standardizing artist name conventions is also available in the Github repository [https://github.com/PatrykMauer/art-data-prep-pipeline, accessed on 24 November 2024]. By providing the complete codebase, we aim to facilitate the replication of our results and encourage further research in this domain. The lists of hyperparameters found for each model are provided in Appendix B, in Table A1 and Table A2.

3. Results

This section evaluates the performance of the trained models across different datasets, using Symmetric Mean Absolute Percentage Error (sMAPE) as the primary metric for comparison. The sMAPE formula is given by

s M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{2 |y_{p r e d, i} - y_{t r u e, i}|}{|y_{t r u e, i}| + |y_{p r e d, i}|} \times 100

(2)

This metric is particularly well-suited for the art auction prediction problem, as it mitigates the impact of overpredictions, particularly regarding the presented datasets that are skewed towards lower price results. The sections following the presented results explore the underlying factors that contribute to each model’s performance.

3.1. sMAPE Score Analysis Across Datasets

The analysis of sMAPE scores across the different datasets provides valuable insights into the predictive capabilities of the compared machine learning models. sMAPE is particularly useful in this context, as it measures the percentage difference between predicted and actual values, symmetrically penalizing both over- and under-predictions. This makes it especially suitable for art auction price predictions, where both types of errors can have a significant impact.

Table 3 provides a comparative evaluation of the models based on their sMAPE scores across the datasets. In the NoImg dataset, XGBoost demonstrates the best performance, achieving a sMAPE score of 55.11%, with LightGBM following at 57.53%. The LinearModel, however, exhibits significantly inferior performance, with sMAPE over 100% for each dataset, indicating its limitations in handling the complexity of auction data. Similarly, DeepGBM with a sMAPE of 76.92% and VIME with 75.29% struggle to generalize effectively, while other models present moderate performance, ranging from 60.57% for SAINT to 68.50% for ModelTree.

In the Color dataset, RandomForest emerges as the top performer with a sMAPE of 55.95%, outperforming XGBoost, which records 57.50%. Other models result in similar ranges of error, with improvement only for VIME, with almost 10 percentage points of decrease.

In the SVD dataset, XGBoost once again leads with a sMAPE score of 54.83%, followed closely by RandomForest at 56.94%. Contrary to the Color dataset, providing SVD Entropy resulted in the worst noted error for VIME (86.44%) and DeepGBM (87.96%).

For the ColorSVD dataset, RandomForest delivers the best performance, achieving a sMAPE of 55.51%, with CatBoost in second place at 58.25%. VIME (74.09%) and DeepGBM (76.18%) still exhibit higher error rates than other moderately performing models, suggesting that complex neural network-based models are less effective compared to ensemble methods when Colorfulness Score and SVD Entropy enhance the auction data.

The Housing dataset is simpler compared to the auction datasets, as it displays higher linearity between features. In this setting, LightGBM achieves the best performance with a sMAPE of 14.71%, slightly outperforming XGBoost (14.84%). However, DeepGBM performs poorly in this benchmark, with a sMAPE of 35.13%, further demonstrating DeepGBM may not be suitable for this kind of tabular dataset, while this time, VIME is placed among the moderate performers.

Overall, XGBoost and RandomForest consistently exhibit superior performance across the auction-related datasets; however, the prediction errors still exceed the threshold for what would be considered satisfactory results. The LinearModel consistently performs poorly, with sMAPE scores exceeding 100% in most auction datasets, underscoring its unsuitability for these tasks. Models such as DeepGBM and VIME also exhibit subpar performance, particularly in the auction datasets where higher-dimensional transformations (e.g., SVD) are applied. In contrast, for the CaliforniaHousing dataset, more models demonstrate good generalization abilities, with LightGBM and XGBoost achieving the best results.

Tree-based ensemble methods, particularly gradient-boosting techniques like XGBoost and RandomForest, consistently deliver the best performance across various datasets, especially in handling the complexity of auction data with nonlinearity and high-dimensional features. While these models achieve relatively lower sMAPE scores compared to others, indicating their robustness and adaptability to tabular data, the overall prediction errors remain above what would be considered optimal, underscoring the challenge of achieving precise predictions in this domain.

3.2. LiniearModel (Linear Regression)

The LinearModel consistently performed the worst across all auction datasets, with sMAPE scores exceeding 100%. It highlights its inability to capture the complexity of auction price data. Linear regression assumes a linear relationship between the target variable (auction price) and input features, modeled as

{\hat{y} = β_{0}}^{} + \sum_{i = 1}^{p} β_{1} x_{1}

(3)

However, auction prices are driven by nonlinear interactions between factors such as TECHNIQUE, DIMENSIONS and high-cardinal features, such as ARTIST. The inability to model complex relationships leads to high bias, as the linear model systematically underfits the data. The lack of flexibility in handling nonlinearity and feature interactions, compared to models like XGBoost or RandomForest, results in poorer generalization and high error rates.

3.3. K-Nearest Neighbors

KNN performed significantly better than LinearModel, with its sMAPE scores ranging between 60.38 and 62.68% (presented in Figure 1), indicating that it struggled with the complexity of the auction datasets.

KNN is a distance-based algorithm that predicts the target variable based on the average of the nearest neighbors’ target values. It is defined as

\hat{y} = \frac{1}{k} \sum_{i \in N_{k} (x)} y_{i}

(4)

where N_k(x) represents the k nearest neighbors of x.

We observe that KNN performed relatively consistently across the auction datasets, showing minor improvements when image-derived features were introduced. This indicates that dimensionality alone was not the key factor in KNN’s underperformance compared to Random Forest. Auction datasets, characterized by volatility due to external market factors and the subjective nature of art pricing, often contain outliers, such as unusually high or low auction prices. KNN’s reliance on simple averaging of the nearest neighbors makes it highly sensitive to these outliers, which leads to less accurate predictions.

3.4. DecisionTree

The DecisionTree model demonstrated moderate performance across the auction datasets, with its best result achieved when SVD Entropy was included (56.70% ± 0.78). The comparison is presented in Figure 2.

The model’s ability to capture nonlinearity is a key factor, but possibly its susceptibility to overfitting in noisy data impacted its performance. DecisionTrees recursively partition the feature space, minimizing the variance (MSE for regression) in the target variable at each node. This characteristic explains its better performance compared to LinearModel and KNN. While DecisionTrees capture nonlinearity, they tend to overfit noisy auction datasets memorizing both signal and noise. In the ColorSVD dataset (59.21%), the model likely overfitted to complex feature combinations, leading to reduced generalization.

3.5. RandomForest

RandomForest is an ensemble learning method that constructs multiple decision trees, each trained on a bootstrapped sample of the dataset. By averaging the predictions across all trees, RandomForest reduces variance, as described by

E (\hat{f}) = ρ E ({\hat{f}}_{1}) (1 - p) \frac{σ^{2}}{n}

(5)

where E(

\hat{f}

) is the expected value of the n ensemble model,

ρ

is the correlation between trees, and

σ^{2}

is the variance of each tree’s prediction.

This averaging mitigates overfitting by reducing sensitivity to individual tree errors, leading to more stable predictions. The performance results highlight this advantage, especially in datasets with image-derived features. For instance, as presented in Figure 3, RandomForest achieved strong results in the Color (55.95% ± 0.29) and ColorSVD (55.51% ± 0.80) datasets, reflecting its robustness in capturing nonlinear interactions. With these datasets, RandomForest maintained better fit and generalization than other models.

3.6. XGBoost

XGBoost outperformed all other models across NoImg and SVD datasets, achieving the lowest sMAPE scores of 55.11% and 54.83% and consistent results in other datasets presented in Figure 4.

The authors believe it to be due both to its precise optimization and robust regularization capabilities. The core strength of XGBoost comes from its gradient boosting framework, where trees are built sequentially to minimize a regularized objective function. The objective function includes both the gradient (first derivative) and the Hessian (second derivative), allowing for more precise optimization. The function is approximated as follows:

L (θ) \approx \sum_{i = 1}^{n} [g_{i} (θ) + \frac{1}{2} h_{i} (θ^{2})] + Ω (f)

(6)

where

θ

represents parameters of the model,

g_{i}

is the gradient,

h_{i}

is the hessian,

Ω (f)

is a regularization term.

The second strong point of XGBoost is a solid regularization term that is based on both L1 (Lasso) and L2 (Ridge) techniques that control model complexity. The complete regularization term is described by

Ω (f) = λ \sum_{j = i}^{m} θ_{j}^{2} + α \sum_{j = 1}^{m} |θ_{j}|

(7)

where λ controls the strength of L2 regularization (penalizing large weights), and α controls the strength of L1 regularization (encouraging sparsity by setting less important weights to zero). In auction datasets, L1 regularization helps focus the model on key features (e.g., ARTIST or TECHNIQUE) by eliminating irrelevant ones, simplifying the model and improving interpretability. L2 regularization, on the other hand, controls the size of the remaining feature weights by penalizing large coefficients. This prevents any single feature from dominating the model and helps reduce overfitting, making the model more stable and robust, especially in noisy auction datasets. Together, L1 and L2 regularization allow XGBoost to balance model complexity and generalization.

3.7. CatBoost

CatBoost demonstrated stable performance across all the auction datasets. The results of this model are presented in Figure 5.

CatBoost’s key strength lies in its target-based encoding for categorical variables that auction data consist of. Moreover, CatBoost uses ordered boosting, a method that addresses the prediction-shift problem often observed in gradient boosting. In ordered boosting, each tree is trained on a historical view of the data, preventing overfitting and leakage by making sure the model only uses past data for training. This can be useful in auction datasets, where past auction results influence price predictions. In addition, CatBoost incorporates L2 regularization, which helps control the model’s complexity by penalizing large weights, leading to more stable predictions. This is especially important in datasets like auctions, where market volatility can introduce noise. The robust regularization, combined with CatBoost’s ability to efficiently handle categorical data, results in models that generalize well, even with noisy auction prices.

3.8. LightGBM

LightGBM performed competitively across the auction datasets, but its sMAPE scores were slightly higher compared to XGBoost, as shown in Figure 6. The model achieved a sMAPE of 57.52% ± 1.22 in the ColorSVD dataset, and a notable 14.71% ± 0.31 in the CaliforniaHousing dataset, where it performed the best. However, it underperformed compared to XGBoost and RandomForest in more complex auction datasets.

One key difference between LightGBM and XGBoost lies in their regularization strategies. While both models use L2 regularization, XGBoost employs a stronger regularization framework, incorporating both L1 and L2 penalties. This gives XGBoost an edge in controlling model complexity, especially in high-dimensional and noisy datasets like auctions, where feature interactions are nonlinear and difficult to capture. In contrast, LightGBM relies primarily on L2 regularization to prevent overfitting by penalizing large leaf weights. The absence of a strong L1 regularization component means that LightGBM is less effective at enforcing sparsity, which could explain its slightly higher error rates in auction datasets where irrelevant or noisy features might be present. While LightGBM excels in structured data, its regularization framework may not handle noise and complex feature interactions as effectively as XGBoost. The model’s leaf-wise growth strategy, where trees grow asymmetrically based on maximum loss reduction, helps LightGBM capture deep and complex interactions between features like Colorfulness Score and SVD Entropy. This enables the model to handle high-dimensional data efficiently, making it faster and more memory-efficient. However, this extensive tree growth can also lead to overfitting if not adequately controlled, especially in noisy datasets.

3.9. Multi-Layer Perceptron (MLP)

The MLP model showed moderate performance across the auction datasets, with sMAPE scores ranging from 61.86% ± 1.14 in the Color dataset to 65.02% ± 0.75 in the ColorSVD dataset. Despite its theoretical ability to model complex nonlinear relationships, the MLP underperformed compared to tree-based ensemble methods. The model showed a slight improvement when color feature was introduced, but its performance decreased with the addition of SVD features, as illustrated in Figure 7.

Unlike tree-based models, MLPs lack built-in mechanisms to evaluate feature importance during training, limiting the model’s ability to prioritize the most predictive features. This can dilute the learning process across many irrelevant inputs. Given the relatively small size of the auction datasets (25,408 entries), neural networks like MLP are prone to overfitting, as they tend to learn noise rather than general patterns, resulting in poor performance on unseen auction data. The absence of strong regularization methods (e.g., dropout, weight decay) in the hyperparameters suggests that overfitting was not adequately controlled. As a result, MLPs are more likely to memorize training data rather than generalizing effectively.

3.10. VIME

The VIME model exhibited considerable variability in performance across the auction datasets, with sMAPE scores ranging from 66.29% ± 1.59 in the Color dataset to 86.44% ± 2.28 in the SVD dataset. Overall, VIME underperformed compared to models like RandomForest and XGBoost. The results reached by VIME are presented in Figure 8.

Variational Information Maximizing Embedding (VIME) is a semi-supervised learning model designed specifically for tabular data. It combines supervised and unsupervised learning, leveraging labeled data while making use of additional unlabeled data. VIME’s semi-supervised nature can be an advantage in datasets where labeled data are limited, as the model can extract additional information from the unlabeled portion of the data. In the auction datasets, where labelled data (auction prices) are fully available but noisier due to external factors, the unsupervised component of VIME may have been less effective.

3.11. ModelTree

The ModelTree model exhibited moderate performance across the auction datasets, with sMAPE scores ranging from 66.59% ± 0.66 in the Color dataset to 71.68% ± 3.62 in the SVD dataset.

ModelTree is a hybrid model that combines the strengths of decision trees and linear regression. The model partitions the data recursively, like a traditional decision tree, but instead of making constant predictions at the leaf nodes, it fits a linear regression model to the data in each leaf. This enables the model to blend nonlinear partitioning with localized linear modelling, allowing it to capture both linear and nonlinear patterns in different regions of the feature space. The use of linear regression within each partition provides a localized approximation of the relationship between the features and the target variable (auction prices), offering more flexibility than constant predictions, while still maintaining the tree structure for partitioning.

3.12. DeepGBM

The DeepGBM model performed poorly across all auction datasets, with sMAPE scores ranging from 76.18% ± 4.23 in the ColorSVD dataset to 87.96% ± 10.62 in the SVD dataset, indicating its difficulty in generalizing effectively. The size of the auction datasets could be insufficient for training a model as complex as DeepGBM. The hybrid structure of gradient-boosted decision trees and deep neural networks adds significant complexity, with the neural component likely being too deep for these data. Without strong regularization, the network likely memorized noisy patterns rather than learning generalizable features, leading to poor performance. The model likely failed to generate meaningful low-dimensional representations from these features. Although DeepGBM theoretically has the capacity to capture complex feature interactions through its hybrid approach, its over-complex architecture struggled to fit the highly variable auction data.

3.13. DeepFM

DeepFM is a hybrid model that combines factorization machines with deep neural networks to model both low- and high-order feature interactions. It is designed for tasks where understanding the interactions between categorical and numerical features is critical. Despite its sophisticated structure, DeepFM performed moderately well across the auction datasets, with sMAPE scores ranging from 61.30% ± 0.76 in the NoImg dataset to 63.45% ± 0.87 in the Color dataset, as shown in Figure 9. The model did not outperform simpler models like RandomForest or XGBoost.

While this architecture theoretically provides a robust way to learn both simple and complex interactions between features, auction datasets present a unique challenge. Many of the feature interactions in auction data (e.g., ARTIST, TECHNIQUE, PRICE) are influenced by external market factors and subjective pricing that cannot easily be captured by standard interaction modeling. This might explain why DeepFM’s additional complexity did not translate into better predictive accuracy compared to tree-based models.

3.14. SAINT

Self-Attention and Intersample Attention Transformer (SAINT) is a relatively new approach that utilizes transformers for tabular data. The model applies two main attention mechanisms: self-attention, which focuses on feature interactions within a single sample, and intersample attention, which focuses on interactions between different samples. Despite its innovative architecture, SAINT delivered only moderate performance across the auction datasets, with sMAPE scores ranging from 60.57% ± 0.95 in the NoImg dataset to 62.41% ± 1.19 in the SVD dataset, as presented in Figure 10.

The self-attention mechanism in SAINT allows the model to capture relationships between features within a single sample. It operates by computing attention weights that dictate how much each feature should contribute to the prediction. The self-attention mechanism is particularly useful for handling high-dimensional and structured data, as it can model complex dependencies between features that other models (e.g., traditional decision trees) might miss. However, auction datasets contain nonlinear relationships driven by subjective factors like market volatility, which are difficult to model using feature-based dependencies alone. This likely explains why SAINT, despite capturing feature interactions, did not outperform simpler models like XGBoost and RandomForest.

3.15. Feature Importance Plots

In this section, we analyze the feature importance across different datasets to understand the impact of various features on model performance. By examining the importance plots, we aim to identify which features consistently contribute to predictive accuracy and how the inclusion of additional features, such as colorfulness and SVD entropy, influences overall model error. This analysis provides insights into the relative significance of core features versus image-related features in auction datasets.

The study highlights that the feature importance plots across different models reveal distinct patterns based on the datasets they are applied to. In Figure 11b, which represents the feature importance for the ColorSVD dataset using the RandomForest model, ARTIST and TOTAL DIMENSIONS stand out as the most influential features. Additionally, image-related features such as Colorfulness Score and SVD Entropy also contribute significantly to the model’s predictions. However, despite the presence of these image-based features, their inclusion introduces complexity that does not necessarily lead to improved performance. The overall contribution of these features is notable, but the complexity they add may lead to higher error rates with the presented models.

On the other hand, Figure 11a, which represents the feature importance for the NoImg dataset using the XGBoost model, underscores the prominence of traditional metadata features such as TOTAL DIMENSIONS, ARTIST, and YEAR. These core features are critical to driving model performance, emphasizing their fundamental role in prediction accuracy. Interestingly, the NoImg dataset does not include image-related features, yet the model achieves strong predictive performance based on these traditional features.

This demonstrates the importance of key features, such as ARTIST and TOTAL DIMENSIONS and YEAR, in the initial stages of evaluating artwork value. By identifying which factors most strongly influence auction outcomes, the models provide insights into the early steps of price prediction. With prediction errors remaining at approximately 50%, the study indicates that, although these models introduce a more data-driven framework for artwork evaluation, they still fail to achieve the level of accuracy required to serve as a foundation for artwork evaluation.

3.16. Summary of the Results

In summary, while image-related features in the ColorSVD dataset provide some predictive value, their inclusion does not always result in lower error rates. The analysis indicates that traditional features like TOTAL DIMENSIONS and ARTIST remain consistently powerful across models and datasets. The findings suggest that adding more complex, image-derived features may introduce diminishing returns for the evaluated models, highlighting the need to balance the benefits of added complexity against the risk of reduced model performance.

Across all datasets analyzed, the RandomForest and XGBoost models consistently emerged as top performers, achieving the lowest sMAPE values. This performance highlights ability to generalize well across a relatively small number of features. These models exhibited some level of resilience in scenarios where color features were included or SVD Entropy was added. The sMAPE values, even with relatively tight error bars indicating some stability, may still be insufficient to reliably capture the complex patterns within art auction datasets.

Conclusions can be drawn about the higher performance of models that utilize strong or multiple regularization techniques. XGBoost, for example, effectively combines L1 (Lasso) and L2 (Ridge) regularization, which controls model complexity and prevents overfitting. In contrast, while Random Forest does not use explicit regularization, it incorporates several implicit mechanisms. These include bootstrap sampling, where trees are trained on random data subsets and their predictions averaged to reduce variance, as well as feature subsampling, which prevents any one feature from dominating the model. Additionally, limiting tree depth and setting a minimum number of samples per leaf or split help prevent the trees from becoming too complex and overfitting. Together, these techniques act as implicit regularization, enhancing the model’s generalization.

Conversely, the Linear Model consistently exhibited the highest error across all datasets, underscoring its limitations in handling the complex, non-linear relationships likely present in the auction data. Despite being a simpler model, its poor performance indicates that linear assumptions are insufficient for this type of dataset, which likely includes complex interactions and non-linear dependencies that linear regression cannot capture effectively.

KNN and Decision Trees showed very limited effectiveness, generally ranking in the mid-range of achieved sMAPE scores. The underperformance of these models could be due to the sensitivity of these models to specific characteristics of the data, such as outliers or noisy features. More complex models like Multilayer Perceptron (MLP) and VIME failed to achieve satisfactory results. While these models are theoretically capable of capturing complex, non-linear relationships, they struggled to deliver competitive MSE scores, particularly when compared to the more stable performance of boosting methods like LightGBM and XGBoost. This underperformance could be attributed to the challenges in tuning these deep learning models for specific data applications, which require careful optimization of hyperparameters, architecture, and regularization techniques to prevent overfitting.

In summary, this analysis highlights the critical need for careful model selection that aligns with the intrinsic characteristics of the art auction dataset. While tree ensembles and boosting models like LightGBM, XGBoost, and CatBoost have shown relative strengths when handling complex features, their performance still falls short of what would be considered robust and reliable. Simpler models and compared deep learning approaches consistently underperform, revealing the persistent challenges of matching model complexity to the specific needs of the dataset. Overall, this analysis suggests that without a precise alignment between model capabilities and data characteristics, achieving satisfactory predictive performance remains difficult.

4. Discussion

The results of this study reveal persistent challenges in applying machine learning models to the prediction of art auction results. While tree-based ensemble models, particularly XGBoost and Random Forest, managed to perform better than other approaches, their effectiveness was still limited when dealing with the complexities of tabular data. Although these models demonstrated some predictive power across the auction datasets, their performance fell short of consistently robust results. This outcome contrasts with prior research that highlighted the effectiveness of gradient-boosting techniques in other domains, suggesting that their capabilities may be less reliable when applied to the nuances of art auction data.

While neural network architectures such as MLP, VIME, and DeepGBM theoretically have the capacity to capture complex interactions in high-dimensional data, they consistently underperformed compared to tree-based models. This outcome suggests that the specific characteristics of art auction data, such as the influence of external market factors and subjective elements like artist reputation and artwork provenance, are not effectively captured by these models. Neural networks tend to excel in larger datasets with abundant feature interactions; however, in the context of auction data, where the sample size is relatively small and features like ARTIST or TECHNIQUE dominate, tree-based models outperform due to their strong regularization ability. These results raise questions about the applicability of neural networks in auction price prediction and suggest that simpler, interpretable models may be more effective in this domain.

One key finding of this study is that the inclusion of image-based features, such as Colorfulness and SVD entropy, did not substantially improve model performance. This contrasts with previous research in the price prediction domain, where image features have been shown to enhance predictions. In the art market, however, core metadata features like ARTIST, TOTAL DIMENSIONS, and YEAR were consistently more influential in driving model accuracy. The limited impact of image features in this study may be due to the highly subjective nature of art valuation, where non-visual factors such as provenance, historical significance, and artist reputation often outweigh visual characteristics in determining price.

The implications of these findings raise concerns for stakeholders in the art market, including auction houses, investors, and analysts. Although tree-based models like Random Forest and XGBoost showed some degree of outperformance, their predictive capabilities still fall short of providing consistently accurate and transparent results. This suggests that relying on these models to optimize bidding strategies or guide investment decisions may not be as reliable as hoped. Furthermore, the feature importance analysis indicates that even traditional metadata, such as artist information, artwork dimensions, and auction histories, may not be sufficient to significantly improve predictive outcomes, highlighting the limitations of current data collection and curation practices in the art market.

The underperformance of neural networks, particularly in the context of art auction prediction, may suggest that future research should focus less on deep learning approaches and more on refining tree-based methods. However, there remains potential for exploring hybrid models that integrate the strengths of both approaches, particularly in cases where additional, higher-dimensional data can be incorporated. For example, the integration of social media sentiment, market trends, or economic indicators could provide a richer dataset for prediction, potentially enhancing the performance of more complex models. Another avenue for future research lies in exploring temporal dynamics in auction data, as the timing of auctions and broader economic conditions often play a critical role in determining final prices.

Furthermore, interpretability remains a critical factor in the practical application of predictive models in the art market. While tree-based models offer transparency through feature importance metrics, advanced techniques such as Shapley Additive Explanations (SHAP) or Local Interpretable Model-Agnostic Explanations (LIME) could provide deeper insights into other models’ behavior, deepening the understanding of the reasons behind the predictions. This is especially important in markets like art, where trust and transparency are paramount.

In conclusion, this study underscores the persistent challenges of applying machine learning to art auction prediction. Although tree-based ensemble models demonstrated relatively better performance, their accuracy remains limited, and neural networks encountered even greater difficulties. The findings indicate that simpler models, despite balancing accuracy and interpretability, still fall short of adequately capturing the complexities of auction data, which include non-linear relationships and subjective factors. Future research should explore hybrid models and integrate additional data sources to enhance predictive performance. However, the ongoing issue of high error rates raises significant concerns about the practical utility and long-term viability of these approaches in the art market.

Author Contributions

Conceptualization, P.M.; methodology, P.M.; software, P.M.; validation, P.M.; formal analysis, S.P.; investigation, P.M.; data curation, P.M.; writing—original draft preparation, P.M.; writing—review and editing, S.P.; visualization, P.M.; supervision, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets were collected from publicly available sources; they can be found at: https://repod.icm.edu.pl/dataset.xhtml?persistentId=doi:10.18150/AEQF8C.

Acknowledgments

GenAI was utilized to automatically round numbers in results and to check for 613 grammatical errors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Data Preprocessing

This section outlines the comprehensive workflow applied to the raw data to ensure their quality and suitability for subsequent analysis. The workflow is designed to address various aspects of data preprocessing, including cleaning, filtering, and encoding, to produce a refined dataset that meets the criteria for rigorous scientific analysis.

Appendix A.1. Ensuring Data Structure

The first stage in the data processing pipeline involves structuring of raw data to remove inconsistencies and prepare the dataset for analysis. The specific actions taken during this stage are as follows:

Column Elimination: Non-essential columns are identified and removed based on predefined criteria outlined in columns structure. This step reduces noise and ensures that only relevant variables are retained.
Date Formatting and Validation: The AUCTION DATE column is standardized to the datetime format. Rows containing invalid or missing dates are excluded from the dataset to maintain temporal accuracy (invalid ‘AUCTION DATE’ rows removed: 4)
Object Type Filtering: The dataset is filtered to include only entries where the OBJECT is classified as “Print.” This restriction narrows the scope of the analysis to graphic, non-unique works (non-print rows removed: 1016).
Artist Name Standardization:
a.
Accent Removal: Accented characters in artist names are normalized using the unidecode function.
b.
Replacements and Normalization: The script applies a series of regular expression-based replacements to unify various forms of artist names. This includes removing year ranges of their life span, special characters, and standardizing prefixes such as “after”.
c.
Name Sorting: Artist names are further normalized by sorting the characters alphabetically, thereby making the order of names and surnames insignificant.
d.
Artist Filtering: Rows that are empty or contain erroneous classifications (e.g., “print” in the artist name) are systematically removed. This step enhances the integrity of the artist data (rows removed due to empty ‘ARTIST’ or erroneous classifications: 10,454).
Handling Missing Values:
a.
Years are extracted from the YEAR or PERIOD columns using regular expressions, and entries with unresolved or invalid year values are removed. This ensures chronological accuracy within the dataset (rows removed due to invalid ‘YEAR’: 52).
b.
Technique Standardization: The TECHNIQUE column is cleaned and standardized to match a predefined list of accepted techniques. Special emphasis was to not include reproductions but only original art works. Entries not meeting these criteria are removed (rows removed due to unwanted ‘TECHNIQUE’: 17,966).
c.
Poster Exclusion: Descriptions containing terms related to posters are excluded to avoid irrelevant data.
Dimensional and Price Validation: The TOTAL DIMENSIONS column is standardized to a consistent unit (centimeters), and outliers or erroneous entries (e.g., dimensions equal to zero) are excluded. The PRICE column is converted to a numeric format, with non-numeric values discarded (rows removed due to invalid ‘TOTAL DIMENSIONS’: 1069).

Appendix A.2. Filtering Data

This stage applies filtering criteria to the processed data to focus on the most reliable and significant data points.

Dimension Filtering: Entries with TOTAL DIMENSIONS outside the range of 10 to 10,000 cm² were excluded to remove records that are unlikely to occur naturally, which could otherwise skew the analysis (rows removed due to ‘TOTAL DIMENSIONS’ being outside of range: 753).
Price Filtering: Entries with PRICE values exceeding 10,000 were removed to concentrate on transactions within a typical range for prints and to avoid artwork category misclassification (rows removed due to ‘PRICE’ being outside of range: 3).
Year-Based Filtering: Artworks created before 1900 are excluded to limit the analysis to more contemporary works, aligning with the study’s temporal focus (rows removed due to ‘YEAR’ being earlier than 1900: 2305).
Artist Frequency Filtering: Artists with fewer than 10 occurrences in the dataset were excluded. This step ensures sufficient representation of artists, thereby improving the robustness of the analysis (rows removed due to artists with less than 10 occurrences: 5266.)

Appendix A.3. Data Encoding

This stage involves the transformation of categorical data into numerical formats suitable for machine learning models and statistical analysis. Only ordinal encoding methodology was applied to model the relationship between the values in each individual column.

Appendix B

This section provides lists of hyperparameters used for training of each model on each auction dataset.

Table A1. Lists of hyperparameters used for training of each model for AuctionResultsNoImg and AuctionResultsColor datasets.

Model Name	AuctionResultsNoImg—Hyperparameters	AuctionResultsColor—Hyperparameters
KNN	n_neighbors: 9	n_neighbors: 11
DecisionTree	max_depth: 11	max_depth: 9
RandomForest	max_depth: 9	max_depth: 6
RandomForest	n_estimators: 37	n_estimators: 19
XGBoost	max_depth: 8	max_depth: 8
	alpha: 1.506511545817065e-06	alpha: 1.3322275996207066e-06
	lambda: 0.002960933875866748	lambda: 0.777925934273516
	eta: 0.014895069860775214	eta: 0.03275817414837939
	learning_rate: 0.06612242253192645	learning_rate: 0.11347994198907171
CatBoost	max_depth: 8	max_depth: 10
	l2_leaf_reg: 5.625945091561714	l2_leaf_reg: 18.349484825036434
LightGBM	num_leaves: 3749	num_leaves: 628
	lambda_l1: 0.024938755036347845	lambda_l1: 0.00015355388929401932
	lambda_l2: 6.603948300896136e-07	lambda_l2: 3.3513689033734044
	learning_rate: 0.23346094966403183	learning_rate: 0.025584699772417396
MLP	hidden_dim: 27	hidden_dim: 90
	n_layers: 4	n_layers: 4
	learning_rate: 0.0009817907858545537	learning_rate: 0.0006126544377845088
VIME	p_m: 0.3171774936756223	p_m: 0.3551020199178091
	alpha: 3.1518685147640513	alpha: 5.796587835827752
	K: 15	K: 3
	beta: 5.777807461440739	beta: 2.026360008336819
ModelTree	criterion: gradient	criterion: gradient-renorm-z
ModelTree	max_depth: 3	max_depth: 2
DeepGBM	n_trees: 100	n_trees: 200
	maxleaf: 64	maxleaf: 64
	loss_de: 3	loss_de: 5
	loss_dr: 0.7	loss_dr: 0.9
DeepFM	dnn_dropout: 0.5645466529889359	dnn_dropout: 0.8953496968333118
SAINT	dim: 64	dim: 256
	depth: 2	depth: 3
	heads: 2	heads: 8
	dropout: 0	dropout: 0.5

Table A2. Lists of hyperparameters used for training of each model for AuctionResultsNoImg and AuctionResultsColor datasets.

Model Name	AuctionResultsSVD—Hyperparameters	AuctionResultsColorSVD—Hyperparameters
KNN	n_neighbors: 5	n_neighbors: 19
DecisionTree	max_depth: 8	max_depth: 10
RandomForest	max_depth: 11	max_depth: 12
RandomForest	n_estimators: 77	n_estimators: 5
XGBoost	max_depth: 5	max_depth: 6
	alpha: 5.857949969161431e-08	alpha: 9.235891162903211e-07
	lambda: 0.34515471928125674	lambda: 1.6580418495949973e-07
	eta: 0.025202041962014803	eta: 0.03645561176974997
	learning_rate: 0.028646710508839136	learning_rate: 0.2496359003501268
CatBoost	max_depth: 10	max_depth: 9
	l2_leaf_reg: 13.293942606581755	l2_leaf_reg: 22.680333733819474
LightGBM	num_leaves: 962	num_leaves: 259
	lambda_l1: 0.002341475216791483	lambda_l1: 5.115908866062836e-07
	lambda_l2: 1.5483601387539192	lambda_l2: 1.7999234961427344e-06
	learning_rate: 0.03971767779263651	learning_rate: 0.0919724612802419
MLP	hidden_dim: 78	hidden_dim: 89
	n_layers: 2	n_layers: 5
	learning_rate: 0.0007311385951868368	learning_rate: 0.0007298057023245878
VIME	p_m: 0.19870045327747068	p_m: 0.3565456402937748
	alpha: 3.491869884196773	alpha: 3.2829661088245
	K: 5	K: 15
	beta: 2.2079468853402586	beta: 0.626661210126865
ModelTree	criterion: gradient	criterion: gradient-renorm-z
ModelTree	max_depth: 3	max_depth: 3
DeepGBM	n_trees: 100	n_trees: 200
	maxleaf: 64	maxleaf: 64
	loss_de: 3	loss_de: 4
	loss_dr: 0.7	loss_dr: 0.9
DeepFM	dnn_dropout: 0.64790221084881	dnn_dropout: 0.6160828071241174
SAINT	dim: 64	dim: 128
	depth: 6	depth: 2
	heads: 8	heads: 4
	dropout: 0.2	dropout: 0.6

References

Bailey, J. Can machine learning predict the price of art at auction? Harv. Data Sci. Rev. 2020, 2, 2–8. [Google Scholar] [CrossRef]
Goetzmann, W.; Renneboog, L.; Spaenjers, C. Art and Money: Risk, Return, and the Art Market as an Asset Class. In Handbook of the Economics of Art and Culture; Ginsburgh, V., Throsby, D., Eds.; Elsevier: Amsterdam, The Netherlands, 2013; Volume 2, pp. 253–283. [Google Scholar]
Schapire, R.E.; Stone, P.; McAllester, D.; Littman, M.L.; Csirik, J. Modeling auction price uncertainty using boosting-based conditional density estimation. In Proceedings of the 19th International Conference on Machine Learning (ICML 2002), Sydney, Australia, 8–12 July 2002; pp. 546–553. Available online: https://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICML02-tac.pdf (accessed on 22 August 2024).
Powell, L.; Gelich, A.; Ras, Z.W. Developing artwork pricing models for online art sales using text analytics. In Proceedings of the Rough Sets: International Joint Conference, IJCRS 2019, Debrecen, Hungary, 17–21 June 2019; Mihálydeák, T., Min, F., Wang, G., Banerjee, M., Düntsch, I., Suraj, Z., Ciucci, D., Eds.; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2019; Volume 11499, pp. 480–494. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016. [Google Scholar] [CrossRef]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased Boosting with Categorical Features. arXiv 2018, arXiv:1706.09516. [Google Scholar]
Ke, G.; Xu, Z.; Zhang, J.; Bian, J.; Liu, T.Y. DeepGBM: A deep learning framework distilled by GBDT for online prediction tasks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 384–394. [Google Scholar] [CrossRef]
Yoon, J.; Jarrett, D.; van der Schaar, M. VIME: Extending the Success of Self- and Semi-Supervised Learning to Tabular Domain. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Virtual, 6–14 December 2021. [Google Scholar] [CrossRef]
Xu, J.; Zhang, H.; Wu, Y. RLN: A Residual Learning Network for Time Series Forecasting. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019; pp. 3342–3348. [Google Scholar] [CrossRef]
Quinlan, J.R. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Australia, 16–18 November 1992; pp. 343–348. [Google Scholar]
Guo, H.; Tang, R.; Ye, Y.; Li, Z.; He, X. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-17), Melbourne, Australia, 19–25 August 2017; pp. 1725–1731. Available online: https://arxiv.org/abs/1703.04247 (accessed on 24 November 2024).
Somepalli, G.; Goldblum, M.; Shrivastava, A.; Goldstein, T. SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. In Proceedings of the 37th International Conference on Machine Learning (ICML 2020), Vienna, Austria, 13–18 July 2020. [Google Scholar] [CrossRef]
Zehtab-Salmasi, A.; Feizi-Derakhshi, A.R.; Nikzad-Khasmakhi, N.; Asgari-Chenaghlu, M.; Nabipour, S. Multimodal price prediction. Ann. Data Sci. 2021, 10, 619–635. [Google Scholar] [CrossRef]
Ma, M.X.; Noussair, C.N.; Renneboog, L. Colors, emotions, and the auction value of paintings. Eur. Econ. Rev. 2022, 142, 104004. [Google Scholar] [CrossRef]
Liu, C. Prediction and Analysis of Artwork Price Based on Deep Neural Network. Sci. Program. 2022, 2022, 7133910. [Google Scholar] [CrossRef]
Aubry, M.; Kraeussl, R.; Manso, G.; Spaenjers, C. Biased Auctioneers. J. Financ. 2022, 78, 795–833. [Google Scholar] [CrossRef]
Smith, J.D.; Johnson, A.B. Improving Predictive Accuracy in Art Market Models Using Ensemble Methods. J. Art Artif. Intell. 2020, 15, 102–118. [Google Scholar]
Hasler, D.; Susstrunk, S. Measuring colorfulness in natural images. In Proceedings of the Human Vision and Electronic Imaging VIII, Santa Clara, CA, USA, 21–24 January 2003; International Society for Optics and Photonics: Bellingham, WA, USA, 2003; Volume 5007, pp. 87–95. Available online: https://infoscience.epfl.ch/record/33994/files/HaslerS03.pdf (accessed on 22 August 2024).
Gómez, S.; Tascon, M.; Martínez, J.; Elad, M. SVD entropy: An image quality measure based on singular value decomposition. Signal Process. Image Commun. 2020, 81, 49–53. [Google Scholar]
Pace, R.K.; Barry, R. Sparse Spatial Autoregressions. Stat. Probab. Lett. 1997, 33, 291–297. [Google Scholar] [CrossRef]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
Borisov, V.; Leemann, T.; Seßler, K.; Haug, J.; Pawelczyk, M.; Kasneci, G. Deep neural networks and tabular data: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 7499–7519. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Bar chart comparing the sMAPE metric of KNN model across the datasets.

Figure 2. Bar chart comparing the sMAPE metric of DecisionTree model across the datasets.

Figure 3. Bar chart comparing the sMAPE metric of RandomForest model across the datasets.

Figure 4. Bar chart comparing the sMAPE metric of XGBoost model across the datasets.

Figure 5. Bar chart comparing the sMAPE metric of CatBoost model across the datasets.

Figure 6. Bar chart comparing the sMAPE metric of LightGBM model across the datasets.

Figure 7. Bar chart comparing the sMAPE metric of MLP model across the datasets.

Figure 8. Bar chart comparing the sMAPE metric of VIME model across the datasets.

Figure 9. Bar chart comparing the sMAPE metric of DeepFM model across the datasets.

Figure 10. Bar chart comparing the sMAPE metric of SAINT model across the datasets.

Figure 11. Feature importance plots of top performing XGBoost and RandomForest models. (a) Feature importance plot for XGBoost for NoImg dataset. (b) Feature importance plot for RandomForest for ColorSVD dataset.

Table 1. Number of unique values for each feature and its ratio to total number of values.

Calculated Field	ARTIST	TECHNIQUE	TOTAL-DIMENSIONS	YEAR	PRICE
Unique Values	395	10	2534	123	795
Unique Values/Total Count (%) *	1.55	0.04	9.97	0.48	3.13

* Total size of the data frame after preprocessing resulted in 25,408 rows.

Table 2. Distribution of the numerical core features.

Calculated Field	TOTAL-DIMENSIONS	YEAR	PRICE
Count	25,408	25,408	25,408
Mean	2259.03	1973.6	225.68
Standard deviation	1674.0	21.7	505.40
Minimal value	10.64	1900	1
25% Quantile	875	1963	50
50% Quantile	1750	1974	99
75% Quantile	3401	1985	200
Maximal value	10,000	2023	10,000

Table 3. Comparison of sMAPE scores (in %) for each model across the datasets. For each dataset, the best-performing models are highlighted in bold, and second-best models are underlined.

Method	AuctionResultsNoImg	AuctionResultsColor	AuctionResultsSVD	AuctionResultsColorSVD	CaliforniaHousing
LinearModel	101.33 ± 0.78	101.01 ± 0.84	102.75 ± 1.16	101.13 ± 0.93	28.70 ± 0.41
KNN	61.38 ± 0.35	62.68 ± 0.32	60.18 ± 0.65	60.93 ± 0.53	22.75 ± 0.51
DecisionTree	58.92 ± 0.62	58.39 ± 0.69	56.70 ± 0.78	59.21 ± 0.69	21.70 ± 0.56
RandomForest	61.20 ± 0.60	55.95 ± 0.29	56.94 ± 0.30	55.51 ± 0.80	17.50 ± 0.36
XGBoost	55.11 ± 0.60	57.50 ± 0.61	54.83 ± 0.59	58.76 ± 1.07	14.84 ± 0.25
CatBoost	58.91 ± 0.79	60.45 ± 1.02	58.20 ± 0.64	58.25 ± 0.48	14.92 ± 0.46
LightGBM	57.53 ± 2.32	58.29 ± 2.34	58.04 ± 0.59	57.52 ± 1.22	14.71 ± 0.31
MLP	62.98 ± 0.89	61.86 ± 1.14	63.68 ± 0.32	65.02 ± 0.75	17.52 ± 0.63
VIME	75.29 ± 3.37	66.29 ± 1.59	86.44 ± 2.28	74.09 ± 2.91	19.40 ± 1.69
ModelTree	68.50 ± 0.37	66.59 ± 0.66	71.68 ± 3.62	68.70 ± 1.35	23.86 ± 0.33
DeepGBM	76.92 ± 7.66	77.14 ± 7.40	87.96 ± 10.62	76.18 ± 4.23	35.13 ± 2.11
DeepFM	63.10 ± 0.76	63.45 ± 0.87	63.28 ± 0.25	63.06 ± 0.81	17.75 ± 0.36
SAINT	60.57 ± 0.95	60.75 ± 0.78	62.41 ± 1.19	60.63 ± 1.43	16.64 ± 0.30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mauer, P.; Paszkiel, S. Tabular Data Models for Predicting Art Auction Results. Appl. Sci. 2024, 14, 11006. https://doi.org/10.3390/app142311006

AMA Style

Mauer P, Paszkiel S. Tabular Data Models for Predicting Art Auction Results. Applied Sciences. 2024; 14(23):11006. https://doi.org/10.3390/app142311006

Chicago/Turabian Style

Mauer, Patryk, and Szczepan Paszkiel. 2024. "Tabular Data Models for Predicting Art Auction Results" Applied Sciences 14, no. 23: 11006. https://doi.org/10.3390/app142311006

APA Style

Mauer, P., & Paszkiel, S. (2024). Tabular Data Models for Predicting Art Auction Results. Applied Sciences, 14(23), 11006. https://doi.org/10.3390/app142311006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tabular Data Models for Predicting Art Auction Results

Abstract

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Datasets

2.2. Data Preprocessing

2.3. Dataset Description

2.4. Model Selection and Training

2.5. Reproducibility

3. Results

3.1. sMAPE Score Analysis Across Datasets

3.2. LiniearModel (Linear Regression)

3.3. K-Nearest Neighbors

3.4. DecisionTree

3.5. RandomForest

3.6. XGBoost

3.7. CatBoost

3.8. LightGBM

3.9. Multi-Layer Perceptron (MLP)

3.10. VIME

3.11. ModelTree

3.12. DeepGBM

3.13. DeepFM

3.14. SAINT

3.15. Feature Importance Plots

3.16. Summary of the Results

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Data Preprocessing

Appendix A.1. Ensuring Data Structure

Appendix A.2. Filtering Data

Appendix A.3. Data Encoding

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI