1. Introduction
The construction industry stands at a critical inflection point as Industry 5.0 principles drive fundamental transformation from traditional labor-intensive practices toward human-AI collaborative systems [
1,
2,
3]. While artificial intelligence promises substantial operational improvements spanning cost optimization, design automation, project monitoring, and resource management [
4,
5], successful adoption hinges not merely on technical capability but critically on workforce perception, acceptance, and integration willingness [
6,
7]. Recent estimates suggest that construction productivity improvements of 30–45% are achievable through AI deployment [
8,
9], yet adoption rates remain below 15% across European markets [
10], with perception barriers cited as the dominant impediment surpassing technical and financial constraints [
11]. Understanding and predicting these perception patterns represents a strategic priority for organizations navigating digital transformation, enabling proactive interventions that address resistance, customize training, and optimize change management resources.
Critically, the transition from Industry 4.0 to Industry 5.0 introduces a fundamental shift in how adoption barriers must be understood and addressed. Industry 4.0 adoption models predominantly focused on technical readiness, infrastructure investment, and process integration—factors amenable to straightforward resource allocation decisions [
12,
13,
14]. In contrast, Industry 5.0 explicitly positions human-centricity alongside sustainability and resilience as co-equal pillars, recognizing that technology value creation depends primarily on effective human–technology collaboration rather than automation capability alone [
15,
16]. This paradigm shift means that workforce perception constitutes a distinct barrier category that cannot be overcome through technical solutions or capital investment. Unlike infrastructure gaps that can be closed with funding or skill deficits addressable through training, perception barriers reflect attitudinal, cultural, and psychological factors that require targeted communication, trust-building, and demonstrated value before adoption proceeds. Construction industry surveys consistently reveal that even organizations with adequate technical infrastructure and trained personnel report adoption failures attributable to workforce resistance, uncertainty about role changes, and skepticism regarding AI reliability in safety-critical contexts [
11,
17].
Predicting AI adoption perceptions faces a fundamental methodological challenge characteristic of early-stage technological transitions: data scarcity imposed by limited expert respondent pools, nascent awareness levels, and the specialized nature of industry-specific surveys [
12]. Construction industry surveys typically achieve response rates of 8–12%, with meaningful samples rarely exceeding 50–100 respondents despite extensive recruitment efforts [
18]. This constraint creates a tension between predictive modeling aspirations and statistical reality—conventional machine learning approaches assume hundreds or thousands of training samples, while practical applications must operate effectively with tens of samples [
19,
20]. The predominant focus in contemporary machine learning on large-scale datasets and deep architectures provides limited guidance for practitioners confronting small sample industrial contexts, creating a methodological gap between algorithmic capabilities and practical requirements [
21].
Existing research on technology acceptance modeling, exemplified by frameworks such as TAM [
22], UTAUT [
23], and DOI [
24], predominantly employs descriptive statistical analysis or structural equation modeling rather than predictive machine learning approaches. Conversely, the machine learning literature extensively addresses large-scale prediction tasks but provides sparse guidance on small sample regimes where the bias-variance trade-off fundamentally shapes algorithm selection [
25,
26]. This research bridges this gap by developing a methodologically rigorous framework specifically designed for perception prediction under realistic data constraints encountered in Industry 5.0 industrial surveys. The primary objective is to demonstrate that statistically principled approaches can achieve meaningful predictive performance with limited samples by explicitly optimizing the bias-variance trade-off through target simplification, dimensionality reduction, regularized model selection, and appropriate validation protocols [
27,
28].
This work makes four principal contributions. First, it establishes a complete methodological framework for small sample perception prediction encompassing data engineering, model selection, validation strategy, and performance interpretation specifically adapted to industry survey contexts. Second, it provides empirical evidence that R2 ≈ 0.50 and F1 ≈ 0.68 represent achievable and meaningful performance levels for perception prediction with n = 51 samples, establishing realistic performance expectations for similar applications. Third, it demonstrates practical applicability by enabling actionable predictions supporting targeted interventions in AI adoption initiatives. Fourth, it delivers a replicable blueprint transferable across industries, geographies, and technology adoption contexts where similar data constraints apply.
The remainder of this paper proceeds as follows.
Section 2 reviews relevant literature spanning Industry 5.0 transformation dynamics, AI perception factors, technology acceptance frameworks, and machine learning approaches for small sample regimes.
Section 3 presents the complete methodological framework, including experimental setup, system architecture, model selection rationale, and validation protocols.
Section 4 reports experimental results for both regression and classification tasks with statistical interpretation.
Section 5 discusses implications, limitations, and future research directions.
Section 6 concludes with a synthesis of contributions and broader significance for Industry 5.0 applications.
3. Experimental Research
This section presents the methodological framework, system architecture, and experimental procedures employed in developing an AI-based perception prediction system for Industry 5.0 applications [
70]. All Python (3.14) scripts comprising the system were systematically reviewed to ensure accurate representation of the implemented computational pipeline. The methodology adheres to rigorous statistical learning principles appropriate for small sample regimes, with explicit consideration of the bias-variance trade-off inherent in limited-data scenarios [
71].
3.1. Problem Statement and the Aim of Experimental Research
The construction industry is increasingly adopting artificial intelligence (AI) to support planning, design, and project delivery [
72]. Recent review evidence confirms that AI methods are being applied across the entire construction value chai—from pre-construction planning and design through construction execution to operations and facility management [
73]. In practice, however, reported benefits differ substantially across organizations and job roles. This variability indicates that AI outcomes in construction are shaped not only by the availability of tools but also by the conditions under which they are introduced and used—particularly digital competencies, the intensity of ICT utilization, the extent of AI training and experience, and actual AI usage by both individuals and companies.
From the perspective of Industry 4.0 and the emerging Industry 5.0 framework, AI should be viewed as part of a broader digital ecosystem grounded in connectivity and data exchange, often complemented by IoT-enabled data sources and real-time data flows [
74]. The original Industry 4.0 initiative emphasizes connected systems and data-supported process integration [
14], while subsequent research highlights the diversity of definitions and implementation approaches across organizations and contexts [
75]. In parallel, Industry 5.0 explicitly extends Industry 4.0 by placing stronger emphasis on sustainability, human-centricity, and resilience, where value is created primarily through effective human–technology collaboration rather than automation alone [
13]. This is particularly relevant in construction, where project outcomes depend on coordination among multiple stakeholders, changing site conditions, and the quality of day-to-day decision-making.
Although the literature provides a broad overview of AI applications in construction, stakeholders still lack empirically validated approaches that would allow them to estimate the expected impact of AI adoption based on measurable readiness and usage characteristics. This limits evidence-based decisions about investments in AI tools and capability building. To address this gap, the present study operationalizes “AI impact” using (i) a composite AI Impact Index and (ii) two process-oriented outcome dimensions—perceived task automation and perceived cost reduction. These outputs are modeled as a function of digital competencies, ICT utilization intensity, AI training and experience, and AI usage at both individual and organizational levels, while accounting for demographic and organizational characteristics [
76,
77].
Aim of the experimental research. The aim of this paper is to develop and empirically validate a predictive AI model (a decision support tool) that estimates the expected impact of AI adoption in the construction sector based on digital competencies, ICT utilization intensity, AI training and experience, and AI usage at both individual and organizational levels. The impact is operationalized as (i) a composite AI Impact Index and (ii) two process-oriented outcome dimensions—perceived task automation and perceived cost reduction—while controlling for demographic and organizational characteristics.
Research hypothesis. Based on the literature review establishing that organizational resources, technology exposure, and facilitating conditions influence AI adoption attitudes [
23,
40], this research tests the hypothesis that company size positively predicts AI perception: larger organizations (medium and large enterprises) are expected to demonstrate more favorable AI adoption perceptions than smaller organizations (micro and small enterprises), reflecting differential resource availability, technology infrastructure, and formal training opportunities. This hypothesis is operationalized through statistical comparison of perception scores across company size categories and through inclusion of company size as a predictor in the regression framework, with results reported in
Section 4.2 and
Section 4.3.
3.2. Experimental Setup
The experimental foundation rests upon a survey-based dataset comprising perception responses related to artificial intelligence adoption in the construction industry, a sector undergoing significant transformation within the Industry 5.0 paradigm. The dataset contains 51 valid response samples collected through a structured questionnaire administered in 2025, with responses encoded in the Slovak language. The raw dataset encompasses 23 primary variables organized into four conceptual categories. Demographic variables include age group measured across five ordinal levels, company size spanning four levels from micro-enterprise to large corporation, job position distributed across seven hierarchical levels from preparation specialist to executive, and work experience captured in five duration bands ranging from less than one year to more than ten years. Technology exposure variables comprise ICT utilization level, personal AI usage, digital competencies, company AI adoption maturity, organizational digitalization level, and AI training exposure, with each measured on ordinal scales ranging from one to five. Perceived AI impact variables capture domain-specific impact assessments covering budgeting and cost management, design and planning, construction project management, marketing and customer relationship management, and material delivery and logistics, with each rated numerically. Finally, perception target variables constitute the primary outcome measures, including perceived AI impact on cost reduction serving as the primary regression target, alongside six classification targets addressing automation potential, materials optimization, project monitoring, human resources management, administrative burden reduction, and intelligent planning.
The sample size of n = 51 represents a boundary condition that fundamentally shapes the methodological approach. This constraint is not a limitation of the research design but rather reflects the realistic data availability in specialized industrial surveys where respondent pools are inherently restricted. The construction industry’s fragmented structure, combined with the nascent state of AI adoption awareness, naturally limits accessible expert respondents. With p = 15 baseline features and n = 51 samples, the effective samples-per-feature ratio approximates n/p ≈ 3.4, which falls below the commonly recommended threshold of 10–20 samples per feature for stable parameter estimation. This ratio necessitates explicit dimensionality reduction and regularization strategies to prevent overfitting and ensure generalization validity.
Target Variable Definitions
The prediction framework addresses one regression target and six classification targets, each representing distinct dimensions of perceived AI impact in construction operations.
Table 1 presents the complete target variable specification.
The primary regression target—perceived AI impact on cost reduction—was selected based on its centrality to construction business decisions and its demonstrated variance in preliminary data exploration. The six classification targets were originally measured on 5-point Likert scales but consolidated to 3-class ordinal categories (low/medium/high) to address severe class imbalance, as detailed in
Section 3.3. All targets capture forward-looking perceptions rather than retrospective assessments, consistent with the predictive decision support purpose of the framework.
3.3. System Architecture
The system architecture implements a four-phase optimization strategy specifically designed for small sample statistical learning. Phase one addresses data engineering through target simplification via ordinal class consolidation from five classes to three classes, combined with feature dimensionality reduction through theoretically—grounded composite index construction. Phase two governs model selection, enforcing exclusive use of low-variance, regularized, and interpretable models while explicitly excluding high variance architectures such as deep random forests, gradient boosting, and neural networks based on bias-variance considerations. Phase three establishes the validation strategy through implementation of leave-one-out cross-validation as the primary evaluation protocol, maximizing training data utilization while providing nearly unbiased performance estimates. Phase four handles ordinal target characteristics through class weighting for residual imbalance and ordinal-appropriate regression and classification methods.
The preprocessing module implements systematic encoding transformations for all categorical variables, mapping Slovak-language survey responses to numerical ordinal scales through pattern-matching functions that handle linguistic variations and encoding artifacts. The transformation preserves ordinal relationships essential for downstream statistical analysis.
3.3.1. Encoding Procedures
All survey variables were encoded according to their measurement properties. Ordinal variables (5-point Likert scales for perception items, ordered categories for age groups and experience levels) were mapped to integer sequences preserving rank order. Specifically:
Age groups: [18–24, 25–34, 35–44, 45–54, 55+] → [1, 2, 3, 4, 5]
Company size: [Micro (1–10), Small (10–50), Medium (50–250), Large (>250)] → [1, 2, 3, 4]
Work experience: [<1 year, 1–3 years, 3–5 years, 5–10 years, >10 years] → [1, 2, 3, 4, 5]
Perception scales: 5-point Likert items [Strongly Disagree → Strongly Agree] → [1, 2, 3, 4, 5]
Nominal variables (job position) were treated as ordinal approximations based on hierarchical level, though sensitivity analysis confirmed that alternative encodings produced equivalent model performance. Missing values were minimal (<3% of observations) and handled through listwise deletion given the small sample size, as imputation methods introduce additional variance in small sample contexts [
52].
The original five-class perception targets exhibited severe class imbalance, with minority classes containing as few as one to two samples, and distributions that violate statistical assumptions required for reliable classifier training. The three-class consolidation scheme addresses this by mapping classes one and two to a low category representing negative or skeptical perception, class three to a medium category representing neutral or uncertain perception, and classes four and five to a high category representing positive perception. This transformation ensures minimum class sizes of eight to ten samples, enabling stable parameter estimation and meaningful cross-validation fold composition.
Three composite indices were constructed to reduce the feature space from 15 baseline features to seven optimized features (
Table 2). The AI Experience Index represents the arithmetic mean of personal AI utilization, company AI adoption, and AI training level, capturing complementary aspects of individual AI ecosystem exposure that demonstrate empirical correlation with typical Spearman correlation coefficients exceeding 0.4. The digitalization index composites ICT utilization, organizational digitalization level, and self-reported digital competencies, capturing the digital environment maturity surrounding each respondent. The AI Impact Index calculates the mean of five domain-specific AI impact ratings spanning budgeting, design, project management, marketing, and logistics, measuring a common latent construct of perceived AI operational impact across business functions. The optimized feature set achieves an improved samples-per-feature ratio of
n/p ≈ 7.3, substantially enhancing estimation stability.
Equal weighting was applied across component variables based on: (a) absence of theoretical justification for differential weights, (b) similar variance contributions across components, and (c) simplicity and reproducibility considerations. All component variables were measured on comparable 1–5 ordinal scales, eliminating the need for standardization prior to aggregation.
As an alternative to theory-driven composites, recursive feature elimination (RFE) was applied to select the most predictive subset directly from the 15 baseline features. Using Lasso regression as the base estimator with LOOCV for evaluation, RFE identified six features that maximized generalization performance: Age_Numeric, Company_Size_Numeric, ICT_Utilization_Numeric, AI_Util_Personal_Numeric, AI_Training_Numeric, and AI_Impact_Budgeting. This data-driven selection achieved R2 = 0.501, marginally outperforming the composite index approach (R2 = 0.45–0.48) and providing the final feature configuration used in reported results.
3.3.2. Target Simplification Justification
The consolidation from five ordinal classes to three classes was evaluated against alternative schemes to ensure methodological robustness.
Table 3 presents the comparative analysis.
The adopted 3-class scheme (low: 1–2, medium: 3, high: 4–5) provides the best balance between class granularity and statistical stability. While 2-class consolidation yields marginally higher F1, it sacrifices the ability to distinguish neutral perceptions from positive ones—an important distinction for intervention targeting. The 5-class original scheme produced unstable estimates with high variance across LOOCV folds, confirming that minority classes with n < 5 samples prevent reliable parameter estimation.
3.4. Experimental Configuration
The fundamental challenge in small sample learning is the bias-variance decomposition of expected prediction error, expressed as E[(y − y)2] = Bias2 + Variance + σ_noise2. For n = 51 samples, high-complexity models such as random forests, gradient boosting, and deep neural networks exhibit low bias but critically high variance, leading to severe overfitting. The implemented strategy accepts moderately higher bias to achieve substantial variance reduction.
3.4.1. High Variance Baseline Comparison
To empirically validate the model selection rationale, constrained high variance models were trained and evaluated under identical LOOCV conditions.
Table 4 presents the comparative results.
Even with aggressive depth constraints (max_depth = 3), ensemble methods substantially underperform regularized linear models. Random forest achieves R2 = 0.412 (18% lower than Lasso), while gradient boosting produces R2 = 0.292 (42% lower). This performance gap demonstrates that for n = 51 samples, the variance component dominates: ensemble methods’ capacity to capture complex interactions becomes a liability rather than an asset, as they fit noise patterns that do not generalize. The results empirically confirm the theoretical bias-variance trade-off rationale and justify the exclusive use of regularized linear models.
The regression model suite comprises five carefully selected algorithms. Ridge regression employs L2 regularization, where coefficient shrinkage reduces effective model complexity, with regularization strength α selected via five-fold internal cross-validation from the range α ∈ [10−3, 103]. Lasso regression utilizes L1 regularization that induces sparsity through coefficient elimination, providing implicit feature selection with maximum iterations set to 10,000 to ensure convergence. The k-nearest neighbors regressor with k = 7 implements a local averaging approach robust to noise, where the value k ≈ 7 follows established heuristics for small samples using the distance-weighted Manhattan metric. A shallow decision tree with a maximum depth of three constrains complexity to a maximum of eight leaf nodes, ensuring an average leaf sample size of approximately six. Finally, an ensemble regressor performs simple averaging of Ridge and k-NN predictions, reducing variance without increasing individual model complexity through a statistically sound approach for small sample sizes.
The classification model suite similarly comprises algorithms selected for low variance characteristics. Logistic regression with L2 regularization provides calibrated probability outputs with balanced class weighting to address residual imbalance. The Ridge classifier offers fast linear classification with L2 regularization. The k-NN classifier with k = 5 uses an odd value to avoid ties, while distance weighting enhances performance. Gaussian Naive Bayes operates under a strong independence assumption that induces high bias but extremely low variance, often proving effective for small datasets. A shallow decision tree with a maximum depth of three provides interpretable decision rules with class balancing. Random forests with default parameters, gradient boosting machines, support vector machines with complex kernels, and neural network architectures were explicitly excluded from consideration. These exclusions are methodologically justified because with n = 51 samples, such models would fit noise rather than signal, producing optimistic training metrics that fail to generalize.
Leave-one-out cross-validation serves as the primary validation strategy, computed as LOOCV Error = (1/n)Σ_i = 1^nL(y_i, y_-i), where y_-i denotes the prediction for sample i from a model trained on all samples except i. This approach offers substantial advantages for the given sample size: it maximizes training set size by using 50 samples per fold, provides nearly unbiased performance estimates, and operates deterministically without random fold assignment. For classification tasks with minimum class counts below five samples, LOOCV becomes mandatory, while stratified five-fold cross-validation preserves class proportions across folds when class sizes permit.
3.4.2. Validation Protocol and Leakage Prevention
The validation framework implements strict separation between training and evaluation to prevent information leakage:
Outer evaluation: leave-one-out cross-validation (51 folds, each holding out one sample for testing)
Inner tuning: 5-fold stratified cross-validation for hyperparameter selection within each outer fold
Random state: fixed seed (42) ensures reproducibility across all stochastic operations
Feature selection: when RFE is applied, selection occurs within each outer fold using only training data
All preprocessing transformations (encoding and composite index calculation) use globally defined mappings that do not depend on target values, eliminating preprocessing-induced leakage. Regularization parameters (α for Ridge/Lasso) are selected via GridSearchCV within the inner loop, with separate tuning for each outer fold to prevent optimistic bias from parameter sharing.
Three complementary feature selection methods were implemented and consolidated to ensure robust variable selection. Spearman rank correlation is appropriate for ordinal data as it does not assume an interval scale or normality, with features ranked by absolute correlation coefficient with the target. Mutual information captures non-linear dependencies between features and target, computed using k = 5 nearest neighbors estimation. Recursive feature elimination iteratively removes the least important features based on Ridge regression coefficients until six features remain. The final feature ranking is computed as the average rank across all three methods, ensuring robust selection not dependent on any single criterion’s assumptions.
3.5. Experimental Procedure
The training procedure executes a systematic sequence beginning with data loading and preprocessing, where raw CSV data is loaded, cleaned of empty rows, and transformed through the encoding pipeline. Class distribution analysis follows, comparing original versus simplified target distributions to validate the three-class consolidation decision. Baseline evaluation trains all models on 15 original features with five-class targets to establish performance benchmarks. The optimized evaluation phase retrains models on seven composite features, retaining the original five-class target for regression to preserve prediction granularity while using three-class targets for classification tasks. A hybrid approach applies recursive feature elimination to select the top six features from the original fifteen, providing a data-driven feature subset alternative to the theory-driven composite indices. Advanced methods, including ensemble models combining Ridge and k-NN averaging, along with ordinal-aware methods where available, are evaluated for marginal improvements. Finally, a hyperparameter tuning experiment employing Optuna-based (v0.20.0) nested cross-validation with 100 trials demonstrates the performance ceiling, using five-fold cross-validation in the inner loop for parameter selection and LOOCV in the outer loop for unbiased evaluation.
The computational complexity analysis reveals tractable resource requirements. Ridge and Lasso regression require O(p2 n + p3) time for coefficient computation via normal equations or iterative solvers, which with p = 7 and n = 51 is effectively constant-time. The k-NN algorithm requires O(n · p) time per prediction for brute-force distance computation and O(n2 · p) for full LOOCV, which remains acceptable for the given sample size. Decision trees require O(n · p · log n) time for training with depth constraints. The LOOCV protocol multiplies training cost by n, but with simple models this remains tractable with total execution under 60 s on standard hardware. Space complexity analysis shows feature matrices requiring O(n · p) ≈ O(51 × 7) = O(357) floating-point values, while model parameters for Ridge store p coefficients and k-NN stores the entire training set at O(n · p), yielding a total memory footprint under one megabyte. The computational approach prioritizes methodological correctness over algorithmic efficiency, which is appropriate given the small data scale where even exhaustive methods complete rapidly.
4. Results
The primary regression task predicts perceived AI impact on cost reduction, with performance reported using the coefficient of determination defined as R2 = 1 − /Σ_i (y_i − y_i)^2Σ_i (y_i − y)^2. The achieved regression performance of R2 = 0.501 was obtained using Lasso regression with RFE-selected features under LOOCV evaluation, with a corresponding Mean Absolute Error of 0.551 and Root Mean Square Error of 0.709, indicating predictions deviate by less than one ordinal class on average. An R2 of 0.501 indicates that the model explains approximately 50 percent of variance in perception responses, which for perception prediction tasks based on survey data represents strong performance. This interpretation rests on three considerations. First, irreducible noise in survey responses limits predictable variance because human perception is inherently variable, with the same respondent potentially providing different answers on different occasions due to mood, context, or question interpretation, establishing a hard ceiling regardless of model sophistication. Second, high dimensionality of latent factors means that perception formation involves psychological, social, and experiential factors not fully captured by available survey variables, with unmeasured confounders contributing to residual variance. Third, sample size constraints mean that with n = 51, even optimal models exhibit estimation variance, with the true population R2 likely lying within ± 0.10 of the observed value.
4.1. Regularized Model Comparison
Table 5 presents the side-by-side comparison of regularized regression models using identical LOOCV evaluation.
Lasso regression achieves the highest R2 (0.501), marginally outperforming Ridge (0.494) and Elastic Net (0.493). The performance differences are small (ΔR2 < 0.01), consistent with expectations for regularized linear models on similar feature sets. Lasso’s slight advantage likely reflects its implicit feature selection property (L1 penalty inducing sparsity), which provides additional regularization benefit in the small sample context.
4.2. Feature Importance Analysis
To support interpretability and align findings with technology acceptance constructs, feature importance was assessed through both permutation importance and coefficient magnitude analysis.
Table 6 presents the ranked importance of RFE-selected features.
The feature importance rankings align with established technology acceptance constructs [
22,
23]. Personal AI usage (β = 0.359) emerges as the strongest predictor, consistent with TAM’s “perceived usefulness” construct—direct experience with AI tools shapes expectations of future value. AI impact on budgeting (β = 0.310) reflects prior positive AI outcomes serving as a trust proxy. ICT utilization (β = 0.304) captures general digital competency aligned with “perceived ease of use”—respondents comfortable with digital tools anticipate smoother AI integration. Company size (β = 0.207) represents organizational facilitating conditions, confirming the company-size hypothesis developed earlier. Age emerges as a weak predictor (β = 0.084), suggesting that chronological age is less influential than experiential factors once AI exposure is controlled.
4.3. Sample Characteristics and Distribution
Table 7 presents the distribution of respondents across company size categories, revealing a concentration in medium and large enterprises that reflects the survey recruitment strategy targeting organizations with sufficient AI exposure to provide informed responses.
Table 8 shows the distribution of the primary target variable—perceived AI impact on cost reduction—across the five-point ordinal scale, demonstrating the positive skew characteristic of perception surveys where respondents tend toward favorable assessments.
The three-class consolidation scheme transforms this distribution as shown in
Table 9, addressing the statistical instability of minority classes while preserving meaningful ordinal distinctions.
4.4. Company Size and AI Perception Analysis
A central hypothesis of this research posits that organizational scale influences AI adoption perceptions, with larger companies expected to demonstrate more positive attitudes due to greater resource availability, technology exposure, and organizational readiness.
Table 10 presents descriptive statistics supporting this hypothesis.
The data reveal a monotonic increase in mean perception scores with company size, from 2.75 for micro-enterprises to 4.31 for large corporations.
Table 11 presents the aggregated comparison between smaller and larger organizations.
Statistical Tests:
Mann–Whitney U = 89.5, p < 0.001
Cohen’s d = 1.47 (large effect)
Spearman ρ = 0.52, p < 0.001
The observed difference of 1.26 points on the 5-point scale represents a substantial and statistically significant effect. Cohen’s d of 1.47 exceeds the conventional threshold for a large effect (d > 0.8), indicating that company size explains meaningful variance in AI perception beyond chance. The Mann–Whitney U test confirms statistical significance (p < 0.001), while the Spearman correlation (ρ = 0.52) indicates a moderate-to-strong monotonic relationship between organizational scale and AI perception.
Table 12 presents the cross-tabulation of company size against consolidated perception classes, revealing the distributional patterns underlying the mean differences.
Chi-square test: χ2 = 18.01, df = 6, p = 0.006
The chi-square test confirms that the association between company size and perception class is statistically significant (p = 0.006). Notably, 92.3% of large company respondents fall in the high perception category compared to only 25.0% of micro-enterprise respondents—a striking 67-percentage-point gap that underscores the practical significance of organizational scale as a perception determinant.
4.5. Hypothesis Interpretation and Practical Implications
The finding that larger companies demonstrate significantly higher AI perception scores (Δ = 1.26 points, p < 0.001) provides partial support for the hypothesis that organizational scale influences AI adoption attitudes. However, this result should not be interpreted as suggesting that AI investment is redundant for smaller companies. Rather, the evidence suggests differentiated adoption strategies:
Small companies require targeted AI solutions: the lower perception scores among micro and small enterprises may reflect legitimate concerns about implementation complexity, resource constraints, and uncertain ROI at smaller scales. AI solutions for smaller companies should emphasize ease of implementation, minimal infrastructure requirements, and rapid time-to-value.
ROI expectations should be adjusted for scale: larger organizations benefit from economies of scale in AI deployment, amortizing fixed implementation costs across larger operations. Smaller companies should calibrate expectations accordingly, focusing on high-impact, narrowly scoped applications rather than enterprise-wide transformations.
Cloud-based AI services may be more appropriate than infrastructure investments: the perception gap may partially reflect awareness of capital-intensive AI implementations unsuitable for smaller organizations. Cloud-based, subscription-model AI services lower barriers to entry and may be more appropriate for companies lacking dedicated IT infrastructure.
Training and exposure drive perception improvements: the strong correlation between company size and AI perception likely reflects differential exposure to AI technologies and training opportunities. Targeted educational initiatives for smaller company employees could narrow this perception gap.
4.6. Classification Performance
Six perception classification targets were evaluated using the weighted F1-score, calculated as F_1 = 2 · /Precision · RecallPrecision + Recall. The achieved average F1-score of 0.681 reflects individual target performance ranging from F1 = 0.598 for human resources perception, which proved most challenging due to class imbalance, to F1 = 0.756 for administrative burden perception, with logistic regression using balanced class weights achieving the best results on most targets. The weighted F1-score of 0.681 substantially exceeds the stratified random baseline of approximately 0.35 to 0.40, demonstrating that learned models capture meaningful signal beyond class frequency memorization. For three-class ordinal classification with limited samples, this performance indicates effective class separation where models successfully distinguish between low, medium, and high perception categories using available features; robustness to imbalance, where class weighting successfully mitigates remaining imbalance after three-class consolidation; and generalization validity, where LOOCV evaluation ensures reported metrics reflect out-of-sample performance without optimistic bias.
4.7. Performance Ceiling Analysis
Extensive hyperparameter tuning via Optuna with nested cross-validation confirmed that R2 ≈ 0.50 to 0.55 represents the achievable ceiling for this dataset. The gap between optimistic non-nested and honest nested evaluation estimates approached 0.05 to 0.08, highlighting overfitting risk even with 100 tuning trials. Expecting R2 > 0.55 is unrealistic without fundamental changes to data collection for three reasons. The Bayes error floor established by survey response noise creates irreducible error independent of model choice. Effective degrees of freedom constraints with n = 51 and p_eff ≈ 6 to 7 features necessarily limit model flexibility. Missing explanatory variables mean that perception formation depends on factors, including personality traits, prior experiences, and organizational culture that are not measured in the current instrument.
4.8. Decision Support Interface Outputs and Correlation Insights
To complement the quantitative model evaluation, the developed system provides an interpretable decision support interface that translates model inputs and composite indices into user-facing profile outputs. The interface is designed to support practical interpretation of predicted perceptions and expected impacts at both the individual respondent level and the organizational (company) level, thereby enabling targeted interventions aligned with Industry 5.0’s human-centric adoption logic.
Figure 1 presents the Individual Profile Assessment view, which allows users to enter basic demographic descriptors (job position, work experience, and age group) together with key readiness variables (digital competencies, personal AI usage, AI training, and ICT utilization). The system subsequently generates an Individual Competency Profile (radar chart) and an Individual AI Readiness score (gauge), computed from the composite readiness structure used in the modeling pipeline. The visualization provides an immediate summary of how a respondent’s capability and exposure profile relate to expected AI adoption perceptions.
From an interpretability standpoint, this output supports two main functions:
- (i)
it clarifies which readiness components dominate the respondent’s profile (e.g., stronger digital competencies but only moderate AI usage), and
- (ii)
it enables practical discussion of interventions (e.g., training or structured exposure) without requiring stakeholders to interpret model coefficients directly.
Figure 2 shows the company characteristics and expected AI impact assessment views. The tool captures company-level descriptors (company size, organizational digitalization level, and company AI usage level) and links them to two outputs:
- (i)
a company digital maturity indicator (gauge-based digitalization index), and
- (ii)
an expected AI impact profile across operational domains (budgeting, design, project management, marketing, and logistics), consistent with the AI Impact Index construction described in the methodology.
This view is particularly useful because it aligns model reasoning with decision contexts familiar to managers: instead of predicting abstract “AI adoption”, the tool expresses expected impacts in recognizable functional areas. The visual profile also supports the interpretation of heterogeneity observed in the dataset: even with similar demographic composition, organizations may differ substantially in digital maturity and AI usage, which is reflected in expected impact patterns.
To support managerial interpretation, the interface includes a hypothesis-oriented comparison focusing on organizational scale.
Figure 3 illustrates the Hypothesis Analysis: small company AI investment, where AI perception is compared across company size categories using a consistent gauge visualization. The displayed pattern is monotonic, with higher perceived AI readiness/perception associated with larger company size categories, which is consistent with the statistical results reported earlier (significant differences between smaller and larger organizations).
Importantly, the interface does not present this finding as a simplistic “small companies should not invest” conclusion. Instead,
Figure 4 provides an ROI Considerations by Company Size table that frames adoption as a scale-sensitive decision problem.
Table 13 emphasizes proportional investment, realistic ROI horizons, and matching solution complexity to organizational resources (e.g., lightweight SaaS automation and assistive AI for micro/small firms versus integrated solutions and custom modeling for medium/large firms). This framing is consistent with the empirical finding that perception and readiness are not uniform across organizational scale, and it translates statistical evidence into a practical decision narrative.
Figure 4 presents a correlation heatmap summarizing relationships among demographic factors, readiness variables, composite indices, and perception/impact outcomes. The results support the conceptual logic embedded in the predictive framework: the AI Impact Index and perception-related outcomes show meaningful positive associations with AI experience/exposure, ICT utilization, and AI usage intensity, which indicates that perceived value is linked to actual exposure and capability-building rather than to demographic factors alone. In practical terms, this strengthens the interpretation that adoption perceptions can be improved through targeted interventions such as training, structured experimentation, and increased workflow integration of AI tools.
The heatmap also illustrates that organizational characteristics contribute to perception patterns, supporting the subgroup results presented in the company-size analysis. This provides a coherent triangulation: (i) statistical tests show significant group differences, (ii) model performance indicates predictable structure in the data, and (iii) correlation patterns confirm that readiness and usage variables align with perceived impact dimensions.
5. Discussion
5.1. Conclusion Derived from Experimental Results
The experimental results validate the methodological framework’s effectiveness for small sample perception prediction in Industry 5.0 contexts. The achieved regression performance of R2 = 0.501 and classification performance of F1 = 0.681 represent statistically and mathematically optimal outcomes given the boundary conditions of limited sample size and high-dimensional perception targets.
The methodological contributions of this work span four key areas. Bias-variance optimization through explicit model selection based on variance reduction rather than flexibility maximization proved essential, with low variance models, including Ridge, k-NN with moderate neighborhood size, and shallow trees consistently outperforming complex alternatives that would overfit 51 samples. Target simplification through five-class to three-class consolidation addressed statistical instability arising from severely imbalanced minority classes while preserving meaningful ordinal distinctions. Composite index construction employing theory-driven feature aggregation reduced dimensionality from 15 to seven features, improving the samples-per-feature ratio from 3.4 to 7.3 without sacrificing predictive signal. LOOCV validation maximized training data utilization and provided nearly unbiased performance estimates appropriate for the sample size.
Evidence confirming successful overfitting prevention includes regularization parameters selected via internal cross-validation rather than manual tuning, LOOCV ensuring each test prediction uses only training data for that fold, nested cross-validation for hyperparameter tuning experiments isolating tuning decisions from final evaluation, and baseline comparisons with the mean predictor and stratified random classifier verifying that models learn beyond trivial patterns.
5.1.1. Connection to Prior Literature
The findings substantively extend, confirm, and provide nuance to prior work on technology acceptance in construction contexts. The achieved R
2 = 0.501 compares favorably to benchmarks from related small sample perception studies: customer satisfaction prediction typically achieves R
2 ≈ 0.35–0.50 [
48], while healthcare quality perception studies report R
2 ≈ 0.45–0.60 [
48]. The current results position construction AI perception prediction within the upper range of comparable domains, suggesting that the predictive signal in AI adoption perceptions is neither weaker nor stronger than other attitudinal constructs.
The company-size effect (Cohen’s d = 1.47) extends findings from Tjebane et al. [
40], who reported qualitative differences in AI readiness by organizational scale. The current results quantify this effect with statistical precision: the 1.26-point perception gap between smaller and larger organizations represents a meaningful barrier that prior descriptive studies identified but did not estimate. This quantification enables resource allocation decisions—organizations can now estimate the magnitude of perception intervention required for different company size segments.
The feature importance rankings confirm and extend TAM and UTAUT constructs [
22,
23] to the AI-specific context. Personal AI usage emerging as the strongest predictor (β = 0.359) aligns with the “perceived usefulness” construct but adds the insight that direct experience matters more than formal training (AI_Training β = 0.244). This extends Emaminejad et al. [
39]’s findings on trust in construction AI by suggesting that hands-on exposure may be more influential than structured educational interventions—a finding with direct implications for training program design.
The results partially contradict assumptions about demographic determinism in technology acceptance. Age emerged as the weakest predictor (β = 0.084), challenging narratives that frame AI adoption barriers as generational issues. This finding aligns with Lu and Deng’s [
41] observation that digital competency mediates age effects, suggesting that chronological age itself is less relevant than accumulated digital experience.
5.1.2. Managerial Implications
The predictive framework enables concrete organizational interventions based on empirically validated perception determinants:
Targeted Training Programs: given that personal AI usage (β = 0.359) and AI training (β = 0.244) are significant predictors, organizations should prioritize hands-on AI exposure over lecture-based training. Recommended approach: implement structured AI tool trials where employees use AI-assisted design, scheduling, or budgeting tools on pilot projects, with 2–4 h of supervised experimentation weekly over 4–6 weeks before formal adoption decisions.
Role-Specific Communication Strategies: the model enables perception prediction for individual employees based on their profile characteristics. Organizations can segment their workforce and customize messaging: employees with low predicted perception scores (high ICT utilization but low AI exposure) may respond to demonstrations of AI-human collaboration, while those with prior negative AI experiences may require addressing specific concerns before broader adoption messaging.
Strategic Pilot Selection: the company-size effect (Δ = 1.26 points) suggests that AI pilots in smaller firms require additional support infrastructure. For micro and small enterprises, recommended approach: partner with technology vendors or industry associations to provide shared AI resources, reducing per-firm investment while enabling exposure that improves perception.
Workflow Redesign Before Technology Introduction: the importance of ICT utilization (β = 0.304) suggests that general digital competency precedes AI-specific acceptance. Organizations with low baseline digitalization should first establish foundational digital workflows (digital document management, collaborative scheduling tools) before introducing AI capabilities—attempting to leapfrog directly to AI may encounter resistance rooted in general digital discomfort.
Intervention Prioritization: using the model coefficients, organizations can estimate return on investment for different perception interventions. Increasing personal AI usage by one Likert point yields approximately 0.36 points improvement in cost reduction perception, while company-wide training programs yield approximately 0.24 points. This enables evidence-based allocation of limited change management resources.
5.2. Future Improvements
Incremental improvements to prediction performance are achievable through targeted extensions. Sample size expansion from 51 to 150–200 respondents would enable more complex model exploration, including ensemble methods and shallow neural networks, reduced confidence intervals around performance estimates, and potential for held-out test set evaluation beyond cross-validation. Feature space enrichment through the introduction of additional explanatory variables would incorporate psychological factors such as technology acceptance disposition and risk tolerance, demographic details, including education level and prior industry experience, and contextual variables encompassing regional economic indicators and company financial performance. Model refinement with larger samples would make viable ordinal regression architectures using proportional odds models, attention-weighted feature combinations, and transfer learning from related industrial perception datasets. These extensions represent data-driven improvements rather than algorithmic advances, consistent with the finding that current performance bounds are determined by data characteristics rather than model capacity. Performance improvements of 0.05 to 0.10 in R2 per 50 additional respondents represent realistic expectations, with asymptotic limits likely near R2 ≈ 0.65 to 0.70 even with substantially expanded datasets, reflecting fundamental uncertainty in human perception measurement.
6. Conclusions
This research establishes a methodologically rigorous framework for AI-based perception prediction in data-constrained Industry 5.0 environments, with specific application to artificial intelligence adoption in the construction sector. The work addresses a fundamental challenge in contemporary industrial research: developing reliable predictive systems when sample acquisition is inherently limited by specialized domain expertise requirements, survey response rates, and nascent technology adoption stages.
The principal contribution of this work lies in demonstrating that statistically principled approaches specifically designed for small sample regimes can achieve meaningful predictive performance without resorting to complex architectures that would inevitably overfit limited data. The achieved performance metrics of R2 = 0.501 for regression and weighted F1 = 0.681 for classification represent not merely acceptable results, but optimal outcomes given the fundamental constraints imposed by sample size, feature dimensionality, and the inherent variability of human perception measurement. This finding carries significant implications for the broader machine learning community, where the predominant focus on large-scale datasets and deep architectures often overshadows the practical reality that many industrial and social science applications operate in data-scarce environments.
The methodological framework developed herein provides a replicable blueprint for researchers and practitioners confronting similar constraints. The systematic approach encompassing target simplification through ordinal consolidation, dimensionality reduction via theoretically grounded composite indices, exclusive deployment of low variance regularized models, and rigorous validation through leave-one-out cross-validation collectively ensures that reported performance reflects genuine generalization capability rather than spurious pattern memorization. This framework extends beyond the specific context of construction industry AI perception to any domain where expert opinion measurement intersects with limited respondent availability.
From an Industry 5.0 perspective, this research addresses a critical gap in understanding human-AI collaboration dynamics during the early stages of technological transformation. The construction sector exemplifies industries undergoing fundamental restructuring, where artificial intelligence promises significant operational improvements yet faces substantial adoption barriers rooted in workforce perception, organizational culture, and technological readiness. The ability to predict perception patterns from measurable demographic and experiential variables enables targeted intervention strategies, including customization of training programs based on predicted resistance levels, prioritization of communication efforts toward high-impact demographic segments, and resource allocation for change management initiatives aligned with predicted adoption trajectories. These applications transform perception prediction from an academic exercise into an actionable decision support tool for organizational leaders navigating technological transitions.
The research acknowledges inherent limitations that bound the scope of its conclusions. The sample size of 51 respondents, while typical for specialized industrial surveys, fundamentally constrains model complexity and introduces estimation uncertainty that wider sampling would reduce. The geographic and temporal specificity of data collection within the Slovak construction industry in 2025 limits direct generalizability to other regions, industries, or time periods, though the methodological approach remains transferable. The reliance on self-reported survey data introduces measurement error through respondent interpretation variability, social desirability bias, and potential gaps between stated perceptions and actual behaviors. The cross-sectional design captures perception at a single temporal snapshot, whereas longitudinal tracking would reveal perception evolution as AI adoption progresses and industry experience accumulates.
Extended Limitations Discussion
Geographic and Cultural Context. The Slovak construction industry operates within a Central European economic and regulatory environment that may influence AI perception patterns in ways not generalizable to other contexts. Slovakia’s construction sector is characterized by a high proportion of small and medium enterprises, relatively recent digital transformation initiatives compared to Western European markets, and specific labor market dynamics influenced by EU integration and regional economic development patterns. Cultural factors, including attitudes toward technology, organizational hierarchy, and risk tolerance, may systematically differ from other national contexts. Readers should interpret findings as hypothesis-generating for other regions rather than directly applicable predictions.
Methodological Limits of Perception Prediction. Predicting subjective perceptions from small samples faces fundamental methodological constraints. First, perception is inherently noisy—the same individual may provide different responses on different occasions due to mood, recent experiences, or question interpretation. This test-retest variability establishes an irreducible error floor that no model can overcome. Second, perception formation involves psychological constructs (risk tolerance, openness to experience, and cognitive style) not captured in the current instrument, meaning that a portion of between-individual variance is systematically unmeasurable with available features. Third, small samples (n = 51) provide limited statistical power for detecting moderate effects and produce wide confidence intervals around parameter estimates—the true population R2 may plausibly range from 0.35 to 0.65 given sampling variability.
Future Improvements for Validity. Longitudinal and multi-source data collection approaches could substantially improve prediction validity. Longitudinal designs tracking the same respondents over 12–24 months as AI exposure accumulates would enable: (a) validation of prediction stability, (b) investigation of perception change dynamics, and (c) assessment of intervention effectiveness through before-after comparisons. Multi-source validation incorporating behavioral data (actual AI tool usage logs and productivity metrics) alongside self-reported perceptions would address social desirability bias and close the attitude-behavior gap. Multi-informant approaches collecting perceptions from both individual workers and their supervisors would enable triangulation and reduce single-source bias.
Future research directions emerge naturally from these limitations and the established foundation. Longitudinal extensions tracking the same respondent cohort across multiple time points would enable investigation of perception dynamics, validation of prediction stability, and assessment of intervention effectiveness. Cross-industry comparative studies applying the identical methodological framework to manufacturing, healthcare, or logistics sectors would test framework generalizability and identify industry-specific versus universal perception determinants. International replication studies across diverse geographic and cultural contexts would illuminate the extent to which perception formation mechanisms transcend regional boundaries versus require localization. Integration of behavioral data linking predicted perceptions to actual technology adoption decisions would validate the practical utility of perception prediction and enable closed-loop optimization of intervention strategies. Advanced modeling explorations viable with expanded sample sizes could investigate hierarchical models accounting for organizational clustering effects, mixture models identifying latent perception subgroups, and causal inference frameworks distinguishing predictive associations from causal mechanisms.
The broader significance of this work extends beyond its immediate empirical findings to its demonstration that methodological rigor can extract meaningful insights even from severely constrained data environments. In an era where machine learning discourse often equates sophistication with architectural complexity and dataset scale, this research affirms the enduring value of statistical fundamentals: understanding bias-variance trade-offs, matching model complexity to sample size, implementing appropriate regularization, and validating performance through honest evaluation protocols. These principles, though developed decades ago in classical statistical learning theory, remain as essential in contemporary AI applications as they were in their original formulation.
For practitioners in Industry 5.0 contexts, this work delivers a clear message: effective AI deployment does not require massive datasets or complex deep learning systems when the problem structure, proper feature engineering, and appropriate model selection can compensate for data limitations. The construction industry and similar sectors should not delay digital transformation initiatives while awaiting ideal data conditions that may never materialize. Instead, they can proceed confidently with available data by applying methodologically sound approaches that acknowledge and work within realistic constraints.
This research ultimately contributes to the evolving understanding of human-centered artificial intelligence in industrial contexts, where technological capability must align with human perception, acceptance, and integration to achieve sustainable transformation. The developed framework and empirical findings provide both a methodological template and empirical evidence supporting the feasibility of perception-aware AI deployment strategies in data-constrained Industry 5.0 environments. Future work building upon this foundation will further refine our ability to predict, understand, and ultimately shape the human dimensions of industrial AI adoption.