3.1. Predictive Accuracy of Genomic Values for Milk and Fat Yield
To assess the predictive performance of genomic PTA values for production traits, we compared simple and full linear regression models for 305-day milk and fat yield in Holstein cows (
n = 986). Observed versus predicted plots revealed clear differences in model performance (
Figure 1 and
Figure 2).
Simple linear regression using gPTAM and gPTAF alone showed only moderate predictive ability, with R2 values of 0.117 for milk and 0.119 for fat yield, representing the proportion of phenotypic variation explained by the simple model. When herd, birth-year, and their interaction were included in the full model, prediction accuracy increased substantially, with R2 values rising to 0.469 for milk and 0.507 for fat yield, corresponding to the explanatory R2 of the OLS model rather than the predictive R2 from cross-validation, suggesting that herd-level management consistency and environmental uniformity accounted for much of the observed phenotypic variation across farms. Data points in the full model clustered more tightly around the 1:1 reference line, indicating markedly improved prediction accuracy compared with the simple model.
Pearson correlation analysis further supported these results, showing moderate but consistent associations between genomic predictions and observed performance across the population (r ≈ 0.34 for both traits). This indicates that genomic PTA values provide a meaningful basis for predicting phenotypic outcomes under subtropical production conditions. Simple linear regression models quantified these effects in more detail (
Table 1). For milk yield, the estimated regression coefficient was β = 1.1895 (
p < 2 × 10
−16) with an intercept of 9366.18, explaining 11.7% of the phenotypic variation (R
2 = 0.117, adj. R
2 = 0.116). For fat yield, the estimated coefficient was β = 1.3526 (
p < 2 × 10
−16) with an intercept of 426.03, explaining 11.9% of the variation (R
2 = 0.119, adj. R
2 = 0.118). These results confirm that while gPTAM and gPTAF alone provide moderate predictive power, incorporating environmental and cohort effects can markedly enhance genomic prediction accuracy.
Model-diagnostic plots for the full OLS models exhibited approximately linear Q–Q patterns with only minor tail deviations (
Figures S1 and S2), indicating near-normal residual distributions. Although formal tests detected slight departures from normality, these were negligible given the large sample size. Residual–fitted plots showed no systematic trends, supporting model adequacy. Tests for heteroscedasticity confirmed homogeneous variance across predicted values, and generalized variance inflation factors (GVIF) were well below commonly accepted thresholds. Overall, these diagnostics indicated that multicollinearity, residual structure, and variance heterogeneity did not materially affect the validity of the regression estimates. Detailed test statistics are provided in
Supplementary Tables S1 and S2.
To further evaluate model robustness and generalizability, five-fold cross-validation was performed for both simple and full models (
Table 2). For milk yield, the cross-validated R
2 increased from 0.117 in the simple model to 0.293 in the full model, with RMSE decreasing from 1634.3 to 1503.7 kg and MAE from 1278.2 to 1151.5 kg. The CV of R
2 across folds was 13.2%, indicating moderately stable predictive performance with some variability across folds. For fat yield, the R
2 increased from 0.122 to 0.363, with RMSE decreasing from 78.9 to 68.3 kg and MAE from 61.3 to 52.6 kg. However, the R
2 CV was 23.2%, suggesting greater sensitivity to herd or environmental variability.
In addition to the fixed-effects models, LMM analysis was conducted to account for herd-level clustering effects by including herd as a random intercept and birth year as a fixed effect.
For milk yield, PTAM remained a strong and significant predictor (β = 1.201,
p < 0.001), and herd-level variance (SD = 853.3 kg) indicated substantial differences among farms, reflecting the influence of management intensity, housing system, and feeding practices on yield variability, while birth-year effects were not significant. For fat yield, PTAF also remained highly significant (β = 1.444,
p < 0.001), but herd-level variance was smaller (SD = 43.9 kg), suggesting that fat yield was less affected by inter-farm management heterogeneity compared with milk yield. DHARMa residual diagnostics for both LMMs indicated no over-/under-dispersion (milk:
p = 0.92; fat:
p = 0.854) and acceptable residual patterns. The milk model showed a small outlier signal (outlier test
p = 0.015), whereas the fat model did not (
p = 0.454) (
Figures S3 and S4). These diagnostic checks support the adequacy of the LMM specification and the robustness of the fixed-effect estimates.
Taken together, the diagnostic results (normality, homoscedasticity, multicollinearity, and DHARMa) support the validity of inference from both OLS and LMM frameworks. The consistency between LMM and multiple regression findings reinforces the critical contribution of herd-level structure to predictive performance—particularly for milk yield. Herd-level and cohort effects account for a substantial proportion of the variation in production, and integrating these factors significantly enhances the predictive accuracy of genomic PTA models. Moreover, cross-validation analysis confirmed that the milk yield model exhibited more stable prediction across validation folds, whereas the fat yield model was more sensitive to environmental variation. These results highlight that genomic PTA values (gPTAM and gPTAF) provide robust genetic signals across herds, but the extent of herd-level variance differs between traits, reflecting their environmental sensitivity.
3.2. Regional Variation in the Predictive Accuracy of gPTAM and gPTAF
The predictive performance of genomic evaluations was examined across nine major dairy production regions in Taiwan (Changhua, Chiayi, Hualien, Miaoli, Nantou, Pingtung, Tainan, Taoyuan, and Yunlin). Correlation coefficients between genomic predicted transmitting abilities and actual 305-day production traits were calculated to assess the consistency of prediction accuracy across regions (
Table 3).
Boxplots (
Figure 3 and
Figure 4) illustrated clear regional patterns in genomic prediction accuracy. For gPTAM and Milk305ME, Tainan and Hualien exhibited the highest median correlation values (0.54 and 0.48, respectively), indicating stronger genomic predictive performance in these regions. In contrast, Pingtung and Nantou showed the lowest correlations (0.06 and 0.2, respectively), suggesting lower prediction reliability in these areas. These regional differences likely reflect variation in farm management intensity, feed quality, and cooling infrastructure between regions; herds in southern coastal areas such as Pingtung are more frequently exposed to high heat load and humidity, whereas inland mountain areas like Nantou experience greater diurnal temperature variation and feed supply fluctuation, both of which can reduce the stability of genomic predictions.
A similar trend was observed for gPTAF and Fat305ME, with Taoyuan showing the strongest median correlation (approximately 0.7), followed by Yunlin and Hualien. Nantou and Pingtung again exhibited lower correlation values, indicating regional disparities in the predictive accuracy for fat yield.
One-way ANOVA results indicated no significant differences in milk yield prediction across regions (F(8, 16) = 1.107,
p = 0.408), whereas fat yield prediction showed significant variation (F(8, 16) = 2.6,
p = 0.0494). These findings were further supported by G × E simple slopes analysis. PTAM slopes did not differ significantly among regions (χ
2 = 6.224,
p = 0.621) (
Table 3), confirming stable genomic prediction for milk yield. In contrast, PTAF slopes varied significantly (χ
2 = 18.496,
p = 0.018), with Taoyuan exhibiting the strongest slope (3.85), significantly higher than several other regions (
p < 0.05). Model checks did not indicate heteroscedasticity or lack of fit at the regional level, and LMM estimates remained consistent with OLS.
3.3. Genomic Quartile Analysis and Economic Interpretation
To further evaluate the practical implications of genomic values, cows were stratified into quartiles based on their gPTAM and gPTAF values, and corresponding production traits were compared. For gPTAM, the average 305-day milk yield increased progressively with genomic merit. Specifically, cows in the lowest quartile (Q1, bottom 25%) produced 8655 ± 1537 kg of milk, while those in the second (Q2), third (Q3), and top quartiles (Q4) yielded 9246 ± 1604 kg, 9463 ± 1661 kg, and 10,265 ± 1785 kg, respectively (
Table 4). Similarly, cows grouped by gPTAF quartiles demonstrated ascending fat production, with Q1 yielding 381 ± 77 kg of fat, Q2 yielding 404 ± 82 kg, Q3 yielding 423 ± 81 kg, and Q4 reaching 453 ± 79 kg. Statistical analysis confirmed that the differences in both milk and fat yield among quartiles were highly significant (
p < 0.0001), suggesting that genomic predictions have strong discriminatory power in identifying higher-yielding animals.
From an economic standpoint, the production difference between Q4 and Q1 cows in terms of milk yield amounted to 1610 kg per lactation. Assuming a representative raw-milk price of US $1.1 per kilogram for illustrative purposes, this difference translates into an estimated profit advantage of US $1771 per cow. These estimates are based on assumed milk prices and do not account for farm-specific feed-cost variation, but they highlight the potential economic relevance of selecting genetically superior animals.
Additional quartile analysis was conducted using NM
$ values. Cows in the top NM
$ quartile yielded 9687 ± 1496 kg of milk, compared to 9085 ± 1716 kg in the lowest quartile—a 602 kg difference, equating to a profit margin of approximately US
$704 per cow (
Table 5). Fat yield also increased across NM
$ quartiles, from 389 ± 81 kg to 443 ± 76 kg (
Table 6), underscoring the comprehensive economic benefit of using NM
$ in selection decisions.
Together, these findings confirm that genomic evaluation, especially when interpreted through quartile stratification, not only predicts performance outcomes but also provides actionable economic insights to support genomic-based selection strategies in Taiwanese Holstein herds.