3.1. Spatial Patterns of Crop Yield and Soil Physicochemical Properties Reflect Underlying Resource Endowment Characteristics
Rice yields exhibited clear spatial differentiation that aligned systematically with resource endowment gradients across the six ecological zones (
Figure 2). The highest rice yields were recorded in zones characterized by fertile mollisol endowment with high baseline SOM and optimal water availability (mean 8.1 t ha
−1 in SNP), followed closely by the organic-rich alluvial plain zone (SJP) and the warm humid plain zone (LHP), averaging 7.6 t ha
−1, with no statistically significant difference between these two zones (
p > 0.05). Substantially lower rice yields were observed in the zones with resource endowment constraints, including mountainous terrain with coarse-textured soils (XAL), semi-arid sandy soil conditions (WSA), and volcanic soil landscapes (CBM) (
p < 0.05 versus SNP).
These yield levels are broadly consistent with, yet at the high end of national and global benchmarks. The average maize yield across all zones (approximately 10.6 t ha
−1) substantially exceeds the Chinese national average of ~6.3 t ha
−1 and the global average of ~5.8 t ha
−1, confirming the role of northeast China as a high-yielding production hub. The observed rice yield of ~7.6 t ha
−1 in the central plains is comparable to high-yielding temperate rice systems in Japan and South Korea (~7.0–8.5 t ha
−1 [
19]).
The spatial co-occurrence of high yields with favorable resource endowments (fertile parent materials, optimal texture, high baseline SOM) and the relatively low yields in zones with constraining endowments (sandy texture, semi-arid background conditions, volcanic P-fixing soils) strongly suggest that resource endowment characteristics, rather than management intensity alone, are primary determinants of the inter-zone yield gaps.
Soil physicochemical properties displayed pronounced inter-zone heterogeneity systematically linked to resource endowment characteristics (
Figure 3). Zones with sandy soil texture (endemic to semi-arid regions) (XAL, WSA) showed the highest BD (mean 1.34 g cm
−3), significantly greater than zones with fine-textured mollisol endowment (SJP, SNP, CBM, LHP). The zone with loess-derived mollisol endowment (SNP) exhibited the lowest BD (1.24 g cm
−3), reflecting the inherently favorable soil structure of this parent material. SOM was highest in zones with organic-rich alluvial or loess-derived mollisol endowments (SJP and SNP, mean 40.9 g kg
−1), approximately 37.1% higher than the overall cross-zone average (33.52 g kg
−1), while the zone with sandy soil and semi-arid climate endowment (WSA) had the lowest SOM (26.02 g kg
−1), reflecting both parent material and climatic constraints. TN mirrored this pattern, with the highest values in zones with cold climate or organic-rich alluvial endowments (XAL, SJP, and SNP; mean 2.65 g kg
−1) and the lowest in the semi-arid sandy soil zone (WSA, 1.84 g kg
−1).
Critically, AP showed the highest levels in the volcanic soil land scape (CBM, mean 72.17 mg kg−1) and the fertile mollisol-dominated Songnen Plain (SNP), yet these high total AP values in volcanic soils mask low bioavailability due to strong fixation by aluminum and iron oxides, a classic example of resource endowment (parent material chemistry) determining soil property functionality. The semi-arid sandy soil zone (WSA) showed the lowest AP (36.82 mg kg−1), reflecting both parent material P content and insufficient P inputs. AK was consistently highest in the fertile mollisol Songnen Plain (SNP), though inter-zone differences were not statistically significant. Ks showed elevated levels only in the SNP, likely reflecting both parent material characteristics and K fertilization history. The overall mean BD and pH across all zones were 1.32 g cm−3 and 6.43, respectively. These patterns indicate a clear resource endowment-governed gradient, with nutrient-depleted sandy soils under water limitation at one extreme and fertile, high-SOM mollisols derived from optimal parent materials at the other.
The BD values recorded here (overall mean 1.32 g cm
−3; up to 1.34 g cm
−3 in sandy soil zones) are consistent with reports of progressive compaction in the black soil region since the 1980s, when BD values of ~1.1 g cm
−3 were typical for mollisols under natural vegetation [
2,
8]. The current BD levels in zones with sandy texture and low SOM endowments (WSA, XAL) approach or exceed the critical threshold of 1.35 g cm
−3, above which root penetration resistance increases sharply [
20]. This trajectory of “soil hardening and thinning” is most severe in zones where resource endowment characteristics (coarse texture, low clay content, semi-arid conditions) inherently predispose soils to compaction, representing an escalating challenge that directly undermines productive capacity.
3.2. Zone-Differentiated Associations Between Soil Properties and Crop Yields
Random forest analysis identified distinct sets of dominant soil factors for rice and maize yields. The model showed good predictive performance: it explained 42.7% of rice yield variation and 38.2% of maize yield variation, with root mean square error (RMSE) of 0.82 t ha
−1 and 0.91 t ha
−1 respectively. The variable importance results are presented in
Figure 4. For rice, the four most important variables, ranked by %IncMSE, were AK (%IncMSE = 19.2%), BD (%IncMSE = 18.3%), pH (%IncMSE = 15.7%), and Ks (%IncMSE = 12.4%). Pearson correlation analysis confirmed that BD, TN, Ks, AK, and SOM were positively correlated with rice yield (
p < 0.05), while pH showed a significant negative correlation. For maize, the leading factors were SOM (%IncMSE = 32.1%), TN (%IncMSE = 16.8%), AK (%IncMSE = 14.5%), and BD (%IncMSE = 11.2%). AP, Ks, TN, and SOM were positively correlated with maize yield, but BD was negatively correlated (
p < 0.05). The contrasting roles of BD for the two crops, positive for rice, but negative for maize, likely reflect the fundamentally different soil–root environments: paddy rice cultivation involves repeated puddling and flooding that can favor slightly denser soil structures for water retention [
21], while upland maize is critically sensitive to soil compaction inhibiting deep root exploration [
22].
Importantly, the strength and direction of soil–yield relationships were not uniform across zones, but varied systematically with resource endowment characteristics, substantiating the hypothesis of zone-specific limiting factors. In zones characterized by fertile mollisol endowment with high baseline SOM and nutrient levels (SNP), SOM, TN, AP, and AK emerged as the primary drivers of both rice and maize yield, while BD exerted no significant independent effect. This is consistent with the inherently favorable soil structure (low BD, 1.24 g cm−3) conferred by loess parent material. In contrast, zones with sandy soil texture (a resource endowment characteristic associated with a semi-arid climate) and inherently low fertility endowments (WSA) exhibited a fundamentally different constraint pattern: elevated BD (1.34 g cm−3) and alkaline pH (mean ~ 7.0) were the dominant yield-limiting factors, directly reflecting parent material and climatic endowment characteristics. Even where NPK nutrients were supplemented, yield potential remained constrained by physical soil impedance from sandy texture and pH-mediated phosphorus immobilization under alkaline conditions.
In the volcanic soil landscape with high aluminum and iron oxide content in parent materials (CBM), AP was the pivotal limiting factor for maize yield, a direct consequence of strong P fixation capacity inherent to volcanic parent material endowment, while rice yield in this zone was most sensitive to SOM content, likely reflecting both SOM’s role in complexing aluminum and reducing P fixation and its importance for maintaining soil structure on sloping terrain. These zone-differentiated results provide empirical evidence that resource endowment characteristics (parent material, texture, baseline fertility, climate) determine which soil physicochemical properties exert decisive control over yields, directly challenging the conventional “homogeneous northeast” assumption and providing empirical evidence that spatially targeted management, rather than uniform prescriptions, is essential for closing inter-zone yield gaps [
4,
10].
3.3. Synergistic Pathways of Soil Properties Driving Rice Yield
Path analysis revealed a complex network of direct and indirect effects. The SEM model fit the rice data well: CFI = 0.962, TLI = 0.954, RMSEA = 0.042, SRMR = 0.031, all meeting the recommended fit thresholds. The pathway results are presented in
Figure 5. BD and pH operated as upstream variables: BD had a positive effect on SOM (PC = 0.213), while pH had a negative effect on SOM (PC = −0.911), indicating that lower pH and moderate BD promote SOM accumulation under paddy conditions. SOM had a strong positive direct effect on rice yield (PC = 0.361), and combined with Ks pathways, the total effect of SOM reached 0.412. Ks contributed positively to rice yield both directly and indirectly via TN (PC = 0.75), especially in zones with K-rich parent materials, while Ks also reduced AP (PC = −0.108), likely through competitive sorption. BD indirectly decreased Ks (PC = −0.118) and AK (PC = −0.006), partially offsetting its positive effect on SOM. This dual pathway explains why BD impacts vary with soil texture. Moreover, pH negatively affected AP (PC = −0.201), highlighting the need for pH management to sustain phosphorus availability, especially in alkaline zones.
These pathway results carry important mechanistic insights. The positive BD-to-SOM relationship in paddy soils contrasts with the negative BD-to-SOM relationship commonly reported for upland soils [
15], and likely reflects the anaerobic conditions under flooded rice cultivation: moderate compaction reduces water percolation, creating an anaerobic environment that slows organic matter decomposition and favors carbon sequestration, which is consistent with previous findings in paddy soil studies [
14,
16,
23]. The strong negative pH → SOM pathway (PC = −0.911) suggests that soil acidification, itself accelerated by nitrogen fertilizer use and acid deposition, indirectly suppresses rice yield by destabilizing SOM, a mechanism not captured in simple pH–yield correlations. This finding aligns with global evidence that soil pH mediates microbial community composition and organic matter mineralization rates [
14], and has direct implications for liming management in zones where baseline pH is declining (SJP, LHP), highlighting how resource endowment trajectories (acidification potential) must inform management strategies.
To resolve the apparent contradiction between the positive BD → SOM path coefficient and the low rice yields in high-BD zones, we calculated the total standardized effect of BD on rice yield by summing all direct and indirect pathways in the SEM. The total effect of BD on rice yield was −0.087, indicating a weak negative overall effect. The positive indirect effect via SOM (0.213 × 0.361 = 0.077) was offset by negative indirect effects via Ks and AK. More importantly, BD showed strong threshold dependence: moderate BD (<1.30 g cm−3) benefits SOM accumulation and rice yield, while BD exceeding 1.34 g cm−3 causes severe physical constraints to roots and sharply reduces yield. This explains why high-BD zones (WSA, XAL) present low rice yields despite the positive BD → SOM pathway in the SEM.
3.4. Synergistic Pathways of Soil Properties Driving Maize Yield
For maize, the path model revealed a more complex, bidirectional role of BD. The SEM model fitted the maize data well: CFI = 0.958, TLI = 0.951, RMSEA = 0.047, SRMR = 0.035, all meeting the recommended fit thresholds. The pathway results are presented in
Figure 6. BD positively influenced maize yield indirectly by increasing SOM content (PC = 0.194), a pathway operating through improved aggregate stability at moderate compaction under upland conditions. Concurrently, BD negatively regulated AK (PC = −0.011), and this reduced AK pool was itself positively driven by TN (PC = 0.703), together exerting a positive net effect on maize yield (PC = 0.246). The overall consequence is that BD influences maize yield through dual, partially opposing pathways: a positive indirect pathway via SOM and a constraining effect via AK depletion at higher BD levels. This duality explains why some previous studies have reported a positive BD–yield relationship while others found a negative one: the net direction depends on the prevailing BD range and the relative magnitudes of these two pathways [
5,
13].
AP showed the strongest negative association with BD among all nutrient variables (PC = −0.641), reflecting the critical interaction between soil physical properties and P availability, a relationship particularly important in zones with volcanic parent material endowment where P fixation capacity is inherently high. SOM showed a strong positive association with BD (PC = 0.398), confirming its dual role as both a consequence of moderate compaction (in fine-textured soils) and ameliorator of compaction effects. AK, Ks, and SOM were all positively correlated with maize yield (p < 0.05), confirming their critical and synergistic roles in supporting maize productivity across all resource endowment contexts.
The TN, AK, and maize yield pathway (combined PC ≈ 0.173) is particularly noteworthy: it suggests that nitrogen not only directly promotes photosynthetic capacity but also indirectly supports maize yield by stimulating microbial biomass, which in turn releases potassium through weathering and mineralization processes [
18]. This N–K coupling mechanism implies that potassium deficiency may be masked or exacerbated depending on nitrogen management, and highlights the importance of balanced NPK fertilization rather than nitrogen-only optimization in maize production. In the WSA, where both TN and AK are low, this indirect pathway is particularly weak, further compounding the direct BD and pH constraints identified in
Section 3.2. Taken together, the path models for both crops demonstrate that the soil–crop system in northeast China operates as an interconnected network rather than a collection of independent factor–yield relationships, underscoring the insufficiency of single-factor management prescriptions.
3.5. Threshold Effects of Bulk Density and Organic Matter on Crop Yield
We derived the soil property thresholds using segmented regression, which statistically identified the breakpoints where the relationship between soil properties and yield changes significantly. Beyond the pathway relationships, the dataset allows identification of these statistically derived thresholds with operational significance for crop management. When BD exceeded approximately 1.34 g cm
−3, the mean value in zones with sandy soil texture and low SOM endowments (WSA, XAL), maize yields in those zones were approximately 8–13% lower than in zones with fine-textured mollisol endowments and BD below 1.26 g cm
−3 (SJP, LHP). This differential response confirms that the critical BD threshold of 1.35 g cm
−3 reported for black soils [
5] is most relevant in resource endowment contexts characterized by coarse texture, low clay content, and low SOM, where compaction effects are most severe.
For SOM, zones with resource endowments characterized by sandy texture, semi-arid climate, and low baseline fertility (WSA, SOM mean 26.02 g kg
−1) exhibited both rice and maize yields significantly lower than the overall mean, confirming that SOM below approximately 26 g kg
−1 represents a critical deficiency threshold across all resource endowment contexts. Conversely, zones with organic-rich alluvial or fertile mollisol endowments (SJP/SNP, SOM mean > 40.9 g kg
−1) showed only marginal additional yield benefits relative to intermediate SOM levels (30–35 g kg
−1) [
24], a saturation effect consistent with the global SOC–yield relationship reported by Lal et al. [
25], who identified diminishing yield returns above ~2% SOC (approximately equivalent to ~3.4% SOM, or ~34 g kg
−1). This threshold pattern demonstrates that resource endowment characteristics determine both the magnitude of SOM deficit (most severe in sandy, semi-arid endowments) and the marginal returns to SOM enhancement (diminishing in already fertile endowments).
These threshold observations represent a step change from qualitative description to quantitative, resource endowment-stratified benchmarking: they provide specific target ranges that vary with endowment context. For zones with sandy soil endowments (typically found in semi-arid regions) (WSA, XAL), priority targets are BD reduction to <1.30 g cm−3 and SOM restoration to 28–32 g kg−1, as these zones are furthest from optimal and exhibit the steepest marginal yield response. For zones with fertile Mollisol endowments (SNP, SJP) the priority is to maintain the current optimal soil conditions, as further SOM enhancement would only bring marginal yield benefits.
The current average BD of 1.32 g cm
−3 across northeast China is approximately 20% higher than pre-cultivation mollisol BD values (~1.10 g cm
−3), and average SOM has fallen below the critical threshold specifically in zones with resource endowments predisposing to degradation (sandy texture, low clay, semi-arid climate in WSA and parts of XAL). These differential degradation trajectories directly reflect resource endowment vulnerability: zones with coarse-textured parent materials, low baseline SOM, and water limitation are inherently more susceptible to compaction and organic matter loss under intensive cultivation [
2,
7,
26]. Reversal of these trends will require targeted interventions: subsoiling or deep tillage to break compaction layers (effective for BD > 1.35 g cm
−3) must be combined with aggressive organic amendment application (straw incorporation, manure) to rebuild SOM toward the 30 g kg
−1 target [
6], while in fertile mollisol zones, conservation tillage and residue retention are sufficient to maintain existing favorable conditions.
3.6. From Mechanisms to Spatially Differentiated Management Implications
Management emphasis should shift to maintaining existing BD (<1.26 g cm
−3) and SOM (>35 g kg
−1) rather than aggressive enhancement, given diminishing returns beyond current levels [
26,
27]. Our results are consistent with previous studies in black soil regions [
2,
3], but differ from studies in southern red soil regions [
7,
9], where BD showed a consistent negative effect across all crops. This difference is mainly due to the paddy–upland rotation system in our study area, which creates the anaerobic conditions that allow moderate BD to have positive effects, while southern regions have more intensive tillage that leads to severe compaction. These findings carry direct implications for resource endowment-based soil management policy in northeast China. We propose management priorities stratified by resource endowment characteristics rather than by geographic zone names.
For areas characterized by sandy soil texture, the highest priority is to reduce BD by 0.1–0.2 g cm
−3 through subsoil tillage every 2–3 years, which costs about RMB 200 per ha, and increase SOM by 2–3 g kg
−1 over 3–5 years through straw return, which costs about RMB 120 per ha per year. pH correction (typically acidification mitigation under alkaline conditions) is a secondary priority. These interventions can be further combined with appropriate irrigation interventions to improve water-use efficiency in these semi-arid sandy regions, which has been proven to be effective in similar contexts [
28]. Fertilization inputs will have limited effect until physical constraints imposed by unfavorable texture and compaction are addressed. These interventions can increase yield by 0.8–1.2 t ha
−1, with a return on investment of approximately 150% within 3 years, and BD targets of <1.30 g cm
−3 and SOM targets of 28–32 g kg
−1 should guide intervention intensity.
For areas with organic-rich alluvial or fertile mollisol endowments derived from loess parent materials, where baseline SOM is high (>35 g kg−1) and soil structure is favorable (BD < 1.26 g cm−3), management emphasis should shift to maintaining existing SOM levels through conservation tillage, residue retention, and reduced bare-fallow periods, while monitoring for pH decline driven by nitrogen over-application. Given diminishing marginal returns above current SOM levels, aggressive enhancement efforts are not justified by yield response curves.
For volcanic soil landscapes with high aluminum and iron oxide content in parent materials, targeted phosphorus management is critical for maize, addressing both the strong BD → AP negative pathway and the inherent P fixation chemistry. We recommend increasing the P application rate by 20–30% and applying it with organic fertilizer to reduce fixation, which costs about RMB 80 per ha per year and can increase maize yield by 0.5–0.7 t ha−1. For rice, SOM maintenance and pH optimization are primary objectives to complex aluminum and reduce fixation. Phosphorus application rates and timing must account for high fixation capacity inherent to this resource endowment.
For mountainous areas with coarse-textured spodosols derived from granitic parent materials, addressing BD compaction is the priority for both crops, complemented by TN and SOM restoration to bring soil quality toward mollisol benchmarks. The inherently low clay content and high stone content of these parent materials necessitate careful tillage management to avoid excessive compaction. For warm humid plains with intensive management and moderate baseline fertility, balanced NPK management with attention to potassium depletion is recommended, given the AK and Ks pathways identified for both rice and maize and the long history of crop K removal exceeding inputs [
29].
3.7. Limitations
This study has inherent limitations that require explicit discussion. While it cannot account for interannual climate variability, which may affect absolute yield levels and the exact numerical values of the BD and SOM thresholds, or capture the temporal dynamics of soil properties or carryover effects from previous crops, this has minimal impact on the relative spatial patterns of soil constraints across ecological zones, as these patterns are governed by long-term resource endowment characteristics rather than short-term weather fluctuations. However, at the county scale, these effects are averaged out across thousands of fields and thus do not significantly alter the regional-scale relationships between soil properties and yield. Furthermore, causal inference based on observational cross-sectional data has inherent limitations, but our SEM is constructed strictly on well-established soil science principles (e.g., BD and pH as upstream regulators of SOM accumulation) rather than exploratory data mining, so the identified causal pathways should be interpreted as mechanistic hypotheses rather than definitive causal relationships. Despite these limitations, this large-scale, spatially comprehensive dataset provides the most robust baseline to date for understanding spatially differentiated soil constraints in northeast China’s black soil region and lays a critical foundation for future long-term experimental validation [
30].
Second, we acknowledge that we did not explicitly account for spatial autocorrelation in the statistical models. The 201 counties are geographically clustered, and neighboring counties may share similar soil properties and agronomic practices, which could lead to underestimated standard errors and inflated significance levels for individual county-level coefficients. However, this limitation has minimal impact on our core conclusions, which focus on the relative ranking of soil factors across ecological zones, the mechanistic pathways linking soil properties to crop yield, and the critical threshold values for bulk density and soil organic matter. Our stratified random sampling design, which allocated sampling points proportionally to cropland area within each county, has further mitigated this bias. Future research will incorporate spatial econometric models to explicitly account for spatial dependence and refine county-level yield predictions.
Our analysis did not incorporate soil biological indicators such as microbial biomass carbon, enzyme activities, or earthworm density, which are known to mediate the conversion of SOM into plant-available nutrients [
31,
32]. Integrating these biological dimensions into future pathway models could further resolve the mechanisms linking soil physicochemical properties to crop yield. Our sensitivity analysis shows that this exclusion leads to a less than 8% bias in our threshold values, which does not change our core conclusions, but future studies can partition the effects more precisely. The threshold values for BD and SOM identified here should be treated as indicative benchmarks pending validation through controlled field experiments. Additionally, regarding spatial autocorrelation, while we did not explicitly model this issue in the statistical analysis, our stratified sampling design has mitigated this potential bias. By ensuring samples were evenly distributed across all six ecological zones, we avoided geographic clustering, and this limitation does not affect our core conclusions, as the observed soil constraint patterns are driven by long-term resource endowments.