Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity

Xu, Hui; Hu, Zhihang; Xu, Ming; Ding, Juanjuan; Chen, Shijun; Li, Zhulin; Li, Tianlai

doi:10.3390/agriculture16101093

Open AccessArticle

Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity

by

Hui Xu

^1,2,3,

Zhihang Hu

^1,2,3,

Ming Xu

¹,

Juanjuan Ding

^1,2,3

,

Shijun Chen

¹,

Zhulin Li

^1,2,3,* and

Tianlai Li

^1,2,3

¹

College of Horticulture, Shenyang Agricultural University, Shenyang 110866, China

²

Key Laboratory of Protected Horticulture, Ministry of Education, Shenyang 110866, China

³

National & Local Joint Engineering Research Center of Northern Horticultural Facilities Design & Application Technology (Liaoning), Shenyang 110866, China

^*

Author to whom correspondence should be addressed.

Agriculture 2026, 16(10), 1093; https://doi.org/10.3390/agriculture16101093 (registering DOI)

Submission received: 10 March 2026 / Revised: 3 May 2026 / Accepted: 12 May 2026 / Published: 16 May 2026

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

The temperature management models for solar greenhouses exhibit strong regional dependency. Their application in non-target environments often faces significant limitations, frequently resulting in severe temperature control deviations. To address this challenge, seven solar greenhouses located in Lingyuan (Liaoning Province) and Yinan (Shandong Province) were utilized as experimental platforms. Using real-time environmental data collected by the NEUT-80S IoT monitoring system, backpropagation (BP) neural network models were trained and validated. Multiple stepwise regression analysis identified total solar radiation and sunshine duration as the primary determinants of cucumber yield. Based on these findings, a dynamic weight matrix was constructed using a solar radiation clustering algorithm. By integrating similarity distance and similarity coefficient, a microclimate similarity determination logic was established, leading to the proposal of an automatic model selection strategy with an 11-day update cycle. Quantitative validation demonstrated that when the threshold conditions—a similarity coefficient (R) ≥ 0.6 and a similarity distance (D) ≤ 0.85—are met, triggering the optimally matched model significantly improves the simulation goodness-of-fit (R²) from 0.6716 in the unmatched state to 0.9851. This strategy effectively achieves the cross-regional adaptation of high-yield temperature management models, providing robust technical support for the advancement of precision protected agriculture.

Keywords:

solar greenhouse; cucumber; model transferability; similarity assessment; dynamic selection strategy

1. Introduction

Solar greenhouses (SGs) serve as a cornerstone for winter vegetable production in Northern China, playing a pivotal role in ensuring year-round supply and advancing agricultural sustainability [1,2]. In the cultivation of greenhouse vegetables such as cucumber (Cucumis sativus L.), the precision regulation of temperature and illumination is directly linked to crop physiological development and productivity. However, the inherent nonlinearity and strong coupling characteristics of greenhouse microclimate systems pose significant challenges for traditional prediction models. With the rapid advancement of Artificial Intelligence (AI), data-driven models—exemplified by Back Propagation (BP) neural networks—have demonstrated substantial advantages in processing complex meteorological factor predictions [3,4].

Furthermore, precision microclimate management is not merely an engineering challenge; it is a critical determinant of crop physiological responses. Recent studies have highlighted that environmental stresses, such as extreme heat and low light, directly disrupt plant morphology and physiological functions. For instance, heat stress can restrict carbon assimilation, induce oxidative damage, and trigger developmental reprogramming in vegetable crops [5]. Simultaneously, fluctuations in radiation and light regimes not only alter canopy leaf temperature and the maximum quantum yield of photosystem II (Fv/Fm), but also profoundly impact the ultimate crop yield and nutritional quality [6]. Notably, low-light conditions within greenhouses have been proven to inhibit the activity of carbon-assimilating enzymes, thereby inducing physiological disorders and substantial yield reductions in crops such as cucumbers [7]. Consequently, environmental prediction models must exhibit high sensitivity to these microclimatic fluctuations to ensure optimal physiological performance and yield formation in crops.

Currently, greenhouse environmental prediction has entered an era of multi-algorithm integration. To address the inability of traditional models to accurately predict greenhouse microclimates, the utilization of variable-weight ensemble models [8] and multi-spatiotemporal feature convolutional neural networks (CNNs) [9] has been shown to significantly enhance the simulation accuracy of local microclimates. Nevertheless, significant spatial heterogeneity persists within greenhouse environments [10], where the optimization of sensor network deployment [11] and localized alterations in ventilation regimes [12] further intensifies the complexity of microclimate assessment. To mitigate the interference of dynamic external environments on the accurate identification of key feature relationships, multi-algorithm clustering is employed to facilitate feature characterization [13], the variations in physiological responses—such as morphological development and water transport—under heterogeneous conditions [14] continue to restrict the generalizability of experience-driven models during cross-regional implementation. Such profound reliance on specific environments renders existing models inherently inflexible, frequently resulting in substantial temperature control deviations when subjected to geographical shifts or climatic fluctuations. This vulnerability constitutes a major technical bottleneck for the broader application of current models in protected agriculture.

Model transferability remains a prominent challenge in contemporary ecological and agricultural modeling [15]. To enhance the generalization capability of the model, the Transformer architecture was incorporated [16] or multimodal deep transfer learning algorithms [17] can enhance model generalization, distributional shifts in application environments (Environment shifts) continue to result in significant escalations of prediction errors [18]. Currently, while research into climate suitability zoning has achieved notable progress [19], there remains a critical lack of online decision-making systems capable of adapting to real-time dynamic evolution. This deficiency complicates the realization of dynamic performance evaluation and autonomous model switching [20]. Consequently, this study proposes a dynamic matching framework based on microclimate similarity determination. By quantitatively evaluating the similarity between the target greenhouse and the original modeling environment in real time, this strategy triggers the model to autonomously switch across different geographical regions and growth stages. This approach thereby effectively mitigates the predictive failures of traditional models caused by environmental shifts.

To address the dilemma of multi-model adaptation, identifying core driving factors through feature selection methods constitutes the primary step toward precise decision-making [21,22]. Climate similarity analysis has been successfully applied to fine-scale crop zoning and cross-regional technology transfer [23,24]; furthermore, integrating weighted similarity analysis enables the identification of feature correlations and effectively mitigates assessment uncertainty [25]. Given that greenhouse environments exhibit multi-period dynamic response characteristics that evolve alongside growth cycles [26], predictive accuracy can be improved through multiple stepwise regression [27] or automatically optimized hybrid models [28]. Nevertheless, there remains a pressing need to develop intelligent decision-making systems capable of autonomous model switching based on real-time data streams [29].

To address the poor adaptability and limited generalization capabilities of empirical high-yield models when applied to non-modeling regions, this study analyzed the systematic microclimate differentiation characteristics of solar greenhouses in Lingyuan and Yinan. Specifically, the research objectives are threefold: (1) to identify the primary meteorological factors influencing greenhouse cucumber yield through stepwise regression analysis; (2) to construct a weighted decision system based on microclimate similarity via clustering analysis of these primary factors; and (3) to establish an automated dynamic model selection strategy, followed by rigorous validation using independent greenhouse data. This framework achieves precise matching between management models and dynamic environmental conditions, providing a robust tool for ensuring stable and high yields of vegetables in solar greenhouses. The technical roadmap is illustrated in Figure 1.

2. Materials and Methods

2.1. Test Materials and Environment

The data collection for this study spanned from 1 December 2019 to 30 March 2022, utilizing overwintering cucumber (Cucumis sativus L., cv. ‘Jinyan No. 4’) as the experimental crop. The trials were conducted in Lingyuan City (Chaoyang, Liaoning Province) and Yinan County (Linyi, Shandong Province). Specifically, the Lingyuan experimental site (approx. 41° N, 119° E) is characterized by a mid-temperate continental monsoon climate, featuring protracted and severe winters with pronounced diurnal temperature amplitudes, yet possessing abundant solar radiation resources. Conversely, the Yinan site (approx. 35° N, 118° E) falls within a warm-temperate monsoon climate with four distinct seasons, where winter temperatures and extreme minimums are substantially milder than those in Lingyuan.

Strictly guided by historical regional production performance, a total of seven solar greenhouses were selected across both locations (all operating under standard commercial production conditions throughout the experimental period). Over the past five years, the average annual yield of these greenhouses has stably exceeded 35,000 kg per mu, representing an exceptionally high standard for overwintering cucumber production within these regions. As detailed in Table 1, the framework and wall structures of all seven greenhouses comply with local high-yield standards. Furthermore, their front roofs are uniformly covered with 0.12-mm-thick polyethylene (PE) film, yielding an initial solar radiation transmittance of approximately 88%.

In the model construction phase, the experimental greenhouses were partitioned into training and validation sets. Greenhouses No. 1, 2, and 3 in Fangzhuangzi Village (Yinan County) and No. 4, 5, and 6 in Songzhangzi Village (Lingyuan City) were designated as the training set. These were utilized to continuously collect internal microclimate parameters and other multifaceted factors for constructing the BP neural network models. To ensure the objectivity and independence of the evaluation, Greenhouse No. 7, situated in Gongjiashaoguo Village (Lingyuan City)—approximately 5 km away from Songzhangzi Village—was selected as the validation set. While sustaining standard commercial operations, this greenhouse exhibits a moderate micro-regional spatial separation from the training set. The actual measured environmental data from this greenhouse in January were extracted to independently validate the decision-making accuracy of the automated model selection system.

The planting parameters of overwintering cucumber (Cucumis sativus L. ‘Jinyan No. 4’) in the solar greenhouse are shown in Table 2.

This study utilized the NEUT-80S Internet of Things (IoT) environmental monitoring system, developed by the State Key Laboratory of Protected Horticulture at the College of Horticulture, Shenyang Agricultural University (Shenyang, China), to collect greenhouse environmental data in real time. In this experiment, environmental sensors in each greenhouse were positioned at the center of the greenhouse; the specific layout of the sensors is illustrated in Figure 2. The solar radiation sensor (MD6A, Liaoning Jiuyi Agriculture Co., Ltd., Shenyang, China) was mounted atop the Stevenson screen at a height of 2.0 m above ground level, ensuring an environment entirely free from any obstructions. Real-time data were acquired in W/m² (subsequently integrated and converted into MJ/m² for downstream analysis). The CO₂ concentration sensor (MD6A, Liaoning Jiuyi Agriculture Co., Ltd., Shenyang, China) was mounted inside the radiation shield. The air temperature and humidity sensor (MD6A, Liaoning Jiuyi Agriculture Co., Ltd., Shenyang, China) was positioned at a height of 1.2 m above the ground and was covered with a tinfoil hood to prevent measurement errors caused by direct solar radiation. Soil temperature and humidity sensors (MD6A, Liaoning Jiuyi Agriculture Co., Ltd., Shenyang, China) were placed in the soil at a depth of 0.15 m around the plant roots, with a data sampling frequency of once every 300 s. Yield data in this study were obtained from daily records maintained by the farmers.

2.2. Experimental Modeling

Data curation and graphical visualization were performed using Microsoft Excel 2019 (Microsoft Corporation, Redmond, WA, USA) and Origin 2018 (OriginLab Corporation, Northampton, MA, USA), whereas the underlying algorithm logic and system programs were developed in Python 3.9.7 (Python Software Foundation, Beaverton, OR, USA; https://www.python.org). Furthermore, all statistical operations—encompassing correlation analysis, stepwise regression, weight calculation, and similarity distance analysis—were uniformly executed using SPSS Statistics 27 (IBM Corp., Armonk, NY, USA).

In this study, a standard three-layer backpropagation (BP) neural network architecture was constructed. Preliminary data preprocessing and analyses were executed within the Matlab 2014 environment (The MathWorks, Inc., Natick, MA, USA). To eliminate dimensional discrepancies, the raw input data were mapped onto the [0, 1] interval using Min-Max Normalization [30]. Utilizing the trial-and-error method [31], the optimal number of nodes in the hidden layer was determined to be 11, thereby establishing the final network topology as 5-11-1, as depicted in Figure 3. The model adopted logsig and purelin as the transfer functions for the hidden and output layers, respectively. Furthermore, the Levenberg–Marquardt (LM) optimization algorithm was employed during the training phase to iteratively update the global weights and thresholds of the network [32].

Regarding the network training configurations, the input modeling data were randomly partitioned into training, validation, and test sets at a ratio of 70%, 15%, and 15%, respectively. The maximum number of training epochs was set to 1000, with the target mean squared error (MSE) specified as 1 × 10⁻⁵. Tailored to the characteristics of the LM algorithm, the initial Marquardt adjustment parameter (μ) was established at 0.001. Furthermore, a maximum of 6 validation failures (Max_fail) was defined to trigger the early stopping mechanism, thereby effectively mitigating model overfitting. The environmental parameters monitored in this study encompassed total solar radiation (R), soil temperature (Tsoil), soil moisture (Hsoil), CO₂ concentration (CO₂), air humidity (Hair), and air temperature (Tair).

2.3. Method for Confirming the Limitations of Two-Location Model Areas

For the validation of cross-regional model failure, by inputting the measured environmental factor data from each region into the model developed for the other region, the degree of dispersion and error distribution of the output results were compared. This provided a direct visual verification of the performance degradation and failure of regional models when applied across different geographical areas.

2.4. Establishment of the Model Applicability Determination System

To address the regional limitations revealed by the aforementioned validation, this study established a dynamic model selection framework centered on environmental similarity. The underlying construction logic of this framework is structured into the following three hierarchical layers.

2.4.1. The Method for Determining the Dominant Factors in a Greenhouse

The primary step in establishing the decision-making system is the identification of core indicators driving yield. Consequently, this study employed Pearson correlation analysis and multiple stepwise regression [33] to identify these key metrics. The method follows the principle of “dynamic entry and removal”, iteratively evaluating the contribution of candidate variables based on the F-test. Specifically, features are introduced sequentially according to their significance, while redundant factors that become non-significant upon the inclusion of new variables are simultaneously eliminated. Once the equation reaches a steady state—where no further variables can be significantly added or removed—the extraction of dominant factors is finalized, and the optimal predictive equation is established. The equation is as follows:

Y = β_{0} + β_{1} X_{1} + β_{2} X_{2}

(1)

where Y is the yield of greenhouse cucumber (kg); X₁ and X₂ are the first and second dominant meteorological factors influencing yield, respectively; β₁ and β₂ are the partial regression coefficients, which quantify the degree of contribution of a specific environmental factor to the yield while keeping other factors constant; and β₀ is the constant term, representing the baseline influence of extraneous factors not included in the model.

2.4.2. The Establishment Method of the System for Determining the Similarity of Microclimates in Greenhouses

(1): Method for establishing multi-dimensional spatial sequences

To circumvent the inherent limitations of single-factor yield assessment [34], this study introduced a multi-dimensional spatial sequence processing methodology [35]. Given the profound impact of the seasonal and monthly evolution of solar radiation on the stability of the indoor environment, the K-means clustering algorithm was employed within a Python environment to conduct unsupervised classification of the daily average solar radiation from the Lingyuan and Yinan greenhouses, thereby facilitating the construction of a multi-dimensional spatial sequence.

During the algorithm initialization and parameter configuration phase, Euclidean distance was employed as the metric to quantify sample similarity, while a fixed random state was assigned to guarantee the reproducibility of the clustering computations. To rigorously determine the optimal number of clusters (k), this study established a classification boundary evaluation criterion: specifically, “the maximum distance from a data point to its assigned cluster center must be strictly less than the minimum distance between cluster centers.” By iteratively calculating the center-to-data distance ratios across various candidate k values, the system aimed to eliminate inter-cluster overlap. This process optimally identified the best multi-dimensional spatial sequence partitioning scheme, thereby driving the subsequent dynamic weight allocation of environmental factors. Furthermore, the minimum duration (in days) of each microclimate stage, as resolved by the optimal clustering scheme, directly served as the core statistical foundation for establishing the “Time Window” mechanism in the system’s dynamic evaluation.

(2): The calculation method for the weight of each factor’s influence on yield in different spatial sequences

To enable the dynamic evolution of the decision-making logic, this study aimed to quantify the dynamic influence of meteorological factors on yield during the cucumber harvest period and construct a differentiated weighting assessment system. The weighting coefficients of various factors vary across distinct experimental stages. By analyzing the cumulative radiation and sunshine duration for each stage and experimental site, each factor was subjected to segmented processing to identify its specific influence across different production phases. The formula for the weighting coefficient is defined as follows:

P_{(i, k)} = R_{i k}^{2} \frac{|α_{i k}|}{\sum_{j = 1}^{n} (R_{i j}^{2} \times |α_{i j}|)}

(2)

where the R² values for different stages characterize the degree of correlation between environmental factors and yield across various time intervals. The partial regression coefficients (α_ij) for distinct periods reflect the magnitude of influence exerted by these factors on yield during each respective timeframe. The absolute value |α_ij| is used to calculate the normalized weights (with the sum of the weights equal to 1), eliminating the interference of the positive or negative signs of correlation on the evaluation of the contribution degree.

(3): Calculation of multi-factor and multi-period similarity distance and similarity coefficient

Following the determination of the dominant factors, this study introduced and improved a decision-making algorithm based on similarity distance and similarity coefficients [23,36] to quantitatively evaluate the matching degree between the microclimates of the experimental site and the modeling greenhouses (Lingyuan and Yinan). This approach aims to automatically adapt the optimal management model for different crop developmental stages through the similarity comparison of multi-dimensional environmental factors. The formulas are as follows:

Similarity coefficient formula:

R (i, i 1) = \frac{\sum_{j = 1}^{m} \sum_{k = 1}^{l} (V (i, j, k) - \bar{V (i, j)}) \times (V (i 1, j, k) - \bar{V (i 1, j 1)})}{\sqrt{\sum_{j = 1}^{m} \sum_{k = 1}^{l} {(V (i, j, k) - \bar{V (i . j)})}^{2}} \sqrt{\sum_{j = 1}^{m} \sum_{k = 1}^{l} {(V (i 1, j, k) - \bar{V (i 1 . j)})}^{2}}}

(3)

Similar Distance Formula:

\begin{matrix} D (i, i 1) = \sqrt{\sum_{j = 1}^{m} \sum_{k = 1}^{l} {(V (i, j, k) - V (i 1, j, k))}^{2}} \end{matrix}

(4)

where R(i, i1) and D(i, i1) represent the similarity coefficient and similarity distance of station i relative to the reference station i1, respectively, with i1 signifying the reference region or reference point. V(i, j) denotes the value of the meteorological elements at station i, while m represents the number of factors and l the number of time periods. In these equations, V(i, j, k) = p(j, k)x(i, j, k) and V(i, j) = p(j)x(i, j), where p(j, k) is the weight coefficient of the j-th factor during the k-th period, and p(j) is the average weight of the factor determined by its contribution in the stepwise regression. X(i, j, k) and x(i, j) represent the normalized values of the environmental parameters.

However, due to the significant disparities in measurement units and numerical magnitudes among greenhouse environmental data, normalization of the raw meteorological data was performed to eliminate calculation errors stemming from varying dimensions. This procedure maps the values of each factor onto the range of [0, 1]. The formula is expressed as follows:

\begin{matrix} x (i, j) = \frac{\max (X (j)) - X (i, j)}{\max (X (j)) - \min (X (j))} \end{matrix}

(5)

Furthermore, to establish the trigger thresholds for the decision-making system, the probability distributions of historical greenhouse environmental data from Lingyuan and Yinan were analyzed to characterize the baseline numerical distribution patterns of similarity distance and similarity coefficients. On this basis, by integrating on-site production survey data with expert empirical judgment, the suitable numerical intervals for similarity distance and similarity coefficients were defined for this study.

2.4.3. The Establishment and Verification Method of the Automatic Selection Model Strategy in Greenhouses

To overcome the limitations associated with the cross-regional application of empirical high-yield models, this study constructed an automated dynamic selection strategy for greenhouse models within a Python environment. Driven by real-time microclimate data, the core decision-making logic of this system comprises the following four progressive modules. Data Preprocessing: The system extracts dominant microclimate factors in real time and normalizes them to the [0, 1] interval using Equation (5) to eliminate dimensional discrepancies. To guarantee the statistical significance of the similarity calculations, a “Time Window” mechanism was introduced for data buffering. The length of this window period is strictly defined as the minimum duration (in days) derived from the multi-dimensional spatial sequence clustering, serving as the minimum sample size boundary for model determination. Similarity Calculation: Integrating the dynamic weight matrices of the predefined spatial sequences from both locations, the algorithm executes Equations (3) and (4) to compute the similarity distance (D) and similarity coefficient (R) of the data within the buffer period. Decision Matching: The system conducts traversal optimization based on predefined empirical thresholds. If the current indicators (D and R) fall within the optimal similarity interval, the system automatically activates the corresponding high-yield BP neural network model to output precise temperature control targets. Conversely, if the threshold constraints are breached, an “incompatible model” warning is triggered to prevent blind regulation. Cyclic Updating: To dynamically adapt to the seasonal evolution of the greenhouse microclimate, the system continuously advances the time window using the predefined window length as a fixed step size. Upon completing each step cycle, the newly accumulated microclimate data trigger a new round of evaluation and model switching.

3. Results

3.1. Judgment of the Limitations of the Two-Location Model Area

To validate the regional dependency of the models, the microclimate data from both locations were respectively fed into the models developed for their non-corresponding sites to conduct cross-simulations. As illustrated in Figure 4, when the Lingyuan data were applied to the Yinan model, the simulated maximum midday temperatures were over 5.6 °C lower than the actual requirements, whereas the outputs during non-midday periods were overestimated by up to approximately 3.5 °C. Conversely, inputting the Yinan data into the Lingyuan model resulted in simulated nocturnal temperatures that were 3.9 °C higher than the actual values. These experimental results demonstrate that applying a model to a non-target environment causes the temperature control objectives to deviate severely from the normal physiological demands of the crops. This unequivocally confirms that the Lingyuan and Yinan models exhibit pronounced regional dependency and lack mutual transferability.

3.2. Establishment of a System for Determining the Similarity of Microclimates in Greenhouses

3.2.1. Establishment of the Dominant Factors in the Greenhouse

The results of the Pearson correlation analysis revealed that greenhouse cucumber yield exhibited highly significant positive correlations with sunshine duration (r = 0.638), air temperature (r = 0.632), and total solar radiation (r = 0.623) (p < 0.01). Conversely, it demonstrated highly significant negative correlations with air humidity (r = −0.565) and CO₂ concentration (r = −0.530) (p < 0.01). However, in actual high-yield greenhouse production—particularly in Yinan—air temperature is highly susceptible to severe interference from manual management practices, such as ventilation. Consequently, in conjunction with the stepwise regression equation, the following relationship was established:

Lingyuan: Y = 14.7fz + 0.983rs − 438.67 (R² ≥ 0.863)

Yinan: Y = 5.307fz + 3.338rs − 1015.627 (R² ≥ 0.776)

In the equations, Y denotes the cucumber yield (kg), fz represents the total solar radiation (MJ/m²), and rs signifies the sunshine duration (min). A comparison of the partial regression coefficients reveals a significant regional differentiation in the light-thermal demand characteristics between the two locations. Specifically, the radiation coefficient for Lingyuan (14.7) is 2.77 times that of Yinan (5.307). This indicates that in the higher-latitude Lingyuan region, characterized by severely cold winters, an increase in unit radiation exerts a more drastic yield-enhancing effect, demonstrating the crops’ extreme dependence on solar radiation and radiant heat. Conversely, the partial regression coefficient for sunshine duration (rs) in the Yinan equation (3.338) is substantially higher than that in Lingyuan (0.983). This suggests that in the relatively warmer Yinan region—provided the baseline solar radiation is met—the duration of effective photosynthesis (photoperiod) becomes the more sensitive factor limiting crop yield. Such divergent quantitative discrepancies strongly corroborate the necessity of incorporating a dynamic weighting mechanism into the similarity evaluation system.

Furthermore, the coefficient of determination (R²) for the Lingyuan equation (0.863) outperforms that of Yinan (0.776). This indicates that the Lingyuan greenhouses operate under a closed, thermally insulated state during the severe winter, resulting in an exceptionally high linear concordance between crop yield and natural light-thermal fluctuations. Conversely, frequent manual interventions in Yinan, such as winter ventilation management, weaken the direct explanatory power of natural factors on yield, thereby generating slightly larger residuals. Nevertheless, both equations can effectively account for over 77% of the yield variations. Consequently, driven by the objective to eliminate the interference of artificial ventilation and cooling, and in conjunction with the significance of the partial regression coefficients, this study ultimately established total solar radiation (fz) and sunshine duration (rs)—which represent the natural climatic baseline—as the core evaluation indicators for the similarity determination system.

3.2.2. The Construction Result of the Multi-Dimensional Space Sequence

Table 3 presents the clustering analysis results of the daily average solar radiation for the Yinan and Lingyuan greenhouses, furnishing a statistical foundation for regional microclimate partitioning. The determination criterion dictates that if the maximum distance between a data point and its assigned cluster center exceeds the minimum distance between cluster centers, it signifies inter-cluster overlap, thereby rendering the classification scheme unreliable. For the Yinan greenhouses, the 4-cluster scheme achieved a center-to-data distance ratio of 0.645 (2.11 vs. 3.27), effectively circumventing the inter-cluster overlap observed in the 2- and 5-cluster schemes. Similarly, the Lingyuan greenhouses were optimized into four clusters with a maximum intra-cluster distance of 3.47, thereby ensuring the consistency of microclimate characteristics across various growth stages.

Figure 5 illustrates the clustering results and sample distribution based on solar radiation during the fruiting period for both the Yinan and Lingyuan greenhouses. It is evident that the microclimate within solar greenhouses exhibits a pronounced temporal imbalance. Specifically, the A₂ stage in Lingyuan persists for 71 days (accounting for approximately 44.7% of the fruiting period), manifesting long-term environmental stability; conversely, the A₃ stage spans a mere 11 days, characterizing a highly transient transition period. This minimum sample duration of 11 days directly dictates the statistical boundary for the system’s “11-day rolling update cycle.” By aligning the update frequency with the shortest observed clustering cycle, this strategy guarantees that the system can simultaneously capture fleeting environmental shifts and retain the adequate data granularity requisite for modeling. Consequently, this enables the precise adaptation of the model to seasonal dynamic evolutions.

3.2.3. Determination of the Weight Coefficients of Total Solar Radiation and Sunshine Duration in Different Spatial Sequences for Yield

Table 4 details the relative weights of various factors influencing yield across different growth stages in Lingyuan (Liaoning) and Yinan (Shandong). The results indicate that these factor weights exhibit pronounced fluctuations as the growth stages progress. In Lingyuan, the weight of sunshine duration escalates drastically from 0.1805 in the A1 stage to a predominant 0.9581 in the A₂ stage, representing a 5.3-fold amplification. This quantitative shift reflects that during the core period of severe winter, because low outdoor temperatures constrain the greenhouse’s thermal buffering capacity, the sensitivity of cucumber yield to sunshine duration is markedly heightened. Insufficient light directly restricts indoor heat storage and temperature elevation, thereby suppressing plant photosynthesis and the accumulation of soluble solids. This consecutively impedes the synthesis of proteins and organic matter, ultimately culminating in a yield reduction. Consequently, sunshine duration emerges as the critical limiting factor for crop production during this stage.

3.2.4. Determination of the Appropriate Intervals for the Similarity Distances and Similarity Coefficients of Multiple Factors and Multiple Time Periods in Two Locations

Table 5 details the distribution of similarity coefficients (R) and similarity distances (D) derived from the microclimate factors, total solar radiation, and sunshine duration within the Lingyuan and Yinan greenhouses. An analysis of Table 5 reveals that the computed similarity distances exhibit an expansive range spanning from 0.00628 to 26.51, with a mean value of 3.301. Consequently, the determination of greenhouse microclimate similarity must reference these numerical distribution characteristics and be comprehensively evaluated in conjunction with the practical logic of facility production.

Table 6 delineates the defined suitable intervals for the similarity distance (D) and similarity coefficient (R). By establishing the “applicable” upper threshold at 0.85, the system can selectively filter out the top 75% of highly divergent environments, thereby guaranteeing that the model is exclusively activated under conditions of high similarity. If the determination criterion satisfies D > 0.85, it indicates a pronounced discrepancy in microclimate characteristics between the experimental site and the target greenhouses, rendering the existing models inapplicable. Only when R ≥ 0.6 and D ≤ 0.85 can one of the pre-established models be applied to the experimental site. Subsequently, the optimal model is selected based on the similarity evaluation outcomes to guide agricultural production.

3.3. Establishment of an Automatic Selection Model Strategy for Greenhouses

In this study, an automated greenhouse model selection system was developed within a Python environment, the core architecture of which comprises four distinct stages.

Data Accumulation: The system continuously aggregates 11 days of internal greenhouse data for total solar radiation (fz) and sunshine duration (rs). A cycle of 11 days was selected to guarantee statistical robustness, as this duration corresponds to the minimum sample requirement for multi-dimensional spatial sequences. Similarity Discrimination: The accumulated datasets are integrated into the weight models of eight preset spatial sequences. Eight sets of similarity coefficients (R) and similarity distances (D) are calculated concurrently. Model matching: The system traverses the results to identify the minimum similarity distance (D_min), subject to the constraints of R ≥ 0.6 and D ≤ 0.85. If these conditions are met, the corresponding high-yield model for either Yinan or Lingyuan is automatically activated; otherwise, the system indicates that no adaptive model is available for the current environment. Cyclic Update: Utilizing a recurring loop structure, the system triggers a re-assessment every 11 days to accommodate the seasonal dynamic evolution of the greenhouse microclimate, as illustrated in Figure 6.

3.4. Verification and Case Analysis of the System Performance of the Dynamic Selection Model

To evaluate the reliability of the automated selection system, independent environmental data from January, collected at the experimental site in Gongjiashaoguo Village, Lingyuan, were utilized as a validation set. Using the multi-factor multi-period similarity algorithm, similarity indicators were calculated between the experimental site and each respective similarity indicator of sample distribution at each solar radiation clustering stage of Lingyuan (A₁–A₄) and Yinan (B₁–B₄), as detailed in Table 7. The similarity coefficients (R) and similarity distances (D) exhibited significant disparities across the different regions. Notably, the January environment of the experimental site demonstrated the highest similarity with the A₁ stage of Lingyuan (R = 0.9233) and the minimum similarity distance (D = 0.5651). These results fulfilled the system adaptation criteria of R ≥ 0.6 and D ≤ 0.85. Consequently, the internal microclimate of the experimental greenhouse in January was identified as most similar to Lingyuan’s A₁ stage, leading the system to automatically select the Lingyuan model for guiding production management.

Selecting a high-yield solar greenhouse as the experimental site, the actual measured data from January were utilized as input variables and were respectively fed into the Lingyuan and Yinan models for comparative simulations. This was conducted to validate the accuracy of the greenhouse microclimate similarity determination system and the automated model matching function. As illustrated in Figure 7, within the independent validation set, the goodness-of-fit (R²) surged from 0.6716 under the unmatched state (as shown in Figure 7B) to 0.9851 under the autonomously selected model state (as shown in Figure 7A), representing a 46.6% enhancement in modeling accuracy. Furthermore, the January microclimate characteristics of the Gongjiashaoguo greenhouse in Lingyuan exhibited high similarity to the Lingyuan A₁ stage, thereby substantiating the scientific validity of the system’s strategy to automatically match the Lingyuan model.

4. Discussion and Conclusions

4.1. Discussion

This study corroborates the perspective that environmental shifts precipitate the failure of model generalization [18]. In contrast to studies that emphasize large-scale static suitability zoning [19], our research further elucidates that even within the same suitability zone, inherent discrepancies in light-thermal ratios coupled with anthropogenic interventions—such as frequent manual ventilation by farmers—still constrain the precision of direct model transfer and adaptation. This demonstrates that relying exclusively on static geographical zoning is inadequate to fulfill the requirements of precision agriculture for dynamic, real-time control.

By meticulously isolating the core driving factors through feature selection, this study establishes a pivotal link in mitigating environmental uncertainty and enhancing decision-making precision. This strategy aligns seamlessly with the prevailing academic consensus that “utilizing feature extraction bolsters model robustness” [21]. The findings reveal that the radiation coefficient for yield in Lingyuan (14.7) is significantly higher than that in Yinan (5.307), indicating that crop production in high-latitude regions exhibits a more profound dependence on solar radiation. This underscores that when constructing a similarity evaluation system, it is imperative to account for the dynamic contribution weights of environmental factors as they evolve with geographical location and growth stages.

The 11-day dynamic selection strategy proposed in this study, which is underpinned by “similarity determination-driven model switching,” drastically reduces the system’s reliance on extensive sample sizes and computational overhead compared to compute-intensive deep transfer learning utilizing Transformer architectures [16]. This lightweight nature makes it remarkably more conducive to cost-effective deployment on field-level agricultural IoT devices.

Despite these promising outcomes, several limitations of this study warrant acknowledgment. Restricted spatial coverage: The current experimental samples are exclusively confined to the Lingyuan and Yinan regions. Future research must encompass a broader array of climate zones to rigorously validate the generalizability of the system. Singular environmental dimensionality: The similarity determination logic is predominantly anchored in light-thermal resources, insufficiently accounting for the potential confounding effects of variables such as CO₂ concentration and soil fertility. Adaptability to extreme weather: The system’s response sensitivity to extreme meteorological events—such as abrupt cold waves and prolonged overcast periods—remains a challenge. This aspect requires further optimization through the integration of multi-modal data [17].

4.2. Conclusions

Through an in-depth analysis of the microclimates within solar greenhouses in Lingyuan, Liaoning, and Yinan, Shandong, this study engineered a dynamic model selection system predicated on microclimate similarity. The principal conclusions are summarized as follows:

Total solar radiation (fz) and sunshine duration (rs) were identified as the dominant yield-determining factors (R² ≥ 0.776), thereby establishing a rigorous scientific foundation for microclimate similarity evaluation.

A dynamic weight matrix was constructed employing a solar radiation clustering algorithm, and a microclimate similarity decision matrix was subsequently established utilizing similarity distance (D) and similarity coefficient (R).

For quantitative performance, by integrating the weighted similarity algorithm (R ≥ 0.6, D ≤ 0.85) within an “11-day rolling update” cycle, the system effectively circumvented the bottlenecks of cross-regional adaptation. The goodness-of-fit (R²) within the independent validation set surged from 0.6716 to 0.9851, demonstrating the robust efficacy of this strategy in facilitating the transferability of high-yield models.

Author Contributions

Conceptualization, H.X. and Z.H.; data curation, Z.H. and M.X.; formal analysis, Z.H. and S.C.; writing—original draft preparation, H.X. and Z.H.; writing—review and editing, J.D.; funding acquisition, Z.L.; supervision, H.X. and T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grants from the National Key Research and Development Program of China (2023YFD2300700).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Panwar, N.L.; Kaushik, S.C.; Kothari, S. Solar Greenhouse an Option for Renewable and Sustainable farming. Renew. Sustain. Energy Rev. 2011, 15, 3934–3945. [Google Scholar] [CrossRef]
He, M.; Wan, X.; Liu, H.; Xia, T.; Gong, Z.; Li, Y.; Liu, X.; Li, T. Theory and Application of Sustainable Energy-Efficient Solar Greenhouse in China. Energy Convers. Manag. 2025, 325, 119394. [Google Scholar] [CrossRef]
Wang, G.; Awad, O.I.; Liu, S.; Shuai, S.; Wang, Z. NOx Emissions Prediction Based on Mutual Information and Back Propagation Neural Network Using Correlation Quantitative Analysis. Energy 2020, 198, 117286. [Google Scholar] [CrossRef]
Chen, J.; Liu, Z.; Yin, Z.; Liu, X.; Li, X.; Yin, L.; Zheng, W. Predict the Effect of Meteorological Factors on Haze Using BP Neural Network. Urban Clim. 2023, 51, 101630. [Google Scholar] [CrossRef]
Li, Y.; Zhang, X.; Xia, C.; Wu, T.; Gao, Y.; Zeng, L.; Wu, Z.; Dai, X.; Yuan, F.; Liu, F.; et al. Molecular Mechanisms and Breeding Strategies for Heat Tolerance in Vegetable Crops under Global Warming. Hortic. Res. 2026, 13, uhaf309. [Google Scholar] [CrossRef] [PubMed]
Rahimi Jahangirlou, M.; Pullens, J.W.M.; Lindhardt, M.K.K.; El Khoury, Y.V.; Antoniuk, V.; Manevski, K.; Ottosen, C.-O.; Jørgensen, U. Agrivoltaic Systems: Trade-Offs on Microclimate, Physiology, Yield and Canopy Thermal-Spectral Maps. Agric. Syst. 2026, 232, 104557. [Google Scholar] [CrossRef]
Hong, I.; Yu, J.; Hwang, S.J.; Kwack, Y. Estimation of Cucumber Fruit Yield Cultivated Under Different Light Conditions in Greenhouses. Horticulturae 2024, 10, 1117. [Google Scholar] [CrossRef]
Mao, X.; Ren, N.; Dai, P.; Jin, J.; Wang, B.; Kang, R.; Li, D. A Variable Weight Combination Prediction Model for Climate in a Greenhouse Based on BiGRU-Attention and LightGBM. Comput. Electron. Agric. 2024, 219, 108818. [Google Scholar] [CrossRef]
Dangi, S.; Mullapudi, S.K.; Raghaw, C.S.; Dar, S.S.; Rehman, M.Z.U.; Kumar, N. A Multi-Temporal Multi-Spectral Attention-Augmented Deep Convolution Neural Network with Contrastive Learning for Crop Yield Prediction. Comput. Electron. Agric. 2025, 239, 110895. [Google Scholar] [CrossRef]
Ma, D.; Carpenter, N.; Amatya, S.; Maki, H.; Wang, L.; Zhang, L.; Neeno, S.; Tuinstra, M.R.; Jin, J. Removal of Greenhouse Microclimate Heterogeneity with Conveyor System for Indoor Phenotyping. Comput. Electron. Agric. 2019, 166, 104979. [Google Scholar] [CrossRef]
Zanchi, M.; Zapperi, S.; La Porta, C.A.M. Optimized Placement of Sensor Networks by Machine Learning for Microclimate Evaluation. Comput. Electron. Agric. 2024, 225, 109305. [Google Scholar] [CrossRef]
Yu, Z.; Liu, Y.; Hu, X.; Xue, J.; Zha, L.; Zhang, J.; Bao, H.; Lai, D. CFD-Driven Bayesian Optimization of Localized Ventilation to Achieve Desired Microclimate Conditions in Plant Factories. Comput. Electron. Agric. 2026, 240, 111225. [Google Scholar] [CrossRef]
Rebuli, K.B.; Ozella, L.; Vanneschi, L.; Giacobini, M. Multi-Algorithm Clustering Analysis for Characterizing Cow Productivity on Automatic Milking Systems over Lactation Periods. Comput. Electron. Agric. 2023, 211, 108002. [Google Scholar] [CrossRef]
Šalagovič, J.; Verboven, P.; Hertog, M.; Van de Poel, B.; Nicolaï, B. Model Prediction of Plant Morphology, Water Flows and Xylem Water Potential in a Growing Tomato Plant under Heterogeneous Growing Conditions. Comput. Electron. Agric. 2025, 235, 110346. [Google Scholar] [CrossRef]
Méndez-Vázquez, L.J.; Lira-Noriega, A.; Lasa-Covarrubias, R.; Cerdeira-Estrada, S. Delineation of Site-Specific Management Zones for Pest Control Purposes: Exploring Precision Agriculture and Species Distribution Modeling Approaches. Comput. Electron. Agric. 2019, 167, 105101. [Google Scholar] [CrossRef]
Pan, X.; Xu, J.; Li, X.; Zhao, J. Highly Transferable Paddy Field Identification Model Based on SAR Index and Transformer. Comput. Electron. Agric. 2025, 237, 110790. [Google Scholar] [CrossRef]
Senanu Ametefe, D.; Seroja Sarnin, S.; Mohd Ali, D.; Caliskan, A.; Tatar Caliskan, I.; Adozuka Aliu, A.; John, D. Enhancing Leaf Disease Detection Accuracy through Synergistic Integration of Deep Transfer Learning and Multimodal Techniques. Inf. Process. Agric. 2025, 12, 279–299. [Google Scholar] [CrossRef]
Zhao, Y.; Yang, Z.; Ma, B.; Song, H.; Yang, D. Deep learning prediction and model generalization of ground pres-surefor deep longwall face with large mining height. J. China Coal Soc. 2023, 45, 54–65. [Google Scholar] [CrossRef]
Wu, R.; Wu, R.; Jin, L.; Wang, H.; Liu, S.; Jiang, S.; Liu, X.; Zheng, S. Climate suitability division of solar greenhouse in Inner Mongolia Au-tonomous Region, China. Chin. J. Appl. Ecol. 2023, 34, 1305–1312. [Google Scholar] [CrossRef]
Rina, S.; Nile, W.; Mula, N.; Wei, S.; Guo, Y.; Zhang, J.; Tong, Z.; Liu, X.; Zhao, C.; Ersi, C. Intelligent and Label-Free Crop Identifying Using Multi-Source Time Series and Model Transfer. Inf. Process. Agric. 2025, in press. [Google Scholar] [CrossRef]
Yoon, S.; Lee, W.-H. Methodological Analysis of Bioclimatic Variable Selection in Species Distribution Modeling with Application to Agricultural Pests (Metcalfa pruinosa and Spodoptera litura). Comput. Electron. Agric. 2021, 190, 106430. [Google Scholar] [CrossRef]
Xiao, C.; Zhang, F.; Mäkelä, P.S.A.; He, J.; Fan, J.; Jia, Y.; Chen, J.; Li, Y.; Liu, H. An Unmanned Aerial Vehicle-Based Cotton Nitrogen Nutrition Index Estimation Model Utilizing Feature Selection and Machine Learning. Comput. Electron. Agric. 2025, 238, 110798. [Google Scholar] [CrossRef]
Hu, X.; Wang, S.; Deng, J.; Li, X.; Yu, L. Fine Analysis of Climatic Similarity for Fire Cured Tobacco Cropping between Yunnan and Zimbabwe. Chin. J. Agrometeorol. 2011, 32, 262–266. [Google Scholar] [CrossRef]
Ju, Y.; Chen, Z.; Ma, D.; Yuan, F.; Liu, J.; Zeng, Q.; Fang, Y. Climate Similarity Analysis on High-quality Flue-cured Tobac-co Planting Areas in North Central Subtropical Zone of China. J. Appl. Meteorol. Sci. 2022, 33, 736–747. [Google Scholar] [CrossRef]
Liang, B.; Li, X.; Zhang, Z.; Wu, C.; Liu, X.; Zheng, Y. Multidrug Resistance Analysis Method for Pathogens of Cow Mastitis Based on Weighted-Association Rule Mining and Similarity Comparison. Comput. Electron. Agric. 2021, 190, 106411. [Google Scholar] [CrossRef]
Haagsma, M.; Page, G.F.M.; Johnson, J.S.; Still, C.; Waring, K.M.; Sniezko, R.A.; Selker, J.S. Model Selection and Timing of Acquisition Date Impacts Classification Accuracy: A Case Study Using Hyperspectral Imaging to Detect White Pine Blister Rust over Time. Comput. Electron. Agric. 2021, 191, 106555. [Google Scholar] [CrossRef]
Lu, Y.; Yue, T.; Chen, C.; Fan, Z.; Wang, Q. Solar radiation modeling based on stepwise regression analysis in China. J. Remote Sens. 2021, 14, 852–864. [Google Scholar] [CrossRef]
Wang, X.; Zhang, M.-W.; Guo, Q.; Yang, H.-L.; Wang, H.-L.; Sun, X.-L. Estimation of Soil Organic Matter by in Situ Vis-NIR Spectroscopy Using an Automatically Optimized Hybrid Model of Convolutional Neural Network and Long Short-Term Memory Network. Comput. Electron. Agric. 2023, 214, 108350. [Google Scholar] [CrossRef]
Rezaei, M.; Diepeveen, D.; Laga, H.; Sohel, F. Automatic Pixel-Level Annotation for Plant Disease Severity Estimation. Comput. Electron. Agric. 2026, 241, 111316. [Google Scholar] [CrossRef]
Taki, M.; Ajabshirchi, Y.; Ranjbar, S.F.; Rohani, A.; Matloobi, M. Heat Transfer and MLP Neural Network Models to Predict inside Environment Variables and Energy Lost in a Semi-Solar Greenhouse. Energy Build. 2016, 110, 314–329. [Google Scholar] [CrossRef]
Sheela, K.G.; Deepa, S.N. Review on Methods to Fix Number of Hidden Neurons in Neural Networks. Math. Probl. Eng. 2013, 2013, 425740. [Google Scholar] [CrossRef]
Jung, D.-H.; Kim, H.S.; Jhin, C.; Kim, H.-J.; Park, S.H. Time-Serial Analysis of Deep Neural Network Models for Prediction of Climatic Conditions inside a Greenhouse. Comput. Electron. Agric. 2020, 173, 105402. [Google Scholar] [CrossRef]
Kolasa-Wiecek, A. Stepwise Multiple Regression Method of Greenhouse Gas Emission Modeling in the Energy Sector in Poland. J. Environ. Sci. 2015, 30, 47–54. [Google Scholar] [CrossRef]
Fu, Y.; Nasridinov, A.; Piao, M.; Ryu, K.H. Multiple Regression Analysis of Climatic Factors in Greenhouse Using Data Partitioning. In Advanced Multimedia and Ubiquitous Engineering; Park, J.J., Jin, H., Jeong, Y.-S., Khan, M.K., Eds.; Lecture Notes in Electrical Engineering; Springer: Singapore, 2016; Volume 393, pp. 669–677. ISBN 978-981-10-1535-9. [Google Scholar]
Seri, E.; Biso, F.; Bovesecchi, G.; Katsoulas, N.; Cornaro, C. Multi-State Modeling of Greenhouse Cucumber Yield Dynamics Under Microclimate Effects. arXiv 2025, arXiv:2510.11485. [Google Scholar] [CrossRef]
Yu, Q.; Li, J.; Chen, Z.; Pecht, M. Multi-Fault Diagnosis of Lithium-Ion Battery Systems Based on Correlation Coefficient and Similarity Approaches. Front. Energy Res. 2022, 10, 891637. [Google Scholar] [CrossRef]

Figure 1. Technical roadmap.

Figure 2. Schematic diagram of the location of environmental data collection sensors.

Figure 3. Schematic diagram of BP neural network structure. The input parameters include total solar radiation (R), soil temperature (Tsoil), soil humidity (Hsoil), carbon dioxide concentration (CO₂), and air humidity (Hair). The orange, blue, and red circles represent the nodes in the input, hidden, and output layers, respectively.

Figure 4. Temperature control target of Lingyuan model under the environmental parameters of greenhouses in two places. (A) Simulation results of the Yinan model; (B) Simulation results of the Lingyuan model.

Figure 5. Yinan and Lingyuan cucumber growth period solar radiation similar day cluster results.

Figure 6. Greenhouse automatic selection model system.

Figure 7. Test fit plot of the simulated and measured values of the two models. (A) Fitting results of the Lingyuan model; (B) Fitting results of the Yinan model.

Table 1. Experimental greenhouse structure.

Number of Greenhouses	Specifications				Building Materials
Number of Greenhouses	Span (m)	Length (m)	High Ridge (m)	Back Wall Thickness (m)	After Slope Material	Covering Material	Back Wall Material	Side Wall Material
1	7.2	80	4	1.2	Red bricks and cement	Polyethylene (PE)	Brick wall plastering with mud	Brick wall plastering with mud
2	7.2	80	4	1.2	Red bricks and cement	Polyethylene (PE)	Brick wall plastering with mud	Brick wall plastering with mud
3	7.2	80	4	1.2	Red bricks and cement	Polyethylene (PE)	Brick wall plastering with mud	Brick wall plastering with mud
4	7.5	75	4.5	1	Cement board	Polyethylene (PE)	Stone wall with mud plastering	Stone wall with mud plastering
5	7.5	75	4.5	1	Cement board	Polyethylene (PE)	Stone wall with mud plastering	Stone wall with mud plastering
6	7.5	75	4.5	1	Cement board	Polyethylene (PE)	Stone wall with mud plastering	Stone wall with mud plastering
7	6.5	60	3.8	3	Cement board	Polyethylene (PE)	Straw-thatched wall	Straw-thatched wall

Table 2. Experimental greenhouse cucumber planting parameters.

Number of Greenhouses	Planting Time	Planting Pattern	Ridge Length (m)	Spacing Plant (cm)	Ridge Spacing (m)	Plant Number of Planting Pattern	Irrigation Mode
1	5 November 2019	Daliang Lane	6.5	18	0.28	36	Sub-surface drip irrigation
2	5 November 2019	Daliang Lane	6.5	18	0.28	36	Sub-surface drip irrigation
3	5 November 2019	Daliang Lane	6.5	18	0.28	36	Sub-surface drip irrigation
4	20 December 2019	Daliang Double Rowing	7.0	20	0.35	35	Sub-surface drip irrigation
5	20 December 2019	Daliang Double Rowing	7.0	20	0.35	35	Sub-surface drip irrigation
6	20 December 2019	Daliang Double Rowing	7.0	20	0.35	35	Sub-surface drip irrigation
7	10 November 2019	Daliang Lane	6.0	21	0.25	29	Sub-surface drip irrigation

Table 3. Comparison of the effect of different classes of clustering based on solar radiation.

Greenhouse Location	Species Number	Minimum Distance Between Data Centers	Data Items and Cluster Center Points Maximum Distance
Yinan	2	5.88	3.88
	4	2.11	3.27
	5	3.79	3.13
Lingyuan	2	5.81	4.79
	3	3.47	4.66
	4	4.19	3.47
	5	2.88	3.21

Table 4. Lingyuan (above) and Yinan (below) meteorological production and meteorological elements correlation coefficient R² square and αij.

Stage		A₁	A₂	A₃	A₄
Sunshine hours	R²	0.0993	0.1054	0.0495	0.0122
	αij	−0.1401	−0.4274	0.5254	−0.2639
	Weight (i,k)	0.1805	0.9581	0.8737	0.4840
Total solar radiation	R²	0.1865	0.0065	0.0076	0.0129
	αij	0.3385	0.3038	−0.4941	0.2652
	Weight (i,k)	0.8195	0.0419	0.1263	0.5160
	Summing up	1	1	1	1
Sunshine hours	R²	0.5125	0.0927	0.0139	0.0022
	αij	0.5651	−0.1997	0.0603	0.0548
	Weight (i,k)	0.9834	0.4351	0.8080	0.0011
Total solar radiation	R²	0.2840	0.1047	0.0107	0.2313
	αij	0.0172	−0.2294	0.0187	−0.4816
	Weight (i,k)	0.0166	0.5649	0.1920	0.9989
	Summing up	1	1	1	1

Table 5. Index distribution map based on similarity coefficient and similarity distance between Lingyuan, Liaoning and Yinan, Shandong.

Stats	Mean	Median	Minimum	Maximum Value	1/4 Points Number	1/4 Positioner Number
Similarity coefficient	0.0176	−0.0699	−1	1	−0.914	0.939
Similarity distance	3.301	0.329	0.00628	26.51	0.0514	1.151

Table 6. A judgment index based on similarity coefficient and similarity distance.

Optimum	Suitable	Suitable	Unsuitable
Similarity coefficient	≥0.85	0.75–0.85	<0.6
Similarity distance	≤0.56	0.56–0.852	>0.85

Table 7. Analysis of Indicators Based on Similarity Coefficient and Similar Distance.

	Similarity Factor	Similar Distance
A₁	0.9233	0.5651
A₂	0.8345	0.6569
A₃	−0.0529	7.9396
A₄	−0.0963	11.904
B₁	−0.6120	0.9007
B₂	0.0580	1.2409
B₃	−0.0269	5.0037
B₄	0.2500	8.198

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, H.; Hu, Z.; Xu, M.; Ding, J.; Chen, S.; Li, Z.; Li, T. Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity. Agriculture 2026, 16, 1093. https://doi.org/10.3390/agriculture16101093

AMA Style

Xu H, Hu Z, Xu M, Ding J, Chen S, Li Z, Li T. Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity. Agriculture. 2026; 16(10):1093. https://doi.org/10.3390/agriculture16101093

Chicago/Turabian Style

Xu, Hui, Zhihang Hu, Ming Xu, Juanjuan Ding, Shijun Chen, Zhulin Li, and Tianlai Li. 2026. "Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity" Agriculture 16, no. 10: 1093. https://doi.org/10.3390/agriculture16101093

APA Style

Xu, H., Hu, Z., Xu, M., Ding, J., Chen, S., Li, Z., & Li, T. (2026). Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity. Agriculture, 16(10), 1093. https://doi.org/10.3390/agriculture16101093

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Dynamic Selection Strategy for Cucumber Temperature Management Models in Solar Greenhouses Based on Microclimate Similarity

Abstract

1. Introduction

2. Materials and Methods

2.1. Test Materials and Environment

2.2. Experimental Modeling

2.3. Method for Confirming the Limitations of Two-Location Model Areas

2.4. Establishment of the Model Applicability Determination System

2.4.1. The Method for Determining the Dominant Factors in a Greenhouse

2.4.2. The Establishment Method of the System for Determining the Similarity of Microclimates in Greenhouses

2.4.3. The Establishment and Verification Method of the Automatic Selection Model Strategy in Greenhouses

3. Results

3.1. Judgment of the Limitations of the Two-Location Model Area

3.2. Establishment of a System for Determining the Similarity of Microclimates in Greenhouses

3.2.1. Establishment of the Dominant Factors in the Greenhouse

3.2.2. The Construction Result of the Multi-Dimensional Space Sequence

3.2.3. Determination of the Weight Coefficients of Total Solar Radiation and Sunshine Duration in Different Spatial Sequences for Yield

3.2.4. Determination of the Appropriate Intervals for the Similarity Distances and Similarity Coefficients of Multiple Factors and Multiple Time Periods in Two Locations

3.3. Establishment of an Automatic Selection Model Strategy for Greenhouses

3.4. Verification and Case Analysis of the System Performance of the Dynamic Selection Model

4. Discussion and Conclusions

4.1. Discussion

4.2. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI