1. Introduction
Recent economic forecasts say that the world population will reach about 8.5 bln people by 2030. Growth of population imposes strict requirements for food availability; there should be enough food in terms of quantity and food supplies are expected to double by that time. Apart from significant growth of Earth population, there are two problems (i) urbanization and (ii) delivery of inexpensive fresh vegetables and fruits to remote areas. Therefore, food production in greenhouses and urban farming is becoming a promising tool to tackle this problems.
Precision agriculture and plant growth in controlled artificial conditions have become the state-of-the-art research and engineering areas addressing the increasing demand for food in the nearest future [
1]. The development of autonomous greenhouses systems is becoming a big hit globally due to the shutdown hook from the COVID-19 pandemic. For above mentioned reasons, Wireless Sensor Networks (WSN), being an enabling technology for the Internet of Things (IoT) [
2], is able to contribute to this process and guarantee better efficiency, automotive and sustainable crop yield.
Successful deployments of WSN in the agriculture related areas [
3,
4] and application of machine learning algorithms [
5,
6] to embedded and mobile devices [
7] ensure bright future for intelligent deployments.
In fact, in order to successfully grow vegetables and operate an industrial greenhouse, an experienced and highly skilled grower is required. He is capable of establishing the typical greenhouse routine: setting optimal artificial light, watering, feeding, etc. All these factors are assessed by the experienced grower while taking decisions for guaranteeing optimal production. However, the grower takes decisions based on his prior experience rather than on science. Adding intelligent sensing for supporting the decisions can positively impact the agriculture: it helps identify the ideal time to harvest, to improve yields of crops, to reduce the operational costs, to ensure better resource management [
8].
The first intelligent sensors joined in a WSN were performing environmental monitoring tasks including active volcano monitoring [
9], zebra migration monitoring [
10] and buildings monitoring [
11]. However, these WSNs were not always truly intelligent and did not always possess the actuation capabilities: the WSN deployed for wildfire detection was rapidly destroyed by the fire passage which was extinguished only at the final stage [
12]. Hence, there was a clear need to equip the WSNs with actuation capabilities and integrate them in existing monitoring infrastructure.
It terms of agriculture, a sensor-actuator WSN was deployed in a greenhouse for automatic climate control [
3].
Due to significant progress, Artificial Intelligence (AI) and machine learning, as well as an opportunity to squeeze it to embedded systems [
7], has helped evolve WSNs for precision agriculture. Recent WSN deployment for tomato monitoring engages reinforcement learning for optimal artificial light control in a greenhouse [
4]. Still, there is a lack of agricultural WSN deployments and collected datasets which serve as a driving force for further development of AI-enabled precision agriculture.
In this work, we propose the data-driven enhancement for a plant growth dynamics modelling and prediction in an artificial growth system. We report on a WSN deployment consisting of off-the-shelf components: low-power sensors, video cameras and a server in the 720 m industrial greenhouse for cucumber growing. The goal of this deployment is the assessment and prediction of the cucumbers’ growth dynamics using non-invasive real-time intelligent systems. It is essential for providing the information for selecting the optimal growth regimes in artificial environment. To realize this goal into practice we collected the dataset of top-down cucumber leaves’ images while using the sensor data for the environmental data collection and detection of anomalies, e.g., sudden and unpredictable significant changes in temperature and humidity. We used Fully Convolutional Neural Networks (FCNN) for segmentation task helping to calculate leaf area and manual measurement of biomass for finding the correlation between the leaf area and biomass.
The paper novelty is threefold: to the best of our knowledge, we first demonstrate the correlation between the cucumber leaf area and its biomass using computer vision in industrial scale experiments and use this dependency for the prediction of the biomass based on the 2D imagery data. This approach can significantly contribute to the in situ growing optimization as the conditions like shading or sudden temperature change [
13] may have severe consequences on the growing and yield. Second, as a result of this work and its validation on a real deployment, there will be the state-of-the-art dataset available for the research community. Third, we apply FCNN, which have been successfully used in a number of industrial applications, for the segmentation task. This procedure is integrated in the greenhouse infrastructure and can be scaled to other greenhouses for plant growing.
The paper is organized as follows: we introduce the reader to the relevant works in the area in 
Section 2. In 
Section 3 we discuss methodology used in this research. Afterwards, we detail deployment in terms of hardware, software, plants and greenhouse facility in 
Section 4. Next, data analysis is demonstrated in 
Section 5 where the methods and results are provided. In 
Section 6 we discuss limitations of our work and specific findings. Finally, we provide concluding remarks and highlight our future work in 
Section 7.
  4. Deployment
The deployment shown in 
Figure 1 was located in a dedicated area (720 m
) of the initial plant growing stage in an industrial greenhouse facility. In this section we detail specific information about deployment making a special emphasis on plant we monitor, hardware and software.
  4.1. Plants
The cultivation of cucumber seedlings was carried out according to the low-volume hydroponic technology based on growing plants in rock wool substrate. Sowing for each seed was carried out in rock wool blocks 10 × 10 × 6.5 cm (
Grodan delta). All rock wool blocks were preliminarily dunked in a specially prepared nutrient solution (see 
Table 1). Vermiculite was sprinkled on top of the seeds to avoid additional evaporation. In total, 496 plants were sowed and evenly distributed on the floating table in rock wool blocks for the experiment.
The rock wool blocks were placed on one table, 8 m long, 1.82 m wide according to the experimental scheme (see 
Figure 2). The rock wool blocks were saturated with a nutrient solution completely sinking into it. The parameters of the fertiliser solution that was used for initial blocks saturation are: Electrical Conductivity (EC) equals 1.50 mS/cm and pH was in the range of 5.3–5.5. Further watering of plants was carried out by the partial flooding method. Necessary watering time was determined by the weight of the rock wool block. The weight of a fully saturated rock wool cube 10 × 10 × 6.5 cm is 650–660 g. With a drop in weight up to 350–370 g, watering was carried out. As seedlings grow, they need more elements of mineral nutrition. Therefore, EC of feeding solution was slightly increased during the experiment. However, when the seedlings formed four real leaves and a good root system, according to the technology of cultivation, they should be transplanted on rock wool slabs, which have a capacity of 16 L of nutrient solution, 4 L per plant (4 plants per slab). In our experiment, seedlings continued to grow in a 0.65 L cube. That is why further watering based on common technology was impossible: the last few watering procedures were made with distillate water (see 
Figure 3). The biomass measurements during the experiment are shown in 
Figure 4. Each point contains from 11 to 82 measurements of the biomass. In total, 480 measurements were performed.
Conductivity measurements were performed on a METTLER TOLEDO conductivity meter and measured in mS/cm. Measurements of pH were carried out on a Sartorius PB-11 instrument. Measurements of changing EC and pH in the rock wool blocks parameters during the experiment are shown in 
Table 2. Solutions’ samples were taken out by a syringe from the rock wool block. Two samples from the middle of each of 3 zones were taken. According to 
Table 2 next samples were taken: from 
Zone I—samples 1 and 4, from 
Zone II—samples 2 and 5 and from 
Zone III—samples 3 and 6. These measurements ensure equal conditions for growing all the plants. It reduces the deviations and makes the obtained dataset relevant and statistically correct.
  4.2. Hardware
The WSN deployment consists of (i) a data collection server, (ii) WaspMote sensor nodes and (iii) Xiaomi XiaoFang 1080p cameras organized as a WSN communication at 2.4 GHz frequency. The sensor nodes are based on the low-power ATmega Microcontroller Unit (MCU), a wireless transmitter with the transmission power set to −0.77 dBm. It is chosen based on the empirical analysis of Received Signal Strength Indicator (RSSI) since the Link Quality Indicator (LQI) and Packet Delivery Rate (PDR) metrics are inaccessible in the industrial versions of WaspMotes. A power source is the battery pack containing three parallel 3.7 V Li-ion polymer cells with the total capacity of 6.6 Ah. It secures the long-term lifetime without recharging. The sensor nodes include several types of sensors: temperature, PAR, humidity and 
. Several sensor nodes are placed on each tray evenly as shown in 
Figure 2. The cameras are accessed from the server using Real Time Streaming Protocol (RTSP).
  4.3. Software and Data Storage
A schematic view of our data collection system is shown in 
Figure 5. It consists of three main components: (i) a Flask-based HTTP server implemented in Python programming language; (ii) a distributed task queue Celery and in-memory Redis database as a message broker with the persistence enabled; and (iii) a general-purpose schema-less Database Management System (DBMS) MongoDB allowing for the storage of unstructured data along with the arbitrary binary objects using GridFS. The HTTP API server allows for the collection of sensor data using push strategy with the sensor nodes sending measurements every 30 min. The MongoDB database is used to store the sensor measurements and camera images. It is hosted using a DBaaS service which was sometimes inaccessible due to the intermittent internet connectivity on the deployment site. Therefore, it is crucial to persistently store the received data locally using the Celery queue with Redis as a broker and task storage, and when the internet connection is restored we send the data to MongoDB. Additionally, Celery allows one to periodically (every 30 min) receive the image data from the cameras using the poll strategy. All of the software components were built using Docker containerization system, thus being easy-to-deploy.
We used the data from sensors in order to monitor the environmental conditions and prevent undesirable and abnormal plant growth. As mentioned earlier, for the development of growth model, validation of the methodology for biomass assessment and prediction, only measurements of leaf area obtained from images were used. This was enough for accurate predictions of biomass in normal environmental growth conditions. Also, there is an opportunity to include actuators and the data from sensors into the plant growth model for fine tuning of the modeling process.
The examples of obtained measurements of temperature and humidity corresponding to the experimental results described next are shown in 
Figure 6 and 
Figure 7. They ensure the permissible values of these environmental parameters during the experiment. According to the research reported in [
13] and  [
34] the optimal temperature conditions are 18.3–32.2 °C. The best humidity range to ensure the maximum growth rate on the initial stage is 50–70%. However, the permissible humidity range that has not much effect on the total yield is 35–90% [
35].
We performed the Dickey–Fuller test for ensuring the stationarity of growth conditions. This test demonstrated the following result: p-values were less than 0.05 for both the important environmental parameters including temperature and humidity. It means that the time series are stationary.
For maintaining sustainable growth of the plants it is important to monitor not only the absolute values of the environmental conditions, but also the rate of changing (first derivatives) [
36,
37]. Rapid changes of the environmental parameters, even being in the optimal boundaries, may affect the growth dynamics. It can result in the plant development in a wrong way. Tracking these changes is only possible when using the distributed sensors that provide the measurements with high time resolution. We performed this analysis using the obtained measurements. The results are demonstrated in 
Figure 8, 
Figure 9 and 
Figure 10. These results demonstrate that the environmental parameters changed smoothly what resulted in the normal plant growth dynamics.
  4.4. Image Data Collection and Annotation
According to 
Figure 2, four digital cameras with resolution 1920 × 1080 were mounted 2 m above the floating table. These cameras took two images sequentially every 30 min for 31 days. A total of 2494 raw images were taken from each camera, and 9976 top-down images were taken in total. After the data cleaning procedure and choosing the images only for the time interval that represents the active growth stage interval, 4 sequences of 975 images representing 25 days of observation for each camera were kept. This data was used for further investigation and assessment of growth dynamics. All images were flattened using the calibration images to avoid distortions. In total, 248 images were annotated. Selection of 62 images out of 975 from each of the 4 cameras for annotation purposes was performed in the following way: from each day of observation 3 images at times 9:00, 15:00 and 21:00 were kept. The annotation procedure consisted of putting the segmentation masks and bounding boxes for each plant in the image. Overall, 45,389 instances for 248 images were obtained after the annotation procedure.
  4.5. Inference on a Low-Power Embedded System with the AI Capabilities
In this section we describe an experiment which can realize the second option in terms of data analysis, i.e., perform the analysis on board of sensing device instead of sending the data to a server.
Computer vision algorithms, e.g., FCNN, described earlier are powerful methods for semantic segmentation of images in real-time. However, these neural networks are well known as greedy algorithms which require extensive computational resources and are not suitable for low-power embedded devices. Therefore, inference of these algorithms on board of single board computers and mobile devices is a challenging task. In the case of success the distributed network based on the low-power systems with AI capabilities could tremendously improve the capabilities of greenhouses. It could make the prediction of plant growth dynamics for each plant individually, make a decision on the specific chemical input to the soil, therefore maximizing the output of cultivation. Such an autonomous system could be powered by the external batteries and perform the data collection even in case of blackout.
However, the key advantage of this system is the ability to perform the data-intensive computing on board and sending the post-processed data to the server. It significantly reduces the requirements throughout the capacity of the data transmission system. It can send the semantic mask which occupies several KB instead of the raw image which can obtain several MB. For the large scale data collection and processing it would significantly influence the data transmission and WSN infrastructure in general.
In the current research, we report on the development of such a system for semantic segmentation of cucumbers. A single-board computer Nvidia Jetson Nano is a critical component of the proposed system. It has the mobile GPU on board with 128 cores and can easily handle even 4K video streams. However, what matters in the case of cucumbers’ growth dynamics in the greenhouse is autonomy. Therefore, the most intriguing parameter for the proposed research is power consumption for a single frame processing along with the ability to process the data-intensive FCNNs.
We tested the FCNN architecture on the embedded device. Also, we measured the time per single image processing and power consumption. The mean time for image processing is 3.5 s per image. It has a constant voltage of 5 V. However, the current varies. The power consumption during the computation is 6.5 W (1.3 A), with 5 W (1.0 A) in the idle mode. The power bank could easily power this system. To calculate the operation time of the proposed system, we rely on the formula: , where E is the power capacity of the battery, T is the time of operation, P is the power consumption and U is the input voltage. For example, 10,000 mAh power bank can produce the power supply for up to 7.69 h of continuous operation—performing neural networks inference all the time. However, in our scenario, the system should capture new data and perform predictions every 30 min, staying the rest of the operation in the idle mode. Therefore, it could withstand the operation for up to 10 h in such a scenario. It makes an autonomous system an excellent option for applying distributed sensor systems in operation at areas with restricted power or communication capabilities. It also becomes a way to preserve the data even in case of blackouts or other infrastructure incidents.
  5. Data Analysis
The set of FCNNs, e.g., U-Net [
33], FCN8s, FCN16s [
38], was trained within PyTorch framework and validated using the labeled dataset. U-Net consists of a contracting path that captures context and an expanding path that enables precise localization. It is effective for segmentation on small datasets with excessive data augmentation and it is also effective for border segmentation which is important for plant segmentation. The contracting path consists of 3 × 3 unpadded convolutions, each followed by ReLU, max-pooling 2 × 2 with stride 2. The expansive path contains the upsampling of the feature map, followed by 3 × 3 convolutions. After upsampling, the resulting feature map concatenates with the corresponding feature map from the contracting path. Then it is followed by two 3 × 3 convolutions each followed by ReLU. Finally, 1 × 1 convolution is used for each of 64 components. After that, pixelwise softmax over the whole feature map is calculated.
FCNN’s convolutional layers that include pooling and ReLU activations are followed by deconvolutional layers (or backwards convolutions) to upsample the intermediate tensors so that they match the width and height of the original input image.
Out of 62 images from each camera, 50 were used for training and 12 for validation. The remaining 913 images from each camera were kept as test data. To assess the quality of the trained model the average IoU between the predicted masks and the ground truth masks for validation data was evaluated using the following formula:
      where
      
      and 
 are all the possible pairs of ground truth and predicted masks, while 
 is a number of predicted masks. Average IoU on the validation set using FCN8 semantic segmentation neural network achieved value of 81%. The training parameters were selected as follows: batch size = 2, learning rate = 0.008, class weight = 0.5. Images were also resized to 1280 × 720. Principal Component Analysis (PCA) aided FCNN was also trained and evaluated for the reference with the proposed modifications of the FCN8 and FCN16 (modified FCN8). This neural network also has a contracting convolutional part and an expanding upsampling part. However, it relies on transferring the learned weights of new classification networks to fine-tuning segmentation networks. In our case, the pre-trained VGG-16 layers were used for training [
39]. It achieves 82% IoU in the proposed task after 100 epochs of training. Even though all the abovementioned FCNNs are not the most advanced in the area, e.g., Deeplab [
40], there could be an overshoot for the proposed task of leaf area segmentation. Mostly, because the IoU of these networks is sufficient and we have no reason to speed up them. Captures are made every 30 min, and we can make every prediction with a low framerate.
In 
Figure 11a,b present train and validation losses and IoU accordingly for FCN8 and for modified FCN8 for 100 epochs. Early stopping criteria were used to retrieve the best model during the process of learning. The examples of predicted masks on the validation dataset are shown in 
Figure 12a,b; it represents images for different stages of growth. The examples of predicted masks on the test dataset are shown in 
Figure 13a,b. As can be noticed from these figures, predicted masks are accurate and are in full correspondence with the actual plants.
Using the sequence of selected 975 images from each camera, per-plant leaf area was calculated. As there are many plants in the images, the table can move in the horizontal direction, and there were direct biomass measurements of plants—the different amount of plants appearing on images. This means that the total segmented area should be divided by the actual amount of plants to obtain the averaged area of each plant. The example of the calculated average per-plant leaf area on the image sequence obtained from one of the cameras is shown in 
Figure 14. It should be noticed that there was a total power interruption for several days on the 18th day of the active plant growth. In 
Figure 14 the first 760 data points are shown, representing the continuous growth. The accuracy of the proposed FCNN used for segmentation additionally proved by the fact that it captured the diurnal fluctuations (see 
Figure 14) of the projection of the leaf area that is caused by the biological reasons, specifically relative motion of leaves. After the power interruption, the system switched on automatically and continue collecting images and data from sensors (rest 215 images). This experience showed the high relevance of implementation of the autonomous embedded systems for greenhouses that allows to overcome the problem. Nevertheless, the collected images and biomass measurements were sufficient to find dependency between leaf area and biomass. 
Figure 15 shows approximated dependency between leaf area and biomass using Equation (
3). To construct this dependency, we used data points representing direct measurements of biomass and corresponding FCNN-calculated leaf area during the first 18 days (the first 10 points of the biomass measurements form 
Figure 4).
      
The derived dependency for cucumbers is following (Equation (
4)):
Using the obtained dependency it is possible to assess and predict biomass using the predicted leaf area. The Verhulst model, commonly used for description of the biological systems growth, was applied to make predictions of the leaf area (see Equation (
5)):
      where 
 is the growth rate (
), 
S is the measured (calculated) leaf area and 
 is the maximum leaf area in cm
. Integration of Equation (
5) gives the following Equation (
6):
      where 
 is the initial leaf area. The Verhulst model is widely applied for assessment of the dynamics of life systems. For example, the Verhulst model was applied for spatio-temporal population control to the management of aquatic plants [
41,
42]. It should be noticed that the Verhulst model was used for assessment of the dynamics of the projected leaf area, not the leaf areas themselves. Growing plants broadwise can have an effect on the values of the calculated projected leaf area. The projection of leaf area has limitations for the investigated type plant because it is not able to grow infinitely broadwise. This effect was also observed experimentally. Also, the estimation and modeling of the biomass were carried out at the initial stage of growth. Thus, the biomass that is accumulated at the initial stages and estimated using the first 3–4 projected leaf area has its limitations.
Non-linear least square method was used for estimation of the parameters in the growth model (Equation (
6)) based on the leaf area calculations obtained by the FCNN for the first 18 days. The result of the estimation is 
 1/30 min, 
 cm
 and 
 cm
. The relative error of the approximation of the data by the model is 5.5%. Using these coefficients it is possible to predict (extrapolate) the leaf area growth curve. The result of the fitting to the experimental data and prediction of the leaf area for 12 days ahead is shown in 
Figure 16.
Using these fitted and extrapolated values for the leaf area and derived dependency between leaf area and biomass, we calculated the predicted biomass for one month including the extrapolation interval (last 12 days). The result is shown in 
Figure 17, where the predicted biomass is presented as well as the biomass measurements that were used for construction of the dependency. Also, 
Figure 17 shows the measurements of the biomass that were taken during the last 12 days and that were not included in construction of the dependency. These last seven data points were used for validation of the prediction accuracy. The average relative error of the biomass prediction reached 
.
  6. Discussion
The cucumber (
  L.) is one of the most produced crops in greenhouses worldwide. Its production rate is up to 60%. Over the last few years, several dynamic or simulations models have been proposed to predict the cucumber growth and yield [
43,
44]. Such process-based models include huge amount of input heterogeneous environmental conditions parameters, complicated mathematical models that describe crop growth and need to be tuned and adapted for each plant cultivation and growing system. Also, precise measurements of some of these parameters are possible only manually which is time consuming and ineffective. This means that the efficiency of simulations can be low in the cases when the farmers could not monitor all the parameters as a routine. Leaf area and leaf biomass are important morphological parameters for in situ monitoring because the leaf is vital for perceiving and capturing light. Meanwhile, the traditional approach for leaf area and biomass measurements is destructive and may cause illness for plants. The incorporating IoT, computer vision and ANN systems can help solve these issues, because they provide possibilities for real-time monitoring of the plant’s phenology changes in a non-invasive way, and produce highly satisfactory forecasting of biomass (or other target parameters). In this study, the AI-based approach to assess and predict leaf area and plant biomass was proposed and validated. Our approach can estimate and predict precisely overall plants’ biomass at the early stage of growth in a non-destructive way. There is no need to carry out open-field vs. greenhouse experiments for assessing the impact of specific parameters [
13] as it can be performed via the assessment of the leaf biomass. Discussion on 3D and laser based approached is provided in 
Section 2.2: Leaves Modelling.
The other important outcome of our research is that the proposed methodology can be used for the fundamental research that aims at finding plant characteristics, dependencies and assessment of the plant’s response to changes of the environmental parameters with high time resolution. This in turn opens wide possibilities for investigation of the hidden dynamics that was impossible to observe before, using standard techniques. Moreover, we studied the optimal conditions for cucumber growing based on the expert knowledge’s of our agronomy, so this experiment design could also used as a baseline for future studies.
Meanwhile some limitations of this study could be highlighted. Usually, cucumber plants grow vertically in industrial greenhouses. The main problem is that the newest top leaves could overlap the lower leaves on images, so the plant biomass may be underestimated. This fact limits the ability of computer vision systems (implemented for top-screen view monitoring) to catch the plant’s biomass in a long-term period. Technically, this issues can be resolved by deploying additional cameras or applying mathematical algorithms to automate leaf counting, but this issue was beyond the scope of this investigation.
  7. Conclusions
In this article, we have reported on the approach for leaf area and biomass assessment which has been validated on a real deployment. For this reason we presented and tested the industrial deployment enabled by the AI-based sensing system for robust and accurate plant growth dynamics prediction. For the purpose of dataset collection that includes image data, environmental conditions and biomass measurements, we conducted one-month experiment on cucumbers growth in a greenhouse. Specifically, we obtained a dataset containing sequences of 9976 top-down images from 4 cameras, 480 direct measurements of biomass for a 17-day period and environmental data from sensors. First, we labeled the obtained image dataset and trained the FCNNs to perform automatic segmentation of cucumbers, achieving 82% of the IoU. Second, the trained FCNNs were applied to the sequences of images, thus, reconstructing average per-plant leaf area and growth dynamics. Then, we established correspondence between the area of leaves and biomass using the direct measurements of the biomass. Finally, it allowed us to predict dynamics of the biomass based on the predictions of the leaf area within 10% accuracy. Overall, we propose and evaluate the high effective and reliable data-driven based pipeline for the plant growth dynamics assessment and prediction using the common sensors such as 2D digital cameras.