Integration of Scales and Cameras in Nondisruptive Electronic Beehive Monitoring: On the Within-Day Relationship of Hive Weight and Traffic in Honeybee (Apis mellifera) Colonies in Langstroth Hives in Tucson, Arizona, USA

The relationship between beehive weight and traffic is a fundamental open research problem for electronic beehive monitoring and digital apiculture, because weight and traffic affect many aspects of honeybee (Apis mellifera) colony dynamics. An investigation of this relationship was conducted with a nondisruptive two-sensor (scale and camera) system on the weight and video data collected on six Apis mellifera colonies in Langstroth hives at the USDA-ARS Carl Hayden Bee Research Center in Tucson, Arizona, USA, from 15 May to 15 August 2021. Three hives had positive and two hives had negative correlations between weight and traffic. In one hive, weight and traffic were uncorrelated. The strength of the correlation between weight and traffic was stronger for longer time intervals. The traffic spread and mean, when taken separately, did not affect the correlation between weight and traffic more significantly than the exact traffic counts from videos. Lateral traffic did not have a significant impact on weight.


Introduction
Hive weight is an important indicator of colony activity [1], and many amateur and commercial operations measure weight continuously to estimate colony food reserves and to gauge optimal honey harvesting times [2,3]. Weight changes are indicative of the forager loss and gain during the day and of pollination activity [4,5]. Traffic is another important factor affecting colony dynamics. Traffic at hive entrance may predict honey weight gain [6], and rapid traffic increases at hive entrance may be due to robbing and swarming events [7]. Continuous video traffic measurement is a nondisruptive, robust, and inexpensive method to estimate hive traffic levels [8]. Sensor-based methods of monitoring colonies have shown their effectiveness in estimating the effects of stressors (e.g., poor nutrition or agrochemical exposure) on colony foraging activity and thermoregulation that are difficult to detect using other means such as visual colony assessments by human beekeepers [9]. Electronic beehive monitoring researchers have used sensors to measure internal and external temperature, humidity, atmospheric pressure, wind direction and speed, rainfall, shortwave radiation, weight, and traffic. While many researchers have investigated the relationship between traffic and weather (e.g., [10][11][12]) or weight and weather (e.g., [9,[13][14][15][16]), the literature on continuous beehive monitoring, with few notable exceptions (e.g., [6]), has a dearth of studies on the relationship of hive weight and traffic. This problem is fundamental, because hive weight and traffic affect many aspects of colony dynamics. Furthermore, hive weight is a function of several factors such as colony food collection and consumption, bee development and loss, moisture gain or loss due to nectar inflow, ambient humidity and bee respiration, water inflow and outflow, robbing and swarming, and external weather events. Some of these factors are associated with colony traffic, while others are not. For example, humid weather adds to hive weight, because moist wood (and many hives worldwide are made out of wood) is heavier than dry wood. Thus, the weight change due to humidity, especially at night when bees do not fly, is not associated with colony traffic. Thus, if we understand the relationship between hive weight and traffic, we can use both measurements more precisely to identify behavioral markers of Apis mellifera colonies and to improve data interpretation.
The study by Marceau et al. [6] is a rare attempt to shed light on the relationship between the traffic at hive entrance estimated with an electronic bee counter and hive weight on five Langstroth hives with Apis mellifera at an apiary of 22 Langstroth hives for 35 days in July and August 50 km west of Quebec city. The monitored period of each day was from 9:00 to 16:00. For the first four hives, traffic counts were recorded at 16:00 for the 9:00-16:00 period, and the weight difference for every 24 h period was logged at 9:00. The fifth hive was automatically monitored at 15 min intervals using data logging equipment from 9:00 to 16:00 for the mean daily traffic activity (bees/h), and the hive weight difference (kg) was recorded for the 24 h period at 9:00. The researchers proposed the quadratic model GAIN = B 0 + B 1 · ACT 2 , where GAIN estimates the hive honey gain (kg), ACT is the average bee activity between 9:00 and 16:00 (bees/h), and B 0 and B 1 are model coefficients. Marceau et al. reported that the honey gain varied from 28.7 to 58.4 kg and that the average bee activity for the 35 observation days varied from 19,403 bees/h for the least productive hive to 27,408 bees/h for the most productive hive. The four resulting models were very similar, with the best curve fitting obtained on the two most productive hives with R 2 = 0.88 and R 2 = 0.90. The researchers concluded that the more active a colony was, the more honey it produced, and the minimum activity rate required to obtain a positive daily gain was 14,000 bees/h. When the daily average activity remained below 14,000 bees/h, the hive weight decreased. While the findings by Marceau et al. are significant, their investigation had several important limitations. First, hive weight can be only an approximate estimate of honey gain, because the latter is included in the former. Second, the directionality of bee motion was not taken into account. Specifically, traffic in the vicinity of the hive consists of incoming bees, outgoing bees, and laterally flying bees, which Marceau et al. did not take into account. Third, the researchers made no attempt to distinguish the weight associated with traffic and the weight not associated with it. Fourth, the researchers did not justify why traffic at hive entrance was estimated from 9:00 to 16:00. Research (e.g., [9]) shows that foragers start flying out as early as 5:00 and return to the hive as late 20:30 or even later. Fifth, the datasets described in the article do not appear to be publicly available for replication, standardization, and improvement.
Our investigation addresses the gap in the literature on the relationship of hive weight and colony traffic by investigating the within-day relationship between hive weight and traffic in the vicinity of a hive with a nondisruptive two-sensor (scale and camera) electronic beehive monitoring (EBM) system. We make the following contributions to the body of research on continuous hive monitoring. First, we formulate, prove, and experimentally validate a necessary condition for the within-day independence of weight and traffic on time periods from 1 h up to 6 h. Second, our experiments indicate that the correlation of weight and traffic becomes stronger on time periods longer than 1 h. Third, while the necessary condition for the independence was experimentally verified in our investigation, the executed χ 2 tests failed to verify the implied sufficiency condition for the within-day independence of weight and traffic for any tested time period from 1 h up to 6 h. Thus, the formulation of the within-day sufficiency conditions remains an open problem for electronic beehive monitoring and theoretical apiary science. Fourth, our experiments show that some hives had positive and some hives had negative correlations between weight and traffic. We offer several conjectures on possible causes that may warrant further investigation. Fifth, the computed correlation coefficients and the executed χ 2 tests showed that lateral traffic did not have a significant impact on weight change and may be omitted in within-day computational models that predict hive weight from traffic. Sixth, our experiments suggest that the traffic spreads and means, when taken separately, did not affect the correlation of weight and traffic more significantly than the exact traffic counts. Thus, exact traffic counts may suffice as traffic estimates. Finally, we made public our curated datasets of time-aligned weight and traffic measures from our field deployment at the USDA-ARS Carl Hayden Bee Research Center in Tucson, Arizona (AZ), USA, in May-August 2021. These datasets can be used as benchmarks for replication, standardization, and improvement.
Since EBM is a relatively recent branch of digital apiculture and does not yet have standard terminology, we conclude the introduction with several definitions. We use the terms bee and honeybee to refer to the Apis mellifera honeybee. We use the terms hive and beehive to refer to a standard Langstroth hive or a variant thereof with an Apis mellifera colony. We define the vicinity of a hive to be the cube-shaped space in front of the hive's entrance with dimensions 3 m × 3 m × 3 m continuously monitored with a camera-computer unit. We use the term traffic to refer to all bee traffic in the vicinity of the hive. We use the term total traffic to refer to the bee traffic that includes incoming traffic (number of bees flying into the hive), outgoing traffic (number of bees flying out of the hive), and lateral traffic (number of bees flying parallel to the landing pad of the hive) over a given period of time. We use the term electronic beehive monitoring (EBM) to refer to the acquisition and analysis of digital data on the behavior of a managed bee colony through various sensors deployed in or around the hive. We use the adjectives nondisruptive and noninvasive with respect to EBM to describe the type of EBM that requires no structural modification of the hive and no deployment of active or passive sensors inside the hive or on individual bees. EBM solutions are nondisruptive insomuch as they do not disrupt any natural cycles of the monitored colonies and preserve the sacredness of the honeybee space. We note that, unlike other state-of-the-art EBM investigations (e.g., [5,16]) that rely on disruptive solutions (e.g., radio tags on bees or structural hive modifications), we used only nondisruptive methods in our study.
The remainder of our article is organized as follows. In Section 2, we detail the materials and methods of our investigation. In Section 3, we present our results. In Section 4, we discuss our results. In Section 5, we present our conclusions. Our supplementary materials include not only the datasets and additional tables and plots but also several short videos that illustrate important hardware and software aspects of our EBM system, which the readers may want to watch before proceeding to the remainder of the article. References to the figures and tables in the supplementary materials start with the prefix S (e.g., S54).

Data
The dataset was acquired during the deployment of 10 BeePi monitors (e.g., [17]) on Apis mellifera colonies in Langstroth hives at the USDA-ARS Carl Hayden Bee Research Center in Tucson, Arizona (AZ), USA (GPS coordinates: 32 • 13 18.274 N, 110 • 55 35.324 W) from 20 May to 15 August 2021. All colonies had Italian queens from two breeders: one in California and one in Hawaii. The queens were all painted (blue for breeder 1; green and yellow for breeder 2) to enable queen verification throughout the experiment. All queens were one year old. No hives swarmed during the monitored period. Each BeePi monitor was equipped with a Raspberry Pi 3 model B v1.2 computer coupled to a Raspberry Pi v2 8-megapixel camera. Timestamped 30-second mp4 25 frames per second videos were taken by each monitor every 15 min of the 3 m × 3 m × 3 m cube-shaped space in front of the hive on top of which the monitor was mounted. The videos were captured from 7:00 to 20:30 due to the poor visibility at the site apiary before 7:00 and after 20:30. The videos were saved on each monitor's 5 TB USB storage device. All 10 hives were each placed on the 10 stainless steel electronic scales (Tekfa model B-2418 and Avery Weigh-Tronix model BSAO1824-200; max. capacity: 100 kg, precision: ±20 g; operating temperature: −30 • C to 70 • C) and linked to 16-bit dataloggers (Hobo UX120-006M External Channel data logger, Onset Computer Corporation, Bourne, MA). The hive weight was measured in kilograms (kg) and logged every 5 min, which was the default time period of the data logger. Four of the ten BeePi monitors were damaged during a severe storm in Tucson, AZ, in July 2021 and were fixed in early August 2021. However, due to this data acquisition gap, the data from these hives were not used in this investigation. We henceforth refer to the remaining six hives from which the weight and traffic data were collected by their IDs used in our logs: H17, H19, H41, H43, H47, and H53. Each of the 13,353 videos (see Table 1) collected from the six hives was processed by the BeePIV algorithm [8]. BeePIV converts video frames to particle motion frames, computes particle displacement vector fields, classifies individual displacement vectors as incoming, outgoing, and lateral, and uses vector counts (non-negative integers) to measure incoming, outgoing, and lateral bee traffic. Total traffic is estimated as the sum of the incoming, outgoing, and lateral measurements. The timestamps on weight and traffic measurements were used to time-align them into one CSV file for each hive. The final dataset consisted of six CSV files (one per hive) of time-aligned incoming (IN), outgoing (OUT), lateral (LAT), total (TOT) counts and weight measurements. Since weight measurements were logged every 5 min while traffic measurements were logged every 15 min, each weight measurement time-aligned with a traffic measurement was computed as the mean of the three weight measurements the middle of which had the same timestamp with the traffic measurements. Weight measurements were raw in that they included external impacts (e.g., someone puts a heavy object such as a brick or a super on top of the hive). Thus, when the weight rose or dropped abruptly by ≈20 kg and returned to the previous level within 30 min, which is physically impossible in a real beehive, the measurement was considered to reflect an external impact and was replaced with the mean of the neighbors before and after it. Table 1 summarizes the information on the CSV data files.

Hive Inspection and Treatment
All monitored hives had regular hive inspections carried out by the fifth author for the duration of the experiment. We counted frames of bees and mite drops and logged qualitative brood assessments. Frames of bees are counts of individual frames in a hive that are completely covered by bees on both sides. If only one side of a frame is completely covered with bees, then it is counted as 1/2 of a full frame. It should be noted that such measurements as 1/2, 1/4, 1/8 of a full frame are visual assessments by the beekeeper. Counts of frames of bees are an estimate of the overall health of a colony. Mite drops are counts of Varroa mites on a sticky board. A sticky board is a thin (≈2 cm thick) rectangular piece of corrugated plastic on which a thin film of Vaseline (or other adhesive substances such as plant oil) is placed with a paper towel. The sticky board was inserted into the screened bottom board underneath each monitored hive. As mites drop from bees in a colony, they stick to the board and can be visually counted by the beekeeper. Greater mite drops indicate higher levels of mite infestation, which may negatively impact the colony's productivity. Brood assessments are the beekeeper's qualitative assessments of the brood's condition. We used the following qualitative labels in our brood assessments: straight, straight with punctured caps, spotty, PMS (parasitic mite syndrome), no brood, and chalk brood. Straight brood indicates a productive laying queen. Straight brood with punctured caps also indicates a productive queen with potential minor laying problems. Spotty brood shows that a queen lays eggs in isolated and typically disconnected cell regions, which may cause productivity problems or colony failure later on. PMS characterizes the brood with white larvae that appear chewed or sunken on the side of some cells. Chalk brood is caused by a fungus called Ascosphaera apis. Frames affected by chalk brood have white chunks of mummified brood that resemble small pieces of white chalk. This disease infects a hive through reproductive spores attached to pollen, robbing bees, or tools used in already infected hives. Hives infected with chalk brood often fail and present a danger to the other hives in the apiary due to bee drift. Apivar strips were applied to all monitored hives on 8 July 2021 to treat Varroa mites. We assessed the population strength of each colony by calculating the weight of the adult bee mass (bee mass) by subtracting from the total weight of the hive the combined weight of the woodenware, the electronics, and the frames without the bees [9]. The adult bee mass measurements were conducted twice on each hive at the beginning (June 2021) and at the end (August 2021) of the monitored period.

Random Variables and Correlations
We measured hive weight and traffic as two jointly observed random variables W (weight) and T (traffic). Our samples were where W i and T i , 1 ≤ i ≤ n, are time-aligned weight and traffic measurements for a given hive (see Table 1 for specific values of n). We use the notation W t and T t to denote random variables whose values range over the values of W and T at time t. The correlation between two random variables is typically measured with Pearson's, Spearman's, and Kendall's correlation coefficients [18][19][20], denoted as ρ P , ρ S , and ρ K , respectively. We tested the absence against the presence of correlation with the following hypotheses: where ρ = ρ P , ρ = ρ S , or ρ = ρ K . H ρ,0 was rejected in favor of H ρ,1 at p ≤ 0.05. We use the notationρ P ,ρ S , andρ K to denote the computed estimates of ρ P , ρ S , and ρ K . Thus, we computed Pearson, a statistical measure of linear dependency between two variables, of n measurements (W i , T i ) asρ Spearman and Kendall measure a monotonic association between two random variables, and are more robust versions of Pearson, because they rely on ranks. The rank of an observation W i , R(W i ), is its position in a list of measurements for a random variable sorted in ascending order. For example, if there are four measurements W 1 = 6.3, W 2 = 8, W 3 = 2.5, and W 4 = 7.4, then the sorted list is (W 3 = 2.5, W 1 = 6.3, W 4 = 7.4, W 2 = 8) and the ranks are R(W 1 ) = 2, R(W 2 ) = 4, R(W 3 ) = 1, and R(W 4 ) = 3. We computed Spearman and Kendall of n time-aligned measurements (W i , T i ) aŝ where the event {W i ≶ W j , T i ≶ T j } refers to the situation when the comparison signs between W i and W j and T i and T j are the same. Thus, if W i < W j , then T i < T j , and if W i > W j , then T i > T j . The event {W i ≶ W j , T i ≷ T j } refers to the situation when the comparison signs between W i and W j and T i and T j are different. Thus, if W i > W j , then We computed the correlation coefficient heat map of weight (W) and the five traffic types for each hive: IN, OUT, difference between IN and OUT (IN-OUT), sum of IN and OUT (IN+OUT), and total (TOT=IN+OUT+LAT). We used the autocorrelation function (ACF) to detect non-randomness in weight and traffic viewed as time series [21]. The ACF evaluates the similarity between a time series (i.e., the signal) and its copy with a shift, which is referred to as a lag. The ACF is a function of a lag. Let S 1 , S 2 , . . . , S n be a time series of observations, then the ACF of the lag L is defined as the Pearson correlation ρ P between S i and S i+L aŝ where S 1:(n−L) and S 1:(n−L) are the mean values of the series S 1 , S 2 , . . . , S n−L and S L , S 2 , . . . , S n , respectively. Autocorrelation plots are used to visually assess the presence of trends and cycles in data. A trend is a pattern in a time series that does not repeat at least within the captured period. Cyclicity is a component that regularly repeats itself over time. If a time series has a trend, the ACF does not reach zero unless the lag is sufficiently long. If a time series contains a significant cycle, the autocorrelation plot typically shows spikes at multiples of lags equal to the period.

Weight and Traffic Changes
Since each BeePi monitor captured a 30-s video every 15 min from which the BeePIV algorithm extracted the integers IN, OUT, LAT, and TOT, we measured the change in traffic over a lag of 15k minutes as Thus, if k = 4, then the lag is 1 h; if k = 8, then the lag is 2 h, etc. The values of ∆ k T t were logarithmically transformed to make the rate of change distribution closer to normal as they approximately follow a log-normal distribution (see Figures S1 and S2 in the Supplementary Materials). Since weight measurements were logged every 5 min, for every T i the time-aligned W i was computed as the mean of W i−1 , W i , and W i+1 . The change in weight over a lag of 15k minutes was computed as where the values of 1 + tk and 1 + (t − 1)k were chosen so that W 1+tk and W 1+(t−1)k always belonged to the video monitoring period (7:00-20:30) of the same day. The variables ∆ k W t were assumed to be independent and identically distributed for any k. Specifically, if k = k , then ∆ k W t and ∆ k W t may not have identical distributions. However, if d and d are two different days, ∆ k W t on day d is assumed to be independent of ∆ k W t on day d , but the distribution of ∆ k W t is assumed to be identical on d and d . The variables ∆ k T t were analogously assumed to be independent and identically distributed. We denote the weight change over the lag 15k as ∆ k W and the traffic change over the same lag as ∆ k T. The change in the variance (σ 2 ) and the mean (µ) of T were computed as An example of calculating ∆ k W, ∆ k T, ∆ k σ 2 (T) t , and ∆ k µ(T) t is given in Appendix A.

Joint Probabilities
We defined the function The function divides the values of ∆ k Z t into two categories: 0 and 1. Thus, if W and T are two thresholds for the change in weight and traffic, respectively, then D W (W t ) = 1 signifies the change in weight between times t − 1 and t at or above W , while D T (T t ) = 1 signifies the change in bee traffic between t − 1 and t at or above T . If Z i is the estimate of Z t at time i (i.e., the observed value), the probability of {D (Z t ) = 1} can be estimated as the average of the occurrences of this event in n trials as A necessary condition of the independence between the random variables of X t , Y t , X t−1 , and Y t−1 can be formulated as follows and proved as a theorem (a proof is in Appendix A).
A Necessary Condition for Independence (NCI): If X t , X t−1 , Y t , Y t−1 are discrete independent random variables, then, for any X and Y , Equation (11) is a paradox in the following sense. If W = T = 0, both W t and T t change in most trials between times t − 1 and t, which makes the left side of Equation (11) true. In other words, However, if no restrictions are placed on weight and traffic measurements through W and T , W t does not depend on T t , which makes the right side of Equation (11) true. In other words, Thus, the thresholds for weight and traffic, W and T , must be above zero. Furthermore, the traffic threshold T must be further constrained as because otherwise the joint probability of {D W (W t ) = 1} and {D T (T t ) = 1} is 0 whenever P(D T (T t ) = 1) is 0. To establish the feasible values of W and T , we used our datasets to compute the sets where n is the total number of records in the dataset (see Table 1) and k ∈ {4, 8 )). We also tested the necessary condition for Independence (NCI) in Equation (10) on our dataset for each tuple of the three feasible values (i.e., k, 0 < W < * W , 0 < T < * T ) by computing for each hive the threshold θ such that (16) in order to discover the upper bound of the difference between the joint and marginal probabilities. The value of θ in Expression (16), to which we refer as maxD, was computed for all 15 traffic measurements. We also computed the argmaxima of absolute differences between the joint probability (the left-hand side of Equation (10)) and the product of the marginal probabilities (the right-hand side of Equation (10)) for all hives.

X 2 Tests
The NCI implies a sufficiency criterion of the independence of X t , X t−1 , Y t , Y t−1 , which can be formulated as follows.
A Sufficiency Condition for Independence (SCI): Let X t , X t−1 , Y t , Y t−1 are discrete independent random variables. If, for any X and Y , Since we cannot prove this criterion as a theorem, because, as of now, it is unclear to us how to formulate general, entomological realistic assumptions on the distributions of the variables, we executed the χ 2 tests to estimate the independence of W t , W t−1 , T t , T t−1 on our dataset. For each sample (W 1 , T 1 ), . . . , (W n , T n ), we introduced k grouping intervals ∆ 1 , . . . , ∆ k for values ∆ k W and m grouping intervals ∇ 1 , . . . , ∇ m for ∆ k T values. We used the following hypotheses in the χ 2 tests: where p rc is the probability of an observation belonging to ∆ r × ∇ c , p r is the probability of ∆ k W ∈ ∆ r , and p c is the probability of ∆ k T ∈ ∇ c . We split the domain of ∆ k W and ∆ k T into sub-intervals with the same probability to ensure that the probability of ∆ k W t falling into any sub-interval is equal for all intervals. Since the true distributions of W and T are unknown, we separated both domains into the intervals with the the same number of counts from our dataset, which is the standard approach in χ 2 tests [22]. The literature on the χ 2 tests has two recommendations, which we used for all lags in our χ 2 tests. The first recommendation, REC 1 , is that each cell in the χ 2 cumulative table for (W t , T t ) contain at least 5 observations on average, which is a standard requirement for the practicability of χ 2 tests [22]. The second recommendation, REC 2 , is to split the domain of the investigated samples into a number of intervals m in accordance with the number of samples n. For example, if n is in [40, 100), then m ∈ [7,9], if n is in [100, 500), then m ∈ [8,12] [23]. If REC 2 is followed, the deviation of the histogram from the actual distribution density is minimal [24]. To estimate the impact of lags on the relationship between hive weight and traffic, we computed the H χ 2 ,0 rejection ratios for all hives and lags from 1 h up to 6 h in 1 h increments. We also computed the correlation coefficients and their p-values for all hives and the same lags between W t and T t to see if these two different statistical methods agree on the impact.   Tables S1-S5). For space considerations, we chose to present the results in terms of hive H17 as a representative of the group that included hives H17, H19, H41, H43, and H53, because these hives exhibited similar trends and patterns different from those of hive H47.    The autocorrelation plots for weight and total traffic (TOT) for hive H17 are given in Figure 2. The plots for the other hives are in the supplementary materials ( Figures S3-S7). The ACF plots indicate that both weight and traffic include trend components insomuch as the amplitudes gradually decrease as the lag increases. Figure 2a,c show that trends and cycles are present in the weight data. Figures S3-S7 in the supplementary materials indicate that the weight and traffic of the other hives also exhibit trends and cyclical patterns. Figure 2b reflects a cyclical pattern in TOT with a period of 54 points (i.e., the full number of records per day) with the ACF peaks corresponding to L ≈ 54 and L ≈ 108.

Results
The computation of Equation (16) on our dataset showed that if the difference between t and t − 1 (i.e., the lag) was no longer than 6 h, then for any 0 < W < * W and 0 < T < * T , the difference between the joint probability P * (D W (W t ) = 1, D T (T t ) = 1) and the marginal probabilities P * (D W (W t ) = 1) · P * (D T (T t ) = 1) did not exceed 0.15. In other words, Figure 3 shows the graphs of the joint and marginal probabilities for lags of 1 h, 3 h, and 6 h and five traffic types for all values of 0 < W < * W and 0 < T < * T . Tables 4, S6 and S7 give the argmaxima of the absolute differences between the exact counts, variances, and means of incoming, outgoing, total, and lateral traffic, as well as the difference between incoming and outgoing traffic, the sum of incoming and outgoing traffic, and the corresponding value of W and T for hive H17.   Tables 5, S41 and S42 give the χ 2 statistics C for the exact traffic counts, their variances, means, and different lags, and their p-values for hive H17.  Table 6 gives the traffic types and lags for which H χ 2 ,0 was rejected at p ≤ 0.05, which occurred in 142 out of 540 tested cases. Table 6. Lags for which H χ 2 ,0 was rejected for monitored hives; IN-incoming traffic; OUT-outgoing traffic; TOT-total traffic; σ 2 (X)-variance of X; µ(X)-mean of X; means that H χ 2 ,0 was not rejected for any lag; bolded lags are the lags for which both recommendations at the end of Section 2.6 are satisfied.   Tables S58-S62).  Tables 8-12 show our counts of the frames of bees, mite drops, brood quality assessments, adult bee mass measurements, and the queen status inspections.    Table 11. Bee mass (kg) in monitored hives measured using the method in [9]; N/A-not applicable.  Figure 1 shows that the correlation coefficients between weight and all traffic measurements, except for IN-OUT, are relatively close to each other. Hive H17 was the only hive with notable correlations between IN-OUT and weight for all three coefficients. Hive H53 showed strong correlations between weight and the four traffic measurements (IN, OUT, TOT, and IN+OUT). Hives H19 and H41 showed moderately positive correlations which were not as strong as those of hive H53. Hives H17 and H43 showed negative correlations, with H17 correlations being more negative than those of H43. H17 was the only hive for which the sign of the correlation coefficients between W t and IN-OUT were different from those of IN, OUT, TOT, and IN+OUT. All correlations of H47 were ≈0, which suggests that measured traffic and weight for this hive were uncorrelated. Table 2 shows that H47 was the only hive for which H ρ,0 was not rejected for any coefficient and any traffic type at p ≤ 0.05. For all other hives, H ρ,0 was rejected at p ≤ 0.05. H47 was the only hive whose weight did not increase with time. The weight of H47 oscillated between 40 and 41.5 kg throughout the experiment (see Figure 4). This observation leads to a conjecture that weight and traffic may not be correlated in hives that are not gaining weight, which merits further investigation. There may exist a traffic threshold per given lag below which a hive does not gain weight. Marceau et al. [6] experimentally concluded that for a positive daily gain the traffic must be at least 14,000 bees/h, but did not elaborate on whether this was mean traffic or real traffic. Our assumption of the independence and identical distribution of the values of ∆ k W t (and ∆ k T t ) on different days is reasonable insomuch as it is hard to assume that the differences in weight (and traffic) taken on a specific time interval on one day are related to the differences in weight (and traffic) taken on the same time interval but on a different day. More field data are required to investigate the independence and identical distribution of ∆ k W t and ∆ k T t for different lags.

Discussion
Tables 3, 4, S6 and S7 for hive H17, the analogous tables for the other hives in the supplementary materials, and the plots in Figure 3 give an experimental validation of the result in Inequality (18) in that the difference between the joint and marginal probabilities was relatively small. In other words, traffic W t and weight T t might have been necessarily (but not sufficiently!) independent in the monitored hives so long as the lag did not exceed 6 h. The joint probabilities and the marginal probability products in Tables 4 and S6-S22 diverge by no more than 15% across all 15 traffic measurements for all hives (See Inequality (18)). The maximum absolute difference occurred mostly on the median values for weight and traffic. The divergence as a function of W and T reached its maximum when W and T were such that P * (D W (W t ) = 1) ≈ P * (D T (T t ) = 1) ≈ 0.5 (see Tables S23-S40). As the tables with P * at the extrema in the supplementary materials show, this observation held for all hives. Inequality (18) suggests that the events {∆ k W t ≥ W } and {∆ k T t ≥ T } are independent for any W and T , for all 15 traffic measurements and all lags from 1 h up to 6 h. Consequently, the necessary condition of independence (NCI) of weight and traffic was experimentally verified on our dataset. Of course, this verification does not imply that weight and traffic were independent on the tested lags, because our χ 2 results failed to verify the implied sufficiency condition for independence (SCI). The results in Table 5 and Tables S41-S57 in the supplementary materials, and Table 6 indicate that there was an association between W t and traffic measurements at p ≤ 0.05 for all hives. However, these results should be interpreted with caution because the χ 2 tests fail to distinguish H χ 2 ,0 from H χ 2 ,1 if the probabilities of the observations falling into the partitioning intervals are the same for both hypotheses. On our dataset, the longer the lag was, the smaller the number of records for that lag were recorded. More field deployments are needed to increase the number of records for longer lags. Several rejections of H χ 2 ,0 at smaller lags suggest that W and T may be correlated even on smaller lags. The hives whose weight and several traffic measures were related at lags of 1 h and 2 h (hives H17, H19, and H53) saw significant weight gains (+20 kg) for the entire monitored period of May-August 2021. This might indicate that the more frequently foragers leave the hive, the more weight the hive gains. It is unclear, however, whether the weight gain of a hive depends on the forager fly rate or on the efficiency of individual foragers (i.e., how much payload each forager brings back to the hive). Table 6 indicates that there is a relationship between traffic as measured by IN, TOT, IN+OUT, µ(OUT), µ(TOT), and µ(IN+OUT) and the weight for all hives.
The Pearson coefficients in Table 7 indicate that as the lag increased from 1 h to 6 h, the absolute value of Pearson increased, which means that on longer lags W t and T t became more correlated for hive H17 and the other hives (see the tables in the supplementary materials), except for H47. This observation, however, should not be interpreted as causality, because there may have been an underlying hidden variable (e.g., the declining health of the queen) that caused the two random variables to change in tandem. The p-values in Table 7 show that for hive H17, H ρ,0 was rejected for all lags. Similar observations (i.e., stronger Pearson correlation between W t and T t on longer lags, and H ρ,0 rejection for all lags) can be made on the analogous tables for all other hives, except for hive H47, in the supplementary materials (see Tables S58-S62). For H47, H ρ,0 was not rejected for any lags. H41 and H47 were the only two hives that did not have significant gains in both weight (see Figure 4) and bee mass (see Table 11) at the end of the experiment. The weight of H41 increased only by 6 kg, and the weight of H47 essentially remained the same. However, H41 had positive within-day correlations between weight and traffic, while the same correlations in H47 were negative and close to zero. H47 had lower counts of frames of bees, with the counts falling from 6 on 27 May 2021 to 5 on 13 August 2021 (see Table 8). H41 was the only other hive where the frames of bees fell from six to four during the same period. For H47, the mite drop started at 1.7 (the lowest of all hives) at the beginning of the monitored period and rose to 3.7 in August, 2021 (see Table 9). Table 10 shows that H47 and H41 were infected with the PMS and had spotty brood patterns. Table 11 shows that H47 had the lowest bee mass in June, 2021, and the second lowest bee mass at the end of the experiment in August, 2021, which may explain why H47 had the lowest mite drop. In H41, a supersedure queen was present in the hive for about one week. This queen was removed on 8 June 2021 (see Table 12) and replaced by a new one year old Italian queen from the same breeder from the queen bank at the USDA ARS Center in Tucson, AZ. The new queen might have been more productive than the removed queen, which may explain why H41 had positive and H47 negative correlations between weight and traffic in Figure 1. The supersedure queen may have mated with Africanized feral drones in Tucson, AZ, which may have affected traffic patterns in H41. In H47, an Italian queen from breeder 1 was removed on 19 May 2021 (four days after the start of the experiment) due to lack of productivity and replaced with a one-year-old Italian queen from breeder 2 from the same queen bank. Thus, H47 was the only hive where a one-year-old Italian queen from one breeder was replaced by a one-year-old Italian queen from a different breeder, which may have been a contributing factor in the lack of correlation between hive weight and traffic in this hive. All other queens survived and no other hives were re-queened during the monitored period. Tables 13 and 14 summarize the H χ 2 ,0 rejection ratios for all lags. Table 13 takes into account the χ 2 results for all lags, while Table 14 gives the results for the lags satisfying both recommendations, REC 1 and REC 2 , for χ 2 tests at the end of Section 2.6. Table 13 shows that as the lag increased from 1 h to 6 h, the frequency of H χ 2 ,0 rejection also increased, indicating that for all hives W t and T t became more frequently associated as the lags became longer, which corroborates the result achieved with the Pearson coefficients in Table 7. The same tendency is observed in Table 14. Therefore, the observation that the strength of the correlation between W t and T t increased with the lag was confirmed by the correlation coefficients and χ 2 results. Additional field deployments will enable us to collect more data in order to investigate which lags maximize weight-traffic correlation and to find computational methods for determining the optimal duration of traffic observations for accurate weight prediction from traffic. Table 15 gives the H χ 2 ,0 rejection ratios for exact traffic measurements, traffic measurement variances, and traffic measurement means and indicates that the rejection ratios are basically the same for the three statistics, which suggests that traffic variance and traffic mean, when taken separately, do not affect the correlation between W t and T t more significantly than exact traffic counts. Thus, the latter may suffice in computational models that relate weight and traffic.  While the necessary condition for the independence (NCI) of weight and traffic was experimentally verified, the χ 2 tests failed to verify the implied sufficiency condition for independence (SCI) for any lag. A possible explanation is that D (·) in Equation (8) divides the values of ∆ k W and ∆ k T into two categories, while the χ 2 tests divide ∆ k Z t into multiple (more than two) categories. The computed correlation coefficients and the executed χ 2 tests indicate that the relation between total traffic (TOT) and weight W t and the sum of the incoming and outgoing traffic (IN+OUT) and W t are basically the same, which suggests that lateral traffic did not have a significant impact on weight change and may be omitted in computational models that relate weight and traffic. Additional field deployments will allow us curate larger datasets to test the strength of the relation and, as opportunity arises, discover its statistical or mathematical nature.

Conclusions
Hive weight W t and traffic T t were more correlated on longer lags. The strength of the correlation increased with the lag when estimated with Pearson, Spearman, and Kendall. The χ 2 tests also showed that as the lag increased from 1 h to 6 h, the frequency of H χ 2 ,0 rejection increased, indicating that W t and T t became more related on longer lags for all hives. This conclusion may not be causal, because there may have been underlying hidden variables (e.g., the health of the queen) that caused W t and T t to change in tandem. Our autocorrelation analysis suggests that both weight and total traffic exhibit trends and cyclical patterns. A feasible reason why some hives had positive and some hives had negative correlations between W t and T t may lie in the genetic differences between the queen lines from two U.S. breeders of Italian queens, which may warrant further investigation in the future. A number of factors could contribute to the lack of correlation between hive weight and traffic in H47: the replacement of the original queen with a queen from a different breeder, insignificant gain in bee mass, spotty brood patterns, and PMS infection. The spread in traffic and average traffic, when taken separately, did not affect the correlation of W t and T t more significantly than the exact traffic counts from videos. The correlation coefficients and the χ 2 tests showed that the relation between total traffic (TOT) and weight W t and the sum of incoming and outgoing traffic (IN+OUT) and W t are basically the same. Thus, lateral traffic (LAT) did not have a significant impact on weight change and may be omitted in computational models that relate hive weight and traffic in the vicinity of the hive. We hope that our study and similar studies will eventually result in a methodology to separate the hive weight associated with traffic from the hive weight not associated with it, which, in turn, will lead to the construction of computational models that predict hive weight from hive traffic.
Supplementary Materials: The following supplementary materials are available at https://www. mdpi.com/article/10.3390/s22134824/s1: (1) the PDF with three video sets that illustrate how the BeePIV algorithm processes videos with different levels of bee traffic; Tables S1-S62 with additional hive-specific correlation coefficients and p-values; and Figures S1-S7 that illustrate additional hive-specific aspects of autocorrelation of weight and traffic; (2) six datasets of weight and traffic which we used for the reported research.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:  Table A1 is our dataset. If we want to measure the change in the exact counts with the lag of one hour, then the exact counts are as follows.
Proof. Let us consider the probability P(X t − X t−1 = x, Y t − Y t−1 = y), where x, y ∈ R are arbitrary numbers. Since X t , X t−1 , Y t , Y t−1 are discrete and independent, this probability can be written as Thus, X t − X t−1 and Y t − Y t−1 are independent discrete variables, and, consequently, |X t − X t−1 | and |Y t − Y t−1 | are independent as well. Therefore, for any X and Y in R, we have if and only if P(D X (X t ) = 1, D Y (Y t ) = 1) = P(D X (X t ) = 1) · P(D Y (Y t ) = 1).