First Evaluation of Infrared Thermography as a Tool for the Monitoring of Udder Health Status in Farms of Dairy Cows

The aim of the present study was to test infrared thermography (IRT), under field conditions, as a possible tool for the evaluation of cow udder health status. Thermographic images (n. 310) from different farms (n. 3) were collected and evaluated using a dedicated software application to calculate automatically and in a standardized way, thermographic indices of each udder. Results obtained have confirmed a significant relationship between udder surface skin temperature (USST) and classes of somatic cell count in collected milk samples. Sensitivity and specificity in the classification of udder health were: 78.6% and 77.9%, respectively, considering a level of somatic cell count (SCC) of 200,000 cells/mL as a threshold to classify a subclinical mastitis or 71.4% and 71.6%, respectively when a threshold of 400,000 cells/mL was adopted. Even though the sensitivity and specificity were lower than in other published papers dealing with non-automated analysis of IRT images, they were considered acceptable as a first field application of this new and developing technology. Future research will permit further improvements in the use of IRT, at farm level. Such improvements could be attained through further image processing and enhancement, and the application of indicators developed and tested in the present study with the purpose of developing a monitoring system for the automatic and early detection of mastitis in individual animals on commercial farms.


Introduction
In dairy farming, the monitoring of udder health status plays an important role in the production of milk [1]. Mastitis is the most frequent disease that can affect the health status of the udder and the quantity and quality of milk yielded. Its early detection, thought an effective and automatic monitoring, can be a way to improve the farm efficiency [2]. Many indicators, methods, and devices have been developed to reach this goal. Some of these, such the somatic cell count (SCC), the bacteriological examination of milk samples (BAC), the DeLaval Cell Counter (DCC) or the California Mastitis Test (CMT) are laboratory analysis, or portable devices, not routinely applicable in the monitoring of animals' health on-farm due to the costs and time requirements [3]. Others are indicators that can be of this relationship they concluded that IRT could be a screening tool useful for the evaluation of udder inflammation status. Berry et al. (2003) [18], in a study that involved ten multiparous Holstein Friesian cows, investigated the variations of USSTs in a daily and within-day scale with the aim to improve a base knowledge for the development of future methods for the early detection of a possible case of mastitis. The IRT was used for the measurement of udder surface temperatures and possible effects, of environmental factors, were also investigated. Results have shown that a model, based on USST values of previous days, could predict the current udder temperature. Furthermore, residual values not predicted by the model were less than 0.5 • C. This suggested that possible rises of USST, related to a case of mastitis, could be potentially detected by IRT. Nevertheless, to achieve a good prediction accuracy, the model needed to consider some environmental parameters. This fact led the authors to suggest that new experiments, on different field conditions, are necessary as also remarked by many other authors [2,10,11]. Furthermore, efficient technologies must be developed if the final target is an automation of the evaluation of USST by thermography [17,18].
The aim of the present study was, therefore, a further evaluation of the IRT, under field condition, considering a greater number of dairy cows, from different farms, reared in different ambient conditions. A significant number of thermographic images were analyzed and two different settings to classify udder health status were investigated. Some dedicated imaging elaborations were developed and new thermographic indices were calculated having in mind the use of IRT in a possible automatic system. Finally, the accuracy of this technology was evaluated in order to determine its applicability and effectiveness as a primary screening tool in cow udder health.

Animals and Farms
The study was carried out in January 2018. It involved 155 dairy cows reared in three different medium size farms located in the Lombardy region (North of Italy). In detail, the experimental group was composed by: 92 Holstein Friesian cows of the farm 1; 35 Holstein Friesian cows of the farm 2 and 28 Holstein Friesian cows of the farm 3. Each cow did not have any clinical sign of mastitis. Stage of lactation was different for each cow and ranged between 15 to 275 days. Cows were feed twice a day with a total mixed ration according to their state of lactation. They were housed in free-stalls with straw beddings changed 2 or 3 times per week. Cows were milked by automatic milking systems (AMS) of different manufacturers and using specific settings for each farm. In detail, they were: one BouMatic MR-D1 (BouMatic LLC Stoughton Rd, Madison, WI, USA) for farm 1; one Fullwood Merlin M 2 (Fullwood Ltd., Grange Road, Ellesmere, Shropshire, UK) for farm 2; and one BouMatic MR-S1 for farm 3. In each farm, during the time of the experiments, neither ventilators nor water dropping were active.

Milk Sampling, Milk Sample Analysis and Definition of Udder Health Status
Immediately before to acquire thermografic images, each mammary gland was examined in order to avoid a possible case of clinical mastitis. A clinical mastitis was diagnosed when milk from one or more glands was abnormal in color, viscosity, or consistency; with or without accompanying signs of heat, pain, or redness. Clinical signs of mastitis was assessed according to National Mastitis Council guidelines [20]. From each cow, a composite milk sample of all udder quarters was automatically collected by the milking robot. At the end of the day of sampling, a technician of the Regional Breeders Association (Associazione Regionale Allevatori Lombardia-ARAL), within a Dairy Herd Improvement Program (DHI), collected milk samples of the farm that were processed for SCC in the following day of sampling, following international recommendations [21].
In accordance with experimental designs used in similar studies [10], SCC results were considered to classify the health status of udders. Since different thresholds are generally adopted to classify subclinical mastitis [22,23], two cutoffs were taken into consideration in order to discriminate healthy versus not healthy cases: 200,000 cells/mL and 400,000 cells/mL.

Thermographic Images Collection
Thermographic images were collected using a commercial infrared camera (Thermo GEAR-G120 EX-Nippon Avionics Co., Tokyo, Japan). It had an uncooled detector focal plane array (microbolometer) with a resolution of 320 × 240 pixels. Its accuracy was ±2 • C with a sensitivity of 0.04 • C (at 30 • C) while its physical dimensions were 21.2 cm × 7.5 cm × 13.8 cm (H × W × D). Prior to image acquisition, the ambient temperature of milking parlor was recorded and used to allow internal compensation for this parameter (i.e., calibration) by thermal imaging camera algorithms. The range of ambient temperature recorded during the experimental period was from 6 to 10 • C with a mean value of 8 • C. The camera operator ensured optimum image focus during image acquisition. An emissivity of 0.98 was employed throughout in accordance with previously published studies carried out on cow udders [2,[24][25][26][27][28][29]. Thermographic images were captured positioning the camera at udder level, at a distance of circa 0.6 m [10,28,30,31] from each udder side (collecting as a result, two thermographic images for each udder- Figure 1A). During thermographic images acquisition, the camera operator acquired at least three images for each udder side. This step guaranteed to have, for the analysis that followed, one clear image without any movement of the cow or leg that could partially hide the udder. Only one set of thermographic images (i.e., the right and the left side of each udder) was acquired for each animal involved in the experiment. Thermographic images were acquired just before the start of milking procedures having in mind a possible future automation of the use of USST as an indicator for the monitoring of udder health status.
All acquired images were analyzed by a dedicated software application developed in LabVIEW (National Instruments, Austin, TX, USA-version: 8.5) using also some specific subroutines of the Vision Acquisition Software (NI-version: 2009) and Vision Development Module (NI-version: 2009). The software application was able to work, off-line, on a set of ".bpm" files [32] formatted with a gray-level scale and a resolution of 8 bit. On each image file, the software algorithm performed the following tasks in automatic:

•
It identified the pixel with the maximum intensity (PI max , Figure 1C), calculating its coordinates inside the image and its value (equal to the maximum recorded temperature in the thermographic image).

•
It calculated a range of intensities to use as thresholds, according to the following formulas: These values were selected considering both, the average USSTs that have been observed in previous experiments [2,10,11,16,19] and the increments that have been reported in case of subclinical and clinical mastitis. The range of intensities calculated was applied as a filter [33], to the thermographic image, in order to detect the udder of the cow ( Figure 1B).

•
On the filtered image, it applied a grid made by image subsections of 4 × 4 pixels.

•
On each image subsection, it calculated the pixel average intensity (i.e., the recorded average temperature of the image subsection evaluated).

•
On the resulting set of pixel average intensities, it selected the maximum value and it considered that number as the recorded maximum temperature of the thermographic image evaluated (i.e., the T max ), taken as possible index of the udder health status in accordance with results of previous scientific studies [2,17]). • It calculated a "temperature proximity area" (AP T , Figure 1C,D) considering the coordinates of PI max as a starting point and a set of connected pixels which intensities were different from zero after applying the following filter: where the indicator T (tolerance) was set up at a level of 15 [34]. This last value was selected considering the increments of USST already found in scientific literature [2,10,11,16,19]. Furthermore, a connectivity mode of 4 pixels was used for the recursive application of the above reported filter [35]. This value specified at the algorithm whether a pixel should be considered in the following cycle. In detail, it imposed at the software application to take into consideration the pixels that were at the cardinal points (i.e., North, East, South and West) of the pixel under evaluation for the filtering operations that followed.
After evaluation of all acquired thermographic images, the software application reported the results in a ".txt" file in order to allow statistical analysis. At this purpose, for each udder, only the thermographic image that showed the highest value of USST was studied as a possible method for the monitoring of udder health status. where the indicator T (tolerance) was set up at a level of 15 [34]. This last value was selected considering the increments of USST already found in scientific literature [2,10,11,16,19]. Furthermore, a connectivity mode of 4 pixels was used for the recursive application of the above reported filter [35]. This value specified at the algorithm whether a pixel should be considered in the following cycle. In detail, it imposed at the software application to take into consideration the pixels that were at the cardinal points (i.e., North, East, South and West) of the pixel under evaluation for the filtering operations that followed.
After evaluation of all acquired thermographic images, the software application reported the results in a ".txt" file in order to allow statistical analysis. At this purpose, for each udder, only the thermographic image that showed the highest value of USST was studied as a possible method for the monitoring of udder health status.  [1,2]) and after identifying in thermographic image the pixel with the maximum intensity value (PImax). In the figure, almost the whole cow udder is highlighted. As a consequence, a grid, of dimensions 4 × 4 pixels can be applied in order to calculate the surface distribution of temperatures. In a following step, the maximum value of udder skin temperature (Tmax) can be identified as the maximum value within the surface temperatures calculated; In (C), it is shown with a red cross the location of the pixel PImax and with a green contour the APT calculated; In (D), finally, is reported the "temperatures proximity area" (APT) obtained considering the coordinates of PImax and a set of connected pixels which intensities are different from zero after applying the above reported filter [3].

Statistical Analysis
Data obtained from image elaborations were investigated through statistical analysis performed using the "R" software tool (version 3.4.3, 2017) [36]. The relationships between dependent statistical variable Tmax and independent statistical variables SCC and APT were studied. The following linear model was fitted: where: Yijk were the values of variable Tmax calculated from thermographic images evaluated; SCCi (log-transformed) was the effect of somatic cell counts performed on milk samples collected; APTj was the effect of "temperature proximity areas" calculated on thermographic images investigated; (SCC × APTk) was the effect of the first order interaction between somatic cell counts and "temperature proximity areas" considered; and eijk were the residual errors. To calculate the values ) and after identifying in thermographic image the pixel with the maximum intensity value (PI max ). In the figure, almost the whole cow udder is highlighted. As a consequence, a grid, of dimensions 4 × 4 pixels can be applied in order to calculate the surface distribution of temperatures. In a following step, the maximum value of udder skin temperature (T max ) can be identified as the maximum value within the surface temperatures calculated; In (C), it is shown with a red cross the location of the pixel PI max and with a green contour the AP T calculated; In (D), finally, is reported the "temperatures proximity area" (AP T ) obtained considering the coordinates of PI max and a set of connected pixels which intensities are different from zero after applying the above reported filter [3].

Statistical Analysis
Data obtained from image elaborations were investigated through statistical analysis performed using the "R" software tool (version 3.4.3, 2017) [36]. The relationships between dependent statistical variable T max and independent statistical variables SCC and AP T were studied. The following linear model was fitted: where: Y ijk were the values of variable T max calculated from thermographic images evaluated; SCC i (log-transformed) was the effect of somatic cell counts performed on milk samples collected; AP Tj was the effect of "temperature proximity areas" calculated on thermographic images investigated; (SCC × AP Tk ) was the effect of the first order interaction between somatic cell counts and "temperature proximity areas" considered; and e ijk were the residual errors. To calculate the values of model' linear coefficients (β n ), and to evaluate their significance, a linear regression analysis was performed using the procedure "lm" of the package "stats" (version 3.4,3- [37,38]). In a following phase of statistical analysis, the ability of the variable T max to discriminate a possible case of mastitis was investigated. When T max overcame a defined threshold, a case of mastitis was supposed (i.e., the result of a possible statistical test was set up as positive). On the basis of SCC performed on the corresponding milk sample, udder health status was defined (i.e., "healthy" if SCC was lower than a defined threshold or "not healthy" if SCC was higher than the selected threshold). The results of statistical test, and of udder health status definitions, were compared and classified as following: true positive (TP), when the statistical test was able to detect a case of healthy udder; false positive (FP), when the statistical test highlighted a possible case of mastitis evaluating a case of healthy udder; true negative (TN), when the statistical test correctly classified a case of not healthy udder and false negative (FN), when a not healthy udder was not detected by the statistical test. When all results were classified, the performance of the statistical test based on the evaluations of T max was calculated as: sensitivity and specificity, in accordance with the following formulas: Of course, the statistical test gave different couples of sensitivity and specificity for each possible threshold used to evaluate the variable T max . For this, a receiver operating characteristic (ROC) curve was build using the procedures "prediction" and "performance" of the package "ROCR" (version 1.0.7- [39]). Analyzing the curve, a specific cutoff was selected and the corresponding couple of sensitivity and specificity was identified as final performance reached by the variable T max in the detection of a possible case of an udder with high SCC. Furthermore, the area under the curve (AUC) was also considered to study the performance of the variable T max ; and both definitions of udder health status were investigated (i.e., less than 200,000 cells/mL and less than 400,000 cells/mL).

Result
In a first phase of statistical analysis the relationships between: the variable T max ; the values of SCC carried out on milk samples collected; and the values of AP T obtained from the image elaborations performed on thermographic images collected; were studied. A linear regression was conducted and the values of linear coefficients were estimated. In Table 1, these values, and their significances, are reported. Table 1. Values and significance of linear coefficients used to study the relationships between the dependent variable T max (i.e., the maximum temperature of the thermographic image evaluated) and the independent variables: SCC (somatic cell count) and AP T (i.e., the "temperature proximity area"). In the linear model, the first order interaction between SCC and AP T was also included. Values and significance of linear coefficients were estimated through the procedure "lm", package "stats" of the "R" statistical software tool. In a following phase of statistical analysis, the detection performances of the variable T max were investigated. Two different thresholds of SCC were used to classify the udders health status (i.e., 200,000 cells/mL and 400,000 cells/mL). For each SCC' threshold, a ROC curve was calculated through the couples of sensitivity and specificity shown by the statistical test when different possible cutoff levels were selected. In Figures 2 and 3 through the couples of sensitivity and specificity shown by the statistical test when different possible cutoff levels were selected. In Figures 2 and 3, the ROC curves obtained are reported.  For the determination of udder health status, an SCC' threshold of 400,000 cells/mL was used. The ROC curve was obtained through the procedures "prediction" and "performance", package "ROCR" of the "R" statistical software tool. The ROC curve was obtained through the procedures "prediction" and "performance", package "ROCR" of the "R" statistical software tool. through the couples of sensitivity and specificity shown by the statistical test when different possible cutoff levels were selected. In Figures 2 and 3, the ROC curves obtained are reported.  For the determination of udder health status, an SCC' threshold of 400,000 cells/mL was used. The ROC curve was obtained through the procedures "prediction" and "performance", package "ROCR" of the "R" statistical software tool. For the determination of udder health status, an SCC' threshold of 400,000 cells/mL was used. The ROC curve was obtained through the procedures "prediction" and "performance", package "ROCR" of the "R" statistical software tool.

Items
In a final phase of statistical analysis, the AUC and the final cutoff level were calculated for each ROC curve obtained. Final cutoff levels were identified considering the point, in the curve, closer to the best theoretical result (i.e., the point in the graph in the upper right corner equal to a sensitivity and specificity of 100%). Results obtained are reported in Table 2 while in Table 3 means and standards errors values of the main indicators investigated, for each criterion adopted to classified udder health status, are also shown.  Table 3. Descriptive statistics of the main indicators investigated (T max , SCC and AP T ) in terms of mean and standard error (S.E.) values for each criterion adopted to classify the udder health status (i.e., criterion 1: udder health = "healthy" if SCC < 200,000 cells/mL; criterion 2: udder health = "healthy" if SCC < 400,000 cells/mL).

Discussion
The relationship between the levels of somatic cell and the values of T max , calculated on the thermographic images collected during this study, showed to be significant. Furthermore, when the levels of somatic cells increase, the values of T max follow the same trend. Similar results have been found by other authors. Barth et al. [40], in a study conducted on 6 cows followed for 8 days, found that USST increase when measured on quarters characterized by SCC higher than 100,000 cell/mL (34.1 • C Vs. 33.6 • C). Polat et al. [10], in a study curried out on 62 dairy cows, found a positive correlation between USST values and SCC. When SCC increased, the values of USST showed to increase logarithmically and the best linear model that fitted this relationship reached an R 2 value of 0.73. Martins et al. [41], in a study performed on 37 ewes, found that higher USSTs were related to high SCC. Therefore, even though bacteriological analysis were not performed on milk samples collected in our study, statistical results seem to confirm that USST could be a possible index for the detection of a not healthy state of cow udder.
In our study, also the area of cow udder involved in a local increase of USST was investigated calculating in each thermographic image what we called a "temperature proximity area" (AP T ). This variable, never considered in a previous published paper, has shown a significant relationship with the trend of T max (and of SCC). When values of AP T showed a decrease (and levels of SCC an increase) the values of T max showed to be higher. This result suggests that high USSTs, to be significant, should be always coupled with small values of AP T in order to highlight a real local rise of temperature due to changes of underlying circulation and tissue metabolism [18], both caused by the presence of a local inflammation of the mammary gland [11]. This variable, therefore, could be useful to increase the detection performance of a monitoring system based on the use of the variable T max . In previous studies, in fact, T max has been measured through the pixel of maximum intensity found in the thermographic image evaluated [17]. With this procedure, the accuracy of the measurement could be affected by a possible error of the thermographic sensor (due by noise, etc.). In our study, we tried to limit this effect not considering a single spot for the measurement of T max but an average value calculated on a small image section of dimensions 4 x 4 pixels. Nevertheless, this methodology does not avoid other possible reading errors due by the use of an infrared camera. It is well known, in fact, that commercial infrared cameras can show reading errors up to ±2 • C [29]. Even though most of them have an internal function to automatically perform a sensor calibration, this effect could negatively affect the performance of a system for the automatic monitoring of udder health status based on the evaluation of the variable T max through an absolute threshold. Furthermore, also the effect of each animal could negatively affect the use of this index in a field application. Thus, a combined evaluation of the variables T max and AP T should allow to overcome the above cited issues and to reduce possible false positive cases. Another reason that could promote the use of a combined evaluation of these variables could be a better classification of cases where the PI max is detected outside the udder surface (such for example in the hairless area of the adjacent leg). These cases generally happen when a significant rise of USST is not locally present in the udder and are often coupled with a high value of the variable AP T . Therefore, a combined evaluation of T max and AP T could allow a better management of these cases and so to permit the use of USST, in field conditions, for the automatic monitoring of udder health status. Future studies would be useful to confirm this hypothesis.
In previous studies some authors have investigated which area of the cow udder should be more promising for the measurement of a correct and significant USST. Porcionato et al. [42], for the measurement of the USST, used three areas selected considering the udder' height in the dorsal-ventral direction. Hovinen et al. [2] considered circles of dimensions 40 × 40 pixels positioned immediately above the teats from the lateral side of the cow udder reporting that the maximum temperature recordable on each udder not always was inside the area considered for the measurements. Pezeshki et al. [19], used two rectangles of dimensions 25 × 25 pixels positioned above the teats, in the rear side of the udder. Metzner et al. [43], in order to limit possible mistakes due by personal interpretation of thermographic images, tried to define a standardized procedure based on the manual drawing of three different geometrical shapes on the rear side of each cow udder. These geometrical shapes were: a polygon, two rectangles, and two lines built applying specific rules to the sizes of each evaluated quarter. The authors have remarked that an effective detection of mastitis, in a milking parlor, through an automatic monitoring system, could be reached only if the USST would be measured and analyzed in a standardized way and they suggested as main features of a possible algorithm: (1) an automatic identification of the udder shape based on the major temperatures of this area; (2) the elimination of the anatomical parts not useful to the measurement of USST; and (3) the use of all possible pixels of the thermographic image in order to limit possible measurement mistakes due by dirt particles on udder skin and/or other possible imaging artifacts. In our study, an algorithm for the calculation of USST from thermographic images was developed in accordance with suggestions provided by Metzner et al. [43]. The algorithm allowed to detect, automatically, the udder surface on the basis of the higher temperatures of udder skin and to calculate the USST considering the maximum number of pixels of the thermographic image evaluated. This algorithm could run in real time and it could be easily integrated in a real monitoring system for the udder health status surveillance. Thus, it could be considered as a possible technical solution to the needs previously stated by other authors. Further experiments would be useful in order to test also this technology solution in a real field application.
However, in the present study, T max values were also investigated to discriminate possible not healthy states of cow udder. Two different SCC' thresholds were used to classify the udder health status. Results obtained showed as a detection accuracy of this indicator: a sensitivity of 78.6% and a specificity of 77.9%, using as SCC' threshold a level of 200,000 cells/mL; a sensitivity of 71.4% and a specificity of 71.6% using as SCC' threshold a level of 400,000 cells/mL; and values of AUC of ROC curves that were in the limit of good and fair diagnostic accuracy (0.80 and 0.81, respectively). Scientific researchers that have studied the performance of the variable USST for the detection of a possible case of not healthy mammary gland are really few. In a study that involved 62 dairy cows, Polat et al. [10] reported, as detection performance of the USST, a sensitivity of 83.5% and a specificity of 100% with an SCC threshold of 200,000 cells/mL and a cutoff of 34.7 • C; and a sensitivity of 95.6% and a specificity of 93.6% considering an SCC threshold of 400,000 cells/mL and always a cutoff of 34.7 • C. In a study that involved 65 dairy camels, Samara et al. [7] reported as accuracy of the index USST a sensitivity of 89% and a specificity of 96%, having considered an SCC threshold of 432,000 cells/mL and a cutoff value of 35.7 • C. Our results were lower than those cited above. However, in our study, many animals have been considered to collect thermographic images in a field condition. Furthermore, SCCs were evaluated on milk samples composed by all udder quarters of each cow. As a consequence, possible rises of SCCs, due by mastitis cases, may have been partially masked by dilution effects. Thus, we think that our results can be considered as acceptable as a first field application of this technology and we think that they confirm what was reported by other authors about the use of USST as a possible index for the rapid and noninvasive evaluation of udder health status [2,[10][11][12]14,17,18,41,43,44]. Nevertheless, further studies would be useful in order to reach, also at farm level, a better accuracy in the automatic and early detection of possible cases of mastitis.

Conclusions
The variable USST showed a significant relationship with the classes of SCC confirming that it could be a useful index for the early detection of a possible case of not healthy cow udder. The sensitivity and specificity found, considering two different classes of SCC as thresholds to classify a possible state of not healthy cow udder, were lower than those reported by other authors. Nevertheless, they were acceptable considering the large number of animals involved in the present experiment and the range field conditions during the study even though no validation of true cases was done by sample culture or repetition in different days. Future experiments will facilitate improvements in the use of IRT and the development of a monitoring system for the automatic and early detection of mastitis, in individual animals of commercial farms, also considering the image processing and the indicators developed and tested in the present study.