Comparison Between Non-Invasive Methane Measurement Techniques in Cattle

Simple Summary Enteric methane emissions pose a serious issue to ruminant production and environmental sustainability. To mitigate methane emissions, combined research efforts have been put into animal handling, feeding and genetic improvement strategies. For all research efforts, it is necessary to record methane emissions from individual cows on a large scale under farming conditions. The objective of this trial was to compare two large-scale, non-invasive methods of measuring methane (non-dispersive infrared methane analyzer (NDIR) and laser), in order to see if they can be used interchangeably. For this, paired measurements were taken with both devices on a herd of dairy cows and compared. Significant sources of disagreement were identified between the methods, such that it would not be possible to use both methods interchangeably without first correcting the sources of disagreement. Abstract The aim of this trial was to study the agreement between the non-dispersive infrared methane analyzer (NDIR) method and the hand held laser methane detector (LMD). Methane (CH4) was measured simultaneously with the two devices totaling 164 paired measurements. The repeatability of the CH4 concentration was greater with the NDIR (0.42) than for the LMD (0.23). However, for the number of peaks, repeatability of the LMD was greater (0.20 vs. 0.14, respectively). Correlation was moderately high and positive for CH4 concentration (0.73 and 0.74, respectively) and number of peaks (0.72 and 0.72, respectively), and the repeated measures correlation and the individual-level correlation were high (0.98 and 0.94, respectively). A moderate concordance correlation coefficient was observed for the CH4 concentration (0.62) and for the number of peaks (0.66). A moderate-high coefficient of individual agreement for the CH4 concentration (0.83) and the number of peaks (0.77) were observed. However, CH4 concentrations population means and all variance components differed between instruments. In conclusion, methane concentration measurements obtained by means of NDIR and LMD cannot be used interchangeably. The joint use of both methods could be considered for genetic selection purposes or for mitigation strategies only if sources of disagreement, which result in different between-subject and within-subject variabilities, are identified and corrected for.


Introduction
The livestock sector plays an important role in climate change, representing 14.5% of all anthropogenic emissions. Within livestock, one of the most relevant sources of greenhouse gases is enteric methane (CH 4 ) representing 39% of sector emissions [1]. Methane is a potent greenhouse gas with a high global warming potential, calculated as 25 times greater than that of carbon dioxide (CO 2 ) [2]. For these reasons, enteric CH 4 has become a great concern worldwide. In fact, agreements like "The Paris Agreement" in 2015 have pointed out the importance of the reduction of CH 4 emission by cattle.
Moreover, enteric CH 4 production is associated with considerable dietary energy losses for ruminants, ranging from 1.7% to 14.9% of gross energy intake for lactating cows [3] that could lead to decreased energy gain and productivity. Feed is typically the main production cost item in mixed and intensive systems. Wasting part of the feed energy in the form of CH 4 is not only a climate change issue but also a production problem. In this context, the reduction of livestock greenhouse gases emissions, and in particular enteric CH 4 , is posed as a top issue in the agribusiness sector.
The reduction of these emissions can be addressed from the combination of handling, feeding and genetic improvement strategies. However, in order to implement any of these strategies and especially to breed animals with lower CH 4 production, it is necessary to have individual CH 4 data on a large number of animals in commercial conditions.
Methane concentration is heritable [4], and accurate and reliable individual measurements are necessary to include this trait in the breeding programs. These measurements must be made on a large number of animals under commercial conditions [5]. Several methods to record CH 4 emissions from individual animals have been proposed, each with specific advantages, scope of application and also flaws [6][7][8]. Rapid and non-labor-extensive methods of measuring CH 4 are required in order to implement national genetic evaluations. The non-dispersive infrared analyzer CH 4 analyzer (NDIR) and the hand-held laser CH 4 detector (LMD) are two alternative methodologies that measure CH 4 concentrations in the breath of dairy cattle. Both technologies are non-invasive and allow high throughput in commercial conditions. Both methods have been proved to reliably quantify individually CH 4 concentrations in exhaled air at farm conditions [9][10][11][12].
Garnsworthy et al. [13] used for the first time a NDIR CH 4 detector to quantify CH 4 emissions from individual cows on a farm, by sampling air released by eructation during milking. With this technique, a sampling point was installed inside the feed bin of automated milking systems (AMS) and CH 4 concentration on exhaled air was continuously measured in each cow's visits to the AMS. The LMD is a hand-held gas detector for remote measurements of column density for CH 4 containing gases that was originally developed for the detection of gas leaks. However, Chagunda, et al. [14] demonstrated its ability to quantify CH 4 concentration in exhaled air and estimate enteric CH 4 output in dairy cows. Their conclusions were later corroborated by other authors [15,16].
Generating large databases of CH 4 emissions that help addressing reduction strategies has been lately proposed. Exchanging data from different measurements devices in different countries for more accurate studies or genetic evaluations would be an advantage. Thus, harmonizing measurement methodologies is a main concern in recent years, and assessing the equivalence or the statistical agreement between measuring methods is crucial to propose strategies to combine data from different devices. It is important to establish the agreement prior to measuring large numbers of individuals. One approach to analyze this agreement, usually used in medical field, is to record simultaneous repeated records on different subjects using the two methods [17] in order to analyze different sources of disagreement like differences on population means, on between-subject or on within-subject variabilities, as well as the calculation of agreement indices, such as concordance correlation coefficient (CCC) and the coefficient of individual agreement (CIA).
Up to date, few comparative studies have been conducted on these two instruments to determine their equivalence or lack thereof [18], but no study has compared their agreement on paired measurements on the same breath air samples, up to the best of our knowledge. Therefore, the aim of this study was to analyze the agreement and concordance of CH 4 emissions through LMD and the NDIR on paired measurements under field conditions.

Experimental Setup and CH 4 Concentration Determination
The LMD is a hand held open path laser measuring device. The model used in this study was LaserMethaneMini (Tokyo Gas Engineering Co., Ltd. Anritsu Devices Co., Ltd., Tokyo, Japan). The principle of the LMD measuring technology was described previously by Chagunda et al. [12][13][14]. Briefly, this device is based on infrared absorption spectroscopy using a semiconductor laser for CH 4 detection. The device must be pointed towards the nostrils of the cow from a fixed distance. Then, the LMD measures the density of the air column between the device and the animal's nostrils. The reflected laser beam is detected by the device, and its signal is processed and converted to the cumulative CH 4 concentration along the laser path in ppm-m. The LMD was connected to a tablet (Samsung Galaxy Tab A6, New Jersey, USA) running GasViewer app (Tokyo Gas Engineering Solutions, Tokyo, Japan) via Bluetooth connection for exporting and storing the data in real time at 0.5 s intervals. The effect of atmospheric ambient CH 4 concentration from the measurements was discounted using the offset function of the LMD.
The NDIR (Guardian NG Edinburg Instruments Ltd., Livinstong, UK) is one of the so-called sniffer methods that measure CH 4 concentration (ppm) in breath or exhaled air. These methods have been previously used by Garnsworthy et al. [10] to assess the CH 4 production of dairy cows at commercial farms. Briefly, a gas sampling tube from the front of a cow's head to a gas analyzer to continuously measure CH 4 concentration in the cow's breath is used. Then, air is drawn through the instrument by an integral pump between the gas inlet port and analyzer. The device used in this study had a range of 0 to 10,000 ppm. For this study, air was sampled continuously at a rate of 1 L/min through an 8 mm polyamide tube, using approximately 2 m of tube from the analyzer to cow´s nostrils. Methane concentration was recorded at 1 s intervals and stored in a datalogger (Data Recorder SRD-99; Simex Sp. z o.o, Gdańsk, Poland). Baseline or ambient CH 4 concentration was calculated as mean CH 4 concentration before starting the measurements and subtracted from the measured data. Each day before starting measurements, the NDIR analyzer was verified using standard mixtures of CH 4 in nitrogen (0.0%, 0.25%, 0.50%, 0.75% and 1.0%; MESA International Technologies INC, Santa Ana, CA, USA).
For this trial, records were measured in 29 Holstein (11) and Brown Swiss (18) dairy cows in the Fraisoro Agricultural School (Zizurkil, Spain). All animal experiments were carried out in accordance with EU Directive 2010/63/EU for animal experiments and approved by the ethics committee (NEIKER-OEBA-2017-004). Cows were 18.5% of the 1st parity, 52% of the 2nd parity, 18.5% of the 3rd parity and 11% of the 4th parity. Average days in milk at the beginning of the experiment was 102 ± 88 d. Cows were offered a partial mixed ration consisting of corn silage, grass silage and straw ad libitum, and a mean of 5 kg of concentrate supplied in the AMS.
Breath CH 4 concentration was measured on six different days during four months, except for two cows that died during the experimental period. Measurements were performed after morning unifeed distribution between 10:00 and 14:00. Animals were restrained and CH 4 was measured simultaneously with the two devices during a 5 min sampling period, obtaining a total of 164 paired measurements. An operator pointed the LMD at a cow's nostril at a fixed distance of 1 m and trying to maintain the angle from which the LMD was pointed to the cow. Data was recorded every 0.5 s in a tablet. Another operator simultaneously placed NDIR sampling tube on the cow's nostrils in order to measure CH 4 concentration in exhaled air with the NDIR and data was recorded every 1 s in a datalogger.

Calculations and Statistical Analysis
The average CH 4 concentration from LMD and NDIR was calculated as the arithmetic mean of all concentration values within each 5 min profile measured with the LMD and the NDIR analyzer respectively. The number of eructation events or peaks in each 5 min profile was also calculated for both devices as described in [18].
In order to perform a concordance analysis, phenotypes with the same units are required. Some assumptions must be taken into account in order to convert mean LMD values in ppm-m into ppm. Data obtained with LMD at a distance of one meter (D) was transformed considering an exhaled air bubble of 10 cm (X) from the cow's nostrils, and 2 ppm CH 4 concentration in the environment air (A). The resulting phenotype was calculated as: Data was analyzed using a linear mixed effects model with Kronecker product covariance structure in a doubly multivariate set-up [19] using the MIXED procedure of SAS with longitudinal repeated measures (SAS ® Institute INC, Cary, NC, EEUU, Version 7.15, 2017). A Kenward-Roger correction was used to compute the correct denominator degrees of freedom of fixed effects in the presence of repeated measures [20]. The linear mixed effects model for the ith records can be written as: where every method replication response is denoted by superscript i and y i is the method replication response (CH 4 concentration, ppm). Terms S i1 and S i2 indicate the instrument each response belongs to (LMD and NDIR) and will take a value of either zero or one to link y i to the corresponding instrument. The regression coefficients b 1 and b 2 are the respective fixed effects of instruments, b 3 denotes the fixed effect of day of measurement and b 5 is the fixed effect of the breed. Terms b i1 and b i2 are random effect parameters for each cow with each of the two devices, with bi~ND (0,G), and Z i1 and Z i2 relate responses y i to the respective methods. Term e i is the random residuals from each instrument response with e i~N D (0,R i ).
The method described in detail by Roy [19] was employed to formally test the significance of differences for the between-subject and within-subject variances of the two instruments by means of a log likelihood ratio test between models differing for various combinations of structured and unstructured variance-covariance matrices G and R i . Using the variance components of the previous model, CIA and CCC were calculated.
Repeatability for each method was calculated as the between-cow (σ 2 B ) variance divided by the between-cow and within-cow (σ 2 W ) variance.
Additionally, the repeated measures correlation (r p ) was calculated as: with σ B1B2 2 being the covariance between cows for the LMD and NDIR, and σ B1 2 and σ B2 2 being the between-subject variance for method 1 and 2, respectively. Individual level of correlation was calculated using variance components applying the statistical model described previously but including the stage of lactation and lactation number as fixed effects [21].
A Pearson correlation coefficient and Spearman´s rank correlation test were used to assess the association between the overall mean CH 4 concentrations across all paired individual cow records [22].

Results and Discussion
Strategies to address CH 4 abatement strategies need large amounts of CH 4 measurements. Combining measurements from different countries has been proposed as a possible strategy and is being studied elsewhere. However, different research groups often utilize different types of device to record CH 4 . In this context, a common goal is to decide if the measurement systems agree suitably to each other, and thus can be used interchangeably. The current study is the first directly comparing spot-sampling methods based on breath analysis called the NDIR and LMD simultaneously in the same breath air samples.
An example of a CH 4 profile of a cow as measured with the NDIR sensor and the LMD can be seen in Figure 1. Five different eructation events or CH 4 peaks can be seen with both devices. Mean number of peaks differed between instruments (Table 1), being the LMD capable of detecting higher number of peaks (4.69 vs. 4.24, p < 0.001), this could be related with the higher sensitivity of the LMD to detect small variations on CH 4 concentration, even away from the animal [11] and with the time of recording (every 0.5 s vs. 1 s with LMD and NDIR, respectively). However, these differences on the mean number of peaks are of small biological relevance. Moreover, the between subjects variance for the number of peaks did not differ between instruments and a large and positive repeated measures correlation and individual-level correlation were observed (Table 1). Thus, it can be said that both methods are capable to detect the eructation events or peaks in a similar way. In our study, a mean CH 4 concentration of 97 ppm-m with a minimum of 21 ppm-m and a maximum of 303 ppm-m were recorded with the LMD. With the NDIR sensor a mean CH 4 concentration of 1268 ppm with a minimum of 58 ppm and a maximum of 3575 ppm was recorded. The values observed with the LMD agree with those observed in other studies; for instance, Chagunda et al. [12][13][14] found average CH 4 concentrations between 107-369 ppm-m, depending on the animal activity. The values observed with the NDIR sensor are in line with those observed in other studies [23].
Before considering any method for comparison one needs to ensure that its replication error is acceptable. Repeatability is relevant to the study of method comparison because the repeatabilities of the two methods limit the amount of agreement, which is possible. If one method has poor repeatability the agreement between the two methods is likely to be poor too. As shown in Table 1, the repeatability for the concentration of CH 4 was greater with the NDIR sensor (0.42) than with the LMD (0.23). However, for the number of peaks, the repeatability with the LMD (0.20) was greater than with the NDIR sensor (0.14). The repeatability values obtained in the current trial with the NDIR sensor agree with those reported by Lassen et al. [24] made in the AMS, and with Negussie et al. [25] who reported a repeatability value of 0.36, measuring CH 4 based on CH 4 /CO 2 ratio, but were lower than those reported by Sorg et al. [18] who obtained a repeatability of 0.77 (0.13). However, the latter authors processed the NDIR data using head lifting algorithms and modeled it in a Fourier series approach on the time of day of measurement to account for diurnal patterns of CH 4 emission as described by Difford et al. [26]. Repeatability from measurements obtained with LMD was lower than that obtained with the NDIR but similar to those found by other authors [18,27]. The LMD measures were less repeatable than those of NDIR probably because the LMD measurement took place in the open space of the barn, while the NDIR sampled air in the nostrils where environmental influences are likely less variable, e.g., air movement due to ventilation and wind speed.
Though discouraged in comparison studies with repeated measures per subject, we provide the Pearson's and Spearman's correlation coefficients, which were moderately high and positive between instruments for CH 4 concentration (0.73 and 0.74, respectively) and the number of peaks (0.72 and 0.72, respectively). These correlation coefficients could be shrunk because repeated measurements per subject inflate the residual error variance. In these cases, the repeated measures correlation is more appropriate because the effect of repeated measurements per subject are accounted for and prevents a downward bias of correlations due to imprecise measurements. In this sense, repeated measures correlation and individual-level correlation between methods were high (0.98 and 0.94, respectively) for CH 4 concentration (Table 1), which was higher than that observed by Sorg et al. [18]. Individual-level correlations have been used as proxies for genetic correlations in difficult or expensive to measure traits, being the genetic correlation the most informative correlation metric for assessing how to incorporate a method in a selection index [21].
We did not observe a significant breed effect. The CH 4 concentrations population means and all variance components (between-subject and within-subject) differed (p < 0.05) between instruments (Table 1). It must be pointed out that this study aimed to assess the agreement or lack thereof between LMD and NDIR. Neither methods are considered as the gold standard, but they allow high throughput recording in a whole cattle population, which is of main interest for the dairy industry. This method comparison study aimed to evaluate whether these methods can be used interchangeably in commercial farms. Traditionally, if there is no reference method, assessing agreement has been based on the intraclass correlation coefficient and CCC. In this study a moderate CCC was observed between instruments for both CH 4 concentration (0.62) and for the number of peaks (0.66; Table 1). This CCC value was higher than that observed by Sorg et al. [18] probably due to the high variance estimates in the present study. This coefficient with fixed within-subject variability increase as between-subject variability increases and some authors question whether this coefficient is adequate in assessing agreement at the individual level, being necessary to evaluate agreement using both CCC and CIA in order to overcome the inherent shortcomings of both metrics [28,29]. The CIA evaluates agreement relative to imprecision in such a way that a good individual agreement means that the individual difference between readings from different methods is close to the difference between replicated readings within a method, and could therefore be inflated by low repeatability values. We observed higher values of CIA between instruments for both CH 4 concentration (0.83) and number of peaks (0.77) than CCC values, which can be explained by the observed moderate repeatability values (Table 1). However, CIA values were moderately high and based on the threshold of "good" interchangeability (0.455) defined by Barnhart et al. [17], CH 4 concentration measurements of both instruments might be used interchangeably.
Nevertheless, based on the criteria for the statistical agreement of the equivalence of means and total variability suggested by Roy [19], the observed differences in the mean populations, between-subject variabilities and within-subject variabilities (Table 1), CH 4 concentration obtained by means of NDIR and LMD could not be considered interchangeable. When two methods are compared without a gold standard, neither provides an unequivocally correct measurement, but CH 4 measures obtained with both devices can be used interchangeably provided that the sources of disagreement are identified and corrected. In this sense, Difford et al. [26] in a study comparing two sniffer methods with simultaneous repeated measurements proposed a process of calibration and standardization to overcome these sources of disagreement with positive results.

Conclusions
Methane concentration measurements obtained by means of NDIR and LMD instruments cannot be directly used interchangeably under commercial conditions. The joint use of both methods could be considered to establish classifications of individuals, in relation to their CH 4 emissions, in studies for genetic selection purposes or to evaluate CH 4 emissions reduction strategies only if sources of disagreement, which result in different between-subject and within-subject variabilities, are identified and corrected for.