# Evaluation of Inter-Observer Reliability of Animal Welfare Indicators: Which Is the Best Index to Use?

Abstract

**:**

## Simple Summary

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Dataset

#### 2.2. Agreement Measures

#### 2.3. Confidence Intervals for Agreement Indexes

#### 2.4. Statistical Analyses

## 3. Results

#### 3.1. Agreement Measures

#### 3.2. Confidence Intervals for Agreement Indexes

## 4. Discussion

^{2}test, calculated from a cross-classification table, or the approach based on correlation coefficients. However, both approaches appear unsuitable and, consequently, they were not implemented in this study. The χ

^{2}test measures the degree of independence between variables that does not necessarily coincide with concordance. In fact, the association measures calculate the deviation from chance contingencies between variables [4]. Therefore, the χ

^{2}statistic presents high values for any deviation from the association due to chance, both in case of agreement and in case of disagreement [40]. Similarly, the use of correlation coefficients that measure deviations from linearity is also discouraged because correlation and concordance are not the same [42]. According to Krippendorff [4], a valid index measures agreements or disagreements among multiple descriptions generated by a single coding procedure, regardless of who enacts the procedure.

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

#### Appendix A.1. $\pi $ Index

- ${P}_{o}$ is the rate of observed concordance and represents the rate of concordant judgments of two independent observers who analyze the same dataset;
- ${P}_{e}$ is the rate of the expected agreement due to chance given by:$${P}_{e}={\displaystyle \sum}_{i=1}^{M}{p}_{i}^{2}$$
- M is the number of categories;
- ${p}_{i}$ is the proportion of objects assigned to the i-th category.

#### Appendix A.2. $k$ and ${k}_{C}$ Indexes

- $\sum}{P}_{ii$ is the observed hit rate, denoted by ${P}_{o}$;
- $\sum}{P}_{i.}{P}_{.i$ is the proportion of agreement due to chance, denoted by ${P}_{e}$. Hence, the formula can be summarized as:$$k=\frac{{P}_{o}-{P}_{e}}{1-{P}_{e}}$$
- (a)
- The N objects categorized are independent;
- (b)
- The categories are independent, mutually exclusive, and exhaustive;
- (c)
- The assigners operate independently.

- ${P}_{oM}$ is the maximum observed proportion, obtained by adding the minimum values of the individual marginal totals.

_{M}:

#### Appendix A.3. ${k}_{PABAK}$

- ${P}_{o}$ is the concordance rate.

#### Appendix A.4. $H$ Index

- C is the number of concordant judgments;
- N
_{A}is the number of judgments of the observer A; - N
_{B}is the number of judgments of the observer B.

#### Appendix A.5. $\alpha $ Index

#### Appendix A.6. $\Gamma $ Index

#### Appendix A.7. $J$ Index

#### Appendix A.8. $B$ Index

- ${a}_{ii}^{2}$ is the square of the values of the concordant cells;
- ${n}_{i.}$ is the total of i-th row;
- ${n}_{.i}$ is the total of i-th column.

#### Appendix A.9. $\u2206$ Index

#### Appendix A.10. $\gamma \left(A{C}_{1}\right)$ Index

- ${P}_{o}$ is estimated with ${P}_{o}=({n}_{++}+{n}_{--})/n$;
- ${P}_{e}^{\ast}$ is estimated as:$${P}_{e}^{\ast}=2{\pi}_{+}\left(1-{\pi}_{+}\right)$$$$\begin{array}{c}{\pi}_{+}=\left({p}_{Oss1|0}+{p}_{Oss2|0}\right)/2,\\ {p}_{Oss1|0}={n}_{Oss1|0}/n\text{}\mathrm{and}\text{}{p}_{Oss2|0}={n}_{Oss2|0}/n.\end{array}$$

## Appendix B

#### Appendix B.1. $\pi $ Index

#### Appendix B.2. $k$, ${k}_{C}$, and $\alpha $ Indexes

#### Appendix B.3. ${k}_{PABAK}$ Index

#### Appendix B.4. $H$ Index

#### Appendix B.5. $J$ and $\Gamma $ Indexes

#### Appendix B.6. $\u2206$ Index

#### Appendix B.7. $\gamma \left(A{C}_{1}\right)$ Index

**Figure 1.**Boxplot of the agreement values obtained for each index with the bootstrap method and the exact bootstrap method for all the selected farms (I-IT1, E-IT1, I-IT2, I-IT3, I_IT4, I_IT5, I_IT6, I-IT7, I-PT1). Legend: ∆ = $\u2206$ index; B = $B$ index; γ = $\gamma \left(A{C}_{1}\right)$ index; Γ = $\Gamma $ index; π = $\pi $ index; ${\mathrm{k}}_{\mathrm{PABAK}}$ coincided with the related indexes: $\sigma $ index, G index and S index; k = $k$ index; H = $H$ index; grey = bootstrap method; white = exact bootstrap method. The α index is not reported in the figure as it coincided with Cohen’s $k$. The $J$ index is not reported in the figure as it coincided with Hubert’s $\Gamma $.

**Table 1.**Values of the agreement indexes for the AWIN animal-based welfare indicator “udder asymmetry” for the nine selected dairy goat farms, sorted by increasing concordance rate (${P}_{0}$).

Agreement Index ^{1} | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Farm | ${\mathit{P}}_{\mathit{o}}$^{2} | $\mathit{\pi}$ | $\mathit{k}$ | ${\mathit{k}}_{\mathit{c}}$ | ${\mathit{k}}_{\mathit{P}\mathit{A}\mathit{B}\mathit{A}\mathit{K}}$^{3} | $\mathit{H}$ | $\mathit{\alpha}$ | $\mathit{\Gamma}$ | $\mathit{J}$ | $\mathit{B}$ | $\u2206$ | $\gamma \left(\mathit{A}{\mathit{C}}_{1}\right)$ |

E-IT1 | 75 | 0.15 | 0.16 | 0.23 | 0.51 | 75 | 0.15 | 0.25 | 0.25 | 0.70 | 0.52 | 0.65 |

I-IT1 | 77 | 0.24 | 0.24 | 0.24 | 0.54 | 77 | 0.24 | 0.28 | 0.30 | 0.71 | 0.54 | 0.68 |

I-IT2 | 88 | 0.27 | 0.27 | 0.43 | 0.77 | 88 | 0.28 | 0.58 | 0.58 | 0.87 | 0.79 | 0.86 |

I-IT3 | 92 | 0.55 | 0.55 | 0.55 | 0.84 | 92 | 0.56 | 0.69 | 0.70 | 0.90 | 0.84 | 0.90 |

I-IT4 | 95 | 0.64 | 0.64 | 1.00 | 0.89 | 95 | 0.64 | 0.79 | 0.79 | 0.94 | 0.95 | 0.94 |

I-IT5 | 95 | −0.02 | 0.00 | 0.00 | 0.90 | 95 | −0.01 | 0.80 | 0.81 | 0.95 | 0.95 | 0.95 |

I-IT6 | 97 | 0.78 | 0.78 | 1.00 | 0.93 | 97 | 0.78 | 0.87 | 0.87 | 0.96 | 0.97 | 0.96 |

I-IT7 | 97 | −0.02 | 0.00 | 0.00 | 0.93 | 97 | 0.00 | 0.87 | 0.87 | 0.97 | 0.97 | 0.96 |

I-PT1 | 100 | 1.00 | 1.00 | 1.00 | 1.00 | 100 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |

^{1}$\pi $ [19]; $k$ [5]; ${k}_{c}$ [5]; $H$ [29]; $\alpha $ [30]; $\Gamma $ [31]; $J$ [32]; $B$ [33]; $\u2206$ [24]; $\gamma \left(A{C}_{1}\right)$ [21].

^{2}Concordance rate (${P}_{o}$, %), calculated as: $\left({n}_{11}+{n}_{22}\right)/N$.

^{3}The related indexes ($\sigma $ index [20], G index [27], and S index [28]) gave the same results.

