Observational behavior research is an important activity for zoos and aquariums, often being conducted to provide insights into welfare and guide management decisions. This research relies on standardized protocols to ensure consistent data collection. Inter-observer reliability, where untrained observers are tested against the behavior identifications of an expert observer, represent a critical internal validation process. Recent software advances have made reliability testing easier and more accessible, but there is limited guidance on what constitutes a strong reliability test. In this study, we reviewed historic reliability test data from Lincoln Park Zoo’s on-going behavior monitoring program. Six representative species were chosen that included 645 live pairwise reliability tests conducted across 163 total project observers. We identified that observers were being tested on only approximately 25% of the behaviors listed and defined in the species ethograms. Observers did encounter a greater percent of the ethogram with successive reliability tests, but this gap remained large. While inactive behaviors were well-represented during reliability tests, social and other non-maintenance solitary behaviors (e.g., exploratory, scent marking, play, etc.) did not frequently occur during tests. While the ultimate implications of these gaps in testing are unclear, these results highlight the risks of live reliability testing as an inherently non-standardized process. We suggest several approaches to help address these limitations, including refining ethograms, reconsidering criteria, and supplementing live training with video. We hope this self-critique encourages others to critically examine their methods, enhance the quality of their behavioral data, and ultimately, strengthen conclusions drawn about animal behavior and welfare.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.