Conservation Education: Are Zoo Animals Effective Ambassadors and Is There Any Cost to Their Welfare?

Animal ambassador encounters (AAE), where visitors come into close-contact with animals, are popular in zoos and are advocated as promoting connection to wild species. However, educational and animal-welfare implications are relatively unknown. We conducted a systematic literature review (PRISMA) to investigate visitor and animal outcomes of AAE. We identified 19 peer reviewed articles and 13 other records focused on AAEs. Although we found net positive or neutral impacts overall, several studies indicated that high-intensity visitor contact and long-term exposure may be detrimental to animal welfare. Most studies lacked rigour and claims were based on an absence of negative impacts rather than evidence of benefits. Multiple publications were derived from the same datasets and there were no standardised measures for either welfare or education impacts. Of the peer-reviewed articles, just two considered both education and welfare. Education studies often used perceived learning or only post-experience testing. Welfare studies used small samples (median n = 4; range 1–59), and limited measures of welfare. In order to justify the continued use of AAEs in modern zoos, animal welfare costs must be proven to be minimal whilst having demonstrable and substantial visitor educational value. Large-scale, standardised impact assessments of both education and welfare impacts are needed.


Introduction
Human-animal interactions (HAIs) are those where a reciprocal action or communication occurs between human(s) and non-human animal(s). Zoos offer the opportunity to view live animals up-close, setting them apart from other natural history organisations that commonly use animal artefacts. Despite equivocal evidence, it is often assumed that close encounters between zoo visitors and animals increase feelings of connectedness with species, ultimately inspiring visitors to protect biodiversity more generally [1][2][3]. Although HAIs occur in many forms within the zoo environment (such as walk-through exhibits and keeper-animal interactions), this study focuses on the impacts of Animal Ambassador Encounters (AAEs). These AAEs involve one-to-one interactions (also known as an encounter) between visitors and individual animals who are deemed to be acting as (animal) ambassadors for their species or a conservation cause. Such encounters include feeding experiences and opportunities to touch or hold an animal. Although we acknowledge that spontaneous animal encounters do occur when zoo animals choose to engage with visitors outside of formal AAEs offered by the zoo, this article considers only those where an animal is trained or encouraged to interact closely with humans as part of a planned visitor experience. Ambassadors are also referred to as encounter, education, close-contact, handling or program animals. Whilst all zoo animals could be considered 'ambassadors' we refer to those used specifically in close-contact experiences with visitors.
Over 75% of global zoos offer AAEs, including opportunities to touch (43%) or handfeed animals (23%) [4]. A wide variety of taxa are used as ambassadors globally [5]. Often visitors pay additional fees in order to access 'exclusive' contact with ambassadors making them valuable additional sources of revenue (average cost in the UK: ). However, income generation alone should not be considered a valid reason for AAEs.
Although zoos commonly use small animal ambassadors, such as corn snakes (Pantherophis guttatus) in 'meet and greets', using larger species such as cheetahs (Acinonyx jubatus) is a growing trend. Ambassador cheetahs tend to be hand-reared and trained specifically for public engagement. Sometimes they are raised with dogs to increase socialisation and reduce natural instincts to flee [7]. Once trained and working as an ambassador, individual cheetahs are often no longer included in breeding programs or their handrearing may confer different reproductive outputs to maternally reared conspecifics [8], which has potential consequences for population genetics. Despite this, the popularity and fund-raising ability of cheetahs means ambassadors may fulfil a different role in conservation. For example, across a 14-year period the Columbus Zoo raised over USD 250,000 for cheetah conservation through the sale of cheetah merchandise and cheetah encounters [7]. However, we acknowledge that AAEs with cheetahs may be less common than encounters with other species (e.g., smaller or herbivorous species) and may therefore not be representative of the use of the diverse range of large AAE species.
Whilst AAEs are intended to fulfil an educational role by raising species awareness, there are some concerns that close-contact experiences are solely for human benefit [5]. To participate in encounters, animals must overcome natural fear responses towards unfamiliar humans [9]. Animal handling by inexperienced individuals is potentially harmful and may disrupt natural behaviours [10]. Furthermore, there is a risk that animal encounters may promote wild animal ownership or create misconceptions of tameness for wild and/or dangerous animals [11]. In addition, the act of allowing some visitors to touch and feed animals is contrary to general zoo messaging (i.e., 'not to feed the animals'). A visitor who observes an encounter may try to replicate the experience by attempting to feed or touch animals across 'stand-off' barriers and without staff supervision, risking harm to both parties. This has raised ethical questions around AAEs and, in some cases, encounters have been halted due to welfare concerns. For example, in New South Wales (NSW), Australia, specific guidelines were introduced in 1997 which prohibited visitor handling of koala (Phascolarctos cinereus) due to concerns that handling disrupted resting and feeding behaviours [12].
Nonetheless, the American Association of Zoos and Aquariums (AZA) supports the use of ambassador animals as part of conservation education programs. They state that: 'studies have shown that the presentation of ambassador animals is a powerful catalyst for learning' [13]. The AZA Animal Ambassador Policy [13] requires that education and conservation messages are an integral part of the use of ambassador animals and requires all member institutions who use animal ambassadors to have a long-term management plan and detailed educational objectives. The AZA have also produced an Ambassador Animal Evaluation Tool [14] to assess an animal's suitability for public interactions, including assessment of educational potential; whether husbandry and behavioural needs can be met and, ease of transport and training. However, guidelines to assess the ongoing welfare of animals once they are part of an AAE program have been lacking.
The World Association of Zoos and Aquariums (WAZA) Animal-visitor Interaction Guidelines [15] recommends that welfare of AAE animals should be regularly assessed (using animal-focused physical and behavioural assessments) and that animals should be withdrawn from encounters if a compromise to physical and behavioural welfare is detected [16]. However, there are no standardised assessment frameworks included as part of these guidelines and it remains to be seen how zoos will interpret and implement such standards. Despite many zoos offering AAEs, the available evidence showing benefits to visitor experiences is sparse and often inferred. In general, HAIs are less commonly studied within zoo environments compared with domestic or agricultural settings (such as keeping livestock) [6,17]. Furthermore, zoos are beginning to offer more varied and unusual ambassador encounters, with unknown welfare implications, to appeal to new audiences. For example, red panda (Ailurus fulgens), known to be sensitive to disturbances in habitat, feature in feeding encounters offered at several zoos in the UK, Australia and New Zealand (author pers. obs.). A recent publication by Learmonth [16] acknowledges there may be potential benefits of HAIs but stresses that, regardless of which ethical framework is applied to AAEs, there are welfare concerns. Educational outcomes that foster pro-environmental attitudes are part of the justification for AAEs but not at the cost of welfare [16]. It is, therefore, critical to examine the evidence of impact of AAEs on both animal welfare and visitor learning to establish whether they are justifiable. This systematic review offers a first step by identifying current evidence of impact, knowledge gaps, and determining future research needs for AAE.
With this in mind, we ask: • What is the evidence of the educational impacts of AAEs? • What impact do AAEs have on animal welfare? and • What further evidence is necessary to establish whether AAEs are ethically justifiable in the modern zoo?

Materials and Methods
To meet our inclusion criteria, studies were required to evaluate welfare and/or education impacts as well as: (A) focus on encounters with ambassador animals; (B) be based in a zoo or aquarium; (C) focus on visitor encounters with non-domestic species.
We consider education in the broadest terms including changes to knowledge, attitude, conservation behaviour and emotions. As knowledge alone is not enough to influence conservation action [18], it is important to consider broad educational impacts.
Welfare is defined as the state of an individual as it adapts to changes within its environment [19] and includes the emotional state of an individual [20]. Good welfare is not simply about survival but about providing opportunities for animals to thrive and fulfil basic and complex needs [21]. Due to the complexity of welfare, both behavioural and physiological responses should be used as indicators. However, as species respond to experiences differently [21] careful consideration of species-specific behaviours is needed to understand welfare implications.

Developing the Boolean Term
Prior to the systematic review process, we conducted a naïve search by combining the three criteria above into a Boolean string as follows: Title-Abstract-Keywords = (Criteria A) AND (Criteria B) AND (Criteria C). We restricted our search to title, abstract and keywords as these excluded studies which mentioned peripherally, but did not focus on, ambassador animals.
To ensure that all criteria were met, we separated each criterion using an AND function, forcing the search engine to look for co-occurrences. Synonyms for each of the three criteria were separated by an OR function, to search when any of the possible terms occurred. Asterisks were used to denote potential alternative spellings of words.
Our naïve search was based on commonly used terms to describe ambassador animal encounters within zoos and was as follows: (ambassador* OR encounter* OR contact* OR interaction* OR touch*) AND (zoo* OR aquari*) AND (visitor* OR guest*).
Running the naïve search through SCOPUS, Web of Science and Google Scholar produced 728 records after removal of duplicates. These databases were used as they represent three of the most commonly used sources of academic literature. Searches were conducted between May-June 2020. To ensure that all potential terms used in the animal ambassador literature were identified, we took all the title-abstract-keyword hits from the naïve search and ran them through a Python parser which identified the most frequently occurring non-function words of three-or-more letters. This list of words was then coded, assigning the top 400 words into either fitting the criteria (A, B, or C) or being deemed irrelevant. This produced the following extended search term: (interactions* OR contact* OR interaction* OR petting* OR feeding* OR hand* OR interactive* OR encounters* OR programs* OR touch* OR proximity* OR program* OR physical* OR interact* OR direct* OR encounter* OR live* OR close*) AND (institutions* OR zoo* OR zoos* OR captive* OR enclosure* OR aquariums* OR aquarium* OR facilities*) AND (visitors* OR visitor* OR human* OR public* OR people* OR humans* OR children* OR participants* OR adults* OR family* OR guest*).
In the resulting list of search terms, the word 'ambassador' was the 432nd most frequent word (appearing 33 times) and therefore outside our pre-defined limit of the top 400 words. However, as it was considered integrally important to the study it was retained and inserted into the extended search terms. Where multiple words were identified with alternative spelling, we replaced these with their common stem word plus an asterisk. The terms: 'program*, live*, close*, proximity*, physical*, institution*, facility*, enclosure*, human*, public*, people*, children*, adults*, family*' were removed from the search term as they were deemed either to be covered by an existing term or be so broad as to openup irrelevant search avenues. The terms were then reordered based on relevance to the research question. Thus, the final search term was: (ambassador* OR petting* OR touch* OR encounter* OR contact* OR hand* OR interact* OR direct*) AND (zoo* OR aquarium* OR captive*) AND (visitor* OR participants* OR guest*).

PRISMA Review Process
Our systematic review followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) [22] protocol ( Figure 1).

Identification
Records were identified through searching the above Boolean term on three databases: SCOPUS, Web of Science and Google Scholar. Searches were conducted using a German based IP address but were written in English and used the American (.com) versions of the databases.
Google Scholar weights the Boolean search term based on the order that the terms appear rather than giving an equal weighting to all terms separated by OR functions. To overcome this, we ran the search term eight times through Google Scholar alternating the order of the first word and took the 250 most relevant records as selected by Google's 'order by relevance' function. Due to large numbers of duplicates, the Google Scholar searches led to 1077 records.
Additionally, the search term was run through ProQuest Global Thesis and Dissertation search to identify any relevant studies which may not appear in published research (n = 51). All records were downloaded and duplicates removed, first automatically using Python, followed by manual removal. than one study site, having a control group, extended length of observation time, number of species tested, number of individuals surveyed and whether anecdotal conclusions were drawn). We note that small sample sizes are not always avoidable, however, flaws in methodology can be avoided through careful study design.
Our study was conducted in accordance with Nottingham Trent University, School of Animal Rural and Environmental Sciences' ethical review processes.

Focus/No. Datasets
Of the peer-reviewed papers, 19 were taken to full analysis. Of these, six were education-only, 11 were welfare-only and two were both education and welfare. In addition, 13 'other' records were identified including book chapters (n = 1), MSc and PhD theses (n = 9) and conference proceedings (n = 1) ( Figure 2).
Due to apparent replication of datasets, only 17 unique data sets were identified. Of the repeat datasets: two described data from a giraffe (Giraffa camelopardalis spp.) feeding experience [28,29] and two related to a single dataset on touch pool exhibits [30,31]. Overlap of data sets was also found between the peer-reviewed journals and other literature [32][33][34][35][36].

Inclusion and Exclusion Criteria
As we were specifically testing the impact of zoo-based animal encounters we excluded human-animal encounters occurring outside of zoos. We additionally excluded studies which focused on encounters with familiar people such as between keepers and animals as we wanted to evaluate the welfare implications of unfamiliar human encounters.
We restricted our search date range to between 1995 and 2020. As substantial changes were made to zoo husbandry around the early 1990s [23,24], any studies prior to 1995 were unlikely to reflect modern zoo practices.
Studies which involved domestic species including domestic animals in petting-zoo encounters were also excluded (n = 3) as extrapolation to non-domestic species was considered unreliable given both species differences and the small number of studies. Domestic animals, through their extensive breeding are typically more comfortable around humans than wild species and are therefore likely to have different welfare responses to wild/nondomesticated species [25]. The aim of this research was to explore the use of non-domestic species in AAEs, rather than to compare or extrapolate between studies of domestic versus non-domestic species. The presentation environment, format, structure and messaging surrounding educational displays using non-domestic species are considered divergent from those of petting-zoo scenarios. They may also be governed by different policies and zoo guidance documents, introducing too many confounding variables for comparison between studies. Nonetheless, it is acknowledged that a direct comparison between the use of domestic versus non-domestic species in an identical encounter setting would be of future value. Domestic species were defined as any animal commonly farmed or kept as a pet. Camels were considered non-domestic animals if they were not native to the country the study was conducted in, as they are often used as working animals in their native range countries.

Screening
Titles and abstracts were considered together to determine whether the entry fitted the inclusion criteria. Rejections were made due to articles being related to politics or religion ('ambassador'), being outside of the date range (e.g., before 1995), focused on domestic species or keeper-animal encounters, or not being zoo based. Two records were irretrievable.

Eligibility
Peer reviewed journals which fitted all criteria were included in the review. Other literature, e.g., conference proceedings or unpublished research, which met criteria was analysed separately.

Analysis of Data
Peer-reviewed articles were divided into education-focused, welfare-focused or both. Content analysis was conducted on each article examining sample size, type of animal encounter and methods used. Descriptive statistics including mean, median and bootstrapped confidence intervals were conducted using R version 3.2.3 [26]. Due to the wide range of methods used, meta-analysis was not possible.
We identified the main claim of each article (whether the encounter had a positive, neutral/mixed or negative impact) based on statements in the abstract, summary or conclusion. We then identified evidence to support and reject these claims from within the article [27]. Each claim was evaluated based on quality of data collection (e.g., pre-posttesting, control groups, and duration of behavioural observations) as well as whether claims made were specifically tested by the data collected. Based on these rankings, studies were given a methodological robustness score and this was plotted against sample size. Scores were calculated by assigning 1 point for each robust methodology criteria met and −1 where they were not met to give an overall score (robust methods included: using repeated measures testing, using behavioural and physiological measures, testing more than one study site, having a control group, extended length of observation time, number of species tested, number of individuals surveyed and whether anecdotal conclusions were drawn). We note that small sample sizes are not always avoidable, however, flaws in methodology can be avoided through careful study design.
Our study was conducted in accordance with Nottingham Trent University, School of Animal Rural and Environmental Sciences' ethical review processes.

Focus/No. Datasets
Of the peer-reviewed papers, 19 were taken to full analysis. Of these, six were education-only, 11 were welfare-only and two were both education and welfare. In addition, 13 'other' records were identified including book chapters (n = 1), MSc and PhD theses (n = 9) and conference proceedings (n = 1) ( Figure 2).

Figure 2.
Distribution of records meeting search criteria in a systematic review of ambassador animal encounters; categorized according to their focus on education, welfare and both and whether they were sourced from peer-reviewed journals or other literature.

Types of Encounters
A total of 11 different types of encounters were identified (Table 1). Amongst peerreviewed journals, the most frequently researched encounter types were feeding experiences (n = 6), touch pools (n = 4) and education handling sessions (n = 3). From the other literature (n = 13), five studies focused on welfare, eight on education and two considered both aspects [34,37] (Table 2). Most studies focused on mammals (n = 14), with giraffes being the most frequent (n = 4).  Due to apparent replication of datasets, only 17 unique data sets were identified. Of the repeat datasets: two described data from a giraffe (Giraffa camelopardalis spp.) feeding experience [28,29] and two related to a single dataset on touch pool exhibits [30,31]. Overlap of data sets was also found between the peer-reviewed journals and other literature [32][33][34][35][36].

Types of Encounters
A total of 11 different types of encounters were identified (Table 1). Amongst peerreviewed journals, the most frequently researched encounter types were feeding experiences (n = 6), touch pools (n = 4) and education handling sessions (n = 3). From the other literature (n = 13), five studies focused on welfare, eight on education and two considered both aspects [34,37] (Table 2). Most studies focused on mammals (n = 14), with giraffes being the most frequent (n = 4).  The doctoral theses went into substantially more detail than any of the published literature, typically investigating a larger sample, considering multiple, separate investigations to evaluate welfare or education [32,33,37] or making detailed observations over an extended time period [34]. Parts of these doctoral theses have been published and have been considered in the main analysis [38,42,52]; these aspects were, therefore, excluded from our analysis of the 'other' literature.
Encounter durations ranged widely with the shortest being 3 min and the longest being unrestricted visitor contact during zoo opening hours. Similarly, the number of visitors involved in encounters ranged between 1 individual and unlimited numbers of visitors (estimate 150,000-1 million visitors per year).

Date of Publication, Journal Metrics
Education-only studies featured in the literature between 2008-2019 and welfare studies between 2013-2020. Neither education nor welfare studies were found between 1995-2008.
Peer-reviewed articles were published across 11 different journals with varying subject focus, including: psychology, endocrinology, management and politics. The most popular publication outlets for ambassador animal studies were Zoo Biology (n = 4/19) and Animals (n = 4/19). Impact Factors of journals at the time of publication ranged between 0 and 2.323 (mean 1.55; 95% CI = 1.14-1.93) ( Table A1, Appendix A).

Geographical Representation
Approximately half of the peer-reviewed articles featured studies conducted in American zoos or aquaria (n = 9/19) with 4/19 studies conducted in Australia and 2/19 studies in the UK. The other studies were conducted in Canada, Germany and Italy (Table A1).
The initial Google Scholar search identified a number of records containing the words 'German', 'Germany', or 'Zoologisch' (n = 38/1077), despite search terms being written in English and searched through the American (.com) version of the search engine.

Reported Impacts
Five of the peer-reviewed education studies reported positive educational impacts of encounters. Miller et al. [57] and Wünschmann et al. [53] both made strong, evidenced claims of positive educational impact based on pre-post-knowledge testing and comparison to a control (non-animal-encounter group). Miller et al. [57] found that visitor encounters with dolphins (Tursiops truncatus) led to significantly higher knowledge, attitude and behavioural intentions compared to baseline and against non-encounter controls. The study employed a large sample (n = 331 pre-post) and included a three-month-delayed post-test (n = 128) which identified retained learning amongst encounter participants. However, Miller et al. [57] acknowledged that in addition to animal contact, HAI participants spent longer with an expert (90mins) than the comparison groups, which may have positively influenced learning. Wünschmann et al. [53] also identified positive impacts of animal handling. They found that despite similar baseline scores, children who experienced in-zoo reptile handling scored significantly higher on knowledge tests than either the in-school treatment group (involving the same educator) or the in-school control. It is unclear, however, how much of this difference was due to the novel learning location (the zoo) and how much was due to the animal contact itself.
Kisiel et al. [30] also evidenced learning using conversational and interview excerpts claiming that touch pool encounters encouraged basic scientific reasoning. However, evidence was weakened due to a lack of control group or pre-encounter comparison. Additionally, scientific reasoning was claimed based on very generalised criteria including 'observing' or 'touching' the animal. Circumstantial evidence of positive educational impact was also noted by Cater [58] who, through post-visit only surveys, identified that 84% of encounter participants could state a new fact but did not clarify what these facts were.
In the 'other' literature, education staff (n = 3) reported anecdotal benefits of zoo education in reducing visitors' fear of animals and creating memorable experiences [55]. Stanford [56] supports this, finding that children had more positive responses to rats (Rattus norvegicus) and snakes (Pituophis melanoleucus) post-contact compared to pre-. However, knowledge testing revealed mixed results. Woodman's [54] comparison between artefacts, posters and live animals found that posters produced the strongest learning results in terms of knowledge transfer. As no pre-knowledge testing was done, it is possible that learning was influenced by individual group differences. Nonetheless, Farmerie [57] used pre-post knowledge testing in combination with focus groups and personal journals, and identified a significant correlation with knowledge and feelings of attachment posteducational encounter with koi fish (Cyprinus carpio), which was maintained six months after the encounter. We note the educational koi program involved three-days intensive contact with animals in contrast to a seven-minute encounter in Woodman's study [54].
In contrast, two peer-reviewed studies identified negative educational impacts from encounters. Lloro-Bidart's [42] ethnographic study of lorikeet (Trichoglossus haematodus) feeding drew circumstantial conclusions that visitors often ignore available information sources and are mainly preoccupied with avoiding negative experiences such as being defecated upon. Kopczak et al. [31] also concluded very limited learning about ecology occurred at touch pools despite investigating the same dataset as Kisiel et al. [30] who claimed positive impacts. Lloro-Bidart's [34] non-peer reviewed ethnography suggests that the quality of learning may be influenced by the educator themselves after finding that educators often contradict themselves between presentations and, anthropomorphise predators in order to reduce the sense of fear surrounding them.
Other (non-peer reviewed) studies focused on the emotional and motivational aspects of learning and concluded positive and neutral impacts. For example, Knudson [50] established that touch pool visitors (n = 258) displayed empathy towards marine invertebrates. However, very few visitors explicitly expressed concern over the invertebrates' welfare, and the primary evidence of empathy was demonstrated by 'touching' or 'observing' an animal. Similarly, O'Brien et al. [35] identified that some visitors expressed concern about pollution after visiting a touch pool, however, these findings are undermined by an absence of pre-testing or comparison to non-touch exhibits.

Welfare
Of the 13 peer-reviewed welfare studies, nine used behavioural observations and five tested physiological measures (Table 3). Four peer-reviewed studies used combined physiological and behavioural measures of welfare [28,29,38,52]. All of the four welfare-focused 'other' studies reported mixed welfare impacts. All used behavioural observations, two additionally tested FGM concentrations [32,33], and one additionally tested other physiological indicators including body condition scores [57]. Amongst the 'other' literature two studies examined both behavioural and physiological response [32,33] and one behavioural only [46] (Table 2).

Sample Sizes
Sample sizes were generally very low across peer-reviewed welfare studies ( Table 4). Ten of the 13 peer-reviewed studies were based at a single study site and six studies investigated fewer than five animals. Exceptions to the small sample size were Baird et al. [52] [43] sample was divided into test groups of between nine and 12 animals. Amongst the 'other' literature, sample sizes were higher and ranged between two and 73 animals although the majority were still based on under five animals.

Observations
Behavioural observations varied significantly in duration ( Table 4). The shortest observations were conducted by Martin and Melfi [44] who collected data for 30 s before, 3-17 min during, and 60 s post-encounter.
Ethograms (peer-reviewed studies n = 9; 'other' literature n = 2) focused on recording gross macro behavioural expressions such as locomotion, feeding and social interactions. Although four studies (Table 3) included some more nuanced micro expressions in their ethograms, such as yawning and oral stereotypies, these were very limited and were unlikely to pick up subtle welfare responses.
Proximity to visitors was used as a measure of welfare in two of the nine behaviour studies. Szokalski et al. [47] considered distances of <2 m; 2-5 m and >5 m from visitors whilst Martin and Melfi [44] compared <1 m; 1 m and >1 m. It is unclear with what accuracy these distances were recorded.

Welfare Impact
The six peer-reviewed studies which claimed positive welfare impacts, although based on evidence, all had weaknesses, including testing fewer than 10 animals; absence of pretesting [45], extremely short observation times [44] and testing animals at different locations pre-and post-encounter [51]. deMori et al. [28] and Normando et al. [29] reported on the same dataset and used the same analysis. Additionally, deMori et al. [28] explicitly refer to Normando et al., [29] as being a more detailed account of the behavioural observations used in both studies. Both demonstrate an ethical assessment tool for AAEs using a pilot study of giraffe feeding (n = 4) concluding that visitor feeding had no negative behavioural or ethical impact. However, because these were pilot studies, the assessments are described in limited detail particularly in relation to physiological indicators and long-term behavioural impacts.
The peer-reviewed study by Acaralp-Rehnberg et al. [38], despite having aspects of rigour in the study design (pre-post-testing, control groups and a mix of physiological and behavioural welfare measures), involved a very small sample (n = 2 servals Leptailurus serval). Whilst they were unable to measure the behaviour of the animals during the encounter due to a lack of CCTV in the presentation area, it is unlikely that observing this period would have revealed much about serval welfare since the animals were performing trained behaviours. However, the study relies on inferring meaning from pre-and postencounter behaviours and observations of the non-participating serval whilst the other was participating in the encounter. Surges of pacing from the non-participating animal were used as evidence that the serval benefited from the encounter and were consequently frustrated when not involved. However, other factors, such as wanting keeper interaction or being allowed into a different space, may have also caused this behaviour. Interestingly, Acaralp-Rehnberg's [32] doctoral thesis found that visitor feeding was positive for giraffe, evidenced by increased amicable, social interactions and no change in FGM concentrations regardless of visitor frequency.
In contrast, Lynn [46]'s MSc research found that increasing visitor feeding frequency decreased locomotion and increased oral stereotypy. Both aspects of these findings are supported by the peer-reviewed literature which found that giraffe encounters had limited negative effects on behaviour but that increased visitor-frequency was linked to increased idleness [28,29,43]. Lynn [46] also noted that giraffes preferentially fed on lettuce offered by visitors compared to freely available browse. Whilst this indicates absence of fear of unfamiliar humans, it raises concerns over the long-term dietary impacts of these AAEs.
Another aspect of Acaralp-Rehnberg's [32] doctoral research considered the impact of education handling on shingle-back lizard (Tiliqua rugosa) welfare. The study found significant negative welfare impacts. Increased handling correlated with distress behaviours such as hiding, increased respiration and tongue flicking. These behaviours occurred at both low intensity (once daily) and high intensity (thrice daily) handling, suggesting negative welfare impacts of AAEs on this species. Unlike the peer-reviewed journal articles, this study examined more nuanced behaviour and measured animal welfare directly around an encounter period, however, findings were based on a very small sample (n = 2).
Strong evidence of mixed welfare impacts were provided by three peer-reviewed studies. Baird et al.'s [52] two experiments compared non-handling (exhibit and off-exhibit) animals against education handling animals, and additionally examined the impact of handling and non-handling periods on individual education animals. The study concluded that animals being handled specifically for education programs showed no significant effect on indicators of welfare as there were no differences in faecal glucocorticoid metabolite (FGM) concentrations between education and exhibit animals or between periods of handling and non-handling. However, increased handling duration was associated with higher FGM concentrations, increased frequency of undesirable and self-directed behaviours, and decreased rest. Additionally, African hedgehogs (Atelerix albiventris) that had participated in education handling for a greater number of years had significantly higher FGM concentrations than those new to handling. This suggests potential long-term detrimental welfare effects. Whilst strong in sample size and overall design, the Baird et al. [52] study did not observe behavioural changes immediately around times of handling. This weakens claims that handling has no effect on behaviour, as evidence is based on generalised behavioural and physiological measures.
Farmerie [37] also examined animals used in education sessions, focusing on Koi that were only touched within their environment and which were not restricted or held. Farmerie's PhD study [37] was the only one to use body weights, body condition scores and blood samples to measure welfare in addition to behavioural observations. Fish welfare reportedly increased post-educational encounter and no significant decrease in body weight or body condition were noted. As welfare assessments were conducted by the children participating in the educational session, there was potential for inaccuracy, however, measuring was overseen by an expert.
Fish welfare was also considered by Lloro-Bidart's [34] PhD ethnography which states that sharks were lifted up in the water for the purpose of visitor touching and exhibited defensive behaviours such as 'huddling together' when in the touch pool. However, the study did not test welfare explicitly and made limited claims as to whether the fish were actually distressed by the AAE.
Two peer-reviewed studies investigated the impact of non-contact experiences; a koala photography experience [59] and a free-choice participation, non-contact encounter with penguins (Spheniscus demersus) [41]. Webster et al. [59] found mixed responses of high and low intensity photography on FGM concentrations that were strongly linked to the animal's sex. Female koalas tended to cope better with higher intensity photography than males, however, impacts were confounded by variance in experimental group composition and high individual FGM variance. Saiyed et al. [41] claim 'neutral or positive' impacts of encounters as, despite finding no negative behavioural impacts, not all penguins chose to engage with encounters and there was high individual variability in rates and duration of AAE participation. Saiyed et al.'s [41] study is of particular interest as it was the only study where the animals involved had not previously taken part or been trained for public encounters. Moreover, the encounter design was based entirely on animal choice to join or leave the encounter and no physical human-animal contact took place [41]. In addition, animals were rewarded for participation by having opportunity to engage with extra enrichment items during the encounter.
Rewarding animal-visitor interaction was also common in Majchrak et al.'s [51] study of the impact of visitors riding camels (Camelus dromedarius). Camels were given a food reward for every 75 m ride given. Salivary cortisol levels were found to be lowest during peak ride season and highest when no visitors were present, and when no rides were given (pre-and post-ride season). This is claimed as evidence of positive welfare response to an AAE but may also reflect the response to food-based rewards associated with visitor encounter. In addition, as pre-and post-season samples were collected at an alternative location to the AAE it may be that other factors were responsible for changes in cortisol levels.
Kearns et al. [48], Orban et al. [43], and Szokalski et al. [47] all claim neutral or mixed welfare impacts of encounters. Kearns et al. [48] found no evidence of harmful human bacteria on the skin of touch pools housing cow-nosed-rays (Rhinoptera bonasus), but acknowledged that microbial diversity was lower than the surrounding habitat. Orban et al. [43] found that giraffe engaged in high intensity (all-day) visitor feeding showed increased idleness and rumination. However, these effects were not apparent in giraffes engaged in part-day visitor feeding and had no impacts on stereotypic behaviours. Despite finding increased pacing during lion (Panthera leo leo) and tiger (Panthera tigris sumatrae) encounters and prior to cheetah encounters, Szokalski et al. [47] claimed that this may be linked with anticipation and responses to other conspecifics rather than an indicator of negative welfare.
The only peer-reviewed study to acknowledge negative welfare impacts was Lloro-Bidart's [42] ethnographic study which revealed several accounts of staff acting as 'police' to protect animals from visitors and that the public feeding disrupted territorial behaviour of the birds.
Lastly, a welfare study in the 'other literature' examined differences between ambassador and exhibit cheetah (n = 73) finding personality differences, e.g., greater instances of Playful-Friendly personalities, in cheetah raised as ambassadors, and higher levels of FGM in cheetah involved in free-contact encounters compared to protected contact [33].

Discussion
Despite the evident popularity of AAEs within zoos, evaluation within the literature is limited. Most articles are case studies based at single sites and with small sample sizes. Although we recognize the importance of these small scale studies as an initial investigation, larger scale multi-zoo studies or meta-analyses are needed to really understand the overall impacts. There were no studies which considered long term impacts such as longevity, veterinary requirements or reproduction. These findings are supported by other (opportunistic) literature reviews [25,60].
Generally, doctoral and masters' theses, and their associated publications, were found to contain more rigorous methods than the other peer-reviewed studies. However, despite containing valuable and detailed impact evidence, they are often less accessible to practitioners unless involved in the data collection.
Only two articles [43,52] considered more than eight facilities as part of a multi-institution national approach. No study included more than one country. Although looking within a country provides important insights and avoids potential cultural differences, the limited number of multi-institutional international studies is surprising given recent trends for global zoo impact research [61,62]. However, even where large-scale zoo education evaluations have been conducted, the impact of using a live animal on learning is [   . Study robustness against sample size. Robustness scored by a points system (1 = anecdotal findings; 3 = post-only testing no control; 5= post-only testing plus control; 7= pre-post testing, no control; 9 = pre-post-testing plus control; 10 = pre-post-and delayed post-plus control; points were deducted for: only one study site; severe methodological errors e.g., very short observation times). Sample size was capped at n = 100 for the purpose of graphing. Red squares = education studies; blue circles = welfare studies; green triangles = both education and welfare. Peer reviewed journal articles were plotted [28][29][30][31]38,[41][42][43][44][45][47][48][49][51][52][53][57][58][59].
Of the peer-reviewed studies, only deMori et al. [28] and Lloro-Bidart [42] considered both education and welfare impacts. However, neither of these tested learning outcomes specifically and welfare measures were discussed only briefly. For example, deMori et al. [28] examined reasons for and satisfaction with giraffe AAEs, including whether visitors felt staff had conveyed specific conservation information. However, only word association questions were used to measure educational impacts pre-and post-experience and willingness to sign-up for conservation related emails used as evidence for conservation concern. As their study was primarily an introduction to their ethical assessment, measures of welfare (risk assessment, behavioural observations) are mentioned but refer to other studies (e.g., [29]) for more information.
Lloro-Bidart [42] did not test any education or welfare outcomes specifically. Instead, the study examined visitor, staff and volunteer perceptions of AAEs as a whole and from these accounts, extracts were selected to indicate education and welfare impacts.

Discussion
Despite the evident popularity of AAEs within zoos, evaluation within the literature is limited. Most articles are case studies based at single sites and with small sample sizes.
Although we recognize the importance of these small scale studies as an initial investigation, larger scale multi-zoo studies or meta-analyses are needed to really understand the overall impacts. There were no studies which considered long term impacts such as longevity, veterinary requirements or reproduction. These findings are supported by other (opportunistic) literature reviews [25,60].
Generally, doctoral and masters' theses, and their associated publications, were found to contain more rigorous methods than the other peer-reviewed studies. However, despite containing valuable and detailed impact evidence, they are often less accessible to practitioners unless involved in the data collection.
Only two articles [43,52] considered more than eight facilities as part of a multiinstitution national approach. No study included more than one country. Although looking within a country provides important insights and avoids potential cultural differences, the limited number of multi-institutional international studies is surprising given recent trends for global zoo impact research [61,62]. However, even where large-scale zoo education evaluations have been conducted, the impact of using a live animal on learning is not assessed and evaluations draw conclusions from entrance and exit surveys. Visitor reporting of their in-visit behaviour lacks the ability to longitudinally assess individual experiences. D'Cruze et al.'s [4] overview of current ambassador experiences offered by WAZA member zoos demonstrates that global approaches are possible. However, the practicalities of developing a global education and welfare study are challenging.
Overall, there has been a general shift from focusing on educational impacts to considering animal welfare. Education-only studies were published between 2008 and 2019 but not thereafter. In contrast, welfare studies first appear in 2013 and continue. We found no education or welfare focused studies between 1995 and 2008. Despite our study excluding research pre-1995 (due to significant shifts in zoo practice making this early literature no-longer relevant), several of the studies that we identified referred to pre-1995 literature as justification for using ambassadors [7,60].

Ambassador Animal Impacts on Conservation Education
Following the claim by the AZA that studies have shown ambassador animals to be 'catalysts for learning' [13], only two high-quality studies supported this. Miller et al. [57] and Wünschmann et al. [53] demonstrate that knowledge, and conservation attitudes increase following an animal encounter and are maintained for at least three-months postvisit. Farmerie's [37] doctoral thesis also makes similar claims. However, there are potential confounding variables, namely that groups involved in animal contact also received more contact with an expert than non-contact groups which could also increase knowledge. Likewise, comparisons between in-school and zoo-learning [53] are confounded by the effects of a novel environment. Children's perception of the novelty of a school trip significantly affects learning, with moderately novel environments promoting the most learning [63].
Other studies identify increased emotional attachment [28,56] and encouragement of scientific reasoning [30] but offer only relatively weak evidence. Most educational claims are based on anecdotal evidence from educators [34,55], self-reported knowledge increases [49,58] or testing post-AAE only [30,31]. Additionally, there is no standardised way of measuring educational outcomes, meaning that claims are dependent on interpretation. For example, Kopczak et al. [31] and Kisiel et al. [30] interpret the same conversational data from touch pools and draw different conclusions, likely due to analytical differences. Kisiel et al. [30] looked for any scientific or factual discussion, including basic descriptive comments and claims. In contrast, Kopczak et al. [31] looked for more complex constructs or concepts, i.e., relationships between the animal and its environment. The more complex conversations occurred less frequently (only 9% of conversations), thus leading Kopczak et al. to conclude weaker educational impacts to Kisiel et al. This demonstrates how critical the definition of education or the targeted educational level can be in influencing the analysis and reported outcomes. According to educational taxonomies, learning occurs at different levels, from basic recall to more complex cognitive understanding [64]. Although learning can be stated to have occurred at each level, issue awareness is unlikely to lead to the zoos' intended conservation-related behavioural outcomes [18]. For example, Ogle [49] demonstrated touch pools increased perceived knowledge but not desire to protect species.
Most of the education studies confirm that zoo encounter experiences can promote basic exploration (such as touch and observation) and learning. However, if zoos intend to deliver conservation education, they should seek and evaluate more complex outcomes than simply increasing knowledge. Without more consistent evidence it is difficult to conclude that ambassador animals themselves have a positive impact on conservation education. In the absence of this evidence, the question is raised as to whether using ambassador animals is a valid educational technique.

Welfare Impacts on Ambassador Animals
Welfare was most frequently assessed using gross behavioural observations and measuring FGM concentrations. This is problematic for assessing AAE impact as FGM levels measure cumulative adrenal response (potentially indicative of welfare) without necessarily picking up subtle or acute responses to a specific event. Although behavioural observations measure acute responses, the ethograms used were not sufficiently detailed to identify micro-behaviours indicative of positive or negative valence.
Despite the limitations of FGM analysis, several studies considered non-significance between encounter and non-encounter groups as evidence of no impact on animal welfare [38,52,59]. However, studies did not discuss the timeframe from production to detection of FGMs, or other potential confounders that should be considered when evaluating welfare using FGM as a physiological indicator [65]. Furthermore, ambassador animals may be kept in different conditions to those on general display and therefore variances in housing may occur.
Notably, despite concluding no detriment to animal welfare, the same studies also suggested that animals who were exposed to high intensity visitor contact or who were involved in many years of animal handling had higher FGM levels and expressed more negative behaviours than non-encounter animals. This indicates potential long-term welfare impacts on ambassador animals such as chronic stress [6]. As FGM is better at measuring chronic rather than acute/short-term stress, these long-term findings are significant. Ambassador encounters are short bursts of high intensity visitor exposure, it is therefore critical that their impacts should be understood and assessed in the acute phase using justified measures of welfare.
We should note that, although an encounter may not cause apparent detriment to an animal, it is not necessarily of benefit. Providing appropriate welfare should not only minimise suffering but promote contentment [21,25]. Some actions such as stroking may be pleasurable to animals as they mimic social filial grooming [66]. In studies of non-domestic cats, increased pacing behaviour prior to an encounter was interpreted as anticipatory and an a priori assumption made that the encounter was therefore positive [38,47]. This is despite evidence that pacing may either indicate that the animal is positively altering their normal behaviour budgets in favour of an activity [67] or, alternatively, suggest that an aspect of welfare needs is not being met [68]. For example, pacing prior to an AAE may be a response to a lack of social or enriching stimuli within the animal's own environment rather than indicating an active desire to engage with the AAE itself. Similarly, higher levels of pacing may not mean increased desire for an AAE but instead indicate greater contrast between stimuli pre-and during an encounter [68]. It is increasingly important to therefore monitor the behavioural responses of animals during the AAE as subtle cues may indicate whether they find the interactions positive or negative that may lead to our understanding of the valance of their anticipatory behaviour.
As with education impact studies, there were no standardised measures of welfare impacts on ambassador animals. Studies were mostly conducted with small sample sizes, single study sites and limited observation times. In addition, very few studies considered more than one measure of welfare. As good welfare requires all aspects of an animal's needs to be met, including behavioural and physiological [21], relying on single measure assessments only provides part of the picture. Where studies did take a more holistic approach to assessing welfare [28,29,37], these were pilots or theses and did not fully explore evaluation methodologies in detail.
Animals respond differently to encounters and some animals (or entire species) may not be suitable as ambassadors. Acaralp-Rehnberg [32] identified human interaction as potentially beneficial to giraffe (Giraffa camelopardalis), however, detrimental to shingle-back lizard (Tiliqua rugosa). Similarly, Webster et al. [59], Saiyed et al. [41] and Baird et al. [52] all identify individual animal differences regarding tolerance of human interaction which suggests a need for both individual animal profile assessments and species-based guidelines.
Welfare differences are also likely to be affected by the type of encounter. For example, non-restrictive feeding experiences allow a certain amount of animal autonomy, in contrast, educational handling where animals are held during interactions does not. Animals may also be moved between familiar and unfamiliar environments and impacts of novelty will depend on both individual and species responsiveness. Finally, during feeding encounters, the animal is positively reinforced using a food reward, yet in the education handling, the animal may receive no obvious positive reinforcement. Given the updated WAZA Animal-visitor Interaction Guidelines [15], which recommend that animals have choice whether or not to participate in interactions, it is likely AAE practices such as restricting or holding an animal for the sole purpose of visitor contact will have to be reconsidered in WAZA-aligned facilities.

Ethical Justification of Animal Ambassadors
In order for AAEs to be ethically justified within modern zoos there needs to be an understanding of the potential trade-offs. Weighing up the impacts on animal welfare and keeper time against potential financial and educational gains (including attitudes, emotions, knowledge and conservation behaviours) is vital for establishing whether ambassador experiences are morally viable. Currently, the evidence, in either direction, is sparce. There is an apparent need for robust evidence-based impact research into both education and welfare implications. This is particularly relevant given the recent COVID-19 virus pandemic (itself likely caused by a human-wildlife interaction [69]) which led to the temporary closure of zoos globally and a subsequent dramatic reduction in revenue [70]. With continuing public health restrictions on close contact, many AAEs are unable to take place. However, there is a risk that, where they are allowed, AAEs may be used indiscriminately as a potential fast or easy source of revenue generation. It is critical that zoos acknowledge that with such a lack of species-specific evidence for or against AAEs, caution must be applied to ensure that animals are not inadvertently harmed or their welfare compromised.
There are several frameworks which could be adapted to provide a more standardised assessment which considers welfare and education. Biasetti et al. [71] proposed that an Ethical Matrix can assess animal encounters from all perspectives. They used the example of touch pools and considered the potential stakeholders as biodiversity, the animals used, the aquarium itself, staff, and visitors. Each stakeholder was considered based on impacts on 'well-being', 'autonomy' and 'fairness'. This approach has merit as it accounts for all aspects of an animal encounter and can assess overall net justification. However, what is missing is an evidential assessment to ensure that assumed impacts are actually occurring. Without this, assessments are biased or subjective as they rely on anecdotal outcomes.
In contrast, deMori et al. [28] and Normando et al. [29], take a more experimental approach in their six-step ethical assessment for assessing animal-visitor interactions (AVIP). This considers a holistic view measuring both animal and human risk assessments, physiological and behavioural animal welfare and human well-being outcomes. The results are combined to produce an ethical assessment which informs whether the interaction is justifiable. AAEs can be considered justifiable if there are educational benefits and no negative impacts on welfare. This is a highly beneficial approach as it considers positive and negative impacts on all parties involved in an AAE. However, there is still a need for a more comprehensive approach to measuring education and welfare within this framework. Currently there is no standardised method for measuring education and welfare outcomes.
As education can be considered from more aspects than just knowledge acquisition, it is important that assessments consider multiple areas. This is especially significant with conservation education as conservation behaviours rely on a knowledge of how to act in addition to the desire and ability to do so [18]. Consequently, positive educational outcomes of AAEs need to demonstrate increases across all aspects. Crucially, pre-post-encounter measures need to be taken in order to isolate the impact of the encounter itself, with controls to ensure that impacts are not confounded by other factors such as the environment or having access to direct or personalised expert information.
Similarly, welfare measures need to isolate the impact of ambassador encounters. Animal welfare is highly dependent on individual factors including environmental features and animal temperament [25]. Therefore, species level conclusions can only be drawn from large scale standardised studies conducted across the same species in multiple contexts. This is advocated by Claxton [66] who recommends creating individual animal behaviour and personality profiles and comparing these across institutions. In the current literature one published paper and one thesis used a personality-based approach [33,41]. Understanding individual animals' fear response to unfamiliar humans is also important when ambassador animals must regularly interact with novel humans [9,72]. Rose [21] advocates a combined approach including behaviour observations, physical measures of body condition and movement to create a holistic assessment of each animal based on Wemelsfelder et al.'s [73] Qualitative Behaviour Assessment (QBA). This can then be compared against conspecifics at the same and other sites to establish common welfare implications.
Current behavioural observations tend to be generic and may not capture nuanced responses. Consequently, species specific ethograms are needed to assess subtle behaviours which may indicate changes in affective state during human interaction. Studies have shown that these nuanced behaviours can provide a more detailed assessment of welfare. For example, McGowan et al. [74] noted 39 different behaviours expressed by domestic cats during litter-box defecation events including bunching of whiskers, eye squinting and moving ears to the side. Analysis of these micro expressions provided a more comprehensive assessment of affective state than would otherwise be possible from assessment of macro-behaviours. By extension, observing micro-expressions during AAEs may improve understanding of animal responses to the encounters. Furthermore, to fully understand the welfare impact of AAEs, long-term measures such as frequency of veterinary interventions and longevity need to be considered. Much of this information is available in zoo animal records and studbooks, but the quality of these records may vary.
Hampson and Schwitzer [8] used studbook data to consider the impact of hand-rearing on the future reproductive success and longevity of four big cat species. They found that hand-reared animals tended to reproduce less, and some species had shorter lifespans compared to parent-reared individuals. Given that AAE animals are often hand-reared, this finding implies that AAE animals may also suffer from reproduction and longevity restrictions. However, as yet, there are no studies of these long-term welfare measures in AAE animals.
Finally, institutions need to be prepared for the consequences of AAEs being found to have detrimental impacts. Despite initial concerns that halting koala handling in NSW would significantly harm the local tourist industry, no such impact was seen [12]. Consequently, if zoos do halt encounters with a particular species they can do so based on strong scientific justification, and with the knowledge that it supports zoo conservation aims and is unlikely to damage tourism potential. Some may feel a temporary halt to AAEs is more appropriate until sufficient evidence is gathered, or decide to switch to species where no detrimental welfare implications are evidenced.

Systematic Review Methodology
In addition to the implications for HAIs, this study has highlighted methodological considerations for any systematic review process. Google Scholar is used frequently in systematic reviews; however, the complications of its use are rarely acknowledged. Firstly, the order of search terms matters. Consequently, a highly considered search term which includes all possible key words will be fundamentally restricted by the order in which they are written. All search engines that do not parse regular expressions, but instead have a popularity or keyword-based search, will have this problem. Although we have attempted to address this issue in our study through rotation of the Boolean string, the number of potential combinations is so vast that it is very difficult to ensure all are considered. Secondly, regardless of whether the search engine itself uses an American (.com) or English (.co.uk) address, the location the search is conducted from matters. In the case of this study the search was conducted from a German IP address leading to more papers of German origin than if the search was initiated in a different country. These factors have enormous implications as they effectively mean that a systematic literature review search using a keyword-based search engine such as Google Scholar is not fully replicable. In utilising several search-engines we ensured that literature was searched as thoroughly as possible.
Although every attempt has been made to conduct a thorough search of the literature, we acknowledge that, due to variances in search engines and IP locations, there is potential that relevant studies will be missing from our dataset. Nonetheless, what has become clear from the range of journal locations in relevant published studies, is that there is no specific home for the publication of combined education and welfare impact assessments, such as is required for AAE evaluations.

Conclusions
Currently, studies are so varied in species, methodology and rigor, that it is impossible to conclude whether a particular AAE has a positive or negative impact on either education or welfare. Without this evidence, the justification for using AAEs within zoos remains questionable.
What is needed is a collective research approach which examines the same species across multiple locations and draws together individual animal profiles to assess impact and suitability. Welfare assessments must include both behavioural and physiological measures and consider nuanced responses as well as long-term implications. Education evaluations need to consider knowledge, attitude, emotional and behavioural impacts. Crucially, studies need to assess the impact of AAEs explicitly and use pre-post-testing and controls. Understanding both educational and welfare outcomes of HAIs are vital for establishing their ethical justification within the modern zoo.
Currently, encounters are occurring within zoos globally with little awareness of the impacts on the animal (or the visitor). Zoos are relying on limited evidence indicative of some potential benefits without fully exploring the undesirable outcomes. This is not an argument against animal encounters per se, but it is a criticism of the current evidence base and an urgent call for more rigorous research into ambassador animal encounters.