Impact of Stressors on Honey Bees ( Apis mellifera ; Hymenoptera: Apidae): Some Guidance for Research Emerge from a Meta-Analysis

: Bees play an essential role in plant pollination and their decline is a threat to crop yields and biodiversity sustainability. The causes of their decline have not yet been fully identiﬁed, despite the numerous studies that have been carried out, especially on Apis mellifera . This meta-analysis was conducted to identify gaps in the current research and new potential directions for research. The aim of this analysis of 293 international scientiﬁc papers was to achieve an inventory of the studied populations, the stressors and the methods used to study their impact on Apis mellifera . It also aimed to investigate the stressors with the greatest impact on bees and explore whether the evidence for an impact varies according to the type of study or the scale of study. According to this analysis, it is important to identify the populations and the critical developmental stages most at risk, and to determine the di ﬀ erences in stress sensibility between subspecies. This meta-analysis also showed that studies on climate change or habitat fragmentation were lacking. Moreover, it highlighted that technical di ﬃ culties in the ﬁeld and the bu ﬀ er e ﬀ ect of the colony represent methodological and biological barriers that are still di ﬃ cult to overcome. Mathematical modeling or radio frequency identiﬁcation (RFID) chips represent promising ways to overcome current methodological di ﬃ culties.


Introduction
Honey bees are important pollinators of most wild plants [1] and agricultural crops [2]. They are the most economically important group of pollinators worldwide [3] and are also crucial for maintaining biodiversity [4]. In particular, the economic contribution of the honey bee Apis mellifera to agriculture is estimated at USD 20 billion in the US and more than USD 200 billion worldwide [5].
Over the past decades, significant losses of wild and domestic bees have been reported in many parts of the world [6], threatening the ecosystem services they provide. Many hypotheses have been put forward to explain these losses, but the causes are not yet clearly identified [3]. So far, no single factor appears to act as the main driver of bee decline [7,8] and this phenomenon is now widely regarded as multifactorial [6][7][8][9][10]. Among the factors involved, biological and chemical agents are at the forefront. Indeed, bees are chronically exposed to pesticide cocktails, but also to many parasitic and infectious agents (PIAs), some of which are still emerging as they are disseminated by humans and international transport [9]. In addition, other stressors such as habitat loss, beekeeping practices, climate change or decreased abundance and diversity of floral resources are likely to contribute to the

Identification of the Key Concepts and the Relevant Keywords
The population, exposure, outcomes (PEO) method was used to define the key concepts of the analysis. Three key concepts were then identified: the target population, the stressors studied and the methods used. Keywords were listed for each key concept after reading a subset of scientific papers related to the impact of stressors on Apis mellifera. A search was subsequently performed in Scopus and Cab Abstract databases with these keywords (see Figure S1 in the Supplementary Materials for details of the search string) and resulted in the selection of 3999 articles.

Literature Search
The target population included in the study comprised subspecies of the honey bee Apis mellifera with the exception of A. m. scutellata, A. m. capensis and Africanized bees. All epidemiological units (colony, adults, brood) and development stages (eggs, larvae, pupa) were included in the analysis. The papers included in this survey were published during the last ten years (from 2007 to 2017, last access: 6 March, 2017). The articles were available in full text and written in English. The primary search in the Scopus and Cab Abstract databases resulted in the selection of 3999 articles; 1187 duplicates were removed (Figure 1), then 717 articles were excluded because they dealt with A. m. scutellata or A. m. capensis (82 articles), with other organisms (73 articles), with the efficiency of veterinary treatments (117 articles), with the presence of pesticides in bee matrices (26 articles) or because they were off topic (419 articles).
A new search was performed on the remaining 2095 articles to better focus on the impact of stress on honey bees. To be included in the analysis, the title of the articles had to contain the following words: ("honey bee" or "mellifera") and ("impact" or "affect" or "effect" or "influence" or "toxicity" or "impair" or "induce"). Following this procedure, 386 articles were selected. Reviews and articles dealing with stressors considered as "anecdotal" (i.e., stressors to which bees are rarely exposed for example caffeine, nanoparticles; see details in Figure S2, Supplementary Materials) were excluded from the list. After this selection process, 293 articles were included in the analysis (see Table S1 in Supplementary Materials for the references). Although our paper selection was implemented thoroughly, we acknowledge that some references may have been omitted, however we believe that the number of these references is very small.

Data Extraction
Each publication (n = 293) was reviewed using a standard protocol. General information about the authors affiliation, country, the year of publication and the journal was recorded. Information on the subspecies, the bee life stages (larvae, pupae or adults, i.e., workers, queens, drones), and the stressors studied in the publication, the endpoint measurements and the methods used were stored in a database dedicated to the analysis. Whether several stressors were tested simultaneously or not was noted. We also recorded whether the impact of the stressor was evidenced or not. Finally, we included the scale of study in the analysis. The endpoint measurements were grouped into four classes: (i) colony scale (e.g., colony weight, colony reproduction, colony survival), (ii) individual scale (e.g., physiological or anatomical measures, learning, memory, behavior, mortality), (iii) cellular scale (e.g., cell death, spermatozoa viability), (iv) molecular scale (e.g., enzyme activity, protein concentration), and (v) genetic scale (e.g., genes expression).

Data Analysis
Flows between two or more variables are represented by Sankey diagrams, in which the width of the arrows is proportional to the magnitude of the flow. Diagrams were produced with the online tool SankeyMATIC.
Statistical analysis was conducted using Chi-square tests implemented with R.

Data Extraction
Each publication (n = 293) was reviewed using a standard protocol. General information about the authors affiliation, country, the year of publication and the journal was recorded. Information on the subspecies, the bee life stages (larvae, pupae or adults, i.e., workers, queens, drones), and the stressors studied in the publication, the endpoint measurements and the methods used were stored in a database dedicated to the analysis. Whether several stressors were tested simultaneously or not was noted. We also recorded whether the impact of the stressor was evidenced or not. Finally, we included the scale of study in the analysis. The endpoint measurements were grouped into four classes: (i) colony scale (e.g., colony weight, colony reproduction, colony survival), (ii) individual scale (e.g., physiological or anatomical measures, learning, memory, behavior, mortality), (iii) cellular scale (e.g., cell death, spermatozoa viability), (iv) molecular scale (e.g., enzyme activity, protein concentration), and (v) genetic scale (e.g., genes expression).

Data Analysis
Flows between two or more variables are represented by Sankey diagrams, in which the width of the arrows is proportional to the magnitude of the flow. Diagrams were produced with the online tool SankeyMATIC.
Statistical analysis was conducted using Chi-square tests implemented with R.

General Information
The articles analyzed in this meta-analysis (n = 293) were published in 128 different journals, the three most frequent being Plos One (13%), Apidologie (10%), and the Journal of Apicultural Research (6%) (Supplementary Materials, Figure S3). The number of papers published per year increased relatively steadily (Figure 2) from 2010 (24 papers) to 2016 (54 papers). This could be a sign of the scientific community's growing interest in bee health and especially its willingness to investigate the mechanisms leading to abnormal bee mortality. The articles analyzed in this meta-analysis (n = 293) were published in 128 different journals, the three most frequent being Plos One (13%), Apidologie (10%), and the Journal of Apicultural Research (6%) (Supplementary Materials, Figure S3). The number of papers published per year increased relatively steadily ( Figure 2) from 2010 (24 papers) to 2016 (54 papers). This could be a sign of the scientific community's growing interest in bee health and especially its willingness to investigate the mechanisms leading to abnormal bee mortality.

What Kind of Populations were Studied in the Articles?
Worker bees were by far the most studied category of bees (67%, n = 230) and of these, foragers were widely represented (17% of the total population) (Figure 3a). Brood and colonies were studied in 13% of cases (n = 43), while drones (male bees) and queens were rarely studied (2% and 5% of the publications, respectively). Queens have been studied since 2011 whereas drones have only been included in papers since 2013 (see Figure S4 in the Supplementary Materials). Workers, and especially foragers, are the first to suffer from abnormally high mortality rates and seem to be the age class most exposed to stressors. Moreover, due to their number they constitute the greater part of the colony and probably represent the simplest biological material to study in hives. Drones and queens are key elements for colony survival due to their role in reproduction and recent studies have revealed that their reproductive capacities are altered by stressors [21][22][23][24][25][26][27][28]. In addition, nurses play a decisive role in larvae development due to the quality of the food they produce. However, bee nurses together with queens and drones are poorly studied when compared to bee workers.
More than half of the authors did not specify which A. mellifera subspecies was used in their experiments (Figure 3b), probably because there is numerous inter-subspecific cross-breeding in the field that makes identification difficult. However, stress sensitivity may differ between two subspecies [29,30]. Therefore, the subspecies is an important parameter to take into account and should be documented. When specified, the most studied subspecies were A. m. carnica (13%) and A. m. ligustica (9%) followed by Buckfast bees (5%) and A. m. mellifera (2%).
Therefore, it seems important to identify the populations most at risk or the critical developmental stages, to identify the differences in stress sensitivity between subspecies and potentially to define one or two indicative subspecies.

What Kind of Populations were Studied in the Articles?
Worker bees were by far the most studied category of bees (67%, n = 230) and of these, foragers were widely represented (17% of the total population) (Figure 3a). Brood and colonies were studied in 13% of cases (n = 43), while drones (male bees) and queens were rarely studied (2% and 5% of the publications, respectively). Queens have been studied since 2011 whereas drones have only been included in papers since 2013 (see Figure S4 in the Supplementary Materials). Workers, and especially foragers, are the first to suffer from abnormally high mortality rates and seem to be the age class most exposed to stressors. Moreover, due to their number they constitute the greater part of the colony and probably represent the simplest biological material to study in hives. Drones and queens are key elements for colony survival due to their role in reproduction and recent studies have revealed that their reproductive capacities are altered by stressors [21][22][23][24][25][26][27][28]. In addition, nurses play a decisive role in larvae development due to the quality of the food they produce. However, bee nurses together with queens and drones are poorly studied when compared to bee workers.
More than half of the authors did not specify which A. mellifera subspecies was used in their experiments (Figure 3b), probably because there is numerous inter-subspecific cross-breeding in the field that makes identification difficult. However, stress sensitivity may differ between two subspecies [29,30]. Therefore, the subspecies is an important parameter to take into account and should be documented. When specified, the most studied subspecies were A. m. carnica (13%) and A. m. ligustica (9%) followed by Buckfast bees (5%) and A. m. mellifera (2%).
Therefore, it seems important to identify the populations most at risk or the critical developmental stages, to identify the differences in stress sensitivity between subspecies and potentially to define one or two indicative subspecies.

What Stressors Were Studied and Did They Have an Impact on Bee Health?
Biotic stressors were very seldom studied (11%) compared to abiotic ones (89%).

Biotic Stressors
Publications on biotic stressors (mostly parasitic and infectious agents (PIAs)) mainly concerned the parasitic mite Varroa destructor (Mesostigmata: Varroidae), the fungal agent Nosema spp. (33% and 32%, respectively) and viruses (17%). These three categories represent the most widespread parasitic and infectious agents in bee colonies (Figure 4a). Predators and the small hive beetle Aethina tumida (Coleoptera: Nitidilidae) were very little studied. However, the recent detection of the latter in Europe and the Philippines [31] and the expansion of the Asian hornet Vespa velutina (Hymenoptera: Vespidae) could reverse this trend.

Abiotic Stressors
The most studied abiotic stressors were pesticides (61%, Figure 4b). Insecticides were the most tested (half of them were neonicotinoids (Figure 4d)) while fungicides and herbicides were understudied. Beekeeping practices were relatively highly studied (29%). Three quarters of the beekeeping practices under study were PIA control systems, whether they used chemicals or not (Figure 4c). Bee nutrition was relatively well investigated (17%), while queen management, wintering methods and hive transfers were little studied (≤4%). Among the PIA control methods, "hard" chemical treatment methods [32] were more studied (60%) than "soft" methods. Indeed, these products are often acaricides or fungicides, potentially harmful for bees. Essential oils and organic acids, considered as "soft" methods [32], were studied in 23% and 12% of cases, respectively. Furthermore, Jacques et al. [33] have shown during the EPILOBEE surveillance project that poor beekeeping practices and the lack of expertise of some beekeepers represented one of the major causes of colony loss in Europe. It should be noted that the present meta-analysis only takes into account veterinary products (mostly acaricide treatments) and techniques currently used in beekeeping. Experiments studying any other active ingredients (e.g., toxicity of essential oils not used in beekeeping) were discarded from the analysis.

What Stressors Were Studied and Did They Have an Impact on Bee Health?
Biotic stressors were very seldom studied (11%) compared to abiotic ones (89%).

Biotic Stressors
Publications on biotic stressors (mostly parasitic and infectious agents (PIAs)) mainly concerned the parasitic mite Varroa destructor (Mesostigmata: Varroidae), the fungal agent Nosema spp. (33% and 32%, respectively) and viruses (17%). These three categories represent the most widespread parasitic and infectious agents in bee colonies (Figure 4a). Predators and the small hive beetle Aethina tumida (Coleoptera: Nitidilidae) were very little studied. However, the recent detection of the latter in Europe and the Philippines [31] and the expansion of the Asian hornet Vespa velutina (Hymenoptera: Vespidae) could reverse this trend.

Abiotic Stressors
The most studied abiotic stressors were pesticides (61%, Figure 4b). Insecticides were the most tested (half of them were neonicotinoids (Figure 4d)) while fungicides and herbicides were under-studied. Beekeeping practices were relatively highly studied (29%). Three quarters of the beekeeping practices under study were PIA control systems, whether they used chemicals or not (Figure 4c). Bee nutrition was relatively well investigated (17%), while queen management, wintering methods and hive transfers were little studied (≤4%). Among the PIA control methods, "hard" chemical treatment methods [32] were more studied (60%) than "soft" methods. Indeed, these products are often acaricides or fungicides, potentially harmful for bees. Essential oils and organic acids, considered as "soft" methods [32], were studied in 23% and 12% of cases, respectively. Furthermore, Jacques et al. [33] have shown during the EPILOBEE surveillance project that poor beekeeping practices and the lack of expertise of some beekeepers represented one of the major causes of colony loss in Europe. It should be noted that the present meta-analysis only takes into account veterinary products (mostly acaricide treatments) and techniques currently used in beekeeping. Experiments studying any other active ingredients (e.g., toxicity of essential oils not used in beekeeping) were discarded from the analysis.

Stressors' Impact
In order to determine which stressors affected the most honey bees, Sankey diagrams were generated for each stressor ( Figure 5). All the most studied stressors (parasitic and infectious agents, insecticides, chemical veterinary treatments and beekeeping practices other than chemical veterinary treatments) affected the majority of the parameters studied at all scales. There was an exception for veterinary treatments and insecticides, for which about 50% of the parameters studied at the colony level were not impacted. GMOs, although little studied, did not generally have much effect. In particular, results showed no effect on bee mortality and few impacts were observed at the colony or individual level. However, the studied GMOs were mainly bt maize, and as bees are not sensitive to the bacillus thuringiensis toxins [34], it is consistent that these GMOs were not demonstrated to have an impact. On the other hand, all the articles dealing with the impact of climate or habitat fragmentation ( Figure 5) showed effects, particularly at the colony level. Since the number of publications was very low, the actual effects on bee health should be confirmed by other studies. Exposure to metals appeared to have a significant molecular impact on bees, but very few studies were conducted and none studied colony endpoints. It would therefore be interesting to fill this gap and to relate the molecular effects to the possible effects on the colony.

Stressors' Impact
In order to determine which stressors affected the most honey bees, Sankey diagrams were generated for each stressor ( Figure 5). All the most studied stressors (parasitic and infectious agents, insecticides, chemical veterinary treatments and beekeeping practices other than chemical veterinary treatments) affected the majority of the parameters studied at all scales. There was an exception for veterinary treatments and insecticides, for which about 50% of the parameters studied at the colony level were not impacted. GMOs, although little studied, did not generally have much effect. In particular, results showed no effect on bee mortality and few impacts were observed at the colony or individual level. However, the studied GMOs were mainly bt maize, and as bees are not sensitive to the bacillus thuringiensis toxins [34], it is consistent that these GMOs were not demonstrated to have an impact. On the other hand, all the articles dealing with the impact of climate or habitat fragmentation ( Figure 5) showed effects, particularly at the colony level. Since the number of publications was very low, the actual effects on bee health should be confirmed by other studies. Exposure to metals appeared to have a significant molecular impact on bees, but very few studies were conducted and none studied colony endpoints. It would therefore be interesting to fill this gap and to relate the molecular effects to the possible effects on the colony.

Co-Exposures
Only 20% of the publications investigated interactions between stressors. The most studied were interactions between different pesticides followed by pesticide-PIAs interactions. Other coexposures included PIAs-nutrition or pesticide-nutrition interactions. The combinations of exposures to be studied is infinite, therefore to help in the decision making process, Henry et al. [35] proposed a procedure to narrow down the panel of options.

What Methods Were Used in the Articles to Measure Various Endpoints?
Different techniques were used to study the impact of stressors on A. mellifera according to the level of biological organization of the measured endpoint: colony, individual, cellular, molecular or genetic endpoints. Individual endpoints were by far the most studied (53%), followed by colony endpoints (21%) and molecular endpoints (14%). Only 8% and 4% of the endpoints were genetic and cellular endpoints, respectively. As colony and individual are the most prevalent endpoints in terms of their relative proportion, we will describe in detail below how these endpoints were produced.

Colony Endpoints
Many authors quantified demographic parameters (number of adult bees, quantity of capped or uncapped brood and/or eggs) or the hive production (number of honey, pollen or nectar combs). In most cases (49%), the number of individuals or combs were estimated by using several similar yet different techniques of frame observation ( Figure S5a). Frame or hive weight was also used (13%) to assess brood development and colony size or productivity.
The techniques used to quantify Nosema or Varroa infestation (e.g., samples washed with water or alcohol, powdered sugar, microscopy, sticky boards, etc.) represented 8% of the techniques used at the colony scale. Virus infection was analyzed by PCR (8% of the techniques). The mortality of the entire colony (over the winter or not) was only studied in 7% of the cases. Some authors did not specify which techniques had been implemented (8%). Other methods included moisture and temperature sensors, queen marking to evaluate their renewal, honey extraction techniques, in vitro rearing to study the larvae emergence rate, mathematical models, or field studies that attempt to correlate observed mortality rates with potentially harmful events for bees (i.e., climate events, pesticide use).

Co-Exposures
Only 20% of the publications investigated interactions between stressors. The most studied were interactions between different pesticides followed by pesticide-PIAs interactions. Other co-exposures included PIAs-nutrition or pesticide-nutrition interactions. The combinations of exposures to be studied is infinite, therefore to help in the decision making process, Henry et al. [35] proposed a procedure to narrow down the panel of options.

What Methods Were Used in the Articles to Measure Various Endpoints?
Different techniques were used to study the impact of stressors on A. mellifera according to the level of biological organization of the measured endpoint: colony, individual, cellular, molecular or genetic endpoints. Individual endpoints were by far the most studied (53%), followed by colony endpoints (21%) and molecular endpoints (14%). Only 8% and 4% of the endpoints were genetic and cellular endpoints, respectively. As colony and individual are the most prevalent endpoints in terms of their relative proportion, we will describe in detail below how these endpoints were produced.

Colony Endpoints
Many authors quantified demographic parameters (number of adult bees, quantity of capped or uncapped brood and/or eggs) or the hive production (number of honey, pollen or nectar combs). In most cases (49%), the number of individuals or combs were estimated by using several similar yet different techniques of frame observation ( Figure S5a). Frame or hive weight was also used (13%) to assess brood development and colony size or productivity.
The techniques used to quantify Nosema or Varroa infestation (e.g., samples washed with water or alcohol, powdered sugar, microscopy, sticky boards, etc.) represented 8% of the techniques used at the colony scale. Virus infection was analyzed by PCR (8% of the techniques). The mortality of the entire colony (over the winter or not) was only studied in 7% of the cases. Some authors did not specify which techniques had been implemented (8%). Other methods included moisture and temperature sensors, queen marking to evaluate their renewal, honey extraction techniques, in vitro rearing to study the larvae emergence rate, mathematical models, or field studies that attempt to correlate observed mortality rates with potentially harmful events for bees (i.e., climate events, pesticide use).

Individual Endpoints
Among the parameters studied at the individual level, 56% were non-behavioral parameters. The most frequent non-behavioral endpoint ( Figure S5b) was the mortality rate (73%), which was evaluated with different techniques. Histology techniques (9%) were used to study tissue damage, and in particular, the workers' hypopharyngeal gland ultrastructure. The impact of stressors on the development, immunity or reproduction of honey bees was also assessed.
The behavioral trials ( Figure S5c) mainly used the proboscis extension reflex (PER) in conditioning protocols (21%) or not (5%). Observation cages or hives were also used extensively (18% and 6%, respectively), sometimes in association with cameras (video-tracking). The cages were cardboard boxes, Petri dishes or other devices. These devices, as well as other systems often designed by the authors and used to study phototaxy, were used to evaluate abnormal behavior, locomotion, dance and activities in the hive as well as social interactions.
Foraging is an important parameter of behavior, which is mainly studied by counting the foragers in the fields or by recording the number of bees entering and/or exiting the hive. Other methods have been identified but are rarely used, such as pollen traps or weighing of foragers. Marking individuals with color marks or radars (harmonic or radio frequency identification) was used in 11% of cases. Marking was often associated with artificial feeders to study the flight parameters of foragers, or with releasing the honey bees at distance from the hive to test their ability to return.
This inventory revealed a great number of techniques used to study a multitude of parameters. This great diversity of methods may be related to the fact that at present there are only five standard procedures to test chemicals on bees: two acute toxicity tests by ingestion or by contact with adults [36,37], a chronic oral toxicity test with adults [38] and two larval intake toxicity tests [39,40]. A test to evaluate homing success [41] is currently being ring tested for validation, and has not yet been fully accredited. Various tests have been listed [42], but standard tests are still under development and standardization efforts need to be continued.
The diversity of the tests was also related to the large number of parameters that can be evaluated. It is therefore important to identify the most relevant parameters for assessing bee health.

Did the Evidence of an Impact Vary According to the Type of Study or to the Scale of Study?
In this part of the analysis, we analyzed the impact of the stressor on the endpoint studied (e.g., workers' mortality rate, expression of detoxication genes, immune enzymes' activity, brood capping rate, etc.). The objectives were to determine if the evidence of an impact varied according to the type of study or the scale of study. The impact on the parameter could be "positive" or "negative", but this modality was not recorded.

Type of Study
We investigated whether the parameters studied in the articles were differently affected by a stressor depending on whether the study was carried out in the field (38%) or in a laboratory (48%). In the field, the difference between the number of impacted and non-impacted parameters was very small ( Figure 6) yet statistically significant (p = 0.04, Chi-square test). In laboratory studies, two-thirds of the parameters were impacted by the stressor and one third was not. This difference could be explained by the effect of stress exposure, by a dose effect-the doses tested in a laboratory may be higher than the doses to which bees are exposed in fields-or by other effects such as co-exposure to multiple stressors and interactions between different products, which are difficult to control in field experiments. Figure 6. Number of parameters impacted or not by the stressor studied according to the type of experimentation (n = 1532). (*p < 0,05; **p < 0,01; ***p < 0,001, Chi square test). "Field studies" refer to studies in which the treatment was performed outside, in hives placed in fields or directly in the fields; "laboratory studies" refer to studies in which the treatment was conducted in the laboratory.

Scale of Study
When comparing the results obtained at different scales of study, at the colony scale the difference between the number of impacted and non-impacted parameters was not significant ( Figure  7, p = 0.607, Chi-square test). On all other scales, significantly more parameters were demonstrated to be impacted by a stressor than not impacted. This result demonstrated the buffering effect of the colony, which compensates for individual effects. Therefore, the same result was observed for parameters studied in the field and parameters studied at the colony level; the difference between impacted and not-impacted parameters was not significant. However, colony endpoint measurements were the main parameters studied in field tests. Therefore, we could not determine whether this effect on the impact of stressors was linked to the type of experimentation or to the buffer effect of the colony. It is very likely that both were involved and other confounding factors may also be responsible for this. Nevertheless, these two effects represent important methodological and biological barriers. Indeed, under natural conditions, it is very difficult to control bees' exposure to stressors and their interactions [9]. In addition, conducting robust studies requires a very large number of replicas, which may pose methodological problems in the field [43]. Mathematical modeling methods might circumvent these technical difficulties. Figure 6. Number of parameters impacted or not by the stressor studied according to the type of experimentation (n = 1532). (* p < 0,05; ** p < 0,01; *** p < 0,001, Chi square test). "Field studies" refer to studies in which the treatment was performed outside, in hives placed in fields or directly in the fields; "laboratory studies" refer to studies in which the treatment was conducted in the laboratory.

Scale of Study
When comparing the results obtained at different scales of study, at the colony scale the difference between the number of impacted and non-impacted parameters was not significant (Figure 7, p = 0.607, Chi-square test). On all other scales, significantly more parameters were demonstrated to be impacted by a stressor than not impacted. This result demonstrated the buffering effect of the colony, which compensates for individual effects.
Diversity 2019, 11, x 9 of 12 Figure 6. Number of parameters impacted or not by the stressor studied according to the type of experimentation (n = 1532). (*p < 0,05; **p < 0,01; ***p < 0,001, Chi square test). "Field studies" refer to studies in which the treatment was performed outside, in hives placed in fields or directly in the fields; "laboratory studies" refer to studies in which the treatment was conducted in the laboratory.

Scale of Study
When comparing the results obtained at different scales of study, at the colony scale the difference between the number of impacted and non-impacted parameters was not significant ( Figure  7, p = 0.607, Chi-square test). On all other scales, significantly more parameters were demonstrated to be impacted by a stressor than not impacted. This result demonstrated the buffering effect of the colony, which compensates for individual effects. Therefore, the same result was observed for parameters studied in the field and parameters studied at the colony level; the difference between impacted and not-impacted parameters was not significant. However, colony endpoint measurements were the main parameters studied in field tests. Therefore, we could not determine whether this effect on the impact of stressors was linked to the type of experimentation or to the buffer effect of the colony. It is very likely that both were involved and other confounding factors may also be responsible for this. Nevertheless, these two effects represent important methodological and biological barriers. Indeed, under natural conditions, it is very difficult to control bees' exposure to stressors and their interactions [9]. In addition, conducting robust studies requires a very large number of replicas, which may pose methodological problems in the field [43]. Mathematical modeling methods might circumvent these technical difficulties. Therefore, the same result was observed for parameters studied in the field and parameters studied at the colony level; the difference between impacted and not-impacted parameters was not significant. However, colony endpoint measurements were the main parameters studied in field tests. Therefore, we could not determine whether this effect on the impact of stressors was linked to the type of experimentation or to the buffer effect of the colony. It is very likely that both were involved and other confounding factors may also be responsible for this. Nevertheless, these two effects represent important methodological and biological barriers. Indeed, under natural conditions, it is very difficult to control bees' exposure to stressors and their interactions [9]. In addition, conducting robust studies requires a very large number of replicas, which may pose methodological problems in the field [43]. Mathematical modeling methods might circumvent these technical difficulties.
The buffer effect of the colony is very difficult to take into account in experiments and risk assessment. Indeed, when the individuals of a colony are affected by a stress, if the colony sets up measures to compensate for these individual effects, the impact of the stressor will not be evidenced while the colony is suffering. For example, Henry et al. [12] have shown that when the colony loses its foragers in an abnormally large way, it changes the way the reproductive effort is allocated between the brood of workers and drones: production of males is delayed, while the production of workers is strengthened. The colony size is then maintained, as well as the honey production. Thus, the colony appears to be in good health even though its foragers disappear and the delay in male production may be problematic for mating. This also raises the question of the time scale of an experiment: how long can a colony compensate for a stress without visibly suffering? Are the tests long enough to observe deleterious effects on colonies? It is essential to set up techniques that address these issues. Radio frequency identification (RFID) chips are a first solution since they enable real-time observation of foragers' disappearance.

Conclusions
This meta-analysis highlights the great diversity of techniques used by researchers in honey bee experimentations and the need to standardize the protocols. To do so, populations at risk, critical stages of development and the most relevant parameters to be measured should be identified. New standard tests should be developed, especially to better study the sub-lethal effects of stressors on bee health. In addition, greater importance should be given to the bee subspecies studied, to understanding the differences in their sensitivity to stress, and if possible, to identify one or two indicator subspecies. Moreover, this study highlighted the need to break through two important methodological and biological barriers that make risk assessment difficult: the technical difficulties encountered in field tests and the buffer effect of the colony. New technologies such as RFID chips or mathematical modeling could help to overcome these obstacles. This study also highlighted innovative research paths, particularly with regard to the impact of climate and habitat fragmentation, which, according to the few studies already carried out, could have significant deleterious consequences on bee colonies. Finally, as also pointed out by Benuszak et al. [20], efforts to strengthen the number of studies on the impact of co-exposures and metabolites should be continued. In order to develop standard protocols, the search for biomarkers as diagnostic tools seems to be an interesting route of exploration. This biomarker research should be facilitated by "omics" techniques such as genomics, transcriptomics, proteomics, and metabolomics.
Supplementary Materials: The following are available online at http://www.mdpi.com/1424-2818/12/1/7/s1. Figure S1: Search string used to screen the SCOPUS and CAB ABSTRACT databases to select articles related to the impact of stressors on Apis mellifera published between 2007 and 2017, Figure S2: list of stressors considered as "anecdotal" to discard related articles from the review on the impact of stressors on Apis mellifera published between 2007 and 2017, Table S1: list of the 293 articles related to the impact of stressors on Apis mellifera published between 2007 and 2017 included in the analysis, Figure S3: Percentage of articles related to the impact of stressors on Apis mellifera (n = 293) included in the study and published in scientific journals between 2007 and 2017 (last access to database: March, 6th 2017), Figure S4: Number of publications (n = 293) related to the impact of stressors on Apis mellifera published between 2007 and 2017 studying the different bee categories according to year of publication, Figure S5: Proportion of the different methods used at the colony scale (a), at the individual scale in non-behavioral trials (b) and in behavioral trials (c), at the cellular (d), the molecular (e) and the genetic (f) scales. ("others": set of methods, each representing 2% or less. See Table S2 for details), Table S2: Detail of the "Others" sections of the Figure 4 and Figure S3.