How to Influence the Results of MCDM?—Evidence of the Impact of Cognitive Biases

Multi-criteria decision-making (MCDM) methods aim at dealing with certain limitations of human information processing. However, cognitive biases, which are discrepancies of human behavior from the behavior of perfectly rational agents, might persist even when MCDM methods are used. In this article, we focus on two among the most common biases—framing and loss aversion. We test whether these cognitive biases can influence in a predictable way both the criteria weights elicited using the Analytic Hierarchy Process (AHP) and the final ranking of alternatives obtained with the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). In a controlled experiment we presented two groups of participants with a multi-criteria problem and found that people make different decisions when presented with different but objectively equivalent descriptions (i.e., frames) of the same criteria. Specifically, the results show that framing and loss aversion influenced the responses of decision makers during pairwise comparisons, which in turn caused the rank reversal of criteria weights across groups and resulted in the choice of a different best alternative. We discuss our findings in light of Prospect Theory and show that the particular framing of criteria can influence the outcomes of MCDM in a predictable way. We outline implications for MCDM methodology and highlight possible debiasing techniques.


Literature Review
Multi-criteria decision-making methods have been developed and widely used in the past decades to deal with limitations of human decision making that occur when complex decision environments are involved. Specifically, when people have to solve decision problems, involving multiple conflicting criteria, the processing load required to evaluate the information and make a decision becomes very high. This often results in the tendency to simplify the problem by using intuitive or heuristic approaches instead of rational or analytic ones, and can cause subjective judgments and losses of important information [1,2]. Importantly, even experts are found to have difficulty in assessing complex trade-offs and to resort to simplified decision making [3]. Thus, in such situations better solutions can be achieved by applying MCDM techniques which can deal with large amounts of information and calculations [4].
Despite the popularity of MCDM methods, certain types of limitations in human decision making might persist even when decision support aids, such as MCDM methods, are used. Specifically, as long as human judgment is used as a base for subsequent calculations and model building, the influence of the processes underlying human judgment should be studied and accounted for. This would allow to avoid distortions that might occur even in mathematically most sophisticated models. It is also important to remember, that MCDM is a helpful tool that can support decisions, however, the responsibility of taking the final decision is always in the hands of the decision maker [5]. In the past decade, there has been a resurgence of papers in Operational research (OR) literature highlighting the importance of studying "various behavioral effects [that] can be embedded in and effect OR processes" [6]. The authors discuss in detail the importance and high potential of behavioral OR for bringing improvement into OR methods and propose nine topics for a respective research agenda. One of these topics is the study of cognitive aspects, such as cognitive biases. Cognitive biases are systematic discrepancies of human behavior from the behavior of perfectly rational agents. In other words, instead of behaving according to principles of probability and logic, humans make systematic errors in judgment and perception of the problem, which leads to erroneous decision making. Cognitive biases as a major behavioral effect have been vastly studied in other fields, such a cognitive psychology, behavioral economics and medical science [7][8][9], but often overlooked in MCDM and OR [10,11]. As noted by Borrero et al. [12], there is a lack of empirical evidence on how cognitive factors influence the effectiveness of the MCDM methods. Moreover, it is not clear if they contribute to diminishing cognitive biases, or if, on the contrary, cognitive biases diminish the effectiveness of these tools. Crucially, cognitive biases are systematic and inherent to the human mind. This means that cognitive biases, unlike individual differences and motivational biases, are much more universal, culturally independent and predictable [13]. Therefore, identifying their possible influence on particular stages of MCDM could help prevent frequently occurring problems. Moreover, debiasing techniques, which are effective ways of limiting the occurrence of cognitive biases, could be applied to prevent their influence.
Arnott [14] identified a taxonomy of 37 cognitive biases in Decision Support Systems, such as the framing bias, the loss aversion bias, the reference dependence bias, the anchoring bias and others, and proposed potential improvements for the design of DSS. Similarly, Montibeller & von Winterfeldt [10,15] provide a review of cognitive biases in Decision and Risk Analysis and MCDM. The authors highlight a number of cognitive biases that occur at different stages of MCDM and propose debiasing solutions. Importantly, Montibeller & von Winterfeldt [10] point to the fact that although the issue of cognitive biases in OR has been often identified as a problem, very few studies tested their influence on OR experimentally. One example of such research is the study by George [16], which tested a Decision support system designed to mitigate the effects of the anchoring bias in the context of house appraisals. Due to the anchoring bias individuals are disproportionally influenced by initial information presented to them (considered to be the "anchor") and their subsequent judgments during decision making are accordingly biased towards this initial information. The results of the study showed that anchoring bias remained robust even when using automated decision support. Another study conducted by Ahn & Vazquez Novoa [17], investigated the impact of the decoy effect on the relative performance evaluation and a possible debiasing capacity of the Data Envelopment Analysis. The decoy effect is a cognitive bias that implies that the inclusion of a dominated alternative can influence the preference for non-dominated alternatives (for example, consumers change their preference between two options when presented with a third option). The authors showed that although the utility comparison of two alternatives was biased by the decoy effect, the addition of supplementary information about DEA results (efficiency scores and the mention of existing slacks) helped in debiasing the evaluation.
Despite these first attempts to study experimentally the impact of cognitive biases on different OR techniques, even fewer studies addressed the issue of cognitive biases on MCDM, specifically. The only study, to our knowledge, on this topic was done by Ferretti et al. [18]. In this paper the authors tested several methods to reduce the overconfidence bias when eliciting continuous probability distributions in the context of multicriteria decision analysis. Overconfidence describes the tendency of people to erroneously assess probabilities by underestimating variability and overestimating the tails of the distribution. Results revealed that participant were subject to overconfidence bias, and the debiasing techniques had a positive, though limited, debiasing effect.

Current Study
In the current paper we present the first experimental study on the influence of two major cognitive biases, i.e., framing and loss aversion, on MCDM. A framing bias is said to occur when people make different decisions if presented with different but objectively equivalent descriptions (i.e., frames) of the situations or outcomes. For instance, describing a surgery as yielding a 90% survival rates (positive frame) versus 10% mortality rates (negative frame) presents objectively equivalent information framed in two different ways [19]. Patients are more likely to choose to undergo a surgery described in the former than in the later way. Crucially, the direction of this opinion change occurring when different frames are presented, is predictable by the loss aversion bias. This cognitive bias captures the extreme sensitivity of humans towards losses compared to gains, i.e., the same amount of losses is perceived as being larger than the same amount of gains. Framing and loss aversion are strongly interlinked and therefore will be studied together in the current paper. The existence of these cognitive biases has been previously demonstrated by numerous laboratory experiments and field studies in the literature (for reviews and meta-analyses, see [20][21][22][23]). These phenomena were extensively described in the Prospect theory, developed by Kahneman and Tversky [1,7,13,[24][25][26]. In 2002, Kahneman received the Nobel Memorial Prize in Economic Sciences for his work developing this theory. An overview of Prospect theory will be presented in the following subsection of this paper.
Turning to the susceptibility of MCDM to biases, it is important to note that cognitive biases might impact those steps in MCDM methods, which involve judgments by decision makers. According to Montibeller & von Winterfeldt [10], these steps include the generation of alternatives and objectives, the development of criteria for the objectives, the elicitation of utility or value functions over criteria levels and the elicitation of weights for criteria. In this paper we will focus on the last step, i.e., the weighting of criteria. Ample earlier research has already demonstrated that using different weight elicitation methods yields systematic and persistent differences in the numbers which decision makers assign to the weights of the criteria [27]. Here we will focus on a single weight elicitation methods and test whether the mere difference in the formulations of the criteria can still yield different results in weight assignation.
We consider two questions:

1.
Can different ways of framing criteria have an impact on how decision makers evaluate them? 2.
If framing and loss aversion biases are induced at early stages of weight elicitation (i.e., at the stage of pairwise comparisons), does it affect both the final ranking of criteria weights and the final ranking of alternatives? (see Figure 1) As to the first question, we hypothesize that if a criterion is framed in terms of losses (e.g., jobs lost following a reorganization), it might be perceived as being more important than when it is framed in terms of gains (e.g., jobs saved following a reorganization). In order to elicit weights, we chose to use the AHP method, which is one of the most used methods across different fields, suited both for individual and group decision making [28]. Although many other methods have been since developed, the AHP remains one of the most widely used MCDM methods [29]. As the AHP is believed to be in accordance with psychological principles [30], it can be considered to be extremely suitable to test for robustness to psychological biases.
As to the second question, it is important to identify the specific stage at which framing and loss aversion biases are induced, and crucially, to test for their influence on the subsequent stages of the MCDM. As is common practice, we chose to use a different method than AHP, namely TOPSIS (TOPSIS was used instead of AHP as the former is less time-consuming for the participants. Moreover, in order to have better control on our experimental variable, we wanted to limit to a single MCDM stage the involvement of human judgment, whereas alternative selection using AHP would require an additional input from the decision makers), to obtain the final ranking of the alternatives [31]. Note though, that this method was chosen for simplicity reasons as one of the most common classical methods MCDM, and other methods could have been as suitable to test our hypotheses. We will thus target the following three stages of the MCDM: • the stage of individual pairwise comparisons (AHP) • the stage of criteria ranking (AHP) • the stage of alternative selection (TOPSIS) All three of these stages have to be tested, as an effect of framing and loss aversion at one stage could as well disappear in the subsequent stages. For instance, both cognitive biases could impact the ratings given by the decision makers at the stage of an individual pairwise comparison, but this effect might be mitigated by other pairwise comparisons, subsequent normalization or weight calculation techniques, and thus result in no significant effect on the final ranking of alternatives. For example, [32] studied criteria weight elicitation using three different techniques and showed that although each technique led to different criteria weights, all techniques led to the selection of the same final alternative. An effect of framing and loss aversion found only at the stage of individual pairwise comparisons, but not in later steps, would indicate that AHP helps in diminishing the influence of cognitive biases on decision making. If, however, different ways of framing the same criteria lead to the assignation of different criteria ranks and to the final selection of different alternatives, this would be evidence for the sensitivity of MCDM to framing and loss aversion biases and hence, for the need of effective debiasing techniques.
In this research we evaluated the effect of framing and loss aversion biases on the judgments of decision makers in a controlled online experiment. Two groups of participants were asked to make pairwise comparisons on logically equivalent criteria which were framed in two different ways (positive vs. negative frames). Following recommendations to keep the number of criteria at seven or less for consistency and redundancy reasons [33], we chose to include six criteria. As the experiment was planned at the beginning of the COVID-19 global pandemic, when Lithuania (where the experiment was conducted) was under quarantine, we chose to design a MCDM problem revolving around the topic of COVID-19. As noted by O'Keefe [11], many experimental studies within OR are artificial, as they use students as participants and present problems that are not necessarily relevant to them. In the current study we addressed this criticism and chose a topic that was relevant to a great number of citizens, considering the global epidemiological situation. Moreover, we recruited most participants on COVID-19-related groups on social media (Facebook), thus both ensuring they were personally interested in the issue, and reaching a larger and more varied sample of participants. Finally, the MCDM problem used in our experiment involved qualitative criteria, such as policy impacts, which are particularly suited to be analyzed using AHP [34].
In the next sub-section, we will present the Prospect Theory together with the framing and loss aversion biases in further details, followed by a brief description of the AHP and TOPSIS methods. The most famous illustration of the framing effect comes from the seminal work by Tversky and Kahneman [35]. In their experiment two groups of participants were presented with a hypothetical scenario where they had to imagine that the US prepares for an outbreak of a deadly Asian disease, which is expected to kill 600 people. They were told to choose from one of two alternative programs to combat the disease. The first group received the following formulation of the two programs and their consequences: If Program A is adopted, 200 people will be saved. If Program B is adopted, there is a 1/3 probability that 600 people will be saved, and a 2/3 probability that no people will be saved.
72% of the participants of this group chose the Program A. The other group of participants received a different description of the two programs: If Program C is adopted, 400 people will die. If Program D is adopted, there is a 1/3 probability that nobody will die, and a 2/3 probability that 600 people will die.
In the second group only 22% of the respondents opted for Program C. Thus, although the programs A and C were logically equivalent (200 saved and 400 dead), participants radically shifted they preferences depending on the formulation of the alternatives. The observation that people make different decisions when presented with different but objectively equivalent descriptions (i.e., frames) of the same problem is contrary to the principle of description invariance inherent to rational choice [7]. In order to explain this and related behavioral effects Kahneman & Tversky [1,26] proposed an alternative to the expected utility theory, called Prospect theory. The evaluation proposed in Prospect theory is similar to that of earlier models using weighting functions, namely: where: V is expected utility, x i are the potential outcomes, p i the respective probability of this outcome, π the probability weighting function, and v is a function that assigns a value to an outcome (x i ). However, this value function is different from the standard utility function, as it passes through the reference point, is s-shaped and asymmetrical ( Figure 2). That is, the value of the potential outcome is set to v(x) = x α , f or x ≥ 0, and to v(x) = −λ(−x) β , f or x < 0, where v is the function that assigns a value to an outcome x, α and β denote the adjustable coefficients, α specifying the concavity and β the convexity of the value function. The value function satisfies the constraints that 0 < α ≤ 1 and 0 < β ≤ 1. The parameter λ denotes the loss aversion and λ = 2.25, as estimated from experimental data [25].
Thus, the value function proposed by this theory reflects three major cognitive biases: • Reference dependence bias: the value function (gains and losses) is defined in terms of deviations from a reference point and not in terms of absolute magnitudes. Thus, decisions are made relative to some status quo or baseline and are sensitive to framing. This differs from expected utility theory, in which a rational agent is indifferent to the reference point. • Loss aversion bias: the value function is steeper for losses than for gains. This suggests that the same amount of losses is perceived as being larger than the same amount of gains (e.g., the aversion of losing 10 euros seems stronger than the attractiveness of gaining 10 euros). In other words, humans are oversensitive to losses. Again, this differs from expected utility theory where individuals should value the same amount equally, independently of whether it is a gain or a loss. • Risk aversion bias: the value function is concave for gains and convex for losses. This means that when choices involve gains, people are risk averse and prefer a certain gain to a probable gain, even if the later has equal or greater expected utility. Conversely, when choices involve losses, people are risk seeking and prefer options that help avoid sure losses.
Turning back to the Asian decease problem, the reference point adopted in the first group was negatively framed in terms of the number of lives saved (gains), while in the second group it was positively framed in terms of lives lost (losses). Therefore, participants exhibited risk aversion in the first group and chose the solution with a certain outcome of saving 200 lives. In the second group, however, they exhibited loss aversion and risk seeking as they opted for the risky option that helped avoid a loss of 400 lives. Although the Asian decease problem was an artificial laboratory experiment, it is somewhat similar to multiple criteria decision making, as participants have to choose an alternative based on several conflicting criteria. Thus, if the above described biases have very strong effects in this type of experiments, it is likely that comparable effects might occur in real MCDM decision making.
Although all three cognitive biases characterized by the Prospect theory are relevant to MCDM, we will focus in this paper on the first two, namely, reference-dependence and loss aversion, which are directly linked to the framing bias. Specifically, framing occurs because individuals evaluate information relative to a reference point, which is often the status quo, and they evaluate it differently, depending on whether this information is related to gains vs. losses with respect to this reference point. Thus, we hypothesize that when evaluating the importance of a criterion, decision makers will be sensitive to the frame used to formulate this criterion. For instance, in a pairwise comparison the comparison between Criterion 1 and Criterion 3 will depend on the way they are framed: where Criterion 1 − is a negatively framed criterion, Criterion 1 + is the same, but positively framed criterion, and where the framing of Criterion 3 constant is constant.

AHP
The AHP technique can handle both quantitative as well as qualitative information [36]. In AHP the decision problem is decomposed into a hierarchical structure, incorporating the goal of the decision problem, the criteria and the alternatives. In order to calculate the priority of each criterion with respect to the goal, and the priority of each alternative with respect to each criterion, the AHP uses comparative judgments carried out on pairs of criteria or alternatives. Importantly, the decision maker compares only two elements at a time, thus ensuring he can concentrate on the properties of the elements in question without having to think about the other elements [37]. The pairwise comparisons are done on a fundamental scale from 1 to 9 (Table 1). This allows to obtain a pairwise reciprocal comparison matrix A [38]. A 1 , . . . , A n denote the objects to be compared (criteria or alternatives), w 1 , . . . , w n denote their weights.
A n w n /w 1 w n /w 2 w n /w n Local priorities (weights) are calculated from the comparison matrix with the eigenvalue method [38,39]: where A is the pairwise comparison matrix → p is the priorities vector λ max is the maximal eigenvalue The consistency of judgements the decision maker provided during the pairwise comparisons is checked using the consistency index (CI): where λ max maximal eigenvalue of the matrix A n size of the comparison matrix A The Consistency Ratio (CR) is then calculated in: CR = CI RI , by using the Random Index RI, which is the average CI values from a random simulation of 500 pairwise comparison matrices (the values for RI are given in Table 2). If RI ≤ 0.1, the inconsistency is acceptable, otherwise the judgments should be reviewed. Note, though, that acceptable levels of consistency can still be obtained when aggregating multiple individual inconsistent comparison matrixes, if the number of individual decision makers is sufficiently high [40].
The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), is used only for the choice of an optimal alternative. The criteria weights are elicited by using a different method prior to the application of TOPSIS. TOPSIS is based on the ranking of alternatives through distance measures. Specifically, the optimal alternative is defined as the one that is the closest to an ideal solution and at the same time the most distant from the negative ideal solution. Both of these solutions are derived within the method. While the ideal solution aims at minimizing the cost criteria and maximizing the benefit criteria, the negative ideal solution minimizes the benefit and maximizes the cost criteria.
First, a decision matrix A with m alternatives and n criteria is created, with the intersection of each alternative and criterion given as x ij .
Then this matrix is normalized to obtain R = (r ij ) m×n , by applying the normalization method: The third step involves generating the weighted normalized decision matrix T = (t ij ) m×n expressed as: where ω i is the weight of the ith criterion, and ∑ n i=1 ω i = 1. The weight for each criterion can be derived by using various methods, such as AHP [37,41], simple multiattribute rating technique (SMART) [42], tradeoff weighting [43] etc.
Then the ideal solution S + and the negative ideal solution S − are defined as follows: where B + and B − are associated with benefit and cost criteria, respectively. Using the n-dimensional Euclidean distance, the distance D + j between every alternative and the ideal solution is calculated with: And the distance D − j between every alternative and the negative ideal solution is calculated with: Finally, the relative closeness of each alternative to the ideal solution (denoted as C j ) is obtained by the following equation: The alternatives are ranked according to their C j value (the highest value represents the best solution).

Participants
Participants were recruited on voluntary bases on COVID-19-related groups on the social media Facebook. A small number of participants were recruited among university students. Participants completed the experiment on the online experimentation platform Qualtrics. Following the experimental part, participants were asked to fill in a short sociological questionnaire and provide information on their gender (55% were female in group A and 53% in group B), education level (80% had a higher education diploma in group A and 75% in group B), age (in group A participants were aged between 18 and 63, with mean age of 36; in group B participants were aged between 18 and 59, with mean age of 33). Participants were free to quit the experiment whenever they wanted, thus making sure that only interested and fully engaged participants were completing the questionnaire. In total, 248 participants completed both the experimental and the sociological parts of the test (41 participants were removed, as they did not answer all questions).

Procedure
The experiment was based on a classical multiple-criteria decision problem. Participants were randomly assigned to one of two groups A or B. They were first presented with a short description of the decision problem in which they had to help the government chose the best policy to stop a second wave of the COVID-19 virus. Specifically, they were informed that the state institutions have to choose the best policy among three alternatives. Participants were told that the financial and human resources of the state are limited, and therefore the policy alternatives had to be evaluated based on six conflicting criteria. These criteria described the estimated outcomes of each of the policies. The first three criteria were: the proportion of doctors infected with the virus (depends on the availability of protective equipment in hospitals and testing capacities), the number of new COVID-19 local outbreaks (depends on the funds and human resources allocated to testing, tracing and isolating new cases), the number of new imported COVID-19 cases (depends on funds and human resources allocated to control boarders and ensuring people self-isolate after travelling to countries with a high infection risk). The other three criteria were the availability of hospital beds for COVID-19 patients; the availability of PCR tests; the availability of disinfection liquid and other protective equipment-these three criteria depended on the allocation of funds from the government, the effective managing of medical care units/laboratories and on the organization of supply shipping.
Both the criteria and alternatives were formulated based on real information on COVID-19-related policies taken from official websites of the Ministry of Health. Participants were told that they would first evaluate the importance of the criteria in pairwise comparisons on a scale from 1 to 9 (the classical AHP scale). Once the weights for each criterion were obtained by means of AHP, the alternatives were evaluated and ranked using TOPSIS. For this we used a pre-filled decision matrix with the scores of each alternative on each of the six criteria.
The criteria weight elicitation involved 15 pairwise comparisons (this number is derived from the formula n(n − 1)/2, where n is the number of criteria). The experimental manipulation consisted in framing differently two of the six criteria ("Hospital beds" and "Doctors") for each of the groups A and B. Specifically, the two criteria were framed in terms of losses in one group ("Proportion of hospital beds already occupied by COVID-19 patients" and "Proportion of doctors infected with COVID-19") and in terms of gains in the other (Proportion of hospital beds still available for COVID-19 patients" and "Proportion of doctors who avoided COVID-19 infection"). The other four criteria were identical across both groups. To test for the presence of the framing and loss aversion biases at the pairwise comparison level, we concentrated on the first 8 pairwise comparisons, four in the test condition and four in the control condition (see the structure of the test and control conditions in Table 3), while the remaining pairwise comparisons were used only in subsequent stages of MCDM. In the test condition one member of each pair was framed either in terms of losses (Criteria 1 − and 2 − in participant group A) or in terms of gains (Criteria 1 + and 2 + in participant group B). The framing of the second member (Criteria 3 constant and 4 constant ) was kept constant across experimental groups. The index minus (as in Criterion 1 − ) indicates that the criterion was framed in terms of losses, the index plus (as in Criterion 1 + ) indicates that the criterion was framed in terms of losses, while the index 'constant' (as in Criterion 3 constant ) indicates that the framing of this criterion was kept constant. The Criteria 3 constant and 4 constant correspond to "number of new COVID-19 local outbreaks" and "the number of new imported COVID-19 cases". In the control condition the framing of both members of each pair was kept constant across groups, i.e., both groups received identical criteria pairs in the control condition. Note, that although the constant criteria were also framed in terms of gains or losses, the framing of these criteria was not manipulated and its valence remained the same across groups. Thus, if there is an effect of framing on pairwise comparisons, we expect to find a difference between participant groups A and B in the test condition specifically. Importantly, a small difference between groups A and B could occur in the control condition as well due to individual differences. However, this difference should be much smaller than in the test condition.
Two simple additional questions served as controls for whether participants are actively engaged in the experiment (the first question was: "The sum of 5 + 3 equals to:"; the second was formulated as a pairwise comparison between "funds allocated for vaccines against COVID-19" and "funds allocated for controlling the number of customers in supermarkets"). These questions were included in between experimental questions and were excluded from analyses.

Stage of Pairwise Comparisons
Prior to analysis, we inspected the answers of the participants on the two simple additional questions, as well as their Consistency Ratio in the pairwise comparisons they provided. Participants who did not answer accurately to the additional control questions and those whose pairwise comparisons were inconsistent were excluded from the analysis (23.4%) (out of the removed participants, only five were removed due to inaccurate answers on both additional control questions.), resulting in a total of 102 participants in group A and 96 participants in group B.
As our data was nested (each participant had to make several pairwise comparisons), we analyzed the datasets using linear mixed effects regression modeling (R package lme4 [44]). We constructed a model with Scores assigned by participants in the pairwise comparisons as the dependent variable. Group (A vs. B) and Condition (Test vs. Control) were included as contrast-coded fixed effect, and intercepts for Participants and Pairwise comparisons as random factors. P-values were obtained by likelihood ratio tests of the full model against the model without the effect or interaction in question. We found significant effects of Group (β = 0.44, SE = 0.12, χ 2 (1) = 12.71, p < 0.001) and a Group × Condition interaction (β = −1.48, SE = 0.21, χ 2 (1) = 48.16, p < 0.001), but no effect of Condition (p > 0.5). Separate models for the Test and Control condition revealed that the interaction was due to the fact that in control condition, the effect of Group was not significant, while in test condition, there was a significant difference between the group A and group B (β = 1.18, SE = 0.25, χ 2 (1) = 21.82, p < 0.001). Thus, our prediction that framing and loss aversion biases impact the scores assigned to criteria in the pairwise comparisons was borne out. Moreover, the fact that a difference between groups was found only in the test (framing) condition, but not in the control (no-framing) condition, shows that these differences were indeed caused by framing and loss aversion, but not by mere individual differences between participants in both groups.

Criteria Weights
Once the pairwise comparisons were elicited from participants, criteria weights were calculated and aggregated by using the R package AHPsurvey [45]. This package allows the researcher to adjust comparisons based on consistency, as well as to extract, calculate and aggregate the weights of the criteria. We transformed the scores of the pairwise comparisons to the balanced scale [46] as it was shown to increase the accuracy of the results, decrease the spread of weights and inconsistency compared to the 1-to-9 scale [47].
There is a variety of methods to aggregate the judgments of individual decision makers. Two of the most common methods are the aggregation of individual priorities (AIP) and the aggregation of individual judgments (AIJ). The former is used when the group of decision makers is assumed to act as separate individuals, while the latter is preferred when members of the group are thought to act together as one unit [48]. The decision problem in the current paper revolves around multiple disciplines where full specialization is not achievable. Thus, the participants of this study could be seen either as individuals having different backgrounds and thus different approaches to the problem, or as a single category of citizens dealing with the same problem on the daily bases. As a result, we used both aggregation techniques and compared their outcomes.

Aggregation Using Individual Priority Weights (AIP)
The individual priority weights, which are the weights per criterion for each participant, were computed using the Dominant Eigenvalues method described in [49]. The list of individual priorities for each participant can be found in Appendix A. The geometric mean was used to aggregate individual priority weights. The aggregated priority weights for each participant group can be found in Table 4. The weights and ranks of the three most important criteria that were affected by framing are in bold.
The results reveal that the ranking of criteria weights is different across groups. While in group A the top three criteria were ranked in the descending order C 2 >> C 1 >> C 3 , in group B the order changed to C 3 >> C 2 >> C 1 . This rank reversal lies in the weights assigned to the two criteria which underwent framing, whereby the criteria "Hospital beds" and "Doctors" received much higher weights in group A than in group B (note that, the criterion "Hospital beds" received a 51% higher weight in group A (0.262) than in group B (0.174)). Note, that together criteria C 1 and C 2 obtained almost 11% points more weight in group A (C 1 + C 2 = 0.454) than in group B (C 1 + C 2 = 0.346). Consequently, the proportion of weights given to the remaining criteria in group A is much smaller than in group B. Crucially, this difference between groups is clearly caused by the framing effect and loss aversion-the two criteria in question were framed as losses in group A and thus received more weight than in group B, where those same criteria were framed as gains. The implications of this findings will be discussed in the Discussion section.

Aggregation Using Individual Judgements (AIJ)
The AIJ allows to aggregate the individual judgements of all decision-makers into a single pairwise comparison matrix of all decision-makers. The geometric mean was used to aggregate individual comparison matrixes instead of the arithmetic mean as it provides the advantage of increasing the consistency for the whole group [36]. The aggregated pairwise comparison matrixes for group A and group B are presented in Tables 5 and 6, respectively. The aggregated weights for each participant group can be found in Table 7. The weights and ranks of the three most important criteria that were affected by framing are in bold. Similarly to the AIP aggregation method, the results of weight aggregation using AIJ reveal that the ranking of criteria weights is different across groups A and B. This difference lies in the ranking of the top three criteria, whereby the criteria "Hospital beds" and "Doctors" received much higher weights in group A than in group B. Here too, the criterion "Hospital beds" received a 52% higher weight in group A (0.300) than in group B (0.197). Moreover, criteria C 1 and C 2 together obtained 13% points more in weight in group A (C 1 + C 2 = 0.523) than in group B (C 1 + C 2 = 0.39). Thus, more than half of the weights were assigned to the framed criteria. Consequently, the proportion of weights given to the remaining criteria in group A was much smaller than in group B. As mentioned above, this effect is caused by the framing and loss aversion biases-the two criteria responsible for the rank reversal were specifically the framed ones. They received more weight when framed as losses (group A) than when framed as gains (group B). Although the final weights attributed to criteria by means of the two aggregation techniques differed, the order of criteria ranking remained the same using both AIJ and AIP methods.

Alternative Ranking
In order to rank the three alternatives, we used a pre-filled decision matrix with the scores of each alternative on each of the six criteria. The two sets of aggregated criteria weights-one obtained using the AIP and one using the AIJ aggregation technique were taken from the previous step. Thus, we calculated the final alternative ranking in two ways ( Table 8 show the results when AIP aggregated weights were used, Table 9 show the results when the AIJ aggregated weights were used.) The results show that the best alternative obtained in groups A and B is different. Namely, in group A Alternative 3 was chosen as the best solution, whereas in group B the winner was Alternative 2 . The same alternative ranking was obtained both when using weights from AIP or AIJ.

Discussion
Dealing with the limitations of human information processing is all but straightforward. Cognitive biases, which are discrepancies of human behavior from the behavior of perfectly rational agents, have been shown to strongly impact decision making. Therefore, MCDM methods that involve judgments by decision-makers are also likely to be affected by those biases. In the current study we focused on two among the most studied cognitive biases, namely, the framing and the loss aversion biases. The results of our study show that:

1.
By framing the criteria in a particular way, it is possible to influence the responses given by decision makers during AHP pairwise comparisons.

2.
This caused the rank reversal of criteria weights across groups and resulted in the choice of different best alternatives. 3.
The exact influence of different framings is predictable by the Prospect theory and can be explained by the loss aversion bias.
In this section we will first discuss our findings in light of the Prospect theory, we will then discuss the implications of these results for MCDM methods, we will finish by proposing ways of avoiding or diminishing the effects of these cognitive biases.

Discussion of Results and Interpretation in Light of Prospect Theory
In this research we tested whether framing and loss aversion biases can influence in a predictable way both the criteria weights elicited using the AHP method and the final ranking of alternatives derived employing the TOPSIS technique. The framing bias describes the tendency of people to make different decisions if presented with different but objectively equivalent descriptions (i.e., frames) of a situation or object. Specifically, things framed in terms of losses tend to be perceived as being more important than when they are framed in terms of gains. This extreme sensitivity to losses is predictable and characterized as loss aversion. In this paper we did a controlled experiment and presented participants with a real-world multi-criteria problem. Two groups of participants were asked to make pairwise comparisons on logically equivalent criteria which were framed in two different ways (positive vs. negative frames).
First, we hypothesized that participants who are presented with criteria framed as losses will perceive them as being more important compared to participants presented with the same criteria, but framed as gains. This will result in differences between groups in the evaluation of the framed criteria during the AHP pairwise comparisons. Second, we hypothesized that the subsequent stages of MCDM will in turn be impacted by those cognitive biases. The results of the study confirmed our hypotheses. We found that at the stage of pairwise comparisons of criteria, the two groups of participants assigned significantly different scores to the criteria, when they were framed differently (as losses vs. as gains). Moreover, the difference between groups was only found in the test condition which involved these framing differences. Conversely, in the control condition, were participants received criteria with exactly the same framing, no significant difference between groups was observed. This confirms that differences found in the test condition were indeed caused by framing and loss aversion biases, but not by mere individual differences between the participants of both groups. Thus, our results show that the two cognitive biases did occur at an early stage of MCDM, i.e., during criteria weight elicitation.
Second, we calculated group aggregated weights of criteria for each of the groups by using two techniques, the aggregation of individual priorities (AIP) and the aggregation of individual judgments (AIJ). We found that independently of the aggregation technique used, the ranking of criteria weights was different across groups. While in group A the top three criteria were ranked in the descending order C 2 >> C 1 >> C 3 , in group B the order changed to C 3 >> C 2 >> C 1 . Crucially, this rank reversal was caused by the two framed criteria "Hospital beds" and "Doctors" which received more weights when framed as losses (group A) than when framed as gains (group B). These findings are in line with the predictions of Prospect theory, which postulate that framing occurs because individuals evaluate information not in isolation, but relative to a reference point which is often the status quo. Furthermore, they evaluate this information differently, depending on whether it is perceived as gains or losses with respect to this reference point. Thus, in our experiment the participants of both groups evaluated the importance of the criteria based on the subjective status quo, or the situation as it is. The participant group A was presented with the criterion "Hospital beds" framed in terms of losses "Proportion of hospital beds already occupied by COVID-19 patients". The negatively framed criterion suggested a loss of hospital beds and this triggered loss aversion, or the unwillingness, the regret to lose something that we have in the present situation. On the contrary, the positively framed version of this criterion ("Proportion of hospital beds still available for COVID-19 patients"), did not signal any negative change in the current situation and thus participants were less sensitive to it. Therefore, although both versions of this criterion were logically equivalent (e.g., 55% of available beds would be equivalent to 45% of occupied beds), the participants of this experiment did not evaluate them equally. This points to the crucial role that formulating criteria can have on the outcomes of weight elicitation.
Finally, we used TOPSIS to obtain the ranking of the alternatives. The results revealed that here too, the framing and loss aversion biases, induced during the previous stages of MCDM, influenced the choice of the best alternative. We found that different alternatives were chosen due to criteria weight differences between the two participant groups: while in group A the best solution was Alternative 3 , in group B the winner was Alternative 2 . These results provide evidence that the effect of cognitive biases remains strong throughout the process of Multiple-criteria decision-making.

Implications for MCDM
The results of this paper show that the framing and loss aversion biases influenced the responses of decision makers during pairwise comparisons, which in turn caused the rank reversal of criteria weights across groups and resulted in the choice of a different best alternative. Note also, that only two of the six criteria were framed differently in both groups, and this was sufficient to influence the whole final result of MCDM. This suggests that a conscious or unconscious framing of even a small proportion of criteria can strongly impact the outcomes of the MCDM procedure. In the former case, there is a risk that the MCDM process can potentially be manipulated, as the Prospect theory provides accurate predictions as to how each frame affects the decision makers. That is, in order to force the selection of a particular alternative, the criterion on which this alternative scores the best would first be chosen. Then this alternative would be framed so as to receive more weights from the decision makers. Thus, one could frame the criteria in such a way, that a wanted criteria ordering would be achieved. This would ad hoc rise the probability that the alternative in question is selected. This could undermine the whole process of MCDM and even be dangerous if wrong solutions are chosen in crucial decision-making processes, such as the choice of medical [50][51][52] or engineering [53] solutions. In the case when framing and loss aversion are induced involuntarily, the results of the MCDM process could be compromised as well, as they would reflect the sensitivity of decision makers to cognitive biases instead of their true opinion about the decision problem in question. As noted by Hämäläinen et al. [6], cognitive biases could cause an erroneous interpretation of the MCDM results: a successful intervention could be falsely attributed to the MCDM method, while a failure of such an intervention could be attributed to other factors than the method itself.
Although this experiment was performed on non-experts and on a specific topic, evidence from previous studies in other fields suggest, that the effect of cognitive biases could persist in a different population, making decisions about different decision problems. Recall that cognitive biases, unlike individual differences and motivational biases, are much more universal, culturally independent and predictable [13,14]. A variety of studies in psychology and behavioral economics showed that the framing and loss aversion biases influence both novices and experts, even though experts might be less sensitive to these effects (for reviews, see [20,22]). Interestingly, the framing bias was found to affect the decisions of mathematically trained participants [54] and even of professionals of business and finance [3]. Thus, even fields which involve technical and statistical knowledge, which could be expected to promote "rationality", are not immune to framing and loss aversion biases. Similarly, these cognitive biases have been found to influence the decision making in a variety of fields, such as management [55], medical science [9,56], finance [57,58], engineering [59,60], law [61] and others. It is therefore likely, that MCDM problems close to any of these fields might suffer from the influence of cognitive biases. Indeed, MCDM methods have been widely used in a variety of technical environments related to engineering, industry and finance, for instance, in the selection of a suitable sewer network plan for a city [62]; in the selection of the best waste lubricant oil regenerative technology [63]; in the financial risk evaluation [64]. Furthermore, MCDM methods have been also extensively applied when dealing with policy, economics and societal issues, such as the economic development of government units [65]; low-carbon energy technology policies [66] and in economics [67]. All these examples point to the popularity of MCDM methods and to their importance in the decision-making process in major fields of our everyday life. It is therefore essential to guarantee the accuracy of these tools and carefully test MCDM methods for their sensitivity to cognitive biases.

Solutions
If the framing and loss aversion biases can have such negative outcomes on MCDM techniques, what solutions could reduce them? The first and the simplest solution would be to ensure that the biases do not occur. As the occurrence of cognitive biases is predictable, these psychological effects can be prevented in a systematic way. Therefore, debiasing techniques should be carefully studied and applied, whenever human judgment is involved in the process of MCDM. Several ways of reducing the framing and loss aversion biases have been proposed. For example, an experimental study [68] showed that the magnitude of framing effect could be reduced or even eliminated if the participants are warned about the possibility of bias. The authors found that both weak and strong warning conditions were effective in reducing bias in participants who were highly involved in the task, while only strong warning messages helped the participants with low involvement. In a similar way, [10] suggest that the loss aversion bias can be reduced when the logic of the symmetry of gains and losses is explicitly shown to the participants. In addition to this, listing advantages and disadvantages of each criterion prior to decision making could be and effective debiasing technique [50]. Further experimental studies should evaluate the efficiency of these and other debiasing methods on MCDM.
In addition to these debiasing techniques, another option for dealing with cognitive biases would be to use MCDM methods, that could potentially be less subject to these psychological effects. Although most MCMD methods do not take into account the psychological states of DMs during decision-making [69], fuzzy AHP does allow to deal with vagueness and uncertainty [70]. This method is a development of the classical AHP, in which decision makers can give vague or imprecise responses during pairwise comparison, instead of the crisp or exact numerical values used in classical AHP [71]. For that purpose fuzzy linguistic assessment variables are used in fuzzy AHP [72]. A number of studies [73][74][75] have provided evidence that fuzzy AHP is more effective in solving the problem of the imprecise judgments of decision makers compared to the traditional AHP method. Thus, it could be the case that the cognitive biases observed in our study could be reduced or circumvented when using the fuzzy version of the AHP. However, there is a risk that these cognitive biases might prevail even when using this method, as the framing and loss aversion biases were induced by the framing of the criteria, and not by the type of the scale used. Thus, it would be useful to test experimentally, if the use of the linguistic scale and the fuzzy logic in AHP could act as a counterforce to these cognitive biases.
Finally, attempts have been in recent years to reduce the effect of cognitive biases post hoc, i.e., in the MCDM method itself. One such example is the study by Phochanikorn & Tan [76]. The authors argue that once loss aversion is induced, it should be accounted for by the MCDM method. Thus, they define and calculate a loss aversion parameter (λ), based on the gain and loss value of each alternative. The results of the study show that the ranking of the alternatives changes as the loss aversion parameter increases. In a similar way, Deniz [77] proposed a way to manipulate the expected loss aversion bias during the criteria weight elicitation. After calculating criteria weights using the AHP method, the author calculated a debiased version of these weights by distributing the difference of the first and the second highest weight to other criteria proportionally to the initial weights. This way of debiasing the criteria weights, however, seems somewhat arbitrary, as it presupposes that the two highest ranked criteria are always biased. While it sometimes might be the case, in others the framed criteria could be less important and thus they might induce a loss aversion bias at a lower level in the criteria ranking. Still, the idea of predicting the amount of induced loss aversion is noteworthy as it could allow to at least partially reverse the negative effect of this cognitive bias. Several other recent studies used the theoretical background of Prospect theory to develop more accurate MCDM techniques [78][79][80], however, none of them addressed the issue of cognitive biases.
In conclusion, our paper provides the first experimental evidence of the impact of framing and loss aversion biases on MCDM. We showed that these cognitive biases can strongly influence the responses of decision makers in the pairwise comparisons of criteria. This in turn caused rank reversal in criteria weights and resulted in the choice of different best alternatives. In other words, our results point to the fact that different framings can influence both the weights of criteria and the selection of the best alternative. As these effects are predictable by the Prospect theory, we call attention to the risk of conscious or unconscious result manipulation in MCDM. We highlight possible debiasing techniques that could reduce or eliminate framing and loss aversion biases. Further studies should test experimentally, to which extent these techniques are effective and what could be the most appropriate ways to deal with this problem.