A Modified Analytic Hierarchy Process Suitable for Online Survey Preference Elicitation

: A key component of multi-criteria decision analysis is the estimation of criteria weights, reflecting the preference strength of different stakeholder groups related to different objectives. One common method is the Analytic Hierarchy Process (AHP). A key challenge with the AHP is the potential for inconsistency in responses, resulting in potentially unreliable preference weights. In small groups, interactions between analysts and respondents can compensate for this through reassessment of inconsistent responses. In many cases, however, stakeholders may be geographically dispersed, with online surveys being a more cost-effective means to elicit these preferences, making renegotiating with inconsistent respondents impossible. Further, the potentially large number of bivariate comparisons required using the AHP may adversely affect response rates. In this study, we test a new “modified” AHP (MAHP). The MAHP was designed to retain the key desirable features of the AHP but be more amenable to online surveys, reduce the problem of inconsistencies, and require substantially fewer comparisons. The MAHP is tested using three groups of university students through an online survey platform, along with a “traditional” AHP approach. The results indicate that the MAHP can provide statistically equivalent outcomes to the AHP but without problems arising due to inconsistencies.


Introduction
The estimation of objective or criteria importance weights is an integral component of most multi-criteria decision analysis studies [1].A range of methods has been applied in the literature to assess objective weights, each with advantages and disadvantages [2][3][4][5][6][7][8][9].Comparative studies of these methods suggest in some cases that the objective weights may vary considerably between methods [10], although others have found higher correlations between the results of the different methods [11].
The Analytic Hierarchy Process (AHP) [12,13] is one commonly applied approach for deriving such weights.The AHP has been used in a number of applications covering business (e.g., [14]), manufacturing (e.g., [15]), transport (e.g., [16]), healthcare (e.g., [17]), and a wide range of applied environmental and natural resource management studies [18][19][20][21][22][23][24].It has become one of the most common approaches used for preference elicitation for multi-criteria decision analysis [25].The method estimates the relative importance of each criterion being assessed to different stakeholders or decision-makers through a series of pairwise comparisons.The derived weights subsequently help determine which set of outcomes, given a range of different options, may be overall most preferable to different stakeholder groups and/or decision-makers based on their preference sets.
A key challenge facing the use of the AHP is the issue of inconsistencies in the responses by those whose preferences are being elicited.Preference weightings are highly subjective, and inconsistency between responses to different combinations of comparisons is a common problem facing the AHP, particularly when decision-makers are confronted with many sets of comparisons [26].Respondents do not necessarily cross-check their responses, and even if they do, ensuring a perfectly consistent set of responses when many subcomponents are compared is difficult.The discrete nature of the 1-9 scale in the traditional AHP approach can also contribute to inconsistency, as a perfectly consistent response may require a fractional preference score [27].Inconsistency can also arise through errors in entering judgments, lack of concentration, and inappropriate use of extremes [24,28,29].
The pairwise comparisons underpinning preference choice in the AHP are believed to make the process of assigning importance weights relatively easy for respondents, as only two sub-components are being compared at a time [1].However, as more criteria are compared, the required number of pairwise comparisons increases at an exponential rate, requiring approximately (n 2 − n)/2 comparisons, where n is the number of objectives or criteria to be compared.For example, comparing two criteria requires only one pairwise comparison, three criteria require three pairwise comparisons, four criteria require six pairwise comparisons, and so on.The propensity for respondents to be inconsistent in their responses also correspondingly increases when they are confronted with many sets of comparisons [26,30,31].
Such inconsistency in the comparison matrix may result in incorrect priority weighting being associated with each of the attributes being considered [28].Inconsistency is generally considered to affect the reliability of the resultant set of preference weights derived from the AHP, with reliability inversely related to the level of inconsistency [32][33][34].
The increased development of online survey platforms provides both opportunities and challenges for preference elicitation using the AHP.Foremost, the number of stakeholders that can be captured in the analysis can be greatly increased, providing greater information to decision-makers around the distribution of preferences across different stakeholder groups.The ease of implementing online surveys, the larger number of potential respondents they can reach, and the relatively low cost per response has made online surveys widely attractive for gaining preference information from the general community, as well as from special interest groups where individuals may be hard to identify a priori.
A key advantage of the use of online surveys is that it allows access to relevant stakeholders who may be geographically dispersed, spatially sparce, or otherwise hard to reach, even if not large in absolute numbers.For example, Thadsin, et al. [35] employed an online AHP survey to assess satisfaction with the working environment within a large real estate firm with offices spread across the UK.
The issue of inconsistencies and how they may be reduced, however, creates challenges for the use of online surveys to elicit preferences.The elicitation of preferences using the AHP has largely relied on interactions with stakeholders, either in small groups (e.g., workshops) or through the use of a survey instrument.In small group settings or where respondents are known, a common practice is to request respondents to re-consider inconsistent responses.This is less practical for online surveys.For example, the anonymity of respondents in online surveys makes following up with individual responses impossible.This lack of direct interaction with the respondents creates additional challenges for deriving preference weights through approaches such as the AHP.
The consequence of these factors is that high levels of inconsistency are relatively common in online AHP surveys.For example, Tozer and Stokes [36] and Hummel, et al. [37] respectively found 75% and 73% of respondents provided inconsistent responses.Excluding these observations substantially limits the benefits of the survey.Ideally, these inconsistencies would be identified, and respondents would be requested to reconsider their choices in the light of these identified inconsistencies.Some studies have built-in checks prompting respondents to reconsider their choices if inconsistency is observed [38,39].Feedback from respondents about these prompts was largely skeptical, as some felt that the researchers were trying to get them to choose a particular "right" answer [38].Other online AHP systems have only employed (n − 1) comparisons (e.g., A against B, B against C, but not A against C) to avoid the potential for inconsistencies (e.g., [31,40]).A wide variety of approaches has also been developed to ex post "adjust" the results to reduce inconsistencies and improve the reliability of the preference scores [41][42][43][44][45][46][47][48][49].
Given that a key factor contributing to these inconsistencies is the use of pairwise comparisons to elicit preferences, an alternative approach for preference data collection may be more beneficial for use in online surveys.Pascoe, et al. [50] developed an alternative approach to online data collection that is consistent with the AHP in terms of how the objectives are scored (i.e., 1-9 scale) and how the results are analyzed but does not rely on the use of pairwise comparisons.The method was developed to assess a large number (i.e., 22) of operational objectives identified by stakeholder representatives considering different fishery management options.The stakeholders to be consulted were geographically dispersed across over 3000 km of coastline, making an online survey the only practical option.The large number of objectives also made bivariate comparison approaches impractical, as the time required to complete the survey was expected to have resulted in low completion rates and the high cognitive burden likely to increase inconsistencies in responses.
The approach, which we term the modified AHP (MAHP) for the purposes of this study, was developed and applied by Pascoe, et al. [50] without validation due to time constraints.The purpose in the current study is to (retrospectively) validate the approach and present the approach to a broader practitioner audience.Both the MAHP and "traditional" bivariate comparison-based AHP are compared through an experimental trial, and the results are compared at the individual respondent level.The experiment involved university students from three different classes.The scenario presented was a hypothetical dinner invitation, where they were asked to identify their relative preference for type of drink, main meal, and dessert from a selection of two, three, and four options, respectively.The students were asked to complete both the MAHP and the traditional AHP (i.e., bivariate comparison) format in each case.The results of the preferences derived using both methods were then compared.
The structure of the paper is as follows.The next section presents the methods applied and describes the survey undertaken.The following section presents the results for the case of two, three, and four alternatives, respectively, as well as an assessment of the usability of each approach from the perspective of the respondent.Finally, the advantages and disadvantages of the new approach are discussed, along with a (qualitative) comparison to other preference elicitation approaches.

AHP vs. MAHP
The traditional AHP approach [12] is based on a series of bivariate comparisons.Preferences are expressed on a nine-point scale (Figure 1), with 1 indicating equal preference and 9 indicating an extreme preference for one of the sub-components.Preferences are assumed to be symmetrical, such that if A against B has a preference of a AB = 9, then a BA = 1/a AB = 1/9.With the MAHP, we avoid some of these pitfalls by modifying the way in which the data are collected and analyzed, taking into account the symmetry assumption underlying the AHP.In the online survey, respondents were presented with a nine-point importance scale against which they could assess the importance of each objective.A nine-point scale was selected (rather than an "out of 10," as might occur with a Likert scale), as it allows five categories to be defined with mid-points between them, consistent with the traditional With the MAHP, we avoid some of these pitfalls by modifying the way in which the data are collected and analyzed, taking into account the symmetry assumption underlying the AHP.In the online survey, respondents were presented with a nine-point importance scale against which they could assess the importance of each objective.A nine-point scale was selected (rather than an "out of 10", as might occur with a Likert scale), as it allows five categories to be defined with mid-points between them, consistent with the traditional AHP approach.The options being assessed are presented together as a choice matrix, with the preference score of each being explicitly compared with the preference scores of the other options (Figure 2).With the MAHP, we avoid some of these pitfalls by modifying the way in which the data are collected and analyzed, taking into account the symmetry assumption underlying the AHP.In the online survey, respondents were presented with a nine-point importance scale against which they could assess the importance of each objective.A nine-point scale was selected (rather than an "out of 10," as might occur with a Likert scale), as it allows five categories to be defined with mid-points between them, consistent with the traditional AHP approach.The options being assessed are presented together as a choice matrix, with the preference score of each being explicitly compared with the preference scores of the other options (Figure 2).A separate value is derived for each objective row (e.g., i a = 1 to 9) given the importance level identified in Figure 2 and the relative score between any two objectives , i j a is derived from the difference between them, such that: As with the traditional AHP, preferences are assumed to be symmetrical.The use of the difference rather than the nominal value is to provide a relative preference similar to what is derived from the traditional AHP bivariate comparison.For example, a score of 9 for A and a score of 7 for B is equivalent to an element score of in the traditional AHP bivariate comparison.If both are equal (irrespective of the value assigned to both), then the equivalent element score is The use of the difference also reduces the influence of the initial anchor score [51,52].For example, a pair of scores (9, 7) and (5, 3) will provide the same element score.
Common to both approaches is a comparison matrix of scores (A), given by: A separate value is derived for each objective row (e.g., a i = 1 to 9) given the importance level identified in Figure 2 and the relative score between any two objectives a i,j is derived from the difference between them, such that: As with the traditional AHP, preferences are assumed to be symmetrical.The use of the difference rather than the nominal value is to provide a relative preference similar to what is derived from the traditional AHP bivariate comparison.For example, a score of 9 for A and a score of 7 for B is equivalent to an element score of a AB = 3 and a BA = 1/3 in the traditional AHP bivariate comparison.If both are equal (irrespective of the value assigned to both), then the equivalent element score is a BA = 1.The use of the difference also reduces the influence of the initial anchor score [51,52].For example, a pair of scores (9, 7) and (5, 3) will provide the same element score.
Common to both approaches is a comparison matrix of scores (A), given by: In the case of the traditional AHP, the values of a i,j are determined directly from the bivariate comparison in Figure 1.For the MAHP, the values of a i,j are derived from Equation ( 1) and based on the responses to the comparisons illustrated in Figure 2.
Two general approaches used for determining the weights given the comparison matrix are the eigenvalue method (EM) developed by Saaty [12] and the geometric mean method (GMM) developed by Crawford and Williams [53].While the former approach has been employed in a wider range of studies, the latter has been found to be less susceptible to the influence of extreme preferences, as well as having better performance around other aspects of theoretical consistency (e.g., less susceptible to rank reversibility if the preference set changes and greater transitivity properties) [54].
For this study, we used the Geometric Mean Method (GMM) [53,55] to derive the weights for each attribute (ω i ), given by: where n is the number of comparisons, and n ∏ j=1 a i,j = a i,1 ×a i,2 × . . .× a i,n (i.e., the product of the elements of row i).This approach is used for both the AHP and the MAHP analysis.
The level of inconsistency in the results of both approaches given by the Geometric Consistency Index (GCI) is given by: where n is the number of attributes being compared within a level of the hierarchy [56].This is compared to a randomly generated value for an n × n matrix to derive a consistency ratio (CR).Values of CR ≤ 0.1 are generally considered acceptable [56].Given this, the critical values for the GCI giving a CR = 0.1 are GCI = 0.315 for n = 3, GCI = 0.353 when n = 4, and GCI = 0.370 for n > 4 [56].

Survey Design and Implementation
An online survey was developed based on a hypothetical scenario.The respondents were invited to a (hypothetical) dinner, and the host wished to accommodate dietary and taste preferences for their guests.As they could not guarantee all options would be available, the strength of preference for each was to be estimated.The development of the hypothetical scenario was to provide respondents with a set of alternatives for which they most likely had an established preference (e.g., meal type), and hence would be able to indicate these preferences using the two approaches to the best of their abilities.
The dinner was to consist of two courses, accompanied by a choice of sparkling or still water.The preference for water type formed two alternative scenarios.The main course consisted of a meat, vegetarian, and fish option (three alternatives), while dessert consisted of four options.
Students from the University of Wollongong were recruited from three classes to participate in the survey after class (with the promise of real cake as a reward for their participation).The students were asked to express their preferences using both the set of bivariate comparisons (i.e., the AHP, as in Figure 1) and the set of options presented as one table (i.e., the MAHP, as in Figure 2).The students were also asked to indicate which approach they found easier (again, using both the AHP and the MAHP).Information on the demographic characteristics of the respondents was also collected.
The survey was approved by both the CSIRO and the University of Wollongong Social Science Human Research Ethics Committees (approval number 216/23 and 2023/272, respectively).

Demographics and Response Rates
A total of 31 students initially agreed to participate in the online survey, although only 25 completed the survey (the other six entered the survey, gave their consent, ate the cake, but did not record any other response).Of those who completed the survey, 11 were male and 14 were female.Ages ranged from 18 to 50, with a mean age of 29.
Dietary requirements were also asked, as this could have influenced the responses.The survey was designed ensuring that there was a vegetarian option at each meal, with the "main meal" also including a fish option.One respondent was vegetarian, three were pescatarian, and the remainder had no specific food-type preference.

Two Alternatives: Sparkling or Still Water?
The first set of comparisons involved two alternatives-sparkling or still water.The distributions of responses were similar (Figure 3), with a clear preference for still water over sparkling using both the MAHP and the traditional pairwise approach.The mean values were also similar, and the preferences were not statistically significantly different based on a paired t-test of responses (Table 1).

Demographics and Response Rates
A total of 31 students initially agreed to participate in the online survey, alth only 25 completed the survey (the other six entered the survey, gave their consent, a cake, but did not record any other response).Of those who completed the survey, 11 male and 14 were female.Ages ranged from 18 to 50, with a mean age of 29.
Dietary requirements were also asked, as this could have influenced the respo The survey was designed ensuring that there was a vegetarian option at each meal, the "main meal" also including a fish option.One respondent was vegetarian, three pescatarian, and the remainder had no specific food-type preference.

Two Alternatives: Sparkling or Still Water?
The first set of comparisons involved two alternatives-sparkling or still water distributions of responses were similar (Figure 3), with a clear preference for still over sparkling using both the MAHP and the traditional pairwise approach.The values were also similar, and the preferences were not statistically significantly diff based on a paired t-test of responses (Table 1).The preference ranking differed between elicitation methods for only two individuals (i.e., the derived weight was higher than 0.5 for an alternative using one elicitation approach, but less than 0.5 using the other).For the other respondents, the ranking of the options was consistent across both approaches.
As there were only two options to compare (requiring just one bivariate comparison in the case of the AHP), inconsistency in choices was not possible.

Three Alternatives: Beef, Fish, or Vegetable Lasagna?
With three alternatives, preferences for each were more variably distributed.All individuals generally exhibited a preference for one of the options over the other two, although which alternative was preferred varied by individual, as might be expected given different taste preferences (Figure 4).

Three Alternatives: Beef, Fish, or Vegetable Lasagna?
With three alternatives, preferences for each were more variably distributed.A dividuals generally exhibited a preference for one of the options over the other two hough which alternative was preferred varied by individual, as might be expected different taste preferences (Figure 4).While all completed the MAHP comparison, only 18 of the 25 respondents comp the AHP pairwise comparison.Of these, half were found to be inconsistent (Figure 5 MAHP results were also tested for inconsistencies, which may arise as an artifact o discrete nature of the preference options.However, as expected, the GCI was zero for of the observations and well below the critical level (0.315) for all (Figure 5).While all completed the MAHP comparison, only 18 of the 25 respondents completed the AHP pairwise comparison.Of these, half were found to be inconsistent (Figure 5).The MAHP results were also tested for inconsistencies, which may arise as an artifact of the discrete nature of the preference options.However, as expected, the GCI was zero for most of the observations and well below the critical level (0.315) for all (Figure 5).As above, the preferences scores for each individual derived using both app were compared using a paired t-test.The scores were tested using all data betwe viduals who were consistent in the AHP comparisons and those who were incons the AHP comparisons (Table 2).In all cases, the scores for each meal were not sta significantly different between the two approaches, reflecting the wide variation erences in each of the options.
Table 2. Paired t-tests between the derived weights for the three-alternative question from proach.The t-test was undertaken for all observations, a subset of only those that were con the traditional AHP, and the remainder of observations that were inconsistent in the tr AHP.As above, the preferences scores for each individual derived using both approaches were compared using a paired t-test.The scores were tested using all data between individuals who were consistent in the AHP comparisons and those who were inconsistent in the AHP comparisons (Table 2).In all cases, the scores for each meal were not statistically significantly different between the two approaches, reflecting the wide variation in preferences in each of the options.The rankings of the derived preference weights using each approach were also compared (Table 3).In most cases, when considering all observations, the ranks of the alternative were the same using both approaches, but for several individuals the ranks changed substantially between the two preference elicitation methods.For example, the derived preferences for five individuals resulted in one option being ranked first using the MAHP but third using the AHP (Table 3).Less extreme rank changes were observed when considering only those observations that were consistent (as defined by the GCI), although the number of rank changes was still a relatively high proportion, especially for the first two ranked options.

Four Alternatives: Dessert
The four-alternative comparison again resulted in similar distributions using both approaches, with similar mean preference scores for each (Figure 6).However, as might be expected given the number of pairwise comparisons for the AHP, nearly all individuals provided inconsistent responses (Figure 7).In contrast, none of the responses from the MAHP had higher GCI values greater than the critical value (i.e., 0.353).The AHP responses at the individual level were also statistically significantly different than the MAHP responses for three of the four alternatives considered (Table 4).As nearly all respondents provided inconsistent responses to the AHP comparisons, there were too few consistent responses to test the differences in the distributions by consistency level.As with the three-alternative example, the rankings of the derived preferences weights using each approach for the four alternatives were also compared (Table 5).In most cases, the ranks of the alternative were the same using both approaches, but in some cases the ranks changed substantially.For example, 1 respondent ranked an option first using the MAHP but forth using the AHP pairwise comparison.These cases were limited, and in most cases where the ranks changed, these did so by only one level.As nearly all the observations were inconsistent (based on the GCI) for the AHP results, a consistent subset could not be considered separately (as in Table 3).
Table 5. Ranks of the four-alternative responses using each approach.The numbers in the table represent the number of observations where an alternative had the same relative rank (e.g., (1, 1), (2, 2), etc.)) or different ranks (e.g., (1, 2), (1, 3), etc.) based on the derived preference weights.The brown distributions represent the preferences for fruit estimated usi methods (AHP and MAHP respectively), the red distributions represent the preferences cream estimated using both methods, and the green distributions represent the preferences estimated using both methods, and the yellow distributions represent the preferences for p mated using both methods.

Respondents' Preferences for Each Method (a User Perspective)
At the end of the survey, respondents were also asked to indicate which approach they thought was easiest.This was asked using both approaches.There was no apparent strong preference for one approach or another (Figure 8), with preferences distributed across the range for both.The first pair of distributions in Figure 8 (i.e., the blue distributions) represent the preference for the MAHP approach ("Modified") derived from the pairwise (AHP) and grouped (MAHP) comparisons, respectively, while the second pair (i.e., the green distributions) represent the preference for the traditional AHP ("Pairwise"), again derived from the pairwise (AHP) and grouped (MAHP) comparisons, respectively.As there were only two alternatives, there are no inconsistencies, and the distributions are reciprocal.

Respondents' Preferences for Each Method (a User Perspective)
At the end of the survey, respondents were also asked to indicate which approach they thought was easiest.This was asked using both approaches.There was no apparent strong preference for one approach or another (Figure 8), with preferences distributed across the range for both.The first pair of distributions in Figure 8 (i.e., the blue distributions) represent the preference for the MAHP approach ("Modified") derived from the pairwise (AHP) and grouped (MAHP) comparisons, respectively, while the second pair (i.e., the green distributions) represent the preference for the traditional AHP ("Pairwise"), again derived from the pairwise (AHP) and grouped (MAHP) comparisons, respectively.As there were only two alternatives, there are no inconsistencies, and the distributions are reciprocal.
Figure 8. Perceptions of ease of use of each approach (derived using each approach).The black dot represents the mean of the distribution.The first part of the label represents the alternative considered, while the second part of the label represents the approach used to elicit the preference.For example, "Modified.AHP" represents the user preference for the use of the MAHP approach but derived using the traditional AHP.Conversely, "Pairwise.MAHP" represents the user preference for the use of the traditional AHP approach but derived using the MAHP.The light blue distributions represent the preferences for the use of MAHP estimated using both methods (AHP and MAHP respectively), and the green distributions represent the preferences for the pairwise AHP estimated using both methods.
The paired t-test suggested that there was no statistically significant difference between the preference scores of the individuals based on the method used (Table 6).However, comparing the samples as a whole, the mean of the preference for the MAHP approach was statistically significantly different (and slightly higher) than the mean of the preference for the pairwise (AHP) approach when using the MAHP (t = 2.146, DF = 48), but not statistically significantly different when using the pairwise AHP approach (t = −0.1345,DF = 36).Perceptions of ease of use of each approach (derived using each approach).The black dot represents the mean of the distribution.The first part of the label represents the alternative considered, while the second part of the label represents the approach used to elicit the preference.For example, "Modified.AHP" represents the user preference for the use of the MAHP approach but derived using the traditional AHP.Conversely, "Pairwise.MAHP" represents the user preference for the use of the traditional AHP approach but derived using the MAHP.The light blue distributions represent the preferences for the use of MAHP estimated using both methods (AHP and MAHP respectively), and the green distributions represent the preferences for the pairwise AHP estimated using both methods.
The paired t-test suggested that there was no statistically significant difference between the preference scores of the individuals based on the method used (Table 6).However, comparing the samples as a whole, the mean of the preference for the MAHP approach was statistically significantly different (and slightly higher) than the mean of the preference for the pairwise (AHP) approach when using the MAHP (t = 2.146, DF = 48), but not statistically significantly different when using the pairwise AHP approach (t = −0.1345,DF = 36).Table 6.Paired t-tests between the derived weights from each approach.In this case, the ease of use of each of the alternative methods was considered.When comparing preference ranks, the preference ranking changed for 11 of the 18 individuals across the two methods (i.e., the derived weight was equal to or higher than 0.5 for of the options using one approach, but less than 0.5 using the other).We define "rank consistency" in Figure 9 as observations that were consistently ranked first or second using both approaches (compared with the GCI measure of inconsistency).While potentially spurious, the respondents who were most consistent in terms of rank of the two alternatives were those who also had the greatest preference for the MAHP approach.

Alternative
Algorithms 2024, 17, x FOR PEER REVIEW 12 When comparing preference ranks, the preference ranking changed for 11 of t individuals across the two methods (i.e., the derived weight was equal to or higher 0.5 for one of the options using one approach, but less than 0.5 using the other).We d "rank consistency" in Figure 9 as observations that were consistently ranked first o ond using both approaches (compared with the GCI measure of inconsistency).Whi tentially spurious, the respondents who were most consistent in terms of rank of th alternatives were those who also had the greatest preference for the MAHP approac Figure 9. Rank consistency when the scores are estimated using both approaches.The blu represent the preference weighting given to the MAHP by respondents who were rank-con in their relative ranking of each approach when estimated using each approach.The red bars sent the preference weighting given to the MAHP by respondents who were rank-inconsist their relative rankings across the two approaches.Those who were consistent in their resp tended to favor the MAHP (based on their preference weighting).
In all the cases above, the paired t-test and ranking comparisons were only ap to respondents who provided a response to both the AHP and the MAHP version questions in the survey.Several respondents systematically did not complete the pai (AHP) comparisons but completed the grouped (MAHP) comparisons (Figure 10), p bly suggesting that these individuals did not feel confident in completing the pai comparisons.Rank consistency when the scores are estimated using both approaches.The blue bars represent the preference weighting given to the MAHP by respondents who were rank-consistent in their relative ranking of each approach when estimated using each approach.The red bars represent the preference weighting given to the MAHP by respondents who were rank-inconsistent in their relative rankings across the two approaches.Those who were consistent in their responses tended to favor the MAHP (based on their preference weighting).
In all the cases above, the paired t-test and ranking comparisons were only applied to respondents who provided a response to both the AHP and the MAHP version of the questions in the survey.Several respondents systematically did not complete the pairwise (AHP) comparisons but completed the grouped (MAHP) comparisons (Figure 10), possibly suggesting that these individuals did not feel confident in completing the pairwise comparisons.
In all the cases above, the paired t-test and ranking comparisons were only a to respondents who provided a response to both the AHP and the MAHP version questions in the survey.Several respondents systematically did not complete the pa (AHP) comparisons but completed the grouped (MAHP) comparisons (Figure 10) bly suggesting that these individuals did not feel confident in completing the pa comparisons.

Advantages of MAHP over Traditional AHP for Online Surveys
The increased availability of online survey providers and the decreasing costs of data collection through such providers offers opportunities for analysts to increase the range of stakeholders and their geographic distribution surveyed to support decision-making.The loss of direct interaction with survey respondents, however, also comes at a potential cost of increased inconsistency in the results (and with that, potentially biased weightings), as the ability to feed back to individuals is limited.This has also sparked an increase in the number of methods to try and retrospectively adjust AHP scores to reduce the influence of inconsistencies (e.g., [8,57,58]).
As noted in the introduction, inconsistencies can arise for a number of reasons, including as an artefact of the discrete nature of the responses when a fractional preference score may be required to totally eliminate inconsistencies [27,59].The MAHP is still susceptible to this, although its effect on the results was relatively minor.For the four-comparison example (i.e., dessert), seven of the 25 MAHP responses had a GCI = 0, while a further seven had a GCI < 0.05.The largest inconsistency observed was GCI = 0.24, well below the threshold value of 0.353 for four variables.In contrast, only one of the 18 respondents who completed the bivariate AHP comparisons had a GCI = 0, with all others having a GCI greater than the critical value (i.e., GCI > 0.353), with a maximum value of GCI = 1.7.The substantial difference between the proportion of inconsistency between the two approaches suggests that the discrete nature of the responses has only a relatively small impact on the level of inconsistency, with the other causes of inconsistency identified in the literature having a substantially greater impact.
In our study, differences between the MAHP-and the AHP-derived scores increased with increasing complexity.With three or fewer options to compare, both approaches produced statistically similar results (i.e., not statistically different).With four comparisons, the proportion of inconsistent responses increased substantially for the AHP but not for the MAHP.The derived weights were also statistically significantly different for three of the four dessert options, reflecting the effects of inconsistencies on the derived preference weights.
The preference weight rankings were also found to change in some cases with the approach used.With the MAHP, the relative rankings of alternatives form an implicit key role of the elicitation process, as all alternatives are compared at the same time.As a result, the relative rankings of the alternatives are obvious to the respondent and will influence how they score each alternative.In contrast, the traditional pairwise AHP compares only two alternatives at a time, and the implicit set of ranking across all alternatives (if not preference weights) is obfuscated.With only one direct comparison in the "water" scenario, both approaches produced similar ranks (with only two exceptions).In contrast, with the four-alternative scenario, the ranking of each option was more varied between the two elicitation approaches.This does not explain, however, the contradictory outcomes in the final scenario where the two approaches themselves were compared for ease of use.In this case, both may have been influenced by some of the many other drivers of inconsistency in responses, particularly as the cake was waiting and respondents may have rushed the last question to ensure they got their fair share (a consequence of the tragedy of the commons).
The ease with which the respondents can complete the survey task is an indicator of the acceptability of the weighting approach (e.g., [47,60,61]).In our study, respondents were all able (and willing) to complete the comparisons when presented as a group (MAHP), but several were not willing (or able) to complete the pairwise (AHP) comparisons.This represents a revealed preference for the MAHP comparison approach over the pairwise comparison.Using the MAHP approach also suggested a significant preference for that approach, although no statistical difference was apparent when using the AHP to assess both approaches, at least not within the subset of those who completed the AHP comparison.

Advantages of MAHP over Other Weighting Methods
A range of methods other than the traditional AHP has been applied in the literature to assess preference weights.While these were not explicitly tested in this study, some general advantages of the MAHP can be inferred.
A commonly applied approach is to use ranking-or rating-based approaches [2][3][4][5][6][7].These largely involve scoring each alternative usually from 0 to 100, with variants of the approach either awarding the most preferred a value of 100 and then scoring the other options relative to this, or awarding the least preferred a value of 10 and again scoring the other options relative to this [3].In other approaches, points are allocated across alternatives such that they sum to a given value (e.g., 100) [4].These approaches have the advantage that they avoid issues around inconsistency.However, comparative studies of these methods suggest in some cases that the objective weights may vary considerably between methods [7,10], although others have found higher correlations between the results of the different methods [11].
The MAHP approach has some benefits over these other rating-based approaches in that the resultant weights are not sensitive to the highest scores given.For example, a score of A = 9, B = 7 yields AB = 2, which is the same as if A = 5 and B = 3, and the same derived weight.In contrast, a rating-based approach would result in closer weights between the options in the first case (9,7) than in the second case (5,3).
Other approaches have also been developed to reduce the number of comparisons required using the AHP.For example, Abastante, et al. [62] developed a parsimonious AHP method that involved first rating each criterion (e.g., 0-100) and then using bivariate comparisons on a subset of alternatives using a set of reference criteria defined by the initial ratings with linear interpolation used to assess the weights for the excluded criteria.The approach was also applied by Duleba [63] to a transport development decision problem, who also validated the method against a traditional AHP approach, and by Jamal Mahdi and Esztergár-Kiss [64] to identify the importance of different criteria to the choice of tourism destinations.While the parsimonious AHP method helps reduce the number of pairwise comparisons required, it requires interaction with the decision-makers to first assess the initial rankings in order to determine the appropriate sets of pairwise comparisons.It is therefore less amenable to online surveys, where direct interactions are infeasible and potentially the set of respondents may be large.
Best-worst analysis has also been used in online surveys largely to rank alternatives or attributes (e.g., [65]).These generally require counts of responses over the survey sample and are less amenable to eliciting individual preferences.Rezaei [66], Rezaei [67], and Liu,et al. [68] developed variants of a best-worst multicriteria decision-making method where decision-makers first identified their best and worst alternatives, and pairwise comparisons were then developed based on comparing other alternatives to these best-worst alternatives.The decision-makers were then resurveyed to elicit their individual preferences.This additional step increases the complexity of the approach relative to the MAHP, which can be undertaken in a single step.Other best-worst scaling approaches have been developed in a form analogous to choice experiments for use in online surveys.While these have been successfully applied to elicit preferences (e.g., [69][70][71]), they are often limited to relatively few alternatives and require large samples to develop robust weightings.

Other Advantages of MAHP
The MAHP also has some additional advantages over the pairwise AHP approach.For example, the MAHP limits the potential range of the option scores to a maximum difference of 9.With the AHP, if AB = 9 and BC = 9, it would imply that AC should be substantially greater than 9, although the AHP is constrained to a maximum of AC = 9 (contributing to inconsistency in the results).In the MAHP, if respondents select extreme values (e.g., all "extremely important" with a value of 9) for all options, the outcome is an equal weight for each.This is also the case if all are considered less important.
A further advantage of the MAHP is that it is similar in appearance to a standard Likert scale, which is commonly applied in social science research [72].As a result, its appearance will be familiar to many potential survey respondents.Kusmaryono,et al. [73] found that 90% of studies using a Likert scale employed odd choice category values (e.g., 5, 7, 9) and that the use of a rating scale with an odd number greater than five was the most effective in terms of reliability and validity coefficients [73].While often considered ordinal, Wu and Leung [74] found that Likert scales with more than seven points were highly correlated with cardinal scales and hence could be considered appropriate for arithmetic operations.The use of a nine-point scale in the MAHP is thus consistent with best practice.This may also explain the higher response rate in the survey to the MAHP question than in the traditional AHP bivariate comparison, as the question format was more familiar to respondents.
While not considered in this study (which was aimed at comparing the "traditional" and modified approach to the initial comparisons), the MAHP approach allows for the potential for irrelevant alternatives to be effectively removed from the comparison set.This can be applied through the addition of an NA or zero option (i.e., not at all important), signaling that this option is irrelevant and can be removed from the set of comparisons (and assigned a weight of zero).In contrast, the traditional AHP and other systems produce a non-zero weight for all options in the set considered.Related to this, addition or subtraction of an alternative using the traditional AHP may result in a change in the relative weightings of the previous alternatives, and in some cases reverse the ranking of these alternatives [75].In the MAHP, adding or removing an alternative does not affect the relative relationship of those retained and hence is not susceptible to rank reversal as a result of changes in the choice set.

Limitations of MAHP
The MAHP was developed for ease of application as an online survey, as the purpose for which it was originally developed involved surveying a wide range of interest groups that were geographically dispersed over a large region, making face-to-face interviews impractical [50].The use of online surveys in general for preference elicitation, while becoming more common, still has a range of issues.Foremost of these is the issue of selfselection, as it is likely that those with strong views are more likely to respond.This issue is common to all online surveys irrespective of the method employed to elicit preferences.

Conclusions
The aim of this study was to formally present an alternative approach to preference elicitation that may be beneficial particularly when using online surveys.
The approach derives from the traditional AHP, and hence we have termed it the modified AHP (MAHP).Testing the two approaches through an experimental trial generally revealed that the MAHP produced statistically similar results to the AHP for relatively simple comparisons (i.e., two or three options) but potentially performed better (in terms of fewer inconsistent results) when a higher number of options was compared.Ease of use-a key perceived benefit of the AHP-was also considered similar between the two approaches by those who responded to this question (with the MAHP having a significantly higher preference in one of the comparisons).Higher non-response to the AHP questions also suggested a revealed preference for the MAHP.
The approach is not aimed at replacing other existing alternative approaches (such as best-worst analysis or the parsimonious AHP) but provides an additional tool in the decision analyst's toolbox.The approach is conceptually simple and provides weight estimates consistent with (and in many cases more robust than) traditional AHP methods.

Algorithms 2024 , 18 Figure 1 .
Figure 1.Example of a bivariate comparison used in the traditional AHP.A separate bivariate comparison is required for each set of alternatives.

Figure 1 .
Figure 1.Example of a bivariate comparison used in the traditional AHP.A separate bivariate comparison is required for each set of alternatives.

Figure 1 .
Figure 1.Example of a bivariate comparison used in the traditional AHP.A separate bivariate comparison is required for each set of alternatives.

Figure 2 .
Figure 2. Example of bivariate comparison in the MAHP.Additional alternatives are included as additional rows, and all alternatives are compared in the same question.

Figure 2 .
Figure 2. Example of bivariate comparison in the MAHP.Additional alternatives are included as additional rows, and all alternatives are compared in the same question.

Figure 3 .
Figure 3. Two-alternative comparison preference distributions.The black dot represents the of the distribution.The light blue distributions represent the preferences for sparkling wate mated using both methods (AHP and MAHP respectively), and the green distributions rep the preferences for still water estimated using both methods.

Figure 3 .
Figure 3. Two-alternative comparison preference distributions.The black dot represents the mean of the distribution.The light blue distributions represent the preferences for sparkling water estimated using both methods (AHP and MAHP respectively), and the green distributions represent the preferences for still water estimated using both methods.

Figure 4 .
Figure 4. Three-alternative comparison preference distributions.The black dot represents the of the distribution.The red distributions represent the preferences for Beef estimated using methods (AHP and MAHP respectively), the pink distributions represent the preferences fo estimated using both methods, and the yellow distributions represent the preferences for la estimated using both methods.

Figure 4 .
Figure 4. Three-alternative comparison preference distributions.The black dot represents the mean of the distribution.The red distributions represent the preferences for Beef estimated using both methods (AHP and MAHP respectively), the pink distributions represent the preferences for fish estimated using both methods, and the yellow distributions represent the preferences for lasagna estimated using both methods.

Algorithms 2024 ,Figure 5 .
Figure 5. Consistency indicator of responses for the three-alternative comparison: (a) MA (b) AHP.Blue bars indicate consistent responses; red bars indicate inconsistent responses.T vertical line indicates the GCI where CR = 0.1.

Figure 5 .
Figure 5. Consistency indicator of responses for the three-alternative comparison: (a) MAHP and (b) AHP.Blue bars indicate consistent responses; red bars indicate inconsistent responses.The black vertical line indicates the GCI where CR = 0.1.

Figure 6 .
Figure 6.Four-alternative comparison preference distributions.The black dot represents the of the distribution.The brown distributions represent the preferences for fruit estimated usin methods (AHP and MAHP respectively), the red distributions represent the preferences f cream estimated using both methods, and the green distributions represent the preferences fo estimated using both methods, and the yellow distributions represent the preferences for pi mated using both methods.

Figure 6 .
Figure 6.Four-alternative comparison preference distributions.The black dot represents the mean of the distribution.The brown distributions represent the preferences for fruit estimated using both methods (AHP and MAHP respectively), the red distributions represent the preferences for ice cream estimated using both methods, and the green distributions represent the preferences for jelly estimated using both methods, and the yellow distributions represent the preferences for pie estimated using both methods.

Figure 6 .
Figure6.Four-alternative comparison preference distributions.The black dot represents th of the distribution.The brown distributions represent the preferences for fruit estimated usi methods (AHP and MAHP respectively), the red distributions represent the preferences cream estimated using both methods, and the green distributions represent the preferences estimated using both methods, and the yellow distributions represent the preferences for p mated using both methods.

Figure 7 .
Figure 7. Consistency indicator of responses for the four-alternative comparison: (a) MAHP AHP.Blue bars indicate consistent responses; red bars indicate inconsistent responses.Th vertical line indicates the GCI where the CR = 0.1.

Figure 7 .
Figure 7. Consistency indicator of responses for the four-alternative comparison: (a) MAHP and (b) AHP.Blue bars indicate consistent responses; red bars indicate inconsistent responses.The black vertical line indicates the GCI where the CR = 0.1.

Figure 8 .
Figure 8. Perceptions of ease of use of each approach (derived using each approach).The black dot represents the mean of the distribution.The first part of the label represents the alternative considered, while the second part of the label represents the approach used to elicit the preference.For example, "Modified.AHP" represents the user preference for the use of the MAHP approach but derived using the traditional AHP.Conversely, "Pairwise.MAHP" represents the user preference for the use of the traditional AHP approach but derived using the MAHP.The light blue distributions represent the preferences for the use of MAHP estimated using both methods (AHP and MAHP respectively), and the green distributions represent the preferences for the pairwise AHP estimated using both methods.

Figure 9 .
Figure 9. Rank consistency when the scores are estimated using both approaches.The blue bars represent the preference weighting given to the MAHP by respondents who were rank-consistent in their relative ranking of each approach when estimated using each approach.The red bars represent the preference weighting given to the MAHP by respondents who were rank-inconsistent in their relative rankings across the two approaches.Those who were consistent in their responses tended to favor the MAHP (based on their preference weighting).

Figure 10 .
Figure 10.The number of responses to each question.The red indicates a non-response, which occurred only for bivariate (AHP)-related questions.All MAHP questions were answered by all respondents.

Table 1 .
Paired t-tests between the derived weights for the two-alternative question from each approach.

Table 1 .
Paired t-tests between the derived weights for the two-alternative question from ea proach.

Table 2 .
Paired t-tests between the derived weights for the three-alternative question from each approach.The t-test was undertaken for all observations, a subset of only those that were consistent in the traditional AHP, and the remainder of observations that were inconsistent in the traditional AHP.

Table 4 .
Paired t-tests between the derived weights for the four-alternative question from each approach.