A Fuzzy Delphi Consensus Methodology Based on a Fuzzy Ranking

: Delphi multi-round survey is a procedure that has been widely and successfully used to aggregate experts’ opinions about some previously established statements or questions. Such opinions are usually expressed as real numbers and some commentaries. The evolution of the consensus can be shown by an increase in the agreement percentages, and a decrease in the number of comments made. A consensus is reached when this percentage exceeds a certain previously set threshold. If this threshold has not been reached, the moderator modiﬁes the questionnaire according to the comments he/she has collected, and the following round begins. In this paper, a new fuzzy Delphi method is introduced. On the one hand, the experts’ subjective judgments are collected as fuzzy numbers, enriching the approach. On the other hand, such opinions are collected through a computerized application that is able to interpret the experts’ opinions as fuzzy numbers. Finally, we employ a recently introduced fuzzy ranking methodology, satisfying many properties according to human intuition, in order to determine whether the expert’s fuzzy opinion is favorable enough (comparing with a ﬁxed fuzzy number that indicates Agree or Strongly Agree). A cross-cultural validation was performed to illustrate the applicability of the proposed method. The proposed approach is simple for two reasons: it does not need a defuzziﬁcation step of the experts’ answers, and it can consider a wide range of fuzzy numbers not only triangular or trapezoidal fuzzy numbers.


Introduction
The Delphi method was developed during the 1950-1960s to forecast the impact of technology on warfare by Dalkey and Helmer [1] and Rieger [2]. This approach consists of a survey conducted in several rounds that presents some common characteristics: 1. The process can be conducted by a moderator.
2. An anonymous group of experts is invited to participate through the process of mail or online questionnaires and to give their independent opinions (the questionnaire survey is conducted anonymously and does not require meeting them in person) about the items of the questionnaire. 3. Iterative surveys (often more than two, up to three or four rounds) are usually used. 4. The experts give their opinions about each item in a numeric way or by using labels and make some commentaries to improve the statement of the item from their respective point of views. 5. After each round, a report with the results from the previous round is sent to the experts for the next round, so they can modify their opinions in order to increase the collective agreement, taking into account the report and the comments made by other experts. 6. This process is repeated until some consensus conditions are reached (or after a previously set number of rounds).
Although the traditional Delphi methods have been widely accepted as an effective tool and have been used in a wide range of applications, problems of ambiguity and uncertainty in experts' opinions still remain. The measurement of human judgement is considered as being an emotional, complex, perceptual, subjective, and personal phenomenon, involving many domains of an individual life experience. In general, classic rating scales (e.g., 5-or 7-point Likert scales) inherently consider crisp numbers to measure human thinking. However, due to the complicated nature and uncertainty of a human judgment, it is very difficult to get an accurate numerical number for evaluating it.
The Fuzzy Delphi Method (FDM) was developed to overcome this problem through the combination of fuzzy theory and the classical methodology [3][4][5][6][7][8][9]. In general, FDMs usually use linguistic variables in designing the questionnaires to gather opinions from experts and, after that, they consider a defuzzification step; that is, a step in which linguistic labels (or fuzzy numbers) are reduced to real numbers. As a consequence, the process suffers a great loss of information. The reason for considering defuzzification is not based on the difficulty to operate with fuzzy numbers (which is reasonably easy in some contexts) but on the fact that there is no universally accepted methodology for ranking fuzzy numbers. Although many procedures have been introduced in the last fifty years, almost all of them suffer the same drawback: it is relatively easy to determine two fuzzy numbers that, when ordered through the proposed methodology, one gets the opposite ordering that any expert would propose. In other words, these procedures lead to results that, in many cases, are counter-intuitive, so they cannot be considered in a decision-making process.
To face the above-mentioned problems, the main aim of this paper is to develop a simple theoretical and methodological approach that leads to diverse applications in many fields and can be employed without a defuzzification step, considering a wide range of fuzzy numbers to collect the experts' judgments. On the one hand, the experts' opinions are collected as fuzzy numbers, which enrich the different ways of expressing an (maybe, subjective) opinion. On the other hand, such opinions are collected through a computerized application that is able to interpret the experts' opinions as fuzzy numbers. In addition, in order to overcome the ordering drawback, we propose employing a recently introduced fuzzy ranking methodology (see [10]) in order to determine whether the expert's fuzzy opinion is favorable enough (comparing with a fixed fuzzy number that indicates Agree or Strongly Agree). The main advantages of the proposed methodology are the following: (1) it satisfies a great set of reasonable properties (see [11,12]); (2) it is according to human intuition in practical cases; (3) it is applicable to the whole set of fuzzy numbers (it is not reduced to triangular or trapezoidal fuzzy numbers); (4) it is very easy to compute and interpret in practice, and it overcomes certain shortcomings that appear when applying other more complex algorithms; and (5) in the case of triangular or trapezoidal fuzzy numbers, the procedure is particularly simple and intuitive (see [13]).
In the traditional case, Delphi studies usually consider a certain level of agreement (e.g., more than 80% on a 5-point Likert scale in the two top measurements, desirable/highly desirable) as a consensus [14]. In this paper, we propose a method that tries to extend this idea to the fuzzy context: the moderator will set a particular fuzzy number for indicating Agree (or Strongly Agree), and he/she will compare each expert' opinion to this fuzzy threshold by using the proposed fuzzy ranking. After that, the computation of the percentage of experts' opinions that are greater than or equal to this threshold will determine whether a sufficient level of consensus is attained or not. Furthermore, to reach a consensus, it is often required that no comments are made by the experts (which means that they do not know how to improve the item).
To illustrate the approach, we consider a cross-cultural adaptation of a questionnaire performed using fuzzy answers in a healthcare study. In this context, a visual analogue scale (introduced by Freyd [15]) is often used. For example, the amount of pain that a patient feels ranges indicate a point across a continuum line between two points that is from none to an extreme amount of pain. In this case, this choice appears continuous since the answer does not take discrete jumps, as a categorization of none, mild, moderate, and severe would suggest. It captures this idea of an underlying continuum that this scale was devised on. No training is required to determine a score, and traditional statistical techniques can be used to analyze them. Therefore, it is commonly used to appropriately translate the intensity of symptoms between the patient and the medical and healthcare professionals. Taking into account that a visual analogue scale is often used in epidemiologic and clinical research to measure the intensity or frequency of various symptoms [16], this research also investigates the development of a computational appliance for the rapid collection of fuzzy data.
The organization of the rest of the paper is as follows. First, we give some background on fuzzy numbers and fuzzy ranking. Next, we present the framework of the proposed fuzzy Delphi method, in which the computerized application to collect experts' opinions as TFNs plays an important role. In the fourth section, we describe the results of an application of this method to the problem of consensus in healthcare, and later, we discuss the obtained results. Finally, some conclusions are given, and prospective work is proposed.

Preliminaries
In this section, we include some basic definitions and notations about fuzzy numbers and the fuzzy ranking considered in this paper for a good comprehension of the rest of the manuscript.

Fuzzy Numbers
Let R and I stand for the set of all real numbers and the closed real interval [0, 1], respectively. A fuzzy set on R is an arbitrary function A : R → I (no additional assumptions are supposed on a fuzzy set). However, although a fuzzy number is a fuzzy set, there is no unique definition associated with the notion of fuzzy numbers because distinct properties can be considered. As a consequence, several notions about the idea of fuzzy numbers can be found in the literature (see, for instance, [17][18][19]). For our purposes, we will employ the following one.
A fuzzy number A (for short, a FN) of the real line R is a fuzzy set of the real line, A : R → I, satisfying: (1) normality (A(x 0 ) = 1 for some x 0 ∈ R), (2) fuzzy convexity (A(λx + (1 − λ)y) ≥ min{A(x), A(y)} for x, y ∈ R and λ ∈ [0, 1]), and (3) upper semicontinuity (if x 0 ∈ R and ε > 0, there is δ > 0 such that A(x) − A(x 0 ) < ε whenever | x − x 0 | < δ). Some researchers replace the normality condition with the existence of an absolute maximum. Function A is usually referred to as the membership function of the FN. Each real number A(x) ∈ [0, 1] can be interpreted as the uncertain degree that the point x belongs to the FN A.
For each α ∈ (0, 1], the α-level set (or α-cut) of the FN A is the crisp set A α = { x ∈ R : A(x) ≥ α }, and the kernel (or core) of A is ker A = A 1 . Each level set is a (bounded or unbounded) closed interval of the real line (involving the Euclidean topology). In general, when A is an FN, the set { x ∈ R : A(x) > 0 } can be closed, open, or none of them. To maintain the closedness of the level sets, we define the support of an FN A as the set supp(A) = cl({x ∈ R : A(x) > 0}), where cl(Ω) denotes the closure of a subset Ω ⊆ R in the Euclidean topology. In such a case, its support is also a closed interval. Notice that A α ⊆ A β ⊆ supp(A) for all α, β ∈ (0, 1] such that β ≤ α.
Each level set and the support of an FN can be bounded or unbounded in R. For our purposes, we will only consider FNs whose supports are bounded in R. Coherently, we will denote by F the set of all FNs of the real line with bounded support. In such a case, if we use the convention A 0 = supp A, then each level set is a non-empty, closed, and bounded real interval, so it can be denoted by A α = [ a α , a α ] for each α ∈ I, where a α and a α are, respectively, the inferior and superior extremes of the α-level set A α of the FN A.
Although FNs can be represented by very general functions, we prefer to restrict our study to FNs (general enough) with simple shapes because, in practice, these are the FNs that are most frequently used in practical applications. For instance, given four real numbers a 1 , a 2 , a 3 , and a 4 ∈ R such that a 1 ≤ a 2 ≤ a 3 ≤ a 4 , a trapezoidal fuzzy number (for short, a TFN), denoted by A = (a 1 /a 2 /a 3 /a 4 ), is the FN defined by (as shown in Figure 1): 0, in any other case. The real numbers a 1 , a 2 , a 3 , and a 4 are usually called the corners of the FN A because, when a 1 < a 2 < a 3 < a 4 , they correspond to the vertices of the trapezoid that we obtain when the real function A is plotted. Triangular FNs, denoted by (a 1 /a 2 /a 4 ), are trapezoidal FNs such that a 2 = a 3 . The previous definition extends the notion of a real number to the fuzzy setting because when a 1 = a 2 = a 3 = a 4 = r ∈ R, the FN r = (r/r/r/r) (which takes the value 1 if x = r and the value 0 in any other case) represents the real number r.
TFNs are appropriate tools in order to represent both the imprecision that is necessarily associated with each measuring instrument and the subjective opinions that several experts could express about a finite set of items. For instance, the TFN (8.5/8.7/8.8/9) could represent a very good, but imprecise, opinion about the quality of a wine when the range interval [0, 10] is considered.
Basic operations on the real line can also be extended to the family F by Zadeh's Extension Principle [20], that is, by defining where ∈ {+, −, ·, /} is a traditional operation (notice that the division can only be considered when the real number 0 does not belong to the support of the divisor, see [18,21,22]). This definition is equivalent to that obtained by the interval arithmetic with the α-level sets ( [23]): for instance, if A, B ∈ F , then

Fuzzy Ranking
As we commented in the Introduction, it is not an easy task to rank FNs. Many approaches have been introduced, but many of them produce counter-intuitive results when the FNs are twisted, that is, when their corresponding graphic representations show several common points, giving place to intricate positions. The pointwise binary relation among functions is not useful when such functions have a concrete meaning in order to generalize the real numbers. In this context, in [10], Roldán López de Hierro et al. introduced a novel methodology for ranking FNs, whose main characteristic is to be according to human reasoning in most cases. To describe it, let µ denote the Euclidean measure of subsets of R (in practice, the measure of a real bounded interval [a, b] is b − a). Given two FNs A, B ∈ F , let us consider the subsets of I defined as: In a way, the set I A,B (which is an interval when A and B are TFNs) represents the family of probabilistic levels α in which the FN A is less than or equal to B with regards to the binary relation that is going to be defined. Therefore, the respective measures of the sets I A,B and I B,A must be compared in the following way. We will write This binary relation is not based on any ranking index, and, as we have commented, it satisfies a great list of reasonable properties according to human intuition (see [10]). In [13], the authors completely described how this ranking methodology works when it is applied to compare two TFNs. We highlight that we will use the ranking process given by Equation (1) in order to obtain the agreement percentage that we consider in the main sections (necessary code can be found in Appendix A).

A Novel Fuzzy Delphi Methodology
In this section, we describe a novel FDM in order to face any problem in which a consensus between many experts is needed regarding one or more items. Our approach assumes that the experts' opinions are given by FNs, which generalize other methodologies based on real numbers and/or linguistic labels, and also permit the judges to express their opinions using a range of ambiguity that is full of information. Before explaining the main steps to apply the proposed FDM, we highlight the work done to program a computerized tool, developed in cooperation with professionals of the healthcare system, to facilitate a rapid and secure application of the fuzzy method.

A Computerized Application to Collect Experts' Opinions as TFNs
Starting from the visual analogic scale, in healthcare studies, we can consider FNs described as follows. Respondents select a representative rating point on a bounded interval and indicate higher or lower rating points, depending on the relative ambiguity of their judgment (see Figure 2). In this case, the interval is the largest support that can be considered for any FN in the study, i.e., [0, 10], and it corresponds to a 10-centimeter line in the printed questionnaire. This free-response format gives triangular FNs and lets us collect fuzzy data without training. However, we need to use a ruler to measure the values corresponding to the FN and safe data manually before applying the FDM.
Following this idea, we have generated a computerized application to collect the fuzzy data. In the first stage of the implementation, we considered green/blue/red marks indicated in the bar to gather the opinions in terms of triangular FNs. Respondents select a representative rating point using the blue mark in a bar that represents the bounded interval [0, 10] and indicates higher or lower rating points with the green and red marks, respectively (see Figure 3). When the survey is finished, the fuzzy data are exported to an .xlsx file (which can be easily imported in RStudio [24] for fuzzy computations).
The second stage of the implementation considers TFNs, A = (a 1 /a 2 /a 3 /a 4 ), for the answers. In this case, responders were instructed to select a representative rating interval using two blue marks within a bounded bar and then move the green and the red marks to indicate lower and upper rating points, respectively (see Figure 4). The green and red marks give two numbers, a 1 and a 4 , corresponding to the support [a 1 , a 4 ] of the TFN, and the interval defined by the blue marks corresponds to the kernel [a 2 , a 3 ] of the TFN.   With this computer appliance, the ambiguity of the judgement can be easily and automatically translated into a quantitative form based on TFNs and collected for applying the FDM explained in the next subsection.

The Proposed Fuzzy Delphi Method
In this subsection, we describe the steps we have to follow to carry on the proposed FDM and the framework in which it can be developed. Without loss of generalization and to facilitate its comprehension, we will assume that the experts' opinions are performed by TFNs (in fact, the computerized application was designed only to capture the four corners of such kind of FNs). Moreover, the methodology implemented in Equation (1) can be applied even if the experts' opinions are modeled by FNs with more complicated shapes, which means that the following procedure can be applied to a wide range of FNs collected by a fuzzy questionnaire.
Suppose that a committee of experts (or judges), represented by E 1 , E 2 , . . . , E n , are asked about m items (or decision criteria).
be a TFN to represent the fuzzy performance rating assigned to the i-th item (e.g., the agreement of the cultural translation to Spanish of the item of the QOD-LTC questionnaire) by expert E k (where i ∈ {1, 2, . . . , m} and k ∈ {1, 2, . . . , n}).
In the next lines, we describe the proposed fuzzy Delphi approach: Step 1: To create an initial version of the questionnaire, describing the items as clearly as possible. At the same time, set the necessary elements to identify the consensus. For instance, in the research we will describe in the next section, having the maximum support [0, 10] to express an opinion in mind, we have chosen the triangular FN C = (8/9/10) to indicate Agree and Strongly Agree (other researchers could use any other threshold). To reach the consensus on the i-th item, it will be necessary that two conditions hold: (1) there is, at least, 80% agreement among experts about the item, that is, at least 80% of experts' opinions A 1 i , A 2 i , . . . , A n i about the i-th item will satisfy C A k i (this means that the expert's opinion is very favorable to the item); (2) there will not be any comments from experts (if there is at least one comment, we understand that the item could be improved).
Step 2: To create a group of people that are experts on the subject that agree accept to participate in the survey.
Step 3: To email the questionnaire and collect the answers, which refer (measure) to the level of agreement with each of the items. In this step, experts were invited to respond using a free-response format fuzzy rating scale-based questionnaire via the web involving TFNs.
Step 4: After receiving the experts' opinions and the corresponding comments, the next step is to compute the agreement percentage p i among experts about the i-th item (i ∈ {1, 2, . . . , m}), as the quotient between the number of experts' opinions for which C A k i (from 1 to n) over the number n of experts, multiplied by 100. Then, compute the number N i of commentaries made by the set of experts.
Step 5: If the consensus is reached for the i-th item using the criteria previously set in the first step, then such item will not be posed again to the experts in the next round. On the contrary, if the consensus is not attained on some items (maybe because the percentage p i is not great enough or maybe because there still exist some commentary), the moderator will modify the items in which there is not a consensus by following the experts' commentaries, and he/she will complete a report in order to let all experts know about the other experts' opinions. Such a new questionnaire and report will be submitted to the experts, and Step 3 starts again.
Step 6: The procedure will stop after iterative surveys (often more than twice, up to three or four rounds) when the consensus is reached on each item or after a previously set number of rounds.

Survey Outcomes (Results)
Many questionnaires and measurements have been developed to assess medical conditions and quality of life or death. Measuring the quality of the dying experience is important for improving care for dying patients. However, few instruments exist that assess the quality of one's dying process in Spain. This study uses the new FDM proposed to validate the design of a new Spanish version of the "Quality of Dying in Long-Term Care" (QoD-LTC) [25], one of the most widely used scales to assess the quality of dying in long-term care facilities.
For Spain, 11 issues were proposed (m = 11). This section summarizes the FDM in terms of how much consensus and stability evolved through rounds 1 to 3 by the agreement percentages and the number of comments (we also include the mean of the answers).
A heterogeneous group of thirteen experts (eight nurses, three psychologists, and two doctors) was considered (n = 13). The average age was 43 years, and the average professional experience was 10.8 years (5 are dedicated to care work, 8 to teaching and clinical research, and 1 to both tasks).
The initial version of the questionnaire was submitted to the 13 experts, and by using the computerized tool, we collected their opinions about the 11 items. For instance, Figure 5 represents the 13 experts' opinions { A k 1 = (a k 1 /b k 1 /c k 1 /d k 1 ) } k=13 k=1 collected for Item 1 in the blue color, and in the red color, the comparison FN C = (8/9/10) is presented. We can observe that, although it is clear the relative positions between some TFNs A k 1 and C, there are other cases in which even experts on fuzzy ranking could suffer some doubts about what FN is greater or lesser. Hence, it was necessary to apply the binary relation to decide whether A k 1 C or vice versa. Such comparisons can be observed in Figure 6a. Ana arrows diagram with the 13 experts' opinions as blue circles (denoted by E1, E2, . . . , E13) and the comparison FN, C, in red color is shown. We have used the following criterion: each arrow points to the largest fuzzy number by the fuzzy binary relation ; that is, if A k 1 ≺ C, we have plotted the arrow E k −→ C; if C ≺ A k 1 , the represented arrow is C −→ E k ; and if A k 1 ≈ C, we have plotted E k ←→ C. As we can see, eight opinions were greater than C by the fuzzy binary relation , and five of them were lower than C. As a consequence, the percentage of the agreement of the cultural translation to Spanish for Item 1 of the QOD-LTC questionnaire was 8/13 = 61.5%, which was clearly insufficient to reach the consensus. Moreover, four comments were collected from the experts for Item 1 in the first round, which confirms that the consensus was not attained. These data can be found on the third line of Table 1, associated with Item 1, where we can notice that the agreement percentage was 61.5%, and the number N of comments was four.
Taking into account the experts' suggestions, the moderator modified Item 1 in the questionnaire, and it was submitted to the experts in the second round. The experts had access to other experts' opinions, and they expressed their opinions about the new version of Item 1. In the second round, as we can see in Figure 6b, 11 opinions were greater than C by the fuzzy binary relation , and only 2 of them were lower than C. As a result, the percentage of agreement for Item 1 in the second round was 11/13 = 84.6%, which was greater than 80%. As there was no comments, we considered that the consensus was reached for Item 1. Accordingly, Item 1 was not included in the third round of the questionnaire. The corresponding information can also be found in the third line of Table 1 in the columns that are dedicated to the second round.
This process was implemented for each one of the 11 items considered in the questionnaire. In order not to be repetitive, we summarize the results obtained for each item as follows. Figure 7a-p represent the fuzzy data corresponding to the 13 expert's opinions (in blue color), ranked for each round and each item. In this ranking, the comparison FN C was plotted in red. Table 1 summarizes the results of the fuzzy Delphi rounds. For each round, the first four columns give four numbers (R I , R C1 , R C2 , R S ) corresponding to the fuzzy mean of the 13 experts' opinions; the next column gives the computed agreement percentage (%); the last column is the number of comments made corresponding to each item, denoted by N. Figure 7a-p let us easily check the obtained agreement percentage. We can interpret the sixth line of Table 1 (corresponding to Item 4) as follows. In the first round, the agreement percentage among experts was 46.2%, and there were eight comments. As the consensus was not reached, a second round was necessary. In the second round, there was an agreement percentage among experts around 76.9%, and there were three comments. Consequently, a new round was carried out. Finally, in the third round, the mentioned percentage was 84.6% with no comments. The consensus about Item 4 was reached after three rounds. These data have been represented in Figure 7g-i, which correspond to Item 4. It is interesting to notice that, in the case of Item 3, there was 100% agreement on the proposed translation. However, this item was included in the second round because there was still one comment that could improve the statement of the item.
Finally, we highlight that the consensus on all items was reached after three rounds.   The results of the fuzzy Delphi rounds, with the reasons why an item must be included in the following round highlighted in gray.

Discussion
The Delphi results show a change in experts' views towards consensus and stability. For all the items, an increase in the percentage agreements was observed over the three rounds. The highest disagreement percentage in round 1 corresponds to Item 4, demonstrating that by taking into account the experts' comments, these views can be considerably altered. The number of comments decreased in each round. This reduction supports the evolution of the consensus.
The proposed method, which follows the original methodology for the case of fuzzy answers, is suggested to be an effective way to measure group consensus. The main characteristics of the proposed FDM are: (1) the computerized tool to collect the experts' opinions, which avoids any kind of imprecision when measuring the TFNs plotted on a paper, and (2) the fuzzy binary relation , which contains the necessary fuzzy complexity in order to decide whether the expert's opinion is favorable enough to the item.
This methodology is distinct from other previously introduced FDMs because it does not involve any kind of defuzzification at any stage of the process. For us, it is very important that the experts' opinions are expressed in terms of FNs because these judgements need to involve a certain level of ambiguity that cannot be modeled by using real numbers and/or linguistic labels. Hence, if we apply any kind of defuzzification on the experts' opinions at any step of the process, we will suffer a great loss of information that will cause the use of fuzzy elements to seem unjustified throughout the process. Although many researchers have attempted to develop FDMs, few of them have proposed methods without a defuzzification step, and we have not found any method that could be applied to the whole set of FNs in the literature.

Conclusions and Prospect Works
The Delphi multi-round survey is a procedure that has been widely and successfully used to aggregate experts' opinions. The Fuzzy Delphi Method (FDM) is the modified and enhanced version of the classical Delphi technique. In this paper, the proposed method expands a classic Delphi method to incorporate the fuzzy data in experts' opinions. The evolution of consensus is shown by the increase in agreement percentages and a decrease in the number of comments made.
In this paper, we have presented a novel FDM that overcomes some of the shortcomings that usually appear in this context. On the one hand, we have assumed that the experts' opinions can be modeled by FNs, which are more appropriate to perform the complex human judgement about each item of the questionnaire. On the other hand, we have developed a computerized tool in order to easily collect the experts' responses. This information system is, for the moment, reduced to collect TFNs, but, in prospective work, we plan to develop a computer tool that will be capable of collecting any kind of FN. By comparing the experts' fuzzy responses with a fuzzy agreement level (implemented as a particular FN), the moderator is able to determine whether there is a sufficient percentage of experts that is reasonable according to the statement given on each item of the questionnaire. If such a minimal percentage is not attained, the moderator can start a new round after modifying the questionnaire accordingly to experts' comments. Some advantages of the proposed FDM are the following: • It is simple and can be employed without loss of information (that is, without a defuzzification step). • It can be applied to a wide range of FNs not only TFNs. • The computer appliance developed for collecting the fuzzy data saves time and costs in handling fuzzy questionnaires.
A cross-cultural adaptation was used to illustrate the new approach. Many processes in healthcare that may need a consensus are susceptible to applying the introduced methodology. Further research is needed in this field of study in order to complete the development of new products, services, and techniques that may also be needed to consider a FDM.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

FN
Fuzzy number TFN Trapezoidal fuzzy number FDM Fuzzy Delphi method

Appendix A. R Code for Ranking Two TFNs
In Appendix A, we include the necessary code in order to rank two TFNs by using Roldán López de Hierro et al.'s methodology [10] described in Equation (1).
Suppose that we wish to rank two TFNs, A = (a 1 /a 2 /a 3 /a 4 ) and B = (b 1 /b 2 /b 3 /b 4 ), for which we know their corners. We identify such TFNs in R as the vectors FNa = c(a 1 , a 2 , a 3 , a 4 )t and FNb = c(b 1 , b 2 , b 3 , b 4 ). Next, we describe all the functions we need.
( ) The function LeftInterval(a, b) takes two vectors a = c(a1, a2) and b = c(b1, b2) as inputs representing the inferior corners of the TFNs A and B, that is, a 1 , a 2 , b 1 and b 2 , and it produces the compact subinterval (closed an bounded) I ⊆ I of all values α ∈ I as output such that a α ≤ b α . Such interval is described as a vector c(α, β), where α and β are the extremes of that interval. If such interval is empty, the output is c(−1,−1) (in what follows, we use the number −1 to represent a computing mistake or the empty set).
Intersection= function(i1,i2){ alfa1<-i1 [1] beta1<-i1 [2] alfa2<-i2 [1] beta2<-i2 [2] if(alfa1 == -1 || alfa2 == -1){ return(c(-1,-1)) } else { if(max(alfa1,alfa2)<= min(beta1,beta2)){ return(c(max(alfa1,alfa2),min(beta1,beta2))) } else{ return(c(-1,-1)) } } } ( ) The function Interval(FNa, FNb) takes two TFNs (described as the vectors FNa = c(a 1 , a 2 , a 3 , a 4 ) and FNb = c(b 1 , b 2 , b 3 , b 4 )) as inputs, and it computes the interval I A,B given as a vector c(α, β). If such interval is empty, the output is c(−1,−1). )) } ( ) The function LengthInterval(i) takes a compact interval (described as a vector c(α, β)) as input, and it computes the length β − α (which is a real number). If such interval is empty, the output is −1. ( ) The function Ranking2TraFN(FNa, FNb, TextoA, TextoB) takes two TFNs (described as the vectors FNa = c(a 1 , a 2 , a 3 , a 4 ) as inputs and FNb = c(b 1 , b 2 , b 3 , b 4 )) and two character strings (linguistic labels), TextoA and TextoB, that represent such TFNs, and it produces the following character string as output: Notice that this function asks us for the linguistic labels that we want to use to denote the fuzzy numbers. It is usual to call the first fuzzy number "A" and "B" as the second. In this way, the function will produce one of the following three outputs: "A < B", "A ∼ B", or "A > B". However, when working with more than two fuzzy numbers, it is usual to use other labels. In this case, if we write other linguistic labels, we could get outputs, such as "C < D", "A1 ∼ B2", or "FN1 > FN2". It will depend on the text that we will introduce into the arguments. ( ) The function menu() will ask the user for the four corners of the TFNs A and B jointly with their linguistic labels TextoA and TextoB, and after checking that the inequalities a 1 ≤ a 2 ≤ a 3 ≤ a 4 and b 1 ≤ b 2 ≤ b 3 ≤ b 4 hold, the output is the correct ranking of the FNs A and B by employing the labels TextoA y TextoB (if the inequalities are not fulfilled, it outputs a warning message). This function uses the library "FuzzyNumbers" in order to represent the involved TFNs, so it will be necessary to previously install such library.