Group Decision-Making Based on Artiﬁcial Intelligence: A Bibliometric Analysis

: Decisions concerning crucial and complicated problems are seldom made by a single person. Instead, they require the cooperation of a group of experts in which each participant has their own individual opinions, motivations, background, and interests regarding the existing alternatives. In the last 30 years, much research has been undertaken to provide automated assistance to reach a consensual solution supported by most of the group members. Artiﬁcial intelligence techniques are commonly applied to tackle critical group decision-making difﬁculties. For instance, experts’ preferences are often vague and imprecise; hence, their opinions are combined using fuzzy linguistic approaches. This paper reports a bibliometric analysis of the ample literature published in this regard. In particular, our analysis: (i) shows the impact and upswing publication trend on this topic; (ii) identiﬁes the most productive authors, institutions, and countries; (iii) discusses authors’ and journals’ productivity patterns; and (iv) recognizes the most relevant research topics and how the interest on them has evolved over the years.


Introduction
Making decisions under complex and uncertain situations frequently requires the cooperation of a team of experts, each one with their own background, opinions, motivations, etc. As Huber [1] already noticed in 1984, in these circumstances, experts usually need to spend considerable time in meetings to reach a collective agreement. For more than 30 years, research on Group Decision-Making (GDM) systems have pursued saving much of this time by providing automated support to accomplish consensual decisions [2,3]. Figure 1 sketches the general GDM framework, where a group of experts desire to make a collective decision among a set of alternatives. First, they express their individual preferences on the alternatives. Then, those preferences are combined using an aggregation function. As typically the resulting collective preference does not achieve experts' consensus, a feedback mechanism assists experts in changing their preferences for augmenting the consensus level.
Using both science mapping and performance analysis, this paper answers the following Research Questions (RQs): The remaining of this paper is arranged as follows: Section 2 introduces the materials and methods used to undertake our bibliometric analysis; Section 3 reports the analysis results and provides some discussion regarding the research questions above; finally, Section 4 summarizes the conclusions of our work.

Materials and Methods
This section describes the systematic procedure we have followed to analyze the literature on AI-GDM.

Bibliometric Workflow
The workflow suggested by Cobo et al. [29] has been adopted to undertake our analysis systematically, which is similar to others proposed in the literature, such as PRISMA [37] or Börner et al. [38]. Figure 2 shows the followed workflow, which is organized in three stages:

1.
Data retrieval. As many experts have stated [39][40][41], obtaining all the articles relevant for a literature review is unrealistic. The objective is then to achieve an unbiased publication sample that represents the population satisfactorily.
A sample of 2,862 bibliometric records was gathered from the Clarivate WoS database using the following query: The first line sets the topic of the analysis; the NEAR/0 operator forces that (Making OR Support) follows immediately Group Decision, but tolerates spaces and the '-' character (e.g., the query catches articles with Group Decision-Making and Group Decision-Making). As this paper focuses on the application of AI techniques to GDM, Line 2 limits the scope to the WoS category Computer Science Artificial Intelligence. Line 3 sets the time period of the records: every article published until 2019. Finally, Line 4 specifies the WoS indexes against the query is thrown. As a final remark, the criterion to select WoS instead of other databases, such as Google Scholar or Dimensions, is its outstanding data quality prestige [42]. 2.
Data normalization. Bibliographic data are sometimes not normalized enough [29,30]: an author may appear differently in several records, the same concept may correspond to distinct keywords, etc. These problems can bias the subsequent analysis. For this reason, we preprocessed the data to guaranty its normalization. 3.
Data analysis. The normalized data were examined using two widespread bibliometric procedures [43]: performance analysis and science mapping. Both techniques have been successfully applied in recent studies (e.g., [25,26,28]) because they complement each other very well: performance analysis determines the importance of the bibliometric elements, and science mapping models how those elements are interrelated.

Performance Analysis
The primary method to assess research performance is citation analysis [44]. The Hirsch index [35], typically known as h-index, is probably the most commonly accepted citation analysis indicator [45]. If the index is used to quantify the author's productivity, then it is defined as follows: An author has index h whenever h of her n papers have at least h citations each, and the remaining n − h papers have less than or equal to h citations each.
Furthermore, the h-index concept can be adapted to account for the performance of any bibliographic element: articles [46], journals [45], research organizations, etc.

Science Mapping
Three complementary techniques [47] were applied to identify the key research topics, the significance and role that those topics play, and how the interest in the topics has evolved over time. The following sections introduce these techniques.

Thematic Network Identification
A method called co-word analysis [36] was used to recognize the most relevant topics in AI-GDM research. Co-word analysis works by measuring the co-occurrence frequency of pairs of article's keywords. Co-occurrences are first normalized [48], using the equivalence index [34] typically. Then, a clustering algorithm groups the keywords in function of the computed equivalence indexes [49], corresponding each group to a thematic network, i.e., to a key topic. In particular, the clustering algorithm we applied was simple centers [34].

Strategic Diagrams
The role that each thematic network plays in AI-GDM research was modeled using the density and centrality measures. Density [34] accounts for the thematic network internal coherence by examining the links between keywords inside the network. Centrality [50] estimates the interaction degree of the network with others by analyzing the links between keywords inside and outside the network.
Strategic diagrams are then used to provide a global representation of all topics' role. In these diagrams, the x-axis and y-axis denote the network's centrality and density, respectively. Thus networks are classified according to the quadrants where they are placed [51,52]; see Figure 3.

Maps of Conceptual Evolution
As the years go passing by, the vocabulary authors employ evolves: whereas some new words appear, others fall into disuse. Hence, the keyword set used in each period provides information concerning if the number of researched topics increases (new terms are included in the set), decreases (old words are erased from the set), or remains stable. Following the indications given in [47], we used the Inclusion index to track the vocabulary evolution in AI-GDM.

Results and Discussion
The following sections summarize the results of our analysis and answer the research questions this paper targets.  Figure 4 shows the number of published papers per year. Colors blue and yellow denote periods of stability and growth, respectively. In particular, four stages can be distinguished:

1.
During the first ten years (from 1991 to 2000), the fundamental ideas were proposed and developed in 82 articles.

2.
The subsequent nine years (from 2001 to 2009) correspond to a growth period, where 540 articles where published.

3.
A short period of three years (from 2010 to 2012) with a stable publication rate (121.33 articles per year on average, accumulating a total of 364 papers).

4.
A rapid growth period that lasts up to present days (from 2013 to 2019), where 1856 articles have been published.

RQ2: What Is the Impact of the Research Literature on AI-GDM? (RQ2)
Citations to the published literature on AI-GDM have also followed an upswing trend. Figure Table 1 summarizes the authors who have published the highest number of papers, including also the total number of citations that those papers have received, and the authors' h-index (limited to the sample). 3.4. Is There Any Authors' Productivity Pattern? (RQ4) Figure 6 represents the number of authors per year. As the number of articles increases over time, the number of authors rises as well. There is a total of 3514 authors. Although most of them have published a pretty reduced number of papers (67.92% of the authors have written only one paper in 29 years), a small group of authors have contributed with a much bigger number of articles (8.22% of the authors have published at least five articles). This fact is not surprising, as it is consistent with one of the fundamental laws in bibliometrics: Lotka's law [53] (also known as the inverse square law). In 1926, after analyzing authors' productivity in different domains, Lotka found that the number of authors with n papers is usually inversely proportional to n 2 . In our case, 2,387 authors have written one article; hence, Lotka's law predicts that the number of authors that have published n papers should be 2387 n 2 . Figure 7 compares the empirical distribution found in the sample with the distribution predicted by Lotka's law, showing that both distributions fit much.

How Do the Most Productive Authors Collaborate? (RQ5)
Pretty much as industrial production relies on teamwork, academic literature is increasingly the result of the collective work of several researchers [54]. Figure 8 shows that research on AI-GDM follows this trend too. Year #Co-authors per paper on average Accordingly, studying the collaboration between authors has a remarkable interest [45]. The graph in Figure 9 represents how the most productive authors collaborate. Each node accounts for one of the top 1% most prolific authors. The size of each node is proportional to its Eigenvector centrality in the collaboration network. This centrality models the importance of a node considering not only the number and weights of its connections to other nodes but also the influence of those nodes in the network [55]. There is an edge between two nodes whenever the corresponding authors have published some paper together. Edge thickness is proportional to its weight, i.e., to the number of papers that both researchers have coauthored.     Table 3 shows the journals that have published most articles. Again, the table includes the total number of papers, the citations per journal, and the journal's h-index. The last column will be described in Section 3.8. Analogous to Lotka's law for authors' productivity (see Section 3.4), there is another bibliometric law for journal productivity called Bradford's law [56]. It predicts an inverse relationship between the number of papers published in an area and the number of journals where the articles appear. In other words, a few journals usually account for a high portion of the total publications, while a high number of journals publish fewer articles in the area.
In our case, there are 2862 papers in the sample; 1016 published in conferences and 1846 published in journals. Although a total of 32 journals have published the 1856 articles, 9 of them have published 66% of the articles, i.e., journal productivity concentration is even higher than the one predicted by Bradford's law. Figure 11 compares the cumulative distributions of the empirical data and the data predicted by Bradford's law, according to the procedure proposed by Egghe and Rousseau [57]. Roughly speaking, suppose that the journals in the sample are sorted according to the number of articles into 3 groups, each one including 1 3 of all articles approximately. Those groups are named Bradford's zones. They are registered in Table 3's last column, and highlighted with different colors in Figure 11 (Zone 1 in blue, Zone 2 in pink, and Zone 3 in yellow). Zone 1 comprises 2 core journals. Zone 2 includes 5 journals, thus Bradford's constant n is 5 2 = 2.5. Although Bradford's law predicts that the number of journals in Zone 3 should be 2 × 2.5 2 = 12.5, the empirical number of journals is much bigger: 25. Trying with different numbers of zones produces even more distant results. In the first period, 1991-2009, according to the strategic diagram shown in Figure 12, the GDM research field was focused on 17 themes. Nine of them stand out since they are motor, basic, and transversal: fuzzy-sets, public-investment-decision, multi-attribute-group-decision-making, trapezoid-fuzzy-numbers, consistency (i.e., approaches to measure the level of consistency of the information provided by the experts), information-retrieval, OWA-operators, TOPSIS, and decision-making.
Taking into account the performance measures shown in Table 4, the themes OWA-operators, decision-making, and consistency got more than 100 documents. Considering the citations achieved, OWA-operators is the most cited theme, reaching more than 10,000 citations. Moreover, consistency and TOPSIS, with more than 6000 citations, achieved a significant impact.  In the next period, 2010-2014, as it is shown in Figure 13, the GDM research field delved into the following ten themes (motor plus basic and transversal): OWA-operators, majority (i.e., the soft-computing approach that relaxes the total consensus, seeking the alternative supported by most experts), analytical-network-process, consistency, additive-consistency, vague-set-theory, TOPSIS, linguistic-variables, fuzzy-sets, and decision-making.
Bear in mind that according to the performance measures in Table 5, the themes consistency, TOPSIS, fuzzy-sets, OWA-operators, vague-set-theory, and lingustic-variables got more than 100 documents. Furthermore, the theme consistency, with more than 15,000 citations, almost double the impact of the second more cited theme. Moreover, the themes TOPSIS, fuzzy-sets, and vague-set-theory stand out with more than 6000 citations.
As Figure 14 shows, the primary main research fields turned around 12 main themes in the last period, 2015-2019: terms-sets, AHP, Vikor-method, similarity-measures, consensus, consensus-reaching-process, multi-attribute-group-decision-making, supplier-selection, multi-criteria-group-decision-making, uncertainty, fuzzy-sets, and linguistic-term-sets. Moreover, according to the performance measures shown in Table 6, except for linguistic-terms-sets, the main themes pointed above got a great number of documents (more than 100). Taking into account the achieved citations, Term-sets was the most cited theme, with more than 11,000 citations. In comparison with the previous periods, themes have had high impact considering this period's small citation window. In addition, themes AHP, similarity-measures, multi-attribute-group-decision-making, and consensus achieved more than 4000 citations.  Table 5. Performance of the themes in the 2010-2014 period. Consistency  376  15,073  70  OWA  188  8671  50  Fuzzy-sets  173  7027  49  OWA-operators  168  6936  47  Vague-set-theory  157  7584  49  Linguistic-variables  100  4755  36  Decision-making  81  2841  30  Analytic-network-process  76  3980  32  Additive-consistency  74  2519  30  Majority  65  2531  27  Consistency-measures  28  1486  19  Choquet-integral  27  1136  15  Group-members  17  1463  15  Recommender-system  17  902  10  Personality  12  138  5  Neural-networks  11  390  7  Fuzzy-game-theory  8  223  8  Multidimensional-analysis  7  191  4  Table 6. Performance of the themes in the 2015-2019 period. This section discusses the thematic network evolution, describing how these themes evolved through the years, and how the topics emerged and changed. For that reason, an evolution map [47] is provided, in which each column represents a period. There is a link between the themes of two consecutive periods if both themes have keywords in common. Indeed, the link strength is proportional to the Inclusion index (the more words they have in common, the thicker the link).

Number of Citations h-Index
Therefore, analyzing the themes across three consecutive periods, we can summarize the conceptual evolution of AI-GDM in seven thematic areas ( Figure 15): (i) multi-attribute/criteria in GDM, (ii) analytical network process, (iii) decision-making and uncertainty, (iv) fuzzy sets, (v) recommender systems, (vi) consensus and majority, and (vii) agent systems.
Furthermore, for each thematic area, a set of bibliometric indicators were calculated to show the performance and impact score. In that way, Table 7 shows for each thematic area, the total number of documents, the number of citations achieved, and the h-index. It is worth noting that the documents were associated with each thematic area using the algebraic union of the documents belonging to each theme, so it could be possible that the same documents count in different research areas. That is, the sum of the documents could be different from the total number of documents analyzed in this study. Considering the thematic areas shown in Figure 15, and their performance measures, we should point out that AI-GDM has been mainly focused on the research area of multi-attribute criteria, as it is the largest one (it has the biggest number of documents). Also, it achieves the highest number of citations count. The thematic network fuzzy-sets also has a significant number of documents, which have been highly cited.

What Are the Main Application Domains? (RQ11)
WoS provides a classification system called, research areas that organizes publications according to their subjects into 252 areas. Research literature on AI-GDM spreads over a variety of application domains: 19.67% of the papers fall into the Engineering area, 15.79% into Operations Research Management Science, 7.97% into Automation Control Systems, etc. The word cloud in Figure 16 shows the foremost application domains; words have been abbreviated, and their size is proportional to the number of articles classified in the corresponding areas.

Conclusions and Future Challenges
In this paper, a systematic and highly automated bibliometric workflow has been followed to analyze the literature on group decision-making based on artificial intelligence. Our longitudinal analysis shows that:

•
Research on AI-GDM is increasing as the number of papers and citations to those papers is growing substantially.

•
Most research has been carried out by Chinese universities. Nevertheless, a few Spanish investigators lead research in terms of productivity and collaboration network centrality. • Two basic bibliometric laws hold to a great extent, Lotka's law and Bradford's law, which model authors' and journal productivity concentrations, respectively. • AI-GDM is being applied to a variety of domains, including engineering, operations research management science, automation control systems, robotics, economic, telecommunications, imaging science, etc.

•
In summary, the conceptual evolution of the AI-GDM research fields delved into seven thematic areas: multi-attribute/criteria in GDM, analytical network process, decision-making and uncertainty, fuzzy sets, recommender systems, consensus and majority, and agent systems.
Finally, recent literature on AI-GDM reveals the following trends and challenges: • There is an increased need to support the consensus of huge groups of decision-makers. This need arises in several contexts, such as social networks, e-democracy platforms, crowd-funding systems, group recommender systems, etc. Those large groups are typically decomposed into smaller ones by applying different clustering algorithms, such as hierarchical clustering [58], discriminant analysis [59], etc.
• In classical GDM, a reduced group of experts needs to make a consensual decision. Presently, the experts' group is often replaced by internet users' opinions. As a result, natural language processing techniques have started to be applied for mining linguistic information that is subsequently processed by GDM systems [60].

•
As AI-GDM problems become more complex, advanced models and simulations are required to support the experts' group dynamics [61], e.g., for identifying the most influential experts, detecting manipulative and non-cooperative behaviors, etc.

•
Deep learning has started to be used [62] for (i) estimating the importance (or weight) of the experts, their preferences, and their relationships, and (ii) learning the optimal settings of parameterized aggregation operators.

Conflicts of Interest:
The authors declare no conflict of interest.