1. Introduction
Effective problem solving and innovation across different disciplines no longer rely on individual expertise [
1]. Collaboration in teams has become increasingly crucial [
2]. Today’s global challenges demand a mixture of perspectives, knowledge and skills that surpass the capacity of any single individual. Therefore, teamwork and collaboration are essential competencies across all fields of study and professional practice [
3].
Various pedagogical approaches have been explored to encourage teamwork and collaboration, which can be categorized as either instructor-led or student-led. Instructor-led approaches leverage the instructor’s expertise in creating teams and scenarios that simulate real-world collaborative challenges [
4,
5], whereas student-led approaches rely on students organizing and managing their teams [
6].
Collaborative learning, defined as learning within a group through discussion and joint knowledge construction [
7], has been effectively employed in education as a pedagogic strategy to enhance and complement individual learning [
8] as well as boost academic performance [
9]. Compared with traditional grouping methods such as random and self-organized grouping, the advanced algorithm-supported grouping approaches not only improve academic performance but also provide a better collaborative experience, enhance learning motivation and increase overall satisfaction [
10,
11].
Effective automatic grouping methods rely on three crucial aspects. First, selection criteria determine who gets placed together. These criteria could be skills, interests, learning styles, or previous experience. Second, the grouping type influences how people are assigned. Are they meant to collaborate, share knowledge, or simply interact socially? The purpose of the group dictates this. Finally, grouping algorithms are the computer programs that take these criteria and type into account, automatically assigning people to groups based on the chosen parameters [
11].
According to Moreno et al. [
7], effective grouping promotes better interaction and produces better learning outcomes by taking into account a variety of group formation characteristics. Academic achievement is not the only requirement for a successful group. It has been shown that a variety of learner-related factors, such as learning styles, communication and leadership abilities, personality traits, social interaction and knowledge level significantly affect the grouping process [
12,
13,
14,
15].
Li et al. [
11] emphasize the importance of distinguishing between static and dynamic student characteristics. They defined static characteristics as the fixed attributes of students, such as learning styles, communication and leadership skills, which remain unchanged throughout the learning process. In contrast, dynamic characteristics are those that cannot be determined at the beginning of the course and change throughout the learning process, such as levels of social interaction.
Once the grouping criteria are established, regardless of the specific characteristics, three types of groupings can be formed based on the degree of uniformity or similarity: homogeneous, heterogeneous and mixed grouping [
16]. In the homogeneous grouping, each group is formed with individuals who share greater similarities. In the heterogeneous grouping, each group is characterized by greater diversity among its members. Mixed grouping involves searching for homogeneity in certain characteristics while seeking heterogeneity in others, resulting in groups that exhibit both significant similarities and differences among all individuals. Research has shown that in an inquiry-based learning context, homogenous grouping is more beneficial for teaching than heterogeneous grouping [
17]. However, in a didactic setting, heterogeneous grouping is more effective [
18]. According to the majority of studies, heterogeneous grouping is the practice of instructing students in the same classroom—between two and five students—with various ages and skill levels together. As stated by Sukstrienwong [
12], students of advanced, average and low ability have benefited academically and socially from heterogeneous grouping in elementary grade levels.
Several algorithmic approaches have been presented by the learner group formation research community to effectively address this difficulty. Algorithms present a promising new method for enhancing team formation. Examples of these methods include clustering, the greedy algorithm, the ant colony algorithm and genetic algorithms [
19,
20,
21,
22]. Genetic algorithms (GAs) and other evolutionary algorithms are strong solutions that use machine learning to calculate several parameters. These algorithms can analyze large amounts of data and identify patterns for effective team formation. For example, evolutionary machine learning matches the multidimensional input of student model traits and is adaptable to various group compositions to deliver a genetic solution for created groups [
7,
12]. The k-means algorithm is a data mining technique used for group formation by identifying patterns and extracting information from big data sets. It uses methods that combine machine learning, statistics and database systems [
19].
Numerous studies have addressed the challenge of group formation in collaborative learning by creating systems designed to create groups based on their proposed approaches. As shown in
Figure 1, Liang et al. [
23] created an intelligent group formation system that incorporated a variety of algorithms, including homogeneous and heterogeneous algorithms, as well as the jigsaw algorithm. This system offered a comprehensive solution for supporting teachers in the formation and analysis of groups using learning log data derived from the BookRoll learning system [
24]. Moreover, Sukstrienwong [
12] proposed a GA with a particular fitness function that focuses on using students’ learning styles and educational backgrounds as the core criteria for group assignments. This approach aimed to establish a fair grouping mechanism, ensuring optimal allocation of heterogeneous students within groups to maximize learning outcomes. To achieve this, researchers developed a web-based application called “Genetic Algorithm for Student Group Formation (GAFSG)”. According to the authors, GAFSG supported various group sizes and diverse grouping attributes. It allowed teachers to adjust and determine various attributes to be considered during the student grouping process and consider multiple heterogeneity factors based on the specific needs of the student population.
However, among those applications presented in
Figure 1, two studies developed specific applications and conducted experiments to provide innovative technical contributions, with plans for future improvements and performance enhancements [
23,
25]. One study developed a standalone web application integrated into Sakai, which is the e-learning platform of the Universitat Politècnica de València [
26]. Only one study explicitly mentioned a publicly accessible web application [
12], but searches using the provided URL and name yielded no results. Researchers still need to improve the accessibility and usability of their applications to better promote research findings and serve in educational practices.
In this field, some highly cited review studies have conducted in-depth analyses and summaries of computational techniques (i.e., algorithms) and students’ key attributes used to support team formation in collaborative learning environments [
16,
27]. Furthermore, another review study comprehensively explored the application of GA in team building, covering aspects such as student attributes, genetic operators, initial settings, and termination conditions [
22]. While existing research has contributed to exploring the application of algorithms in team formation, they may have overlooked the comprehensive evaluation of these algorithms’ effects in actual teaching environments. Therefore, this systematic review examines the various algorithms’ applications in team formation and the types and criteria of groups. Additionally, it compares the learning effects of algorithm-supported teaming with traditional teaming methods in actual teaching environments and reviews assessment methods and experimental design (data collection and analysis methods). Additionally, this paper also outlines the student academic levels and professional backgrounds involved in algorithm-supported teaming. Overall, this review may also serve as an update to previous literature reviews in this field. This comprehensive analysis aims to provide educators and researchers with deeper insights to help them effectively use algorithms to promote collaborative learning and interdisciplinary collaboration.
3. Results
We conducted a systematic literature review to explore the development and application of group formation algorithms in collaborative learning environments. A total of 20 articles satisfied our search criteria. As shown in
Figure 3, there’s been a clear rise in publications on team formation algorithms in collaborative learning since 2020. This trend peaked in 2021, with the highest number of journal articles published. In fact, this rise seems to coincide with the COVID-19 pandemic that began in early 2020. As remote learning and digital collaboration tools became essential, researchers may have directed their focus towards developing algorithms for enhancing online and hybrid learning environments. The peak in publications in 2021 could reflect these efforts reaching maturity, whereas the decline in 2022 might suggest a stabilization of research trends as researchers adapt to the ‘new normal’ of education.
Nevertheless, while there was a decrease in 2022, the overall trend suggests ongoing research activity and strong academic interest in this field. This not only reflects continued focus on collaborative learning and group formation algorithms but also highlights their potential and importance for future research.
3.1. Algorithms of Team Formation
In this section, we provide a comprehensive and detailed analysis of the algorithms used in 20 articles published between 2014 and 2023. Forming groups is a classic optimization challenge that demands an appropriate solution from among tens of thousands of possible group configurations. The algorithm is one of the critical factors used to automatically generate teams.
Table 3 illustrates the utilization of algorithms throughout all the literature under analysis. This study discovered that GAs were the most favored approach for grouping within the academic community, with research utilizing GAs constituting 70% of the total shown in
Figure 4. GAs are capable of obtaining the optimal solution from a multitude of possible solutions within a limited time frame. Their advantage lies in the ability to flexibly generate groups based on different criteria while maintaining a certain level of computational efficiency [
12,
15,
32]. The flexibility and efficiency of this algorithm may be one of the key reasons for its widespread application across various fields and problem-solving strategies.
A GA simulates the principles of natural selection and genetics, optimizing solutions through an iterative process and making them particularly effective in tackling complex issues, especially when multiple factors and constraints need to be considered. A GA’s general design is shown in
Figure 5. It starts with generation 0 (Gen = 0) and a randomly selected population for the first population. All individuals are evaluated by a fitness function. Reproduction, crossover and mutation are three common GA operators that take place throughout a single generation to produce new offspring. These operators aim to preserve the chromosomes (or part of them) that represent superior solutions under the principles of natural selection [
33]. In general, the reproduction operator is more likely to use a fitness function to choose individuals who will make up the algorithm’s next generation. Two chromosomes (parents) are combined by the crossover operator to create new chromosomes (offspring). This can be applied in GA in a variety of ways, including a heuristic crossover, two-point crossover and single-point crossover [
34]. Therefore, GAs are not only valued in theoretical research but also demonstrate significant potential and value in practical applications.
The following content elaborates on the innovative improvements made to GAs in various aspects in numerous studies. These enhancements include the optimization and adjustment of genetic operators, as seen in [
10,
14,
25,
32,
35], aiming to enhance the search efficiency and quality of solutions of the algorithm. Additionally, some studies focused on refining the fitness functions [
12,
15], aiming to more accurately assess individual fitness and promote a more effective evolutionary direction. Moreover, improvements to the algorithmic model itself have been a focal point [
11], adjusting the algorithm’s structure and processes to accommodate more complex application scenarios. Furthermore, several studies have explored the integration of GAs with other algorithms [
23,
36,
37], a cross-disciplinary fusion that not only expands the application scope of GAs but also provides new insights and methods for tackling more complex problems. These advancements and integrations continuously propel GAs toward greater efficiency, precision and adaptability, showcasing the immense potential and value of GAs in solving real-world problems.
For instance, Chen and Kuo [
15] provided an innovative approach to forming groups using GAs that incorporates a penalty function. This method takes into account the heterogeneity of students’ knowledge levels and learning roles, as well as the homogeneity of social interactions among group members, as assessed through social network analysis. A fitness function was employed to assess the availability, and subsequently, a globally optimized solution was developed by allocating varying weights to student characteristics. As a result, collaborative groups with balanced educational attributes are formed within a problem-based collaborative learning context. Additionally, in Krouska and Virvou’s study [
32], the authors introduced a novel GA for grouping students within a social network-based learning system. This approach allows for a more thorough exploration of the problem space and introduces new genetic information into the population, effectively preventing the algorithm from getting stuck in local optima. Aside from that, this approach outperforms the basic GA technique in terms of efficiency and accounts for a broader range of parameters than typically observed.
However, beyond the enhanced GA, several studies have explored the integration of GAs with other computational methods. This interdisciplinary approach leverages the strengths of different algorithms to address complex problems more effectively. For example, in Berge and Mark’s study [
37], they introduced the “Team Machine” as a tool designed to form student teams with the highest level of diversity. The Team Machine incorporates a variety of search algorithms, including the Greedy Randomized Adaptive Search Procedure (GRASP) and GAs. The GRASP, serving as a precise local search technique, is focused on finding optimized solutions within a constrained search area. On the other hand, GAs, as a population-based search strategy, excel in expanding the search horizon by exploring a wider array of solution spaces, thereby enhancing the quality of solutions initially identified by the GRASP. By synergistically utilizing both algorithms, this method not only bolsters the comprehensive exploration of potential team configurations but also aims to refine and elevate the best solutions discovered.
Table 3.
Overview of technical team formation algorithms.
Table 3.
Overview of technical team formation algorithms.
Algorithms | Number | Literature |
---|
Genetic algorithm | 14 | [10,11,12,13,14,15,23,25,32,35,36,37,38,39] |
Team formation algorithm based on coalition structure generation Belbin’s theory and Bayesian learning | 1 | [26] |
Variable neighborhood search algorithm | 1 | [40] |
k-means algorithm | 1 | [19] |
Group algorithm | 1 | [41] |
Cluster and prune | 1 | [42] |
Minimum entropy collaborative grouping | 1 | [43] |
Furthermore, as demonstrated in
Figure 4 and
Table 3, among the selected articles for this analysis, six papers employed non-genetic algorithms. Each of these algorithms was unique, having been independently developed and applied, including the variable neighborhood search algorithm [
40], the k-means algorithm [
19], the group algorithm [
41], the cluster and prune method [
42] and minimum entropy collaborative grouping [
43]. The diverse selection of algorithms in these studies not only enriches the methodological landscape of group formation but also offers a broader perspective and potential solutions for addressing specific grouping challenges. For instance, Lambić et al. [
40] developed an application utilizing the variable neighborhood search algorithm, aimed at addressing the process of group formation as a mathematical optimization problem. Aside from that, Ramos et al. [
19] employed the k-means algorithm, integrating three distinct similarity distance metrics—the Euclidean distance, Manhattan distance and cosine similarity—along with key student attributes extracted from learning paths for group formation. This comprehensive approach of applying multiple metrics and student characteristics for grouping demonstrates an innovative strategy and method for addressing complex grouping challenges in the educational sector.
In summary, this trend highlights the widespread application and dominant position of GAs in the research of automatic group formation, motivating researchers and developers to further explore GAs and other optimization techniques. Despite the clear advantages of GA, 30% of the studies employed other algorithms, indicating that there is still room for exploring new methods and improving existing ones in the field of automatic grouping. The diversity of these methods not only provides researchers with the flexibility to choose or develop the most suitable algorithm based on specific needs but also paves the way for interdisciplinary collaboration. It encourages experts from fields such as computer science, artificial intelligence, educational technology and psychology to work together in developing more efficient and intelligent grouping systems.
3.2. Grouping Types of Team Formation
In the automated process of forming groups, the selection of algorithms, along with the type of grouping and the characteristics of the students chosen, play a pivotal role in determining the effectiveness of the grouping. Group formations can be categorized into three main types: homogeneous, heterogeneous and mixed. This section provides a focused review of the types of group formations achievable through algorithms, specifically utilizing the algorithms mentioned earlier (such as GAs) and the required characteristic attributes to form groups with homogeneous or heterogeneous features according to optimization functions. As shown in
Figure 6, 15% of the studies adopted homogeneous grouping methods [
14,
19,
35], while 30% of the research opted for heterogeneous grouping approaches [
11,
12,
26,
32,
42,
43]. Most notably, over half of the studies explored mixed grouping methods that combined both homogeneous and heterogeneous characteristics, highlighting the significance and prevalence of mixed grouping methods in current research.
3.2.1. Homogeneity
Homogeneous grouping gathers members with similar characteristics, creating groups that exhibit a high level of similarity among members when all characteristics are considered collectively. According to Jensen and Lawson [
17], when students with similar beginning reasoning abilities are grouped together, they tend to have more positive attitudes toward collaboration in the context of inquiry-based learning, and this leads to greater performance. Similarly, in their study, Sanz-Martínez et al. [
44] discovered that when learning engagement is homogeneous, team assignments exhibit higher quality due to increased interactions and self-efficacy. While Vygotsky [
45] suggests that heterogeneity among group members and their resources is beneficial, homogeneous compositions in learning engagement patterns can prevent the neglect and isolation of learners during group activity [
46].
In this systematic review, only three studies exclusively utilized homogeneous grouping methods [
14,
19,
35], accounting for 15% of the total. For instance, Oscar et al. [
35] introduced a strategy for forming homogeneous groups within collaborative learning settings aimed at enhancing cooperation and improving educational outcomes, both collectively and individually. The establishment of these groups relies on personality traits, specifically employing the Big-Five personality model (extraversion, agreeableness, conscientiousness, neuroticism and openness). Furthermore, they measured the homogeneity across all students in a group using the fitness function of a GA. Empirical evidence has shown that homogeneous groups formed through GAs achieve better learning outcomes compared with traditional methods of grouping students based on their preferences. Similarly, Ramos et al. [
19] demonstrated through experimental validation that 75% of students experienced an improvement in their grades when grouped homogeneously.
3.2.2. Heterogeneity
Heterogeneous grouping, in contrast, combines members with differing or complementary characteristics. This arrangement results in greater diversity among group members, fostering a culture of mutual complementarity and learning that can boost the team’s capacity for innovation and problem-solving diversity. In the didactic condition, Jensen and Lawson [
17] found that heterogeneous groups performed better than homogenous groups. Likewise, in terms of knowledge, skills and abilities, including team behavior skills, heterogeneous groups are likely to be more successful than teams that are homogeneous. This is because diverse teams have access to a wider range of knowledge and perspectives, and members can learn from each other and generate new ideas by combining their qualifications [
47,
48]. In conclusion, the practice of heterogeneous grouping is increasingly favored as it more effectively accommodates a variety of educational settings, leading to the desired outcomes in collaborative learning.
As illustrated in
Figure 6, six studies opted for heterogeneous grouping criteria [
11,
12,
26,
32,
42,
43], which accounted for 30% of the total. Based on Belbin’s team role theory, Alberola et al. [
26] developed an artificial intelligence tool for creating heterogeneous teams in the classroom. This theory suggests that a successful team should consist of eight distinct roles—plant, resource investigator, coordinator, shaper, monitor evaluator, team worker, implementer and finisher—to foster successful teamwork. Compared with traditional team formation methods, students believe this new approach enhances the level of collaboration, leading to greater satisfaction with their teammates’ cooperation and a higher regard for their teammates. Krouska and Virvou [
32] developed an innovative GA for creating heterogeneous groups. The algorithm takes into account the three key dimensions of learning within the social networking-based learning environment—academic, cognitive and social dimensions—while utilizing the characteristics generated from these dimensions to foster team collaboration and enhance learning outcomes at both the group and individual levels. The vast majority of students reported receiving effective support from other group members during project work, indicating that the heterogeneous group configuration achieved the desired positive effects in practice. To create heterogeneous groups more efficiently, Toni and Ramon [
43] designed an advanced algorithm called “minimum entropy collaborative grouping” based on complex network theory. The results indicate that groups formed through this algorithm are more efficient, exhibit lower uncertainty, have stronger interconnections and demonstrate a higher level of maturity.
3.2.3. Homogeneity and Heterogeneity
Mixed grouping adopts a more adaptable strategy, applying homogeneous grouping for certain characteristics while utilizing heterogeneous grouping for others. This method not only preserves similarities among group members but also introduces diversity within the group, promoting mutual understanding and sparking innovation and adaptability within the team.
In 55% of the studies, the focus was not solely on homogeneous or heterogeneous grouping criteria. As shown in
Table 4, among these 11 articles, three studies adopted criteria for intergroup homogeneity and intragroup heterogeneity [
13,
36,
37], two implemented mixed grouping strategies that combine homogeneous and heterogeneous characteristics [
15,
40], another two allowed for either homogeneous or heterogeneous grouping [
23,
39], and four papers offered the flexibility to choose among homogeneous, heterogeneous or mixed grouping methods [
23,
25,
38,
41]. This demonstrates the diversity and flexibility of grouping approaches.
Lin et al. [
36] created an advanced method that combines GAs and the Technique for Order Preference by Similarity to Ideal Solution method. Based on this, a web-based grouping support system was developed, aimed at assisting educators in effectively grouping students according to intergroup homogeneity and intragroup heterogeneity. In this way, students with stronger skills within a group can support those who need help. That aside, groups are balanced based on multiple criteria, which ensures equity among groups and reduces disparities. As a result, this method not only makes competition between groups fairer but also enhances learning enthusiasm and helps improve learning outcomes. In addition, Chen and Kuo [
15] implemented a novel method for creating groups that combine GAs with penalty functions. The goal was to construct collaborative learning groups that had a balanced distribution of learning qualities. This approach considers the range of students’ knowledge levels and learning roles, as well as the regularity of social interactions among group members. This strategy promotes the formation of collaborative groups that have a balanced mix of learning abilities. It not only improves students’ academic achievement but also enriches their interactions in a problem-solving-focused collaborative learning context.
The mixed grouping approach provides researchers with the greatest freedom, enabling them to select the most suitable grouping strategy according to their specific educational goals and learning tasks. This strategy integrates the benefits of both homogeneous and heterogeneous grouping, fostering mutual comprehension and cooperation among members while also encouraging team creativity and flexibility. Educators and researchers are exploring various ways for mixed grouping to address the varied learning demands and objectives in a more personalized and dynamic manner. According to a detailed analysis of 20 pieces of research, it is clear that both heterogeneous and homogeneous grouping have their own advantages and specific situations where they are suitable. However, the mixed grouping approach garnered more attention due to its flexibility and multiple benefits. The mixed grouping method’s diversity and adaptability may significantly improve learning results, team cooperation and innovative capacities. This approach enables educators to adapt grouping strategies flexibly according to the particular requirements and learning objectives of students, thus enhancing support for the learning process.
3.2.4. Characteristics
There are two types of student characteristics: dynamic and static [
11]. Static qualities include things like gender, age, past knowledge and learning styles, which essentially never change or at least do not change rather quickly. On the other hand, dynamic traits, like a student’s interaction levels and emotional states, tend to change over the course of their learning process since they are difficult to fully capture at one moment in time. In this literature review, as shown in
Figure 7, when it came to research on employing student characteristics for grouping, the most common approach was the use of static features, which were referenced in 11 papers (or 55% of the total). Only 25% of the studies took into account both static and dynamic properties. This type of research was less common. Studies based solely on dynamic characteristics for grouping were quite rare, with only one paper addressing this approach. Additionally, three papers described teachers defining grouping criteria based on the content of collaborative activities, without specifying the exact characteristics to be included.
One possible reason for the widespread use of static characteristics in grouping research is that these traits cover several key dimensions needed for grouping in the educational field. Especially in educational settings, a student’s prior academic performance, study habits and professional skills as well as communication and leadership skills, which are required for teamwork, are crucial. These static characteristics include learning styles, prior academic performance, learning roles, gender, age, major, coding skills, leadership level, knowledge levels, communication skill level and personality traits. Due to their relative stability and ease of measurement, these traits have become the preferred choice in educational grouping research. In contrast, dynamic characteristics mainly involve students’ social interactions and emotional states, which may change during the learning process, including social interactions, interpersonal relationships and social networks. Due to the fluidity and complexity of dynamic characteristics, studies based solely on these traits for grouping are relatively rare. Most research that included dynamic characteristics tended to also incorporate static traits, aiming to achieve a more comprehensive student profile and more effective grouping strategies through integrated analysis. Integrating both static and dynamic student characteristics results in a more effective collaborative outcome compared with relying solely on one type of characteristic [
11,
32]. For example, to create optimal groups, Chen and Kuo [
15] considered the topic knowledge levels, learning roles and social interactions of the students. Similarly, Krouska and Virvou [
32] examined 17 variables related to social, cognitive and academic domains to create groups.
In conclusion, static characteristics are frequently used due to their broad coverage of key dimensions required for educational grouping and their relative stability and ease of measurement. Although dynamic characteristics are crucial for understanding students’ social and emotional states, they often need to be combined with static traits in practical applications to achieve more effective student grouping. This finding points to potential future research directions, namely exploring how to better utilize dynamic characteristics and how to optimize the combination of dynamic and static traits to achieve higher efficiency and effectiveness in educational grouping.
3.3. Outcomes and Evaluation
3.3.1. Reviewed Research Results
All selected articles conducted empirical experiments in classroom settings and included control groups. These experiments primarily aimed to compare team formation methods supported by innovative algorithms against traditional grouping approaches. The data presented in
Figure 8 clearly show that the combination of random and self-organized grouping methods was the most frequently discussed approach in the literature, with six studies employing this comparative approach [
10,
11,
13,
15,
36,
40]. Following closely were five studies that compared these methods against instructor-led team formations [
14,
23,
35,
37,
41], with some research focusing solely on self-organized or random grouping as a control. Notably, some studies also engaged in comparisons between algorithms, such as contrasting an improved GA with a simple GA [
32]. Surprisingly, experimental validation revealed that all comparative results consistently showed that the grouping methods supported by the proposed algorithms outperformed traditional grouping approaches. Surprisingly, experimental validation consistently demonstrated that the grouping methods supported by the proposed algorithms surpassed traditional approaches in several key aspects: higher academic performance [
10,
11,
13,
14,
15,
19,
35,
36,
37,
38,
40,
41], increased satisfaction [
10,
26], improved collaborative experiences [
11,
15,
25,
39] and a positive impact on student engagement and affections [
23]. Furthermore, the results from the improved GA were also superior to those of the simple GA [
32].
3.3.2. Evaluation
In this review, a thorough analysis of 20 related works was conducted, focusing on three core dimensions: research design, data collection and data analysis methods. All of the review studies utilized empirical research methods, with 19 being empirical quantitative studies and only one being an empirical mixed method study. As shown in the upper left bar chart in
Figure 9, among the 20 articles analyzed, controlled experiments emerged as a highly preferred method of data collection, being employed in 18 studies.
Following closely were academic performance tests, utilized in 13 articles. Additionally, the survey method was adopted in 12 studies, underscoring its vital role in the data collection process. The upper right bar chart in
Figure 9 further reveals the application of data analysis methods. Notably, three articles [
11,
23,
39] utilized observational methods to gather qualitative data, which was subsequently analyzed through quantitative techniques, employing quantitative content analysis. This approach transforms qualitative data, such as text, into numerical data that can be quantitatively analyzed. By merging the in-depth insights of qualitative data with the precision of quantitative research, this analysis enables researchers to systematically and objectively examine qualitative data. Both qualitative and quantitative analysis methods were utilized, with descriptive statistics (applied in 13 articles) and T-tests (applied in eight articles) being the most common quantitative data analysis methods. A unique study [
15] combined qualitative data collection through interviews and qualitative analysis with quantitative data collection through performance tests and quantitative analysis using variance analysis. Because a single article might cover many study design categories, it is important to note that the total number of articles depicted in the bubble charts is more than that in the bar charts. The bubble charts offer a perspective on the concentration and diversity of research methods. The mix of research designs and data collection methods is shown in the lower left bubble chart, which suggests that empirical quantitative research has taken over as the most common methodology. On the other hand, the combination of methods for data analysis and research design is displayed in the bubble chart on the right.
In a study conducted by Oscar et al. [
38], an experiment was designed to compare traditional grouping based on student preferences in grouping with an improved GA integrated with the Big Five Inventory theory. During the academic years B-2019 and A-2020, 238 students from the program systems engineering majors at the University of Nariño, Mariana University and CESMAG University in San Juan de Pasto, Colombia participated in the experiment, engaging in collaborative learning activities across 14 programming and related courses. Courses numbered 1–10 were designated as the experimental group, where students were grouped using an improved GA after taking the Big Five Inventory test. Courses numbered 11–14 served as the control group, where teams were formed based on student preferences without the Big Five Inventory test. After grouping, teachers assigned tasks, and students engaged in collaborative activities, with post-tests conducted for both the experimental and control groups at the end of the experiment. The students completed a collaborative project in each course, and after the activities, they filled out a questionnaire comparing seven collaboration indicators: participation and decision making, conflict management, problem resolution, internal communication, external communication, collaboration and leadership. This study used descriptive statistics’ quantitative data analysis to find that the collaborative performance of groups predetermined by the suggested strategy (experimental group) was generally superior to that of the traditionally formed groups (control group). Furthermore, the experiment revealed that neglecting to consider personality traits as criteria before forming groups typically resulted in inferior outcomes.
The impact of various grouping techniques on learning outcomes was investigated in a study by Amara et al. [
41] by contrasting teacher-organized groups with groups created by an algorithm. The study involved 54 students from a private secondary school participating in a collaborative learning project based on the French language. The project required student teams to gather information about a selected city, visit the city together, communicate with teammates and incorporate photos and videos into their project, culminating in a report about the chosen city. For this purpose, two types of groups were established; nine groups of three students each, which were manually created by teachers, served as the control group, and nine groups of three students each, which were automatically created with an algorithm, served as the experimental group. Pre-tests and post-tests were conducted before and after the collaborative activity to assess the learning outcomes. The quantitative method of analyzing the descriptive data of the above test results led to the conclusion that at the pre-test stage, the average scores of both groups were nearly identical and relatively low, reflecting a lack of cognitive ability and knowledge about the written report topic among students in both the control and experimental groups at the outset. However, in the post-test phase, although the average scores of both groups improved, the increase was more significant in the experimental group. This result indicates that the algorithm-based grouping method had a significant positive impact on students’ learning outcomes, achieving greater success compared with the teacher-managed control group.
3.4. Education Content
Research on automatic team formation covers a wide range of educational content, yet the distribution is uneven. As depicted in
Figure 10, the data reveal that only 25% of the studies focused on school education, while 75% were dedicated to higher education, with 15 studies specifically targeting this sector. Within higher education, the fields of computer science and engineering dominated, accounting for 50% of the research and significantly surpassing other educational areas. The specific courses involved include Programming [
10,
14,
25,
35,
36], Data Structures [
12], Discrete Mathematics [
19] and Software Technology [
32]. This bias towards STEM fields may be attributed to the researchers and developers themselves being faculty members in computer science or engineering who, for convenience in sampling, chose their own students as participants. Nonetheless, business and education majors also held certain proportions at 15% and 10%, respectively, indicating that research on automatic team formation is gradually expanding to other educational fields. The courses involved in these fields include the MBA program [
26] and Modern Educational Technologies [
11].
However, compared with the extensive research in higher education, studies at the school level accounted for only 25%, a relatively low proportion. Some involved courses at the school level include math problem-solving classes [
23], French language classes [
41] and some activities that require group collaboration in report writing [
15]. This suggests that there is significant room for improvement in the application and research of automatic team formation at the foundational education stage. Moreover, given the importance of teamwork across various majors and courses, especially in the field of medicine and the arts, encouraging more participation in these areas is particularly crucial. By expanding the range of majors involved in the research, a more comprehensive exploration and understanding of the application and effectiveness of automatic team formation technology across different educational backgrounds can be achieved, thereby fostering innovation and improvement in educational practices.
In summary, although current research on automatic team formation has achieved certain results in the field of higher education, it still needs to be expanded to more educational fields and learning stages, especially those professions that require strong teamwork skills. Through such expansion, automatic team formation technology can be more comprehensively evaluated and utilized, thereby enhancing the collaboration skills and learning outcomes of students across different disciplines and stages of learning. Additionally, the potential for research at the school level should not be overlooked, and future studies should continue to shift more focus toward this area.
4. Discussion
In our systematic review, we examined 20 studies exploring how algorithms can be used to create effective teams for educational settings and identified four key areas for analysis. These areas included the algorithms themselves, the characteristics that influence group formation, the research methods used in the studies and the specific technical methods used in real-world educational settings. In this section, we will also discuss the challenges and limitations identified in the literature, along with a critical evaluation of the studies themselves.
4.1. Main Findings of Algorithm-Based Team Formation Research
Our review revealed that Genetic Algorithms (GAs) were the most common algorithm used for team formation, appearing in 70% (14 out of 20) of the studies. However, simple GAs can struggle with large and complex problems, often getting stuck on solutions that are not the best (premature convergence). To overcome this limitation, researchers developed enhanced GAs. These improved algorithms incorporate better genetic operators and fitness functions, allowing them to explore a wider range of possibilities and find better solutions. This is achieved by introducing new variations within the groups (genetic information) and preventing the algorithm from getting stuck in local minima (i.e. suboptimal solutions).
The studies showed that these enhanced GAs outperform simple GAs. They consistently found higher quality solutions and could handle a larger number of factors when forming groups. Importantly, they achieved this within a reasonable amount of time for most real-world applications.
However, there is still room for improvement. One area of focus is optimizing the GA model itself. For example, a study proposed a “feature categorization model”—an enhanced GA specifically designed to address the challenge of forming groups when there is little initial data (cold start problem) [
11].
Nevertheless, a number of investigations have looked into integrating GAs with other algorithms to improve the grouping results. This multidisciplinary strategy makes better use of the advantages of several algorithms to tackle challenging issues. For instance, the Team Machine in Berge Mark’s [
37] study uses GAs and the GRASP. This method seeks to refine and improve the most effective solutions discovered while also supporting the thorough investigation of possible team configurations through the synergistic use of both algorithms. Additionally, to facilitate the trade-off of multi-objective grouping optimization, Lin et al. [
36] proposed a novel approach based on the enhancement of a GA with the Technique for Order Preference by Similarity to Ideal Solution. Based on the proposed approach, further development was carried out on a web-based group support system to assist educators. Furthermore, researchers have been actively investigating and utilizing a range of other algorithms in addition to GAs. Six papers in this review used non-genetic algorithm approaches: the variable neighborhood search algorithm, the k-means algorithm, the cluster and prune method and minimum entropy collaborative grouping. As a result, the varied selection of algorithms in these works offers a wider view and possible solutions for dealing with particular grouping issues.
Next, the grouping type and criteria are also key elements in the grouping process. In terms of grouping type, six studies used only heterogeneous grouping in the algorithmic and experimental phases, while the least number of studies used homogeneous grouping exclusively at only three. The algorithmic phase of the studies considered both homogeneous and heterogeneous grouping criteria in the highest number of studies, amounting to 11. However, when these studies were applied to the experiments, three used homogeneous intergroup grouping but heterogeneous intragroup grouping, two used homogeneous grouping for some characteristics but heterogeneous grouping for others, and six used homogeneous, heterogeneous or mixed grouping. This finding reveals that during the group formation process, researchers tend to explore and experiment with different grouping types to find the most appropriate grouping strategy for a particular learning environment and goal.
In this review, grouping criteria were divided into static and dynamic types. Static characteristics refer to attributes that remain unchanged over a short period, such as learning styles, previous academic performance, learning roles, gender, age, major, programming skills, leadership levels, knowledge levels, communication skill levels and personality traits. Studies that chose static characteristics as grouping criteria accounted for the highest proportion, reaching 55%. This may be because these characteristics are relatively stable and easy to measure, making them the preferred choice in educational grouping research. Another possible reason is that static characteristics cover a wide range, and most of these traits are commonly used standards for grouping. Dynamic characteristics, on the other hand, refer to attributes that never change or at least change quite slowly, including social interactions and emotional states. Studies utilizing dynamic characteristics occurred the least, accounting for only 15% of the studies. Research that considered both static and dynamic characteristics was also a trend, with 25% of the studies incorporating information such as students’ social levels and social networks into the grouping criteria on top of static characteristics. This classification method reveals how researchers choose grouping criteria based on the stability and measurability of characteristics in group formation studies. Moreover, studies that combine static and dynamic characteristics indicate that considering the multidimensional traits of students is necessary for a more comprehensive understanding and addressing of the complexity of learning groups. These diversified grouping criteria not only help form more efficient and cohesive learning groups but also provide more flexible and detailed grouping strategies for educational practices.
To gain a deeper understanding of the research designs involved in the literature, this review conducted a detailed analysis and summary in terms of three aspects: data collection methods, data analysis methods and research methods. In terms of data collection methods, comparative experiments (N = 18), exams (N = 13) and surveys (N = 12) were the primary tools. For data analysis methods, descriptive statistics (N = 13) and T-tests (N = 8) were predominant. Regarding the choice of research methods, 19 studies employed empirical quantitative research methods, while one study utilized a mixed research method that combines both quantitative and qualitative approaches. The widespread application of empirical quantitative research reflects the emphasis on data-driven decision making and evidence-based practices in educational research. The adoption of mixed research methods further enriched the depth and breadth of the research, enabling researchers to comprehensively understand and explain research phenomena in terms of multiple dimensions. In comparison with traditional grouping methods (such as random grouping and self-organizing teams), all studies reached a unanimous conclusion: Algorithm-based team formation methods have shown more positive results. This finding not only highlights the potential of algorithms in enhancing the efficiency and effectiveness of team formation but also provides valuable references for future educational practices and research. Some subjects in higher education, like science and engineering, education and business, have benefited from those team formation approaches. By adopting more scientific and systematic approaches to team formation, it is possible to effectively improve learning outcomes and the quality of team collaboration, thereby promoting innovation and development in more educational areas.
4.2. Challenges and Limitations of Algorithm-Based Team Formation Research and Future Directions
Although numerous studies consistently indicate that technology-assisted grouping methods significantly outperform traditional manual grouping in terms of efficiency and effectiveness, traditional grouping strategies still dominate in real-world educational settings. The reasons behind this phenomenon are complex and varied. Firstly, most of these innovative grouping algorithms remain in the initial stages of research and development and have seldom developed user-friendly mature products. Secondly, even if some research results have been successfully transformed into applications, their user experience may not be as intuitive or convenient. Moreover, the promotion and marketing activities of these applications may not have achieved the expected results, failing to attract sufficient attention from educators and learners, which in turn limits their widespread adoption and application. If the algorithm or platform is not user-friendly enough, then it may limit accessibility and utilization in real-world educational contexts. That aside, challenges may arise in data collection, processing and privacy protection as well due to the utilization of extensive student data, such as personal characteristics, academic performance and preferences. Therefore, despite the great potential demonstrated by technology-assisted grouping methods in theory and experiments, overcoming numerous challenges, including enhancing product usability, strengthening user training and implementing effective marketing strategies, is necessary before they can replace traditional methods in practical applications.
Therefore, although technology-assisted grouping methods have shown great potential in theoretical research and experimental validation, replacing traditional grouping methods in practice faces numerous challenges. These include fostering interdisciplinary collaboration among the psychology, education, computer science and engineering fields to jointly develop a comprehensive product suitable for most collaborative learning scenarios. Additionally, enhancing product user-friendliness, strengthening user training and support and implementing effective marketing strategies are also crucial steps toward achieving this goal.
5. Conclusions
We provided a comprehensive overview of the development and application of automatic group formation techniques to enhance teamwork and collaboration in collaborative learning. Automated team creation, supported by advanced algorithms, has emerged as a significant area of focus within collaborative learning. Manually generating effective groups is a complex task, but several techniques are available to automate this process. We examined the different algorithms used, the types of groups being formed, the criteria considered for grouping, and the educational content involved. We also reviewed the experimental designs of these studies, including how data was collected and analyzed and the methods used to evaluate the success of the algorithms. By analyzing these aspects, we aimed to identify gaps in current research, encourage further exploration in this field and provide insights for future research.
Research shows that the most popular team formation algorithm was the Genetic Algorithm (GA), which was used in 14 studies, accounting for 70% of the works. Enhanced GAs improve the performance and accuracy of algorithms through improvements in genetic operators and fitness functions. In terms of grouping type, we found that researchers tend to explore and experiment with different grouping types to find the most appropriate grouping strategy for a particular learning environment and goal. Studies that chose static characteristics such as knowledge level, learning roles and personality traits as grouping criteria accounted for the highest proportion, reaching 55%. However, there is a growing trend of considering both static and dynamic characteristics, such as social interactions, to create well-rounded teams.
The main methods for gathering data are surveys (N = 12), exams (N = 13) and comparative experiments (N = 18). T-tests (N = 8) and descriptive statistics (N = 13) are the most common data analysis techniques. Algorithm-based team building techniques have demonstrated more successful outcomes when compared with traditional grouping approaches (such as random grouping and self-organizing teams) according to all studies that conducted this comparison. Notably, science and engineering, education and business have benefited from automatic team formation approaches. Those findings highlight the potential of efficient and effective team formation approaches. In addition, they provide valuable references for future educational practices and research as well.
Although technology-assisted grouping techniques have demonstrated significant promise in theoretical investigation and experimental validation, their application in real-world scenarios faces numerous challenges, such as developing user-friendly and comprehensive products, promotion and marketing of products and privacy protection. For future improvements, fostering interdisciplinary collaboration among psychology, education, computer science and engineering should be encouraged to jointly develop a comprehensive product suitable for most collaborative learning scenarios. Likewise, improving the usability of the product, boosting customer support and training and implementing efficient marketing plans are all essential stages.
A limitation of this review is that it only included analyses of 20 journal articles, excluding conference papers, pre-print articles and books. This restriction means that the analysis covered a narrower range of the research field than if these other sources had been considered. However, our findings suggest that despite the limited application of team formation systems in current collaborative learning settings, their potential and attractiveness still merit further attention and investigation.