Exploring Collaborative Problem Solving Behavioral Transition Patterns in Science of Taiwanese Students at Age 15 According to Mastering Levels

: This study analyzed the collaborative problem solving (CPS) behavioral transition patterns of 53,859 Taiwanese students in science at age 15 by using an online Taiwanese CPS assessment that was designed according to the Programme for International Student Assessment 2015 CPS framework. Because of behavioral changes over the testing period, the CPS target skills that corresponded to the assessment items can be viewed as a CPS behavioral sequence. Hence, a lag sequential analysis was applied to explore the signiﬁcance of the interactions among the CPS skills. The behavioral sequence is coded according to the level of mastery (0, 1, or 2) of items. The CPS transition patterns were analyzed in three gaps, namely the gender gap, the urban–rural gap, and the achievement gap. The ﬁndings showed that “Monitoring and repairing the shared understanding” was a crucial CPS skill in science. Moreover, the female students who would follow rules of engagement effectively exhibited higher scores than male students did in monitoring the results of their actions and evaluating their success in solving the problem. No obvious differences were observed in the urban–rural gap, whereas differences were observed in the achievement gap.


Introduction
According to a review of relevant research and frameworks, key sustainability competencies include systems thinking, interdisciplinarity, anticipatory competence, values and ethics, normative competence, critical thinking and appraisal, interpersonal competence, intrapersonal competence, communication skills, strategic thinking and planning, personal engagement, evaluation skills, and dealing with uncertainty and resilience [1][2][3]. Moreover, according to Education for Sustainable Development, student-centered methodologies, project or problem-based learning, case study, simulation, and cooperative inquiry are widely used to study sustainability competencies [4]. Collaborative problem solving (CPS) is a crucial competency for students to communicate and contribute to problem solving with team members in school or future workplaces [5][6][7]. Therefore, education systems need to update curriculum and broaden scope so that students can learn CPS competency for life and employment in the 21st century [7]. Nonetheless, developing a large-scale, standardized CPS assessment that includes scenarios, numbers of team members, collaboration, problem solving, and contexts to understand student performance in CPS is challenging [8].
Smart-classroom development can benefit problem-based learning and collaborative inquiry [4]. In general, two types of CPS are offered in computerized assessments in smart classrooms: human to human and human to agent. Regarding human-to-human assessments, for example, CPS units were developed to assess social skills and cognitive skills for 1.
What are the general CPS behavioral transition patterns of Taiwanese students in science? 2.
What are the differences in CPS behavioral transition patterns between genders? 3.
What are the differences in CPS behavioral transition patterns between urbanized sectors? 4.
What are the differences in CPS behavioral transition patterns between achievement groups? Table 1. Matrix of collaborative problem solving skills for PISA 2015 [7].

Taiwanese CPS Online Assessment System
A Taiwanese CPS online assessment system including units in math, science, reading, and social science scenarios was developed for students in Grades 5 to 10 according to the PISA CPS framework shown in Table 1 [9][10][11][12]. The science scenario contained two units, the Water Purification unit (Figure 1) and the Slurpy unit ( Figure 2). In the Water Purification unit, a task taker (TT) works with two computerized agents to purify dirty water using hands-on materials. One agent is highly collaborative, whereas the other is noncollaborative. The highly collaborative agent always gives positive feedback on how to do the task. However, the noncollaborative agent sometimes gives negative feedback, such as disagreeing with the TT and other computerized agents and making negative comments about the work. The role of the TT is to lead the team and assess the performance of the water filter designs. In the Slurpy unit (Figure 2), the TT communicates and collaborates with a computerized agent to use ice and refrigerants (salt, sugar, monosodium glutamate, and water) to decrease the highest temperature. The team composition (TT and the computerized agent) is asymmetrical. They separately use different proportions of various refrigerants and ice to decrease and record the corresponding temperature. In the end, they collaborate to find the most effective refrigerant and ratio of refrigerant to ice to decrease the temperature. The Water Purification and Slurpy units contain 13 and 17 items, respectively. According to the answering path of each item, students receive a score of 0, 1, or 2 points.

Taiwanese CPS Online Assessment System
A Taiwanese CPS online assessment system including units in math, science, reading, and social science scenarios was developed for students in Grades 5 to 10 according to the PISA CPS framework shown in Table 1 [9][10][11][12]. The science scenario contained two units, the Water Purification unit ( Figure 1) and the Slurpy unit ( Figure 2). In the Water Purification unit, a task taker (TT) works with two computerized agents to purify dirty water using hands-on materials. One agent is highly collaborative, whereas the other is noncollaborative. The highly collaborative agent always gives positive feedback on how to do the task. However, the noncollaborative agent sometimes gives negative feedback, such as disagreeing with the TT and other computerized agents and making negative comments about the work. The role of the TT is to lead the team and assess the performance of the water filter designs. In the Slurpy unit (Figure 2), the TT communicates and collaborates with a computerized agent to use ice and refrigerants (salt, sugar, monosodium glutamate, and water) to decrease the highest temperature. The team composition (TT and the computerized agent) is asymmetrical. They separately use different proportions of various refrigerants and ice to decrease and record the corresponding temperature. In the end, they collaborate to find the most effective refrigerant and ratio of refrigerant to ice to decrease the temperature. The Water Purification and Slurpy units contain 13 and 17 items, respectively. According to the answering path of each item, students receive a score of 0, 1, or 2 points.
Furthermore, according to the answering path in two conversation layers of each item that respects only one skill shown in Table 1, students receive a score of 0, 1, or 2 points based on the following score guide. The score ''0′′ indicates that students respond or provide incorrect information with little relevance to the task. The students contribute minimally to achieving group goals when interacting with team members. The students always work alone. The students do not help the team solve the problem during the mission. The score ''1′′ indicates that the student responds or provides the correct information or actions and fix their understanding when prompted by computer agents. In contrast, "2" indicates that the students are actively involved in the task and select some actions that contribute to the teamwork based on the information provided. The students can communicate with team members, mediate conflicts, and take the initiative to solve the obstacles effectively [7,10,12].  Because the units are designed to be conversational scenarios, the item responses show the proficiency of students mastering the corresponding CPS skills, and the responses of two adjacent items show the behavior pattern between the corresponding CPS skills. For example, the profile of the Water Purification unit is shown in Table 2. The unit Figure 2. Screenshot of the Slurpy unit [9]. Furthermore, according to the answering path in two conversation layers of each item that respects only one skill shown in Table 1, students receive a score of 0, 1, or 2 points based on the following score guide. The score "0 indicates that students respond or provide incorrect information with little relevance to the task. The students contribute minimally to achieving group goals when interacting with team members. The students always work alone. The students do not help the team solve the problem during the mission. The score "1 indicates that the student responds or provides the correct information or actions and fix their understanding when prompted by computer agents. In contrast, "2" indicates that the students are actively involved in the task and select some actions that contribute to the teamwork based on the information provided. The students can communicate with team members, mediate conflicts, and take the initiative to solve the obstacles effectively [7,10,12].
Because the units are designed to be conversational scenarios, the item responses show the proficiency of students mastering the corresponding CPS skills, and the responses of two adjacent items show the behavior pattern between the corresponding CPS skills. For example, the profile of the Water Purification unit is shown in Table 2. The unit contains four tasks and 13 items in total. The second item in Task 2 involves the CPS skill (C2) "Enacting plans." In this item, the TT must implement the plan as discussed with two agents. Hence, in the next item, the TT and two agents must share their understanding of the result after implementation; the task corresponds to (D2) "Monitoring results of actions and evaluating success in solving the problem." This means that the CPS assessment unit mimics a fluent conversation to achieve the common goal of solving the given problem. Therefore, we can analyze specific patterns such as the behavior transition pattern from (C2) to (D2) of students. Specifically, if students are mastering the skill (C2), we can analyze whether they also perform the skill (D2) well. The overall Taiwanese CPS online assessment system illustrates that the human-toagent approach is feasible for measuring the 12 student CPS skills based on the framework shown in Table 1. Moreover, the internal consistency analysis and multidimensional item response theory model of a large-scale assessment have shown that the CPS scales are reliable and valid, respectively. The results via the CPS online assessment are also consistent with PISA 2012 [12,13].

LSA
To address the four research questions, LSA-a common method of identifying behavioral transition patterns-was employed in this study. LSA can be used to determine a given coding (e.g., an activity or behavior "E") followed by another coding (e.g., an activity/a behavior "F"). If the observed frequency of EF in a sequence is significantly higher than the expected frequency (i.e., the corresponding p value is less than or equal to the significance level), then the behavioral transition pattern indicates that E is always followed by F [14][15][16][17][18][19][20]. LSA has been applied in various research studies such as clinical interactions, education, social behavior in animals, communication processes, and children's play. [20]. Education researchers Cheng and Hou (2015) applied sequential analysis to explore students' behavioral transition patterns from affective, cognitive, and metacognitive perspectives during online peer assessment [15]. LSA has also been used to analyze students' discussions and interactive behaviors in project-based learning [15][16][17][18][19][20]. As displayed in Table 2, each item can receive a score of 0, 1, or 2 points to show the student's level of mastery. Hence, using a suitable coding scheme combining the mastery levels and CPS skills, we discussed the behavioral transition patterns of students with varying CPS skill mastery in this study.
In addition, three different gaps were analyzed, namely the gender gap, urban-rural gap, and achievement gap. The gender gap was analyzed by comparing the performance of male and female students. To analyze the urban-rural gap, the students were categorized by school location into three urbanized sectors: commercial areas, emerging and traditional industrial districts, and less developed and remote areas. Then, the participants' behavioral patterns were compared in each area. To measure the achievement gap, the students were divided into a high-score group and a low-score group, and the corresponding CPS behavior patterns were compared. The study also compared the behavioral patterns of male and female participants in each group based on the proposed coding scheme with respect to the response sequence of items.

Methods
The behavioral transition patterns of students with different levels of mastery in CPS skills and the three gaps are discussed in this section; the proposed coding scheme is also introduced. Additionally, this section contains definitions and descriptions of the three urbanized sectors and the participant populations.

Coding Scheme
Based on the scenario designs of the Water Purification and Slurpy units, the student CPS skill sequences were A2, A1, B1, C1, C3, D1, C2, D2, D1, C2, D3, D3, D3, and A1, C1, C2, B2, C1, A3, B3, B1, C2, C3, D2, D1, C1, B1, C2, D2, D3, respectively. However, the students' level of mastery of these items was different. Hence, the study proposed a coding scheme combining CPS skills with students' level of mastery. For instance, if students earn 0 points on the first item of the Water Purification unit, then the mastery level of these students is below average. Hence, the coding of these students for (A2) "Discovering the type of collaborative interaction to solve the problem, along with goals" is A20. If students earn 1 point, then the mastery level of these students is average. Hence, the corresponding coding is A21. Finally, if students earn 2 points, then the mastery level is proficient, and the corresponding coding is A22. Table 3 lists the proposed coding scheme for each of the 12 CPS skills.
In this study, only 23 behavioral transition patterns (see Table 4) in the CPS skills at the three levels of mastery could be discussed because the analysis of the behavioral transition patterns was limited by the online CPS assessment design. Because the student CPS skill sequences are part of a fixed conversational design in each science scenario, only some behavioral transition patterns could be analyzed using LSA. Table 4 shows which transition patterns could be analyzed by considering both the Water Purification and Slurpy units. The rows indicate the starting behaviors, and the columns contain the subsequent behaviors. In addition, checked cells indicate that the behavioral transition patterns could be discussed in this study. For example, in Sequence 1 of the Water Purification unit, the first two items are associated with the CPS skills A2 and A1. Hence, the behavioral transition pattern from A2 to A1 can be discussed, so the (2,1) cell is checked in Table 4. Specifically, according to the proposed coding scheme, the transition patterns A20 (below average in A2), A21 (average in A2), and A22 (proficient in A2) followed by A10 (below average in A1), A11 (average in A1), and A12 (proficient in A1) can be discussed using LSA. Therefore, the (2,1) cell represents the following nine transition patterns potentially exhibited by students in the science scenarios: A20→A10, A21→A10, A22→A10, A20→A11, A21→A11, A22→A11, A20→A12, A21→A12, A22→A12, Table 4. Behavioral transition patterns (indicated from the row to the column) that can be discussed in this study (checks). Based on the proposed coding scheme, the LSA was applied to match the CPS behavior transition patterns. The numbers of one lag pattern, i.e., the behavior transition pattern, were calculated first in the sequential series data. Then the normalized differences were computed between observed numbers and expected numbers based on the independence assumption. If the corresponding p-values are below 0.001, we will obtain the significant CPS behavior transition pattern.

Participants and Procedures
The Taiwanese CPS online assessment was administered to 53,859 students in Grades 9 and 10 (approximately 15 years old), of whom 27,656 were male and 26,203 were female. All participants completed both the Water Purification and Slurpy units within the science scenarios. Students who participated in the assessment had been taught the basic concepts of CPS in a 10-minute lesson by their teachers or shown a video recorded by the team of the Teachers' Collaborative Problem Solving Teaching Competency Project in Taiwan [9]. Furthermore, the students completed the Exercise Plan unit to understand how to use the CPS online assessment's interface.

Three Types of Townships in Taiwan
Hou et al. [21] divided 358 boroughs and townships of Taiwan into 6 groups according to both the 2000 census and 2004 population statistics for stratification. Among these six groups, statistical tests revealed five significant sociodemographic variables, namely age, education level, industrial structure, occupation, and personal income, which are highly related to levels of development among boroughs and townships. For the present study, the six groups were regrouped into three sectors-commercial areas, emerging and traditional industrial districts, and less developed and remote areas-according to the core population density, percentage of the population with more than a junior-college-level education, the percentage of the population aged 15-64, and percentage of the service population. The participants were classified into these three sectors by their school zip codes. The study included 23,489 participants in commercial industrial areas, 23,196 participants in the emerging and traditional industrial district, and only 7174 participants in the less developed and remote areas. In this study, the behavioral transition patterns of students in each of the three sectors were analyzed to assess the urban-rural gap.

High-Score and Low-Score Groups
Based on their total scores in the Water Purification and Slurpy units, participants were divided into two groups: a high-score and low-score group. Students whose total scores were equal to or greater than that of the student in the 33rd percentile were assigned to the high-score group. By contrast, students whose scores were equal to or below that of the student in the 66th percentile were assigned to the low-score group. The high-score and low-score groups contained 20,820 and 16,683 students, respectively. Table 5 shows the total numbers of participants and their proportions in the high-score and low-score groups. Table 5. The number of participants in the high-score and the low-score groups.

Results
This section describes the comparison results of CPS behavioral transition patterns in science scenarios, and those in the gender gap, urban-rural gap, and achievement gap. Figure 3 shows the behavioral transition patterns from (C2) "Enacting plans" followed by (C3) "Following rules of engagement" with the three mastery levels (below average, average, and proficient). The straight, black arrows indicate that the observed frequency was significantly larger than the expected frequency (i.e., p < 0.001) in this study. For example, C22 and C32 indicate that students were proficient in enacting plans and following the rules of engagement, respectively. The sequence of C22 followed by C32 was statistically significant (p < 0.001); hence, C22→C32. In other words, if students can enact plans effectively in science, then they can also effectively follow the rules of engagement in science. However, if students were below average or average in enacting plans (C2) in science, no significant evidence suggested how well they would perform in following the rules of engagement (C3) in science.

Overall CPS Behavioral Transition Patterns in the Science Scenarios
High-Score Group 20,820 39% Low-Score Group 16,683 31%

Results
This section describes the comparison results of CPS behavioral transition patterns in science scenarios, and those in the gender gap, urban-rural gap, and achievement gap. Figure 3 shows the behavioral transition patterns from (C2) "Enacting plans" followed by (C3) "Following rules of engagement" with the three mastery levels (below average, average, and proficient). The straight, black arrows indicate that the observed frequency was significantly larger than the expected frequency (i.e., p < 0.001) in this study. For example, C22 and C32 indicate that students were proficient in enacting plans and following the rules of engagement, respectively. The sequence of C22 followed by C32 was statistically significant (p < 0.001); hence, C22→C32. In other words, if students can enact plans effectively in science, then they can also effectively follow the rules of engagement in science. However, if students were below average or average in enacting plans (C2) in science, no significant evidence suggested how well they would perform in following the rules of engagement (C3) in science. Accounting for the CPS skills (D1) "Monitoring and repairing the shared understanding" and (C1) "Communicating with team members about the actions to be/being performed," Figure 4 illustrates that if students can effectively monitor and repair shared understanding in science, then they can also communicate effectively with team members about the actions to be/being performed in science. Accounting for the CPS skills (D1) "Monitoring and repairing the shared understanding" and (C1) "Communicating with team members about the actions to be/being performed," Figure 4 illustrates that if students can effectively monitor and repair shared understanding in science, then they can also communicate effectively with team members about the actions to be/being performed in science.

Overall CPS Behavioral Transition Patterns in the Science Scenarios
High-Score Group 20,820 39% Low-Score Group 16,683 31%

Results
This section describes the comparison results of CPS behavioral transition patterns in science scenarios, and those in the gender gap, urban-rural gap, and achievement gap. Figure 3 shows the behavioral transition patterns from (C2) "Enacting plans" followed by (C3) "Following rules of engagement" with the three mastery levels (below average, average, and proficient). The straight, black arrows indicate that the observed frequency was significantly larger than the expected frequency (i.e., p < 0.001) in this study. For example, C22 and C32 indicate that students were proficient in enacting plans and following the rules of engagement, respectively. The sequence of C22 followed by C32 was statistically significant (p < 0.001); hence, C22→C32. In other words, if students can enact plans effectively in science, then they can also effectively follow the rules of engagement in science. However, if students were below average or average in enacting plans (C2) in science, no significant evidence suggested how well they would perform in following the rules of engagement (C3) in science. Accounting for the CPS skills (D1) "Monitoring and repairing the shared understanding" and (C1) "Communicating with team members about the actions to be/being performed," Figure 4 illustrates that if students can effectively monitor and repair shared understanding in science, then they can also communicate effectively with team members about the actions to be/being performed in science. As illustrated in Figure 5, the behavioral transition pattern from C1 to C2 suggested that students who were proficient in (C1), "Communicating with team members about the actions to be/being performed," were likely to be either average or proficient in (C2) "Enacting plans." In addition, if students could not communicate effectively with team members about actions (C1), then they were unlikely to be proficient in (C2) "Enacting plans." As illustrated in Figure 5, the behavioral transition pattern from C1 to C2 suggested that students who were proficient in (C1), "Communicating with team members about the actions to be/being performed," were likely to be either average or proficient in (C2) "Enacting plans." In addition, if students could not communicate effectively with team members about actions (C1), then they were unlikely to be proficient in (C2) "Enacting plans."  Figure 5). Furthermore, according to Figure 3, some of these students were proficient in following the rules of engagement (C3) because C22→C32. Thus, (D1) "Monitoring and repairing the shared understanding" is a crucial CPS skill in science.

Comparison of Male and Female Groups
The comparison of male and female students revealed only one different behavioral transition pattern (from C3 to D2) out of the 23 patterns studied. As shown in Figure 6, no difference was observed in the transition patterns of male and female students who had below average or average mastery of (C3) "Following rules of engagement." However, female students who followed the rules of engagement (C3) were likely to perform better than male students in (D2) "Monitoring results of actions and evaluate success in solving the problem." If female students could not follow the rules of engagement, then they remained likely to monitor the results of their actions and evaluate success in solving the problem either well or very well.   According to Figures 4 and 5, students who could proficiently monitor and repair shared understanding (D1) were likely to be average or proficient in enacting plans (C2) because D12→C12 (Figure 4), C12→C21 ( Figure 5), and C12→C22 ( Figure 5). Furthermore, according to Figure 3, some of these students were proficient in following the rules of engagement (C3) because C22→C32. Thus, (D1) "Monitoring and repairing the shared understanding" is a crucial CPS skill in science.

Comparison of Male and Female Groups
The comparison of male and female students revealed only one different behavioral transition pattern (from C3 to D2) out of the 23 patterns studied. As shown in Figure 6, no difference was observed in the transition patterns of male and female students who had below average or average mastery of (C3) "Following rules of engagement." However, female students who followed the rules of engagement (C3) were likely to perform better than male students in (D2) "Monitoring results of actions and evaluate success in solving the problem." If female students could not follow the rules of engagement, then they remained likely to monitor the results of their actions and evaluate success in solving the problem either well or very well. Figure 5, the behavioral transition pattern from C1 to C2 suggested that students who were proficient in (C1), "Communicating with team members about the actions to be/being performed," were likely to be either average or proficient in (C2) "Enacting plans." In addition, if students could not communicate effectively with team members about actions (C1), then they were unlikely to be proficient in (C2) "Enacting plans."  Figure 5). Furthermore, according to Figure 3, some of these students were proficient in following the rules of engagement (C3) because C22→C32. Thus, (D1) "Monitoring and repairing the shared understanding" is a crucial CPS skill in science.

Comparison of Male and Female Groups
The comparison of male and female students revealed only one different behaviora transition pattern (from C3 to D2) out of the 23 patterns studied. As shown in Figure 6, no difference was observed in the transition patterns of male and female students who had below average or average mastery of (C3) "Following rules of engagement." However female students who followed the rules of engagement (C3) were likely to perform better than male students in (D2) "Monitoring results of actions and evaluate success in solving the problem." If female students could not follow the rules of engagement, then they remained likely to monitor the results of their actions and evaluate success in solving the problem either well or very well.
(a) male students (b) female students Figure 6. Behavioral transition patterns from C3 to D2 in (a) male students and (b) female students. Figure 6. Behavioral transition patterns from C3 to D2 in (a) male students and (b) female students.

Comparison of Three Urbanized Sectors
According to the behavioral transition pattern from (C1) "Communicating with team members about the actions to be/being performed" to (C2) "Enacting plans" exhibited by all participants (Figure 7a), no statistically significant pattern from C10 to C22 was observed. That is, overall, students who could not communicate with team members about the actions to be/being performed could not enact plans very well in science. In the comparison of this transition pattern among the three urbanized sectors, students whose schools were located in emerging and traditional industrial districts or less developed and remote areas and who exhibited low performance in communicating with team members about the actions to be/being performed were unlikely to enact plans in science (Figure 7c,d). However, students whose schools were located in commercial industrial areas could enact plans proficiently in science (Figure 7b).

Comparison of Three Urbanized Sectors
According to the behavioral transition pattern from (C1) "Communicating team members about the actions to be/being performed" to (C2) "Enacting plans" ited by all participants (Figure 7a), no statistically significant pattern from C10 to C2 observed. That is, overall, students who could not communicate with team me about the actions to be/being performed could not enact plans very well in science. comparison of this transition pattern among the three urbanized sectors, students schools were located in emerging and traditional industrial districts or less deve and remote areas and who exhibited low performance in communicating with members about the actions to be/being performed were unlikely to enact plans in s (Figure 7c,d). However, students whose schools were located in commercial ind areas could enact plans proficiently in science (Figure 7b).

Comparison of High-Score and Low-Score Groups
Significant differences were revealed between high-score and low-score group achievement gap) because the coding scheme combines CPS skills and mastery Figure 8 illustrates the behavioral transition patterns from A20, A21, and A22 t A11, and A12. According to Figure 8a,b, if students belonged to the high-score grou could discover the type of collaborative interaction to solve the problem, along with (A2), then they were likely to be either average or proficient in (A1) "Discoverin spectives and abilities of team members."

Comparison of High-Score and Low-Score Groups
Significant differences were revealed between high-score and low-score groups (the achievement gap) because the coding scheme combines CPS skills and mastery levels. Figure 8 illustrates the behavioral transition patterns from A20, A21, and A22 to A10, A11, and A12. According to Figure 8a,b, if students belonged to the high-score group and could discover the type of collaborative interaction to solve the problem, along with goals (A2), then they were likely to be either average or proficient in (A1) "Discovering perspectives and abilities of team members." Similar to the behavioral transitional patterns from A2 to A1 (Figure 8), Figure 9 demonstrates that if students in the high-score group were average or proficient in (B2) "Identifying and describing tasks to be completed," then they were likely also average or proficient in (C1) "Communicating with team members about the actions to be/being performed." Moreover, as seen in Figure 10, students in the high-score group who were Similar to the behavioral transitional patterns from A2 to A1 (Figure 8), Figure 9 demonstrates that if students in the high-score group were average or proficient in (B2) "Identifying and describing tasks to be completed," then they were likely also average or proficient in (C1) "Communicating with team members about the actions to be/being performed." Moreover, as seen in Figure 10, students in the high-score group who were average or proficient in (B3) "Describe roles and team organization" were likely able to build a shared representation and negotiate the meaning of the problem (B1). Additionally, even if students in the high-score group were below average in B3, they still could achieve an average performance in B1.
(a) high-score group ( b) low-score group Figure 8. Behavioral transition patterns from A2 to A1 in the (a) high-score and (b) low-score groups.
Similar to the behavioral transitional patterns from A2 to A1 (Figure 8), Figure 9 demonstrates that if students in the high-score group were average or proficient in (B2) "Identifying and describing tasks to be completed," then they were likely also average or proficient in (C1) "Communicating with team members about the actions to be/being performed." Moreover, as seen in Figure 10, students in the high-score group who were average or proficient in (B3) "Describe roles and team organization" were likely able to build a shared representation and negotiate the meaning of the problem (B1). Additionally, even if students in the high-score group were below average in B3, they still could achieve an average performance in B1.
(a) high-score group (b) low-score group (a) high-score group (b) low-score group  (a) high-score group ( b) low-score group Figure 8. Behavioral transition patterns from A2 to A1 in the (a) high-score and (b) low-score groups.
Similar to the behavioral transitional patterns from A2 to A1 (Figure 8), Figure 9 demonstrates that if students in the high-score group were average or proficient in (B2 "Identifying and describing tasks to be completed," then they were likely also average or proficient in (C1) "Communicating with team members about the actions to be/being performed." Moreover, as seen in Figure 10, students in the high-score group who were average or proficient in (B3) "Describe roles and team organization" were likely able to build a shared representation and negotiate the meaning of the problem (B1). Addition ally, even if students in the high-score group were below average in B3, they still could achieve an average performance in B1.
(a) high-score group (b) low-score group (a) high-score group (b) low-score group Figure 10. Behavioral transition patterns from B3 to B1 in the (a) high-score and (b) low-score groups. Figure 10. Behavioral transition patterns from B3 to B1 in the (a) high-score and (b) low-score groups.
Regarding the behavioral transition patterns from C2 to C3 (Figure 11), most students in the high-score group could follow rules of engagement (C3) proficiently. However, most students in the low-score group could not enact plans (C2). In addition, if students in the high-score group were average or proficient in (C2) "Enacting plans," then they could follow the rules of engagement (C3) proficiently. Based on the significant transition pattern from (C3) "Following rules of engagement" to (D2) "Monitoring the results of actions and evaluating success in solving the problem," most students in the high-score group (Figure 12a) were proficient in C3 and average or proficient in D2. dents in the high-score group could follow rules of engagement (C3) proficiently. How ever, most students in the low-score group could not enact plans (C2). In addition, i students in the high-score group were average or proficient in (C2) "Enacting plans, then they could follow the rules of engagement (C3) proficiently. Based on the significan transition pattern from (C3) "Following rules of engagement" to (D2) "Monitoring th results of actions and evaluating success in solving the problem," most students in th high-score group (Figure 12a) were proficient in C3 and average or proficient in D2.
(a) high-score group (b) low-score group Figure 11. Behavioral transition patterns from C2 to C3 in the (a) high-score and (b) low-score groups.
(a) high-score group (b) low-score group

Discussion
Since 2013, two similar PISA CPS units that require students to make multi ple-choice selections and collaborate with team agents to solve problems have been cre ated to assess Taiwanese students' CPS skills in science [10,12,22]. Herborn et al. [23 compared human-to-agent and human-to-human tests in the same scenario from th original PISA 2015 CPS assessment. The authors revealed no significant differences be tween types of collaboration partners. Moreover, according to the report on CPS pro vided by PISA, CPS has the highest correlation (0.77) with science; this correlation wa greater than that in either mathematics or reading [22]. Hence, the CPS behavioral tran sition patterns exhibited in Taiwanese students' science assessments were selected fo discussion in this study.
Graesser et al. [24] mentioned that if students are proficient in "enacting plans," then they can likely be observed to adaptively respond to and make progress on group goals From the overall analysis of CPS behavioral transition patterns, we found that if student exhibit high performance in enacting planes, then they also have high performance in Figure 11. Behavioral transition patterns from C2 to C3 in the (a) high-score and (b) low-score groups. dents in the high-score group could follow rules of engagement (C3) proficiently. How ever, most students in the low-score group could not enact plans (C2). In addition, i students in the high-score group were average or proficient in (C2) "Enacting plans," then they could follow the rules of engagement (C3) proficiently. Based on the significan transition pattern from (C3) "Following rules of engagement" to (D2) "Monitoring the results of actions and evaluating success in solving the problem," most students in the high-score group (Figure 12a) were proficient in C3 and average or proficient in D2.
(a) high-score group (b) low-score group Figure 11. Behavioral transition patterns from C2 to C3 in the (a) high-score and (b) low-score groups.
(a) high-score group (b) low-score group

Discussion
Since 2013, two similar PISA CPS units that require students to make multi ple-choice selections and collaborate with team agents to solve problems have been cre ated to assess Taiwanese students' CPS skills in science [10,12,22]. Herborn et al. [23 compared human-to-agent and human-to-human tests in the same scenario from the original PISA 2015 CPS assessment. The authors revealed no significant differences be tween types of collaboration partners. Moreover, according to the report on CPS pro vided by PISA, CPS has the highest correlation (0.77) with science; this correlation was greater than that in either mathematics or reading [22]. Hence, the CPS behavioral tran sition patterns exhibited in Taiwanese students' science assessments were selected for discussion in this study.
Graesser et al. [24] mentioned that if students are proficient in "enacting plans," then they can likely be observed to adaptively respond to and make progress on group goals From the overall analysis of CPS behavioral transition patterns, we found that if students exhibit high performance in enacting planes, then they also have high performance in Figure 12. Behavioral transition patterns from C3 to D2 in the (a) high-score and (b) low-score groups.

Discussion
Since 2013, two similar PISA CPS units that require students to make multiple-choice selections and collaborate with team agents to solve problems have been created to assess Taiwanese students' CPS skills in science [10,12,22]. Herborn et al. [23] compared humanto-agent and human-to-human tests in the same scenario from the original PISA 2015 CPS assessment. The authors revealed no significant differences between types of collaboration partners. Moreover, according to the report on CPS provided by PISA, CPS has the highest correlation (0.77) with science; this correlation was greater than that in either mathematics or reading [22]. Hence, the CPS behavioral transition patterns exhibited in Taiwanese students' science assessments were selected for discussion in this study.
Graesser et al. [24] mentioned that if students are proficient in "enacting plans," then they can likely be observed to adaptively respond to and make progress on group goals. From the overall analysis of CPS behavioral transition patterns, we found that if students exhibit high performance in enacting planes, then they also have high performance in following rules of engagement. Additionally, students who could monitor and repair shared understanding were likely to have more conversations and communicate with other team members. By contrast, if a team failed to solve a given problem, then the team members often exhibited less reflective discourse and an inability to transform their discussion into an executable plan to solve the problem [25,26].
PISA results [22] suggested that overall, girls perform significantly higher than boys do in CPS. In addition, past comparisons of girls and boys indicated that girls like to communicate and collaborate with others but boys tend to work independently [27]. However, this phenomenon has evolved over time, and the gender gap has decreased in science, although girls still demonstrate better reception and interpretation than boys do [10,[28][29][30][31][32][33][34][35]. The C3→D2 transition results reveal slight differences between girls and boys, suggesting that if girls can follow the rules of engagement, then they can also monitor the results of actions and evaluate success. However, some boys who could follow the rules of engagement still could not monitor results or evaluate success.
The results of the behavioral transition patterns among the three city sectors suggest that if students whose schools are located in emerging and traditional industrial districts or less developed and remote areas cannot communicate with team members about the actions to be/being performed, they also cannot enact plans in science. However, the school sector was not the most important factor influencing CPS skills. A previous study found that instead, being more physically active or attending more physical education classes per week exhibited a greater influence on CPS [22].

Conclusions and Future Work
In this study, a coding scheme that combines student CPS skills and mastery levels in a human-agent online CPS assessment was proposed to understand the behavioral transition patterns of CPS skills in science by applying LSA. The study provides the following major findings: 1.
The overall behavioral transition patterns exhibited by 15-year-old Taiwanese students suggested that those who effectively monitor and repair shared understanding (D1) can also effectively communicate with team members about their actions (C1). Students who effectively communicate with team members about their actions (C1) are also likely to be average or proficient in enacting plans (C2). Students who enact plans (C2) effectively can also follow the rules of engagement (C3) efficiently. Therefore, (D1) "Monitoring and repairing the shared understanding" is a crucial CPS skill in science. This finding suggested that reminding students to continually monitor and repair shared understanding during teamwork is helpful in science class, especially in courses that involve collaborative science experiments.

2.
Regarding the behavioral transition patterns of students compared by gender, female students who could effectively follow the rules of engagement (C3) were likely to perform higher than male students were in the CPS skill (D2) "Monitoring the results of actions and evaluating success in solving the problem." This observation suggested that teachers should focus on the transition pattern from C3 to C2 in male students who are proficient in (C3) in science classes.

3.
Regarding the urban-rural gap, no obvious differences were observed in the behavioral transition patterns of the three city sectors, except for that from C1 to C2. Students attending schools in the city and commercial industrial area performed slightly better than did those attending schools in the emerging and traditional industrial districts and less developed and remote areas. Students in all three urbanization areas who could effectively communicate with team members about their actions (C1) could also enact plans (C2) effectively. Most students in all three urbanization areas who could communicate with team members about their actions (C1) also could not enact plans (C2), except for some students in the city and commercial industrial areas. 4.
More differences in behavioral transition patterns were observed during analysis of the achievement gap because of the coding scheme, which combines CPS skills and mastery levels. Students in the high-score group were average or proficient in (A1), (C1), (B1), and (C3) from (A2), (B2), (B3), and (C2), respectively. In addition, if students in the high-score group were proficient in C2, then they were likely to be average or proficient in D2 because of the C22→C32, C32→D21, and C32→D22 transitions. Moreover, few students in the low-score group exhibit the behavioral transition patterns C22→C30, C22→C31, and C22→C32. Hence, teachers may design class activities that encourage students to prompt other team members to perform their tasks after enacting plans.
This study was conducted in Taiwan; hence, the results cannot be directly expanded to other countries because of differences in curriculum guidelines and commonly used learning models. However, the model, which included a coding scheme and method of applying LSA, may be used in other countries to identify students' behavioral transition patterns. Moreover, the CPS assessment platform has been integrated into a large adaptive learning platform in Taiwan that also includes the corresponding CPS learning materials. Therefore, in the future, teachers may analyze their students' behavioral transition patterns in class by using the assessment units and apply the results to select appropriate activities and teaching materials.
To further enrich the literature related to online CPS assessment, other subjects such as math, reading, and social science can be included to explore overall and individual behavioral transition patterns. The findings also suggest that teachers should design additional activities in class to address their students' weaker transition patterns. In addition, when PISA releases the students' secondary data from the CPS assessment, the proposed coding scheme may be applied to analyze students' behavioral transition patterns within and between regions. Moreover, the cultural differences of each region may influence the behavioral transition patterns when students work together to solve a problem. Therefore, a culture gap between regions or between Asia and the West can be analyzed in the same manner according to PISA secondary data.  Data Availability Statement: Data are not publicly available but may be made available on request from the corresponding author.