Implementing Alternative Assessment Strategies in Chemistry Amidst COVID-19: Tensions and Reflections

The COVID-19 pandemic in the first quarter of 2020 resulted in the worldwide disruption of teaching and learning in main stream schools and in institutes of higher learning. Singapore was not spared. With the closure of schools in early April, it was imminent that the delivery and assessment of our freshman general chemistry course must be overhauled for the new semester. While the delivery of Home-based Learning (HBL) was a challenge for all educators, it was a mammoth roadblock for chemistry courses because of laboratory classes. Besides being thrusted to learn and use new technology tools for online lessons, instructors also had to quickly explore and design alternative assessments to substitute in-person written examinations and tests. This paper documents the struggles that played out in the decision to implement concept map assessments and “split-half” laboratory classes for safe distancing. Although these interventions are not novel, we confronted tensions as we sought to address academic integrity, administrative guidelines, and our own inadequacy particularly in concept map assessments. In light of positive and negative feedback from both staff and students, lessons were drawn to enhance future implementation and for further research.


Introduction
Never in our boldest ambition did the authors dared to implement a 100% online course in general chemistry. While we had previously experimented with a blended approach of offering face-to-face lessons with pre-recorded lectures to meet academic quality and requirements, online learning was always viewed with skepticism: How do we motivate and keep an eye on lagging online learners, and how could we run laboratory classes if chemistry is fully taught online? A "normal" semester would typically see 20 to 30% of physical hours substituted by online learning, with the presence of in-person assessments and examinations. COVID-19 changed a significant part of our thinking and teaching practices.
This paper presents our experiences designing alternative assessment for learning using concept map assignments for a general physical and inorganic chemistry course. This course was taught in the semester of April 2020, and so was caught right in the peak of the COVID-19 pandemic. It was read by 504 students (majority are freshmen with a small minority of repeat students) enrolled into the diploma courses of chemical engineering, pharmaceutical, food, and biological sciences.
Spread over 14 weeks, this 60 h subject addresses fundamental concepts in atomic and electronic structure, chemical bonding, stoichiometry, basics of chemical kinetics, and ionic equilibria. There were 18 h of laboratory instruction (six face-to-face sessions) focusing on preparation of stock and standard solutions, titration stoichiometry, choice of indicators, and an introduction to the autotitrator. At the point of writing this paper, the semester was coming to an end. With the winding down, it was timely to look back at what was implemented. As we document our journey towards online delivery of classes and implementing alternative forms of assessments to replace high-stakes examinations and tests, we encountered several sources of tension. We believe that COVID-19 provided a catalyst to rethink assessment strategies in higher education. In this paper, we would attempt to address the following questions: • What were some of the major factors (both high-level administrative guidelines and on-the-ground considerations) we balanced for assessment designs? • How did the need for safe distancing impact laboratory scheduling and learning? • What were staff and students' qualitative perceptions of concept maps as an alternative assessment strategy? • What could we do better in future implementation, and what are some unanswered questions which could be grounds for continued research?

Pivoting to Home-Based Learning (HBL)
As the COVID-19 situation evolved in the first quarter of 2020 in Singapore, the academic management team initially held up hopes that it would be "business as usual" for the April semester. It was all "antenna up" as the team monitored the situation and discussed possible contingencies across the whole institute. We were initially expecting a partial closure of campus, causing a delay of at most a few weeks. However, when the authorities announced Phase 1 of the "circuit breaker" (CB) beginning from 7 April to 4 May 2020, it became clear that our academic calendar would be severely impacted. The education ministry announced home-based learning (HBL) as the default schooling mode for all schools and tertiary institutes. With the extension of Phase 1 to 2 June, all hopes of a normal semester vanished completely.
Institutional technological resources were very adequate in helping staff to pivot to HBL rapidly. About one month before the start of the semester, professional development and preparedness were well underway. Our main platforms for content creation and delivery are Panopto, Microsoft Teams, and Blackboard Learning Management System (LMS). The chemistry team has also previously produced lecture recordings for use in the earlier semesters, which we could quickly deploy for HBL. When the administration mandated that either lecture recordings or live lecturing was allowed in Microsoft Teams, the instructors wasted no time to complete all the lecture recordings. Based on recommended institutional guidelines that were in place before the pandemic, we incorporated the best practices and best-effort design as quickly as possible to prepare for this unprecedented semester. For example, the contents were chunked into shorter sub-topics to accommodate shorter attention spans of digital natives [1]. To scaffold the learning process and provide interactivity, formative self-check quizzes with auto-feedback were provided [2][3][4]. The second author (the course coordinator) recorded a video clip to familiarize freshmen with the teaching schedule and overall architecture of the LMS site. Tutorials were mandated to be conducted real-time on Microsoft Teams to maximize student engagement and attendance. Staff were then confronted with a steep learning curve to learn how to conduct tutorials online and in real-time. In the first two weeks of the semester, tutorials had not begun yet and so the instructors took the opportunity to contact their tutorial groups through an informal check-in call. This also provided an opportunity for both instructors and students to "touch base" informally online.
Besides learning how to conduct synchronous lesson on Microsoft Teams within a short period of time, the teaching team had to quickly decide how to manage and re-design assessment plans due to the disruption of on-site assessment and laboratory classes.

Tension between Safe Distancing and Scheduling: The "Split-Half" Laboratory Classes
The disruption of laboratory classes presented a severe existential crisis for science educators. While virtual or simulation laboratory instruction could provide a viable substitute with comparable student outcomes [5][6][7][8][9] ( [10] p. 53), we had neither the time nor expertise to rapidly design these bespoke resources. Other formats were mooted. These included using (easier) asynchronous demonstration videos, accompanied by assessing planning or data analysis skills, a strategy widely adopted during campus closure in the pandemic [11]. These were eventually abandoned as the administration was also cognizant of the fact that on-site skills training remains a core mission of the institute. With the deterioration of the pandemic situation, it was clear that we had to make a decision fast. Then, we toyed with the idea of "kitchen science" ([10] p. 57) or home-based laboratory activities, where students could produce digital contents to demonstrate learning [3,12] ( [10] p. 93). However, practically speaking, this approach was appropriate either for experienced students [13] who were familiar with the theory and safety guidelines, or non-chemistry majors ( [10] p. 57), [14]. The freshmen in our chemistry course were neither, forcing us to again abandon this approach. The idea of live-streaming our laboratory classes [15] did not come to mind as the campus was closed, cutting off access to chemicals and glassware. In addition, quite a majority of them did not have prior titration experience in their high school education. For example, about 50% of the students in the main author's chemical engineering classes had not done titration before, thus a substitution with other distant learning format would not be optimal. Looking at our options and our teaching schedule, we decided to delay the in-person laboratory classes until the end of Phase 1, while awaiting further instructions from the education ministry and the administration. Refer to Appendix A, Table A1.
A reprieve came when the education ministry announced, towards the end of Phase 1, that laboratory or studio classes may resume on-campus due to the need to access equipment and materials. However, safe distancing must be observed during lessons. Critical planning considerations were (1) a lead-time was needed to prepare chemicals and distribute PPE, given all supply deliveries had virtually ceased during the CB period, (2) safe distancing of at least 1 m (about 3 feet) in the laboratory venue between persons, and (3) how to prepare students to return to campus after almost eight weeks of HBL. Towards the end of phase 1, the administration immediately granted laboratory technicians to return to campus for preparation, after 2 June. Staff members in the orientation committee contacted PPE suppliers and coordinated the distribution to the laboratory instructors. In a normal semester, each laboratory class comprised around 25 students and one instructor in the laboratory, occurring once every two weeks. Knowing that we could not meet safe distancing if we kept to the old schedules, the administration and course chairs decided to split laboratory classes into two groups (the "split-half" lab). For chemistry, it meant that classes now took place every week. Each instructor met half the class on the same day, on different (consecutive) weeks for the same laboratory task. This had no doubt placed constraints on timetabling, venues, and manpower.
Two weeks before classes began, all students were required to complete a 20-item basic safety quiz (normally completed in-person to ensure adherence), and watch a video on lab safety on the LMS. To emphasize the importance of safety, tutors constantly monitored the submission and marks in the LMS and reminded all students to complete and pass the quiz. With much stress, anxiety, and anticipation, our first laboratory session began on the 8th week, delayed by one month. Laboratory classes continued on into the mid-term break to make up for lost time. In the meantime, lectures and tutorials were progressing as scheduled.
Our LMS repository of pre-lab packages built up over the semesters are now proving their value more than before. For every task, the LMS package consists of a pre-lab assignment with short, open-ended questions related to the theoretical principles of the task, and video resources curated by the team. The latter were self-made, complemented with other Youtube clips. For example, we had our own video to demonstrate the use of the analytical balance, FLASH-designed packages, or powerpoint recordings with narration on titration tasks and color changes of common indicators used in the practical work. The duration of the video clips was between one to 10 min, addressing the theory, experimental techniques, and data recording. We were also able to substitute an in-person class with another online package, prepared for blended learning in earlier semesters. It comprised a 4.5 min video to review key titration skills and a formative, LMS assignment with five open-ended questions. This resource was flexibly deployed during the revision week in week 14, where all on-campus and online lessons stopped temporarily. The questions were intended to probe deeper level understanding. For example, why it is not advisable to add too much indicator during the titration, and why the titrant concentration should be moderate (not too high or low) to ensure a practical, readable titer value.

Our First Bold Experimentation: Concept Maps as Assessment for Learning
With the CB underway, the administration announced the removal of end-semester examinations and mid-terms, forcing all courses to adopt continuous assessment (CA). Before the semester rolled on and even as it commenced, there were intense discussions between the academic heads and management, deliberating the needful adjustments to assessment plans, while ensuring academic rigor and integrity. The instructions and broad guidelines were then relayed to the instructors on the ground, allowing for customization to meet specific subject needs. Needless to say, our assessment plans changed constantly during that challenging period as we continuously aligned curriculum delivery to the national pandemic posture.
In a normal semester, the mid-term and end-of semester examination accounted for more than 50% of the subject grade. It was fortunate that we could still grade the existing post-lab tests and datasheets in our laboratory classes, but we still had to design assessments to substitute examinations and tests. There were several tensions experienced. The most obvious choice was to pivot to the online LMS quiz system. These quizzes can be easily graded. However, administrative guidelines capped the weightages of online assessments, on valid grounds to uphold academic integrity. This is because online test environment were viewed to be more vulnerable to academic misconduct [16][17][18][19][20]. Thus, the subject team decided that it was not worth the effort to design online tests as these could not fully replace the lost assessments. Students' assessment workload was also subjected to an administrative cap and thus limiting our options to break up the weightages into smaller tasks. Weighing in on the pros and cons of different options and guidelines to comply with, the teaching team decided on concept map assignments.
Concept maps are known to promote deep and meaningful learning, a precept born out of cognitive and developmental psychology, seen in the seminal work by Novak [21,22]. A concept map is a visual representation of basic conceptual units called nodes, connected by lines to show the relationship between them. Nodes with labeled relational connections are called proposition. Nodes and propositions are then organized in a hierarchical manner to show the unfolding of macroscopic to grainy, underlying substructures [21]. Where assessment of learning is concerned, concept mapping is known to also expose misconceptions in them [21,23]. As an educational institute, we cannot ignore assessment for learning. The literature shows that concept mapping abilities are weakly related to performance test scores [24][25][26]. Some found a facilitative effect [27], while others found a correlation with more traditional tests when used as tool for talent selection into niche curriculum [28]. Another critical consideration was how to uphold academic integrity in whatever we assessed. Unlike online tests, we thought concept maps required more customization efforts and were thus harder to plagiarize. Moreover, any form of direct copying from the lecture materials can be easily spotted.

Design of the Concept Map Tasks
The topics chosen for this assignment were atomic structure, chemical bonding, chemical kinetics, and equilibria. Atomic structure and chemical bonding were bundled as one assignment set, while kinetics and equilibria were bundled as the second. In both assignments, we included a short practice question worth about 20% of the maximum marks. The objective of the practice question was to allow students to apply basic concepts to a "test-style" question. We chose these topics for two reasons. Firstly, atomic structure can be readily integrated with chemical bonding; similarly, for reaction kinetics and equilibrium. The second reason was because topics on solution preparation and stoichiometry were already extensively assessed in the laboratory. A concept map assignment focusing on these four topics would address the learning outcomes.
Concept mapping has a long history of application in science education and professional development [22,[24][25][26][27][29][30][31]. However, this tool was entirely new to us. None of the authors had prior experience using concept maps for teaching and assessment. It was unsurprising that some colleagues were initially uncomfortable with grading an open-ended piece of work, with no telling how varied students' responses could be. One other concern was that students would simply have no idea how to design a concept map. These are valid issues since concept map grading require student training, is rather time-consuming and subjective [21,32].
To implement the assessment, we also needed to articulate the task, response format, and grading system [33]. We implemented a constrained task [33] by providing key phrases and concepts for students to map. For example, in atomic structure and chemical bonding, we asked for summaries on atomic and electronic structure, molecular shapes, intermolecular forces, and chemical bonding, and to show how these core concepts were linked. For the kinetics and equilibria assignment, we required students to map out how rate and equilibrium expressions are written and also the factors affecting rate and equilibrium. In terms of the mode, given the pandemic, we chose hand-written or computer-typed responses [33]; oral interview was not practical given the cohort size. Though we initially allowed photographed work to be submitted, these were eventually returned to students (post-CB) in exchange for the originals, on the grounds of academic integrity. To address the possibility that students are not experienced mappers, we incorporated a very simplified, partial concept map on atomic structure (see Figure 1). In addition, we also relaxed the task demands to allow students to submit alternative formats, such as bulleted or table form of notes or as a hybrid.
Educ. Sci. 2020, 10, x FOR PEER REVIEW 5 of 15 reasons. Firstly, atomic structure can be readily integrated with chemical bonding; similarly, for reaction kinetics and equilibrium. The second reason was because topics on solution preparation and stoichiometry were already extensively assessed in the laboratory. A concept map assignment focusing on these four topics would address the learning outcomes. Concept mapping has a long history of application in science education and professional development [22,[24][25][26][27][29][30][31]. However, this tool was entirely new to us. None of the authors had prior experience using concept maps for teaching and assessment. It was unsurprising that some colleagues were initially uncomfortable with grading an open-ended piece of work, with no telling how varied students' responses could be. One other concern was that students would simply have no idea how to design a concept map. These are valid issues since concept map grading require student training, is rather time-consuming and subjective [21,32].
To implement the assessment, we also needed to articulate the task, response format, and grading system [33]. We implemented a constrained task [33] by providing key phrases and concepts for students to map. For example, in atomic structure and chemical bonding, we asked for summaries on atomic and electronic structure, molecular shapes, intermolecular forces, and chemical bonding, and to show how these core concepts were linked. For the kinetics and equilibria assignment, we required students to map out how rate and equilibrium expressions are written and also the factors affecting rate and equilibrium. In terms of the mode, given the pandemic, we chose hand-written or computer-typed responses [33]; oral interview was not practical given the cohort size. Though we initially allowed photographed work to be submitted, these were eventually returned to students (post-CB) in exchange for the originals, on the grounds of academic integrity. To address the possibility that students are not experienced mappers, we incorporated a very simplified, partial concept map on atomic structure (see Figure 1). In addition, we also relaxed the task demands to allow students to submit alternative formats, such as bulleted or table form of notes or as a hybrid. We customized a generic grading template provided by school academic advisors and various colleagues outside the chemistry team, to suit the needs of the course. The generic rubric included criteria that were literature-based best practices [21], such as grading for connected nodes with labelled lines, valid propositions, and hierarchical structuring of concepts from broad to narrow. Based on this, the subject team customized a 3-band rubric ("Excellent", "Good" and "Poor") to grade the validity of the core concepts, the extent of connectedness between core concepts and also the sequencing of presentation. Table 1 showed a summary of the grading rubric. Within the "Excellent" and "Good" band, we allowed for a one to two-point variation for scoring. "Poor" was easier to score and thus an absolute zero was used. As seen from the rubric, we intentionally chose to put more weightage in the proposition criteria as we were aware that students did not have much practice on We customized a generic grading template provided by school academic advisors and various colleagues outside the chemistry team, to suit the needs of the course. The generic rubric included criteria that were literature-based best practices [21], such as grading for connected nodes with labelled lines, valid propositions, and hierarchical structuring of concepts from broad to narrow. Based on this, the subject team customized a 3-band rubric ("Excellent", "Good" and "Poor") to grade the validity of the core concepts, the extent of connectedness between core concepts and also the sequencing of presentation. Table 1 showed a summary of the grading rubric. Within the "Excellent" and "Good" band, we allowed for a one to two-point variation for scoring. "Poor" was easier to score and thus an absolute zero was used. As seen from the rubric, we intentionally chose to put more weightage in the proposition criteria as we were aware that students did not have much practice on concept mapping. Complex skills such as hierarchical structuring and sequencing might be challenging at the first attempt.  1 Except for "Poor", a one to two-point range was used in the "Excellent" and "Good" bands.
The first assignment brief on atomic structure and chemical bonding was released as a video clip on the LMS on week 5, around the middle of May 2020. This was timed purposely such that students would have already viewed the required lecture videos. The grading rubrics and academic discipline rules were also communicated to the students. To deter plagiarism, students were required to submit a declaration of originality with their work. Students were given four weeks to submit their work in person during the laboratory sessions in week 8 or 9. Refer to Appendix A, Table A1.

Data Collection and Analysis
Two of the co-authors co-graded the first assignment. A month later around week 15, grading was completed and samples of students' work were discussed and shared amongst the authors in a meeting. The intention was to flesh out what we thought were strong and weak maps, and to discuss how to help the students make a better map in the second assignment. One of co-author then made a 11-min review video clip to bring students through the mistakes in the practice question. A logical sequencing between atomic structure and chemical bonding concepts was also presented to the students. An exemplar map submitted by a student was shown in the review clip. The second assignment brief on reaction rates and equilibrium was released on Week 13 and students were again given four weeks to turn in their work in person during one of the laboratory classes. The main author and one other co-author joined in the grading work. The marking team thus comprised the two experienced co-authors who had earlier marked the first map, plus two new ones. The instructors exchanged notes to briefly calibrate grading expectations after trial marking one to two classes.
Given that concept maps were a very new form of assessment for us, it was decided to implement a very quick "dip-stick" survey to determine students' perception. The survey was distributed using the Microsoft Forms platform, which all students had access to using their institutional accounts. This took place after the release of assignment 1 review clip. A Yes/No question was first posed to students, and depending on their response, they were routed to an open-ended question to elicit positive or negative comments. The questions were: • I found the concept map assignment useful in consolidating my learning (Yes/No). • I found the concept map assignment useful in consolidating my learning because _____. • I did not find the concept map assignment useful in consolidating my learning because _____.
To complete our assessment plan, we graded a set of post-laboratory tests (open-book) and fill-in-the-blanks reports. In addition, we also implemented one written, 15-mark short test to assess the last topic on ionic equilibria during one of the laboratory lessons. To prevent students from circulating the test contents to the next group of peers, we designed three versions per assessment, matched to the same level of difficulty. Instructors were allowed to randomly select a particular test version for a particular class in the week.
Frequency responses from the first Yes/No question were tabulated by diploma courses. The open-ended comments from the two free-responses questions were qualitatively gleaned to obtain high-level perceptions on the utility of the tasks. As the focus of this paper was about perceptions of the concept map task, we presented a qualitative sketch of the work quality, instead of undertaking a quantitative analysis of students' performance.

Findings of Concept Map Survey and Informal Student Feedback
The perception survey garnered 351 responses, or a responses rate of approximately 69.6%. Chemical engineering freshmen made up about 29%, with each of the other diploma courses accounting for about 16% to 19%. This profile is fairly representative of the subject cohort as there are usually six classes of chemical engineering classes, while the rest of the courses typically make up three to four classes each. Table 2 presents the breakdown of survey responses by courses. Overall, 87.7% of the respondents felt that the concept map assignment was useful to them. Table 3 provides the breakdown of "Yes" and "No" response into the five diploma courses. Table 3. "Yes" (N = 308) and "No" (N = 43) responses by diploma courses. Students who found concept map useful often mentioned that it enabled them to re-organize their learning into something visual. Comments also showed that some students understood linkages between concepts as they wrote up their assignment. Some comments also revealed the underlying "cognitive struggle" that students were engaged in. Not only do students need to grasp the domain knowledge, concept mapping requires thought organization, presentation, and language skills to convey meaning to another map reader ( [21] p. 17). Representative comments reflecting such sentiments are provided in Appendix B.

Diploma Course Yes No
On negative comments, some students felt that they had other study styles and making a concept map felt incongruent. With this, some students felt that the assignment constricted them to a particular style; others even felt that it was a waste of time, as they were simply "regurgitating" contents from the notes and paraphrasing into a certain structure. Some comments alluded to cognitive or language challenges such as the inability to link concepts together, expending effort to design a map, or to succinctly paraphrase their work. Some of the comments also showed that students prefer a more traditional mode of assessing domain knowledge, such as quizzes or practice questions. Some also commented that concept maps are best used as study techniques, rather than graded assessments. In the students' view, the ability to answer traditional assessment questions was a better indicator of their academic achievement. Representative comments reflecting these sentiments are shown in Appendix C.
Clearly, the positive comments attest to the strengths of concept maps to assist learners in consolidating and organizing content across different media (be in print or video) and then re-representing these structures logically. This finding was consistent with earlier studies [24,30]. Interestingly, even for some nay-sayers, mapping still played a facilitative role as the task itself forces them to confront and resolve their learning gaps by reviewing study materials. As one student wrote: "for those parts of the topic where I was unsure, I still felt equally unsure after completing the summary assignment. Only after re-watching the lecture videos and watching some educational videos online was when I fully understood the topics." From the comments, students appeared to be more actively engaged in making efforts to grasp the content or to represent their knowledge structures visibly. Proponents of concept mapping admitted that concept mapping is not an easy task. It requires intentional effort and intrinsic motivation, just like how active learning should be [21,25].
Negative comments were symptomatic of the cognitive challenges of the task, and were similar to literature findings [24,25,30]. It was unsurprising that some students encountered difficulties in linking concepts together, found concept maps a waste of time or time-consuming to do, or was incompatible with their learning style. It was also fairly evident that students were more comfortable being assessed with familiar assessments such as written tests and quizzes, for they felt that concept mapping was unrepresentative of their academic performance. Interestingly, student perceptions appeared to echo past work, in that mapping skills are not related to test-taking abilities [24][25][26].
Given the deep-rooted role of assessment in the course of students' educational experience, it was not surprising that they came with an exam-oriented mindset. The fixation with examinations and tests might also have made it more challenging to convince students of the long-term benefits of concept mapping skills. In one interesting exchange, a student messaged the main author to ask how student's knowledge was assessed, with no written tests or examinations. The student was reminded by the main author that the laboratory post-lab tests, concept map assignments, and the short written test would have addressed all the concepts. Perhaps the final reply of this student reflected the impact of educational conditioning: "Mmm yeah i get it now thank you. I was thinking this way because i tend to do better in exams haha".

Qualitative Feedback and Perceptions of Staff on Student Performance
As expected, staff noted that indeed, students were unable to make deep connections and linkages. Most of the linkages were shallow, where lecture notes contents were paraphrased or presented in another way. From the quality of the work submitted, it appeared that there was a lack of integrative reconciliation ( [21] p. 104). Integrative reconciliation occurs when the learner recognized the relationship between related concepts, resulting in fine, differentiated levels of understanding. For example, one way we could have seen integrative reconciliation was to compare similarities or differences between the factors that affected rate versus equilibrium. Very few students picked this up. Another extension along the same idea was the fact that solid reactant concentrations are excluded in rate and equilibrium constant expressions. The practice question involved a computational problem where a solid reactant was used. The concept of omitting solid reactants in equilibrium expressions was not explicitly taught in the lecture material, so students had to do some research and reading up independently. The instructors noted that many students sought guidance on how to tackle the question. However, very few students extended their learning from the computational problem to their concept map, to realize that solid concentrations are excluded from both rate and equilibrium considerations. Integrative reconciliation is a good indicator of deep learning and could trigger an "aha" moment, or "felt significance" that comes with gaining profound insights ( [21] p. 18).
In terms of sequencing of ideas, few had unique or novel structures. The most common one was simply a node of "rate" or "equilibrium", which directly branched out to factors such as "reactant concentrations", "temperature", and "catalyst". One student struck a deep impression because this student started the map at the top level with "Reactions", followed by "reversible" and "irreversible". This student then proceeded to connect rate and equilibrium from these nodes. This was perhaps the most representative of a well-defined logic of sequencing, starting from the macro idea down to the grainier concepts. See Figure 2 for the reproduced schematic map by the student. There were also a few misconceptions which surfaced. For example, one student thought that activation energy was linked to temperature and reactant concentrations (by connecting a bubble labelled as "factors that lower activation energy" to these factors). Another thought that a catalyst allowed the activation energy to be achieved faster, and so more products could be obtained. These misconceptions were obvious only if students explicitly used language or arrows to communicate their (flawed) ideas. Thus, language is a double-edged sword in that it either reflects correct understanding or exposes misconception. Language expression is part of the critical learning experience from concept map ( [21] p. 17). This student then proceeded to connect rate and equilibrium from these nodes. This was perhaps the most representative of a well-defined logic of sequencing, starting from the macro idea down to the grainier concepts. See Figure 2 for the reproduced schematic map by the student. There were also a few misconceptions which surfaced. For example, one student thought that activation energy was linked to temperature and reactant concentrations (by connecting a bubble labelled as "factors that lower activation energy" to these factors). Another thought that a catalyst allowed the activation energy to be achieved faster, and so more products could be obtained. These misconceptions were obvious only if students explicitly used language or arrows to communicate their (flawed) ideas. Thus, language is a double-edged sword in that it either reflects correct understanding or exposes misconception. Language expression is part of the critical learning experience from concept map ( [21] p. 17). One unsettling observation emerged as the assessments unfolded progressively. Staff noted that although the concept map scores were acceptable, students did not do as well in the other written assessments such as the post-laboratory tests and the short test. This was perhaps expected, since literature showed that concept map skills were not clearly associated with test performance [24][25][26]. However, this observation warrants more detailed research work for two reasons. Firstly, this was because the (more favorable) weightage allocated to valid expressions of core concepts could have led to this self-fulfilling outcome. This was an aspect where we expected students to cope better with, as the quality of work did show. Another caution was that the concept map assessed different contents from all the written tests, so a direct comparison may not be meaningful. As a future practice, we could follow up with a written test to assess concepts directly related to the concept map assignment. This is to help students perceive the (assessment) value and importance of the concept map so as to increase acceptance.
Intentionally downplaying the connection and hierarchy scores served a pragmatic purpose for assessment parity, but resulted in poorer differentiation of learning quality, as noted by one of the co-authors. This practice could also mar its educative validity. It is proposed that a tiered, factorscoring approach be adopted to place a premium grade on hierarchy and integrative reconciliation. For example, for every valid hierarchy, score 3 to 10 times that of the associated valid proposition. For every valid integrated concept, score two to three times the point assigned to the hierarchy ( [21] p. 107). Giving correct examples could also be scored positively. In future, we could also conduct online consultation during the lecture periods to provide training and support for students. For example, we could give guided practice on map construction, show examples of strong and weak maps, or expose them to online concept mapping tools like Bubbl.us. One unsettling observation emerged as the assessments unfolded progressively. Staff noted that although the concept map scores were acceptable, students did not do as well in the other written assessments such as the post-laboratory tests and the short test. This was perhaps expected, since literature showed that concept map skills were not clearly associated with test performance [24][25][26]. However, this observation warrants more detailed research work for two reasons. Firstly, this was because the (more favorable) weightage allocated to valid expressions of core concepts could have led to this self-fulfilling outcome. This was an aspect where we expected students to cope better with, as the quality of work did show. Another caution was that the concept map assessed different contents from all the written tests, so a direct comparison may not be meaningful. As a future practice, we could follow up with a written test to assess concepts directly related to the concept map assignment. This is to help students perceive the (assessment) value and importance of the concept map so as to increase acceptance.
Intentionally downplaying the connection and hierarchy scores served a pragmatic purpose for assessment parity, but resulted in poorer differentiation of learning quality, as noted by one of the co-authors. This practice could also mar its educative validity. It is proposed that a tiered, factor-scoring approach be adopted to place a premium grade on hierarchy and integrative reconciliation. For example, for every valid hierarchy, score 3 to 10 times that of the associated valid proposition. For every valid integrated concept, score two to three times the point assigned to the hierarchy ( [21] p. 107).
Giving correct examples could also be scored positively. In future, we could also conduct online consultation during the lecture periods to provide training and support for students. For example, we could give guided practice on map construction, show examples of strong and weak maps, or expose them to online concept mapping tools like Bubbl.us.

Impact of "Split-Lab" on Student Learning
The impact of safe distancing on laboratory scheduling and student engagement was another note-worthy lesson. We were largely able to continue on with the current laboratory curriculum during the pandemic, while adhering to safe distancing guidelines. These primarily meant that classes do not mingle during transition, particularly at dismissal and arrival. By splitting up one class of 25 students into two laboratory sessions timed one week apart, instructors noticed that they were more engaged and proactive in their laboratory work. This was because in previous semesters, students worked in pairs. It was entirely possible that students might assume a "free rider" attitude by depending entirely on their more competent buddy. COVID-19 has also squarely placed a learning challenge on the students. Now, they handle and manage all the bench work individually. With a smaller class size, instructors also reported less stress in classroom management and were able to focus more time and attention to provide skills guidance. The instructor team also thought that this scheduling facilitates future implementation of a skills observation test. While the "split-lab" scheduling arrangements had added more strain to laboratory support resources, it has enhanced student learning expectedly. On hindsight, waiting out for face-to-face classes, coupled with this scheduling option (with no other choice in sight) reaped benefits for both instructors and students.

What is Next: Unanswered Questions in Higher Education Assessment Practices Post-COVID
Our experiences directly informed us that buy-in from students could influence perceived efficacy of new forms of assessment. However, the dilemma was that during COVID implementation, it was impossible for us to change mindsets quickly. Students, through their high school years, have encountered mainly written examinations and tests as an indicator of their academic achievements. Almost overnight, their first lesson experience in our freshman course was thrown into disarray. Their suspicions of the validity of the assessment is understandable. In addition, student motivation and readiness is crucial in an online course, a finding that is already well-documented in online education literature [4,34].
Deploying four instructors to mark the second assignment had eased the marking load. However, we did not specifically harmonize the grading of each individual piece of work, except to set some broad guidelines. We found it difficult to design an instructor exemplar map for validation purposes, a practice recommended by some authors [32]. Likewise, it was not possible to exhaustively identify all the combinations of hierarchy and the connections of concepts. This means then that the tiered, factor-based grading approach ( [21] p. 107) was of limited practical use, as of now. It also means that concept maps are inherently not capped by a maximum score [28], and perhaps do not lend themselves very well to traditional scoring methods. They are highly individualized pieces of expression akin to a painting, whose value and beauty lies in the eyes of the beholder ( [21] p. 97). In assessment and evaluation, this is a potentially discomforting tension for institutions. Though the proponents of concept maps contended that there is some subjectivity, this weakness does not compromise construct validity; the concept map gives a fairly good indication of deep, interleaved learning ( [21] p. 105). Our current work could not provide an answer to the issue on score consistency and validity as yet. Perhaps for concept maps to be more widely accepted by all the stakeholders (students, instructors, and institutions) more research work should be performed. Such research could look into ways to enhance student receptivity or in the design of a standardized marking rubric that clearly accounts for the multi-dimensional flavor of concept maps. Taking a step further, how about a possible further future where the race to the top of academic performance is not capped by a ceiling? After all, since talent has no limits [28], this scenario is not totally unfathomable.
COVID-19 has also challenged institutional resources in terms of laboratory scheduling and support. As we move into the fall semester, the plan is to continue with asynchronous lectures and synchronous tutorial lessons. However, the "split-lab" continues to be enforced for adherence to national guidelines. Safety consideration remains top priority, and thus precluded all other scheduling options. Clearly, with a smaller laboratory class size, there are beneficial outcomes in both student learning and classroom management. The key question is how long the institution can sustain this format. While the team could explore live streaming of laboratory classes [15], the manpower set-up is also fairly laborious and even more challenging to manage. As of now, we are hopeful to continue with on-campus classes. We are still caught in the tension between economic efficiency and student outcomes, of which there is no direct answer yet.

Conclusions
This reflection paper describes the team's experiences in managing the HBL implementation and assessment of a freshman-level general chemistry course during the COVID-19 pandemic in Singapore in the April semester of 2020. Besides the steep learning curve to produce recorded contents and deliver synchronous lessons on newly learned technology platforms, the team had to grapple with how to conduct laboratory classes under safe distancing and to substitute traditional, high-stakes written assessments. While the "split-lab" had unsurprisingly improved student engagement and classroom management, it also comes at the expense of manpower resources and teaching facilities.
Although the use of concept map is not a novel idea in science education, the team had no prior experience. In the eventual decision to roll out this assessment approach, we had to balance administrative guidelines concerning academic integrity, student workload, and how much we can realistically manage as we race on to continue with "business as usual". The outcomes informed us that we could better scaffold students' development in concept mapping skills in future rounds of implementation, and integrate concept maps as part our existing assessment approaches. These are issues we can address at the ground level. However, there are others that are more challenging to resolve, which originate largely from pragmatic, age-old mindsets and practices from various stakeholders. For example, students' perception of alternative assessments, putting a maximum mark on grading, possible institutional misgivings about the inherent variability in this assessment approach and effort required for grading. These concerns provide grounds for meaningful research in the future, as we hopefully pray for a fast return to a new-normal.
Author Contributions: Original draft preparation, results analysis, P.N.L.; First subject coordinator role, inputs and review, Y.T.C.; Second coordinator role, inputs and review, Y.T.; Inputs and review, X.X. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no funding.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix B
Examples of positive comments from the concept map survey (all grammatical or spelling errors are verbatim): • "it an easier way for me to study because its more visual and organized" • "the content is more organised and visual" • "it allows me to revise my notes and make it more compact so its easier" • "it gave an overview of how the topics are connected" • "it allows me to recap on the topics being taught to me and circulating my thoughts on how to convey it." • "it pushed me to make my own notes instead of using the notes provided to me." • "It made me relook at my notes, lectures and tutorials, and I have to rewrite them down into my own words, so it made me more familiar with the topic" • "it summarises what ive learnt and allows me to view the topic as a whole. In addition, it allows me to pin point the important details" • "it made me think and analyse more deeply about how the different topics are connected" • "it structures my thoughts, allowing me to have a clear picture" • "it gives me the opportunity to find out and know what are the important concepts rather than learning every words mentioned in the lecture video by heart. in other words, it helps me to differentiate between what is relevant and what is not."

Appendix C
Examples of negative comments from the concept map survey (all grammatical or spelling errors are verbatim): • "trying to connect everything through mindmaps or point form don't make it easier to understand, it actually makes it more confusing for learning" • "I do not know how to link them together" • "I felt that I didn't properly understand how to do the linking between topics to make them interconnected and hence got stuck at that part" • "It proved difficult for me to try and link main concepts of different topics together and also, the outcome of the concept map was too messy, with too many words and arrows linking to each other, which made me even more confused" • "it took me very long to plan and think of what to pick and type/write out." • "it was difficult to paraphrase in our own words, since it is science usually keywords and sentences are fixed. It may have been mistaken for plagiarising but in reality we were just writing proper scientific sentences" • "I would prefer it not to be graded as I think concept maps and notes are done to help with my own revision and not done for a graded assignment." • "its a summary and i do my own notes so this is kind of wasting my time" • "i already have a preferred learning method so while i dont think the summary assignment way of learning if bad, its not something i would take over what i usually do." • "im suppose to follow a certain format to consolidate my learning instead of doing it the way i want to. Thus, my primary focus for this assignment is to get the format right, instead of focusing on the content that ive learnt is" • "is just copying of the slides waste a lot time can just read the slide and understand plus some people do their own notes also then need do this notes again waste a lot time.

•
"the idea of a concept map assignment is pointless and stupidly time consuming, we shouldnt be graded by how much information we can write down in a summarized form when what actually matters is our understanding of the subject. having concept map assignments is a nod to the fact that the school has run out of ways to allocate our grade and therefore uses this approach to tabulate our grade for the semester. its pointless to grade us on our ability to rephrase and regurgitate the lecture notes onto a piece of paper or a blank document." • "i prefer more practice questions" • "i prefer worksheets" • "I felt like there was a better way to ensure that we are up to topic" • "I don't think seeing how the concepts link together helps me in memorizing the content." • "I think it would be more beneficial to answer structured questions instead." • "Personally concept maps do not reflect my understanding of a topic. I think that small quizzes are more useful when it comes to consolidating my learning, but that is just my learning style."