Online Professional Learning in Response to COVID-19—Towards Robust Evaluation

As COVID-19 continues to impact upon education worldwide, systems and organizations are rapidly transiting their professional learning to online mode. This raises concerns, not simply about whether online professional learning can result in equivalent outcomes to face-to-face learning, but more importantly about how to best evaluate online professional learning so we can iteratively improve our approaches. This case study analyses the evaluation of an online teacher professional development workshop for the purpose of critically reflecting upon the efficacy of workshop evaluation techniques. The evaluation approach was theoretically based in a synthesis of six seminal workshop evaluation models, and structured around eight critical dimensions of educational technology evaluation. The approach involving collection of pre-workshop participant background information, pre-/post-teacher perceptions data, and post-workshop focus group perceptions, enabled the changes in teacher knowledge, skills, and beliefs to be objectively evaluated, at the same time as providing qualitative information to effectively improve future iterations of the workshops along a broad range of dimensions. The evaluation approach demonstrated that the professional learning that was shifted into online mode in response to COVID-19 could unequivocally result in significant improvements to professional learning outcomes. More importantly, the evaluation approach is critically contrasted with previous evaluation models, and a series of recommendations for the evaluation of technology-enhanced teacher professional development workshops are proposed.


Introduction
The recent COVID-19 crisis has impacted on education across an enormous variety of industries [1-3] and sectors [4][5][6], leading to the rapid transition of learning to online mode. Educational experts warn against making direct comparison between previous face-to-face teaching and what they call "emergency remote teaching" that has occurred in response to COVID-19 [7]. However, it is clear that facilitating online learning can be challenging for teachers, as well as that professional learning is needed so that teachers have the capabilities to teach effectively online [8][9][10].
Consequently, effectively evaluating the impact of online professional learning workshops is crucial, in order to understand the impact of different professional learning strategies and how they can be improved. There are many different theoretical models for undertaking and evaluating professional learning of teachers, for instance, the Guskey model [11], the Kirkpatrick model [12], and the interconnected evaluation model [13], to name a few. Yet, when it comes to evaluation of online professional development workshops on the pedagogical use of technology, there is a notable lack of rigor with respect to the theoretical framing, data collected, and analyses used. In fact, a recent review of professional learning evaluation approaches [14] found that only 5 out of 41 used any theo-retical framing to guide their evaluation, a minority (18 out of 41) used both quantitative and qualitative evaluation methods, and evaluations typically focused on only a narrow range of educational dimensions.
The purpose of this article is two-fold. Firstly, the paper aims to critically examine approaches to workshop evaluation, with a view to determining optimal practices. Secondly, the paper reports on a case study evaluation of an professional learning program focused on the use of technology for teaching, that was rapidly switched from face-to-face to online mode in response to COVID-19. The case study demonstrates that it is possible to run impactful professional learning in online mode, but more importantly, is used to highlight salient issues relating to workshop evaluation approaches. Based on this critical examination, we propose a range of recommendations to underpin the evaluation of online professional learning workshops. Limitations and future directions are also proposed.

Background
The role of technology in education has been the subject of a great deal of interest within the research community in terms of its importance in supporting the delivery quality teaching and its implications for students learning outcomes [15]. Communication technologies have been of critical importance in the recent COVID-19 pandemic, where online tools have enabled continuity of learning around the world. For instance, teachers often use Learning Management Systems (LMSs) and other online technologies to provide access to learning materials and interactive learning experiences within the higher education domain, high schools, primary education, and even early childhood settings [16]. However, supporting the effective use of such tools requires corresponding professional learning to be provided, necessitating thorough and responsive planning [17]. One of the most commonly used ways to provide teachers with professional training is through professional learning workshops [18]. Such programs ideally map the student learning outcomes of the curriculum to teacher's skills, knowledge, and beliefs with the goal of preparing the teacher for practice. Fundamentally, the goal of such programs is to change the teacher by making improvements in teachers' knowledge, self-efficacy, pedagogical beliefs, domain expertise, and the learning environments they offer [19].
The effective measurement of teacher change as a product of professional learning workshops is often supported through the deployment of professional development evaluation models [20]. A range of theoretical models have been used to underpin professional development evaluation, each with varying emphases. One of the most relevant explanations of teacher change with respect to teacher professional development is the one originally presented nearly four decades ago by Guskey [11]. According to Guskey, changes in teacher's beliefs and attitudes are influenced by teacher observations of changes in students' learning outcomes, which in turn can be influenced by the impact of the professional learning on teacher practice. Logically, appropriate evaluation of teacher professional development programs requires the concept of teacher change to be considered as the basis of evaluation design. In a recent review of 41 technology-focused professional development workshop studies, only 1 of the workshops uses Gusky model [14]. Using Guskey evaluation model in that study enabled analysis of participants learning, reaction to professional development, organizational support provided to the teachers, and participants use of new knowledge and skills in application of ICT teaching and to identify further training needs. However, while Guskey's evaluation model maintains an adequate level of focus on the teacher change, it does not focus on the professional development itself. More specifically, Guskey's model does not place substantive emphasis on the design, participant needs, and organizational requirements of the professional development.
The evaluation of the teacher changes has not been the only evaluation measure for the professional development programs. For example, Kirkpatrick's 4-layer model [12] forms professional development evaluation around teacher's reactions to the program, behaviors, learning, and longitudinal impacts. This model performs the evaluation of the workshop in a linear fashion where all of the four layers of the evaluation are of equal value. This model is the most used evaluation model in the area of professional development workshop evaluation with 3 out of 41 studies from the study of Ahadi et al.'s [14] using it. According to the finding of that study, Kirkpatrick enabled analysis of participants reactions, exploring evidence of the learning, examination of behavioral changes, and seeking evidence of results based on changes in the work place as a result of the training intervention. However this model does not consider characteristics of the organization and work environment and characteristics of the individual trainee as crucial input factors which can impact the effectiveness of the professional development program [21].
Throughout time, more and more professional development program evaluation models have been introduced to address different evaluation goals. For instance, the CIPP (Context Input Process Product) evaluation model [22] has an emphasis on program improvement more than focusing on the program effectiveness by systematically guiding both evaluators and stakeholders in posing relevant questions and conducting assessments at the beginning of the professional development program, while it is in progress and at its end. The interconnected evaluation model [13] focuses on different paths that professional development can take, thereby changes in different areas, such as attitudes and beliefs, knowledge and skills, student outcome, and teacher behavior, do not form a linear relationship and can be of equal value. Holton's HRD (Human Resource Development) evaluation model [23] is specifically designed to detect casual influences of professional development outcomes motivated by research findings that show changes in teachers learning, individual performance and organizational performance could be impacted by explicit influences, such as motivation to learn, transfer climate, and external events. Desimone [24] introduced a critical feature based measurement for analyzing the impact of teacher professional development interventions. This model focuses on measuring activities that make it effective for increasing teacher learning and changing practice, and ultimately for improving student learning on different levels, including content focus, active learning, coherence, duration, and collective participation.
According to the findings of Ahadi et al. [14], none of the professional development technology workshop evaluations reviewed completely utilized all elements of the aforementioned evaluation models. In fact, results of the study performed by Ahadi et al. [14] revealed that majority of the professional development workshops evaluations lacked any sort of theoretical basis, included subjective measures of teacher change, and did not draw upon more than one professional development model to overcome the limitations of each. Additionally, the few studies (5 papers out of 41 papers) that did mention the use of the Kirkpatrick or Guskey model did not fully utilize these models and only took some of the evaluation framework into account.
Previous work by Lai and Bower [25] has shown that evaluation of technology use in education generally is usually limited to only a few dimensions. Ahadi et al.'s [14] reviewed of technology-focused teacher professional learning studies showed that constructs evaluated were typically limited to a only a few dimensions out of the eight examined by Lai and Bower [25] (learning, behavior, affective, technologies, pedagogies, design, community, institution). More specifically, very little attention has been given to the evaluation of institutional environment (7% of the papers) or the degree of presence or community experienced (2%). Surprisingly, only a minority of studies examined pedagogical aspects of the workshops. If professional learning in the use of technology for education is to be comprehensively evaluated, then it is important to understand how evaluation theories and models can be effectively applied across a broad range of pertinent evaluation dimensions.
In this case study, we critically examine the planning, implementation, delivery, and, more particularly, a framework for evaluation, for a series of online technology enhanced pedagogy teacher professional development workshops, to showcase some of the critical issues at stake. We show how a comprehensive and robust evaluation approach, based on a synthesis of theoretical models and eight dimensions of educational technology evaluation can provide rigorous analysis of teacher professional development workshops. The paper also casts critical reflections on the use of theory, breadth of evaluation, objectivity of measures, use of qualitative data, and assessment of longitudinal impact, as well as pragmatic and industry considerations, when evaluating professional learning in the education field. The aim is to provide a framework for rigorous evaluation of (but not limited to) teacher professional development workshops that support the use of technology in education.

Background Context
In early 2020, Cinglevue International, a company specializing in the development of innovative learning technology solutions, and Macquarie University developed a series of free professional development workshops to explore the ways that teachers can utilize technology to enhance classroom pedagogies. The aim was to support participants to discover how technology can be used to help students learn through acquisition, practice, inquiry, discussion, and collaboration, in line with Laurillard's conversation framework [26]. Practical examples were to be provided using a learning management system called Virtuoso, being developed by Cinglevue International. The webinar also intended to showcase how learning design and analytics could be used to support the individual needs of students across different areas of a designated curriculum. The professional learning was designed to be applicable to teachers in all subject areas and across all year levels, including pre-service teachers. These workshops were to involve a reflective research component, and participants who completed all aspects of the research reflection were to receive a shopping voucher.
The initial intention was to run face-to-face workshops; however, by March of 2019 it became clear that it would not be possible to operate in face-to-face mode due to the rapidly escalating COVID-19 situation. This led to the workshops being rapid transformed into online mode. In particular, the workshop was adapted so that: • a web-conferencing system was used to enable remotely located participants to still benefit from the live presentation and interaction that they would have received in face-to-face mode; • the structure and instruction were more tightly defined and pre-planned, so that participants could more easily follow it remotely; • the content was more explicit and self-contained to account for the fact that participants would not as easily ask individual questions of facilitators during the presentations; • explicit opportunities were interjected into the workshop to ask participants about their thoughts regarding the professional learning and technology; • slides were placed online (into Google Slides) so that all presenters in different locations could update and work from a common presentation; and • the final professional learning reflection was conducted as an online rather than face-to-face group discussion.
This led to a workshop that the team felt could be effectively implemented in online mode, with the specific content and structure of the workshop explained in the next section.

Webinar Content
The workshop was implemented as a 4-h webinar, with first session held in April 2020, followed by two repeated sessions implemented in April and May of 2020. The workshop was delivered over a single half-day during Australian school holidays in order to mitigate disruption to the normal teaching schedule. The webinars were hosted online on CISCO Training Center. Figure 1 provides an overview of the explored content of the workshop. Each webinar was composed of five main sessions where, in each session, one type of pedagogy (acquisition, practice, inquiry, discussion, and collaboration) was first reviewed in detail, followed by a live demonstration on how teachers could utilize Virtuoso to instantiate the pedagogy being examined. Each session was followed by a discussion to understand participants perspectives on the reviewed pedagogical component, as well as receiving suggestions, ideas, and any concerns with respect to the pedagogy and its implementation on Virtuoso. In addition, a short session in between the main sessions was dedicated to the review of text mining, and particularly sentiment analysis and topic modeling, followed by a demonstration of how these techniques are implemented as useful toolkits on Virtuoso. In addition to the five main sessions, the last session of the webinar was dedicated to a focus-group session where participants were asked to provide elaboration on the questions related to the pedagogy, workshop, and Virtuoso. Small rest breaks were dispersed throughout the workshop. By the end of the workshop, it was intended that teachers would achieve the following outcomes: • Understand, explain, and effectively apply different pedagogies in Laurillard's Conversational Framework (acquisition, practice, inquiry, discussion, collaboration). • Understand and effectively apply the features of different tools within the Virtuoso learning management and analytics platform to advance student learning outcomes.

Case Study Method
This research used a case study investigation to examine and showcase pertinent themes relating to effective implementation and evaluation of technology-oriented professional learning workshops. Case study is an empirical inquiry that investigates the case or cases conforming to the aforementioned definition by addressing the "how" or "why" questions concerning the phenomenon of interest [27]. This research method is one of the most frequently used qualitative research methodologies [28] and has long been contested terrain in social sciences research which is characterized by varying approaches espoused by many research methodologists. The advantage of case study research is that it can be readily used to combine qualitative and quantitative research methods in order to promote reliability of findings and robust reporting. In this instance, the reflective case study enabled the research team to apply theory-based professional learning evaluation approaches and critically reflect on the extent to which the evaluation techniques could be used to enhance future iterations of the workshop implementations.

Participants
The call for expressions of interest in participating in the research component of the workshop was made available via the end of the registration form, where participants were asked to give permission to analyze the data collected during the workshop for research purposes. A total number of 64 teachers who showed interest in participating in the workshop gave permission to analyze their webinar interaction data in addition to the data collected via the pre-workshop and post-workshop survey. Eventually, the total number of workshop attendees were 14, 16, and 12 (total number of 42 participants) for the first, second, and third webinar, respectively, from which 29 participants volunteered to engage with the research component of the workshop. Based on the data collected from the teachers, the majority of participants were female (62%) with most of the participants teaching early childhood (56%). A total number of seven participants were pre-service teachers and the other 21 were in-service teachers teaching a variety of topics, including maths, music, STEM subjects (science, technology, engineering, and mathematics), and English. Most of the in-service teachers were aged between 25 to 35 with an average teaching experience of 4 years. Further information about the participants is available in the Results section.

Collection of Research Data
The evaluation frameworks reviewed in the literature section of this study provided a data collection schedule for performing professional development evaluation in three stages, including pre-workshop, during-workshop, and post-workshop. As a result, the data was collected from the participants at these three times, with pre-workshop data collected during workshop registration (see Figure 2). Figure 2 also summarizes the research motive, evaluation goal, data collection method, analysis method, and analysis environment. The quantitative data collected from the participants included their responses to the Likert scale questions of the pre-/post-workshop surveys. The qualitative data collected from the participants included text data collected from the CISCO Training Center chat environment, the audio data collected from the focus-group sessions, and participants responses to the open ended questions of post-workshop survey. The data collected from the participants was selected to enable the research team to fulfill a range of research functions which will be outlined in the following subsections. Note that some aspects of qualitative data analysis (such as topic modeling and sentiment analysis) are extensive to explain and are not crucial to fulfill the main objectives of this paper; hence, the analysis results have not been included in this article. In addition, given privacy and ethical concerns around the collection of users interaction with Virtuoso and CISCO Training center tool, we have not reported upon system log data collected from the participants during the workshop.

Pre-Workshop Survey
In line with the previous professional development models, the overall objectives of the pre-survey were to: • Collect background information on participating teachers so that the workshop could be tailored to address participants needs, requirements and expectations. • Establish a baseline for evaluating the change in participants knowledge, skills, and beliefs regarding the use of technology in education that had resulted from the professional learning.
Accordingly, a range of questions were designed to achieve these objectives (see Appendix A for the instrument questions). Questions were in Likert scale format with 7 points used to collect the degree of participant agreement, with the multi-item anchored scale enabling objective comparison and variance to be established.

Post-Workshop Survey
In light of the evaluation frameworks reviewed in the literature section of the paper, the overall goals of the post-workshop survey were set as follows: • Evaluate the perceived quality of the workshop, in terms of the extent to which it supported the improvement of teaching practice, and degree to which the professional learning was expected to be applied in practice. • Evaluate the impact of the workshop along a broad range of dimensions, including the extent to which the workshop supported the achievement of the intended learning outcomes, engaged and motivated teacher participation, and the overall quality of the design, technology, and pedagogy used. • Analyze the usability of Virtuoso and the extent to which it could be utilized in schools to support the enhancement of teaching and learning outcomes (industry outcome).
To this extent, the survey participants were asked about their level of satisfaction with the workshop in light of the eight dimensions of the Lai and Bower educational technology evaluation framework [25], enabling broad evaluation of the workshop. They were also asked the same set of seven questions which they already answered during the registration pre-survey, enabling the research team to somewhat objectively investigate the impact of the workshop on participants' beliefs, skills, and knowledge. Participants were also provided with the opportunity to provide open-ended feedback about the good and bad aspects of the workshop and suggestions for improvements. This evaluation design accords with the post-professional learning evaluation objectives outlined by Clarke [13], Stufflebeam [29], and Kirkpatrick [12]. The way that the professional learning evaluation approach accords with the theoretical models from the literature is discussed in the Results and Discussion sections to follow. It should be noted that the open-ended questions included items relating to the Virtuoso platform, to provide the industry partner with commercial information about the perceived utility of their platform.

Focus Group Session
The final research component of the webinar was the focus-group session where participants were asked to elaborate more on the answers that they provided to previously asked questions in the pre-workshop and post-workshop surveys (see Appendix B for the list of questions used in the focus group sessions). Specifically, the goal of the focus group session was to collect feedback regarding: Harvesting and analyzing such qualitative data aims to provide a deeper understanding of teacher perceptions and dispositions relating to different topics. In addition, using text mining approaches, such as word clouds, topic modeling, and sentiment analysis, allows automatic extraction of valuable information which can be used to gauge teachers beliefs and needs in more detail. The sessions were recorded via CISCO Training Center, as well as Echo360 screen capture software as a backup screen recording tool. The transcripts of the focus group sessions were obtained using otter.ai web portal. The collected data from a total number of 29 participants was used to as a basis for thematic analysis related to the research aspects of the workshop.

Analysis and Reporting of Data
The quantitative data analysis for descriptive analysis was completed using SPSS. The test of significance of changes in participants beliefs, knowledge, and skills as a result of attending the workshop was performed using R scripting environment. The audio to text conversion of the focus-group sessions was performed using online Otter platform. The qualitative data analysis, including sentiment analysis, topic modeling, and thematic analysis, was performed in NVivo 12. Figures were generated using Microsoft Excel. The Results section below presents indicative outcomes of the workshop evaluation for the purposes of illustration. The results and associated commentary intends to showcase and interrogate more broadly several critical issues associated with the evaluation of technology-focused professional learning workshops.

Analysis of the Pre-Workshop Survey
Stufflebeam [30] is among few professional development evaluation frameworks which places a great emphasis on the analysis of the data collected from the pre-workshop stage. The first stage of Stufflebeam's Context-Input-Process-Product deals with analysis of the context where the result of the analysis directly impacts the design, content, and even, in some cases, the professional development objectives. Hence, the first step of the analysis happens prior to holding the professional development. This information was particularly useful since it enabled the professional development to be designed in a way that could address the diverse range of participants (for instance, by making sure the level of explanation catered to both novice and expert teachers).
Structuring questions as 7-point Likert scale items on a common scale from Strongly Agree to Strongly Disagree enabled participants beliefs, confidence, and experience levels to be gauged and consistently compared. Figure 3 provides a summary of the participants responses (orange bars in the figure) to those questions. Based on the analysis of pre-workshop survey data, the majority of the participants expressed confidence in their ability to utilize technology to apply a variety of different pedagogies with 8 participants not feeling confident about utilizing technology to enact pedagogy. All participants were identified to be familiar with different teaching pedagogies which enabled workshop designers to increase the difficulty level of the pedagogical content. All participants except one expressed confidence in teaching by using different pedagogies. These higher levels of confidence in using technology and enacting pedagogy separately, but relatively lower confidence in using technology to enact pedagogy, indicated to the teaching and research team that a focus on developing integrated technological-pedagogical knowledge was important.

Post Workshop Survey Analysis
Evaluation of the immediate impact of the workshop is the most common type of evaluation that typically occurs for technology professional development (Ahadi et al., 2020). The evaluation of the immediate impact of the professional development is the only evaluation phase which is common among the six reviewed professional development evaluation frameworks.
Analysis of pre-and post-workshop surveys enabled the measurement of participants knowledge, skills, and beliefs to be compared before and after attending the workshop (shown in Figure 3). The Venn diagram in Figure 4 summarizes the total impact of the workshop on the participants beliefs, knowledge, and skills in using technology in education. Based on our analysis, nearly half (49%) of the responses show an improvement. Only 12% of the responses did not indicate a change of teachers' level of skills, knowledge or their values as a result of attending our professional development workshop. Figure 5 represents a breakdown of responses according to the questions asked. In order to examine the statistical significance of the changes observed for each question, a paired t-test was conducted (see Table 1). According to the results of the test of significance of mean differences, teachers' confidence about conceptualizing the use of technology in teaching and their confidence in their ability to utilize technology to apply a variety of different pedagogies was highly improved as a result of attending the professional development program. Teachers' confidence about using technology in education, their belief on the impact of technology on teaching effectiveness and the confidence in their ability to teach using different pedagogies was also statistically significantly improved. This demonstrates that professional development adapted into online mode in response to COVID-19 can still have a significant impact on professional learning outcomes. Teachers knowledge of different teaching pedagogies and their opinion about the importance of using technology in teaching was, however, found to be unchanged. This may, however, have been because of the relatively high initial ratings that teachers provided, that did not leave extensive room for improvement. More generally, it can be seen that the approach of using pre-and post-Likert scale ratings of skills, knowledge, and beliefs enabled the objective impact of the workshops along sub-elements of those dimensions to be gauged.
Other Likert scale questions, relating to each of the eight dimensions of the Lai and Bower evaluation framework (Affective Elements, Behavior, Design, Institutional Environment, Learning, Presence, Teaching or Pedagogy, Technology) [25], allowed the research team to start unpacking which areas of the workshop may have led to these changes. Figures 6 and 7 review participants responses to Likert scale questions of the post-workshop survey (see Appendix C for the list of questions and Table A1 for responses). Based on the results of this survey, it was noted that overall participants found this online teacher training workshop a positive experience (see Table A1 and Figure 6).
Additional post-workshop survey items were included to provide the industry partner with valuable commercial feedback about their platform. The feedback received on Virtuoso was overall positive. Two-thirds of the participants expressed interest in using Virtuoso in school, with 17 participants showing interest in attending a follow-up workshop to explore the pedagogical framework in a greater depth and to have more hands-on experience with Virtuoso. A total of 15% of the participants expressed that the workshop could have delivered further hands-on opportunity to use Virtuoso. As can be seen, the approach of using post-workshop Likert scale ratings of skills, knowledge, and beliefs enabled the objective impact of the workshops to be gauged and industry outcomes to also be met.
The open ended qualitative items in the post-survey, that were in addition to the Likert scale items, provided qualitative and explanatory feedback on the workshops that was analyzed in conjunction with the Focus Group interviews.

Positive Aspects
Analysis of participant open ended responses to focus group and survey questions is an important aspect of the workshop evaluation because it provides insight into the 'how' and 'why' the professional learning had certain sorts of impact. The open-ended responses were analyzed according to themes in Lai and Bower's evaluation of educational technology framework [25]. Findings from the three areas of Design, Technology, and Pedagogy have been included below, for illustrative purposes.

Workshop Design and Implementation (Design) In order to evaluate Lai and Bower's
Design aspect of the workshop [25], we aimed to gauge teachers opinion towards the professional development workshop. With respect to the workshop structure, it was useful to participants that the workshop was implemented in the webinar format rather than a face to face one. A total number of 16 participants declared that the best aspect of the workshop was hands-on experience with Virtuoso. It was noted that the workshop had great conceptual value for the pre-service teachers. A total number of six participants found the workshop very useful in terms of how the pedagogy was structured and segmented into groups and they was it was presented. Participants found the communication with the workshop organizers prior to workshop very clear and well organized. A total number of 23 references were found in the participants' comments on the link between theory and practice, and it was concluded by many that the best aspect of the workshop was the fact that theory and practice were merged and implemented together as a whole.

Technology
In order to evaluate Lai and Bower's Technology aspect of the professional development [25], we explored teachers comments on functionality, accessibility, and perceived usefulness of Virtuoso. An entire section of the post-workshop survey was dedicated to analysis of Virtuoso with respect to ease of use, user friendliness, and affective elements. Thematic analysis of the text data revealed a total number of 14 references to usefulness of Virtuoso, declaring it as a valuable tool. In two cases participants really enjoyed the collaborative design space where teachers could work together to create lesson. They mentioned that Virtuoso is "like a one-stop-shop tool" with "really good user interface". Being able to upload YouTube links, its pedagogy integration capabilities, its text mining features, and its potential to provide big data for analysis were among the main mentioned advantages of Virtuoso.
We also aimed to gauge teachers attitude towards text mining technologies in educational settings and analyze the usefulness of these technological toolkits for teachers. We used open ended questions and allowed participants to discuss their opinion on the application of text mining approaches in the post-workshop instrument and the focus group sessions. Overall, participants found text mining a useful and valuable approach in automatic extraction of information from students online submissions. Examining coded references to the positive sentiments using Nvivo 12 sentiment analysis engine revealed that teachers "liked the text mining", found "... that assessment data is a big selling point for teachers", and impacted teacher's beliefs in the ways that "integration of data and single view of a student are crucial for teachers to improve (and personalize) learning".

Pedagogical Framework
In line with Lai and Bower's Teaching/Pedagogy construct [25], it was decided to examine participants comments on the pedagogical practice explored during the workshop. Participants believed that the presentation of the pedagogical framework was easy to follow, comprehensive in terms of covering different aspects of pedagogy and easy to use as the concepts were mostly known. The fact that the pedagogical framework promotes higher order thinking skills was appreciated by participants. Overall, they found it a good refresher on pedagogical practices, and, specifically, the fact that each learning type was followed by a demonstration of how it happens in practice was welcomed.

Participants Suggestions
Analysis of the suggestions made by the participants is highly emphasized in both Clarke and Stufflebeam evaluation approaches. This is due to the fact that design and implementation of repeated workshops can significantly use different sort of opinions for the improvement of the program. The following reviews the result of the analysis of the participants suggestions for improvement, again in accordance with the dimensions of Lai and Bower's framework [25].

Workshop Design and Implementation
Compilation of the feedback received by the participants revealed that a large number of participants believe that a. Virtuoso needed to be made available to them prior to the workshop, and b. there should be more emphasis on practice rather than focus on pedagogy itself. A total of 8 references were found where participants require more breaks during the workshop. Having prior access to workshop handouts, the ability to see other participants, including other subjects in the demo, less information on the slides and more action, and tailoring Virtuoso at early stages of the workshop were among the suggestions made to improve the workshop. Last, there were two expressions of interest on including a demo on student view of the tool so that teachers can evaluate how hard/easy the tool can be for children at lower ages.

Technology
The range of suggestions made for improvements of Virtuoso was found to be diverse, reflecting the different technological preferences of different individuals. Several features were requested, including a student activity monitoring system, plagiarism detection tool, overall snapshot of students progress, easy and simple alerts on students who needed attention, system capabilities to meet requirements of different subjects (Maths, algebra, etc.) and languages (for teaching different languages), easy to use reporting features for teachers (reports to parents, principals, students, etc.), and a strong feedback system for students work (including MCQs). A second category of suggestions is related to help and support for the end users. A total number of 19 references were found where participants express a need to have quick access to help. Being able to access online support (7 references), information sessions for schools, online tutorials, manuals and guideline on how to use the system, initial training workshops for schools, lesson design templates, and community pages were all seen as helpful utilities than should be considered for future implementations of Virtuoso.

Pedagogical Framework
The professional development workshop was designed to target a wide range of audience with different teaching experience levels (students, teachers, pre-service teachers). As a result, the range of collected suggestions for improvements was wide. For instance, one of the experienced teachers expressed that the pedagogical component must be more explicit while another participant asked for more links to other existing pedagogies. A participant suggested to have links from the pedagogy to how it relates to different sections of the curriculum used, as well as how to report using state-based requirements. A preservice teacher suggested that the content of the pedagogy should be made available to the participants prior to their webinar attendance.

Relating Evaluation to Theory
Performing rigorous evaluation requires in-depth consideration of what aspects of the professional development are examined by different professional development evaluation models, and the way in which they are examined. This enables the analysis of the workshop from different viewpoints, since each professional development model tends to broach evaluation from a different perspective. Table 2 reviews the degree of overlap of our workshop evaluation model with different stages of the six well known professional development evaluation models summarized in the background section. As can be seen in Table 2, the evaluation process used for the workshop has a high degree of overlap with Kirkpatrick's evaluation constructs, including workshop evaluation, evaluation of participants' gains knowledge and skills, and changes in participants' beliefs. According to Table 2, the evaluation constructs of the current study has the highest degree of overlap with Stufflebeam's evaluation model [30]. This is due to the fact that this evaluation framework has a high emphasis on participants' characteristics in order to adjust the program in favor of the participants requirements. In the implementation of our workshops, we considered level of competence, skills, working status of the teachers (pre-service, in-service) and using the data collected prior to the workshop, appropriate program adjustments were made. The workshop evaluation method used in the current study has the least overlap with Desimone and Guskey evaluation model. This is due to the fact that measuring teacher change in light of these two evaluation models requires quantifying changes in teacher's behavior at the workplace (school, university), which was beyond the scope of this study. The practical reason for this was the software was still in beta phase at the the time of conducting the evaluation, so it was not yet ready to be released to schools. Holton mainly focuses on secondary influences on the outcome of a professional learning program. In this study, we did not collect any data on secondary influences, such as personality traits or motivation to learn, due to the fact that including a large number of questions in the instruments could strike certain participants as overwhelming. However, one secondary influence was measured to gauge teacher's ability in collaborative design of a module on Virtuoso. Regarding Clark's evaluation model, the only evaluation element that can be identified in our evaluation process is the evaluation of the workshop as a whole which was carried out by using post-workshop instrument. This is due to the fact that Clarke's evaluation model has a great emphasis on the analysis of the longitudinal impacts of teacher professional learning programs which was beyond the scope of our study (as previously explained).

Relating to Dimensions of Educational Technology Evaluation
As mentioned in the literature review section, Lai and Bower [25] found 8 evaluation dimensions that represent different technological aspects that educational research/programs tend to evaluate. These dimensions include affective elements, behavior, design, institutional environments, learning, presence, teaching or pedagogy, and technology. In this work, we based the design of the evaluation framework for the assessment of teachers' technology use on the 8 dimensions explored in (reference retracted for review). Table 3 reviews the present elements of each of these 8 dimensions, as well as when they were used.
Our pre-/post-workshop surveys mainly focused on self perceptions to evaluate any change in participants skills improvement, gain of knowledge or change in value or belief or perceptions towards technology use for teaching. Participants were asked to give overall feedback on the workshop and its content. During the discussion sessions of the webinar we asked participants how likely it is that their schools might consider using Virtuoso and wondered what sort of support they would like to receive in order to prepare to use Virtuoso. In addition, participants were asked about the quality of institutional support with respect to their school policies and institutional interventions. With respect to evaluation of learning, we asked participants their prior knowledge in using different pedagogies in their practice. For technology evaluation, we asked the participants how easy it was to use Virtuoso and how useful the functionality of the platform would be in practice. Overall, our workshop evaluation plan covers all eight dimensions, and 17 out of the 32 sub-elements are listed in Table 3.

Discussion and Lessons Learnt
The results of the workshop evaluation showed that the online workshop had a significant impact on participants and was generally well received. We do not see this as an atypical result-with careful planning, we believe that a large number of professional learning initiatives can be successfully transitioned from face-to-face to online mode in response to the COVID-19 pandemic. What we believe is more important for the field is to deeply reflect on how we evaluate online professional learning and teaching, so that we more fully capture and understand the impact of our approaches.
As reviewed in the Results section, it was decided to frame the evaluation approach used in this study around other professional development evaluation models. The evaluation method used in the current study has the highest overlap with Stufflebeam's CIPP evaluation model. This means the main emphasis of our evaluation framework is around measuring workshop outcomes in the light of teacher change and overall effectiveness of the workshop. Putting teacher needs, expectations, needed interventions, and a more detailed understanding of attending teachers' characteristics can help to shape the workshop planning and its design. For a better understanding of teachers attitude towards Virtuoso, workshop content, and presentations, it was realized that collecting user-system interaction data (within Virtuoso), as well as teachers' engagement with the presentation environment (CISCO's Training Center), is important. This aligns well with CIPP implementation evaluation phase where attitudes, participation, and user satisfaction can be analyzed in a more detailed fashion. Evaluation of Hamilton's motivational elements, performance influences, transfer design, transfer capacity, and opportunity to use are among a few things that can be taken on-board to complement the design of focus group session and the post-workshop survey. Based on our analysis, the post-workshop instrument also aligns well with post-workshop phase evaluation areas of the other five evaluation models (see Table 2) and will be explored in future work.
The evaluation framework used in this study covers all eight dimensions of educational technology evaluation [25] that are addressed throughout the learning technology literature. However and once again, pragmatic constraints reduced the number of elements that were included for evaluation as overloading participants with too many question items is time consuming and could potentially result in less considered answers to the questions that are asked. With respect to evaluation of technological constructs, evaluation of participants affective elements (see Table 3) can be useful. We did not evaluate the capability of the system to enable participants' interaction, collaboration, nor cooperation, and this is something that we plan to do in subsequent webinars. Institutional environment factors, such as school policies and interventions, should also be considered for evaluation because the learning management system should be capable of handling different requirements imposed by the school policies, as well as different school interventions. Assessment of cognitive or mental load of different modules of Virtuoso can also be considered for a more detailed evaluation of the product. Such measures can be quantified using established instruments. As different teachers have different learning preferences and strategies, established instruments can be used prior to the workshop to adjust the workshop design and content development plan based on the participants learning features. This will increase the knowledge gain and skills improved as a product of attending the workshop.
As a teacher professional development workshop, the core goals were to increase participants' knowledge, improve their lesson design skills, and, ideally, positively impact upon their beliefs about the use of technology in teaching. The objective measures collected as part of the pre-and post-workshop surveys enabled us to determine that the workshop was largely successful in achieving those goals. It was realized that a more rigorous analysis involving the collection of longitudinal data is required to more reliably measure the impact upon teachers. That is, in order to examine the practical impact of the changes in teachers' knowledge, skills, and beliefs, there is a need to identify changes in teachers' professional practice. This is inline with Guskey and Desimone evaluation models where changes in teachers' behavior are considered a product of the impact of the workshop. Desimone puts changes in students' learning outcome as the ultimate measure to evaluate the impact of the workshop on teachers. By placing teachers at the center, Guskey measures changes in teachers' beliefs, attitudes and perceptions as a product of changes in students' learning outcome which in turn is led by changes in teachers' teaching behavior. Both of these perspectives require examination of changes in teacher's behavior and students benefits. Once again, the collection of this longitudinal and field-based data was not included because the system had not yet been deployed in school settings, which highlights some of the practical industry considerations that affects the evaluation of educational technology workshops.
If participant time were not an issue, well established instruments for each construct being investigated (motivation, cognitive load, etc.) or at least a subset of their questions would be used to design focused group sessions, pre-workshop survey, post-workshop sessions, and the discussion sessions of teacher professional development. However, it is not possible to use the full version of such established instruments in workshop evaluation unless there is a very specific area of interest; otherwise, it overloads participants with too many items. As an alternative, the comprehensive and broadly applicable framework presented here may, thus, be of use to other researchers.

Conclusion, Future Work, and Recommendations
Rapidly transitioning from face-to-face to online professional learning is a necessary response to COVID-19, and, in all circumstances, effective evaluation of professional learning is critical to understand impact and inform future practice. Drawing from different established evaluation frameworks makes it possible to look at evaluation of technology professional development for educators from different lenses. This provides a means to design the evaluation process in accordance with approaches that will best suit the context and purpose in question, and cater to both the participants and organizers needs. In this paper, the application of an integrated evaluation framework for assessment of professional development workshops was showcased. Appropriate collection of the data from the participants prior to the workshop, during and after the workshop enabled systematic qualitative and quantitative analysis of the teacher technological professional development. The use of Lai and Bower's technological dimensions [25] and the evaluation constructs present in each dimension made in-depth analysis of different aspects of technology related to the workshops possible. Using Lai and Bower evaluation dimensions [25] as part of the theoretical framework for designing research objectives and, consequently, the postworkshop survey and focus group questions made it possible to more comprehensively identify the strengths and weaknesses of the workshop. The consistent design and the analysis of pre-and post-workshop survey based on common perception items made it possible to gauge the statistically significant impact of the workshop on participants skills, beliefs and knowledge. In addition, using qualitative analysis and evaluation elements of the Lai and Bower constructs [25] enabled reasons for Likert scale perceptions to be explained, for instance, how the integrated text mining tool in Virtuoso resulted in high technology perceptions. The qualitative component of the professional development evaluation model resulted also in the collection of suggestions for workshop and technology improvements, which were of high commercial value to the industry partner.
Thematic analysis of the qualitative feedback from teachers revealed that Virtuoso was identified to be a useful teaching tool with expressions of interest in using Virtuoso at schools. Overall, the workshop was found to be useful for the participants according to their comments and feedback. In addition, the statistical analysis of the workshop impact on participants revealed that the workshop resulted in significant changes in teacher beliefs, knowledge, and skills. Thus, the results of the data analysis for the first round of teacher professional development workshops shows that the workshop outcomes were generally met. The key issues identified by the participants helped us suggest resolutions in order to make improvements to the upcoming workshops, which was valuable to both the industry partner and the research team.
Based on the experience gained from analyzing the evaluation framework used in the current study, the following points are suggested for consideration in the evaluation of technological teacher workshops. Theoretical frameworks for evaluating professional learning and teacher change should be a starting point for any research examining the impact of teacher professional learning, so that complete and robust evaluation approaches can be developed. Workshop registrations should integrate pre-surveys that help to understand teacher needs and dispositions, as well as knowledge and skills. This enables the workshop to be better customized to participant needs, as well as provide a baseline for objective measures of teacher change. Greater breadth of items should be included in professional development evaluations, to promote more holistic understanding and analysis of workshop impact (for instance, using the Lai and Bower framework [25]). Quantitative evaluation (e.g., Likert scale ratings) should be complimented by qualitative evaluation, so that objective measures of teacher change can be accompanied with explanations of how and why the changes occurred. Where possible, longitudinal impact should be gauged; however, in practice, this is often constrained by the professional learning and research con-text. Pragmatic and contextual considerations need to moderate any professional learning evaluation attempt; applying overly long or intrusive approaches to professional learning evaluation may adversely impact on participant experience and motivation.
There were limitations to the case study presented. Due to privacy concerns and ethical considerations, it was not possible to analyze the data collected during the workshop. Evaluation of real-time data can be particularly useful to monitor engagement level of the participants with the workshop content and investigate participant's immediate impression of the showcased online learning management system Virtuoso. In line with Guskey's evaluation model, the evaluation of longitudinal impact of the workshop on teachers would have been of great value. For future iterations, we are planning to consider analyzing the longitudinal impacts of the workshop on the teachers in terms of changes in their teaching style, changes in their teaching strategies, evaluating the impact of such changes on students' learning outcome and tracing changes in teachers' beliefs and values as a product of changes in practice which in turn is a product of attending the workshop. This is in line with the last stage of professional development program evaluation models, the analysis of the impact of the workshop on teacher's practice. Importantly, re-application of the instruments and protocols adopted in the first iteration of the workshops will enable us to determine whether changes to future versions of the workshop result in statistically or qualitatively discernible improvements to participant outcomes as compared to previous iterations. This sort of objective comparison between workshop approaches is rare within the professional learning evaluation field, but it is critical to advance the science of teacher education.

Virtuoso feedback
Please provide your response to the following questions (7-point Likert scale items SD-SA). 1. Virtuoso was easy to use. 2. Virtuoso would enable me to improve the quality of my teaching.
3. I found sentiment analysis (text mining toolkit) useful. 4. I found topic modeling (text mining toolkit) useful. Please answer the following questions: 5. Would you be interested in using Virtuoso within your school? 6. Would you be interested in attending a separate follow-on workshop relating to data analytics, educational evaluation, machine learning and artificial intelligence in education? 7. Would you be willing to be contacted to clarify or elaborate about any of your responses? 8. What concerns do you have about teaching using Virtuoso? 9. What support do you feel you need in order for your online lessons that use Virtuoso to be as successful as possible?