Designing a Collaborative Virtual Conference Application: Challenges, Requirements and Guidelines

: Due to the recent COVID-19 pandemic that has swept the globe, more people are working from home. People use synchronous applications to communicate remotely because they are not able to meet face-to-face. However, few research studies on the issues surrounding the virtual conference application, particularly those that include collaborative activities, have been conducted. The usability study recruited 16 participants (in four groups of four) to communicate synchronously while performing collaborative activities, such as drawing together on a shared screen. According to the ﬁndings of the usability study, users do not often use the collaborative tools provided by the current virtual conference application. This is due to low exposure and unfamiliarity with the use of collaborative tools. The ﬁndings also show that users frequently do not turn on the web camera due to several reasons, including privacy, connectivity issues, the environment, and background distraction. Turning on the web camera can also cause anxiety due to shyness in front of the camera. However, some participants prefer to turn on the web camera so that they can see each other’s reactions when performing collaborative activities. The article provides several guidelines to assist in the design of virtual conference applications, including a simple familiar intuitive interface to encourage the use of collaborative tools and also introduces the use of virtual avatars as a way to represent oneself during online meetings to allow affective sharing while respecting the privacy of its users.


Introduction
This research is primarily focused on a virtual conference application that has been overlooked for the past few years. With the recent COVID-19 pandemic still taking its course around the world, more people are resorting to online communication applications to adapt to the new normal [1]. This includes performing different online tasks such as meeting, learning, discussing, and problem solving through a video conference. However, the current pandemic has shown that 96 percent of users are frustrated when collaborating on any online communication format [2]. A previous study found that users prefer interactive sharing spaces to just video in one of its user studies [3]. Current virtual conference applications still use text, audio, and video to perform complex collaborative activities [4][5][6][7]. This paper focuses on the virtual conference application by identifying the issues and limitations encountered while performing collaborative activities. The following is how the paper is organized: Section 2 presents several theories and concepts on Human-Computer Interaction (HCI), collaboration, video conference, communications and the increase in the use of virtual conferences application during the current pandemic, and user experience. Section 3 describes the usability study that was conducted to evaluate user experience with the current virtual conference application. The results of the usability study are presented in Section 4. Section 5 discusses the findings of the usability study. Section 6 contains design recommendations, proposed system architecture, and an introduction to the user interface design for the virtual conference application. Finally, the work of this paper concludes with a discussion on its limitations and recommendations for future work.

Related Works
This paper focuses on the current user experience of virtual conference applications when performing collaborative activities. Due to the current pandemic outbreak, there appears to be very limited research in this area. However, the study is still able to learn from previous research on remote collaboration. To understand the user experience of the current virtual conference application, the paper outlines human-computer interaction and collaboration. The paper also discusses how the COVID-19 pandemic affects online collaboration, as well as issues surrounding the communication of emotions and non-verbal interaction. Then, we conduct a review on literature related to user experience and remote collaboration. Finally, the study should be able to assist in determining the indicators based on the influencing factors of user experience.

HCI and Collaboration
Human-Computer Interaction (HCI) research has focused on computer systems that have a group impact, mainly on Computer-Supported Cooperative Work (CSCW). CSCW is based on how groups work together and how technology affects group behavior [8]. A groupware is a set of software tools that support a group of people working together. Different theories have been created to combine the cooperative work of people through hardware, software, and networking. One of the theories that have been created is the CSCW matrix theory. The time space matrix is used in CSCW matrix theory to categorize groupware [9]. The first dimension is concerned with whether the interaction can be performed at the same time (synchronous) or at different times (asynchronous). The second dimension is considered when the individuals are located at the same location (collocated) or at different locations (remote). Combining these two-dimensions has resulted in four categories of groupware for CSCW: co-located and synchronous, remote and synchronous, collocated and asynchronous, and finally remote and asynchronous. Figure 1 shows the different example applications being categorized in the time space matrix.  [10]; (b) Example of different location and time using email application [11]. (c) Tabletop System for same location and time matrix [12]; (d) Kiosk System for same location different time matrix [13].
Referring to Figure 1, each of the quadrants has its own set of advantages and disadvantages. For example, quadrant 1 is doing collaborative tasks at the same time and same location. This benefits tasks that are performed directly face-to-face. There is no input and output delay between users, as they can collaborate quickly without the use of the internet. However, performing collaborative work at the same place and time requires  [10]; (b) Example of different location and time using email application [11]. (c) Tabletop System for same location and time matrix [12]; (d) Kiosk System for same location different time matrix [13].
Referring to Figure 1, each of the quadrants has its own set of advantages and disadvantages. For example, quadrant 1 is doing collaborative tasks at the same time and same location. This benefits tasks that are performed directly face-to-face. There is no input and output delay between users, as they can collaborate quickly without the use of the internet. However, performing collaborative work at the same place and time requires a larger space for users to move around and a bigger system for it to collaborate effectively. Collaborative Multi-Mobile System (CMMS) is an example of CSCW with the same time and location, where users can place multiple mobile devices on the table in a collocated manner to create a large surface for collaborative activity [14,15]. Quadrant 2 is carrying out the tasks at different times but at the same location, such as the kiosk. The benefit is that information can be shared at any time without the need for the parties to meet. However, users must return to the same place to retrieve the information required. Quadrant 3 has the advantage of performing collaboration at different locations at the same time without meeting face-to-face. However, the downside is that it requires a higher bandwidth to stream videos over the distance. Depending on the internet quality, the streaming can sometimes be delayed by a few seconds. In collaborative works, users want to make quick changes but the delay in real time application can cause frustrations. Finally, quadrant 4 collaborates at different locations and times. Although there are no delays with this method, users are rarely able to perform any real time-based activities for quick actions and changes.
This paper focuses on synchronous collaborative interaction groupware in quadrant 3 of the time space matrix. Examples of groupware that support synchronous collaborative interaction groupware are virtual conference applications, remote desktop applications, instance messaging, and multiuser text editing [16]. The research has mainly focused on the current video conference or telecommunication used by current users to communicate at a distance and at the same time using text, audio, and video.

COVID-19 Pandemic and Online Collaboration
The COVID-19 pandemic has generally forced people to use online applications such as virtual conference application or social media platforms to help connect people at a distance. People have relied on these platforms to perform a variety of activities such as learning, meeting, and collaborating with their peers. Schools have forced teachers and students to shift from the usual face-to-face settings to online settings for teaching and learning. Meetings between workers and co-workers in companies also take place in a virtual setting to share, present, and discuss information at hand. Some people use virtual conference applications to perform a collaborative task and to discuss ideas and solutions to a given problem.
Several previous studies on online communication have been conducted in order to find ways to support remote collaboration during the pandemic. One previous study has used remote collaboration to design homes without the need for the designer to be present [17]. Collaborative activity can also be used for group learning, which has been shown to be more effective than individual learning [18]. According to a previous study, students now learn better by collaborating with peers online, which is preferred by students [19]. For remote collaborative learning, previous research has found that the provision of opportunities to collaborate, discuss, and reflect on their professional development is beneficial [20]. Previous research identified opportunities for new and augmented approaches for remote collaborative research through Distributed Participatory Design by distributing geographies, backgrounds, ages, and abilities to overcome these new barriers during and after COVID-19 [21]. Previous studies have also looked at security issues such as user privacy, which is more vulnerable to be abused, especially during remote collaboration [22]. Currently, there has been a lack of study on the issues and limitations of performing a remote collaboration using the current virtual conference application, particularly during a pandemic period. Section 2.3 compares the various virtual conference applications available for real-time collaboration and its features.

Virtual Conference Application
There are several virtual conference applications or video conferences available on the market that can be used for remote collaboration. The most popular virtual conference applications have a 2D interface with a mouse, keyboard, and web camera for remote interaction/collaboration. Table 1 compares several virtual conference applications based on the features provided. Collaborative tools such as whiteboard, hand raise, room breakout, and annotation are available in all virtual conference applications. However, a previous study identified multiple weaknesses in the current virtual conferences application, such as security, privacy, media quality, reliability, capacity, and technical difficulties [23]. The virtual conference application also has a limited view as it displays on a flat 2D screen at a reduced scale [4]. The virtual conference application also has a fixed web camera placed at the edge of the display [24]. As a result, the current setup has a fixed viewing angle that requires the user to be in front of the screen. There are also issues with security, quality, and latency when communicating at a distance [23,25,26].

Communicating Emotions and Non-Verbal Interaction While Online
The current virtual conference application does not allow for the sharing of emotional awareness while communicating between local and remote participants. Previous research has shown that limited bonding capability becomes an issue, especially during remote collaboration, and the researchers of this paper would like to study the impact of collaborative teams on affective states [34]. This means that remote collaborators are not aware of other people's emotions, such as happy, sad, neutral, frustrated, excited, and more. According to Meluso et al. [26], the greatest challenge for the current virtual conference application is conveying emotion. Emotions may be able to provide valuable information about the relationship between collaborators, allowing them to understand each other's emotions and feelings. Chikersal et al. [35] have suggested that, in the future, the virtual conference application will be able to synchronize members' facial expressions, as sharing facial expressions may enhance team performance. During this difficult time, many people have been using virtual conference applications for meetings, learning, and working. However, collaborators often turn off the video feed due to privacy, unstable connections, and limited data. Therefore, this research would like to understand more about the underlying issues that arise when the webcam is not available, particularly during a collaborative activity.
There is also a lack of non-verbal interaction options in the current virtual conference application. The keyboard and mouse are the main modes of interaction in the current video conference. The current virtual conference applications lack non-verbal interaction options such as hand gestures, eye gaze, and body movement [36]. A comparison revealed that non-verbal remote collaboration is better than the conventional virtual conference applications that rely solely on the mouse [37]. Participants have shown a general interest in non-verbal communication compared to only verbal communication [38]. Previous research has investigated ways to include non-verbal communication into areas such as communication behaviors, remote collaborations, and sharing cues. Faucett et al. [39] have developed a system that can sense and provide real-time feedback about non-verbal communication to improve communication behaviors. Non-verbal gestures can be useful in social interactions, especially in remote collaboration [7]. Therefore, this research would like to understand more on the underlying issues during non-verbal communication during collaborative activities.

User Experience (UX)
User Experience (UX) has been broadly defined as a method of understanding the feeling of users while using a system or application [40]. This includes all of the feelings, thoughts, actions, and sensations of engaging in some activity while using the system or application [41]. According to a systematic review conducted, there are four core factors that affect user experience, namely the user, the system, the context, and the temporal aspects [42]. The first major aspect of user experience is the user themselves, as they can come from different backgrounds in terms of age, gender, culture, etc., which can affect how the system is used. The interactive system can be designed to meet various goals and user needs; however, new technology has been introduced over the years, resulting in a different user experience than the previous system with old technology. Context is another important factor that influences user experience, as different conditions, physical environment, or even culture, can affect how users interact with the system or application. Finally, the temporal factor is a critical factor in user experience, as there would be no experience without time. For the time being, the study focuses solely on the user, system, and context aspect of the user experience. According to Virpi Roto [43], the user experience is subjective because the user's state influences system perception, which affects the experience and the user's state.
Previous studies have performed usability evaluation as a compulsory process to help determine the user experience of an application or system. The data can be obtained through questionnaires and interviews as a tool in a usability evaluation. In most commercial platforms, the evaluation of user experience is widely used. For example, Mishra et al. [44] used an online questionnaire to assess the ease of use and responsiveness of using augmented reality technology for online shopping. Hussain et al. [45] used a questionnaire to determine the user experience of Amazon Kindle application in terms of perceived ease of use, perceived visibility, perceived enjoyability, and perceived efficiency. Another variable that can determine the user experience is how the applications or systems are used. Chen et al. [46] calculated usage by ranking selected platforms for education and business based on the number of keyword user comments. Pal et al. [47] used a system usability scale questionnaire (SUS) to determine the user experience of smart voice assistants. Furthermore, interviews can be used to determine the user experience of products. Van Der Linden et al. [48] inquired about the usage frequency of interactive devices and 3D applications in virtually immersive environments and tablet usage for learning, respectively. For example, Følstad and Skjuve [49] used interviews to determine the user experience of using chatbots to provide better services to customers. Kelleci and Aksoy [50] used focus group interviews and observation to study the user experience of virtual classroom simulations used for the training of teachers.
Several studies on user experience for virtual conference applications for remote collaboration have been conducted. Thomaschewski et al. [51] designed an augmented reality awareness interface to support remote teams by using the usability and user experience score. Huidong Bai et al. [38] evaluated the user experience of mixed reality remote collaboration using gaze and hand gesture interaction. To validate the user experience, the study used the variables of mental workload, system usability, and preference. Another remote collaboration study looked into whether wearables could help improve user experience in remote collaboration by providing spatial annotation and by view-sharing data to a mixed reality headset [52]. It also measured the mental workload of the user to evaluate user experience. The subjective collaborative quality, satisfaction, and performance are used for understanding user experience in the remote collaboration study by projecting a gesture on a remote user from a local user using a mixed reality headset [53]. Most of the remote collaboration studies evaluate user experience either using virtual reality, augmented reality, or mixed reality 3D technology, but a few studies involve 2D-based remote collaboration. Anton et al. [54] compared the user experiences of 2D-based video conferences, 3D video interfaces, and 3D with annotations. The results showed that 3D with annotation has a higher rating compared to just 2D-based interface. Currently, there is a lack of study on remote collaboration using virtual conference applications, especially during the COVID-19 pandemic, which is what motivated this study. Moreover, in order to understand the user experience, this research would like to focus more on the users' comments and feedback to evaluate the current virtual conference application for collaborative tasks. Finally, based on the previous study, several measurements have been identified to aid in the construction of a user study, particularly on remote collaboration on the current virtual conference applications. A guideline is also proposed based on the comments and feedback provided by the users.

Methodology
The aim of this study is to identify and understand the issues with the current virtual conference application with respect to usability and collaborative experience while performing collaborative activities. The study adapted the usability study to identify issues related to the usability and layout of the current virtual conference applications based on users' feedback. It also implemented a user-based evaluation to evaluate user experience on current virtual conference applications.
The study focuses on identifying the limitations and issues encountered while performing collaborative activities on the current virtual conference application. The key research questions are: (1) What are the main issues and limitations of using current virtual conference applications while performing collaborative activity? (2) How does it affect the usability and user experience while using the current virtual conference application? (3) What are some recommendations or guidelines for improving the usability of the current virtual conference application? To answer these research questions, the exploratory experimental design with mixed method was used to identify the problems and limitations faced when using a virtual conference application for collaborative purposes.
The Zoom meeting application was used to test the software in this experiment. This was because, when compared to other commercial meeting applications, the participants in this study were more familiar with using the Zoom meeting application for day-to-day online synchronous activities, such as meeting, learning, and collaborating. The study designed its own data-collection tools to help validate the analysis to achieve the objectives because very few reviews have been conducted to explain the area of this study. The study collected data through a questionnaire and focus group interviews. The questionnaire required participants to answer 46 questions relating to usability and user experiences. The study collected self-reporting answers from the participants, which included attributes such as terms of usage frequency, ease of use, affective input, collaborative experience, and system usability scale (SUS). Self-reporting was used in lieu of log tools, as the questionnaire answers provided a better insight on the overall perception and preferences of the participants when using virtual conference applications outside of the experiment. Additionally, five questions were asked during the focus group interview. The questions asked participants to elaborate on their overall experience using the virtual conference application, to state why they may or may not switch on their web cameras during virtual conferences, whether they noticed any affective feedback from their fellow meeting buddies, and to provide additional comments and suggestions to improve the current virtual conference application. The data for the questionnaire was collected and analyzed using Google forms. Prior to the analysis, the focus group discussion was transcribed using a speech-to-text transcription software (YouTube transcription).

Participants
The experiment recruited 16 undergraduate and postgraduate (13 female, three male) students, aged between 16 and 40 years old from the Faculty of Computer Science and Information Technology, University Putra Malaysia, due to their familiarity with computers and technological background. A facilitator was also present during the experiment to help with instrument setup, distribution of materials, and to give instructions to the participants. The 16 participants were divided into five groups of three or four members, with only one group being studied at any one time.

Apparatus and Materials
Participants in the usability study were required to prepare their own computer to perform collaborative activities. Their computer should be equipped with a web camera, microphone, keyboard, and mouse, and the laptop should be set up on a desk. Each participant required a stable internet connection in order to complete the remote collaboration task that had been set. The participants were also required to install a virtual conference application (Zoom) to perform most of the collaborative activities.

Tasks
The tasks designed in this study were intended to understand users' interaction and navigation when they are performing remote collaboration activities. Before the start of the experiment, participants were briefed on the outline of the study and consent was obtained. Although each participant joined remotely from different locations, the facilitator served as a host and set up the camera, laptop, WiFi, and virtual conference application from his location to record the interaction, as well as welcomed the participants to the virtual experiment space. Figure 2 shows the experimental setup for the usability study. participants to the virtual experiment space. Figure 2 shows the experimental setup for the usability study. The participants were required to use the virtual conference application (Zoom). The session was recorded. Each group had a total of three participants. One of the participants was made the host and was asked to create a new meeting. The other participants were then invited to join the meeting. The group was given an instruction that read "You are required to collaboratively draw Your Dream Holiday", and they were given 15 min to complete the task collaboratively using the tools readily available on the virtual conference application they were using. After the participants had completed the experiment, the recording was saved.
The group then moved to the next section of the study, in which they each had to fill out a questionnaire about their recent collaborative virtual conference experience. To better understand the issues and challenges associated with performing online collaborative tasks, a focus group discussion was then carried out.

Results
The following sections show the results of the questionnaire that was given to the participants after they completed the tasks in the study. The data collected include the usage frequency of the several features available on the virtual conference application (e.g., annotation, screen sharing, etc.), the perceived ease of use, affective input, The participants were required to use the virtual conference application (Zoom). The session was recorded. Each group had a total of three participants. One of the participants was made the host and was asked to create a new meeting. The other participants were then invited to join the meeting. The group was given an instruction that read "You are required to collaboratively draw Your Dream Holiday", and they were given 15 min to complete the task collaboratively using the tools readily available on the virtual conference application they were using. After the participants had completed the experiment, the recording was saved.
The group then moved to the next section of the study, in which they each had to fill out a questionnaire about their recent collaborative virtual conference experience. To better understand the issues and challenges associated with performing online collaborative tasks, a focus group discussion was then carried out.

Results
The following sections show the results of the questionnaire that was given to the participants after they completed the tasks in the study. The data collected include the usage frequency of the several features available on the virtual conference application (e.g., annotation, screen sharing, etc.), the perceived ease of use, affective input, collaborative experience, and the usability of the application as measured by the System Usability Scale (SUS).

Usage Frequency of Features on the Virtual Conferencing Application
The frequency of feature usage describes how frequently participants utilized the features available on the virtual conference application during the study. The features included were items such as the web camera, the microphone, the chat function, and so on. Based on the results of the questionnaire, the most frequently used tools in the study when performing collaborative activities were primarily the share screen feature and the microphone feature (18%). The less frequently used tools were video recording, the reaction to indicate emotions, the raising hand feature (i.e., to signal permission to speak up), the breakout room feature, the file sharing feature, the whiteboard feature, the chat feature, and the web camera feature (i.e., enabling the camera to be switched on). This suggests that the participants only used the features marginally. It is possible that the participants were not familiar with the virtual conference application and had not fully explored all of its features. However, it strongly indicates that, in general, only a few features were necessary to carry out collaborative tasks during a virtual conference. Figure 3 depicts a diverging stacked bar chart of feature usage frequency for the current virtual conference application for collaborative purposes. participants were not familiar with the virtual conference application and had not fully explored all of its features. However, it strongly indicates that, in general, only a few features were necessary to carry out collaborative tasks during a virtual conference. Figure  3 depicts a diverging stacked bar chart of feature usage frequency for the current virtual conference application for collaborative purposes.

Ease of Use
According to the questionnaire, 73% of the participants stated that the virtual conference application was easy to use, while 27% stated otherwise. Further analysis reveals that the recording feature, the reaction feature, the raise hand feature, the breakup room feature, the screen sharing feature, the chat feature, and the microphone feature were rated as the easiest to use on the virtual conference application by participants. Participants also reported that the file sharing feature, along with the whiteboard and annotation features were more difficult to use. This finding is interesting, as the annotation feature, the file sharing feature, and the whiteboard feature, were all designed to facilitate the completion of collaborative tasks but were rated poorly by the participants. Figure 4 shows the diverging stacked bar chart of ease of use for each feature provided by

Ease of Use
According to the questionnaire, 73% of the participants stated that the virtual conference application was easy to use, while 27% stated otherwise. Further analysis reveals that the recording feature, the reaction feature, the raise hand feature, the breakup room feature, the screen sharing feature, the chat feature, and the microphone feature were rated as the easiest to use on the virtual conference application by participants. Participants also reported that the file sharing feature, along with the whiteboard and annotation features were more difficult to use. This finding is interesting, as the annotation feature, the file sharing feature, and the whiteboard feature, were all designed to facilitate the completion of collaborative tasks but were rated poorly by the participants. Figure 4 shows the diverging stacked bar chart of ease of use for each feature provided by virtual conference application.

Web Camera Usage/Affective Input
Moods, feelings, and emotions, or affective input, can be communicated across the virtual conference application by using features such as switching on the web camera to provide live input of emotional expressions. Another way to express emotions is through the reactions feature. In one section of the questionnaire, participants were asked if they could identify the emotions of other group members who were participating in the collaborative task in the same user study session as them. It was found that participants agreed that it was hard to read the emotions of other group members in the session, because they had all turned off their web cameras most of the time. Several reasons were cited by the participants for their less affective behavior, including: (1) reducing the consumption of their mobile data, (2) hiding the less-than-ideal background environment in which they participated in the user study, e.g., busy, messy, etc., (3) protecting their privacy, and (4) avoiding showing their self-body image due to low confidence. Two participants mentioned hardware issues with their web cameras, and one participant admitted to being away from her computer for a period of time. Figure 5 highlights the reasons provided by the participants for disabling their web camera during virtual conference meetings.

Web Camera Usage/Affective Input
Moods, feelings, and emotions, or affective input, can be communicated across the virtual conference application by using features such as switching on the web camera to provide live input of emotional expressions. Another way to express emotions is through the reactions feature. In one section of the questionnaire, participants were asked if they could identify the emotions of other group members who were participating in the collaborative task in the same user study session as them. It was found that participants agreed that it was hard to read the emotions of other group members in the session, because they had all turned off their web cameras most of the time. Several reasons were cited by the participants for their less affective behavior, including: (1) reducing the consumption of their mobile data, (2) hiding the less-than-ideal background environment in which they participated in the user study, e.g., busy, messy, etc., (3) protecting their privacy, and (4) avoiding showing their self-body image due to low confidence. Two participants mentioned hardware issues with their web cameras, and one participant admitted to being away from her computer for a period of time. Figure 5 highlights the reasons provided by the participants for disabling their web camera during virtual conference meetings. The participants in the usability study were also asked when they would be willing to turn on the web camera. It was discovered that participants neither enabled nor disabled the web camera continuously. The participants stated that they usually enabled the web camera "when being asked", which rated the highest. The second most important factor influencing whether they switched on the web cameras was having to "give a presentation", followed by "performing group activity". Figure 6 shows the results of different situations in which participants were willing to turn on their web cameras during a video conference meeting. The usability study also asked participants how they felt when other group members' web cameras were turned off while performing collaborative activities. The majority of the participants (69%) were neutral and impartial, meaning it did not matter to them if the web camera was switched on or off by other group members. Around 20% of the participants associated the web cameras being turned off with negative feelings, with 13% describing a sense of loneliness and isolation and another 6% describing feelings of boredom and demotivation. On the contrary, the remaining 12% of the participants  The participants in the usability study were also asked when they would be willing to turn on the web camera. It was discovered that participants neither enabled nor disabled the web camera continuously. The participants stated that they usually enabled the web camera "when being asked", which rated the highest. The second most important factor influencing whether they switched on the web cameras was having to "give a presentation", followed by "performing group activity". Figure 6 shows the results of different situations in which participants were willing to turn on their web cameras during a video conference meeting.  The participants in the usability study were also asked when they would be willing to turn on the web camera. It was discovered that participants neither enabled nor disabled the web camera continuously. The participants stated that they usually enabled the web camera "when being asked", which rated the highest. The second most important factor influencing whether they switched on the web cameras was having to "give a presentation", followed by "performing group activity". Figure 6 shows the results of different situations in which participants were willing to turn on their web cameras during a video conference meeting. The usability study also asked participants how they felt when other group members' web cameras were turned off while performing collaborative activities. The majority of the participants (69%) were neutral and impartial, meaning it did not matter to them if the web camera was switched on or off by other group members. Around 20% of the participants associated the web cameras being turned off with negative feelings, with 13% describing a sense of loneliness and isolation and another 6% describing feelings of boredom and demotivation. On the contrary, the remaining 12% of the participants  The usability study also asked participants how they felt when other group members' web cameras were turned off while performing collaborative activities. The majority of the participants (69%) were neutral and impartial, meaning it did not matter to them if the web camera was switched on or off by other group members. Around 20% of the participants associated the web cameras being turned off with negative feelings, with 13% describing a sense of loneliness and isolation and another 6% describing feelings of boredom and demotivation. On the contrary, the remaining 12% of the participants reported feeling more comfortable and relaxed (less anxious) when their peers turned off the camera. Figure 7 depicts the breakdown of the feelings experienced by the participants when the web cameras of their peers were turned off during a virtual meeting.
reported feeling more comfortable and relaxed (less anxious) when their peers turned off the camera. Figure 7 depicts the breakdown of the feelings experienced by the participants when the web cameras of their peers were turned off during a virtual meeting.

Collaborative Experience and System Usability Scale (SUS)
The SUS was used to investigate the collaborative experience of the participants when using the virtual conference application, as well as the usability of the application. Overall, the participants rated the collaborative experience to be good, (µ = 4.27). Figure 8 shows the distribution of collaborative experience gained through the use of a virtual conference application. 5) I enjoyed the experience. 6) I was able to focus on the task activity. 7) I am confident that we completed the task correctly. 8) My partner and I worked together well. 9) I was able to express myself clearly. 10) I was able to understand partner's message. 11) Information from partner was helpful.

Collaborative Experience
neutral strongly disagree disagree agree strongly agree

Collaborative Experience and System Usability Scale (SUS)
The SUS was used to investigate the collaborative experience of the participants when using the virtual conference application, as well as the usability of the application. Overall, the participants rated the collaborative experience to be good, (µ = 4.27). Figure 8 shows the distribution of collaborative experience gained through the use of a virtual conference application. reported feeling more comfortable and relaxed (less anxious) when their peers turned off the camera. Figure 7 depicts the breakdown of the feelings experienced by the participants when the web cameras of their peers were turned off during a virtual meeting.

Collaborative Experience and System Usability Scale (SUS)
The SUS was used to investigate the collaborative experience of the participants when using the virtual conference application, as well as the usability of the application. Overall, the participants rated the collaborative experience to be good, (µ = 4.27). Figure 8 shows the distribution of collaborative experience gained through the use of a virtual conference application. 5) I enjoyed the experience. 6) I was able to focus on the task activity. 7) I am confident that we completed the task correctly. 8) My partner and I worked together well. 9) I was able to express myself clearly. 10) I was able to understand partner's message. 11) Information from partner was helpful.

Collaborative Experience
neutral strongly disagree disagree agree strongly agree The SUS also reveals that the current virtual conference application received an average total score of sixty-three (63), indicating that the score is average [55]. Two items that the participants had rated poorly were brought to light through the SUS: (1) it was difficult to learn how to use the current virtual conference application quickly due to the many features, and (2) it was time-consuming. Figure 9 depicts a diverging stacked bar chart for the SUS of the virtual collaborative application when collaborative activities are completed.
The SUS also reveals that the current virtual conference application received an average total score of sixty-three (63), indicating that the score is average [55]. Two items that the participants had rated poorly were brought to light through the SUS: (1) it was difficult to learn how to use the current virtual conference application quickly due to the many features, and (2) it was time-consuming. Figure 9 depicts a diverging stacked bar chart for the SUS of the virtual collaborative application when collaborative activities are completed.

Discussion
According to the findings of the study, users generally viewed that current synchronous applications are better, compared to asynchronous applications in terms of collaborative experiences and system usability. However, some features in the application were found to be underutilized, such as the record feature, the reactions feature, the raise hand feature, the breakout room feature, the file sharing feature, the whiteboard feature, the chat feature, the annotation feature, and the camera feature. These features were essential for collaborative purposes, but participants did not use them frequently enough. From the study, users believed that the file sharing, whiteboard, and annotation features were not easy to use, which perhaps contributed to their underutilization.
The study also gathered feedback on the web camera usage during the virtual conferences. Most of the participants preferred to turn off their web cameras unless specifically instructed not to do so. According to the results of the survey, most of the participants were unconcerned about their other members not turning on their cameras during an online engagement. This was further supported by the findings of the focus group discussion, in which participants discussed the issues they faced while performing collaborative activities on the current virtual conference application. Qualitative data extracted from the transcription of the focus group discussion is labeled using the G

System Usability Scale (SUS)
neutral disagree strongly disagree agree strongly agree

Discussion
According to the findings of the study, users generally viewed that current synchronous applications are better, compared to asynchronous applications in terms of collaborative experiences and system usability. However, some features in the application were found to be underutilized, such as the record feature, the reactions feature, the raise hand feature, the breakout room feature, the file sharing feature, the whiteboard feature, the chat feature, the annotation feature, and the camera feature. These features were essential for collaborative purposes, but participants did not use them frequently enough. From the study, users believed that the file sharing, whiteboard, and annotation features were not easy to use, which perhaps contributed to their underutilization.
The study also gathered feedback on the web camera usage during the virtual conferences. Most of the participants preferred to turn off their web cameras unless specifically instructed not to do so. According to the results of the survey, most of the participants were unconcerned about their other members not turning on their cameras during an online engagement. This was further supported by the findings of the focus group discussion, in which participants discussed the issues they faced while performing collaborative activities on the current virtual conference application. Qualitative data extracted from the transcription of the focus group discussion is labeled using the G[Number]M/F[Number] arrangement, where G[Number] denotes the group number, e.g., Group 1 to 5, M/F denotes the sex of the participants, e.g., Male or Female, and [Number] denotes the participant's identification number, e.g., 1 to 16. The indented text was the feedback from the participants.

Underutilization of the Collaborative Tools
The majority of the participants in this study expressed that they had very little experience in drawing collaboratively on their virtual conference platform. Previous work supported this by stating that, despite knowing how to use a video conferencing application, participants had never performed any collaborative task via video conference [56]. Some of the participants in this study found it very interesting to be able to draw collaboratively, as they had not experienced this in their previous virtual conferences: Participants also had difficulty activating the annotate feature from drawing, as it was designed to be hidden within a menu. As a result, the participants were unable to locate the functions they desired to use, and there was no indication whether the function had been activated or not. The following excerpt highlights a conversation during the usability study, where a participant in Group 5 was trying to find the draw function: G5M14: I want to change to go back to the drawing.

G5M15:
Oh, uh you click on the draw right click on the draw and then it will go back to the drawing yeah. According to Nielsen Norman Group, the menus such as the hamburger or the hidden menu can degrade the user experience [57]. In their studies, the hidden menus can reduce the functions' discovery time by half, with longer task time and an increase in the perceived task difficulty. It is more obvious in our usability study, as the participants find it difficult to locate the required function, therefore increasing task time and perceived task difficulty. Hence, it is evident that there is a clear need to redesign a more intuitive interface for collaborative purposes in virtual conference applications. A previous study presented several ways to improve the usability of the virtual conference application through an immersive interface [58]. However, using virtual reality and augmented reality technologies to improve usability can be costly and difficult to implement. A simpler solution must be devised to address this issue.

Connectivity and Other Technical Issues
Another issue discovered during the study was related to connectivity and other technical issues, such as poor quality of internet connection. Unstable and slow internet connection results in the users' disconnection from the group. Moreover, disconnection can result in the loss of any collaborative work that the group has already completed: Previous work also demonstrated that the participants encountered the freezing problem due to an unstable internet connection [59]. During an interview session with the participants in the usability study, they mentioned that, since their internet was not stable, they turned off their cameras while interacting to reduce the bandwidth consumption in order to maintain a more stable connection: G3F7: Because when we do the usual class, if all students switched on their cameras, the internet will be slower, it would freeze or lag a lot.
G3F8: Sometimes, Zoom takes a lot of internet bandwidth if we switched on the camera and it will always cause disconnection.

G5M16:
The biggest reason is that I'm not sure when will I be (accidentally) disconnected so I'm trying to reduce the numbers, uh, data that transfer in between. Yeah so that's the biggest reason I decide not to show myself to others.
Some participants also expressed their frustration with technical issues, such as action delay. For instance, it took a few seconds for their drawing to appear on the screens of others due to poor internet connection. This could be because the virtual conference application was not completely in real time: G3M9: Yes, because of the internet connection, we cannot all talk at the same time. So, there is going to be lag issues. Sometimes, I'd thought that they could not hear me because of the internet. G3F7: Same . . . it is difficult because of the unstable network connection.

G3F8:
And, sometimes it's not real time. They're talking about something else, and then suddenly we interact about something else. So, it's a bit difficult.
Some participants mentioned that virtual conference applications for collaborative activities consume a lot of data. The participants admitted that the more fun an activity is, the more data it consumes. This situation was not ideal because most participants had limited data plans from their internet service provider: G1F1: For me it's fun to do this activity, but if we are using Zoom platform, it uses a lot of data yeah so, I don't really like.
G1F2: Because if we are using Zoom, it actually needs a lot of Internet (data) consumption.
Based on a previous survey, data usage via virtual conference applications can be very expensive [60]. The reason for this is the high bandwidth consumption required by video conferences to transmit video and audio data across the internet. A previous study using video conferencing also encountered technical and connectivity issues such as having no audio or video feed, software incompatibility, and sometimes being unable to connect to the internet [61]. Previous research showed that disabling video feed and using only audio in rural areas with high internet latency can greatly improve the user experience [62].

Limited Sharing of Emotions and Expressions
Most of the participants in the usability study preferred to turn off their web camera during online communication due to several reasons such as privacy, a less-than-ideal background environment, and a lack of confidence or low self-esteem. Some users stated that they preferred their web cameras to be switched off by default in adherence to the Islamic teachings that require Muslims to dress modestly in public: G5M15: I just like it like that, maybe due to privacy because when we do something, like when we draw, our faces look too focused, and we don't want other people to look at your face at that time. I don't know, for me, I just like the privacy and since others have also not turned on their cameras, so I choose to not (turn on my camera) too.

G1F1:
Because we are Muslims, so we need to wear a hijab for that.
Another reason is the surrounding environment, which may not be suitable for sharing with others. One of the participants performed the study outside, therefore he decided not to turn on the web camera.
G1F2: I'm outside, so my line is not so good. That's why I just closed the video.
Interestingly, a few participants from the study also mentioned that they felt more relaxed when their web cameras were turned off. They had associated being shy with making them feel more anxious when the camera was turned on. However, some participants expressed that they preferred to turn on the web camera while doing the collaborative work in order to see how others reacted.
G3M9: I also love to switch on the camera because it's more fun because we can see their reactions and so on.
Thus, when users turn off their web camera during an online interaction, they are unable to share and communicate affective feedback with one another. Due to this isolation, the results from the usability study earlier show that most of the participants feel neutral or lonely when performing collaborative sync communication. Therefore, a balanced solution is required that can express the affective feedback of all parties involved, while also respecting the wishes of others who value their privacy highly.
According to a previous study, nearly half of the students did not agree to keep their web cameras on during classes due to anxiety, shyness, and privacy concerns [63]. Currently, the research has highlighted a lack of study on the causes of this behavior and would like to further investigate the matter in the future. Several theories relating to shyness and sociability (the likeliness of users to be socially engaged with others) are discussed in this article [64]. Although some users prefer privacy due to shyness, anxiety, or to increase focus, others still rely on visual cues such as facial expressions and hand gestures to help them interpret the message. Previous research has suggested that a lack of non-verbal and visual cues may increase the risk of clients and therapists misinterpreting what is communicated [65].
One possible solution for sharing non-verbal cues without jeopardizing privacy is by replacing the live video feed with virtual forms of a person, such as through an avatar. Previous studies have shown that avatars can help improve subsections of social presence such as co-presence and behavioral interdependence [66,67]. However, using avatars for this purpose has some limitations because they can only capture and display a limited range of emotions and expressions. Previous work has also shown that lower graphical detail on the avatar further limits the expressions that can be shown, and in turn lowers the emotional and affective states [66]. Basic graphics in avatars also present a trust issue with users, with previous results showing that users have a higher level of trust in the avatar when a more realistic avatar is used, as compared to basic-looking avatars [67][68][69].

Background Distractions
One recurrent problem when users try to collaborate synchronously online is the inevitable background noises and visuals that interrupt their activities. The background noise can be subtle or too loud to ignore. In most cases, participants had to repeat their sentences due to their voice being muffled by the background noises: G2F4: I can draw using color, there is the box where the color located (There is some background noise behind the scene) G2F5: I use the box at the top, number 2 (Background noise-C draw orange rectangle, B draw blue rectangle) (G2F4 distracted by his little brother) Sometimes, participants were distracted by something else while performing collaborative tasks on the virtual conference application. From the observation, one of the participants in Group 3 was interrupted while doing the collaborative activity. Distractions can also take the form of an unwelcome presence in the background: G3F7: Please wait for a moment. (Others were coloring the sea while G3F7 is being interrupted by talking to someone behind the background).
G3F7: Can you move over? (G3F7 was letting someone to move across behind the scene while the web camera was disable) G4F10: There might be distractions from brothers, or like maybe their parents always come into the room without knocking, without realizing they're in class, so that's why.
Distractions can occur in the virtual world as well, such as when the default layout of the platform is perceived to be cluttered and interfering with the flow of the tasks.

G5F13:
Okay, I don't think that turning on the camera is, uh, it's not, it's not, it's unnecessary because I want the whiteboard to be full screen. I think having the, having the video on the side might clutter the experience.
Another research study found that distraction can come from background sound, background video movement, or an unwanted presence [70]. Other factors that may cause distractions include a poor camera angle, bad lighting, walking while holding the camera, behavioral distraction (eating in front of the camera), and background noises [68]. There are also issues due to the home environment, such as seeing yourself on screen and other forms of interruption, which can be distracting.
Previous research has suggested that using a larger face and reducing background visibility could help reduce distraction [70]. The larger face can help reduce background distraction by allowing users to focus more on its facial expression. The previous study designed its system to be as simple as possible to minimize distraction when performing video annotation [71]. That is, by reducing the number of features available on the screen, simplifying the icons, and reducing background visuals and noise in the video feed.

Design Guidelines and Recommendations
The result and discussion of this user experience study serve as the basis for a set of design recommendations that will be useful in the future when designing virtual conference applications for collaborative purposes. According to the findings of the user experience study, there is a need for an intuitive interface to help increase the usage and ease of use of such synchronous, collaborative tools. For instance, the interface should be designed to be as close as possible to the real-world settings of a meeting or collaborative site. This would put the participants at ease as they would recognize the virtual location as being somewhat familiar to the real-world meeting site. Likewise, the tools and additional features provided by the synchronous, collaborative virtual conference application should also have been designed to encourage exploration when engaging in online collaboration activities with others.
Another key finding from this study is the need to provide for an alternative method of representing facial expressions when communicating online. Although respondents stated that they were mostly shy or reluctant to switch on their cameras and show themselves (and their backgrounds) to others, they had also stated that they would appreciate the facial cues and reading responses from others. This study proposes the use of an avatar to achieve this. Avatars use less bandwidth to display expression than a live video feed from a web camera, which consumes a lot of data. Avatars also help to improve privacy as they are less intrusive compared to using a web camera. The virtual character should not be too distracting and can be disabled if users want to. This could help increase user engagement and awareness, as the virtual avatar can share expressions without having to show the user's actual face or background.

Proposed Architecture System Design
The overall architecture of the proposed natural and affective virtual conference application is illustrated in Figure 10. The proposed system is made up of both local and remote clients. A server is also included in the system to connect multiple local and remote clients. For the local client, the web camera captures the video as an input for the virtual conference application. For privacy and data management, the local user can control the input through the input control layer by enabling, switching, and disabling. The video source data is then used to track facial expressions and hand gestures through image preprocessing. Once the facial expressions and gestures have been detected, the system then identifies the correct actions to produce based on the expressions and gestures input, which is done through the affect and action module, respectively. The affect module determines the users' facial expression, while the action module determines the users' gestures. The affect and action data are packaged and ready to be sent to the server through a real time network layer. The real time network layer connects to a cloud server and then transfers the affect and action data from the local client to the server. The server then distributes the affect and action data to all remote clients that connect to it. Another key finding from this study is the need to provide for an alternative method of representing facial expressions when communicating online. Although respondents stated that they were mostly shy or reluctant to switch on their cameras and show themselves (and their backgrounds) to others, they had also stated that they would appreciate the facial cues and reading responses from others. This study proposes the use of an avatar to achieve this. Avatars use less bandwidth to display expression than a live video feed from a web camera, which consumes a lot of data. Avatars also help to improve privacy as they are less intrusive compared to using a web camera. The virtual character should not be too distracting and can be disabled if users want to. This could help increase user engagement and awareness, as the virtual avatar can share expressions without having to show the user's actual face or background.

Proposed Architecture System Design
The overall architecture of the proposed natural and affective virtual conference application is illustrated in Figure 10. The proposed system is made up of both local and remote clients. A server is also included in the system to connect multiple local and remote clients. For the local client, the web camera captures the video as an input for the virtual conference application. For privacy and data management, the local user can control the input through the input control layer by enabling, switching, and disabling. The video source data is then used to track facial expressions and hand gestures through image preprocessing. Once the facial expressions and gestures have been detected, the system then identifies the correct actions to produce based on the expressions and gestures input, which is done through the affect and action module, respectively. The affect module determines the users' facial expression, while the action module determines the users' gestures. The affect and action data are packaged and ready to be sent to the server through a real time network layer. The real time network layer connects to a cloud server and then transfers the affect and action data from the local client to the server. The server then distributes the affect and action data to all remote clients that connect to it. At the remote client, the affect and action data received from the server are then interpreted and sent to the avatar and action controller, respectively. The avatar controller manages and displays the avatar animation by showing real-time emotion based on the At the remote client, the affect and action data received from the server are then interpreted and sent to the avatar and action controller, respectively. The avatar controller manages and displays the avatar animation by showing real-time emotion based on the affect data received from the local client. The action controller displays actions such as the raising of one's hand, waving one's hand, and signing a thumbs up. The actions are hovered on the avatar based on the action data received from the local client. The avatar and actions are then displayed on the user interface. Finally, remote users can see other users' facial expressions and actions through the avatar without revealing their own faces to others. The avatar partially substitutes the need for a public video display while still allowing users to share affects and actions. Figure 11 depicts the architecture flow process of the proposed affective virtual conference application for remote collaboration. Figure 11a shows the flow process for determining emotion based on the facial expressions of users. To detect a facial expression, a face tracking algorithm based on previous research must be implemented [72]. Similar to the previous flow process, a frame needs to be captured and image preprocessing must be performed. The image preprocessing process includes converting an image to grayscale and removing the background. Then, the image must be adjusted through pose correction to straighten out the face. If a face is detected, the image extracts the facial features of the face into three main cues: the eyes, face, and eyebrows. If no errors are found, the data from the feature extractions are used in the Support Vector Machine (SVM) to classify the facial expression. Using the provided training dataset, the system helps to classify different types of emotion such as happy, sad, angry, neutral, excited, curious, and more. This information is then used by the avatar module to help the avatar express emotion. However, only one expression is displayed at a time.
On the other hand, Figure 11b depicts the flow process of providing action based on hand gestures provided by users. For hand tracking, the proposed virtual conference application for remote collaboration makes use of some of the techniques developed by a previous study [73]. The flow starts by capturing a frame from a video source; then, the frame image is processed by converting it to grayscale, performing background subtraction, Gaussian blurring, and image thresholding. After that, the system extracts a contour and convex hull from the processed image. The system then proceeds to detect convexity defects using the processed image. Based on the number of convex hulls and defects, this information can help determine whether the image processed contains the hand or not. If the system fails to detect the hand, it will return to capture another frame and the process is repeated. After detecting the hand, the system uses a simple algorithm to determine the number of convex hulls and defects. If the gesture cannot be determined by the algorithm, the system employs machine learning using Support Vector Machine (SVM) with a training dataset to try to recognize the gesture so that it can be accurate enough to detect different types of gesture. When the gesture is successfully determined, the system proceeds to find the appropriate actions. Finally, a virtual action, such as thumbs up, raising hand, or clapping, can be displayed on the screen.
(a) (b) Figure 11. This is the flow process of the proposed natural and affective virtual conference application: (a) Flow process of detecting the facial expressions of the user and displaying it onto the avatar; (b) Flow process of detecting hand gesture used by the users and displaying as an action cues.

Proposed User Interface Design
The proposed user interface design for the proposed virtual conference application is shown in Figure 12. The user is required to enter his or her name, the group name that will be created or joined, and the group password that will be set or entered into the group. After entering their information, users can choose to join or create a group by clicking on the Join Group or Create Group button. When a user clicks the Create Group button, the proposed virtual conference application creates a new empty group with the group name and password set by the user. When the user clicks Join Group, he or she enters the group based on the group name and password provided by the users who created the group. When the user enters a group, there are multiple functions that can be accessed by the users. The top left corner section contains multiple buttons for annotating, such as sketching, erasing, and coloring. There are also some application functions such as screen sharing, recording, video mode, and microphone control. At the top right corner is the settings menu, where there are options to invite, leave, or edit group settings. The chat message between group members is displayed in the bottom left corner. To highlight Figure 11. This is the flow process of the proposed natural and affective virtual conference application: (a) Flow process of detecting the facial expressions of the user and displaying it onto the avatar; (b) Flow process of detecting hand gesture used by the users and displaying as an action cues.

Proposed User Interface Design
The proposed user interface design for the proposed virtual conference application is shown in Figure 12. The user is required to enter his or her name, the group name that will be created or joined, and the group password that will be set or entered into the group. After entering their information, users can choose to join or create a group by clicking on the Join Group or Create Group button. When a user clicks the Create Group button, the proposed virtual conference application creates a new empty group with the group name and password set by the user. When the user clicks Join Group, he or she enters the group based on the group name and password provided by the users who created the group. When the user enters a group, there are multiple functions that can be accessed by the users. The top left corner section contains multiple buttons for annotating, such as sketching, erasing, and coloring. There are also some application functions such as screen sharing, recording, video mode, and microphone control. At the top right corner is the settings menu, where there are options to invite, leave, or edit group settings. The chat message between group members is displayed in the bottom left corner. To highlight actions taken by the users, a yellow color is displayed for each action. The bottom-right corner is used to manage group members, such as muting them.
actions taken by the users, a yellow color is displayed for each action. The bottom-right corner is used to manage group members, such as muting them. The most prominent feature of this proposed virtual conference application when compared to others is the viewing mode, which allows users to switch between avatar, video, or off mode. When avatar mode is turned on, the avatar appears at the bottom of the screen, depending on the number of users who have joined. The avatar is able to express itself based on the user's facial expression. When the user switches to video mode, the avatar is replaced with a web camera video stream from their computer. The application also allows the user to turn off both video and avatar mode if the user wants more privacy.
Another feature that differentiates this system from others is that it provides actions cues such as hand up and love when a certain gesture is performed. Users will become much more aware as the actions show up beside the avatar. A text message is also displayed on top of the avatar to also help improve awareness. This can be helpful for keeping track of the current message chat. The proposed virtual conference application also improves awareness by displaying color-coded name cues while annotating on the whiteboard or screen sharing. This is used to indicate who is annotating, which can be useful during collaboration.

Conclusions and Future Works
In this paper, a usability study was carried out to determine the issues faced by users while performing collaborative activities on current virtual conference applications. Our findings show that most of the features offered by existing applications are not fully utilized. It was also discovered that users found common collaborative features such as file sharing, whiteboard, and annotation features difficult to use. A simpler, more intuitive, interface design is needed to better assist users. The study also revealed that most participants would prefer to turn off the web camera for reasons of privacy, internet The most prominent feature of this proposed virtual conference application when compared to others is the viewing mode, which allows users to switch between avatar, video, or off mode. When avatar mode is turned on, the avatar appears at the bottom of the screen, depending on the number of users who have joined. The avatar is able to express itself based on the user's facial expression. When the user switches to video mode, the avatar is replaced with a web camera video stream from their computer. The application also allows the user to turn off both video and avatar mode if the user wants more privacy.
Another feature that differentiates this system from others is that it provides actions cues such as hand up and love when a certain gesture is performed. Users will become much more aware as the actions show up beside the avatar. A text message is also displayed on top of the avatar to also help improve awareness. This can be helpful for keeping track of the current message chat. The proposed virtual conference application also improves awareness by displaying color-coded name cues while annotating on the whiteboard or screen sharing. This is used to indicate who is annotating, which can be useful during collaboration.

Conclusions and Future Works
In this paper, a usability study was carried out to determine the issues faced by users while performing collaborative activities on current virtual conference applications. Our findings show that most of the features offered by existing applications are not fully utilized. It was also discovered that users found common collaborative features such as file sharing, whiteboard, and annotation features difficult to use. A simpler, more intuitive, interface design is needed to better assist users. The study also revealed that most participants would prefer to turn off the web camera for reasons of privacy, internet instability, and reducing mobile data consumption. Participants, on the other hand, rated positively on the collaborative experience and system usability as acceptable.
Following that, the paper described a set of design guidelines to address the issues raised. This includes ways to encourage collaborative usage by redesigning the interface to look more like the real world. Other solutions included providing avatar-based virtual characters that consume less data, improve privacy, and increase social awareness among participants.
One of the goals of this study is to understand more about human-to-human distance communication through technology, especially during the ongoing COVID-19 pandemic. The study also contributes to the research community by providing guidelines for improving collaboration and affective sharing, which are often lacking in current virtual conference applications.
One of the study's limitations was that it only tested one application (Zoom) for the experiment. It would be interesting to compare different types of virtual conference applications to determine the issues and challenges. Another limitation is that the study did not consider participants with different backgrounds for the experiments due to the pandemic and the consequential restriction of movement control order. An extension of the study will be conducted, and participants from different backgrounds will be considered to gain a better representation.
The next step is to redesign the interface of a virtual conference application to make it more user-friendly, thereby encouraging collaboration among users. An avatar-based solution to increase social awareness and social presence of group members is also in the works, where affective qualities can be communicated among group members without the need to publicly enable the web camera. In the near future, this study will include log recordings in addition to the self-reporting that is currently in place. This should provide a better insight to further understand the behavior of users when performing collaborative activities online in a more natural setting. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.