An Intelligent and Interactive Interface to Support Symmetrical Collaborative Educational Writing among Visually Impaired and Sighted Users

It is often uncomfortable for disabled individuals, especially those with vision impairment, to conduct educational activities in collaboration with people that have perfect vision. This can be because of the former’s lack of confidence, vision capability, and acceptance. Information and communications technology (ICT) has played a vital role in giving support to people with visual impairments so that they can overcome their issues. This study proposes innovative solutions that address the challenges faced by partially or completely visually impaired people. It provides an interactive and intelligent interface, which they may use to perform educational activities, such as editing, writing, or reviewing documents, in collaboration with people without visual impairments. The system provides high-quality awareness features by sending them instant voice notifications about the actions and events occurring in the shared environment. A speech-recognition engine has been integrated into the system to allow users to interact with the application through voice commands. The system is evaluated through experiments, where people with visual impairment and people without visual impairment were engaged in collaborative writing. The obtained results are encouraging. The users showed curiosity in the system and were able to focus on the productive task instead of their disability.


Introduction
It is worth mentioning that more than 253 million people are visually impaired according to the World Health Organization (WHO) [1].Given that they make up a substantial part of society, it is imperative to ensure that they may actively participate in social activities/kinetics and interact with others effectively.The objective should be to help them become self-reliant and confident.Existing information and communications technology (ICT) tools have been designed particularly for users with normal vision.Users with visual impairment have very few interactive user interface (UI) components available to them, which in turn are not very useful to them.In order to interact with this kind of application, users with such a disability employ assistive technologies and add-ons, such as a braille translator, voice recognition tools [2], speech synthesizers/screen readers [3,4], and so on.
Computer supported cooperative work (CSCW) [5] allows users to work collaboratively with each other on a single goal or task.CSCW helps users to obtain diverse knowledge and skills, which is not easily achieved if they work alone.Thus, individuals can be more productive if they work collaboratively in a group [6].However, individuals with visual impairments struggle when they interact with such applications [7], as they are not particularly designed for them.Add-ons [8] and interface wrappers [9] have been developed to make applications accessible to them, however, they are still not fully utilizable.
In CSCW applications, to make the work worthy and effective, all aspects of cooperative work, i.e., interaction, cooperation, collaboration, awareness, and coordination, are necessary [10].The most important factors are an awareness of and interaction with the applications themselves [11,12].The system must provide an interactive interface that is easily operable by users and can give complete awareness about the actions and events happening in the collaborative environment.Normally, awareness features are built as popup notifications and color formatting over contents.These mechanisms are visuals and do not work for users with extreme visual impairment.Similarly, to interact with the application, the standard input/output devices (mouse, keyboard, liquid crystal display (LCD) monitor, and so on) are ones that a user without visual impairments can easily use and are hard or impossible to use for individuals who cannot see.Existing solutions require the introduction of advanced components and features (audible alert, sound beeps, speech-based input, and so on) specifically designed for people with visual impairments to let them participate equally in collaborative work.
The focus of this research is to enable persons with visual impairments to become self-reliant, confident, and independent, and facilitate them to participate equally, alongside individuals was developed.This framework includes speech-based inputs and awareness functions particularly designed for users who are blind.The proposed system allows visually impaired users to work in a collaborative environment with sighted users on a single goal and has a special feature of voice command input for blind users to interact with the application efficiently.Interaction through voice allows them to use the application without any hustle and assistance.Both of these complement each other with the aim of making an interactive and intelligent interface.The application is embedded with well-structured information/notification components and has an easy-to-use interface.The occurrence of every event is shared with the users to keep every participant on the same track.Moreover, a communication service is added so that information may be easily exchanged between individuals without visual impairments and those with visual impairments.

Literature Review
When a group of users works on a single task, the productivity and the quality of the work increases and decision making becomes faster and more efficient [13].This is achieved through a groupware environment that has the functions of collaboration and coordination in it [7].To achieve a high coordination and collaboration quality, group awareness and the support of coordination are essential.This is related to the activities of each user working in the shared platform, the information of the participating authors, and the effect of each author's activity on the activity of others [14].
A survey was conducted [15] with blind people to gauge whether they have any past experience of group activity and if they had done any group activity, then what were the issues and deficiencies they faced.Contrastingly, if no issues were faced, what were their expectations for such an environment.A questionnaire was distributed among global special education institutes and individuals that were classified as visually impaired through email.Around 150 responses were received, and the results showed that the blind community has a need for a system that is particularly designed for them and can help them to work on a single task in collaboration with sighted people.

Overview of Computer Supported Cooperative Work (CSCW) Systems
CSCW systems are used by many big organizations.They help collaborators because they manage asynchronous as well as synchronous communication [16].Java applets made multiuser (JAMM) [17] and synchronous asynchronous structured shared editor (SASSE) [18] are examples of these collaborator systems.Inconsistency is considered one of the major issues in these systems.
Inconsistency refers to the duplication of the same content in different parts of the document due to collaborative work.Yang et al. [19] presented a consistency model solution to avoid inconsistency problems.The Clay [20] system was proposed by Locasto et al.It allows users to work in a synced manner, from different geographic locations.
Gutwin and Greenberg [21] developed a descriptive theory of awareness that guides the developers of groupware applications about the importance of and need for awareness functions in a shared workspace.The proposed framework helps the designers to understand the concept for the purposes of designing awareness support and improving the quality of group awareness in a collaborative environment.The group awareness knowledge-based system (GAKS) [5] is a web-based application that enhances the coauthors' document writing abilities by providing them with elaborated and innovative awareness functionalities.The proposed system provides synchronous/asynchronous contextual communication tools and a work proximity detector for the users to efficiently produce and coordinate their actions.Big Watch (BW) [22] is a framework that provides flexible and extensible awareness functionalities to its users.The proposed framework can be integrated into any application to enhance its event-based awareness functions.The framework reduces the development cost and extends the awareness information in a unified way.
An extensible markup language (XML) based co-authoring platform was created by Qingzhang et al. [23] for collaborative working.Another XML based framework was also presented by Ho, Leong et al. [24].In XML-based systems, shared documents are converted and stored in XML format.The advantage of using XML is that it stores the information in a structured format that is easy to read for machines and humans.Thus, the processing of the documents becomes easier and it helps to manage resources and access, as well as locking the content of documents.Another system with the name of WoTel [25] allows its authors to conduct video conferences to share different ideas while working on a shared document.To conduct group communications, multimedia systems are integrated for collaboration.
An asynchronous co-authoring system named TeNDeX was developed by Hodel et al. [26], which allows its users to edit documents synchronously.In the proposed system, a document's content is saved in a database, despite the existing conventional co-authoring platforms, which increased the efficiency of data retrieval.Joeris et al. [27] also proposed another application of synchronous collaboration, which supports the engineering domain.The work done in [28] allows its users to write mathematical expressions collaboratively.It provides, within the interface, the option of writing formulae, obtaining suggestions from old written formulae, reusing them, and evaluating them.In the case of any difficulty, it also suggests to currently available expert authors to its user so that they may get help from them.

Technologies and Applications for the Visually Impaired to Help in Document Writing
Assistive applications contain special user interface/user expiries (UI/UX) components, which enable blind users to produce work through them.There might be three ways to manage UI/UX for these kinds of users.Well formatted and good quality visual content for partially blind or color blind people as they have sight, but are unable to see clearly; text to speech function and special assistive hardware devices for the completely blind as they cannot see; and sound/speech based alerts and voice input controls for both are the best ways to interact with applications for persons who are visually impaired.All these things are a challenge to achieve in one framework.However, some frameworks support some individual features.For instance, nowadays, speech-based assistants [29][30][31] are already used, and are specifically designed for visually impaired people; and a similar kind of application may be designed to help said people with their educational and learning activities.A lot of chat bots have arrived, which use natural language processing (NPL) to communicate with users and do not even let the communicator know that there is a bot behind the screen [32].One good application of such systems is for the elderly and disabled people, who suffer from loneliness and do not have a vast social life.The system can provide them with the benefit of a distributed network, and help them stay updated with their surroundings [33].
Google Docs UI developed by Mori et al. [9] encountered the major problems faced by people who are blind while using Google Docs [34] via screen readers.The proposed UI has the same look and feel as the original Google Docs, but the accessibility to interactive elements was improved by integrating a new standard of (X)HTML interactive widgets (links, menu, buttons, and so on).To improve the orientation for the blind, the accessible rich internet applications (ARIA) [35] landmarks and hidden labels were added in the modified layout.The TinyMCE (Tiny Moxiecode Content Editor) editor was used to replace the existing one.This is more accessible through the keyboard and screen reader.In addition, to provide quick information about the document list, summary attributes were added to the document list tables.The real-time informative message issue was solved by using Ajax scripting.
A Microsoft Word add-in prototype [36] was developed to improve the usability and accessibility of collaborative writing between visually impaired individuals.The research was initiated with a baseline usability study [37], conducted to identify the accessibility and usability-related issues that stem from collaborative writing features when they are used by visually impaired people while using Microsoft Word.The author proposed a Word add-in prototype [8] that utilized Windows message boxes to present the revisions and comments of the document.It is compatible with the Job Access with Speech (JAWS) screen reader and a standard keyboard.In their next proposal, they used an iterative design approach that was conducted in two rounds of one usability study [38].A group of blind candidates shared their feedback and suggestions after each iteration to improve the current version of the add-in.Based on the suggested improvements, the authors modified the prototype.
TalkMaths [39] provides blind people with a system that helps them create and edit high precision mathematical formulas.Automatic speech recognition (ASR) and dragon naturally speaking (DNS) is used to recognize speech and give textual output in the form of "parse tree".Moreover, TalkMaths can recognize and detect syntactic errors.The initial version of TalkMaths [40] was designed for English language users only.An editing mode was also devised for the system, where the user can only delete the last typed digit.Work has started on the "DNS select-and-say" topology, like the mouse "point-and-click" strategy, to improve the editing mode.Users then select a specific label by dictating its position, as each box would have a sequence number attached to it.After selecting the appropriate box, the user then dictates the correct statement to overwrite the existing one.
Writing Mathematics by Speech [41] uses speech input techniques to enable blind students to read, write, and edit their mathematical expressions as quickly and precisely as sighted peers do.The proposed solution is an extension of the linear access to mathematics for braille device and audio-synthesis (LAMBDA) system [42], which is based on the functional integration of a linear mathematical code and an editor to visualize, write, and manipulate the formulas.It was designed to be used with Braille peripherals and the vocal synthesis.For speech recognition, Dragon Naturally Speaking TM 9 is used.The proposed prototype is made up of a script written for the deployment of Dragon Naturally Speaking TM, two dictionaries (one for text input and the second is for mathematic input), and a python script, which enables the LAMBDA editor to perform actions on the mathematical expressions.
A web-based application [43] takes input by speech for the writing of mathematical formulas.It is highly accessible with good usability features.The proposed application is context-sensitive, and its functionality is divided into various categories, where each category is forced to use a specific syntax that reduces the risk of errors in speech recognition and ultimately writes an accurate formula.This application requires prosody to minimize voice readout problems, which affect the desired result.On the development side, the author decided to use the JAVA language and Extensible Hypertext Markup Language (XHTML) + Voice Profile which controls voice processing and supports fast voice recognition.The final mathematical formula expression was written using MathML, which is preferred with regards to the existing standards.A two-layered system was introduced; the bottom layer is made from JAVA that runs on Jetty, and for the top layer, graphical user interface (GUI) and Opera Browser were chosen because they support XHTML + Voice Profile technology.
A Software Model to Support Collaboration Mathematical Work between Braille and Sighted Users [44] provides an environment in which the people who are blind do cooperative work with sighted people.The system synchronizes two different perspectives of a mathematical formula, one for the person who is blind and the other for sighted people.The expression is presented to the blind by using braille, whereas a graphical illustration is used for sighted people.Support functions were included to allow visionless people to perform calculations easily.A hybrid entering method was also used to insert simple expressions via a keyboard and complicated expressions via speech.The Universal Math conversion library is used to deal with the fact that each math-based software has one associated Braille code.Switching from one Braille code to another is made possible through this library.Universal maths conversion library (UMCL) [45] consists of one major segment whereas the input and output segments depend on the number of mathematical formulae present.The Canonical MathML, a method to unite MathML segments, is used to speed up the evaluation time for the mathematical formulae.
Supporting Cross-Model Collaboration in the Workplace [46] presents a cross-modal tool for collaborative editing of diagrams between visually-impaired and sighted users.Initially, the authors had designed a single user auditory interface [47] to construct nodes-and-links diagrams, such as organizational charts, flow diagrams, unified modeling language (UML), transport maps, etc.The proposed system is an extended form of that system.The system has different views: The Graphical View is similar to a typical diagram editor, having a toolbar, mouse clicks, drag and drop functions, and keyboard shortcuts; in the Hierarchical Auditory view, the diagram is translated into auditory form from a tree-like hierarchical data structure to support non-visual interactions.In the Spatial Haptic representation, the PHANTOM Omni (a 3D mouse with a 'pen') haptic device is used for displaying the contents on a vertical plane where nodes act as a magnetic point.The user simply traces the stylus across the lines.The system allows haptic and auditory hierarchical views to work together in which the user locates its items and gives a command so that PHANTOM locates them on the virtual plane.
An Initial Investigation into Non-Visual Computer Supported Collaboration [48] provides a collaborative environment to visually impaired users to interact with and perform manipulations on simple graphs.This is an advancement of an existing application, Graph-Builder [49], that uses the PHANTOM Omni device to allow browsing and modification of bar graphs via haptic force feedback.The proposed system uses two PHANTOM Omni devices to build a collaborative environment.Two users are allowed to manipulate the same graph simultaneously, but cannot concurrently modify the same bar.Auditory signals are employed for one user to know the other one's location.For interaction, two features are employed: "Come to Me" (a user uses his Omni device to haptically drag the other one's device to its current location) and "Go to You" (a user lets his Omni device get dragged to the other user's proxy).Some other works have also been conducted along with educational activities to support CSCW between persons who are blind and sighted.Multimodal tools and interfaces [50] are developed to facilitate intercommunication and interaction between a user who is blind and those who cannot listen.It uses the modality replacement function for information transition and enables communication between the users.Stacy et.al. [51] explored the creation and management of accessibility in a shared environment and identified the challenges and solutions of collaborative accessibility Based on these experiences, they proposed new methodologies and technologies to support collaborative accessibility in the home.Winberg et al. [52] reviewed the collaboration between users with visual impairments and sighted individuals across different modalities.They set up an environment in which both types of users play a game.An auditory interface is provided to the candidates who have visual impairments whereas a visual interface is available for sighted users.The issues regarding the collaborative interface were observed and revised design principles were presented for the users with visual impairments.A paper [53] presents the methodologies used by users with visual impairments while interacting with computer systems and describes the pros and cons.Based on these analyses, they presented recommendations for user interfaces of groupware and chatting applications designed for persons with visual impairments to enhance their usability interaction without losing their interest.Finally, a prototype with the name, Blind Internet Relay Chat (BIRC), was proposed and its advantages and limitations were discussed.

Limitations and Relevant Recommendations
Some of the systems that we reviewed have implemented basic techniques, such as a screen reader, sound alerts, and popup notifications etc., whereas others have advanced features, like speech-based input, voice alerts, and braille peripherals, etc., to facilitate the visually impaired in performing group activities for document writing, which included text content or mathematical expression.Table 1 presents the characteristics of the discussed systems based on the attributes chosen for comparative study and analysis.Application platforms are related to whether an application is accessible to its users through a web portal, whether an installation required, or if it is an add-in that needs to integrate with an already installed application.The application objective states exactly what the application does to support its end user.When designing an application, the consideration of its audience is another important factor.It helps to select and design the best input/output approaches for its users instead of using standard mechanisms.An awareness mechanism should be implemented very efficiently to make an application interactive and responsive.Advance interactive components, such as speech-based input, assistive hardware devices, and voice alerts, are needed for blind users to interact with the application, which are not very common input/output methods [54].The user's workspace describes that either an application is usable by only one user at a time or multiple users are allowed to work simultaneously.Some platforms use the approach where they are used at the same place and time, others differ since they can be used at different times and in different places, while some use a hybrid approach, i.e., a mixture of both [7].When users are blind, security and privacy must be a property that is defined and handled.Various approaches have been used to integrate user interactions and awareness in the system.This may include dialogue boxes, warning messages, sound beeps, speech alerts, text to speech, and so on.Text to speech, sound beeps, and speech alert techniques are very popular.However, the users have to listen to long speeches to obtain their required information [4], and if, by chance, a user misses a speech or is unable to grasp the information in a timely manner, he must repeat listening to the speech again.Systems must have a function to generate the maximum amount of information from limited speech and have some control over the speech, like stop, repeat, and skip item functions, and so on.
In mathematical writing applications, the major issue is editing an already written formula.There is some basic solutions proposed for this issue, like dictating the position of the content and then updating it [39] or going through the whole expression and then after reaching the particular area, updating the content.The solution to this problem might be the same as discussed earlier; the user must have control over the spoken content.Another limitation is the extensive structure of the mathematical expression that the user needs to speak, such as in the case of very long and complex expressions.Predictions could be added to speed up the process.Also, memory can be added to memorize frequently used expressions and to re-use them at a single call.The system should be interactive with the user to enhance the application's usability.For instance, the system should respond to every action undertaken by a visually impaired user.The response time should also be managed and be small so that the system can work efficiently and meet the requirements of the user.
When we work in a collaborative environment, the authentication, authorization, and user's personal information security becomes an important aspect of that application [55].Some systems have privacy and security techniques implemented, but their implementation is at a basic level.More work is needed for security, especially when the user of that application is a blind person.Another approach that can be used to authenticate a user is their voice.As the user of the application is a blind person, the authentication can be completed through spoken words.Some systems have voice input features, but none of them have this feature.
Multiple efforts are in progress to standardize educational software applications for visually impaired users and on the basis that some standards have been developed for the implementation and realization of these applications [56].However, still, the doors are open for further research opportunities.

An Interactive Web Co-Authoring Platform for the Visually Impaired
A system, which allows its users to work jointly on a single goal/task even if they are located in different places, is called a collaboration software [11,57].If the goal is to write a single shared document, then this is called co-authoring [58].For better results, in a co-authoring system, the users must know about each update, change, and any present and past activity that has taken place in the shared environment [2].The synchronization of these activities should be proper and the coordination between the users is an important factor.An individual should be able to stay up to date about the group activities; this is the main objective of a co-authoring system.
The comparison of factors described in Table 1 gave us support while designing the proposed system.We analyzed each factor with care and finalized the best options that support both types of users, those who can see and those who cannot.The proposed application is a web-based platform where the users who are visually impaired perform collaborative writing activities with the individuals who have sight.The users with sight give input through a standard keyboard and mouse whereas speech-based commands are available for persons who are blind.There are also shortcut keys available for interactions with the application for both types of users.While interacting with the system, it acknowledges each action performed in the shared environment through speech for it users who are visually impaired whereas the users with normal sight vision do not need any special function for this.Users with visual impairments listen to all the activities happening in the shared document whereas the popup notification and sound beeps are embedded for sighted users.The application Symmetry 2019, 11, 238 8 of 23 is accessible from different places at different timings, which give its users flexibility to work at any time while sitting at any place without any hurdle.It uses standard web technologies, such as Synchronized Multimedia Integration Language (SMIL), XML, and Hypertext Transfer Protocol Secure (HTTPS), for the development of this system.Amaya [59], an open source library, is used as the kernel.The proposed system, Web-Based Co-authoring Framework for Blind (WCFB), is a three-layered distributed model as shown in Figure 1.
the inference engine finds the best match rule from the knowledge base, and the corresponding action is formulated.We designed and developed web-based document access and reporting modules, which contain some major libraries, like the text to speech library, session manager, notification system, and so on, and integrated them in a managed way so that they can work concurrently within the application and do not create any conflicts.They are responsible for the delivery of access to the document, presentation of well-formatted awareness to its end users, and the management of the authorization against different elements of the system.In addition, we integrated the speech recognition engine into the system, which enables its user to work with the application through speech commands.We developed a trigger model that contains a list of transcribed phrases and corresponding actions against them.We developed a function whose responsibility is to identify the corrected transcribed phrase from the trigger model and propose the correct action against it.A detailed overview of each module of the system is given below in Figure 1.

Amaya's Thot Library
A document may contain different components in it, like headings, sections, pages, images, tables, etc.If the document is stored in a well-structured format, it is easy to manipulate it and track all the activities happening in each component of the document.Similarly, the participants were assigned different roles to work on a different section of the document and thus, they must know about any modification and editions if they occur in their assigned areas.So, the logical structure of the documents also helps to log this kind of information.Amaya's Thot library was used to keep the documents in a well-structured way.It is an open source library (source code is easily accessible), which has several application program interfaces (APIs) available in its software architecture.They are easy to manage and their functionality can be enhanced with little modification of their existing functions [59].Many document manipulation libraries are part of Amaya.By the separation of the presentation, structure, and content, the Amaya Thot library keeps documents in a structured way.The abstract tree data structure is used by Amaya's library, whose nodes are the headings, paragraphs, phrases, lists, and so on.The logical structure of the documents is maintained by a set of rules that are a part of the document type definitions (DTDs).To make structured documents, these rules, the attributes of the documents, and the elements of the documents are assembled.The We integrated the Thot libraries into the system and modified their existing functionalities according to our system's requirements where needed.It helped us to customize the basic library features and adapt them according to our specific user needs.We managed to design an intelligent framework, whose major role is to catch all the activities in the shared, and potentially rather distributed, cloud environment and suggest corresponding notifications and actions as well.We succeeded in designing a localized and distributed inference engine and managed the knowledge base that contained a list of rules.Whenever any activity occurs, all the facts against it are collected, the inference engine finds the best match rule from the knowledge base, and the corresponding action is formulated.We designed and developed web-based document access and reporting modules, which contain some major libraries, like the text to speech library, session manager, notification system, and so on, and integrated them in a managed way so that they can work concurrently within the application and do not create any conflicts.They are responsible for the delivery of access to the document, presentation of well-formatted awareness to its end users, and the management of the authorization against different elements of the system.In addition, we integrated the speech recognition engine into the system, which enables its user to work with the application through speech commands.We developed a trigger model that contains a list of transcribed phrases and corresponding actions against them.We developed a function whose responsibility is to identify the corrected transcribed phrase from the trigger model and propose the correct action against it.A detailed overview of each module of the system is given below in Figure 1.

Amaya's Thot Library
A document may contain different components in it, like headings, sections, pages, images, tables, etc.If the document is stored in a well-structured format, it is easy to manipulate it and track all the activities happening in each component of the document.Similarly, the participants were assigned different roles to work on a different section of the document and thus, they must know about any modification and editions if they occur in their assigned areas.So, the logical structure of Symmetry 2019, 11, 238 9 of 23 the documents also helps to log this kind of information.Amaya's Thot library was used to keep the documents in a well-structured way.It is an open source library (source code is easily accessible), which has several application program interfaces (APIs) available in its software architecture.They are easy to manage and their functionality can be enhanced with little modification of their existing functions [59].Many document manipulation libraries are part of Amaya.By the separation of the presentation, structure, and content, the Amaya Thot library keeps documents in a structured way.The abstract tree data structure is used by Amaya's library, whose nodes are the headings, paragraphs, phrases, lists, and so on.The logical structure of the documents is maintained by a set of rules that are a part of the document type definitions (DTDs).To make structured documents, these rules, the attributes of the documents, and the elements of the documents are assembled.The presentation schema is composed of the views to present the documents.A mixture of some sets of rules to present a document to the user is called a view.The box tree is an intermediate structure that uses the abstract tree and the presentation schema from Thot, which presents the views.The outer environment of Amaya is communicated through the API provided by the Thot interface.Several hooks are available to send, receive, and manipulate documents' contents.

Intelligent Interface for the Blind Awareness (IIBA) Framework
IIBA is responsible for enhancing awareness functions for its users.The interface and kernel layer are bridged by this layer.The major components of IIBA are: Facts and events catcher, which captures every activity that happens in a co-authoring environment, like a session beginning or ending, opening a document, editing an object, the type of edition, the action performed, and so on.These facts are stored in a dedicated storage space in order to deliver them to a consumer application later on [5].The local events manager manages all the events generated by local authors and applications.These events are managed in an order list based on the time they occurred in the form of a circular buffer.These events are later sent to other users working in a shared environment to give them awareness about the activities happening in the system.This gives them the visibility of what is going on, on the other side.The events are given as input to the inference engine (IE), which performs analyses using the rules available in the knowledge base and performs actions accordingly.First order predicate logic (without function symbols) [60] is used to write the rules, which consist of two parts, the premise and actions.The rules are well defined and tested very carefully so there are not any syntactical, semantically, and lexical errors.The following rule triggers if any of the authors open any document and notification of the visually impaired author is required of this action: writer" /* Y can write section_1 */ Session_Active(Y) = "true" Author(section_1) = X Role(X) = "annotator" /* X requested to see the update in section_1 */ Action(X) = "open_document" Then SendSpeechAlert(Y) <-"X opened the documents at CURRENT_TIME" EndRule To define a rule, we wrote predefined terminals in boldface.The constants and sequence of characters are enclosed in double quotes (" ").A variable symbol is represented by an identifier.Functions' and actions' names are presented in italics.Comments are enclosed by the symbols /* */.The equal operator (=) is used to compare two items whereas the affectation operator (<-) works as an assignment operator.The semantics of the condition of the above rule is as follows: 1.
Author(section_1) = Y, defines the set of Y of coauthors of Section 1 of the document.

2.
Blind(Y) = "true", checks whether author Y is visually impaired or not.

3.
Role(Y) = "writer", checks that the role of the author is a writer 4.
Session_Active(Y) = "true", checks whether author Y is currently logged in or not.
Group Awareness by Speech Alerts: Whenever a member joins, edits, or updates the document, all others receive a speech alert, which notifies them about the update.This editing rule of the figure can be explained by highlighting that author X is working on the first section of the document.The role of author Y is that of an editor.Author X has started the session, and Author Y is a visually impaired author.Author Y is editing the figure , and a  Interested Users Activity: When a user leaves a comment on a line in the document, the interested users receive a speech alert in case they are visually impaired.This alert is accompanied by a popup notification in case the other user is not a blind person.This rule defines that author X is adding a comment online number (LN) and a popup notification has been sent to Y about this: The inference engine automatically starts and keeps gathering information whenever an author logs in.The data presenter gathers information generated by the inference engine, makes it presentable, and transmits it to the document access and reporting interface.

Web-Based Document Access and Reporting Interface
This is an interface layer whose responsibility is to present an interactive and intelligent interface to its user, as they work in a shared environment.It uses HTTPS protocols to send and receive information from the client to the server and the server to the client.HTTPS is a secure protocol and information moves over the net in an encrypted format, which ensures the security of the content.
The session manager keeps track of every login and logout and controls the access of users over the content within the application.It is a security check for the author's verification and protects against unauthorized access.The notification system provides updates to its users in the form of notifications.For the users with visual impairments, it gives notifications in the form of sounds, beeps, along with speech alerts whereas for sighted users, it uses popup and alert boxes.For example, an opening door sound along with the user's name communicates that an author has joined the session.A short beep along with the author's name is an intimation that someone has added comments to the document.Similarly, a long beep with the author's name informs that a new role has been assigned to an author.Whereas, for users with normal sight, all these notifications are sent through popup alerts.Text to speech libraries are embedded into the system to read the content displayed on the screen.They are specifically used by blind users and are activated on demand.To make them compatible with the application, hidden Meta information was added against each element/icon of the screen.This meta content is not visible to its user, but helps the text to speech synthesizer to speak the element/icons' description when a blind user performs navigations over the screen.

Speech Recognition Engine
The speech recognition engine captures the speech spoken by a user and converts it into a digital form from analog sound waves.The speech processor breaks down the speech into small pieces of meaningful words called phonemes.The acoustic model compares the generated phoneme with the standard pronunciation of the words available in the system's dictionary.The language model creates the sequences of the words by the use of grammar and the knowledge of statistical frequencies of words [61].On the bases of these comparisons, the language recognition engine finds the most likely word sequence based on the probabilities and returns the best-matched phrase.Figure 2 presents the architecture of a speech recognition engine integrated into the WCFB framework.
Symmetry 2019, 11 FOR PEER REVIEW 11 example, an opening door sound along with the user's name communicates that an author has joined the session.A short beep along with the author's name is an intimation that someone has added comments to the document.Similarly, a long beep with the author's name informs that a new role has been assigned to an author.Whereas, for users with normal sight, all these notifications are sent through popup alerts.Text to speech libraries are embedded into the system to read the content displayed on the screen.They are specifically used by blind users and are activated on demand.To make them compatible with the application, hidden Meta information was added against each element/icon of the screen.This meta content is not visible to its user, but helps the text to speech synthesizer to speak the element/icons' description when a blind user performs navigations over the screen.

Speech Recognition Engine
The speech recognition engine captures the speech spoken by a user and converts it into a digital form from analog sound waves.The speech processor breaks down the speech into small pieces of meaningful words called phonemes.The acoustic model compares the generated phoneme with the standard pronunciation of the words available in the system's dictionary.The language model creates the sequences of the words by the use of grammar and the knowledge of statistical frequencies of words [61].On the bases of these comparisons, the language recognition engine finds the most likely word sequence based on the probabilities and returns the best-matched phrase.Figure 2 presents the architecture of a speech recognition engine integrated into the WCFB framework.The engine used for speech recognition in WCFB is Annyang [62].It is an open source JavaScript library that enables the web applications to be controlled by voice command.It converts the speech into a transcribed phrase.The triggers model contains a list of action items mapped against a list of transcribed phrases.The trigger identifier compares the generated transcribed phrase with the available list of the action directory stored in the triggers model and performs the corresponding action.The common functions that can be performed through speech input are: Open and close the document; apply textual formatting, like font size, style, color, and so on.The user can ask for the status of other online members.Chat messenger can be opened by a speech command and then can easily be used to communicate using speech.The users can ask for updates made by other members.

Evaluation of WCFB and Usability Testing
In WCFB, users require login credentials to interact with the platform.The users are managed by assigning different roles to them and the role may vary from time to time.Once a member acts as a writer, later on, he/she may have the responsibility of a reviewer.Audio alerts are available for The engine used for speech recognition in WCFB is Annyang [62].It is an open source JavaScript library that enables the web applications to be controlled by voice command.It converts the speech into a transcribed phrase.The triggers model contains a list of action items mapped against a list of transcribed phrases.The trigger identifier compares the generated transcribed phrase with the available list of the action directory stored in the triggers model and performs the corresponding action.The common functions that can be performed through speech input are: Open and close the document; apply textual formatting, like font size, style, color, and so on.The user can ask for the status of other online members.Chat messenger can be opened by a speech command and then can easily be used to communicate using speech.The users can ask for updates made by other members.

Evaluation of WCFB and Usability Testing
In WCFB, users require login credentials to interact with the platform.The users are managed by assigning different roles to them and the role may vary from time to time.Once a member acts as a writer, later on, he/she may have the responsibility of a reviewer.Audio alerts are available for individuals who are blind, whereas sighted users get visual notifications and popup alerts.Visually impaired users can learn the active participant's list by listening and can communicate with them within the application.They can communicate with each other via a chat messenger and thus have discussions when required.The users can join at any time and may work remotely as it is a web-based application.In Figure 3, three users are active in the system and two are working on the same document.Jack produces the "Introduction" section of the document and Robert works on Section 2 of the document.In addition to this, we added a function to give a voice command to the system.For this, a user must have a microphone attached and installed in his machine.
Symmetry 2019, 11 FOR PEER REVIEW 12 individuals who are blind, whereas sighted users get visual notifications and popup alerts.Visually impaired users can learn the active participant's list by listening and can communicate with them within the application.They can communicate with each other via a chat messenger and thus have discussions when required.The users can join at any time and may work remotely as it is a web-based application.In Figure 3, three users are active in the system and two are working on the same document.Jack produces the "Introduction" section of the document and Robert works on Section 2 of the document.In addition to this, we added a function to give a voice command to the system.For this, a user must have a microphone attached and installed in his machine.To evaluate the framework, experiments were conducted in which sighted and blind candidates participated and were asked to produce an article in a cooperative manner.Two different parameters were selected: The usability of the UI components and the acceptance level of the application by its users.

Participants
To conduct the experiments in different environments, we visited different special educational institutes and engaged students who are visually impaired and sighted.The selection criteria were being visually impaired and sighted, to have good experience with using computer applications, and to have previously completed collaborative educational activities.Luckily, we were able to find 92 candidates, and all were used to working with a computer on a daily basis to perform educational activities.Out of the 92, 61 were visually impaired, whereas 31 were able to see.The candidates were divided into 11 groups and in each group, there were two teams.Each team contained sighted and visually impaired candidates, so that the blind users experienced working with normally sighted users and vice versa.This also verified the objective of the proposed system, which was to build a collaborative environment for sighted and visually impaired candidates.Table 2 presents the details of each group and team structured for the experiments.The participants were aged between 17 to 31 years of age and were of both genders.They were college and university level students.

Sighted Blind Sighted Blind
Group-1 Participants 3 2 2 2 To evaluate the framework, experiments were conducted in which sighted and blind candidates participated and were asked to produce an article in a cooperative manner.Two different parameters were selected: The usability of the UI components and the acceptance level of the application by its users.

Participants
To conduct the experiments in different environments, we visited different special educational institutes and engaged students who are visually impaired and sighted.The selection criteria were being visually impaired and sighted, to have good experience with using computer applications, and to have previously completed collaborative educational activities.Luckily, we were able to find 92 candidates, and all were used to working with a computer on a daily basis to perform educational activities.Out of the 92, 61 were visually impaired, whereas 31 were able to see.The candidates were divided into 11 groups and in each group, there were two teams.Each team contained sighted and visually impaired candidates, so that the blind users experienced working with normally sighted users and vice versa.This also verified the objective of the proposed system, which was to build a collaborative environment for sighted and visually impaired candidates.Table 2 presents the details of each group and team structured for the experiments.The participants were aged between 17 to 31 years of age and were of both genders.They were college and university level students.
both the application's basic features and functions and gave them an understanding of how they would interact with the applications.The tutorial covered an introduction of the application's UI components available for both sets of users to make them familiar with the help module.They were given a demonstration to see how they could interact with the application and operate it.They were also told about the short cut keys available for both sets of users.During the experiment, no assistance was provided to any participant, and they were asked to use the help module in case of any difficulty or problem.Feedback was taken from all participants and was used to further enhance the usability and effectiveness of the WCFB.

Activity Goals
There were 11 experimental sessions (one with each group) that were conducted to evaluate the system.In each experimental session, there were two activities performed.The candidates were asked to write articles in a collaborative environment on two different topics: (1) The role of the Pakistan Super League (PSL) in bringing cricket back to the country, and (2) the future of politics in the country.The two topics were selected because they are very common discussion points for the age group that participated in the experiment.Moreover, at the time of the experiment, the two topics were heavily discussed in daily news and on social media.For activity 1, topic 1 was selected and team 1 was asked to use an existing system to produce the piece, i.e., they used Google Docs UI for the Blind [9], whereas team 2 was provided with WCFB to produce their piece.For activity 2, team 1 was given the WCFB, whereas team 2 was asked to use Google Docs UI for the Blind to complete a write-up on topic 2. In this way, both the teams used both applications one after another and were able to give feedback about them.The test team was given an hour to complete one topic.

Post-Experiment Questionnaire
After the completion of the experiments, a questionnaire was given to each user to provide feedback.The responses were answered on a rating scale, ranging from 1 to 10, where 1 meant the lowest and 10 meant the highest.
The questions were grouped into six categories: Interaction, collaboration, coordination, awareness, communication, and recommendation.In each group, there were at most two to three questions.Table 3 shows a list of questions alongside the results obtained from the participants.The numbers represent the averages of the scores given by each author against each question.
The results obtained from the questionnaire were qualitatively evaluated.The approach helped us analyze our solution to see whether it is useful for collaborative writing.
Figure 4 presents the graphical representation of the average scores obtained against the attributes used for grouping the questions.The results show that the participants rated the interaction and collaboration factors almost the same, but for coordination, awareness, and recommendation, they preferred WCFB over the Google Doc UI.  Figure 4 presents the graphical representation of the average scores obtained against the attributes used for grouping the questions.The results show that the participants rated the interaction and collaboration factors almost the same, but for coordination, awareness, and recommendation, they preferred WCFB over the Google Doc UI.

Wilcoxon Signed Rank Results
The Wilcoxon signed-rank [64] evaluation is a non-parametric statistical theory test used to evaluate two related samples, matched trials, or replicated measurements on a single sample to evaluate whether their inhabitants implied ranks differ, i.e., a paired difference test [65].It may be utilized as a substitute for the paired Student's t-test, t-test for matched pairs, or even the t-test for dependent samples once the population cannot be assumed as being normally distributed [66].Descriptive statistical results obtained from the post-experiment questionnaire are presented in Table 6.The descriptive statistics are not necessary to calculate the Z-value of the Wilcoxon signed-rank but helps with the interpretation of the data.Table 7 presents all the statistics ranks required in the calculation of Wilcoxon signed-ranks, Z, and p values.We can see from the table's legend that against interaction, 28 participants have a negative rank, i.e., they gave a higher score to Google Doc over WCFB, a 48 positives rank means the score of WCFB is more than that of Google Doc and 16 participants have given mean equal scores for both interfaces.Ranks obtained from the other categories are also available in Table 7. Table 8 presents the Z and p values obtained after performing the Wilcoxon signed rank test over the questionnaire's scores.The results obtained from the Wilcoxon signed-rank test are almost the same as that obtained from the t-test analysis.In both tests, only the collaboration's p-value is greater than 0.05 whereas all the other factors have values less than 0.05.So, both the tests accept that WCFB has a more effective interface for its user as compared to Google Doc.During the feedback session with the participants, users who were blind expressed their appreciation for the input method whereby they could use their voices to insert data.In WCFB, the collaborators had an option to review the document sections accessed by other co-authors and suggest modifications.The object/section on which the communication happens is called "work focus" and this activity is named as context-based focused The user who wants to do a review of any sections of the document takes the initiative to start communication.The other co-author of those sections is notified about the potential communication and has an option to accept/reject the communication.Once the communication is established, it is highlighted in both users' environments.Whenever a co-author performs any modification in the section, the concerned user is notified about that activity.The user with visual impairments gets speech-based notifications whereas for the sighted user, there are visual notifications.Similarly, in a shared document, there are interlinked components, for example, table, figures, formulas, and their respective legends.In addition, there might be a chance that these interlinked components are composed by different authors.As an example, one author is assigned to compose a table and the other has the responsibility to write up its legend.We called this property "work proximity".To maintain consistency, when a user completes any modification in an interlink component, the concerned author is notified about the modification so that he can update his assigned section/component accordingly.Unfortunately, both the functionalities are not available in Google Doc UI.Also, WCFB includes an embedded chat messenger, which makes communication between the authors strong.On the other hand, Google Doc only allows users to add comments or send an email for them to communicate with each other.The co-authors must go through the whole document to review the comments.The recommendation of the system is high when compared with the Google Doc UI application.It improves the confidence level of the users according to the kind of participants.
The users also proposed some suggestions and improvements during the feedback.It is observed that, at some time, there is a burst of audio alerts because of multiple users working on a shared document at the same time.The system includes a feature that can log all activities so that the user can listen to them later.The users liked the environment as it required minimum external assistance and the assignment of different roles enhanced their exposure.Privacy is another factor, participants asked for.Besides that, the results obtained from the experiments confirm that the application is acceptable in the community that it is intended for.The feedback proves that the system develops the interest of its user and enhances his writing skills.The overall outcome seems very promising, and the participants' suggestions encourage us to make it more workable and to add more features and components to the application.We conclude that a web co-authoring platform for visually impaired users can promote enthusiasm and interest for group activities.

Conclusions and Future Work
The paper summarizes recent developments of document building tools and applications specifically designed for users with visual impairments, who have very limited special functions that restrict them from fully utilizing all the available features of an application.The proposed platform is particularly designed for visually impaired individuals to enhance group awareness among blind collaborators, informing them about all actions and activities happening in the shared environment, and allowing them to interact with the application through voice and speech commands.
The application's performance was evaluated through multiple experiments, and a questionnaire was disseminated to the participants to gauge their experience and the application's effectiveness.The responses show that the results are very promising and further research would be fruitful.For the future, we plan to extend our interface to handle even more complex objects, such as multimedia contents, including figures, diagrams, images, and so on.Our next goal is to implement this framework in a commercial context and to evaluate its usability with other differently-abled groups.

Figure 1 .
Figure 1.Core architecture of the Web-based Co-authoring Framework for the Blind (WCFB).

Figure 1 .
Figure 1.Core architecture of the Web-based Co-authoring Framework for the Blind (WCFB).

Figure 2 .
Figure 2. The architecture of the speech recognition engine.

Figure 2 .
Figure 2. The architecture of the speech recognition engine.

Figure 3 .
Figure 3.The interface of the WCFB application.

Figure 3 .
Figure 3.The interface of the WCFB application.

Figure 4 .
Figure 4. Averages scores of attributes based on the questionnaire.

Figure 4 .
Figure 4. Averages scores of attributes based on the questionnaire.

Table 1 .
Comparison of assistive technologies and applications for the blind for document writing.

Table 2 .
Participants' details of the WCFB testing and experiments.

Table 3 .
Results of the questionnaire.

Table 5 .
Summary of t-test results against the questionnaire's categories.

Table 6 .
Descriptive statistics of the Wilcoxon signed-rank against the questionnaire's categories.

Table 7 .
Ranks obtained against the questionnaire's categories.

Table 8 .
Test statistics for the Wilcoxon signed ranks test.