ODK Scan : Digitizing Data Collection and Impacting Data Management Processes in the Tuberculosis Control Program of Pakistan

Syed Mustafa Ali 1,*, Rachel Powers 2,*, Jeffrey Beorse 3,*, Farah Naureen 1,*, Arif Noor 1, Naveed Anjum 1,*, Muhammad Ishaq 1, Javariya Aamir 1 and Richard Anderson 3 1 Mercy Corps, Pak Palace, Murree Road, Rawal Chowk, Islamabad 44000, Pakistan; anoor@mercycorps.org (A.N.); muishaq@mercycorps.org (M.I.), jaamir@mercycorps.org (J.A.) 2 Information Systems, Village Reach, Seattle, WA 98102, USA 3 Computer Science and Engineering, University of Washington, Seattle, WA 98105, USA; anderson@cs.washington.edu * Correspondence: symustafa@mercycorps.org (S.M.A.); rachel.powers@villagereach.org (R.P.); jbeorse@cs.washington.edu (J.B.); fnaureen@mercycorps.org (F.N.); nbuzdar@mercycorps.org (N.A.); Tel.: +92-321-413-7073 (S.M.A.) Abstract: The present grievous situation of the tuberculosis disease can be improved by efficient case management and timely follow-up evaluations. With the advent of digital technology this can be achieved by quick summarization of the patient-centric data. The aim of our study was to assess the effectiveness of the ODK Scan paper-to-digital system during testing period of three months. A sequential, explanatory mixed-method research approach was employed to elucidate technology use. Training, smartphones, application and 3G enabled SIMs were provided to the four field workers. At the beginning, baseline measures of the data management aspects were recorded and compared with endline measures to see the impact of ODK Scan. Additionally, at the end, users’ feedback was collected regarding app usability, user interface design and workflow changes. 122 patients’ records were retrieved from the server and analysed for quality. It was found that ODK Scan recognized 99.2% of multiple choice bubble responses and 79.4% of numerical digit responses correctly. However, the overall quality of the digital data was decreased in comparison to manually entered data. Using ODK Scan, a significant time reduction is observed in data aggregation and data transfer activities, however, data verification and form filling activities took more time. Interviews revealed that field workers saw value in using ODK Scan, however, they were more concerned about the time consuming aspects of the use of ODK Scan. Therefore, it is concluded that minimal disturbance in the existing workflow, continuous feedback and value additions are the important considerations for the implementing organization to ensure technology adoption and workflow improvements.


Introduction
Tuberculosis (TB) is still a global and grave public health concern.It is the reason for illhealth of millions and cause of numerous deaths each year worldwide [1].Despite the availability of effective treatment and strategies to improve access to treatment regimens, multidrug-resistant and extremely drug-resistant TB cases are emerging due to the lack of adherence to treatment protocols.This non-adherence is mainly because of the longer treatment course (6 to 8 months) and unpleasant side effects of the drug treatment [2].
Pakistan is among the TB high-burden countries, with the emergence of a large number of new TB cases (approximately 420,000 every year) and prevalence of multidrug-resistant TB.Pakistan ranks fifth for drug-susceptible TB and fourth for multidrug-resistant TB [3].
The burden of disease caused by TB and effectiveness of different programmatic interventions are better understood through an interpretation of the available epidemiological data [4].The program management function, Monitoring and Evaluation (M&E), makes the best use of the available resources (including data) to guide and support programmatic decision-making.The M&E system plans and conducts data collection and analysis to inform program managers for keeping the performance measures on track [5].
However, unfortunately this immense resource is either underutilized or not used at all [4].
Health information system used in the disease-specific health programs are designed to collect patient-centric data to generate the actionable information to guide decisionmaking.Likewise, mostly paper-based patient recording and reporting systems are maintained and managed in the TB control programs [6].
Fidelity to treatment protocols is ensured through case management (DOTS strategy) and follow-up evaluations (sputum specimen for microscopic examination or culture at specific points of treatment period) [7].Generally, in the TB control programs, paper-based patient records are used to prepare periodic reports of the patient enrolment, retention and treatment progress.These data summaries are also sent to different management levels for aggregation, analysis and reporting [6].However, paper-based recording and reporting systems suffer from some inherent challenges, which include: checking data quality, data entry and aggregation, allotting treatment outcome correctly, filling gaps in data, analysing data and generating actionable information.These data management activities are inefficient, work-intensive and time consuming [8].However, the advent of digital technology has positively impacted the organizational workflow [9].Electronic Health Information Systems (eHIS) have generated better results on the three fronts of TB control programs: patient care, resource management and disease surveillance [8].Increasingly, mobile health (mHealth) interventions are grabbing the attention of the public health professionals for their ability to improve the public health situation [10].mHealth refers to the application of portable hand-held devices and communication capabilities in the health care programs [11].During the past decade mHealth has been continuously evolving within the eHealth domain, where it has offered great potential to overcome many traditional barriers in delivering care and reducing the distance, time and cost [12].Fortunately, most of the developing countries have already taken mHealth initiatives to unleash the power and reach of smartphones and considered them as an important resource for front line workers [10].
Among various reported uses of mHealth tools the most reported uses are: data collection, electronic decision support, provider's education and training, and communication [10].
However, a particular use of mHealth for data collection has more likelihood to improve the public health in low-and middle-income countries [13].Noteworthy are those unpredictable implementation challenges and complicated situational factors that affect an introduction of mHealth application in primary healthcare settings of the developing world [14,15].
In the Public-Private mix model of the TB control program of Pakistan, chosen from freely available and open-sourced mHealth applications, we tested the ODK Scan paper-to-digital system for scanning paper forms and digitizing the data.Digitization of the data collection was experimented with an objective of reducing the time and resources spent on the data entry work.Additionally, the ODK Scan was chosen because it allows supervisors and managers to assess the data quality by viewing image snippet.The aim of this pilot study was to: (1) assess data quality aspects at both the app and user levels (2) assess the quantitative impact of ODK Scan on the data management processes (3) collect feedback from the application user's experience to know app usability, user interface design and workflow changes.

Description of Technology
Open Data Kit (ODK) Scan is an Android-based mobile application and a part of the ODK 2.0 suite of tools [9,16].It uses the device's built-in camera to capture an image of a scancompatible paper form, which it feeds into an image processing algorithm.This algorithm segments the form into small image snippers that each contains a single data field from the original form [17].The algorithm then converts hand-filled bubbles, structured numbered boxes, and QR codes into an editable record in the ODK 2.0 database.ODK Scan does not process free hand written text; instead it creates an empty text field in the record that can be manually filled in.Each digitized field includes a copy of its associated image snippet from the original form, which can be used to verify digitized fields or manually transcribe hand written text.Once the record is in the ODK 2.0 database, users can use the other tools, such as ODK Survey and ODK Tables to view, edit, verify, aggregate, and sync the data or create customized services, interactions and reports with the data [16].

ODK Scan Workflow
Before ODK Scan is to be used, a scan-compatible form must be designed using the form designer.Paper copies of the form image are printed out and the form's definition files are copied to the Android-based device.After the form is filled, the user takes a picture using the built-in camera of the smartphone.ODK Scan detects the data fields and auto-digitizes them.The user can then launch ODK Survey to view and verify data fields, and editing if necessary.After a user is sure about the data quality, the patient record is synced with an aggregate server.If no internet connection is available, the user can delay synchronization and continue collecting data until connectivity is restored.At any time, the user can view all synced records on the smartphone and run customized reports to view actionable information.For this trial we created a report that generates information about patients whose follow up appointments are either overdue or due in the coming days.Aside from the syncing step of the ODK Scan workflow, all steps can be undertaken without internet connection.

Intervention
Among nine recording and reporting formats used in the TB control program of Pakistan, one recording format (TB03) was selected (Figure 1).Its scan-compatible form was designed using the ODK Scan Form Designer to create machine-readable fields for the app to recognize (Figure 2).The scan-compatible form consists of three types of machinereadable data fields: structured number boxes, bubbles and QR codes, one non-machine readable data field: hand written text fields (Figure 3).After the form design was prepared, it was pre-tested, necessary modifications in form design were made, and the forms were printed and bound into a register book.The intervention started at the beginning of May 2016 and lasted for three months in district-level field settings.Informed consent was sought from four of the field workers, belonging to two enrolled districts (Chakwal and Hafizabad) of the TB control program of Pakistan.The field workers (District Field Supervisors) were trained in using the Scan-compatible TB 03 form and the app workflow and were given smartphones (Samsung Galaxy J7 and Huawei Y6) and scan-compatible TB03 registers.Additionally, we provided them 3G-enabled mobile SIMs for internet connectivity and monthly top-up for sharing data via internet.We imposed no restriction on accessing or downloading any other mobile applications.Field workers were allowed to use smartphone, including making calls, text-messaging and internet browsing.

Design
A sequential, explanatory mixed-method research approach was employed to elucidate the technology use for an effective implementation of the ODK Scan paper-to-digital system.This was done by applying the following three methods: (1) Assessment of the data quality was done at both app and user-level.At the application level, accuracy of ODK Scan's ability to recognize bubbles and number boxes was measured by comparing the raw Scan values (the digitized values which Scan's algorithm produced) with the data appearing in image snippets of the corresponding data fields.At the user-level, the workflow allowed for users to improve the quality of the data through validation and editing data fields manually.
User-level data quality was assessed by comparing the manually and digitally entered records with hard-copies of the TB03 register (2) Time measures regarding form filling, data transfer, aggregation and verification were recorded in the routine data management process using direct observation.They were then compared with the endline measures for the same data management processes using the ODK Scan system.(3) Face-to-face interviews, using a semi-structured questionnaire, were carried out with each of the field workers.Their app usability and workflow modification experiences were explored to assess the effectiveness of the ODK Scan paper-to-digital system from a usability perspective.

App-level quality measure:
During an intervention period of three months, 122 patients' paper-based records were scanned, digitized and checked for quality.
All of this data was synced and available on the cloud-based ODK aggregate server for access and remote analysis.The first task was to assess ODK Scan's ability to correctly parse the user filled machine-readable data fields, the structured number boxes and the bubbles.Accuracy rates for these fields were calculated by manually reviewing the 122 records, consisting of 10,600 digits and 8,000 bubbles, and comparing Scan's raw predicted value against the ground truth value of the image snippet.The recognition accuracy for the bubbles was found to be much higher than that of the number boxes.Including blank fields (fields that users did not fill in), the Scan app classified 99.2% bubbles and 79.4% digits correctly (Table 1).The misclassified number fields were further investigated by mapping accuracy measures of the number boxes on the TB form to specific fields on the physical paper.It was found that fields located on the right side of the page, the side farthest from the register book's binding suffered from lower accuracy measures than the number boxes which were located closer to the center or left hand side (Figure 4).However, in the existing manual data entry process, this responsibility is shared between a field worker and a data management officer (DMO).The DMO is responsible for data entry and quality at the next level of management.When synced records were checked and compared with the quality of manually entered records, it was found that for the manual data entry process a data accuracy rate of 94% was achieved, while the ODK Scan process only reached 86%.The accuracy of the data in the manual data entry process was calculated by comparing the scanned version of the hard-copies of TB03 forms to the excel spreadsheets they were translated into.Similarly, in the ODK scan process, data accuracy was calculated by comparing an exported excel sheet from the ODK aggregate server to the scanned version of the hard copies of TB03 forms.

Impacting Data Management Processes
In the TB control program, the routine data management process begins with the collection of data from all enrolled General Practitioners (GPs).Once the consolidated paper-based report is prepared it is transferred to the next level of management (Sub-recipient office), which then passes through multiple desk reviews and is finally entered into an excel spreadsheet.The digitization of paper-based records takes a significant amount of time before it is shared with the next level of management (Principle recipient office) for finalization and progress review meetings (Figure 5).Time spent in form-filling, data collection, transfer, aggregation and verification were measured by the direct observation.
At baseline, these measures were recorded by observing the routine data management process.However, at the endline the same measures were recorded by observing the ODK Scan workflow.In the routine data management process, aggregation of records required manual data entry into a spreadsheet, and digitization of one record took 2 minutes.This digitization process was reduced to 10 seconds in the ODK Scan workflow.Moreover, the data verification process was greatly altered, as in the ODK Scan process the field worker took the sole responsibility of viewing and verifying data, while in the routine data management process this responsibility was shared between field worker and DMO.In terms of time, verification of paper based records took 20 seconds per record by the DMO, while in the ODK Scan process it took 3 minutes per record by the field worker.This verification process in ODK Scan workflow included checking of the data for completeness, consistency, and bubble and number recognition accuracy.ODK Scan process provided a significant time reduction for the data transfer and aggregation steps; however, time required for the verification and form filling were increased as compared to corresponding processes of the routine data management (Table 2, Figure 6).

Feedback on App Usability and Workflow Modifications
The field workers learned quickly how to use the smartphones and swiftly became comfortable with the application's workflow: most of them were already using smartphones in their personal lives.Nevertheless, the implementation of ODK Scan was guided by the principles of minimal disturbance in current workflow and generating value for users to improve the adoption chances.Face-to-face interviews were conducted independently with each of the field workers at the end of the intervention period.These interviews were supported by a semi-structured questionnaire, which included hints to guide the interviewer to ensure the completeness of the feedback.The interviews were conducted by a research team member in the language of the enrolled field workers (Urdu).
All interviews were digitally recorded, transcribed in Urdu and then translated into English.The interview recordings and the corresponding translations were then checked by another team member for consistency and completeness.Data is segregated based on the various steps of the app workflow for the ease of analysis and presentation.

Form Filling:
The ODK Scan process begins with filling in the ODK Scan-compatible form.This includes marking the correct bubbles for the bubble fields, linking guide dots for the structured number fields and writing in the text fields.Field test results showed that this step took more time as compared to the corresponding step of the routine data management process.
Moreover, in terms of learning to fill in the new form, all of the field workers adapted to the form well and found it easy to locate the desired field.Nonetheless, almost all field workers were unhappy about the time-consuming aspect of the form filling for the ODK Scan process (Table 3), while at the same time they acknowledged the benefit of using number boxes and bubbles.One DFS said,

Scanning form:
Scanning a Scan-compatible form requires the user to take a picture of the paper form using smartphone's built-in camera.This lets the application process the form in the background and detects machine-readable, hand-marked data fields.Once the processing of image is finished, a notification appears in the top toolbar.Field workers understood the importance of taking a good quality picture, and one described it as, "A good picture is one which is not blurred, and properly focuses the form and covers all the edges of forms.No corner should be cut." Field workers considered this step important and easy to learn.Knowing this step to be important for having accurate and better classification results, they preferred to take image at a relatively calmer place.One field work said, "Because this is delicate task and to make its editing error-free, I conduct this exercise at home."

Verifying Data:
When an image is processed, the record is stored into the ODK 2.0 database.The user then launces the ODK Survey app by pressing the "Transcribe" button to verify the data.This allows the field worker to move field-by-field checking for any necessary edits.Most of the field workers considered the data verification process time-consuming, which may have contributed to the relatively lower data quality in comparison to the manual data management process.One field worker explained this process as, Regular sputum smear microscopy is an important medical aspect in ensuring adherence to the treatment protocol and declaring the treatment outcome of a TB patient.It is extremely difficult to review paper-based records and prepare a list of patients whose sputum smear microscopy is due in the next month.This time-consuming exercise is iterative and considered as a fundamental responsibility of the field workers for tracking treatment progress of the registered patients.With the ODK Scan process, generating this actionable information became easier by running a customized follow up visit report on the smartphone in an offline mode (Figure 7).This enabled field workers to keep a constant check on the treatment progress of the patients registered in his area of control.Moreover, all field workers talked very positively about the benefits of this report.One of them said,

Suggestions for Improvement
After analyzing the content of interviews, it was evident that the field workers completely understood the app workflow and appreciated the benefits of using ODK Scan in improving their work efficiency and performance.Moreover, field workers were eager to continue application use.In addition to simple aggregation reports, field workers wanted to enhance the features of the application.For instance, they suggested including a bookmark feature to tag any patient record, if he/she is not accessible telephonically at the first attempt.With the number of the records available in a table on the user's device, the most popular requirement was patient look-up feature.As TB treatment involves multiple visits of the same patient for medical checkup, it is becoming increasingly difficult for field workers to retrieve a relevant patient's record from the ODK Tables default interface upon patient's revisit.Look-up feature will help field worker to avoid re-scanning and re-verifying form and follow up service will be updated in an existing patient's record.
Currently, synchronization involves the movement of data in either direction.It means, new records on the device are sent to the server and at the same time new records available at the server, synced by others, are downloaded onto the user's device.Deleting a record, either deliberately or mistakenly, is a potential threat to data loss.Therefore, defining user permission attributes to restrict data access to only the appropriate individuals would improve the Scan workflow.Lastly, one of the field workers suggested that instead of scanning, a direct-to-digital solution seems more feasible.

Discussion
This paper compares paper-based and technology-assisted data management processes and tries to extrapolate the value of mHealth solutions for capturing remote health data.In particular, the present study assesses the usability and technological aspects of the ODK Scan supported by the perspective of the field workers in relation to their work settings.
However, it is also noted that in the developing country context, a prerequisite to the success of mHealth interventions is the consideration of developers and implementers about the field workers' expectations, needs and work settings [18,19].Moreover, a gap between the work practices and use of the new technologies is attributed by the poor implementation design of the technology-based interventions.[20] In the Public-Private Mix model of the TB control program of Pakistan, despite available direct-to-digital data collection solutions, the ODK Scan paper-to-digital system was chosen because of the current healthcare system requirement.Largely, the healthcare system relies on the paper-based records for recording, reporting, verification, validation and other purposes.This setting informed the selection of ODK Scan paper-to-digital system.In other mHealth interventions where Android-based smartphones are used primarily for data collection, the ability of smartphones to transfer data in real-time from remote areas to centrally located offices is achieved by exploiting the internet connectivity [15,21].
Similarly, in this field test, internet connectivity benefited field workers in transferring the data and labor involved in sharing the data in the form of hard copies was avoided.
However, the ability of smartphones to transfer data by using 3G internet connectivity instead of WiFi would be highly beneficial.More than just digitization of patient's records, technology also supports report generation, case management and patient follow-up [22].
Data quality achieved through the ODK Scan process is not impressive if compared with its counter process.This drop in quality can be explained through the change in roles as previously field workers were not solely responsible for ensuring data quality.In terms of time, the routine process of data review takes much longer than the new process of the data review and verification.Therefore, an alternative approach to improve the data quality would be to carry out supportive supervision, so that acceptable level of quality of the final set of data could be achieved [23].Additionally, effective data verification approaches require determination of the data quality characteristics and the application of quality standards [24].However, the benefits of the mHealth technologies are improved with increased validity of the digital data [25].
In the ODK Scan pilot test, unrestricted mobile and internet usage was authorized in good faith to generate ownership and responsibility among the field workers.It is also reported that free use policy has generated a sense of ownership and empowerment among the health workers by recognizing the significance and convenience of the smartphone usage in their practices [26].
The most challenging aspect of the use of ODK Scan was filling in Scan-compatible forms that included bubbles, number boxes and text fields.Additionally, this field test recorded significant time consumption on data verification and editing processes.On one hand, viewing and verifying all processed data values is beneficial for ensuring data quality, but on the other hand this requires the field worker to spend more time finalizing the scanned record after correcting Scan's errors.Therefore, more errors in the machine-readable data fields resulted in the frustration of field workers.Furthermore, lower recognition accuracy of the number boxes indicates a need for additional application engineering work to make ODK Scan more practical and workable solution.However, it is also an essential consideration for an implementing organization to include as many machine-readable data fields as required [27].Additionally, the opinion of the field workers must be included during the form design process, so that the workload is anticipated and subsequently accepted by them and only important data fields are chosen for the ODK scan paper-todigital system.Complexity of the mHealth interventions requires change management considerations, including changes in the behavior of healthcare providers and professionals, and systems and processes in delivering care [28,29].Considering the short duration of the ODK Scan pilot phase, its impact on the TB case management aspects and treatment outcomes could not be measured.After feedback informed technology enhancements, its expansion plan in the TB control program of Pakistan will be prepared.In current mHealth domain, a key challenge is to make strong evidence available for advocating the proliferation of mHealth use in improving the health outcomes.Therefore in future studies, we aim to measure ODK Scan's effectiveness in ensuring fidelity to treatment protocols and improving treatment outcomes.

Conclusion
Although it is too early to demonstrate a direct link between digitization of data collection and improvement in health outcomes, mHealth technologies enable public health to devise a national-level policy, or at least organizational policy, for the integration of the mHealth technologies into public health programs to maximize the opportunities to reach needy communities.

Figure 1 :
Figure 1: Original TB03 Form used in Routine Data Management Process

Figure 2 :
Figure 2: Scan-compatible TB03 form used in ODK Scan Paper-to-digital System displaying Bubbles, Number Boxes and Text Fields

Figure 4 :
Figure 4: Mapping of the accuracy measures of the number boxes

Figure 5 :
Figure 5: Routine Data Management Process in TB Control Program of Pakistan

Figure 6 :
Figure 6: Redesigned Data Management Process in ODK Scan paper-to-digital system

Figure 7 :
Figure 7: Process of Follow up Visit Report (left to right) professionals and program managers to make data available for use in an easy and timely fashion.Prompt generation of actionable information from the digital records is major advantage of the technology.However, it is important to integrate application use within the routine workflow, providing minimal disturbance into existing practices, and continuous feedback are important factors for technology acceptance among the field workers.Efforts made to enhance technology features and improving users' experiences are important considerations for the successful implementation of mHealth interventions in the developing countries.However, a gradual adoption is necessitated by human resource factors and the pace of technological advancements.Hence, it is equally important

Table 1 :
Measurement of Accuracy Rates for Recognition of Bubbles and Number Boxes

Table 2 :
Impact of ODK Scan on various processes of Data Management

Table 3 :
Self-administered questionnaire items Note: Whereas, 1 corresponds to very easy and 5 corresponds to difficult.
This experience was very good.This is excellent.Especially dots and bubbles have eliminated any chance of the data management officer contacting us for clarification or to address any readability issue."

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 2 September 2016 doi:10.20944/preprints201608.0232.v2
Peer-reviewed version available at Future Internet 2016, 8, 51; doi:10.3390/fi8040051"Iperformed verification by pressing next option to move from one field to another.I checked all data points of one form and verify each field to confirm that data written on form is properly scanned and all mistakes are rectified.The information is easily viewable."Sendingdata to the next management level was significantly easier with the ODK Scan process.In the Scan process, this simply involves getting to an area of the internet connectivity and sending data to the server by syncing it.Conversely, in the routine data management process, it is very time and energy consuming: copies of the TB register must be sent to the office through a reliable courier service.One of the field workers explained this process as, "Using sync feature is easier because I do editing mostly at home then I reach an area of connectivity and sync forms.While, for sending paper forms via courier, first I have to carry big sized TB03 register to photocopier to make the copies of TB03, then I have to visit courier office to send the parcel.It will not reach the same day, at least it takes one day and then I have to call office to confirm if they have received the parcel/documents."Preprints (www.

preprints.org) | NOT PEER-REVIEWED | Posted: 2 September 2016 doi:10.20944/preprints201608.0232.v2
Peer-reviewed version available at Future Internet 2016, 8, 51; doi:10.3390/fi8040051"It would be very great and easy for us if ODK can sync forms via 3G Internet connectivity.As it is, I scan and edit the forms at home and for syncing forms I have to travel to place where I