Towards Multimodal Equipment to Help in the Diagnosis of COVID-19 Using Machine Learning Algorithms

COVID-19 occurs due to infection through respiratory droplets containing the SARS-CoV-2 virus, which are released when someone sneezes, coughs, or talks. The gold-standard exam to detect the virus is Real-Time Polymerase Chain Reaction (RT-PCR); however, this is an expensive test and may require up to 3 days after infection for a reliable result, and if there is high demand, the labs could be overwhelmed, which can cause significant delays in providing results. Biomedical data (oxygen saturation level—SpO2, body temperature, heart rate, and cough) are acquired from individuals and are used to help infer infection by COVID-19, using machine learning algorithms. The goal of this study is to introduce the Integrated Portable Medical Assistant (IPMA), which is a multimodal piece of equipment that can collect biomedical data, such as oxygen saturation level, body temperature, heart rate, and cough sound, and helps infer the diagnosis of COVID-19 through machine learning algorithms. The IPMA has the capacity to store the biomedical data for continuous studies and can be used to infer other respiratory diseases. Quadratic kernel-free non-linear Support Vector Machine (QSVM) and Decision Tree (DT) were applied on three datasets with data of cough, speech, body temperature, heart rate, and SpO2, obtaining an Accuracy rate (ACC) and Area Under the Curve (AUC) of approximately up to 88.0% and 0.85, respectively, as well as an ACC up to 99% and AUC = 0.94, respectively, for COVID-19 infection inference. When applied to the data acquired with the IMPA, these algorithms achieved 100% accuracy. Regarding the easiness of using the equipment, 36 volunteers reported that the IPMA has a high usability, according to results from two metrics used for evaluation: System Usability Scale (SUS) and Post Study System Usability Questionnaire (PSSUQ), with scores of 85.5 and 1.41, respectively. In light of the worldwide needs for smart equipment to help fight the COVID-19 pandemic, this new equipment may help with the screening of COVID-19 through data collected from biomedical signals and cough sounds, as well as the use of machine learning algorithms.


Introduction
COVID-19 is a well-known contagious infectious disease caused by the new Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) [1]. Since its first detection in 2019, new variants of SARS-CoV-2 have emerged [2], such as the United Kingdom (UK) variant (B.1.1.7), regularly monitor it. However, other respiratory diseases such as cold and flu also reduce the SpO2 level within the range 90-95% without causing any major health concerns.
The use of Artificial Intelligence (AI) based on biomedical data for the diagnosis of respiratory diseases is quite recent. For instance, ref. [18] presented a systematic review of works that address the diagnosis of pneumonia through several biomedical signals (including the most common ones: body temperature, abnormal breathing, and cough) and using different techniques of AI, such as Logistic Regression (LR), Deep Learning (DL), Least Absolute Shrinkage and Selection Operation (LASSO), Random Forest, Classification and Regression Trees (CART), Support Vector Machine (SVM), fuzzy logic, and k-Nearest Neighbour (K-NN), among others. The study found that AI could help to reduce the misdiagnosis of COVID-19, since there is significant overlap in COVID-19's and other respiratory diseases' symptoms.
Recent research has been conducted using samples of sounds from individuals to infer infection by COVID-19 [19][20][21][22][23]. For instance, in [22], a public crowdsourced Coswara dataset, consisting of coughing, breathing, sustained vowel phonation, and one to twenty sounds recorded on a smartphone, was used for this purpose. In another study, data from the INTERSPEECH 2021 Computational Paralinguistics (ComPaRe) challenge were used to infer COVID-19, in a binary classification, through coughing sounds and speech using two subsets from the Cambridge COVID-19 Sound database [22]. The first subset is the COVID-19 Cough Sub-Challenge (CCS), which consists of cough sounds from 725 audio recordings, and the second subset is the COVID-19 Speech Sub-Challenge (CSS), with only speech sounds of 893 audio recordings. In another study [21], an analysis of a crowdsourced dataset of respiratory sounds was performed to correctly classify healthy and COVID-19 sounds using 477 handcrafted features, such as Mel Frequency Cepstral Coefficients (MFCCs), zero-crossing, and spectral centroid, among others. In [23], an audio texture analysis was performed on three different signal modalities of COVID-19 sounds (cough, breath, and speech signal), using Local Binary Patterns (LBPs) and Haralick's method as the feature extraction methods. Unlike cough sounds, another study [24] used biomedical data (body temperature, heart rate, and SpO2), collected from 1085 quarantined healthy and unhealthy individuals, through a wearable device, to infer COVID-19 infection.
In contrast with the aforementioned studies, the all-in-one Integrated Portable Medical Assistant (IPMA) equipment introduced here uses all these measurements taken together from the individual to improve the diagnosis accuracy of a possible infection due to SARS-CoV-2, instead of using isolated measurements. Thus, this work introduces the IPMA, which is a piece of non-invasive, real-time, any-time equipment for large-scale COVID-19 screening that can be used for daily screening of large populations, such as students at school, employees at work, and other people in general public areas, such as parks, public transit, and more. The IPMA uses four bio-markers (cough, heart rate variability, oxygen saturation level, and body temperature) to infer SARS-CoV-2 infection through machine learning algorithms.
It is worth mentioning that some wearable devices (such as smartwatches and fitness bands), as well as mobile apps have been launched to measure cough, heart rate, body temperature, and oxygen saturation levels. Yet, some of these apps and devices lack clinical validity due to their inaccurate measurements without significant correlation with the measurements of certified clinical devices [25]. In fact, these devices are usually sold as approximate information products that show discrete measurements, indicating that their purpose is to be used as preventive alert devices, to "increase users' awareness of their general well-being" [26], having no clinical validation or medical certification, nor even indicating the possibility of COVID-19 infection (as they do not have embedded algorithms to do that). In addition, such wearable devices present inconsistency in the quality of data acquisition, and there is no standardization for data collection or sensor placement [27]. Unlike these wearable devices and apps, the IPMA houses medically certified devices and takes their measurements in a row to be stored in a database. In addition, as previously mentioned, although some recent studies have used separately, in different research, human sounds [19,20,[28][29][30] and some biomedical data [21] to infer infection by COVID-19, in our study, we used all these signals together to be input into machine learning algorithms, in order to gain a higher accuracy rate. Furthermore, we verified the usability of our equipment with volunteers in terms of the device itself, by using the System Usability Scale (SUS), and of the user interface, by using the Post Study System Usability Questionnaire (PSSUQ).

Goals
There were three main goals in this study. First was the development of multimodal equipment to automatically acquire data from medically certified devices, without opening them, thus keeping their medical certification, using linear actuators to turn them on, as well as cameras and an algorithm based on a VGG16 pre-trained network to read the devices' displays. Second was the development of machine learning algorithms to help infer COVID-19 infection. Third was the creation of a database with collected biomedical measurements, which can be further used to infer other respiratory diseases.

Hardware
The IPMA used in this study is shown in Figure 1, in two different versions (see the details of these two versions in Table 1 and Figure 2). Inside the IPMA, there is an oximeter, an automatic blood pressure device, a thermometer, and a microphone. These sensors are responsible for acquiring the oxygen saturation levels (SpO2), Heart Rate (HR), Blood Pressure (BP), Body Temperature (T), and cough from the volunteer. Data from the blood pressure device were not used in this study. After every use, the IPMA was completely sterilized, for 2.5 min, using UV-C lamps (see the moment of that sterilization for version B in Figure 3 ; version A must be put inside a box with UV-C radiation to be sterilized). It is worth mentioning that there was no contact of the volunteer with the UV-C light, which can be harmful to the eyes and skin. In version B, the structure is made of transparent polycarbonate, which blocks the UV-C spectrum. For safety, the UV-C lamps are turned on only if the IPMA's door is closed, and there is a limit-switch and a control relay that cuts the energy immediately if the IPMA's door is open, turning off the UV-C lamps.

Software
In order to use the IPMA, a Graphical User Interface (GUI) was developed, which includes a mobile application. The mobile application can store data from the device and send the data to the physician. Furthermore, there is a Flask server that runs a website in an embedded computer, which can be accessed through any browser. On that website, there is a GUI where the user can interact with the IPMA.

Machine Learning Algorithms
Machine learning algorithms were used here for two purposes: (1) to recognize the characters displayed on the screen using Optical Character Recognition (OCR), from images captured by the cameras; (2) to help the screening of COVID-19 by performing data analysis and classification on the biomedical data captured by the IPMA.
The information about the biomedical data from the individual is acquired by opticam SQ11 mini-cameras focused on the devices' displays, so that the devices do not need to be opened to obtain that information, thus preserving their medical certification. Once images are acquired, VGG16 is used to adjust the image to a correct position to allow character recognition. The images need to be in the correct position, so that the OCR algorithm can locate and recognize key pixels inside the image to locate the segments of the display and determine whether they are on or off.

Image Preprocessing
Before applying the OCR algorithm, we performed some image prepossessing steps to segment and standardize the position of the 7-segment display of each image. As the thermometer is well fixed in the IPMA hardware structure, in the preprocessing stage, the image is rotated and cropped according to the region of interest. However, the oximeter needs a more complex preprocessing, as it is attached to a moving structure that adjusts itself to the arm length, and the opening angle of the device also changes the device's display position relative to the camera, requiring an adjusting algorithm to handle these changes. For this task, we performed transfer learningusing a pre-trained network called VGG16. VGG stands for Visual Geometry Group and is a standard deep Convolutional Neural Network (CNN) architecture with multiple layers. "Deep" refers to the number of layers, with VGG-16 consisting of 16 convolutional layers. It was proposed in [22] to classify images into 1000 different categories. The model with the original weights was provided by the Keras API [23].
We adapted VGG16 to find specific points in the oximeter image and used these points to adjust the image for the OCR algorithm. These points are presented in green in Figure 4. To adapt the model, we created a dataset composed of 22 images taken from the oximeter with the position of the marks defined manually. Fifteen out of them are images of the oximeter displaying SpO2 values, and 7 out of them are with the display turned off or for incomplete measurement, as shown in Figure 4. The input and outputs layers of the VGG16 were modified to our needs. The input layer was swapped for one with the shape of our images (288 × 352 × 3-high × width × colour channels), and the original output was changed to a dense layer with 1024 neurons with the Rectified Linear Unit (ReLU) activation function connected to a final layer composed of 8 neurons and a pair for each line and column coordinates of the 4 desired points.
In the training procedure, the original weights of VGG16 were not modified; only the weights of the input and output layers were modified. From all 22 images, 16 were used for training and 6 for the validation of our model. The images in each group were selected at random. To avoid overfitting of our model and increase the noise and motion robustness of VGG16, multiple random image transformations, such as rotation, Gaussian noise addition, contrast, and shear, were applied at random during the training Figure 5). These transformations were performed at each epoch by selecting random samples and applying the image processing techniques on them. This procedure allows generating multiple training samples using only one image. A normalization procedure was applied on the image pixels' intensities, dividing them by 255 (maximum intensity of the images with 8 bit colour resolution) to have a maximum value of 1. The Mean-Squared Error (MSE) loss metric was used in the training process, and the Adam optimizer was applied. To keep the best weights and avoid overfitting, a checkpoint function was created, which saves the model with the lowest validation MSE. The model was trained for 150 epochs with 32 randomly generated images in each epoch. The code was developed and trained using the free version of the Google Colab virtual machine (the training was performed with a GPU) using Python and Tensorflow [24].
Tensorflow lite was used to convert the model into a 32 bit version, allowing the IPMA to be embedded on a Raspberry board. Then, it was used to find the key points in the images. A perspective transformation was performed to align and segment the SpO2 reading, and finally, a vertical flip was applied. The full procedure of the image alignment applied to the oximeter is presented in Figure 6.

Character Recognition
Once the images are aligned by using VGG16, the OCR algorithm takes the images and binarises them, and a template matching algorithm is used to locate the segments of the 7-segment digits in the image. This is performed in two parts, first finding the position of the full number, then performing the template matching on each digit. Afterwards, the algorithm checks if the segment is activated or not and then compares it with a 7-segment display table, thus determining which digit, from 0 to 9, is being displayed. This algorithm has been demonstrated to be a 100% efficient. Thus, once the digits are recognized, it is possible to know the respective number shown in the display, knowing the position of each digit. Figure 7 shows the diagram of the OCR recognition system.

COVID-19 Inference
In our research, we used cough and audio sounds (from the CCS and CSS of the Cambridge COVID-19 Sound database [22]), in addition to the data of body temperature, heart rate, and SpO2 from [17], to train our machine learning algorithms (Quadratic kernel-free non-linear Support Vector Machine (QSVM) and Decision Tree (DT) with Gini's diversity index to split the data), and then, we applied them to the data collected by the IPMA. Figure 8 shows the block diagram for these algorithms. Prior to any processing, we re-sampled the cough and speech data to a frequency of 16 kHz. For each audio sample, the Mel spectrogram was extracted with a window frame of 25 ms for processing and an overlap of 50%. These spectrograms are computed by extracting the coefficients relative to the compositional frequencies with Short-Time Fourier Transform (STFT), such as done by [31]. Following, these Mel spectrograms are converted to grey-scale, and 512 Local Ternary Pattern (LTP) features are extracted. This feature extraction method is an extension of LBPs, which uses a constant threshold to threshold pixels into three values, i.e., −1, 0, and 1 [32,33]. For heart rate, temperature, and SpO2 signals, the feature vector is composed of their values. The three feature vectors obtained are used to train twodifferent classifier models: for cough and speech, a QSVM is used, whereas for heart rate, body temperature, and SpO2, a DT is used. For each dataset, approximately 70% of the data were used as the training set and 30% for testing. Furthermore, the full training set was analysed in a k-fold cross-validation, with k = 10. The test and IPMA's values' set were checked using each classifier model, and a score vector of the individual being infected or non-infected was obtained for the IPMA set (P1, P2, and P3). This score is the distance from the sample to the decision boundary. A positive score for a class indicates that the sample is predicted to be in that class. A negative score indicates otherwise. Finally, for the IPMA, the sample scores were summed, and if the highest value was related to the COVID-19 score, the sample was assigned as "infected".
The data collected by the IPMA from 36 volunteers (from Ecuador and Brazil, shown in Appendix A were used to evaluate the classifiers' models previously trained with the CCS, CSS, and the database from [17]. The cough signals were preprocessed, and the feature extraction was performed over these three datasets and the IPMA database. Table 2 shows results regarding the Accuracy (ACC) and Area Under the Curve (AUC) obtained for the datasets from the CCS, CSS, and [17], using QSVM for cough and speech and DT for the dataset with body temperature, heart rate, and SpO2. For the IPMA's data, an ACC of 100% was achieved.

Evaluation Metrics Applied to the IPMA
The methodology used to evaluate the IPMA is based on its ease of use and if the GUI is engaging for the volunteer. In order to evaluate the IPMA and the GUI, two scales were used here, which were the System Usability Scale (SUS) and the Post Study System Usability Questionnaire (PSSUQ).
The SUS is a methodology to evaluate the usability, effectiveness, and efficiency of a system, which was developed by Brooke in 1986 as a 10-question survey [34]. In the SUS, the volunteer gives a score for each question, ranging from 1 to 5. The value 1 means the volunteer totally disagrees with the sentence being asked, whereas 5 means the individual totally agrees. The SUS sentences are shown below [35]: 1.
I think that I would like to use this system frequently. 2. I found the system unnecessarily complex. 3. I thought the system was easy to use. 4.
I think that I would need the support of a technical person to be able to use this system. 5.
I found the various functions in this system were well integrated. 6.
I thought there was too much inconsistency in this system. 7.
I would imagine that most people would learn to use this system very quickly. 8. I found the system very cumbersome to use. 9.
I felt very confident using the system. 10. I needed to learn a lot of things before I could get going with this system.
The way the sentences are organized has even questions, with a positive view of the system, and odd questions, with a negative view. There is a similar odd question for each even question, but written in a different way. The Total SUS score (T) is calculated using Equation (1), where p is the score of each question. Note that the parity of the question influences its contribution to the total score.
Although the SUS scale ranges from 0 to 100, it is not a percentile scale; there is a graph used to correlate the percentile scale with the SUS score, with the average (50%) represented by the score 68 [36]. In addition, according to [37], scores of the SUS above average are associated with "Good" products.
As previously mentioned, another important aspect of usability that should be evaluated is the user experience with the computational application, which is addressed by the PSSUQ. This scale is a 19-element standardized questionnaire, which can be scored from 1 ("I strongly agree") to 7 ("I totally disagree"), 4 being "neutral". It is used to evaluate the user experience with computer systems and applications and was developed by IBM in 1988 from a project called System Usability Metrics (SUMS) [36]. It follows a 7-point Likert scale, and the result is the average score of the questions.
An interesting aspect of this scale is that it can be split into three subcategories to evaluate three different aspects of the user experience, which are the System Usefulness (SYSUSE), Information Quality (INFOQUAL), and Interface Quality (INTERQUAL). The SYSUSE is calculated by averaging the results from Questions 1 to 6, whereas the INFO-QUAL and INTERQUAL average the results for Questions 7 to 12 and 13 to 15, respectively. Additionally, there is the overall score, which is the average of all questions, including the 16th question.
In the PSSUQ, lower scores mean better evaluations, thus indicating a higher level of usability. The neutral value is 4, whereas values closer to 1 represent better usability and closer to 7 represent worse usability. The sentences in the PSSUQ scale are [38]:

1.
Overall, I am satisfied with how easy it is to use this system.

2.
It was simple to use this system.

3.
I was able to complete the tasks and scenarios quickly using this system.

4.
I felt comfortable using this system.

5.
It was easy to learn to use this system. 6.
I believe I could become productive quickly using this system. 7.
The system gave error messages that clearly told me how to fix problems. 8.
Whenever I made a mistake using the system, I could recover easily and quickly. 9.
The information (such as online help, on-screen messages, and other documentation) provided with this system was clear. 10. It was easy to find the information I needed. 11. The information was effective in helping me complete the tasks and scenarios. 12. The organization of information on the system screens was clear. 13. The interface of this system was pleasant. 14. I liked using the interface of this system. 15. This system has all the functions and capabilities I expect it to have. 16. Overall, I am satisfied with this system.

Evaluation Protocol
Following [39], testing a large number of participants does not necessarily provide much more information than testing just a few people, since even a few participants are able to find most of the usability issues. According to them, a usability test performed by five users should be enough to detect up to 80% of the potential problems of a product or website. Thus, the probability of a user finding an issue is about 31%. After those five users, the same findings continue to be observed repeatedly without discovering anything new. Thus, based on their study, we selected 18 people for the tests of each IPMA (from Ecuador and Brazil), resulting in a total of 36 individuals.
In the evaluations, the volunteer had to follow the instructions to have his/her biological data collected, and after using the equipment, he/she filled out the SUS and PSSUQ forms. Therefore, it was possible to know their opinion about the usability of the IPMA. Due to the current sanitary restrictions, the equipment was only evaluated with volunteers that did not have ongoing COVID-19 infection, which means they were either recovered or had never had the disease. The evaluation protocol consists of the following steps:

1.
Volunteers were given an explanation about the whole process of the use of the equipment.

2.
They filled out a questionnaire that included their birthday, gender, and health questions. 3.
The system asked the volunteer to open the IPMA's door and make a 10 s forced cough. 4. Afterwards, the system asked the volunteer to speak a phonetically balanced sentence. 5.
Next, the volunteer was informed that the system would take the measurements. The volunteer was asked to place his/her arm properly inside the IPMA. Then, measurements took place after pressing the start button 6.
Once the measurements were finished, the IPMA asked the volunteer to remove his/her arm from the IPMA. 7. Further, the system acknowledged the volunteer and started the UV-C disinfection process. 8.
Finally, the volunteer was asked to fill out two forms (SUS and PSSUQ).

Evaluation Conducted in Ecuador
In these trials, it was possible to evaluate the equipment's aspects, including its mechanical features and usability by 18 volunteers. Regarding the age of the individuals, 72.22% (13 individuals) were young adults (15-29 years old) and 27.78% (5 individuals) were adults (30-64 years old). Results for the measurements, as well as the photos of the experiment setup can be viewed in Appendix B.
In terms of the usability of the equipment, the SUS provided a good result (78.19), which was above the average (68). All individuals but one gave scores that reached values better than the average. Regarding the interface evaluation, the scores showed that the GUI was considered to have a high level of usability, as the average was 1.6 (with subscales SYSUSE scoring 1.5, INFOQUAL scoring 1.8, and INTERQUAL scoring 1.4). Appendix B shows the results of the SUS and PSSUQ for Ecuadorian volunteers.
Regarding the concept of the SYSUSE, individuals' opinions showed that the app was intuitive, even for the first time of use. On the other hand, some individuals, especially those who were older, felt that the need to have a Gmail account was a drawback for the application. In the INFOQUAL subscale, individuals considered the system functionality. For instance, they suggested the IPMA should emit warnings when the network signal is weak or the hand position is not correct. Finally, in the INTERQUAL subscale, the interface was considered friendly for most users; however, they suggested that more graphs would make the system better.

Evaluation Conducted in Brazil
The 18 Brazilian volunteers followed the GUI, which guided them in the process of filling out the form, capturing the audio recordings, and taking the measurements. Afterwards, they were asked to fill out both the SUS and PSSUQ forms to evaluate the IPMA. The detailed results from the IPMA trials are presented in Appendix A.
From the SUS perspective, the equipment scored 81.11 for the Brazilian volunteers, a value considered above average for usability. Five of them (27.78%) scored the equipment below average (68), 2 out of those 5 scores being slightly below average (67.5 and 65). However, the majority of the individuals (13 of them, i.e., 72.22%) considered the equipment quite useful.
The PSSUQ scores showed that the volunteers found the GUI quite useful, with an average score of 2.4 (where the INFOQUAL had a score of 2.8, the SYSUSE had a score of 2.2, and the INTERQUAL had a score of 2.0). This shows, in general, that the Brazilian volunteers also had a good experience with the equipment.

Conclusions
This work introduced the all-in-one Integrated Portable Medical Assistant (IPMA), which is a piece of equipment that allows biomedical data acquisition from individuals, infected or not with COVID-19, through sounds of coughing and three biomedical signals (heart rate, oxygen saturation level, and body temperature) acquired from medically certified devices. The values of these biomedical data were obtained through their displays' readings by cameras and using a 100% efficient VGG16 ANN. All these signals collected from the individual by the IPMA feed pre-trained machine learning algorithms (which achieved approximately ACC = 88.0% and AUC = 0.85 and ACC = 99% and AUC = 0.94, respectively, using QSVM and DT) to allow inferring possible COVID-19 infection, with 100% accuracy, thus indicating to the individual when it is time to seek medical care. It is important to say that although the data collected by the IPMA were all from individuals without COVID-19 infection, our machine learning algorithms showed a good performance to infer COVID-19 from different databases.
Regarding the evaluation from the individuals who used the IPMA, it was considered successful, since it achieved an average score over 68 on the SUS, which means the equipment was considered "above average" (or, in other words, good equipment). Additionally, the PSSUQ presented low scores for both IPMA versions, which means high overall usability.
It is worth mentioning that the IPMA has great advantages as it is a non-invasive, realtime, any-time equipment for large-scale data acquisition for screening and may be used for daily screening of students, workers, and people in public places, such as schools, jobs, and public transportation, to quickly alert that there are group outbreaks. Furthermore, due to its portability, it is suitable to be used in hospitals, in clinics, or at home. Additionally, the IPMA was designed to be user-friendly, with a comprehensive GUI, and safe, since it uses UV-C light to disinfect it.
With the widespread use of the IPMA and collection of data, a new database is being created, which will be quite useful for new studies about alternative parameters to infer COVID-19 infection and other respiratory diseases, mainly because, more than two years after its outbreak, COVID-19 is still strongly threatening the world with its new variants. Mobile applications (apps) were also developed here to allow the compilation of the main physiological signals captured by the equipment and, then, visualize them in mobile phones.
The IPMA was also proven to be functional, and the evaluations conducted with individuals showed that the measurements can be performed easily, while the results can be stored for further analysis and for machine learning training. As future works, we expect to use this equipment to evaluate other respiratory diseases, such as cold, flu, or pneumonia, by training the machine learning with more data.
Regarding a comparison between the IPMA with other equipment with the same purpose, this was not possible, since, as far as we know, there is no other equipment that tries to perform data acquisition from medically certified devices, create a database, and apply machine learning algorithms in the same way as we did here. Additionally, we tested our equipment with known scales to evaluate usability and user interface quality, in order to validate our research.
The limitations of our work are mainly the need for an AC outlet near the equipment to power the UV-C lamps, as well as the need for batteries for each medical device, since we could not power them directly with the main IPMA battery due to the risk of losing their medical certification. Additionally, the structure became bigger because of the need for linear actuators and cameras, as we could not obtain the measurement signals directly from the electronic board of the medically certified devices, to avoid losing their medical certification.
As future works, we plan to perform more trials in different geographical areas to collect more data and verify the usability with people of different regions. Furthermore, we will add some improvements to the user interface to make it more user-friendly and to improve the visualization in the mobile application. Finally, tests will be performed at public healthcare institutions to evaluate the use of the device.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of UFES/Brazil (CAAE: 64800116.9.0000.5542).

Informed Consent Statement:
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of UFES/Brazil (CAAE: 64800116.9.0000.5542).    Table A1 shows the data of oxygen saturation, heart rate, and body temperature, taken with the IPMA, in addition to the information of the birth date, gender, and whether the volunteer is infected or not with COVID-19. In this table, SpO2 is the oxygen saturation level (in percentage), HR is the Heart Rate (in bpm-beats per minute), Temp is the body temperature (in degrees Celsius), and SBP and DBP are, respectively, Systolic Blood Pressure and Diastolic Blood Pressure, measured in millimetres of mercury (mmHg). In the column Gender, M stands for Male and F for Female. In the column COVID, R stands for Recovered and N for Negative, RT-PCR refers to the Real-Time Polymerase Chain Reaction, NS means No Symptoms, ME means Medical Evaluation, and PE means Personal Evaluation. The columns Onset and Finish refer to the date of the beginning of the symptoms and the date that the symptoms ended, respectively.   Tables A3 and A4 show the SUS scores given by the volunteers who tried the IPMA. The first column shows the volunteers who took the experiments, from 1 to 18; in the second line, the numbers are related to the SUS sentences, and the column "SUS" is the overall score calculated for each volunteer. The last line is the average. The tables with the PSSUQ scores that the volunteers gave at the end of the experiment are given below. Table A5 shows the Ecuadorian results for the PSSUQ and Table A6 the Brazilian results. The numbers 1 to 18 in each table represent the volunteers who took the evaluation; in the second line, there is the number of PSSUQ sentences and the scores for each subscale and for the PSSUQ. In this line, S stands for the subscale SYSUSE, F for INFOQUAL, Q for INTERQUAL, and P for PSSUQ (overall score). The last line is the averages.

3.
How do you know whether you currently have COVID-19 or not? Please, specify if you made a test. Single choice in a set containing "I do not think I have COVID-19 now (no test was made)", "RT-PCR test", "Quick test", "'Medical evaluation", "Personal evaluation" and "Other". 4.
If you checked "Other", please specify how you know you have or not COVID-19. The answer here is a text explaining the symptoms the user had last week.

5.
Symptoms onset and finish date The answer here are the dates of the beginning and end of the symptoms for those who had the disease.