Next Article in Journal
Wearable Assistive Robotics: A Perspective on Current Challenges and Future Trends
Previous Article in Journal
Self-Excited Acoustical Measurement System for Rock Mass Stress Mapping
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset

1
Department of Electrical Engineering, HITEC University, Taxila 47080, Pakistan
2
Department of Electrical and Computer Engineering, COMSATS University Islamabad, Attock Campus, Attock 43600, Pakistan
3
Research Centre for Intelligent Healthcare, Coventry University, Coventry CV1 5FB, UK
4
College of Information Technology, United Arab Emirates University (UAEU), Abu Dhabi 15551, United Arab Emirates
5
School of Electronic Engineering, Xidian University, Xi’an 710071, China
6
School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK
7
School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK
8
Artificial Intelligence Research Centre (AIRC), Ajman University, Ajman 20550, United Arab Emirates
*
Author to whom correspondence should be addressed.
Sensors 2021, 21(20), 6750; https://doi.org/10.3390/s21206750
Submission received: 27 August 2021 / Revised: 30 September 2021 / Accepted: 8 October 2021 / Published: 12 October 2021
(This article belongs to the Section Internet of Things)

Abstract

:
The recent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as coronavirus disease (COVID)-19, has appeared as a global pandemic with a high mortality rate. The main complication of COVID-19 is rapid respirational deterioration, which may cause life-threatening pneumonia conditions. Global healthcare systems are currently facing a scarcity of resources to assist critical patients simultaneously. Indeed, non-critical patients are mostly advised to self-isolate or quarantine themselves at home. However, there are limited healthcare services available during self-isolation at home. According to research, nearly 20–30% of COVID patients require hospitalization, while almost 5–12% of patients may require intensive care due to severe health conditions. This pandemic requires global healthcare systems that are intelligent, secure, and reliable. Tremendous efforts have been made already to develop non-contact sensing technologies for the diagnosis of COVID-19. The most significant early indication of COVID-19 is rapid and abnormal breathing. In this research work, RF-based technology is used to collect real-time breathing abnormalities data. Subsequently, based on this data, a large dataset of simulated breathing abnormalities is generated using the curve fitting technique for developing a machine learning (ML) classification model. The advantages of generating simulated breathing abnormalities data are two-fold; it will help counter the daunting and time-consuming task of real-time data collection and improve the ML model accuracy. Several ML algorithms are exploited to classify eight breathing abnormalities: eupnea, bradypnea, tachypnea, Biot, sighing, Kussmaul, Cheyne–Stokes, and central sleep apnea (CSA). The performance of ML algorithms is evaluated based on accuracy, prediction speed, and training time for real-time breathing data and simulated breathing data. The results show that the proposed platform for real-time data classifies breathing patterns with a maximum accuracy of 97.5%, whereas by introducing simulated breathing data, the accuracy increases up to 99.3%. This work has a notable medical impact, as the introduced method mitigates the challenge of data collection to build a realistic model of a large dataset during the pandemic.

1. Introduction

Coronavirus is a large family of viruses with various forms, such as the Middle East respiratory syndrome coronavirus (MERS-CoV), severe acute respiratory syndrome coronavirus (SARS-CoV), and the latest virus, SARS-CoV-2, also known as COVID-19. COVID-19 infection symptoms include respiratory tract illness, acute viral pneumonia with respirational failure, and death [1]. There are various ways to diagnose COVID-19 infection, of which monitoring breathing rate (BR) is considered one of the most significant indications of COVID-19. Therefore, investigating the BR and its connection with COVID-19 symptoms is now a popular area of research [2]. BR is usually defined as the number of breaths an individual takes per minute when resting. A normal BR is 10 to 24 breaths per minute (bpm), while an abnormal BR for adults can be categorized as hyperventilation (bpm > 24), hypoventilation (bpm < 10), or apnea [3]. For non-COVID scenarios, BR may increase with fever, illness, and other medical conditions.
For COVID scenarios, it is considered very important to determine the BR or breathing activity of the patients as abnormal breathing measurements may indicate a deterioration in the patient’s health [4]. The BR can be measured through manual counting; however, this method is unreliable and prone to error. Therefore, measuring BR usually involves the expertise of a health professional, so it is usually performed in the hospital. The best method for BR measurement in hospitals is spirometry, which calculates the airflow during inhalation and exhalation. Other methods include electrical impedance pneumography (EIP), capnography, and inductance pneumography (IP) [5]. However, these methods require hospitalization. Due to the clinical emergency caused by COVID-19, BR monitoring of the COVID patients increases the risk of virus spread by visiting hospitals. Most patients do not show breathing distress at first, and healthcare professionals must send these patients back home for self-monitoring. According to medical research, patients with minor clinical conditions may worsen in the second week of COVID-19 infection. Therefore, patients with normal breathing functions do not necessitate hospitalizations and must be monitored using telemedicine methods during self-isolation [6]. In contrast, for patients facing acute breathing distress, real-time BR monitoring is very much mandatory.
Breathing can lose its regular rhythm because of numerous medical conditions such as potential injury or metabolic disorders. Breathing abnormalities can have breathing patterns that are shallow, deep, or fast. These abnormal patterns include eupnea, bradypnea, tachypnea, Biot, Kussmaul, sighing, Cheyne–Stokes, and CSA. Eupnea is normal breathing with a uniform pattern and rate, while BR is faster in tachypnea than in eupnea. Deep breaths characterize biot respiration with regular periods of apnea, and Kussmaul is deep and fast breathing. Bradypnea is shallow and slow breathing with a uniform pattern, while sighing is normal breathing punctuated by sighs. Cheyne–Stokes breathing is defined by a gradual increase and decrease in BR, whereas CSA is breathing that repeatedly stops and starts during sleep. Details of the breathing patterns and their causes is given below in Figure 1.
Non-contact monitoring during the pandemic situation is a promising solution to combat the spread of COVID-19. The most significant indication of COVID-19 is abnormal breathing. Therefore, classifying breathing patterns using ML is worthwhile and of great significance. The dataset of breathing patterns required for building the ML model is obtained by assessing test subjects’ breathing patterns. This approach for capturing different breathing patterns yields a limited set of data during pandemic situations, insufficient for developing a reliable ML model. Therefore, there is a need for simulated breathing patterns to overcome the scarcity of real-time breathing data from actual patients. This research collected real-time breathing patterns data through a non-contact approach using the software-defined radio (SDR) platform. Subsequently, a large dataset of simulated breathing patterns was generated using the curve fitting technique. The obtained results were validated by evaluating various ML algorithms based on accuracy, prediction speed, and training time. This approach can be utilized for COVID as well as non-COVID scenarios and has many innovative healthcare applications.

2. Literature Review

There are several contact-based and non-contact technologies for breath monitoring presented in the literature. Contact-based technologies require wearable sensors and smartwatches, etc. [7,8]. The devices used in contact-based technologies are expensive, heavy, and are often inconvenient for patients. To avoid this inconvenience, non-contact technologies have also been proposed. The advantages of non-contact technologies include continuous monitoring at home and even during sleep. Most non-contact technologies use camera-based imaging [9] or are RF-based [10]. Camera-based imaging breath monitoring needs a depth camera or thermal imaging camera. There are limitations in camera-based technologies; for example, depth cameras have a high computational cost and are expensive, while thermal imaging is vulnerable to ambient temperature. RF-based non-contact technologies leverage the propagation of electromagnetic (EM) waves that can be extracted through a wireless medium.
Furthermore, RF-based technology for breath monitoring includes various technologies, including radar, Wi-Fi, and SDR. For RF-signal sensing, these technologies can exploit channel state information (CSI) or received signal strength (RSS). There are numerous techniques for radar-based breath monitoring, including the Doppler radar [11] and frequency-modulated continuous-wave (FMCW) [12]. These radar-based techniques require high-cost, specialized hardware working at high frequency. Furthermore, the Vital-Radio system [13] uses a FMCW radar to track breathing and heart rates with a wide bandwidth from 5.46 GHz to 7.25 GHz. For Wi-Fi-based breath monitoring, numerous approaches using RSS and CSI are mentioned in the literature. The authors of [14] explored the use of RSS measurements on the links between wireless devices to find the BR and location of a person in a home environment. Similarly, a complete architecture for finding breathing signals from noisy Wi-Fi signals was presented in [15]. A further study [16] provided a non-contact CSI-based breath monitoring system. Schmidt [17] applied the Hampel filter on the CSI series to eliminate outliers and high-frequency noises, and then, BR was measured by performing FFT on all the CSI streams targeting frequencies between 0.1 Hz and 0.6 Hz. Wang et al. [18] proposed a BR monitoring system using the CSI through a single pair of commercial Wi-Fi devices. Wi-Fi-based RF sensing has several advantages, such as cost-effectiveness and ready availability. However, it also has disadvantages, such as lack of scalability and flexibility, and under-reporting Orthogonal Frequency Division Multiplexing (OFDM) subcarriers [19].
SDR-based breath monitoring has been investigated by various authors [20,21,22,23,24]. SDR-based breath monitoring is considered the most efficient among all the RF-based techniques, as it offers a flexible, portable, and scalable solution. Additionally, this technology permits the selection of the operating frequency and transmitted/received power. Moreover, it allows the simple execution of signal processing algorithms. ML has also been exploited for breath monitoring to help accurately classify various breathing abnormalities. Several authors have used ML for classifying breathing patterns [25]. However, several studies were unable to obtain reasonable accuracy, and some were only successful in classifying basic breathing patterns, including normal and fast breathing [26]. Consequently, there is a need for a platform to monitor and accurately classify a diverse range of breathing patterns. The summary of various technologies discussed in the literature is shown in Figure 2.

3. System Architecture

The system architecture consists of four layers, as shown in Figure 3. The functionality of each layer is explained below:

3.1. Data Extraction Layer

The first layer is the data extraction layer, which is responsible for breathing data extraction. There are three main blocks in this layer containing the transmitter, receiver, and wireless channel. The transmitter contains transmitter PC and transmitter universal software radio peripheral (USRP). First, random data bits are generated and mapped to quadrature amplitude modulation (QAM) symbols in the transmitter PC. Then, these QAM symbols are further split into parallel frames. After that, reference data symbols are inserted in each parallel frame. On the receiver side, the reference symbols will be used for channel estimation. Then in each frame, zeros are positioned at the edges and 1 zero at DC. Next, frequency domain signals are converted into time-domain signals by applying inverse fast Fourier transform (IFFT). Subsequently, a cyclic prefix (CP) is introduced in every frame by repeating the last one-fourth of the points at the start. On the receiver side, the CP will be used in the elimination of frequency and time offset.
Then, the host PC sends this data to the USRP kit through gigabit ethernet. First, this data is digitally upconverted and translated into an analog form using a digital upconverter (DUC) and digital to analog converter (DAC), respectively. The USRP then passes this analog signal through a low pass filter (LPF) and mixes it up to a user-specified frequency. Before transmitting the resultant signal using an omnidirectional antenna, USRP moves it through a transmit amplifier for gain adjustment. After passing through the wireless channel, the transmitted signal is received at the receiver USRP device using an omnidirectional antenna. This received signal is then moved through a low noise amplifier (LNA) and drive amplifier (DA) for noise element elimination and gain adjustment, respectively. The subsequent signal is passed through LPF, then an analog to digital converter (DAC), followed by the digital down converter (DDC) for filtering and decimating the signal. Eventually, the resultant signal is moved to the host PC using a gigabit ethernet cable. Now CP is eliminated from each frame at the host PC; additionally, time and frequency offset are removed by applying the Van de Beek algorithm [27]. Afterward, the fast Fourier transform (FFT) converts the time domain samples into frequency domain symbols. Finally, breathing patterns are detected by extracting the amplitude response of the frequency domain signal.
The wireless channel for the OFDM system can be regarded as a narrowband flat fading channel, which can be represented in the frequency domain, as shown in Equation (1):
Y ¯ = H × X ¯ + N ¯
where Y ¯ and X ¯ denote the received and transmitted wireless signal vectors, respectively, N ¯ is the additive white Gaussian noise, and H represents the OFDM channel frequency response for all subcarriers, and this H can be estimated from Y ¯ and X ¯ . Here, the OFDM system uses 256 subcarriers for data transmission on a 20 MHz channel. The channel frequency response for all subcarriers can be represented by Equation (2) as:
H = [ H 11 H 12 H 1 s H 21 H 22 H 2 s H k 1 H k 2 H k s ]
where k represents the OFDM subcarriers, and s represents acquired samples. The frequency response of the channel for a single subcarrier i is denoted by H i , and is a complex value, given in Equation (3) as:
H i = | H i |   exp ( j H i )
where | H i | and H i are the amplitude and phase response of OFDM subcarrier i , respectively. For indoor lab environments with multipath components, the channel frequency response H i of subcarrier i is expressed in Equation (4) as:
H i = n = 0 N r n · e j 2 π f i τ n
where N is the total number of multipath components, and r n and τ n are the attenuation and propagation delay on the n t h path, respectively, while f i represents the frequency of the i t h subcarrier.

3.2. Data Preprocessing Layer

Raw CSI data received from the extraction layer is sent to the data preprocessing layer. This layer is further divided into four sublayers.

3.2.1. Subcarrier Selection

The first step in the data preprocessing layer is subcarrier selection. After each activity, 256 OFDM subcarriers are acquired at the receiver. It is realized that the susceptibility of each subcarrier is distinct for the breathing experiment. Therefore, for good detection, eliminating all subcarriers that are less susceptible to breathing activity is necessary. Therefore, the variance of all subcarriers is measured. Based on this, all subcarriers that are less susceptible to breathing activity are eliminated, as shown in Figure 4a for all OFDM subcarriers.

3.2.2. Outliers Removal

After subcarrier selection, wavelet filtering is applied to eliminate the outliers from raw data by retaining sharp transition. This can be seen in Figure 4b for all OFDM subcarriers. For wavelet filtering, “scaled noise option” and “soft heuristic SURE” thresholding is used on coefficients by choosing “syms5” and “level 4”.

3.2.3. Data Smoothing

The moving average filter of “size 8” is applied for data smoothing, which removes high-level frequency noise, and this can be seen in Figure 4c for all OFDM subcarriers. The output of the moving average filter of window “size 8” can be represented by Equation (5):
y [ n ] = 1 N i = 0 N 1 x [ n i ]
where y [ n ] is the current output, x [ n ] is the current input, and N is the window size of the moving average filter.

3.2.4. Data Normalization

Finally, waveform data is normalized to the maximum and minimum value to 1 and −1, respectively, by using the following Equation (6):
y [ n ] ¯ = y [ n ] o f f s e t s c a l e
where y [ n ] ¯ is the normalized data, and y [ n ] is the input data. Here, input waveform data is scaled and offset by some values to acquire normalized waveform. For example, the normalized waveform for a single OFDM subcarrier is shown in Figure 4d. After performing the above steps, processed CSI data is obtained to classify breathing patterns through ML algorithms.

3.3. Data Simulation Layer

The amount of real-time breathing experimentation is not sufficient to train a robust ML classification model. Therefore, a simulation model inspired by [28] was developed to overcome the data scarcity issue. Based on the characteristics of actual real-time breathing patterns data, a simulation model was developed to generate abundant and high-quality simulated breathing data. As breathing is a continuous process of inhalation and exhalation, breathing signals measured by the non-contact method can be approximated by the sinusoidal waveforms. Here, breathing patterns are simulated through the curve fitting function available in MATLAB. The curve fitting is usually performed to theoretically describe experimental data points with a model (equation or function) and acquire the model’s parameters. In this research work, the curve fitting function of MATLAB is used to model and generate all breathing patterns [29]. As all breathing patterns can be represented by sinusoidal waveform, therefore by using the curve fitting function of MATLAB, real-time breathing patterns are modelled by the sum of seven sinusoidal terms, which can be represented by Equation (7):
y = i = 0 n a i sin ( b i x + c i )
where a is the amplitude, b is the frequency, and c is the phase for each sinusoidal term, while x represents OFDM samples, and n is the total number of sinusoidal terms in the summation. The coefficients’ values for eight breathing patterns are shown in Table 1. Here, for illustration purposes, simulated and real-time Biot breathing patterns are shown in Figure 5. Real-time breathing patterns are prone to fluctuations; therefore, to make simulated breathing patterns closer to real-time data, the additive white Gaussian noise (AWGN) function available in MATLAB introduces noise into the simulated breathing patterns [30]. A huge amount of simulated data can be generated by slight variations in AWGN values or the coefficients’ values shown in Table 1. At the output of the data simulation layer, simulated CSI for eight breathing patterns is generated and used for classification purposes.

3.4. Data Classification Layer

In this layer, the processed and simulated CSI is used for training and testing purposes. First various statistical features of the processed CSI are extracted, and the performance of the ML algorithms is evaluated based on accuracy, prediction speed, and training time. Likewise, statistical features for the simulated CSI are extracted. After this, the performance of the ML algorithms is evaluated for the simulated data. The details of the statistical features are shown in Table 2. As the accuracy of ML algorithms depends upon the size and type of the dataset, it can be enhanced by enlarging the dataset. In this work, the dataset size was enhanced by introducing a large amount of simulated breathing data. Furthermore, random five-fold cross-validation was applied for classification purposes.

4. Results and Discussion

In this section, the experimental setup is presented, and the results are discussed.

4.1. Experimental Setup

The experimental setup monitors and classifies eight breathing patterns by detecting small-scale activities in a real-time wireless medium through acquiring fine-grained CSI. The experimental setup contains three main blocks: the transmitter, receiver, and wireless channel, as shown in Figure 6. The transmitter block contains a transmitter PC, and the USRP model 2922 is utilized as SDR hardware to perform generic RF functionality. In contrast, the receiver block consists of a receiver PC and receiver USRP. Omnidirectional antennas are also used to capture the variation in the CSI due to breathing. The RF signal generated by the transmitter reaches the receiver through multipaths in the indoor lab environment. When an individual is present in the lab environment, an additional path is created due to the human body’s diffraction or reflection of signals. Therefore, the impact of human breathing on the signal’s propagation is acquired on the receiver side in the form of CSI.
Five volunteers participated in the study and performed the breathing patterns; their details are given in Table 3. Each volunteer was sitting in a relaxed position. Both the transmitter and receiver USRPs were positioned parallel to the abdomen of the volunteer at a 1-m distance. All volunteers were professionally trained to perform each breathing pattern. Ten datasets were collected from five volunteers for eight breathing patterns to perform 400 experiments. Each breathing activity was performed for 30 s by each volunteer.

4.2. Breathing Patterns’ Monitoring

This section describes the monitoring of breathing patterns using SDR-based RF sensing. The CSI amplitude response was exploited to analyze the breathing patterns. The variations in amplitude frequency response were observed for each breathing experiment over 3500 OFDM samples. For illustration purposes, the results from subject 2 for eight breathing patterns are depicted in Figure 7 for a single OFDM subcarrier. Eupnea is normal breathing with a uniform rate and pattern with a BR between 12 to 24 bpm. From Figure 7a, it can be observed that there are 12 breaths in a half-minute, which lies within the range of normal breathing. Bradypnea is shallow and slow breathing with a uniform pattern, as can be observed in Figure 7b, showing 6 breaths in a half-minute, which lies in the range of slow breathing. In tachypnea, the BR is faster than eupnea, which can be verified in Figure 7c, which shows there are 15 breaths in a half minute. Biot is characterized by deep breathing with regular periods of apnea, and Figure 7d depicts deep breaths followed by apnea. Sighing is normal breathing punctuated by sighs. This can be observed in Figure 7e, which shows normal breathing punctuated by frequent deep breaths. Cheyne–Stokes is defined by a gradual increase and decrease in BR, and Figure 7f clearly shows a gradual increase and decrease in BR. Kussmaul is deep and fast breathing, which can be observed in Figure 7g. Furthermore, CSA is a type of breathing in which breathing repeatedly stops and starts during sleep and this can be seen in Figure 7h, which shows periods of no breathing during normal breathing.

4.3. Breathing Patterns’ Classification

In this section, the results of ML algorithms for the classification of breathing patterns are discussed. To assess the performance of each ML algorithm, a confusion matrix was used, with eight predicted and true classes. The diagonal entries of the matrix represent the cases where the actual class and predicted class are matched. The cell values other than the diagonal entries show where the ML algorithm performed poorly. The performance of the ML algorithms was evaluated based on prediction speed, accuracy, and training time. Prediction speed is measured as observations per second, accuracy is calculated as a percentage, and training time is measured in seconds. Initially, the real-time breathing patterns data were trained by four ML algorithms. The confusion matrix results are shown in Table 4, and it can be seen that all algorithms classified these breathing patterns successfully. Then, data from ten thousand simulated breathing patterns were trained through the ML algorithm. Finally, the confusion matrix results are shown in Table 5, and it can be seen that these algorithms can classify breathing patterns even more successfully.
Finally, in Table 6, a performance comparison is shown for real-time and simulated breathing data. It can be observed that accuracy is improved for all algorithms for simulated breathing data compared to real-time breathing data. For example, for cosine nearest neighbor classifiers (KNN), accuracy is increased from 97.5% to 99.3%, while for complex tree algorithms, accuracy is increased from 96.8% to 98.4%, and for ensemble boosted tree algorithms, the accuracy is increased from 85.6% to 94.7%. Furthermore, for linear support vector machine (SVM) algorithms, the accuracy is increased from 75.5% to 84.9%. The performance comparison of ML algorithms in terms of accuracy is also elaborated in Figure 8. It can be observed that accuracy is improved for all algorithms when the breathing patterns dataset is increased through simulation.

5. Conclusions

In this article, a solution is proposed for the problem of collecting a large dataset of abnormal breathing patterns during the pandemic. This approach improves the classification accuracy of the ML model. A non-contact SDR platform was used to collect real-time breathing patterns data based on variations in CSI. The real-time data was used to generate a large, simulated dataset through the curve fitting technique. This increases the dataset size, which will help in building a reliable ML model. Different ML algorithms were exploited for breathing patterns classification for real-time as well as simulated breathing data. It was verified that accuracy improves by introducing simulated data for training purposes. This research work can be used for COVID and non-COVID scenarios. The results indicate that the developed platform is accurate and robust for monitoring human breathing, and its accuracy can be further enhanced by introducing more simulated breathing data. The future applications of this research work are numerous, for example, it can be used as a pre-examination tool for patients to provide clues about the nature of the illness, it can be used in homes for individual monitoring during the day, and it can even be deployed in public places for infection monitoring in crowds. This work has a few limitations, including that experiments were performed on single subjects in an indoor lab environment, and actual patients were not chosen for data collection. So, future recommendations for this research work would remove all of the limitations stated in this research.

Author Contributions

Conceptualization, M.R. and M.B.K.; data curation, M.R. and M.B.K.; formal analysis, M.R. and M.B.K.; funding acquisition, N.A.A., Q.H.A., M.A.I., A.A. and S.A.S.; investigation, M.R., M.B.K., S.A.S. and M.A.I.; methodology, M.R. and M.B.K.; project administration, N.A.A., Q.H.A., S.A.S., A.A. and M.A.I.; resources, M.R. and M.B.K.; software, M.R. and M.B.K.; supervision, N.A.A., Q.H.A., R.A.S., A.A., X.Y. and M.A.I.; validation, Q.H.A., R.A.S., S.A.S.; visualization, Q.H.A., R.A.S., S.A.S.; writing—original draft, M.R. and M.B.K.; writing—review and editing, Q.H.A., R.A.S. and N.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by EPSRC, grant numbers EP/R511705/1 and EP/T021063/1.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of COMSATS University Islamabad, Attock Campus.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X. Clinical Course and Risk Factors for Mortality of Adult Inpatients with COVID-19 in Wuhan, China: A Retrospective Cohort Study. Lancet 2020, 395, 1054–1062. [Google Scholar] [CrossRef]
  2. Khan, M.B.; Zhang, Z.; Li, L.; Zhao, W.; Hababi, M.A.M.A.; Yang, X.; Abbasi, Q.H. A Systematic Review of Non-Contact Sensing for Developing a Platform to Contain COVID-19. Micromachines 2020, 11, 912. [Google Scholar] [CrossRef] [PubMed]
  3. Clinical Methods: The History, Physical, and Laboratory Examinations, 3rd ed.; Walker, H.K.; Hall, W.D.; Hurst, J.W. (Eds.) Butterworths: Boston, MA, USA, 1990; ISBN 978-0-409-90077-4. [Google Scholar]
  4. Xu, Z.; Shi, L.; Wang, Y.; Zhang, J.; Huang, L.; Zhang, C.; Liu, S.; Zhao, P.; Liu, H.; Zhu, L. Pathological Findings of COVID-19 Associated with Acute Respiratory Distress Syndrome. Lancet Resp. Med. 2020, 8, 420–422. [Google Scholar] [CrossRef]
  5. Von Schéele, B.H.C.; Von Schéele, I.A.M. The Measurement of Respiratory and Metabolic Parameters of Patients and Controls before and after Incremental Exercise on Bicycle: Supporting the Effort Syndrome Hypothesis. Appl. Psychophysiol. Biofeedback 1999, 24, 167–177. [Google Scholar] [CrossRef] [PubMed]
  6. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [Green Version]
  7. Vafea, M.T.; Atalla, E.; Georgakas, J.; Shehadeh, F.; Mylona, E.K.; Kalligeros, M.; Mylonakis, E. Emerging Technologies for Use in the Study, Diagnosis, and Treatment of Patients with COVID-19. Cell. Mol. Bioeng. 2020, 13, 249–257. [Google Scholar] [CrossRef] [PubMed]
  8. Liaqat, D.; Abdalla, M.; Abed-Esfahani, P.; Gabel, M.; Son, T.; Wu, R.; Gershon, A.; Rudzicz, F.; Lara, E.D. WearBreathing: Real World Respiratory Rate Monitoring Using Smartwatches. Proc. ACM Int. Mobile Wearable Ubiquitous Technol. 2019, 3, 1–22. [Google Scholar] [CrossRef]
  9. Bae, M.; Lee, S.; Kim, N. Development of a Robust and Cost-Effective 3D Respiratory Motion Monitoring System Using the Kinect Device: Accuracy Comparison with the Conventional Stereovision Navigation System. Comput. Methods Prog. Biomed. 2018, 160, 25–32. [Google Scholar] [CrossRef] [PubMed]
  10. Shah, S.A.; Abbas, H.; Imran, M.A.; Abbasi, Q.H. RF Sensing for Healthcare Applications. In Backscattering and RF Sensing for Future Wireless Communication; John Wiley & Sons, Ltd.: London, UK, 2021; pp. 157–177. ISBN 978-1-119-69572-1. [Google Scholar]
  11. Chuma, E.L.; Iano, Y. A Movement Detection System Using Continuous-Wave Doppler Radar Sensor and Convolutional Neural Network to Detect Cough and Other Gestures. IEEE Sens. J. 2020, 21, 2921–2928. [Google Scholar] [CrossRef]
  12. Purnomo, A.T.; Lin, D.-B.; Adiprabowo, T.; Hendria, W.F. Non-Contact Monitoring and Classification of Breathing Pattern for the Supervision of People Infected by COVID-19. Sensors 2021, 21, 3172. [Google Scholar] [CrossRef] [PubMed]
  13. Adib, F.; Mao, H.; Kabelac, Z.; Katabi, D.; Miller, R.C. Smart Homes That Monitor Breathing and Heart Rate. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Korea, 18–23 April 2015; Association for Computing Machinery: New York, NY, USA; pp. 837–846. [Google Scholar]
  14. Patwari, N.; Brewer, L.; Tate, Q.; Kaltiokallio, O.; Bocca, M. Breathfinding: A Wireless Network That Monitors and Locates Breathing in a Home. IEEE J. Select. Top. Sig. Proc. 2013, 8, 30–42. [Google Scholar] [CrossRef] [Green Version]
  15. Abdelnasser, H.; Harras, K.A.; Youssef, M. UbiBreathe: A Ubiquitous Non-Invasive WiFi-Based Breathing Estimator. In Proceedings of the 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, Hangzhou, China, 22–25 June 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 277–286. [Google Scholar]
  16. Wang, Z.; Jiang, K.; Hou, Y.; Dou, W.; Zhang, C.; Huang, Z.; Guo, Y. A Survey on Human Behavior Recognition Using Channel State Information. IEEE Access 2019, 7, 155986–156024. [Google Scholar] [CrossRef]
  17. Schmidt, R. Multiple Emitter Location and Signal Parameter Estimation. IEEE Trans. Anten. Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef] [Green Version]
  18. Wang, F.; Zhang, F.; Wu, C.; Wang, B.; Liu, K.R. Respiration Tracking for People Counting and Recognition. IEEE Int. Things J. 2020, 7, 5233–5245. [Google Scholar] [CrossRef]
  19. Shah, S.A.; Fioranelli, F. RF Sensing Technologies for Assisted Daily Living in Healthcare: A Comprehensive Review. IEEE Aerosp. Electron. Syst. Mag. 2019, 34, 26–44. [Google Scholar] [CrossRef] [Green Version]
  20. Al-Wahedi, A.; Al-Shams, M.; Albettar, M.A.; Alawsh, S.; Muqaibel, A. Wireless Monitoring of Respiration and Heart Rates Using Software-Defined-Radio. In Proceedings of the 2019 16th International Multi-Conference on Systems, Signals Devices (SSD), Istanbul, Turkey, 21–24 March 2019; pp. 529–532. [Google Scholar]
  21. Praktika, T.O.; Pramudita, A.A. Implementation of Multi-Frequency Continuous Wave Radar for Respiration Detection Using Software Defined Radio. In Proceedings of the 2020 10th Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS), Malang, Indonesia, 26–28 August 2020; pp. 284–287. [Google Scholar]
  22. Rehman, M.; Shah, R.A.; Khan, M.B.; AbuAli, N.A.; Shah, S.A.; Yang, X.; Alomainy, A.; Imran, M.A.; Abbasi, Q.H. RF Sensing Based Breathing Patterns Detection Leveraging USRP Devices. Sensors 2021, 21, 3855. [Google Scholar] [CrossRef] [PubMed]
  23. Rehman, M.; Shah, R.A.; Khan, M.B.; Ali, N.A.A.; Alotaibi, A.A.; Althobaiti, T.; Ramzan, N.; Shaha, S.A.; Yang, X.; Alomainy, A. Contactless Small-Scale Movement Monitoring System Using Software Defined Radio for Early Diagnosis of COVID-19. IEEE Sens. J. 2021, 21, 17180–17188. [Google Scholar] [CrossRef]
  24. Muin, F.; Apriono, C. Path Loss and Human Body Absorption Experiment for Breath Detection. In Proceedings of the 2020 27th International Conference on Telecommunications (ICT), Bali, Indonesia, 5–7 October 2020; pp. 1–5. [Google Scholar]
  25. Lee, S.; Park, Y.-D.; Suh, Y.-J.; Jeon, S. Design and Implementation of Monitoring System for Breathing and Heart Rate Pattern Using WiFi Signals. In Proceedings of the 2018 15th IEEE Annual Consumer Communications Networking Conference (CCNC), Las Vegas, NV, USA, 12–15 January 2018; pp. 1–7. [Google Scholar]
  26. Khan, M.B.; Yang, X.; Ren, A.; Al-Hababi, M.A.M.; Zhao, N.; Guan, L.; Fan, D.; Shah, S.A. Design of Software Defined Radios Based Platform for Activity Recognition. IEEE Access 2019, 7, 31083–31088. [Google Scholar] [CrossRef]
  27. Van de Beek, J.-J.; Borjesson, P.O.; Boucheret, M.-L.; Landstrom, D.; Arenas, J.M.; Odling, P.; Ostberg, C.; Wahlqvist, M.; Wilson, S.K. A Time and Frequency Synchronization Scheme for Multiuser OFDM. IEEE J. Select. Areas Commun. 1999, 17, 1900–1914. [Google Scholar] [CrossRef] [Green Version]
  28. Wang, Y.; Hu, M.; Zhou, Y.; Li, Q.; Yao, N.; Zhai, G.; Zhang, X.-P.; Yang, X. Unobtrusive and Automatic Classification of Multiple People’s Abnormal Respiratory Patterns in Real Time Using Deep Neural Network and Depth Camera. IEEE Int. Things J. 2020, 7, 8559–8571. [Google Scholar] [CrossRef]
  29. Sum of Sines Models-MATLAB & Simulink. Available online: https://www.mathworks.com/help/curvefit/sum-of-sine.html (accessed on 18 August 2021).
  30. Add White Gaussian Noise to Signal-MATLAB Awgn. Available online: https://www.mathworks.com/help/comm/ref/awgn.html (accessed on 18 August 2021).
Figure 1. Breathing patterns and causes.
Figure 1. Breathing patterns and causes.
Sensors 21 06750 g001
Figure 2. Literature review summary.
Figure 2. Literature review summary.
Sensors 21 06750 g002
Figure 3. System architecture.
Figure 3. System architecture.
Sensors 21 06750 g003
Figure 4. Data preprocessing layer. (a) Subcarrier selection, (b) Outliers removal, (c) data smoothing, (d) data normalization.
Figure 4. Data preprocessing layer. (a) Subcarrier selection, (b) Outliers removal, (c) data smoothing, (d) data normalization.
Sensors 21 06750 g004
Figure 5. Simulated and real-time Biot breathing.
Figure 5. Simulated and real-time Biot breathing.
Sensors 21 06750 g005
Figure 6. Non-contact breathing sensing experimental setup.
Figure 6. Non-contact breathing sensing experimental setup.
Sensors 21 06750 g006
Figure 7. Breathing pattern results. (a) Eupnea, (b) bradypnea, (c) tachypnea, (d) Biot, (e) sighing, (f) Cheyne–Stokes, (g) Kussmaul, (h) CSA.
Figure 7. Breathing pattern results. (a) Eupnea, (b) bradypnea, (c) tachypnea, (d) Biot, (e) sighing, (f) Cheyne–Stokes, (g) Kussmaul, (h) CSA.
Sensors 21 06750 g007
Figure 8. Comparison of the algorithms’ accuracy.
Figure 8. Comparison of the algorithms’ accuracy.
Sensors 21 06750 g008
Table 1. Coefficients values for simulated breathing patterns.
Table 1. Coefficients values for simulated breathing patterns.
Coefficients ValuesBreathing Patterns
EupneaBradypneaTachypneaBiotSighingKussmaulCheyne–StokesCSA
Amplitudea10.5060.4090.4060.4350.2560.5470.7750.427
a20.3420.3030.3260.2600.3330.2820.3020.437
a30.0640.4413.8520.1790.2960.1980.2320.176
a40.0770.1741.8380.5640.1920.1926.8500.329
a50.2180.1093.6930.4580.1470.1236.8550.240
a60.0430.0750.0660.2370.1790.1200.2050.202
a70.0660.6071.6920.7890.1529.2750.1150.164
Frequency b10.0010.0110.0270.0020.0010.0360.0010.004
b20.0210.0120.0020.0150.0180.0390.0160.025
b30.0050.0010.0250.0100.0110.0030.0040.000
b40.0190.0080.0290.0120.0210.0320.0210.007
b50.0030.0140.0250.0040.0040.0250.0210.018
b60.0290.0050.0090.0070.0290.0290.0180.022
b70.0220.0010.0290.0010.0070.0000.0050.011
Phase c12.388−1.035−0.2973.864−2.571−2.7972.9682.462
c2−0.3732.7750.207−2.694−1.710−0.0731.8682.009
c3−1.5133.4562.9413.3022.1230.331−2.0130.944
c41.107−1.1300.915−0.7702.059−2.012−3.590−2.466
c52.4341.926−0.119−2.595−1.478−1.583−0.4972.735
c6−1.2272.053−0.194−0.835−1.264−1.5481.154−1.577
c7−0.0751.0914.008−4.899−1.6503.133−1.390−1.952
Table 2. Statistical features.
Table 2. Statistical features.
Sr. No.Statistical FeaturesDetailEquations
1MinimumMinimum value in data X m i n = m i n ( x k )
2MaximumMaximum value in data X m a x = m a x ( x k )
3MeanData mean X m = L k = 1 L x k
4VarianceSpread of data X S   D = k = 1 n ( x k x m ) 2
5Standard deviationSquare root of variance X v = 1 L 1 k = 1 L ( x k X m ) 2 2
6Peak-to-peak valueVariations in data about the mean X p p = X m a x X m i n ( k = 1 , 2 , , L )
7RMSRoot mean of square data X R M S = 1 L k = 1 L x k 2 2
8KurtosisPeak sharpness of a frequency–distribution curve X K = 1 L k = 1 L ( | x k | X m ) 4 X R M S 4
9SkewnessMeasure of symmetry in data X S = 1 L k = 1 L ( | x k | X m ) 3 X R M S 3
10Interquartile rangeMid-spread of data X I Q = X 3 X 1
11Waveform factorRatio of the RMS value to the mean value X W = X R M S X M
12Peak factorRatio of maximum value of data to RMS X P = m a x ( x k ) X R M S   ( k = 1 , 2 , , L )
13FFTFrequency information about data X F F T = k = L L x ( n ) e j 2 π N n k
14Frequency MinMinimum frequency component X f m i n = M i n ( X F F T )
15Frequency MaxMaximum frequency component X f m a x = M a x ( X F F T )
16Spectral ProbabilityProbability distribution of spectrum X S P = F F T ( d ) 2 k = L L F F T ( k ) 2
17Spectrum EntropyMeasure of data irregularity X H = k = L L p ( d ) ln ( p ( d ) )
18Signal EnergyMeasure of energy component X S E = k = L L | p ( d ) | 2
Table 3. Details of volunteers.
Table 3. Details of volunteers.
Sr. No.GenderAge (Years)Weight (Pounds)Height (Inches)Body Mass Index
1Male261686825.4
2Male281447120.3
3Male311147016.8
4Male311136916.6
5Male311436821.5
Table 4. Confusion matrix for real-time breathing data.
Table 4. Confusion matrix for real-time breathing data.
AlgorithmsActual/PredictedEupneaBradypneaTachypneaBiotSighingKussmaulCheyne–StokesCSA
Cosine KNNEupnea34876563904715
Bradypnea583512427010021
Tachypnea13471339624011311
Biot321236203460
Sighing00003648200
Kussmaul261173361830
Cheyne–Stokes00080436380
CSA43293110003546
Complex TreeEupnea3491740760090
Bradypnea483596040020
Tachypnea00348700220141
Biot543035880050
Sighing00003649100
Kussmaul005502358706
Cheyne–Stokes820190036210
CSA0040700003243
Ensemble Boosted TreeEupnea21226130781001340
Bradypnea1013007062004800
Tachypnea003372001620116
Biot99420347800310
Sighing00003644600
Kussmaul002501493461015
Cheyne–Stokes6002820033620
CSA001073003602541
Linear SVMEupnea2367631138272 569789
Bradypnea7121958174476118934376
Tachypnea2029671401230544
Biot5194973222640267701
Sighing0000344920100
Kussmaul0050191713314321
Cheyne–Stokes192143012113628027780
CSA0069600002954
Table 5. Confusion matrix for simulated breathing data.
Table 5. Confusion matrix for simulated breathing data.
AlgorithmsActual /PredictedEupneaBradypneaTachypneaBiotSighingKussmaulCheyne–StokesCSA
Cosine KNNEupnea13,4934377307423
Bradypnea3313,539422022012
Tachypnea1797213,334490178
Biot433613,6040300
Sighing000013,648020
Kussmaul3501 134010
Cheyne–Stokes20031113,6421
CSA4929132108613,523
Complex TreeEupnea12,96031803650070
Bradypnea6813,4910840025
Tachypnea1013,35200940203
Biot4625013,570 18
Sighing000013,6292100
Kussmaul00140113,62103
Cheyne–Stokes113120450013,4800
CSA00358023013,287
Ensemble Boosted TreeEupnea12,5807980130001420
Bradypnea37512,95500003200
Tachypnea0013,25200325073
Biot1003496012,13400180
Sighing000013,49915100
Kussmaul001015213,456041
Cheyne–Stokes12761060013,4560
CSA001377012059012,094
Linear SVMEupnea10,0030003647000
Bradypnea19998023649000
Tachypnea17010,02703606000
Biot00310,00036210260
Sighing14900013,496050
Kussmaul0000365010,28400
Cheyne–Stokes00003650010,0000
CSA000036500010,000
Table 6. Performance of ML algorithms.
Table 6. Performance of ML algorithms.
AlgorithmsReal-Time Breathing DataSimulated Breathing Data
Accuracy
(%)
Prediction Speed
(obs/s)
Training Time
(s)
Accuracy
(%)
Prediction Speed
(obs/s)
Training Time
(s)
Cosine KNN97.5~2200306.3599.3~5002583.60
Complex Tree96.8~410,00011.1698.4~86,000140.77
Ensemble Boosted Tree85.6~80,000390.5894.7~44,0002897.90
Linear SVM75.5~98,000219.7584.9~32,0001184.90
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Rehman, M.; Shah, R.A.; Khan, M.B.; Shah, S.A.; AbuAli, N.A.; Yang, X.; Alomainy, A.; Imran, M.A.; Abbasi, Q.H. Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset. Sensors 2021, 21, 6750. https://doi.org/10.3390/s21206750

AMA Style

Rehman M, Shah RA, Khan MB, Shah SA, AbuAli NA, Yang X, Alomainy A, Imran MA, Abbasi QH. Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset. Sensors. 2021; 21(20):6750. https://doi.org/10.3390/s21206750

Chicago/Turabian Style

Rehman, Mubashir, Raza Ali Shah, Muhammad Bilal Khan, Syed Aziz Shah, Najah Abed AbuAli, Xiaodong Yang, Akram Alomainy, Muhmmad Ali Imran, and Qammer H. Abbasi. 2021. "Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset" Sensors 21, no. 20: 6750. https://doi.org/10.3390/s21206750

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop