Improving Machine Learning Classification Accuracy for Breathing Abnormalities by Enhancing Dataset

The recent severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as coronavirus disease (COVID)-19, has appeared as a global pandemic with a high mortality rate. The main complication of COVID-19 is rapid respirational deterioration, which may cause life-threatening pneumonia conditions. Global healthcare systems are currently facing a scarcity of resources to assist critical patients simultaneously. Indeed, non-critical patients are mostly advised to self-isolate or quarantine themselves at home. However, there are limited healthcare services available during self-isolation at home. According to research, nearly 20–30% of COVID patients require hospitalization, while almost 5–12% of patients may require intensive care due to severe health conditions. This pandemic requires global healthcare systems that are intelligent, secure, and reliable. Tremendous efforts have been made already to develop non-contact sensing technologies for the diagnosis of COVID-19. The most significant early indication of COVID-19 is rapid and abnormal breathing. In this research work, RF-based technology is used to collect real-time breathing abnormalities data. Subsequently, based on this data, a large dataset of simulated breathing abnormalities is generated using the curve fitting technique for developing a machine learning (ML) classification model. The advantages of generating simulated breathing abnormalities data are two-fold; it will help counter the daunting and time-consuming task of real-time data collection and improve the ML model accuracy. Several ML algorithms are exploited to classify eight breathing abnormalities: eupnea, bradypnea, tachypnea, Biot, sighing, Kussmaul, Cheyne–Stokes, and central sleep apnea (CSA). The performance of ML algorithms is evaluated based on accuracy, prediction speed, and training time for real-time breathing data and simulated breathing data. The results show that the proposed platform for real-time data classifies breathing patterns with a maximum accuracy of 97.5%, whereas by introducing simulated breathing data, the accuracy increases up to 99.3%. This work has a notable medical impact, as the introduced method mitigates the challenge of data collection to build a realistic model of a large dataset during the pandemic.


Introduction
Coronavirus is a large family of viruses with various forms, such as the Middle East respiratory syndrome coronavirus (MERS-CoV), severe acute respiratory syndrome coronavirus (SARS-CoV), and the latest virus, SARS-CoV-2, also known as COVID-19. COVID-19 infection symptoms include respiratory tract illness, acute viral pneumonia with respirational failure, and death [1]. There are various ways to diagnose COVID-19 infection, of which monitoring breathing rate (BR) is considered one of the most significant indications of COVID-19. Therefore, investigating the BR and its connection with COVID-19 symptoms is now a popular area of research [2]. BR is usually defined as the number of breaths an individual takes per minute when resting. A normal BR is 10 to 24 breaths per minute (bpm), while an abnormal BR for adults can be categorized as hyperventilation (bpm > 24), hypoventilation (bpm < 10), or apnea [3]. For non-COVID scenarios, BR may increase with fever, illness, and other medical conditions.
For COVID scenarios, it is considered very important to determine the BR or breathing activity of the patients as abnormal breathing measurements may indicate a deterioration in the patient's health [4]. The BR can be measured through manual counting; however, this method is unreliable and prone to error. Therefore, measuring BR usually involves the expertise of a health professional, so it is usually performed in the hospital. The best method for BR measurement in hospitals is spirometry, which calculates the airflow during inhalation and exhalation. Other methods include electrical impedance pneumography (EIP), capnography, and inductance pneumography (IP) [5]. However, these methods require hospitalization. Due to the clinical emergency caused by COVID-19, BR monitoring of the COVID patients increases the risk of virus spread by visiting hospitals. Most patients do not show breathing distress at first, and healthcare professionals must send these patients back home for self-monitoring. According to medical research, patients with minor clinical conditions may worsen in the second week of COVID-19 infection. Therefore, patients with normal breathing functions do not necessitate hospitalizations and must be monitored using telemedicine methods during self-isolation [6]. In contrast, for patients facing acute breathing distress, real-time BR monitoring is very much mandatory.
Breathing can lose its regular rhythm because of numerous medical conditions such as potential injury or metabolic disorders. Breathing abnormalities can have breathing patterns that are shallow, deep, or fast. These abnormal patterns include eupnea, bradypnea, tachypnea, Biot, Kussmaul, sighing, Cheyne-Stokes, and CSA. Eupnea is normal breathing with a uniform pattern and rate, while BR is faster in tachypnea than in eupnea. Deep breaths characterize biot respiration with regular periods of apnea, and Kussmaul is deep and fast breathing. Bradypnea is shallow and slow breathing with a uniform pattern, while sighing is normal breathing punctuated by sighs. Cheyne-Stokes breathing is defined by a gradual increase and decrease in BR, whereas CSA is breathing that repeatedly stops and starts during sleep. Details of the breathing patterns and their causes is given below in Figure 1.
Non-contact monitoring during the pandemic situation is a promising solution to combat the spread of COVID-19. The most significant indication of COVID-19 is abnormal breathing. Therefore, classifying breathing patterns using ML is worthwhile and of great significance. The dataset of breathing patterns required for building the ML model is obtained by assessing test subjects' breathing patterns. This approach for capturing different breathing patterns yields a limited set of data during pandemic situations, insufficient for developing a reliable ML model. Therefore, there is a need for simulated breathing patterns to overcome the scarcity of real-time breathing data from actual patients. This research collected real-time breathing patterns data through a non-contact approach using the software-defined radio (SDR) platform. Subsequently, a large dataset of simulated breathing patterns was generated using the curve fitting technique. The obtained results were validated by evaluating various ML algorithms based on accuracy, prediction speed, and training time. This approach can be utilized for COVID as well as non-COVID scenarios and has many innovative healthcare applications. and training time. This approach can be utilized for COVID as well as non-COVID scenarios and has many innovative healthcare applications.

Literature Review
There are several contact-based and non-contact technologies for breath monitoring presented in the literature. Contact-based technologies require wearable sensors and smartwatches, etc. [7,8]. The devices used in contact-based technologies are expensive, heavy, and are often inconvenient for patients. To avoid this inconvenience, non-contact technologies have also been proposed. The advantages of non-contact technologies include continuous monitoring at home and even during sleep. Most non-contact technologies use camera-based imaging [9] or are RF-based [10]. Camera-based imaging breath monitoring needs a depth camera or thermal imaging camera. There are limitations in camera-based technologies; for example, depth cameras have a high computational cost and are expensive, while thermal imaging is vulnerable to ambient temperature. RF-based non-contact technologies leverage the propagation of electromagnetic (EM) waves that can be extracted through a wireless medium.
Furthermore, RF-based technology for breath monitoring includes various technologies, including radar, Wi-Fi, and SDR. For RF-signal sensing, these technologies can exploit channel state information (CSI) or received signal strength (RSS). There are numerous techniques for radar-based breath monitoring, including the Doppler radar [11] and frequency-modulated continuous-wave (FMCW) [12]. These radar-based techniques require high-cost, specialized hardware working at high frequency. Furthermore, the Vital-Radio system [13] uses a FMCW radar to track breathing and heart rates with a wide bandwidth from 5.46 GHz to 7.25 GHz. For Wi-Fi-based breath monitoring, numerous

Literature Review
There are several contact-based and non-contact technologies for breath monitoring presented in the literature. Contact-based technologies require wearable sensors and smartwatches, etc. [7,8]. The devices used in contact-based technologies are expensive, heavy, and are often inconvenient for patients. To avoid this inconvenience, non-contact technologies have also been proposed. The advantages of non-contact technologies include continuous monitoring at home and even during sleep. Most non-contact technologies use camera-based imaging [9] or are RF-based [10]. Camera-based imaging breath monitoring needs a depth camera or thermal imaging camera. There are limitations in camera-based technologies; for example, depth cameras have a high computational cost and are expensive, while thermal imaging is vulnerable to ambient temperature. RF-based non-contact technologies leverage the propagation of electromagnetic (EM) waves that can be extracted through a wireless medium.
Furthermore, RF-based technology for breath monitoring includes various technologies, including radar, Wi-Fi, and SDR. For RF-signal sensing, these technologies can exploit channel state information (CSI) or received signal strength (RSS). There are numerous techniques for radar-based breath monitoring, including the Doppler radar [11] and frequencymodulated continuous-wave (FMCW) [12]. These radar-based techniques require high-cost, specialized hardware working at high frequency. Furthermore, the Vital-Radio system [13] uses a FMCW radar to track breathing and heart rates with a wide bandwidth from 5.46 GHz to 7.25 GHz. For Wi-Fi-based breath monitoring, numerous approaches using RSS and CSI are mentioned in the literature. The authors of [14] explored the use of RSS measurements on the links between wireless devices to find the BR and location of a person in a home environment. Similarly, a complete architecture for finding breathing signals from noisy Wi-Fi signals was presented in [15]. A further study [16] provided a non-contact CSI-based breath monitoring system. Schmidt [17] applied the Hampel filter on the CSI series to eliminate outliers and high-frequency noises, and then, BR was measured by performing FFT on all the CSI streams targeting frequencies between 0.1 Hz and 0.6 Hz. Wang et al. [18] proposed a BR monitoring system using the CSI through a single pair of commercial Wi-Fi devices. Wi-Fi-based RF sensing has several advantages, such as cost-effectiveness and ready availability. However, it also has disadvantages, such as lack of scalability and flexibility, and under-reporting Orthogonal Frequency Division Multiplexing (OFDM) subcarriers [19].
SDR-based breath monitoring has been investigated by various authors [20][21][22][23][24]. SDRbased breath monitoring is considered the most efficient among all the RF-based techniques, as it offers a flexible, portable, and scalable solution. Additionally, this technology permits the selection of the operating frequency and transmitted/received power. Moreover, it allows the simple execution of signal processing algorithms. ML has also been exploited for breath monitoring to help accurately classify various breathing abnormalities. Several authors have used ML for classifying breathing patterns [25]. However, several studies were unable to obtain reasonable accuracy, and some were only successful in classifying basic breathing patterns, including normal and fast breathing [26]. Consequently, there is a need for a platform to monitor and accurately classify a diverse range of breathing patterns. The summary of various technologies discussed in the literature is shown in Figure 2.
Sensors 2021, 21, x FOR PEER REVIEW 4 of 17 approaches using RSS and CSI are mentioned in the literature. The authors of [14] explored the use of RSS measurements on the links between wireless devices to find the BR and location of a person in a home environment. Similarly, a complete architecture for finding breathing signals from noisy Wi-Fi signals was presented in [15]. A further study [16] provided a non-contact CSI-based breath monitoring system. Schmidt [17] applied the Hampel filter on the CSI series to eliminate outliers and high-frequency noises, and then, BR was measured by performing FFT on all the CSI streams targeting frequencies between 0.1 Hz and 0.6 Hz. Wang et al. [18] proposed a BR monitoring system using the CSI through a single pair of commercial Wi-Fi devices. Wi-Fi-based RF sensing has several advantages, such as cost-effectiveness and ready availability. However, it also has disadvantages, such as lack of scalability and flexibility, and under-reporting Orthogonal Frequency Division Multiplexing (OFDM) subcarriers [19]. SDR-based breath monitoring has been investigated by various authors [20][21][22][23][24]. SDRbased breath monitoring is considered the most efficient among all the RF-based techniques, as it offers a flexible, portable, and scalable solution. Additionally, this technology permits the selection of the operating frequency and transmitted/received power. Moreover, it allows the simple execution of signal processing algorithms. ML has also been exploited for breath monitoring to help accurately classify various breathing abnormalities. Several authors have used ML for classifying breathing patterns [25]. However, several studies were unable to obtain reasonable accuracy, and some were only successful in classifying basic breathing patterns, including normal and fast breathing [26]. Consequently, there is a need for a platform to monitor and accurately classify a diverse range of breathing patterns. The summary of various technologies discussed in the literature is shown in Figure 2.

System Architecture
The system architecture consists of four layers, as shown in Figure 3. The functionality of each layer is explained below:

System Architecture
The system architecture consists of four layers, as shown in Figure 3. The functionality of each layer is explained below:

Data Extraction Layer
The first layer is the data extraction layer, which is responsible for breathing data extraction. There are three main blocks in this layer containing the transmitter, receiver, and wireless channel. The transmitter contains transmitter PC and transmitter universal software radio peripheral (USRP). First, random data bits are generated and mapped to quadrature amplitude modulation (QAM) symbols in the transmitter PC. Then, these QAM symbols are further split into parallel frames. After that, reference data symbols are inserted in each parallel frame. On the receiver side, the reference symbols will be used for channel estimation. Then in each frame, zeros are positioned at the edges and 1 zero at DC. Next, frequency domain signals are converted into time-domain signals by applying inverse fast Fourier transform (IFFT). Subsequently, a cyclic prefix (CP) is introduced in every frame by repeating the last one-fourth of the points at the start. On the receiver side, the CP will be used in the elimination of frequency and time offset.

Data Extraction Layer
The first layer is the data extraction layer, which is responsible for breathing data extraction. There are three main blocks in this layer containing the transmitter, receiver, and wireless channel. The transmitter contains transmitter PC and transmitter universal software radio peripheral (USRP). First, random data bits are generated and mapped to quadrature amplitude modulation (QAM) symbols in the transmitter PC. Then, these QAM symbols are further split into parallel frames. After that, reference data symbols are inserted in each parallel frame. On the receiver side, the reference symbols will be used for channel estimation. Then in each frame, zeros are positioned at the edges and 1 zero at DC. Next, frequency domain signals are converted into time-domain signals by applying inverse fast Fourier transform (IFFT). Subsequently, a cyclic prefix (CP) is introduced in every frame by repeating the last one-fourth of the points at the start. On the receiver side, the CP will be used in the elimination of frequency and time offset.
Then, the host PC sends this data to the USRP kit through gigabit ethernet. First, this data is digitally upconverted and translated into an analog form using a digital upconverter (DUC) and digital to analog converter (DAC), respectively. The USRP then passes this analog signal through a low pass filter (LPF) and mixes it up to a user-specified frequency. Before transmitting the resultant signal using an omnidirectional antenna, USRP moves it through a transmit amplifier for gain adjustment. After passing through the wireless channel, the transmitted signal is received at the receiver USRP device using an omnidirectional antenna. This received signal is then moved through a low noise amplifier (LNA) and drive amplifier (DA) for noise element elimination and gain adjustment, respectively. The subsequent signal is passed through LPF, then an analog to digital converter (DAC), followed by the digital down converter (DDC) for filtering and decimating the signal. Eventually, the resultant signal is moved to the host PC using a gigabit ethernet cable. Now CP is eliminated from each frame at the host PC; additionally, time and frequency offset are removed by applying the Van de Beek algorithm [27]. Afterward, the fast Fourier transform (FFT) converts the time domain samples into frequency domain symbols. Finally, breathing patterns are detected by extracting the amplitude response of the frequency domain signal. Then, the host PC sends this data to the USRP kit through gigabit ethernet. First, this data is digitally upconverted and translated into an analog form using a digital upconverter (DUC) and digital to analog converter (DAC), respectively. The USRP then passes this analog signal through a low pass filter (LPF) and mixes it up to a user-specified frequency. Before transmitting the resultant signal using an omnidirectional antenna, USRP moves it through a transmit amplifier for gain adjustment. After passing through the wireless channel, the transmitted signal is received at the receiver USRP device using an omnidirectional antenna. This received signal is then moved through a low noise amplifier (LNA) and drive amplifier (DA) for noise element elimination and gain adjustment, respectively. The subsequent signal is passed through LPF, then an analog to digital converter (DAC), followed by the digital down converter (DDC) for filtering and decimating the signal. Eventually, the resultant signal is moved to the host PC using a gigabit ethernet cable. Now CP is eliminated from each frame at the host PC; additionally, time and frequency offset are removed by applying the Van de Beek algorithm [27]. Afterward, the fast Fourier transform (FFT) converts the time domain samples into frequency domain symbols. Finally, breathing patterns are detected by extracting the amplitude response of the frequency domain signal.
The wireless channel for the OFDM system can be regarded as a narrowband flat fading channel, which can be represented in the frequency domain, as shown in Equation (1): where Y and X denote the received and transmitted wireless signal vectors, respectively, N is the additive white Gaussian noise, and H represents the OFDM channel frequency response for all subcarriers, and this H can be estimated from Y and X. Here, the OFDM where k represents the OFDM subcarriers, and s represents acquired samples. The frequency response of the channel for a single subcarrier i is denoted by H i , and is a complex value, given in Equation (3) as: where |H i | and ∠H i are the amplitude and phase response of OFDM subcarrier i, respectively. For indoor lab environments with multipath components, the channel frequency response H i of subcarrier i is expressed in Equation (4) as: where N is the total number of multipath components, and r n and τ n are the attenuation and propagation delay on the n th path, respectively, while f i represents the frequency of the i th subcarrier.

Data Preprocessing Layer
Raw CSI data received from the extraction layer is sent to the data preprocessing layer. This layer is further divided into four sublayers.

Subcarrier Selection
The first step in the data preprocessing layer is subcarrier selection. After each activity, 256 OFDM subcarriers are acquired at the receiver. It is realized that the susceptibility of each subcarrier is distinct for the breathing experiment. Therefore, for good detection, eliminating all subcarriers that are less susceptible to breathing activity is necessary. Therefore, the variance of all subcarriers is measured. Based on this, all subcarriers that are less susceptible to breathing activity are eliminated, as shown in Figure 4a for all OFDM subcarriers.

Outliers Removal
After subcarrier selection, wavelet filtering is applied to eliminate the outliers from raw data by retaining sharp transition. This can be seen in Figure 4b for all OFDM subcarriers. For wavelet filtering, "scaled noise option" and "soft heuristic SURE" thresholding is used on coefficients by choosing "syms5" and "level 4".

Data Smoothing
The moving average filter of "size 8" is applied for data smoothing, which removes high-level frequency noise, and this can be seen in Figure 4c for all OFDM subcarriers. The output of the moving average filter of window "size 8" can be represented by Equation (5): where y[n] is the current output, x[n] is the current input, and N is the window size of the moving average filter.

Outliers Removal
After subcarrier selection, wavelet filtering is applied to eliminate the outliers from raw data by retaining sharp transition. This can be seen in Figure 4b for all OFDM subcarriers. For wavelet filtering, "scaled noise option" and "soft heuristic SURE" thresholding is used on coefficients by choosing "syms5" and "level 4".

Data Smoothing
The moving average filter of "size 8" is applied for data smoothing, which removes high-level frequency noise, and this can be seen in Figure4c for all OFDM subcarriers. The output of the moving average filter of window "size 8" can be represented by Equation (5): where [ ] is the current output, [ ] is the current input, and is the window size of the moving average filter.

Data Normalization
Finally, waveform data is normalized to the maximum and minimum value to 1 and −1, respectively, by using the following Equation (6):

Data Normalization
Finally, waveform data is normalized to the maximum and minimum value to 1 and −1, respectively, by using the following Equation (6): where y[n] is the normalized data, and y[n] is the input data. Here, input waveform data is scaled and offset by some values to acquire normalized waveform. For example, the normalized waveform for a single OFDM subcarrier is shown in Figure 4d. After performing the above steps, processed CSI data is obtained to classify breathing patterns through ML algorithms.

Data Simulation Layer
The amount of real-time breathing experimentation is not sufficient to train a robust ML classification model. Therefore, a simulation model inspired by [28] was developed to overcome the data scarcity issue. Based on the characteristics of actual real-time breathing patterns data, a simulation model was developed to generate abundant and high-quality simulated breathing data. As breathing is a continuous process of inhalation and exhalation, breathing signals measured by the non-contact method can be approximated by the sinusoidal waveforms. Here, breathing patterns are simulated through the curve fitting function available in MATLAB. The curve fitting is usually performed to theoretically describe experimental data points with a model (equation or function) and acquire the model's parameters. In this research work, the curve fitting function of MATLAB is used to model and generate all breathing patterns [29]. As all breathing patterns can be repre- sented by sinusoidal waveform, therefore by using the curve fitting function of MATLAB, real-time breathing patterns are modelled by the sum of seven sinusoidal terms, which can be represented by Equation (7): where a is the amplitude, b is the frequency, and c is the phase for each sinusoidal term, while x represents OFDM samples, and n is the total number of sinusoidal terms in the summation. The coefficients' values for eight breathing patterns are shown in Table 1.
Here, for illustration purposes, simulated and real-time Biot breathing patterns are shown in Figure 5. Real-time breathing patterns are prone to fluctuations; therefore, to make simulated breathing patterns closer to real-time data, the additive white Gaussian noise (AWGN) function available in MATLAB introduces noise into the simulated breathing patterns [30]. A huge amount of simulated data can be generated by slight variations in AWGN values or the coefficients' values shown in Table 1. At the output of the data simulation layer, simulated CSI for eight breathing patterns is generated and used for classification purposes. where [ ] is the normalized data, and [ ] is the input data. Here, input waveform data is scaled and offset by some values to acquire normalized waveform. For example, the normalized waveform for a single OFDM subcarrier is shown in Figure 4d. After performing the above steps, processed CSI data is obtained to classify breathing patterns through ML algorithms.

Data Simulation Layer
The amount of real-time breathing experimentation is not sufficient to train a robust ML classification model. Therefore, a simulation model inspired by [28] was developed to overcome the data scarcity issue. Based on the characteristics of actual real-time breathing patterns data, a simulation model was developed to generate abundant and high-quality simulated breathing data. As breathing is a continuous process of inhalation and exhalation, breathing signals measured by the non-contact method can be approximated by the sinusoidal waveforms. Here, breathing patterns are simulated through the curve fitting function available in MATLAB. The curve fitting is usually performed to theoretically describe experimental data points with a model (equation or function) and acquire the model's parameters. In this research work, the curve fitting function of MATLAB is used to model and generate all breathing patterns [29]. As all breathing patterns can be represented by sinusoidal waveform, therefore by using the curve fitting function of MATLAB, real-time breathing patterns are modelled by the sum of seven sinusoidal terms, which can be represented by Equation (7): where is the amplitude, is the frequency, and is the phase for each sinusoidal term, while represents OFDM samples, and is the total number of sinusoidal terms in the summation. The coefficients' values for eight breathing patterns are shown in Table  1. Here, for illustration purposes, simulated and real-time Biot breathing patterns are shown in Figure 5. Real-time breathing patterns are prone to fluctuations; therefore, to make simulated breathing patterns closer to real-time data, the additive white Gaussian noise (AWGN) function available in MATLAB introduces noise into the simulated breathing patterns [30]. A huge amount of simulated data can be generated by slight variations in AWGN values or the coefficients' values shown in Table 1. At the output of the data simulation layer, simulated CSI for eight breathing patterns is generated and used for classification purposes.

Data Classification Layer
In this layer, the processed and simulated CSI is used for training and testing purposes. First various statistical features of the processed CSI are extracted, and the performance of the ML algorithms is evaluated based on accuracy, prediction speed, and training time. Likewise, statistical features for the simulated CSI are extracted. After this, the performance of the ML algorithms is evaluated for the simulated data. The details of the statistical features are shown in Table 2. As the accuracy of ML algorithms depends upon the size and type of the dataset, it can be enhanced by enlarging the dataset. In this work, the dataset size was enhanced by introducing a large amount of simulated breathing data. Furthermore, random five-fold cross-validation was applied for classification purposes.
Standard deviation Square root of variance Peak-to-peak value Variations in data about the mean X p−p = X max − X min (k = 1, 2, . . . , L)

Results and Discussion
In this section, the experimental setup is presented, and the results are discussed.

Experimental Setup
The experimental setup monitors and classifies eight breathing patterns by detecting small-scale activities in a real-time wireless medium through acquiring fine-grained CSI.
The experimental setup contains three main blocks: the transmitter, receiver, and wireless channel, as shown in Figure 6. The transmitter block contains a transmitter PC, and the USRP model 2922 is utilized as SDR hardware to perform generic RF functionality. In contrast, the receiver block consists of a receiver PC and receiver USRP. Omnidirectional antennas are also used to capture the variation in the CSI due to breathing. The RF signal generated by the transmitter reaches the receiver through multipaths in the indoor lab environment. When an individual is present in the lab environment, an additional path is created due to the human body's diffraction or reflection of signals. Therefore, the impact of human breathing on the signal's propagation is acquired on the receiver side in the form of CSI.
Sensors 2021, 21, x FOR PEER REVIEW 11 of 17 generated by the transmitter reaches the receiver through multipaths in the indoor lab environment. When an individual is present in the lab environment, an additional path is created due to the human body's diffraction or reflection of signals. Therefore, the impact of human breathing on the signal's propagation is acquired on the receiver side in the form of CSI. Figure 6. Non-contact breathing sensing experimental setup.
Five volunteers participated in the study and performed the breathing patterns; their details are given in Table 3. Each volunteer was sitting in a relaxed position. Both the transmitter and receiver USRPs were positioned parallel to the abdomen of the volunteer at a 1m distance. All volunteers were professionally trained to perform each breathing pattern. Ten datasets were collected from five volunteers for eight breathing patterns to perform 400 experiments. Each breathing activity was performed for 30 s by each volunteer.

Breathing Patterns' Monitoring
This section describes the monitoring of breathing patterns using SDR-based RF sensing. The CSI amplitude response was exploited to analyze the breathing patterns. The variations in amplitude frequency response were observed for each breathing experiment over 3500 OFDM samples. For illustration purposes, the results from subject 2 for eight breathing patterns are depicted in Figure 7 for a single OFDM subcarrier. Eupnea is normal breathing with a uniform rate and pattern with a BR between 12 to 24 bpm. From Figure 7a, it can be observed that there are 12 breaths in a half-minute, which lies within the range of normal breathing. Bradypnea is shallow and slow breathing with a uniform pattern, as can be observed in Figure 7b, showing 6 breaths in a half-minute, which lies in the range of slow breathing. In tachypnea, the BR is faster than eupnea, which can be verified in Figure 7c, which shows there are 15 breaths in a half minute. Biot is characterized by deep breathing with regular periods of apnea, and Figure 7d depicts deep breaths followed by apnea. Sighing is normal breathing punctuated by sighs. This can be observed in Figure 7e, which shows normal breathing punctuated by frequent deep breaths. Five volunteers participated in the study and performed the breathing patterns; their details are given in Table 3. Each volunteer was sitting in a relaxed position. Both the transmitter and receiver USRPs were positioned parallel to the abdomen of the volunteer at a 1-m distance. All volunteers were professionally trained to perform each breathing pattern. Ten datasets were collected from five volunteers for eight breathing patterns to perform 400 experiments. Each breathing activity was performed for 30 s by each volunteer.

Breathing Patterns' Monitoring
This section describes the monitoring of breathing patterns using SDR-based RF sensing. The CSI amplitude response was exploited to analyze the breathing patterns. The variations in amplitude frequency response were observed for each breathing experiment over 3500 OFDM samples. For illustration purposes, the results from subject 2 for eight breathing patterns are depicted in Figure 7 for a single OFDM subcarrier. Eupnea is normal breathing with a uniform rate and pattern with a BR between 12 to 24 bpm. From Figure 7a, it can be observed that there are 12 breaths in a half-minute, which lies within the range of normal breathing. Bradypnea is shallow and slow breathing with a uniform pattern, as can be observed in Figure 7b, showing 6 breaths in a half-minute, which lies in the range of slow breathing. In tachypnea, the BR is faster than eupnea, which can be verified in Figure 7c, which shows there are 15 breaths in a half minute. Biot is characterized by deep breathing with regular periods of apnea, and Figure 7d depicts deep breaths followed by apnea. Sighing is normal breathing punctuated by sighs. This can be observed in Figure 7e, which shows normal breathing punctuated by frequent deep breaths. Cheyne-Stokes is defined by a gradual increase and decrease in BR, and Figure 7f clearly shows a gradual increase and decrease in BR. Kussmaul is deep and fast breathing, which can be observed in Figure 7g. Furthermore, CSA is a type of breathing in which breathing repeatedly stops and starts during sleep and this can be seen in Figure 7h, which shows periods of no breathing during normal breathing. Cheyne-Stokes is defined by a gradual increase and decrease in BR, and Figure 7f clearly shows a gradual increase and decrease in BR. Kussmaul is deep and fast breathing, which can be observed in Figure 7g. Furthermore, CSA is a type of breathing in which breathing repeatedly stops and starts during sleep and this can be seen in Figure 7h, which shows periods of no breathing during normal breathing.

Breathing Patterns' Classification
In this section, the results of ML algorithms for the classification of breathing patterns are discussed. To assess the performance of each ML algorithm, a confusion matrix was used, with eight predicted and true classes. The diagonal entries of the matrix represent

Conclusions
In this article, a solution is proposed for the problem of collecting a large dataset of abnormal breathing patterns during the pandemic. This approach improves the classification accuracy of the ML model. A non-contact SDR platform was used to collect real-

Conclusions
In this article, a solution is proposed for the problem of collecting a large dataset of abnormal breathing patterns during the pandemic. This approach improves the classification accuracy of the ML model. A non-contact SDR platform was used to collect real-time breathing patterns data based on variations in CSI. The real-time data was used to generate a large, simulated dataset through the curve fitting technique. This increases the dataset size, which will help in building a reliable ML model. Different ML algorithms were exploited for breathing patterns classification for real-time as well as simulated breathing data. It was verified that accuracy improves by introducing simulated data for training purposes. This research work can be used for COVID and non-COVID scenarios. The results indicate that the developed platform is accurate and robust for monitoring human breathing, and its accuracy can be further enhanced by introducing more simulated breathing data. The future applications of this research work are numerous, for example, it can be used as a pre-examination tool for patients to provide clues about the nature of the illness, it can be used in homes for individual monitoring during the day, and it can even be deployed in public places for infection monitoring in crowds. This work has a few limitations, including that experiments were performed on single subjects in an indoor lab environment, and actual patients were not chosen for data collection. So, future recommendations for this research work would remove all of the limitations stated in this research.