Development of an Ultrasonic Doppler Sensor-Based Swallowing Monitoring and Assessment System

Existing swallowing evaluation methods using X-ray or endoscopy are qualitative. The present study develops a swallowing monitoring and assessment system (SMAS) that is nonintrusive and quantitative. The SMAS comprises an ultrasonic Doppler sensor array, a microphone, and an inertial measurement unit to measure ultrasound signals originating only from swallowing activities. Ultrasound measurements were collected for combinations of two viscosity conditions (water and yogurt) and two volume conditions (3 mL and 9 mL) from 24 healthy participants (14 males and 10 females; age = 30.5 ± 7.6 years) with no history of swallowing disorders and were quantified for 1st peak amplitude, 2nd peak amplitude, peak-to-peak (PP) time interval, duration, energy, and proportion of two or more peaks. The peak amplitudes and energy significantly decreased by viscosity and the PP time interval and duration increased by volume. The correlation between the time measures were higher (r = 0.78) than that of the amplitude measures (r = 0.30), and the energy highly correlated with the 1st peak amplitude (r = 0.86). The proportion of two or more peaks varied from 76.8% to 87.9% by viscosity and volume. Further research is needed to examine the concurrent validity and generalizability of the ultrasonic Doppler sensor-based SMAS.


Introduction
The early screening and intervention of dysphagia are of importance because dysphagia can cause aspiration, pneumonia, dehydration, and malnutrition, resulting in a decreased quality of life and even death [1]. Dysphagia, a disturbance in transferring solid or liquid food from the mouth to the stomach [2,3], occurs in neurologic patients and the elderly [4,5]. Note that laryngeal closure and A seven-ultrasonic Doppler sensor array with two transmitters and five receivers in a curved strip (Figure 1b; DA-2.0M7-CRBK, Digital Echo Co., Republic of Korea; frequency = 2 MHz ± 3%, length (L) × width (W) × thickness (T) = 38.0 mm × 15.5 mm × 3.0 mm, kerf = 1 mm, and the radius of curvature = 133 mm) is custom-designed for its close contact to the neck and securing a wide capture area of reflected ultrasound signals due to a pharyngeal movement. The sensor array is housed in a strip pad to measure movements in a sufficiently large pharyngeal area and connected to the device by a flexible polyvinyl chloride (PVC) cable. Next, an omnidirectional electret condenser microphone (WM-54BH, Panasonic Corporation, Japan; frequency = 20~16,000 Hz, radius = 9.7 mm, sensitivity = −42 ± 2 dB, and signal to noise ratio >60 dB) and an IMU (MPU-6500, TDK Corporation, Japan; 6-axis gyro accelerometer, L × W × T = 3 mm × 3 mm × 0.9 mm) are used to record signals due to vocalization, coughing, and/or movements around the neck, which also can affect ultrasound signals. Lastly, a Bluetooth chip (nRF52832, Nordic Semiconductor, Norway; frequency = 2.4 GHz and data rate = 2 Mbps) is used for wireless communication of signals.
Sensors 2020, 20, x FOR PEER REVIEW 3 of 16 analysis if values of the microphone or IMU are greater than designated values (e.g., microphone >0.2 mV or IMU >200 deg/s). A seven-ultrasonic Doppler sensor array with two transmitters and five receivers in a curved strip (Figure 1b; DA-2.0M7-CRBK, Digital Echo Co., Republic of Korea; frequency = 2 MHz  3%, length (L) × width (W) × thickness (T) = 38.0 mm × 15.5 mm × 3.0 mm, kerf = 1 mm, and the radius of curvature = 133 mm) is custom-designed for its close contact to the neck and securing a wide capture area of reflected ultrasound signals due to a pharyngeal movement. The sensor array is housed in a strip pad to measure movements in a sufficiently large pharyngeal area and connected to the device by a flexible polyvinyl chloride (PVC) cable. Next, an omnidirectional electret condenser microphone (WM-54BH, Panasonic Corporation, Japan; frequency = 20~16,000 Hz, radius = 9.7 mm, sensitivity = −42  2 dB, and signal to noise ratio >60 dB) and an IMU (MPU-6500, TDK Corporation, Japan; 6-axis gyro accelerometer, L × W × T = 3 mm × 3 mm × 0.9 mm) are used to record signals due to vocalization, coughing, and/or movements around the neck, which also can affect ultrasound signals. Lastly, a Bluetooth chip (nRF52832, Nordic Semiconductor, Norway; frequency = 2.4 GHz and data rate = 2 Mbps) is used for wireless communication of signals.  A signal processing program is developed in the present study for the acquisition, rectification, smoothing, and quantification of signals from the sensor array. First, ultrasound measurements of a particular time period are excluded as noises if those of the microphone or IMU of the time period exceed the designated values ( Figure 2a). Second, all the filtered ultrasound signals are converted to positive values and then smoothed by a simple moving average algorithm (Figure 2b). Lastly, the smoothed signals are quantified by six measures (1st peak amplitude, 2nd peak amplitude, peak-to-peak (PP) interval, duration, energy, and the number of peaks) referring to Lee [43], as shown in Figure 2c. Note that peak is operationally defined as the signal finishes increasing and starts decreasing while keeping the amplitude of more than 0.1 mV, PP time interval (unit: ms) as the time interval between the 1st and 2nd peaks, duration (unit: ms) as the time difference between the starting point when the signal begins to increase above a designated level and the ending point when the signal reaches the baseline level, and energy (unit: mV 2 ) as the sum of the squared amplitudes during the duration [44].

Participants
A total of 24 healthy participants (14 males and 10 females; age = 30.5  7.6 years) were recruited to examine the characteristics of ultrasound signals by pharyngeal swallowing using the SMAS. Any history of swallowing disorders or problems with food intake were reported by the participants.

Experiment Procedure
The swallowing experiment was performed in the following four phases: introduction, preparation, measurement, and debriefing. In the introduction phase, the purpose and procedure of the experiment were explained to the participant and informed consent was obtained. In the preparation phase, the SMAS was placed around the neck, the ultrasonic Doppler sensor array coated with water-soluble gel was vertically attached with an elastic band near the right side of the laryngeal prominence, and then the participant was asked to say his/her name aloud to check if the signals of the ultrasonic Doppler sensor array were adequately measured. In the measurement phase, signals were recorded three times for each of four swallowing conditions: 3 mL and 9 mL of thin liquid (water) and thick liquid (Yoplait Plain, Binggrae Co., Ltd., Republic of Korea). The experiment was planned to immediately stop if aspiration occurred. Lastly, the SMAS was dismounted, a debriefing was conducted, and monetary compensation was provided for the participation. The experiment was approved (PIRB-2019-E021) by the Institutional Review Board at Pohang University of Science and Technology.

Analysis Methods
An ANOVA was conducted to identify the significance of bolus viscosity and bolus volume on the ultrasound signal measures (1st peak amplitude, 2nd peak amplitude, PP time interval, duration, and energy) of pharyngeal swallowing. Pearson's correlation analysis was performed to examine the relationships between the ultrasound signal measures. Outliers were removed so that the coefficients of variation (CVs) of swallowing signals were kept below particular levels (CV < 0.4 for the amplitude and energy measures and CV < 0.2 for the time measures). A paired t-test was conducted for posthoc analysis of significant factors and a z-test was conducted for testing the percentage of two or more peaks by bolus viscosity and bolus volume. Statistical testing in the present study was

Participants
A total of 24 healthy participants (14 males and 10 females; age = 30.5 ± 7.6 years) were recruited to examine the characteristics of ultrasound signals by pharyngeal swallowing using the SMAS. Any history of swallowing disorders or problems with food intake were reported by the participants.

Experiment Procedure
The swallowing experiment was performed in the following four phases: introduction, preparation, measurement, and debriefing. In the introduction phase, the purpose and procedure of the experiment were explained to the participant and informed consent was obtained. In the preparation phase, the SMAS was placed around the neck, the ultrasonic Doppler sensor array coated with water-soluble gel was vertically attached with an elastic band near the right side of the laryngeal prominence, and then the participant was asked to say his/her name aloud to check if the signals of the ultrasonic Doppler sensor array were adequately measured. In the measurement phase, signals were recorded three times for each of four swallowing conditions: 3 mL and 9 mL of thin liquid (water) and thick liquid (Yoplait Plain, Binggrae Co., Ltd., Republic of Korea). The experiment was planned to immediately stop if aspiration occurred. Lastly, the SMAS was dismounted, a debriefing was conducted, and monetary compensation was provided for the participation. The experiment was approved (PIRB-2019-E021) by the Institutional Review Board at Pohang University of Science and Technology.

Analysis Methods
An ANOVA was conducted to identify the significance of bolus viscosity and bolus volume on the ultrasound signal measures (1st peak amplitude, 2nd peak amplitude, PP time interval, duration, and energy) of pharyngeal swallowing. Pearson's correlation analysis was performed to examine the relationships between the ultrasound signal measures. Outliers were removed so that the coefficients of variation (CVs) of swallowing signals were kept below particular levels (CV < 0.4 for the amplitude and energy measures and CV < 0.2 for the time measures). A paired t-test was conducted for post-hoc analysis of significant factors and a z-test was conducted for testing the percentage of two or more peaks by bolus viscosity and bolus volume. Statistical testing in the present study was conducted at α = 0.05 unless otherwise specified and Minitab 19 (Minitab LLC., State College, PA) was used for statistical analysis.

Results
An ANOVA with two within-subjects factors (viscosity and volume) on swallowing ultrasound signal measurements (Table 1) shows that viscosity was found significant on 1st peak amplitude, 2nd peak amplitude, and energy and volume on 1st peak amplitude, PP time interval, and duration. No interaction effects between viscosity and volume were found significant for any of the swallowing measures. Table 1. ANOVA results of swallowing ultrasound signal measurements (*: p < 0.05; **: p < 0.01).

Peak-to-Peak (PP) Time Interval
As displayed in Figure 5, the average PP time interval increased slightly more by volume (5.9%) than by viscosity (5.2%). The average changes (29.4 ms) in PP time interval by viscosity were not found significant (t[37] = −1.56, p = 0.13). On the other hand, the average PP time interval of 9 mL was found significantly longer (33.2 ms) than that of 3 mL (t[37] = −2.12, p = 0.04). As displayed in Figure 4, the average 2nd peak amplitude changed greater by viscosity (9.2%) than by volume (4.4%). The average 2nd peak amplitude of thick liquid was found significantly lower (0.04 mV) than that of thin liquid (t [37] = 2.24, p = 0.03). On the other hand, the average changes (0.02 mV) in 2nd peak amplitude by volume were not found significant (t [37] = −1.01, p = 0.32).  As displayed in Figure 4, the average 2nd peak amplitude changed greater by viscosity (9.2%) than by volume (4.4%). The average 2nd peak amplitude of thick liquid was found significantly lower (0.04 mV) than that of thin liquid (t[37] = 2.24, p = 0.03). On the other hand, the average changes (0.02 mV) in 2nd peak amplitude by volume were not found significant (t[37] = −1.01, p = 0.32).

Peak-to-Peak (PP) Time Interval
As displayed in Figure 5, the average PP time interval increased slightly more by volume (5.9%) than by viscosity (5.2%). The average changes (29.4 ms) in PP time interval by viscosity were not found significant (t[37] = −1.56, p = 0.13). On the other hand, the average PP time interval of 9 mL was found significantly longer (33.2 ms) than that of 3 mL (t[37] = −2.12, p = 0.04).

Peak-to-Peak (PP) Time Interval
As displayed in Figure 5, the average PP time interval increased slightly more by volume (5.9%) than by viscosity (5.2%). The average changes (29.4 ms) in PP time interval by viscosity were not found significant (t [37] = −1.56, p = 0.13). On the other hand, the average PP time interval of 9 mL was found significantly longer (33.2 ms) than that of 3 mL (t [37] = −2.12, p = 0.04).

Duration
As displayed in Figure 6, the average duration increased greater by volume (5.6%) than by viscosity (1.4%). The average changes (12.5 ms) in duration by viscosity were not found significant

Duration
As displayed in Figure 6, the average duration increased greater by volume (5.6%) than by viscosity (1.4%). The average changes (12.5 ms) in duration by viscosity were not found significant (t [37] = 0.60, p = 0.55). On the other hand, the average duration of 9 mL was found significantly longer (49.1 ms) than that of 3 mL (t [37] = −2.79, p = 0.01).

Duration
As displayed in Figure 6, the average duration increased greater by volume (5.6%) than by viscosity (1.4%). The average changes (12.5 ms) in duration by viscosity were not found significant

Correlations Between Swallowing Signal Measures
As displayed in Figure 8, significant correlations were found for all the ten pairs of the five swallowing signal measures, except two pairs (1st peak amplitude and duration; 2nd peak amplitude and PP time interval). Of the significant correlations, strong correlations (r ≥ 0.7) were found for two pairs (1st peak amplitude and energy; PP time interval and duration) and moderate correlations (0.7

Correlations between Swallowing Signal Measures
As displayed in Figure 8, significant correlations were found for all the ten pairs of the five swallowing signal measures, except two pairs (1st peak amplitude and duration; 2nd peak amplitude and PP time interval). Of the significant correlations, strong correlations (r ≥ 0.7) were found for two pairs (1st peak amplitude and energy; PP time interval and duration) and moderate correlations (0.7 < r < 0.4) for one pair (2nd peak amplitude and energy). The correlation between the time measures (r = 0.78) was higher than the correlation between the peak amplitude measures (r = 0.30); those between the time measures and the peak amplitude measures were found low (r = −0.27 to 0.18). Lastly, energy was found correlated higher with the peak amplitude measures (r = 0.86 with 1st peak amplitude and r = 0.5 with 2nd peak amplitude) than with the time measures (r = −0.24 with PP time interval and r = 0.17 with duration).

Correlations Between Swallowing Signal Measures
As displayed in Figure 8, significant correlations were found for all the ten pairs of the five swallowing signal measures, except two pairs (1st peak amplitude and duration; 2nd peak amplitude and PP time interval). Of the significant correlations, strong correlations (r ≥ 0.7) were found for two pairs (1st peak amplitude and energy; PP time interval and duration) and moderate correlations (0.7 < r < 0.4) for one pair (2nd peak amplitude and energy). The correlation between the time measures (r = 0.78) was higher than the correlation between the peak amplitude measures (r = 0.30); those between the time measures and the peak amplitude measures were found low (r = −0.27 to 0.18). Lastly, energy was found correlated higher with the peak amplitude measures (r = 0.86 with 1st peak amplitude and r = 0.5 with 2nd peak amplitude) than with the time measures (r = −0.24 with PP time interval and r = 0.17 with duration).

Number of Peaks
As displayed in Figure 9, the proportion of two or more peaks changed more largely by viscosity (∆ = 11.1%) than by volume (∆ = 7.9%). Figure 9a shows that the proportion of two or more peaks of thin liquid (87.9%) was significantly higher than that of thick liquid (76.8%; z = 2.03, p = 0.04). Next, Figure 9b shows that the proportion of two or more peaks of 9 mL (86.5%) slightly increased without significance compared to that of 3 mL (78.6%; z = 1.46, p = 0.15).

Number of Peaks
As displayed in Figure 9, the proportion of two or more peaks changed more largely by viscosity (∆ = 11.1%) than by volume (∆ = 7.9%). Figure 9a shows that the proportion of two or more peaks of thin liquid (87.9%) was significantly higher than that of thick liquid (76.8%; z = 2.03, p = 0.04). Next, Figure 9b shows that the proportion of two or more peaks of 9 mL (86.5%) slightly increased without significance compared to that of 3 mL (78.6%; z = 1.46, p = 0.15).

Reproducibility of Measurement
The distribution of the coefficient of variation (CV, the ratio of the standard deviation to the mean) presented in Table 2 indicates that the reproducibility of the measurement varies mainly by ultrasonic measure and that duration has the highest reproducibility, followed by the PP time interval, energy, 1st peak amplitude, and 2nd peak amplitude. The CV of the three repeated measurements collected in each swallowing condition was calculated, and then the distribution of the CV was constructed by bolus viscosity, bolus volume, and ultrasonic measure, as displayed in

Reproducibility of Measurement
The distribution of the coefficient of variation (CV, the ratio of the standard deviation to the mean) presented in Table 2 indicates that the reproducibility of the measurement varies mainly by ultrasonic measure and that duration has the highest reproducibility, followed by the PP time interval, energy, 1st peak amplitude, and 2nd peak amplitude. The CV of the three repeated measurements collected in each swallowing condition was calculated, and then the distribution of the CV was constructed by bolus viscosity, bolus volume, and ultrasonic measure, as displayed in Table 2. More than 80% of the measurements of duration and PP time interval were found at a CV ≤0.2 and those of energy, 1st peak amplitude, and 2nd peak amplitude were found at a CV ≤0.4. On the basis of the CV analysis results, the present study excluded those of duration and PP time interval if their CV was >0.2 and those of energy, 1st peak amplitude, and 2nd peak amplitude if their CV was >0.4.

Discussion
The present study develops a wireless neck-band-type SMAS comprising a slim and curved ultrasonic Doppler sensor array, a microphone, and an IMU so that ultrasound signals due to swallowing can be collected effectively. The single-crystal flat disk transducer (diameter = 28 mm and thickness = 33.5 mm) of a portable ultrasonic detector DF-4001 (Martec Med LLC., Brazil) used by Soria et al. [45], Cagliari et al. [31], and Santos and Filho [33] needs to be located carefully on the lateral tracheal border just below the cricoid cartilage during swallowing measurement while neck-related motions such as speech production, coughing, and/or movements other than swallowing are restricted. On the other hand, the SMAS has a sensor array consisting of two transmitters and five receivers with 133 mm of curvature and 5 mm of thickness so that the sensor array can be located easily on the neck, because the swallowing detection range of the SMAS is wider than that of the single-sensor of the ultrasonic detector DF-4001. Furthermore, the SMAS does not require restrictions of neck-related motions and the curved strip-type sensor array can be placed on the neck easily, securely, and with a snug fit, using an elastic band. The signal processing algorithm of the SMAS can exclude ultrasound measurements as noises from the subsequent analysis if measurements of the microphone or the IMU exceed a designated value by assuming that the corresponding measurements occur due to activities other than swallowing. To our best knowledge, the ultrasonic Doppler sensor-based system developed in the present study is the first of its kind that is wearable, wireless, and measures signals originating only from swallowing activities.
The results of the ANOVA and correlation analysis in the present study identified the effects of the viscosity and volume of bolus on the swallowing ultrasound signal measures (1st peak amplitude, 2nd peak amplitude, duration, PP time interval, energy, and number of peaks) and the relationships of the ultrasound signal measures. To our best knowledge, no previous studies have examined the effects of the viscosity and volume of bolus on the swallowing ultrasound signal measures and the relationships of the ultrasound signal measures. Note that Soria et al. [45], Cagliari et al. [31], and Santos and Filho [33] used an ultrasonic fetal detector reported ultrasound signal measurements for various bolus viscosity and volume conditions and tested their differences between age groups. The ANOVA result showed that viscosity was significant mainly on the amplitude and energy measures, volume was significant on the time measures, and the interaction between viscosity and volume was not significant on all the swallowing signal measures. Next, the correlation analysis results found the time measures strongly correlated (r = 0.78), the amplitude measures weakly correlated (r = 0.30), the amplitude measures crossed the time measures weakly correlated (r = −0.27 to 0.18), and energy having higher correlations with the amplitude measures (r = 0.86 with 1st peak amplitude and r = 0.5 with 2nd peak amplitude) than with the time measures (r = −0.24 with PP time interval and r = 0.17 with duration).
The present study identified that the 1st peak amplitude (23.6%~29.0%), 2nd peak amplitude (9.2%), energy (44.9%~47.6%), and proportion of two or more peaks (11.1%) significantly decreased as the viscosity increased. Note that the amplitude of the ultrasound signal decreases as the velocity of a moving object decreases [46]. Thus, the reductions in peak amplitudes, energy, and proportion of two or more peaks can be explained by the movements of the swallowing-related organs being slower for thick liquid with a high viscosity than for thin liquid with a low viscosity. A similar result was reported by Cagliari et al. [31]: a decrease in the peak intensity of the ultrasound signal as an increase in the viscosity of bolus (91.1~92.7 dB for water and 90.0~92.4 dB for yogurt). Lastly, the first and second peaks can be related to the elevation of the hyolaryngeal complex and its return to the original position, respectively, which needs to be substantiated by VFSS. The hyolaryngeal complex is elevated by the synergistic contraction of the suprahyoid, thyrohyoid, and long pharyngeal muscles (leading to the opening of the upper esophageal sphincter and transferring of a bolus from the oral cavity to the esophagus) and finally returns to its original position [47].
Next, the present study found that the 1st peak amplitude (8.2%~16.4%), PP time interval (5.9%), duration (5.6%), energy (17.0%~22.9%), and proportion of two or more peaks (7.9%) increased as the bolus volume increased to 9 mL from 3 mL. Note that the effects of the volume on the 1st peak amplitude and energy were significant only for thin liquid. The increase in the peak amplitude and energy can be interpreted as the faster movements of the swallowing-related organs caused by higher innervations commanded from the brain for 9 mL than those for 3 mL. Lastly, the increases in the PP time interval and duration by the bolus volume are natural phenomena.
As shown in Table 3, the average swallowing durations of the healthy participants (902.2 ± 149.3 ms for thin liquid and 889.7 ± 163.8 ms thick liquid) in the present study were found similar (difference <10%) to those reported by Cagliari et al. [31], who used an ultrasonic Doppler sensor, and Nascimento et al. [48], who used a videofluoroscopy, but quite different (difference >30%) from those reported by Santamato et al. [32] and Youmans and Stierwalt [49], who used a microphone. It is noteworthy that shorter durations for thick liquid than those for thin liquid were commonly found in the ultrasonic Doppler sensor-based studies, while the opposite was found in the videofluoroscopy-based study. Note that increases in the swallowing duration by the bolus volume were commonly observed in previous studies. Since the swallowing analysis results using an ultrasonic Doppler sensor are more similar with those using the gold standard videofluoroscopy than with those using a microphone, it can be inferred that the ultrasonic Doppler sensor technique has higher validity than the microphone technique in swallowing research. The wireless neck-band-type SMAS in the study can be effectively used to monitor swallowing activities in daily life, quantify swallowing functions, and screen those with swallowing problems. As VFSS and FEES are limited in terms of intrusiveness and subjectivity, the ultrasonic Doppler sensor-based SMAS can monitor swallowing activities safely and unobtrusively in daily life. Quantitative information such as peak amplitude, PP time interval, duration, energy, and the number of peaks of ultrasound signals from swallowing can be analyzed and their changes before and after particular treatments are reported. Lastly, screening services for dysphagia are possible using the SMAS by monitoring swallowing activities and identifying an abnormal pattern of swallowing.
Further research is needed to examine the concurrent validity and generalizability of the ultrasonic Doppler sensor-based SMAS. A comparative analysis of ultrasound signals and VFSS images is needed to check the concurrent validity of the SMAS. Although the trend of SMAS signals measured from the healthy was examined in the present study, the significance of the swallowing ultrasound signal measures needs to be understood by relating to VFSS analysis results. Furthermore, the measurements of the SMAS need to be collected from those diagnosed with dysphagia at various severity levels by various causes including neurological disorders, congenital conditions, and muscular conditions and then compared with the healthy. Then, a statistical model for the screening of dysphagia using the SMAS can be developed by gathering large-scale normative data of the healthy and patients with dysphagia.