Exploring the Clinical Utility of Gustatory Dysfunction (GD) as a Triage Symptom Prior to Reverse Transcription Polymerase Chain Reaction (RT-PCR) in the Diagnosis of COVID-19: A Meta-Analysis and Systematic Review

Background: The diagnosis of COVID-19 is made using reverse transcription polymerase chain reaction (RT-PCR) but its sensitivity varies from 20 to 100%. The presence of gustatory dysfunction (GD) in a patient with upper respiratory tract symptoms might increase the clinical suspicion of COVID-19. Aims: To perform a systematic review and meta-analysis to determine the pooled sensitivity, specificity, positive likelihood ratio (LR+), negative likelihood ratio (LR−) and diagnostic odds ratio (DOR) of using GD as a triage symptom prior to RT-PCR. Methods: PubMed and Embase were searched up to 20 June 2021. Studies published in English were included if they compared the frequency of GD in COVID-19 adult patients (proven by RT-PCR) to COVID-19 negative controls in case control or cross-sectional studies. The Newcastle-Ottawa scale was used to assess the methodological quality of the included studies. Results: 21,272 COVID-19 patients and 52,298 COVID-19 negative patients were included across 44 studies from 21 countries. All studies were of moderate to high risk of bias. Patients with GD were more likely to test positive for COVID-19: DOR 6.39 (4.86–8.40), LR+ 3.84 (3.04–4.84), LR− 0.67 (0.64–0.70), pooled sensitivity 0.37 (0.29–0.47) and pooled specificity 0.92 (0.89–0.94). While history/questionnaire-based assessments were predictive of RT-PCR positivity (DOR 6.62 (4.95–8.85)), gustatory testing was not (DOR 3.53 (0.98–12.7)). There was significant heterogeneity among the 44 studies (I2 = 92%, p < 0.01). Conclusions: GD is useful as a symptom to determine if a patient should undergo further testing, especially in resource-poor regions where COVID-19 testing is scarce. Patients with GD may be advised to quarantine while repeated testing is performed if the initial RT-PCR is negative. Funding: None.

A multipronged surveillance and containment strategy consisting of active detection of COVID-19 cases, contact tracing and early isolation [18][19][20][21], coupled with social distancing [22][23][24], appear to be effective in controlling the COVID-19 outbreak. However, the major constraints [25-28] to blanket testing of populations are trained personnel to administer the swabs and run the tests, cost, materials (swab sticks, sample media, reagents) and turnaround time for the reverse transcription polymerase chain reaction (RT-PCR) test on respiratory samples. In addition, the sensitivity of the "gold standard" RT-PCR ranges from 20 to 100% depending on the time from exposure and symptom onset [29], and clinicians should not rely on a single negative RT-PCR test to exclude COVID-19 if clinical suspicion is high [29,30].
While COVID-19 infections present most commonly as an acute upper respiratory tract infection (URTI) (fever, cough, sore throat, myalgia) [31], there are a number of peculiar symptoms which differentiate it from other viruses. It has been shown that OD and GD are common among COVID-19 patients [32][33][34]. Carrillo-Larco et al. [35] found that the prevalence of GD among COVID-19 patients in 6 included studies varied widely from 5 to 89%, with heterogeneous definitions of GD. Smell refers to the perception of odour by the olfactory fibres in the roof of the nasal cavity [36] while taste refers to the perception of salty, sweet, sour, bitter and umami by the tongue carried by cranial nerves VII, IX and X [37]. On the other hand, flavour is a complex perception and refers to the combination of smell, taste and trigeminal sensation (pain, tactile and temperature) [ [40,41]) in COVID-19 patients, GD is less well studied.
Taste is important for quality of life, appetite, satiety and is part of a defence mechanism against hazards [42]. More importantly, if GD as a symptom possesses high diagnostic value, it may be used in isolation or in combination with other specific symptoms as part of a screening questionnaire to determine if a patient should undergo further testing, especially in resource-poor regions where COVID-19 testing is scarce. It may also be used to determine the level of clinical suspicion of COVID-19, so that appropriate isolation measures are instituted before repeated testing is performed if the first RT-PCR is negative [30]. Post-viral OD is well established among viral upper respiratory tract infections [43][44][45][46], which reduces its diagnostic value in differentiating COVID-19 from other viruses. Therefore, this study aims to determine if GD, with or without OD, may be used as a discriminatory criterion instead to predict a patient's COVID-19 status.
Published meta-analyses of the diagnostic value of GD in COVID-19 are sub-optimal. Hoang et al. [47] only pooled data for one subgroup analysis in April 2020, reporting an odds ratio (OR) of 12.7 of GD in COVID-19 versus patients with acute respiratory infections without detectable virus, including only 2 studies with a total of 392 patients. Liou et al. [48] performed a meta-analysis in May 2020 and reported the sensitivity, specificity, positive predictive value, negative predictive value and accuracy of combined taste or smell alteration in the prediction of COVID-19 across 6 studies but did not report statistics for GD (with or without OD).
The study aims to perform a systematic review and meta-analysis to determine the pooled sensitivity, specificity, positive likelihood ratio (positive LR), negative likelihood ratio (negative LR) and diagnostic odds ratios of using gustatory dysfunction as a triage symptom prior to RT-PCR in the diagnosis of COVID-19.

Definition of GD
For the purposes of this meta-analysis, GD is defined as the presence of quantitative (ageusia (complete loss of taste), hypogeusia (diminished sense of taste) and hypergeusia (increased gustatory sensitivity)) or qualitative dysfunction (dysgeusia (distorted taste perception) and phantogeusia (phantom taste perception)) or a combination of the above [36,42], either reported, measured, or both. The list of abbreviations can be found in Table A3.

Systematic Review Protocol
The methodology follows a similar study previously published by the author on the clinical utility of OD in COVID-19 [49]. The review protocol was not registered on any registry.
The Preferred Reporting Items for Systematic reviews and Meta-analyses (PRISMA) Statement [50] was used to structure the systematic review and meta-analysis as shown in Table A4. No ethics approval was required.

Information Sources and Search Strategy
Studies were eligible if they were indexed on PubMed or Embase. Cochrane Central Register of Controlled Trials (CENTRAL) was not searched as trials were irrelevant to the present study. The search was performed on 20 June 2021. The search strategy is included in Table A1 and was not limited by publication date as some articles are indexed prior to publication.

Study Selection and Data Collection
Screening of titles and abstracts was performed by 2 independent researchers (K.W.P., S.L.T.) to determine if the studies met the inclusion criteria. If abstracts were not available, the full text was retrieved and analysed. Any disagreements between the 2 researchers were resolved by discussion and by consulting a third, senior researcher (L.S.N.), to determine if the studies met the inclusion criteria. Duplicate studies were removed by Endnote X9 and then by hand. Data was extracted from eligible studies into Excel sheets by 1 researcher (K.W.P.) and then cross-checked by a 2nd researcher (S.L.T.). These included the author, year of publication, study design, country, GD testing method, COVID-19 testing method and number of cases reporting GD among COVID-19 positive and negative patients. All clarifications with authors were made via email.
The Newcastle-Ottawa scale [51] was used to assess the methodological quality of the included studies. Each item was allocated 1 point except for the item on the "Comparability of cases and controls on the basis of age and URTI symptoms", which was allocated 2 points. The studies were classified as having low (7-9 points), moderate (4-6 points) and high risk of bias (1-3 points). Assessment was performed by 2 independent researchers (K.W.P., S.L.T.) and any disagreements were resolved by consulting the senior researcher (L.S.N.).

Inclusion and Exclusion Criteria
We compared the frequency of GD in adult patients (at least 18 years) stratified by COVID-19 test results using the reverse transcription polymerase chain reaction (RT-PCR). Studies were included if they compared the frequency of GD in COVID-19 positive patients (proven by RT-PCR) to COVID-19 negative controls in case control or cross-sectional studies. Appropriate controls were defined as patients suspected of having COVID-19 infection or fulfilled local guidelines for COVID-19 testing but were COVID-19 negative on RT-PCR testing. Only studies published in English were included.

Study Characteristics
A total of 21,272 COVID-19 positive patients and 52,298 COVID-19 negative patients were included across the 44 studies as seen in Figure A1 and Table A2. The patients were from 21 countries across the major continents, as illustrated in Figure 2.
Life 2021, 11, x FOR PEER REVIEW 6 of 28

Study Characteristics
A total of 21,272 COVID-19 positive patients and 52,298 COVID-19 negative patients were included across the 44 studies as seen in Figure A1 and Table A2. The patients were from 21 countries across the major continents, as illustrated in Figure 2. With reference to Figure A1, all studies utilised RT-PCR as the COVID-19 diagnostic testing method. Most studies collected data regarding GD via questionnaires or structured interviews, except for 3 studies which utilised gustatory testing [139][140][141]. Among the 44 included studies, 7 studies [142][143][144][145][146][147][148] did not test for GD or state that GD symptoms were explicitly asked for.

Risk of Bias
Using the Newcastle-Ottawa scale [51] to assess the risk of bias in each of the included studies, most of the studies were of moderate risk of bias except for 6 studies [34, [149][150][151][152][153] which had high risk of bias, as shown in Figure A1. Most studies utilised hospital instead of community controls, failed to control for age as a variable, failed to blind patients and interviewers to the COVID-19 test result during assessment of GD, and failed to report the non-response rate of their study.
Subgroup analysis Comparison 1 failed to show a statistically significant difference between the DOR in Group A as compared to Group B (test for subgroup differences, p = 0.74, Figure 3). Among the 31 studies in Group B with low to moderate risk of bias and in which GD symptoms were explicitly asked for or tested, there was still significant heterogeneity (I 2 = 91%, p < 0.01).

Risk of Bias
Using the Newcastle-Ottawa scale [51] to assess the risk of bias in each of the included studies, most of the studies were of moderate risk of bias except for 6 studies [34, [149][150][151][152][153] which had high risk of bias, as shown in Figure A1. Most studies utilised hospital instead of community controls, failed to control for age as a variable, failed to blind patients and interviewers to the COVID-19 test result during assessment of GD, and failed to report the non-response rate of their study. Subgroup analysis Comparison 1 failed to show a statistically significant difference between the DOR in Group A as compared to Group B (test for subgroup differences, p = 0.74, Figure 3). Among the 31 studies in Group B with low to moderate risk of bias and in which GD symptoms were explicitly asked for or tested, there was still significant heterogeneity (I 2 = 91%, p < 0.01).
The funnel plot shown in Figure 7 and Peters' test (p = 0.61) did not detect the presence of publication bias. The funnel plot shown in Figure 7 and Peters' test (p = 0.61) did not detect the presence of publication bias.

Discussion
This meta-analysis is the largest study describing the utility of GD in the diagnosis of COVID-19, with 44 included studies, comprising 21,272 COVID-19 positive patients and 52,298 COVID-19 negative controls. It demonstrates that GD as a symptom has high DOR, low sensitivity, high specificity, moderate positive LR and low negative LR in predicting COVID-19 RT-PCR positivity. The DOR of GD was 6.39 (4. 86-8.40), lower than that published by Hoang et al. [47] (2 studies, n = 519, DOR 12.7 (7.90-20.4)), but similar to that reported in the Cochrane review by Struyf et al. (6 studies, n = 9286, DOR 6.60 (5.30 to 8.27)) [154]. Translating this into clinical practice, a patient presenting with upper respiratory tract symptoms and GD likely has COVID-19 and should be quarantined even if the first RT-PCR is negative. However, the absence of GD is insufficient to rule out a COVID-19 infection.
Comparing the clinical utility of GD (with or without OD) to OD (with or without GD) by Pang et al. [49], either GD, OD, or both by Kim et al. [155] in predicting COVID-19 RT-PCR positivity, it can be seen that GD has the lowest DOR and sensitivity, while equivalent specificity. The data in Table 1 suggests that the combination of either GD, OD, or both, may be the best screening criteria, among the 3, to predict COVID-19 RT-PCR positivity.

Discussion
This meta-analysis is the largest study describing the utility of GD in the diagnosis of COVID-19, with 44 included studies, comprising 21,272 COVID-19 positive patients and 52,298 COVID-19 negative controls. It demonstrates that GD as a symptom has high DOR, low sensitivity, high specificity, moderate positive LR and low negative LR in predicting COVID-19 RT-PCR positivity. The DOR of GD was 6.39 (4. 86-8.40), lower than that published by Hoang et al. [47] (2 studies, n = 519, DOR 12.7 (7.90-20.4)), but similar to that reported in the Cochrane review by Struyf et al. (6 studies, n = 9286, DOR 6.60 (5.30 to 8.27)) [154]. Translating this into clinical practice, a patient presenting with upper respiratory tract symptoms and GD likely has COVID-19 and should be quarantined even if the first RT-PCR is negative. However, the absence of GD is insufficient to rule out a COVID-19 infection.
Comparing the clinical utility of GD (with or without OD) to OD (with or without GD) by Pang et al. [49], either GD, OD, or both by Kim et al. [155] in predicting COVID-19 RT-PCR positivity, it can be seen that GD has the lowest DOR and sensitivity, while equivalent specificity. The data in Table 1 suggests that the combination of either GD, OD, or both, may be the best screening criteria, among the 3, to predict COVID-19 RT-PCR positivity.
While GD is useful in predicting COVID-19 RT-PCR positivity, the mechanism by which COVID-19 induces GD is still uncertain. Human angiotensin-converting enzyme in the literature among COVID-19 patients [159]. One theory is that Toll-like receptors (TLRs) and interferons (IFN) may disrupt normal taste transduction or cell renewal in taste buds [160]. Another theory is that salivary gland dysfunction leads to hyposalivation with subsequent taste impairment [161]. There is also a growing body of evidence that COVID-19 has neuro-invasive potential with positive RT-PCR from cerebrospinal fluid samples [162]. An alternative mechanism of GD is postulated to be cranial nerve VII, IX and X dysfunction with disruption of the central nervous system pathways but this remains controversial [163]. In addition, the optimal method of ascertaining GD remains controversial. Singer-Cornelius et al. [164] suggested that there are large discrepancies between questionnairebased assessments and gustatory testing, with only 25.6% (10/39) of patients who reported GD demonstrating a measurable deficit on taste strip testing (Burghart Messtechnik GmbH, Wedel, Germany). One possible explanation is the presence of the "ceiling effect" and inability to discriminate subtle levels of GD with taste strips of just four different concentrations [165]. While this problem might be alleviated by the use of extended taste strips testing with additional concentrations [165], it might be time consuming and further increase the risk of exposure to infectious oral secretions. Our study suggests that history/questionnaire-based assessments were predictive of RT-PCR positivity but gustatory testing was not, therefore we propose the former be utilised in assessing GD for the purposes of COVID-19 risk assessment.
Amongst the various screening tools, the use of questionnaires to triage patients into low and high-risk groups for COVID-19 has proven to be effective through different stages of a pandemic. During the initial period of disease outbreak when numbers are high and detection is key, the utility of questionnaires rests in its potential for wide coverage at low costs [111,166]. In January 2020, the first online questionnaire about COVID-19 was launched in China based on early data collected from the initial cases, to stratify the population based on their risk of having COVID-19 and determine the need for further testing or a medical consult. In a span of three weeks, the questionnaire was adopted by all the Chinese provinces and 38 other overseas countries, amassing close to 20,000 responses [166]. Correlating the number of confirmed cases out of these responses facilitated the identification of risk factors for COVID-19 and more importantly, demonstrated how questionnaires could be deployed as a rapid, nationwide screening tool and provide the necessary prompts to particularly high-risk groups for early detection.
Beyond the emergent phase and with international commute resuming amidst COVID-19, questionnaires were adapted as part of travel screening for passengers to fine tune the global response to the pandemic [167]. In the surveillance phase, questionnaires were also used abroad, such as in the US and UK, to gather public perception about the rapidly moving infection and subsequently correct misconceptions through more targeted official press releases [168]. Thus, it is evident that questionnaires have multi-pronged utility. With GD being reported as both a common and possibly early symptom of COVID-19 [169], the inclusion of GD in symptoms-based questionnaires could not only become more relevant as screening tool to aid early detection but also help to educate the public, and allay the distress and functional impact that comes with GD [170].
In the current season where the disease is increasingly being regarded as endemic [171], the move away from gold standard tests with RT-PCR towards self-administered antigen rapid test (ART) kits is testimony to how COVID-19 may progressively be treated akin to a cold. The need for formal testing might be obviated and replaced with either self-or clinician-based clinical diagnosis for isolation and home recovery. For example, Singapore has pioneered a home recovery program (HRP) as the default care arrangement for all COVID-19 patients, unless they belong to a vulnerable age group (80 years and above) or have not completed their vaccinations [172]. HRP now constitutes 40% of daily cases in a bid to reduce the strain on public healthcare inpatient resources [173]. Recovery has become patient-directed with instructions to monitor and upload their vital signs online, while an HRP buddy periodically checks in on their symptoms and progress via telephone calls [174]. The use of self-administered symptom-based questionnaires, featuring GD, may be developed to complement such a recovery program independent of testing. This allows patients to systematically track their clinical progress while offering a potential database of valuable information regarding the clinical course of COVID-19 across demographics and profiles.
A major contributory factor that has permitted countries like Singapore to adopt such methods is their high national vaccination rates and low mortality for COVID-19 patients (estimated to be 0.1% especially for the young and healthy population). However, it has been reported that GD is a possible side effect of COVID-19 vaccinations [175]. In Europe, a small handful of COVID-19 naïve patients reported having new-onset olfactory or taste dysfunction following their COVID-19 vaccinations, but their symptoms lasted for less than two weeks. It is conjectured that post-vaccine inflammation in the olfactory neuroepithelium could contribute to transient olfactory disorder, but there is little established evidence in the current literature [175]. Should GD become a more common or established side effect of vaccinations, whether temporary or permanent, it might confound the use of GD as a potential early screening symptom for COVID-19.
We recognize that there was considerable heterogeneity among the 44 studies in this meta-analysis. Possible sources include: the different populations sampled across 21 countries, lack of a standardised questionnaire in various languages to elicit GD, studies being conducted at different time points of the pandemic (where later studies might be influenced by media coverage of chemosensory dysfunction and COVID-19), some studies assessed GD after COVID-19 testing results were known (recall bias) while others failed to enquire regarding GD symptoms explicitly. The COVID-19 variants, especially the prevalent Delta variant, differ in their virulence, but more importantly, may be associated with less olfactory and gustatory dysfunction. Subgroup analyses attempted to explore some of the above sources of heterogeneity but were not statistically significant.
The limitations of this meta-analysis were an inability to analyse the duration, severity and recovery of GD and possible implications on prognosis due to insufficient data. The studies which were included were of moderate to high risk of bias and failed to control for age and other confounders. This meta-analysis only included studies which were published in English and this resulted in a selection bias as data might not be representative of the nonnative English-speaking regions of the world. Future research should be directed towards basic science on the pathophysiology of GD in COVID-19, comparing the performance of various COVID-19 clinical prediction scoring systems and evaluating GD among patients with the different COVID-19 variants.

Conclusions
GD has high DOR, low sensitivity, high specificity, moderate positive LR and low negative LR in predicting COVID-19 RT-PCR positivity. While the included studies were heterogenous, this meta-analysis provides evidence on the clinical utility of using GD in a screening questionnaire to determine if a patient should undergo further testing, especially in resource-poor regions where COVID-19 testing is scarce. It may also be used to determine the level of clinical suspicion of COVID-19, so that the patient may be advised to quarantine while repeated testing is performed if the initial RT-PCR is negative [30]. There is insufficient evidence to recommend using gustatory testing over questionnairebased assessment of GD. Institutional Review Board Statement: Ethical review and approval were waived for this study, due to the use of publicly available secondary data.
Informed Consent Statement: Patient consent was waived due to the use of publicly available secondary data.

Data Availability Statement:
The authors will share data and the full statistical code upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.  Figure A1. Description of Included Studies and Newcastle-Ottawa Scale Risk of Bias Assessment. Figure A1. Description of Included Studies and Newcastle-Ottawa Scale Risk of Bias Assessment.

Appendix A
Appendix C Table A2. Citations for Included Studies.
3 Data collection process 9 Specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and if applicable, details of automation tools used in the process. List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (e.g., for all measures, time points, analyses), and if not, the methods used to decide which results to collect.

10b
List and define all other variables for which data were sought (e.g., participant and intervention characteristics, funding sources). Describe any assumptions made about any missing or unclear information.
3 Study risk of bias assessment 11 Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and if applicable, details of automation tools used in the process. Describe the processes used to decide which studies were eligible for each synthesis (e.g., tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)).

NA 13b
Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics, or data conversions.
NA (only studies with complete data were included) 13c Describe any methods used to tabulate or visually display results of individual studies and syntheses. 3 13d Describe any methods used to synthesize results and provide a rationale for the choice(s). If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used.

13e
Describe any methods used to explore possible causes of heterogeneity among study results (e.g., subgroup analysis, meta-regression). 4 13f Describe any sensitivity analyses conducted to assess robustness of the synthesized results. NA

Reporting bias assessment 14
Describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases). 3 Certainty assessment 15 Describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome. NA

Study selection 16a
Describe the results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram.

16b
Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded.