Next Article in Journal
An Integrated Classification and Association Rule Technique for Early-Stage Diabetes Risk Prediction
Next Article in Special Issue
How Has the COVID-19 Pandemic Changed Urban Consumers’ Ways of Buying Agricultural Products? Evidence from Shanghai, China
Previous Article in Journal
Mediating Effect of Communication Competence in the Relationship between Compassion and Patient-Centered Care in Clinical Nurses in South Korea
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Relationship between Mustard Import and COVID-19 Deaths: A Workflow with Cross-Country Text Mining

1
AI Data Analytics Lab, Beijing Normal University-Hong Kong Baptist University (BNU-HKBU United International College), Zhuhai 519087, China
2
Division of Science & Technology, Beijing Normal University-Hong Kong Baptist University (BNU-HKBU United International College), Zhuhai 519087, China
3
School of Economics and Management, Harbin Institute of Technology Shenzhen, Shenzhen 518000, China
*
Author to whom correspondence should be addressed.
Healthcare 2022, 10(10), 2071; https://doi.org/10.3390/healthcare10102071
Submission received: 10 September 2022 / Revised: 6 October 2022 / Accepted: 8 October 2022 / Published: 18 October 2022

Abstract

:
We developed a workflow for the search and screening of natural products by drawing from worldwide experiences shared by online platform users, illustrated how to cope with COVID-19 with a text-mining approach, and statistically tested the natural product identified. We built a knowledge base, which consists of three ontologies pertaining to 7653 narratives. Mustard emerged from texting mining and knowledge engineering as an important candidate relating to COVID-19 outcomes. The findings indicate that, after controlling for the containment index, the net import of mustard is related with reduced total and new deaths of COVID-19 for the non-vaccination time period, with considerable effect size (>0.2).

1. Introduction

Many developing economies have inadequate doses to vaccinate their populations [1]. Countries with high vaccination rates are experiencing a new round of the outbreak, largely caused by new COVID-19 variants such as Delta and Lambda. Recent research indicates the potential for COVID-19 variants to escape from neutralizing humoral immunity [2,3], and the effectiveness of vaccination has been found to be lower among people with the Delta variant than those with the Alpha variant [4]. There is an urgent need for non-vaccination interventions against this disease [1].
Drug-repurposing opportunities identified in previous studies need to be evaluated over time, and clinical trials are still in progress [5,6]. While there are a huge amount of COVID-19 research works published on drug discovery, few studies have suggested exploiting potential natural products [7]. Previous works on natural product identification have proposed approaches mining genome and metabolomics data [8,9]. However, such methods are limited by the scope of their search and are usually time-consuming. A more efficient or novel way of gaining insights from data in healthcare research is mining and analyzing online text with natural language processing (NLP) or text analytics [10]. For example, researchers of recent healthcare studies have learned public opinion and sentiment on topics such as COVID-19 vaccine boosters [11] and intimate partner violence [12] by mining textual data from social media.
In this study, we developed a new workflow for the search and screening of natural products from 7653 narratives that can be potentially used to cope with COVID-19. To illustrate this approach, we explored online narrative texts collected from a large tourism platform, identified a potential natural food with a knowledge graph approach, and tested it statistically. We hypothesize that natural food consumption is related to COVID-19 outcomes. We then test the data on COVID-19 deaths and cases drawn from John Hopkins University (JHU) database.
The number of medicine discovery projects on the basis of big data and data mining has been growing in recent years [13,14]. However, most studies so far built their databases by collecting published COVID-19 studies [15,16]. We compiled a dataset by visiting a large Chinese tourism website wherein a large population of users upload and share a large amount of travel writings that state their experiences with various overseas destinations. The use of online travel writing helps us to efficiently gain insights into the consumption of some “special” local food or consumption culture for each destination [17,18]. The content downloaded includes 7653 travel writings, as well as the upload date and destination. The data were grouped and combined according to destination, and each document consists of all the travel writings of a particular country. In total, we compiled a database representing 209 countries, covering all major regions and cultures in the world. The dataset reflects local condition within a time period before the pandemic, so this type of data was considered more appropriate than Wikipedia, wherein we cannot efficiently learn in-depth information on natural products in each location and can not specify a particular time period.

2. Methods

2.1. A Workflow with NPN Approach

As the workflow (Figure 1) is used to isolate the best cases of countries from natural products narratives (NPN), which may provide us hints on natural products, we deleted countries with highly restrict responses to COVID-19. As successful government intervention such as closing schools and banning gatherings are highly effective at controlling community transmission during COVID-19 [19,20,21], countries with a lower containment and health index are considered more promising for the detection of natural solutions. We used the Containment and Health Index, which was developed by the Oxford Coronavirus Government Response Tracker (OxCGRT) project, and a higher value of the index indicates a stricter response to the pandemic (100 for strictest response). The index is based on thirteen policy-response indicators including school closures, workplace closures, travel bans, testing policy, contact tracing, face coverings, and vaccine policy. If countries with low containment observe fewer COVID-19 deaths or cases, there might be some unknown causes. Countries that did not achieve 50% in the COVID Data Transparency Index (totalanalysis.com/Covid19/TAIndex, accessed on 5 August 2021) were deleted in data mining, as the numbers of deaths and confirmed cases in these countries might not be accurate.
We end up with a final list of countries with both a low containment index and low total COVID-19 confirmed cases (Table 1). Although African countries lack effective containment and health facilities, many of them have surprisingly low level of total cases per million (<1000), e.g. Burundi, Burkina Faso, Congo Dem. Rep., Madagascar, Niger, Senegal, Somalia, Sudan, and Tanzania. S1 figures (supporting information [22]) show that most of the selected countries observe less COVID-19 deaths than the world average.

2.2. Data Analysis

Knowledge engineering as a sub-field of artificial intelligence involves transferring human knowledge into a database and representing expert knowledge and reasoning [23]. We adopt knowledge engineering logic to develop a system to sort text data, isolate knowledge, and compile a domain-specific knowledge base. The knowledge base consists of three ontologies: food, drink, and smell. While natural food and drink might serve as medicine, substances in the air may influence the process by which a virus spreads from infected people to others. NLTK's (Natural Language Toolkit) tokenizers were used to convert narrative text into structured data. All files were then merged on the basis of these ontologies. This approach enabled us to connect concepts among sub-datasets and identify natural products spanning several countries, which has been a common limitation of previous knowledge mining studies [24].

3. Results

3.1. Corpus Development and Knowledge Engineering

We developed a corpus of Chinese words related to eat, drink, food, smell, dish, cooking, taste, etc. The corpus was employed to screen out the sentences that do not consist of food, drink, and smell information. These invalid sentences were dropped from the knowledge base. Then, seven research assistants (college students) were recruited and trained. They manually annotated each valid sentence with tags indicating the actual product name used among local people and classified the tags under each ontology. The annotated data were checked and verified until inter-rater reliability was over 90%.
The purpose of the knowledge engineering was to identify similarities from the sample countries in terms of food-consumption culture. The more documents connected to a node, the more important the factor should be. A notable challenge with the text-mining approach was isolating good candidates from the large amount of nodes. We removed those nodes that are commonly consumed globally. A good candidate of potential factors identified from the data is mustard, which emerges as an important node linking multiple documents/countries from different angles (Figure 2). Mustard has been mentioned frequently in narratives about Moroccan food and its dye industry. It has also been noted in other narratives as common ingredients in Senegal’s Yassa (local dish), hot dog in Norway, and brunch in Finland and Australia.
Mustard consumption is measured by net import, which is defined as imports minus exports of mustard. Net import data (product category: 210,330–mustard flour and meal and prepared mustard, by country) were provided by World Integrated Trade Solution (WITS), which was co-developed by the World Bank and the United Nations Conference on Trade and Development (UNCTAD). Net import is a good proxy for national consumption, particularly when consumption data of a specific product category is not available [27,28].
We tested the relationship between net import of mustard and COVID-19 deaths worldwide. If the nutrition and consumption of this food are related to the prevention, or cure, of this disease, then the net import of this mustard should negatively influence the number of COVID-19 deaths. The first COVID-19 vaccination took place in the mid of December 2020. To test the pure effect of mustard, we collected worldwide COVID-19 data from 1 March to 10 December 2020, a time period before vaccination.

3.2. Hypothesis Testing

The fruits or vegetables mentioned in the narratives pertaining to countries with low levels of COVID deaths and containment index include cassava, pepper, cabbage, papaya, banana, orange, avocado, olive, cocoa, coconut, pineapple, mango, apple, watermelon, almonds, tomato, cucumber, shallot, fennel, pepper, potato, cherry, beans, corn, ginger, cinnamon, calyx pear, chamomile, sisal, eggplant, radish, red pomelo, tricholoma matsutake, jujube, jackfruit, avocado, litchi, plum, sugarcane, polo, baobab fruit, sakya, papaya, kiwifruit, and cactus fruit. We checked and compared with the WITS food list and tested those foods for which we could find valid data from the WITS database. Both import and export data in 2018 were downloaded. The food category consisting of multiple types of food was not considered (e.g. “080450—Fruit, edible; guavas, mangoes and mangosteens, fresh or dried”), as one can not tell which particular food might cause an effect. Some vegetables or fruits indicate non-trival but small negative associations (correlation coefficient around 0.1) on COVID deaths, such as “cucumbers/gherkins” and “coconuts”, but none of these have a correlation coefficient with COVID deaths greater than 0.2 (see Table 2).
We then analyzed other vegetables and fruits from the WITS trade database and ended up with 41 other types of foods for correlation tests. The results of grape indicate a small but considerable relationship with COVID deaths (correlation coefficient > 0.1 for total and new deaths). Again, mustard shows a much stronger relationship with COVID deaths compared with the 41 types of foods.
We used regression models in statistical tests with robust standard errors. As multiple data, i.e. daily deaths, were observed from each country, the determination of statistical significance was based on clustered-robust standard errors (clustered by country). This is more conservative, as well as more accurate usually, for hypothesis-testing [29]. We hypothesize that mustard consumption is associated with COVID-19 outcomes. The dependent variables that we investigate in the statistical test include total and new deaths of COVID-19. Correlation analyses indicate considerable associations [30] between net import of mustard and total deaths (Pearson’s r = −0.24, p < 0.05), and between net import of mustard and new deaths (Pearson’s r = −0.21, p < 0.05). The direction of the relationships is consistent with our hypothesis.
We developed two models in testing the hypothesis (Table 3). As healthy economics and demographics may assist nations with combatting the COVID-19 pandemic [31,32], we selected the following confounders: population, life expectancy, GDP per capita, cardiovascular death rate, diabetes prevalence, percent of population aged 70+, population density, and number of hospital beds (per thousand), which were drawn from Our World in Data (Global Change Data Lab). As national infrastructure may also facilitate an efficient reaction to COVID-19, we also controlled confounding factors, such as electricity and mobile subscriptions, by drawing data from World Bank.
We tested if the net import of mustard has a negative relationship with total deaths. The results indicate that Model 1 predicts a good proportion of total deaths (r2 = 0.514) and a significant and negative effect of net import on total deaths (p = 0.020) after controlling for all of the confounding variables above. We next tested with Model 2 if the net import of mustard could predict new deaths caused by COVID-19. The same set of variables were controlled. Again, the net import of mustard shows a significant and negative relationship with new deaths (p = 0.034).

3.3. Sensitivity Analysis for Additional COVID-19 Outcomes

COVID-19 outcomes can be alternatively measured with the number of confirmed cases. After controlling for all of the above confounding factors, the results indicate that the net import of mustard is negatively and significantly related to total (p = 0.046) and new cases (p = 0.037) (Table 4). Correlation analysis results indicate a considerable effect size for the associations between the net import of mustard and total cases (Pearson’s r = −0.19) and between the net import of mustard and new cases (Pearson’s r = −0.23).
As shown in previous studies, deaths and confirmed cases are strongly influenced by governmental factors such as lockdown, stringency, testing policy, the extent of contact tracing, requirements to wear face coverings, and policies around vaccine rollout [33]; we did additional tests by dividing all sample countries into two groups with a high (scored above 70) and low Containment Index (≤70). Figure 3a–d indicates that the net import of mustard consistently have negative relationships with total (new) deaths and total (new) cases for both groups of countries. The negative relationship is stronger for countries with a high containment index, with r ranging from −0.749 to −0.688.

4. Discussion

The proposed workflow can accelerate the identification of potential natural products against COVID-19. This novel method has been illustrated by drawing from experiences and writings relating to natural products on over 200 countries. Although narratives pertaining to a particular country would be contextual, a cross-country comparison of such data may bring some new and natural solutions against the disease. Since both cross-country NPN and product data can be collected from online or public sources, our approach is a general and time-efficient one, without additional inputs from metabolic engineering or gene expression.
The case of mustard provides a proof-of-concept. The net import of mustard has been found to be associated with reduced deaths, particularly when nations manage to improve the containment index to a level of above 70 (r becomes around −0.7). The findings are robust to the use of confirmed case data. These findings provide new explanations on why some countries, although less resourceful in terms of vaccine, crisis containment, and healthcare facilities, are not harmed seriously by the pandemic.
By examining the relationships between the net import of other foods and COVID-19 outcomes, we found several notable positive correlations (>0.2) between COVID-19 total deaths and the net import of almond (0.317), pineapple (0.292), grapes (0.277), and apples (0.238). These fruits, although not associated with a reduced number of COVID-19 deaths, might provide new hints on the factors relating to the growing number of COVID-19 deaths. Future research could address this issue by investigating the nutrition or chemical mechanisms underlying the connections between these fruits and COVID-19.
This study was limited by the scope of the destination data. Although we collected a fairly good number of travel writings on 209 nations or regions, the dataset does not cover every country in the world and does not convey a complete picture of local consumption and culture. Future studies could investigate the molecular mechanism pertaining to mustard, particularly pertaining to mustard seed, which has been used to make sauce and dye in our sample countries. Mustard has been found to be useful in the treatment of pneumonia, asthma, or cough and in relieving pain symptoms, such as headaches and neuralgia. Since fermentation with Lactobacillus Plantarum can enhance the anti-inflammatory activity of mustard leaves, mustard leaves fermented by this microorganism may be helpful in the treatment of inflammation [34]. The SARS-CoV-2 protein 3CLPro is essential for successful viral replication. A recent study found that a glucosinolate derivative found in mustard seeds is a potent inhibitor of SARS-CoV-2 3CLPro [35].
Future work may also try using other data-driven or algorithm-driven approaches in the identification of foods related to COVID 19. While our workflow is developed on the basis of a knowledge graph that is exploratory in nature and built on domain-specific knowledge, deep learning applications in NLP, particularly BERT models, have good potential for the efficient identification of food-related words or terms from large social network sites, such as Twitter. For example, in a recent healthcare study, researchers developed a NLP algorithm and accurately extracted sleep parameters from polysomnography text noted in electronic medical records [36].

Author Contributions

G.Z. developed the idea and research. G.Z. wrote the first draft of the manuscript, and all other authors discussed the method and results. F.Y. collected and validated the text data. L.Z. edited the manuscript. G.Z., L.Z., and H.W. performed the analyses. H.W. generated all figures. All authors have read and agreed to the published version of the manuscript.

Funding

G.Z. acknowledge support by the National Natural Science Foundation of China (Key Program, grant code 71832015), Higher Education Enhancement Plan by the Guangdong Education Department (UICR0400011-21), UIC Research Grant (R202027), and Joint Research Project from the Guangdong Planning Office of Philosophy and Social Science (GD20XGL55).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Sample data and aggregated statistics for replication and academic research purposes are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Boum, I.Y.; Ouattara, A.; Torreele, E.; Okonta, C. How to ensure a needs-driven and community-centred vaccination strategy for COVID-19 in Africa. BMJ Glob. Health. 2021, 6, e005306. [Google Scholar] [CrossRef] [PubMed]
  2. Garcia-Beltran, W.F.; Lam, E.C.; St Denis, K.; Nitido, A.D.; Garcia, Z.H.; Hauser, B.M.; Feldman, J.; Pavlovic, M.N.; Gregory, D.J.; Poznansky, M.C.; et al. Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity. Cell 2021, 184, 2372–2383.e9. [Google Scholar] [CrossRef] [PubMed]
  3. Zhou, D.; Dejnirattisai, W.; Supasa, P.; Liu, C.; Mentzer, A.J.; Ginn, H.M.; Zhao, Y.; Duyvesteyn, H.M.E.; Tuekprakhon, A.; Nutalai, R.; et al. Evidence of escape of SARS-CoV-2 variant B.1.351 from natural and vaccine-induced sera. Cell 2021, 184, 2348–2361.e6. [Google Scholar] [CrossRef] [PubMed]
  4. Lopez Bernal, J.; Andrews, N.; Gower, C.; Gallagher, E.; Simmons, R.; Thelwall, S.; Stowe, J.; Tessier, E.; Groves, N.; Dabrera, G.; et al. Effectiveness of Covid-19 vaccines against the B.1.617.2 (Delta) variant. N. Engl. J. Med. 2021, 385, 585–594. [Google Scholar] [CrossRef]
  5. Gaziano, L.; Giambartolomei, C.; Pereira, A.C.; Gaulton, A.; Posner, D.C.; Swanson, S.A.; Ho, Y.-L.; Iyengar, S.K.; Kosik, N.M.; Vujkovic, M.; et al. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat. Med. 2021, 27, 668–676. [Google Scholar] [CrossRef]
  6. Guy, R.K.; DiPaola, R.S.; Romanelli, F.; Dutch, R.E. Rapid repurposing of drugs for COVID-19. Science 2020, 368, 829–830. [Google Scholar] [CrossRef]
  7. Shyr, Y.; Berry, L.D.; Hsu, C.Y. Scientific rigor in the age of COVID-19. JAMA Oncol. 2021, 7, 171–172. [Google Scholar] [CrossRef]
  8. Doroghazi, J.R.; Albright, J.C.; Goering, A.W.; Ju, K.S.; Haines, R.R.; Tchalukov, K.A.; Labeda, D.P.; Kellehe, N.L.; Metcalf, W.W. A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat. Chem. Biol. 2014, 10, 963–968. [Google Scholar] [CrossRef]
  9. Kersten, R.D.; Yang, Y.L.; Xu, Y.; Cimermancic, P.; Nam, S.J.; Fenical, W.; Fischbach, M.A.; Moore, B.S.; Dorrestein, P.C. A mass spectrometry–guided genome mining approach for natural product peptidogenomics. Nat. Chem. Biol. 2011, 7, 794–802. [Google Scholar] [CrossRef] [Green Version]
  10. Elbattah, M.; Arnaud, É.; Gignon, M.; Dequen, G. The role of text analytics in Healthcare: A review of recent developments and applications. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, Vienna, Austria, 11–13 February 2021; pp. 825–832. [Google Scholar] [CrossRef]
  11. Ong, S.-Q.; Pauzi, M.B.M.; Gan, K.H. Text mining and determinants of sentiments towards the COVID-19 vaccine booster of twitter users in Malaysia. Healthcare 2022, 10, 994. [Google Scholar] [CrossRef]
  12. Xu, H.; Zeng, J.; Tai, Z.; Hao, H. Public attention and sentiment toward intimate partner violence based on Weibo in China: A text mining approach. Healthcare 2022, 10, 198. [Google Scholar] [CrossRef]
  13. Nagashima, T.; Shirakawa, H.; Nakagawa, T.; Kaneko, S. Prevention of antipsychotic-induced hyperglycaemia by vitamin D: A data mining prediction followed by experimental exploration of the molecular mechanism. Sci. Rep. 2016, 6, 26375. [Google Scholar] [CrossRef] [Green Version]
  14. Sarangdhar, M.; Tabar, S.; Schmidt, C.; Kushwaha, A.; Shah, K.; Dahlquist, J.E.; Jegga, A.G.; Aronow, B.J. Data mining differential clinical outcomes associated with drug regimens using adverse event reporting data. Nat. Biotechnol. 2016, 34, 697–700. [Google Scholar] [CrossRef]
  15. Michel, F.; Gandon, F.; Ah-Kane, V.; Bobasheva, A.; Cabrio, E.; Corby, O.; Gazzotti, R.; Giboin, A.; Marro, S.; Mayer, T.; et al. Covid-on-the-Web: Knowledge graph and services to advance COVID-19 research. In Proceedings of the International Semantic Web Conference, Athens, Greece, 1–6 November 2020; pp. 294–310. [Google Scholar]
  16. Steenwinckel, B.; Vandewiele, G.; Rausch, I.; Heyvaert, P.; Taelman, R.; Colpaert, P.; Simoens, P.; Dimou, A.; De Turck, F.; Ongenae, F. Facilitating the analysis of COVID-19 literature through a knowledge graph. In Proceedings of the International Semantic Web Conference, Athens, Greece, 1–6 November 2020; pp. 344–357. [Google Scholar]
  17. Tsai, C.-T.S. Memorable tourist experiences and place attachment when consuming local food. Int. J. Tour. Res. 2016, 18, 536–548. [Google Scholar] [CrossRef]
  18. Kim, S.; Choe, J.Y.; King, B.; Oh, M.; Otoo, F.E. Tourist perceptions of local food: A mapping of cultural values. Int. J. Tour. Res. 2021; in print. [Google Scholar] [CrossRef]
  19. Brauner, J.M.; Mindermann, S.; Sharma, M.; Johnston, D.; Salvatier, J.; Gavenčiak, T.; Stephenson, A.B.; Leech, G.; Altman, G.; Mikulik, V.; et al. Inferring the effectiveness of government interventions against COVID-19. Science 2021, 371, eabd9338. [Google Scholar] [CrossRef]
  20. Hsiang, S.; Allen, D.; Annan-Phan, S.; Bell, K.; Bolliger, I.; Chong, T.; Druckenmiller, H.; Huang, L.Y.; Hultgren, A.; Krasovich, E.; et al. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 2020, 584, 262–267. [Google Scholar] [CrossRef]
  21. Liang, L.L.; Tseng, C.H.; Ho, H.J.; Wu, C.Y. Covid-19 mortality is negatively associated with test number and government effectiveness. Sci. Rep. 2020, 10, 12567. [Google Scholar] [CrossRef]
  22. Coronavirus Resource Center, Data Notes by Regions, by Johns Hopkins University & Medicine. Available online: https://coronavirus.jhu.edu/region (accessed on 5 August 2021).
  23. Shi, Z. Intelligence Science: Leading the Age of Intelligence; Tsinghua University Press: Beijing, China, 2021; pp. 1–31. [Google Scholar]
  24. Cernile, G.; Heritage, T.; Sebire, N.J.; Gordon, B.; Schwering, T.; Kazemlou, S.; Borecki, Y. Network graph representation of COVID-19 scientific publications to aid knowledge discovery. BMJ Health Care Inform. 2021, 28, e100254. [Google Scholar] [CrossRef]
  25. Van Eck, N.J.; Waltman, L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef] [Green Version]
  26. Van Eck, N.J.; Waltman, L. Text mining and visualization using VOSviewer. arXiv 2011, arXiv:1109.2058. [Google Scholar]
  27. Feuerstein, S. Do coffee roasters benefit from high prices of green coffee? Int. J. Indus. Org. 2002, 20, 89–118. [Google Scholar] [CrossRef]
  28. Michael, S.W. Application of a dynamic panel data estimator to cross-country coffee demand: A tale of two eras. J. Econ. Dev. 2009, 34, 1–17. [Google Scholar] [CrossRef]
  29. Cameron, A.C.; Gelbach, J.B.; Miller, D.L. Robust inference with multiway clustering. J. Bus. Econ. Stat. 2011, 29, 238–249. [Google Scholar] [CrossRef] [Green Version]
  30. Ellis, P.D. The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
  31. Gaye, B.; Khoury, S.; Cene, C.W.; Kingue, S.; N’Guetta, R.; Lassale, C.; Baldé, D.; Diop, I.B.; Dowd, J.B.; Mills, M.C.; et al. Socio-demographic and epidemiological consideration of Africa’s COVID-19 response: What is the possible pandemic course? Nat. Med. 2020, 26, 996–999. [Google Scholar] [CrossRef] [PubMed]
  32. Sharma, A.; Borah, S.B.; Moses, A.C. Responses to COVID-19: The role of governance, healthcare infrastructure, and learning from past pandemics. J. Bus. Res. 2021, 122, 597–607. [Google Scholar] [CrossRef] [PubMed]
  33. Haider, N.; Osman, A.Y.; Gadzekpo, A.; Akipede, G.O.; Asogun, D.; Ansumana, R.; Lessells, R.J.; Khan, P.; Hamid, M.M.A.; Yeboah-Manu, D.; et al. Lockdown measures in response to COVID-19 in nine sub-Saharan African countries. BMJ Glob. Health 2020, 5, e003319. [Google Scholar] [CrossRef] [PubMed]
  34. Le, B.; Anh, P.T.N.; Yang, S.H. Enhancement of the anti-inflammatory effect of mustard kimchi on RAW 264.7 macrophages by the Lactobacillus plantarum fermentation-mediated generation of phenolic compound derivatives. Foods 2020, 9, 181. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Guijarro-Real, C.; Plazas, M.; Rodríguez-Burruezo, A.; Prohens, J.; Fita, A. Potential in vitro inhibition of selected plant extracts against SARS-CoV-2 chymotripsin-like protease (3CLPro) activity. Foods 2021, 10, 1503. [Google Scholar] [CrossRef]
  36. Rahman, M.; Nowakowski, S.; Agrawal, R.; Naik, A.; Sharafkhaneh, A.; Razjouyan, J. Validation of a natural language processing algorithm for the extraction of the sleep parameters from the polysomnography reports. Healthcare 2022, 10, 1837. [Google Scholar] [CrossRef]
Figure 1. A workflow for exploring natural products narratives (NPN). Yellow boxes represent the use of cross-country data for the screening and selection of “best case” countries, e.g. those with lower confirmed cases and deaths, and for statistical testing. Blue boxes show a knowledge engineering process that transforms text into structured data and then visualizes promising nodes (natural products) in the knowledge graph. The orange box indicates domain knowledge in the knowledge engineering process. The green box shows experiments needed before confirming the identification of natural products.
Figure 1. A workflow for exploring natural products narratives (NPN). Yellow boxes represent the use of cross-country data for the screening and selection of “best case” countries, e.g. those with lower confirmed cases and deaths, and for statistical testing. Blue boxes show a knowledge engineering process that transforms text into structured data and then visualizes promising nodes (natural products) in the knowledge graph. The orange box indicates domain knowledge in the knowledge engineering process. The green box shows experiments needed before confirming the identification of natural products.
Healthcare 10 02071 g001
Figure 2. Knowledge graph of nodes connected to mustard. The graph was made by VOSviewer [25,26]. Circle size reflects the number of links. Nodes from the same country are clustered with the same color.
Figure 2. Knowledge graph of nodes connected to mustard. The graph was made by VOSviewer [25,26]. Circle size reflects the number of links. Nodes from the same country are clustered with the same color.
Healthcare 10 02071 g002
Figure 3. Factors correlated with the net import of mustard. (A,B) Total and new deaths caused by COVID-19 are negatively related to net import (log transformed). Brown lines represent a high containment index (>70), while blue lines represent a low containment index (≤70). (C,D) Total and new COVID-19 confirmed cases are negatively related to net import. The correlations are compared with different levels of the containment index.
Figure 3. Factors correlated with the net import of mustard. (A,B) Total and new deaths caused by COVID-19 are negatively related to net import (log transformed). Brown lines represent a high containment index (>70), while blue lines represent a low containment index (≤70). (C,D) Total and new COVID-19 confirmed cases are negatively related to net import. The correlations are compared with different levels of the containment index.
Healthcare 10 02071 g003
Table 1. Countries ranked by containment index.
Table 1. Countries ranked by containment index.
Country NameContainment IndexTotal Cases/MCountry NameContainment IndexTotal Cases/M
Nicaragua12.50881Somalia38.69280
Tanzania16.379Finland44.944595
Burundi17.2658Denmark48.2114,238
Yemen Rep.18.4570Norway50.606750
Afghanistan23.811195United Arab Emirates52.6217,203
Central African Rep.24.401018Netherlands52.9831,289
Congo Dem. Rep.24.70144Singapore52.989953
Sudan26.19416South Africa55.3613,359
Mauritius26.79397Sweden57.1425,819
Niger27.0866Australia57.141095
Mauritania28.571873Korea Rep.58.63686
Burkina Faso28.57140Bahrain59.5251,209
New Zealand32.14427Belgium60.7149,977
Syria33.04456Lithuania61.3122,963
Congo Rep.35.711046Germany61.3113,065
Eswatini36.905553Qatar61.9048,246
Tajikistan36.901282Morocco62.209749
Senegal36.90962United Kingdom63.1024,265
Madagascar36.90626Luxembourg63.1056,119
Haiti38.10815Spain63.6935,428
Note: Total cases/m is total cases per million. Countries in the table with both a low containment index and low total COVID-19 confirmed cases are included in analysis. Countries that failed to manage data transparently (<50% in the Covid Data Transparency Index) were deleted in data mining. Group A lists countries with good data transparency (≥50%), and group B includes countries whence transparency data are not available. In both groups, the top 20 countries with the lowest containment index were listed. In group A, countries have big variance in total cases, ranging from 427 to 56,119. As the aim of data mining was to explore “best-case” countries that may provide hints of natural foods against COVID-19, we deleted countries with total confirmed cases above 10,000. Singapore and Korea Rep. have demonstrated good administrative practice in pandemic control, so both countries were removed from the sample. In group B where countries have smaller variance, we deleted those with total cases above 1000.
Table 2. Correlation tests between selected foods and COVID-19 outcomes.
Table 2. Correlation tests between selected foods and COVID-19 outcomes.
PotatoesCucumbers and GherkinsFruits of the Genus Capsicum or of the Genus Pimenta
70,19070,70070,960
N = 14,757N = 12,300N = 14,621
Total cases0.0864−0.0672−0.0302
New cases0.0946−0.0694−0.0465
Total deaths0.1272−0.106−0.05
New deaths0.0735−0.0729−0.0459
mushrooms and trufflescoconutsalmonds
71,23080,11080,212
N = 14,191N = 14,378N = 14,312
Total cases−0.0294−0.02010.231
New cases0.0636−0.08010.1922
Total deaths−0.0426−0.06280.3168
New deaths−0.0023−0.11660.2081
pineapplesorangescitrus fruit
80,43080,51080,590
N = 13,420N = 14,534N = 10,325
Total cases0.2359−0.014−0.0767
New cases0.1707−0.0022−0.0579
Total deaths0.2916−0.0838−0.0662
New deaths0.1785−0.074−0.0522
grapespapawsplums and sloes
80,62080,72080,940
N = 13,837N = 8381N = 12,994
Total cases0.20660.13540.0083
New cases0.2270.12580.0238
Total deaths0.27660.17080.0039
New deaths0.24790.10620.0275
apples
81,330
N = 11,232
Total cases0.1649
New cases0.1726
Total deaths0.2379
New deaths0.1767
Note: WITS product codes and correlation coefficients are shown.
Table 3. Mustard net import and deaths caused by COVID-19.
Table 3. Mustard net import and deaths caused by COVID-19.
Total Deaths (Model 1)New Deaths (Model 2)
Coef.SEtp > tCoef.SEtp > t
Net import−0.0110.005−2.370.020−0.0110.005−2.170.034
Containment index0.0100.0110.900.3710.0140.0101.480.144
Log(population)1.0510.08811.910.0000.6920.06710.320.000
Life expectancy−0.1460.063−2.330.023−0.1730.036−4.740.000
Log(GDP per capita)1.1810.3573.310.0010.3150.2271.390.169
Cardiovasc death rate−0.0020.001−1.700.094−0.0030.001−3.690.000
Diabetes prevalence−0.0530.040−1.330.189−0.0320.033−0.960.340
Aged 700.0970.0511.900.0620.0890.0322.780.007
Log (population density)0.0080.1290.060.953−0.1150.093−1.240.220
Hospital beds (1000)−0.1440.059−2.460.016−0.1400.039−3.550.001
Electricity0.0540.0192.810.0060.0820.0155.620.000
Mobile subscriptions−0.0230.008−3.070.003−0.0080.004−1.830.072
Note: Unit of net import: million USD. Total and new deaths are log transformed. The p values gained from a two-sided t test; SE clustered and adjusted for 73 countries; p values < 0.05 indicates statistical significance. Model 1: number of obs. = 19,530; F(12, 72) = 91.93; Prob > F = 0.000; r2 = 0.514; Root MSE = 1.981. Model 2: number of obs. = 13,412; F(12, 72) = 18.90; Prob > F = 0.000; r2 = 0.431; Root MSE = 1.480.
Table 4. Sensitivity analysis: confirmed cases as Covid-19 outcomes.
Table 4. Sensitivity analysis: confirmed cases as Covid-19 outcomes.
Total Cases (Model 3)New Cases (Model 4)
Coef.SEtp > tCoef.SEtp > t
Net import−0.010 0.005 −2.030 0.046 −0.017 0.008 −2.120 0.037
Containment index0.012 0.010 1.210 0.228 0.007 0.018 0.370 0.710
Log(population)0.942 0.079 11.900 0.000 0.763 0.141 5.420 0.000
Life expectancy−0.047 0.069 −0.680 0.502 −0.159 0.072 −2.210 0.030
Log(GDP per capita)1.502 0.348 4.310 0.000 1.383 0.417 3.310 0.001
Cardiovasc death rate0.000 0.001 0.240 0.813 −0.001 0.001 −0.390 0.701
Diabetes prevalence−0.020 0.041 −0.490 0.627 −0.028 0.052 −0.540 0.593
Aged 70−0.002 0.046 −0.050 0.963 0.011 0.058 0.190 0.848
Log(population density)0.008 0.081 0.100 0.920 −0.009 0.119 −0.080 0.940
Hospital beds (1000)−0.074 0.054 −1.380 0.172 −0.088 0.067 −1.320 0.192
Electricity0.011 0.019 0.600 0.553 0.039 0.020 1.990 0.050
Mobile subscriptions−0.014 0.006 −2.130 0.036 −0.014 0.009 −1.500 0.137
Note: Unit of net import: million USD. Total and new cases are log transformed. The p values gained from a two-sided t test; SE clustered and adjusted for 74 countries; p values < 0.05 indicates statistical significance. Model 1: number of obs. = 20,853; F(12, 73) = 42.48; Prob > F = 0.000; r2 = 0.349; Root MSE = 2.358. Model 2: number of obs. = 18,889; F(12, 73) = 13.39; Prob > F = 0.000; r2 = 0.287; Root MSE = 2.173.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhan, G.; Yang, F.; Zhang, L.; Wang, H. The Relationship between Mustard Import and COVID-19 Deaths: A Workflow with Cross-Country Text Mining. Healthcare 2022, 10, 2071. https://doi.org/10.3390/healthcare10102071

AMA Style

Zhan G, Yang F, Zhang L, Wang H. The Relationship between Mustard Import and COVID-19 Deaths: A Workflow with Cross-Country Text Mining. Healthcare. 2022; 10(10):2071. https://doi.org/10.3390/healthcare10102071

Chicago/Turabian Style

Zhan, Ge, Fuming Yang, Liangbo Zhang, and Hanfeng Wang. 2022. "The Relationship between Mustard Import and COVID-19 Deaths: A Workflow with Cross-Country Text Mining" Healthcare 10, no. 10: 2071. https://doi.org/10.3390/healthcare10102071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop