Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach
Abstract
1. Introduction
2. Data
2.1. Data Source
2.2. Data Integration
2.3. Data Description
3. Methods
3.1. Machine Learning Algorithms
3.2. Model Implementation
4. Results
4.1. Trends in Pathogen Composition of Foodborne Diseases in Zhejiang Province
4.2. Risk Drivers of Foodborne Disease Types
4.2.1. Model Performance Comparison
4.2.2. Feature Importance Analysis
4.3. Time Pattern Analysis of Foodborne Disease Types
5. Conclusions
6. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Code | Items | Contents |
---|---|---|
1 | Case ID | *** |
2 | Outpatient Number | *** |
3 | Follow-up Visit (Yes/No) | No |
4 | Hospitalized (Yes/No) | No |
5 | Inpatient Number | *** |
6 | Gender | Female |
7 | Date of Birth | 1999-01-01 |
8 | Age | 26 |
9 | Occupation | Healthcare workers |
10 | Workplace | *** |
11 | Patient Category | Other districts in this city |
12 | Address—Province | Zhejiang |
13 | Address—City | Ningbo |
14 | Address—District/County | Haishu |
15 | Detailed Address | *** |
16 | Onset Date | 2023-02-16 02 |
17 | Consultation Date | 2023-02-16 16 |
18 | Time of Death | - |
19 | Main Symptoms and Signs | [Digestive system] Nausea, vomiting once/day |
20 | Medical History | None |
21 | Preliminary Diagnosis | [Viral] Norovirus |
22 | Antibiotic Use Before Consultation (Yes/No) | No |
23 | Suspected Foodborne Case (Yes/No) | Yes |
24 | Biological Sample (Yes/No) | Yes |
25 | Food Name | beef noodles |
26 | Food Category | Meat and meat products |
27 | Processing and Packaging Method | Catering service industry |
28 | Purchase Province | Zhejiang |
29 | Purchase City | Ningbo |
30 | Purchase District/County | Jiangbei |
31 | Detailed Purchase Address | *** |
32 | Purchase Location Type | Shop |
33 | Eating Location—Province | Zhejiang |
34 | Eating Location—City | Ningbo |
35 | Eating Location—District/County | Jiangbei |
36 | Detailed Eating Location Address | *** |
37 | Eating Location Type | Catering |
38 | Number of People Dining Together | 1 |
39 | Eating Time | 2023-02-10 19 |
40 | Food Sample Collected (Yes/No) | No |
41 | Other People Affected (Yes/No) | No |
42 | Sample ID | *** |
43 | Sample Type | Fecal sample |
44 | Sample Value | 5 |
45 | Sample Unit | Pcs |
46 | Sample Collection Date | 2023-02-13 |
47 | Test Item | Norovirus |
48 | Testing Institution | *** |
49 | Testing Date | 2023-02-13 |
50 | Qualitative Result | + |
51 | Testing Unit | |
52 | Strain Isolated (Yes/No) | Yes |
53 | Strain ID | *** |
54 | Identification Method | Nomal PCR |
55 | Target Gene Detection | - |
56 | Serotyping | - |
57 | Identification Conclusion | Norovirus |
91 | Strain Depository Organization | *** |
Code | Variables | Description | Data Cleaning Rules | Data Sources |
---|---|---|---|---|
1 | Prefecture | The patient’s municipal address | Encoded with the corresponding administrative division codes, for example: Hangzhou → 3301; Nanjing → 3201; Wuhan → 4201. | Zhejiang Provincial Center for Disease Control and Prevention, 2014–2023. |
2 | Food | Food category | The encoding for food categories is as follows: Grains and their products (including starch sugars, baked goods, and various staple foods) = 1; Meat and meat products = 2; Oils and fats = 2; Aquatic animals and their products = 3; Eggs and egg products = 4; Dairy products = 4; Fruits and their products (including dried and preserved fruits) = 5; Vegetables and their products = 6; Fungi and their products = 6; Algae and their products = 6; Legumes and their products = 7; Nuts and seeds and their products = 7; Beverages and frozen drinks = 8; Alcoholic beverages and their products = 8; Packaged drinking water (including bottled water) = 8; Mixed foods = 9; Various foods = 9; Blank = 10; Packaged bulk products = 10; Unknown foods = 10; Condiments = 10; Other foods = 10; Candies, chocolates, honey, and their products = 10; Infant foods = 10 | |
3 | Age | The patient’s age, which includes different units such as “years,” “months,” and “days.” | The age values are standardized and converted into numerical values expressed in years. For example: “40 years old” → 40; “1 year and 4 months” → 1 + 4/12 ≈ 1.33; “8 months” → 8/12 ≈ 0.67; “9 months and 30 days” → 9/12 + 30/365 ≈ 0.78 | |
4 | Sex | Record the patient’s gender. | Convert to numerical values: Male = 1; Female = 0. | |
5 | Occupation | Indicating the occupation of the patient | The coding for numerical values is as follows: Catering industry = 1; Migrant workers = 2; Unknown = 4; Teachers = 3; Students = 3; Others = 4; Farmers = 2; Dispersed children = 2; Commercial services = 1; Pastoralists = 2; Fishermen = 2; Administrative staff = 3; Preschool children = 3; Healthcare workers = 3; Retired personnel = 3; Homemakers and unemployed = 2; Workers = 2. | |
6 | Eat place | The type of location where food is consumed. | The categorical variables were encoded as numerical values: Catering industry = 1; Canteen = 2; Household = 3; Rural banquets = 1; Retail market = 4; Other = 5; School = 2; Type of eating venue = 5. | |
7 | Purchase | Represents the type of location where food is purchased. | The categorical variables were encoded as follows: Catering Industry = 1, Canteen = 2, Household = 3, Street Vendor = 1, Retail Market = 4, Others = 6, and Shop = 5. | |
8 | Bacteria | Indicating the bacterial classification or category. | Classification codes: Salmonella = 1; Norovirus = 2; Vibrio parahaemolyticus = 3; Escherichia coli = 4; Others =0 | |
9 | Diagnosis | The duration of “time of consultation” and “time of onset,” measured in hours. | The “time of visit” minus the “time of onset” yields the duration of medical consultation in hours. | |
10 | GDP | Gross Domestic Product | Gross Domestic Product in prefectures of Zhejiang province during 2014–2023. Unit: billion yuan | |
11 | GDP1 | Gross Domestic Product of the Primary Industry | Gross Domestic Product of the Primary Industry in prefectures of Zhejiang province during 2014–2023. Unit: billion yuan | |
12 | GDP2 | Gross Domestic Product of the Secondary Industry | Gross Domestic Product of the Secondary Industry in prefectures of Zhejiang province during 2014–2023. Unit: billion yuan | |
13 | GDP3 | Gross Domestic Product of the Tertiary Industry | Gross Domestic Product of the Tertiary Industry in prefectures of Zhejiang province during 2014–2023. Unit: billion yuan | |
14 | Average GDP | Gross Domestic Product per capita | Gross Domestic Product per capita in prefectures of Zhejiang province during 2014–2023. Unit: yuan | |
15 | Household | Total Number of Households | Total Number of Households in prefectures of Zhejiang province during 2014–2023. Unit: household | |
16 | Population | Total Population | Total Population in prefectures of Zhejiang province during 2014–2023. Unit: ten thousand people | |
17 | Mortality | Mortality rate | Mortality rate in prefectures of Zhejiang province during 2014–2023. Unit: ‰ | |
18 | Employment | Number of Employed Persons | Number of Employed Persons in the Entire Society at the End of the Year in prefectures of Zhejiang province during 2014–2023. Unit: Ten Thousand Persons | |
19 | Income disposable | The per capita disposable income of urban and rural residents | The per capita disposable income of urban and rural residents in prefectures of Zhejiang province during 2014–2023. Unit: yuan | |
20 | Consumption expenditure | The per capita living consumption expenditure | The per capita living consumption expenditure of urban and rural residents in prefectures of Zhejiang province during 2014–2023. Unit: yuan | |
21 | Total agriculture | The total output value of agriculture | The total output value of agriculture, forestry, animal husbandry, and fishery in prefectures of Zhejiang province during 2014–2023. Unit: billion yuan | |
22 | Sown area | The area of crop planting | The area of crop planting in prefectures of Zhejiang province during 2014–2023. Unit: thousand hectares | |
23 | Total grain yield | Total Grain yield | Total Grain yield in prefectures of Zhejiang province during 2014–2023. Unit: million tons | |
24 | Cereal yield | Cereal yield | Cereal yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | Zhejiang Statistical Yearbook |
25 | Rapeseed yield | yield of rapeseed | Yield of rapeseed in prefectures of Zhejiang province during 2014–2023. Unit: ten thousand tons | |
26 | Cotton yield | Cotton yield | Cotton yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
27 | Fruit yield | Fruit yield | Fruit yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
28 | Meat yield | Meat yield | Meat yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
29 | Pork yield | Pork yield | Pork yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
30 | Egg yield | Egg yield | Egg yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
31 | Milk yield | Milk yield | Milk yield in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
32 | Fish yield | Aquatic product yield | Aquatic product yield in prefectures of Zhejiang province during 2014–2023. Unit: ten thousand tons | |
33 | Marine fish yield | Marine fish yield | Marine fish yield in prefectures of Zhejiang province during 2014–2023. Unit: ten thousand tons | |
34 | Freshwater fish yield | Freshwater fish yield | Freshwater fish yield in prefectures of Zhejiang province during 2014–2023. Unit: Ten Thousand tons | |
35 | agricultural plastic | The usage of agricultural plastic film | The usage of agricultural plastic film in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
36 | Fertilizer | Usage of Agricultural Fertilizers | Usage of Agricultural Fertilizers in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
37 | Nitrogen | Nitrogen fertilizer usage | Nitrogen fertilizer usage in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
38 | Phosphate | Phosphate fertilizer usage | Phosphate fertilizer usage in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
39 | Potassium | Potassium fertilizer application | Potassium fertilizer application in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
40 | Compound | Compound fertilizer usage | Compound fertilizer usage in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
41 | Pesticide | Pesticide Usage | Pesticide Usage in prefectures of Zhejiang province during 2014–2023. Unit: tons | |
42 | Wholesale retail | Total Sales of Goods in Wholesale and Retail Trade Above Designated Size | Total Sales of Goods in Wholesale and Retail Trade Above Designated Size in prefectures of Zhejiang province during 2014–2023. Unit: hundred million yuan | |
43 | Income | Total Fiscal Revenue | Total Fiscal Revenue in prefectures of Zhejiang province during 2014–2023. Unit: hundred million yuan | Zhejiang Statistical Yearbook |
44 | Expenditure | Local Government Fiscal Expenditure | Local Government Fiscal Expenditure in Zhejiang province during 2014–2023. Unit: hundred million yuan | |
45 | Public expenditure | General Public Service Expenditure | General Public Service Expenditure in prefectures of Zhejiang province during 2014–2023. Unit: hundred million yuan | |
46 | Temperature | Annual Average Temperature | Annual Average Temperature in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 °C | |
47 | Temperature1 | Average Temperature in January | Average Temperature in January in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
48 | Temperature2 | Average Temperature in February | Average Temperature in February in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
49 | Temperature3 | Average Temperature in March | Average Temperature in March in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
50 | Temperature4 | Average Temperature in April | Average Temperature in April in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
51 | Temperature5 | Average Temperature in May | Average Temperature in May in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
52 | Temperature6 | Average Temperature in June | Average Temperature in June in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
53 | Temperature7 | Average Temperature in July | Average Temperature in July in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
54 | Temperature8 | Average Temperature in August | Average Temperature in August in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
55 | Temperature9 | Average Temperature in September | Average Temperature in September in Zhejiang province during 2014–2023. Unit: Celsius | |
56 | Temperature10 | Average Temperature in October | Average Temperature in October in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
57 | Temperature11 | Average Temperature in November | Average Temperature in November in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
58 | Temperature12 | Average Temperature in December | Average Temperature in December in prefectures of Zhejiang province during 2014–2023. Unit: Celsius | |
59 | Precipitation | Annual Precipitation | Annual Precipitation in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
60 | Precipitation1 | Precipitation in January | Precipitation in January in prefectures of Zhejiang province during 2014–2023. Unit:0.1 mm | |
61 | Precipitation2 | Precipitation in February | Precipitation in February in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
62 | Precipitation3 | Precipitation in March | Precipitation in March in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
63 | Precipitation4 | Precipitation in April | Precipitation in April in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
64 | Precipitation5 | Precipitation in May | Precipitation in May in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
65 | Precipitation6 | Precipitation in June | Precipitation in June in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
66 | Precipitation7 | Precipitation in July | Precipitation in July in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
67 | Precipitation8 | Precipitation in August | Precipitation in August in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
68 | Precipitation9 | Precipitation in September | Precipitation in September in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
69 | Precipitation10 | Precipitation in October | Precipitation in October in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
70 | Precipitation11 | Precipitation in November | Precipitation in November in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
71 | Precipitation12 | Precipitation in December | Precipitation in December in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 mm | |
72 | Sunshine | Total Annual Sunshine Hours | Total Annual Sunshine Hours in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
73 | Sunshine1 | Sunshine Hours in January | Sunshine Hours in January in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | Zhejiang Statistical Yearbook |
74 | Sunshine2 | Sunshine Hours in February | Sunshine Hours in February in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
75 | Sunshine3 | Sunshine Hours in March | Sunshine Hours in March in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
76 | Sunshine4 | Sunshine Hours in April | Sunshine Hours in April in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
77 | Sunshine5 | Sunshine Hours in May | Sunshine Hours in May in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
78 | Sunshine6 | Sunshine Hours in June | Sunshine Hours in June in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
79 | Sunshine7 | Sunshine Hours in July | Sunshine Hours in July in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
80 | Sunshine8 | Sunshine Hours in August | Sunshine Hours in August in prefectures of Zhejiang province during 2014–2023.Unit: 0.1 h | |
81 | Sunshine9 | Sunshine Hours in September | Sunshine Hours in September in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
82 | Sunshine10 | Sunshine Hours in October | Sunshine Hours in October in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
83 | Sunshine11 | Sunshine Hours in November | Sunshine Hours in November in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
84 | Sunshine12 | Sunshine Hours in December | Sunshine Hours in December in prefectures of Zhejiang province during 2014–2023. Unit: 0.1 h | |
85 | water resources | Total Water Resources | Total Water Resources in prefectures of Zhejiang province during 2014–2023. Unit: hundred million cubic meters | |
86 | water supply | Total Water Supply | Total Water Supply in prefectures of Zhejiang province during 2014–2023. Unit: hundred million cubic meters | |
87 | Hospitals | Number of Hospitals and Health Centers | Number of Hospitals and Health Centers in prefectures of Zhejiang province during 2014–2023 (Units). | |
88 | Hospital beds | Number of Hospital Beds | Number of Hospital Beds in prefectures of Zhejiang province during 2014–2023 (Units). | |
89 | Doctors | Number of Doctors | Number of doctors in prefectures of Zhejiang province during 2014–2023 (Units). | |
90 | Insurance | Number of Basic Medical Insurance Enrollees | Number of Basic Medical Insurance Enrollees in prefectures of Zhejiang province during 2014–2023. Unit: ten thousand yuan | Zhejiang Statistical Yearbook |
91 | Climate policy | Climate Policy Index | Using manual auditing and the deep learning algorithm MacBERT model, we constructed the CCPU index for China at the national, provincial, and major city levels from January 2000 to December 2022. This index is based on 1,755,826 articles from six mainstream newspapers in China: People’s Daily, Guangming Daily, Economic Daily, Global Times, Science and Technology Daily, and China News Service. The research framework consists of six parts: data collection, data cleaning, manual auditing, model construction, index calculation and normalization, and technical validation. | [24] |
Code | Variable | Count | Mean | Std | Min | Max |
---|---|---|---|---|---|---|
1 | age | 56,970 | 32.11 | 22.00 | 0 | 99 |
2 | sex | 56,970 | 0.55 | 0.50 | 0 | 1 |
3 | occupation | 56,970 | 2.41 | 0.74 | 1 | 4 |
4 | food | 51,607 | 4.82 | 2.91 | 1 | 10 |
5 | purchase | 49,570 | 4.44 | 1.77 | 1 | 6 |
6 | eat place | 51,468 | 3.33 | 1.24 | 1 | 5 |
7 | bacteria | 56,970 | 3.25 | 2.00 | 1 | 6 |
8 | diagnosis | 56,970 | 27.45 | 41.32 | 0 | 2221 |
9 | City code | 56,970 | 3305.84 | 3.31 | 3301 | 3311 |
10 | year | 56,970 | 2019.34 | 2.67 | 2014 | 2023 |
11 | GDP | 42,749 | 6181.53 | 4608.54 | 971.47 | 18,753 |
12 | GDP1 | 42,749 | 193.87 | 92.02 | 21.05 | 382 |
13 | GDP2 | 42,749 | 2608.88 | 1731.65 | 369.93 | 7413 |
14 | GDP3 | 42,749 | 3112.68 | 2561.17 | 457.63 | 12,287.31 |
15 | average GDP | 42,749 | 98,051.57 | 31,005.40 | 39,721 | 167,134 |
16 | household | 48,937 | 1,515,721.00 | 783,651.17 | 339,224 | 2,664,533 |
17 | population | 48,444 | 452.29 | 253.31 | 76.66 | 846.75 |
18 | mortality | 41,061 | 5.92 | 0.79 | 4.5 | 8 |
19 | employment | 42,749 | 385.57 | 201.08 | 72.14 | 759.68 |
20 | Income disposable | 41,332 | 48,961.18 | 11,069.14 | 22,426 | 70,281 |
21 | consumption expenditure | 41,332 | 31,268.95 | 7043.39 | 13,875 | 46,440 |
22 | total agriculture | 42,749 | 310.34 | 145.54 | 30.29 | 589.82 |
23 | sown area | 42,749 | 201.09 | 67.76 | 13.45 | 320.18 |
24 | total grain yield | 44,594 | 56.25 | 24.41 | 2.48 | 122.34 |
25 | cereal yield | 42,749 | 514,214.28 | 213,044.53 | 16,832 | 1,158,805 |
26 | rapeseed yield | 43,252 | 2.33 | 1.77 | 0.13 | 6.62 |
27 | cotton yield | 40,022 | 880.62 | 1323.54 | 1 | 9488 |
28 | fruit yield | 42,749 | 699,250.71 | 397,609.49 | 63,002 | 1,499,253 |
29 | meat yield | 42,749 | 107,953.67 | 56,592.71 | 4686 | 322,988 |
30 | pork yield | 42,749 | 74,989.71 | 50,859.05 | 1524 | 250,047 |
31 | egg yield | 42,749 | 36,035.15 | 24,953.02 | 189 | 130,870 |
32 | milk yield | 40,379 | 17,794.60 | 16,598.84 | 2 | 65,073 |
33 | freshwater fish yield | 43,252 | 12.15 | 13.27 | 0.06 | 60.46 |
34 | agricultural plastic | 43,252 | 6505.72 | 3589.66 | 325 | 13,567 |
35 | fertilizer | 42,749 | 70,619.17 | 25,618.09 | 3907 | 112,811 |
36 | nitrogen | 43,252 | 31,332.97 | 16,029.53 | 1500 | 78,100 |
37 | phosphate | 43,854 | 6618.99 | 3886.90 | 0 | 14,700 |
38 | potassium | 43,252 | 5428.94 | 2510.99 | 100 | 12,500 |
39 | compound | 43,252 | 26,427.04 | 14,814.84 | 1300 | 56,000 |
40 | pesticide | 43,252 | 3841.39 | 1705.31 | 324 | 7510 |
41 | wholesale retail | 42,464 | 8950.83 | 11,985.53 | 312.34 | 46,799.786 |
42 | income | 42,464 | 1189.07 | 1186.74 | 119.02 | 4590.08 |
43 | expenditure | 42,464 | 863.83 | 595.26 | 72.23 | 2542.09 |
44 | public expenditure | 42,464 | 86.45 | 49.71 | 11.12 | 210.01 |
45 | temperature1 | 41,332 | 73.22 | 17.02 | 35 | 115 |
46 | temperature2 | 41,332 | 85.26 | 22.42 | 49 | 137 |
47 | temperature3 | 41,332 | 131.29 | 12.83 | 105 | 160 |
48 | temperature4 | 41,332 | 177.92 | 10.35 | 148 | 204 |
49 | temperature5 | 41,332 | 222.94 | 13.27 | 190 | 254 |
50 | temperature6 | 41,332 | 254.28 | 10.57 | 226 | 280 |
51 | temperature7 | 41,332 | 292.79 | 15.19 | 256 | 329 |
52 | temperature8 | 41,332 | 294.70 | 13.20 | 259 | 334 |
53 | temperature9 | 41,332 | 255.47 | 12.77 | 232 | 286 |
54 | temperature10 | 41,332 | 202.73 | 12.30 | 180 | 234 |
55 | temperature11 | 41,332 | 151.80 | 14.48 | 126 | 184 |
56 | temperature12 | 41,332 | 87.19 | 18.41 | 48 | 134 |
57 | temperature | 41,332 | 186.07 | 7.73 | 168 | 200 |
58 | precipitation1 | 41,332 | 795.06 | 500.62 | 55 | 2153 |
59 | precipitation2 | 41,332 | 897.30 | 538.90 | 169 | 2764 |
60 | precipitation3 | 41,332 | 1376.98 | 522.21 | 467 | 2879 |
61 | precipitation4 | 41,332 | 1187.64 | 602.80 | 369 | 3576 |
62 | precipitation5 | 41,332 | 1746.45 | 820.46 | 523 | 4241 |
63 | precipitation6 | 41,332 | 2764.66 | 995.91 | 877 | 5882 |
64 | precipitation7 | 41,332 | 1894.35 | 1338.43 | 208 | 6353 |
65 | precipitation8 | 41,332 | 2018.23 | 1231.23 | 223 | 4990 |
66 | precipitation9 | 41,332 | 1745.05 | 1166.36 | 62 | 6069 |
67 | precipitation10 | 41,332 | 823.14 | 780.02 | 30 | 3577 |
68 | precipitation11 | 41,332 | 845.91 | 531.14 | 57 | 2804 |
69 | precipitation12 | 41,332 | 576.98 | 445.65 | 45 | 2048 |
70 | precipitation | 41,332 | 16,680.64 | 2846.76 | 11,892 | 25,596 |
71 | sunshine1 | 41,332 | 973.60 | 387.02 | 440 | 1957 |
72 | sunshine2 | 41,332 | 954.48 | 395.54 | 139 | 1748 |
73 | sunshine3 | 41,332 | 1179.75 | 250.30 | 601 | 1810 |
74 | sunshine4 | 41,332 | 1405.16 | 333.94 | 639 | 2097 |
75 | sunshine5 | 41,332 | 1356.47 | 314.98 | 707 | 2102 |
76 | sunshine6 | 41,332 | 1057.39 | 302.46 | 334 | 1832 |
77 | sunshine7 | 41,332 | 1866.88 | 609.40 | 626 | 2918 |
78 | sunshine8 | 41,332 | 2061.95 | 511.96 | 827 | 3140 |
79 | sunshine9 | 41,332 | 1499.85 | 387.90 | 690 | 2451 |
80 | sunshine10 | 41,332 | 1328.47 | 385.16 | 318 | 2392 |
81 | sunshine11 | 41,332 | 993.67 | 311.68 | 300 | 1698 |
82 | sunshine12 | 41,332 | 1138.37 | 430.89 | 324 | 1917 |
83 | sunshine | 41,332 | 15,811.29 | 1932.72 | 10986 | 19,961 |
84 | water resource | 41,332 | 110.21 | 58.08 | 6.93 | 257.23 |
85 | water supply | 41,332 | 17.36 | 7.77 | 1.45 | 57.38 |
86 | hospitals | 42,749 | 160.27 | 100.40 | 28 | 403 |
87 | hospital beds | 42,749 | 30,391.50 | 20,020.09 | 3851 | 87,950 |
88 | doctors | 42,749 | 20,681.80 | 13,479.55 | 2609 | 57,455 |
89 | insurance | 42,749 | 324.46 | 193.54 | 5.21 | 954.43 |
90 | climate policy | 56,970 | 0.86 | 0.70 | 0 | 10.78 |
Class/Metric | Random Forest (Train) | XGBoost (Train) | Random Forest (Valid) | XGBoost (Valid) |
---|---|---|---|---|
Class 0 Precision | 0.8 | 0.94 | 0 | 0 |
Class 0 Recall | 0.05 | 0.42 | 0 | 0 |
Class 0 F1-score | 0.1 | 0.58 | 0 | 0 |
Class 0 Support | 77 | 77 | 11 | 11 |
Class 1 Precision | 0.84 | 0.85 | 0.72 | 0.73 |
Class 1 Recall | 0.96 | 0.97 | 0.86 | 0.86 |
Class 1 F1-score | 0.89 | 0.9 | 0.78 | 0.79 |
Class 1 Support | 15,264 | 15,264 | 3864 | 3864 |
Class 2 Precision | 0.9 | 0.9 | 0.43 | 0.41 |
Class 2 Recall | 0.49 | 0.55 | 0.17 | 0.18 |
Class 2 F1-score | 0.64 | 0.69 | 0.25 | 0.25 |
Class 2 Support | 2484 | 2484 | 620 | 620 |
Class 3 Precision | 0.89 | 0.91 | 0.69 | 0.7 |
Class 3 Recall | 0.83 | 0.84 | 0.64 | 0.64 |
Class 3 F1-score | 0.86 | 0.87 | 0.66 | 0.67 |
Class 3 Support | 9142 | 9142 | 2245 | 2245 |
Class 4 Precision | 0.92 | 0.95 | 0.45 | 0.5 |
Class 4 Recall | 0.36 | 0.58 | 0.14 | 0.29 |
Class 4 F1-score | 0.52 | 0.72 | 0.21 | 0.29 |
Class 4 Support | 849 | 849 | 215 | 215 |
Accuracy | 0.86 | 0.87 | 0.7 | 0.71 |
Macro Avg Precision | 0.87 | 0.91 | 0.46 | 0.47 |
Macro Avg Recall | 0.54 | 0.67 | 0.36 | 0.38 |
Macro Avg F1-score | 0.6 | 0.75 | 0.38 | 0.4 |
Macro Avg Support | 27816 | 27816 | 6955 | 6955 |
Weighted Avg Precision | 0.86 | 0.88 | 0.68 | 0.68 |
Weighted Avg Recall | 0.86 | 0.87 | 0.7 | 0.71 |
Weighted Avg F1-score | 0.85 | 0.87 | 0.68 | 0.68 |
Weighted Avg Support | 27816 | 27816 | 6955 | 6955 |
Coef | Std Err | Z | P > |z | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
ar.L1 | 0.2408 | 0.351 | 0.687 | 0.492 | −0.446 | 0.928 |
ma.L1 | 0.1107 | 0.383 | 0.289 | 0.773 | −0.640 | 0.861 |
ar.S.L12 | −0.1659 | 0.141 | −1.178 | 0.239 | −0.442 | 0.110 |
ma.S.L12 | −0.5302 | 0.141 | −3.769 | 0.000 | −0.806 | −0.254 |
sigma2 | 0.3394 | 0.039 | 8.746 | 0.000 | 0.263 | 0.415 |
Ljung–Box(L1)(Q): | 0.07 | Jar que-Bera(JB): | 18.23 | |||
Prob(Q): | 0.80 | Prob(JB): | 0.00 | |||
Heteroskedasticity(H): | 1.19 | Skew: | −0.51 | |||
Prob(H)(two-sided): | 0.63 | Kurtosis: | 4.90 |
References
- Holst, M.M. Contributing Factors of Foodborne Illness Outbreaks—National Outbreak Reporting System, United States, 2014–2022. MMWR Surveill. Summ. 2025, 74, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Mixão, V.; Pinto, M.; Brendebach, H.; Sobral, D.; Santos, J.D.; Radomski, N.; Uldall, A.S.M.; Bomba, A.; Pietsch, M.; Bucciacchio, A.; et al. Multi-country and intersectoral assessment of cluster congruence between pipelines for genomics surveillance of foodborne pathogens. Nat. Commun. 2025, 16, 3961. [Google Scholar] [CrossRef]
- Sadilek, A.; Caty, S.; DiPrete, L.; Mansour, R.; Schenk, T., Jr.; Bergtholdt, M.; Jha, A.; Ramaswami, P.; Gabrilovich, E. Machine-learned epidemiology: Real-time detection of foodborne illness at scale. NPJ Digit. Med. 2018, 1, 36. [Google Scholar] [CrossRef]
- Chen, Y.; Wan, G.; Song, J.; Dai, J.; Shi, W.; Wang, L. Food Safety Practices of Food Handlers in China and their Correlation with Self-reported Foodborne Illness. J. Food Prot. 2024, 87, 100202. [Google Scholar] [CrossRef]
- Xue, J.; Zhang, W. Understanding China’s food safety problem: An analysis of 2387 incidents of acute foodborne illness. Food Control 2013, 30, 311–317. [Google Scholar] [CrossRef]
- Thaivalappil, A.; Young, I.; Paco, C.; Jeyapalan, A.; Papadopoulos, A. Food safety and the older consumer: A systematic review and meta-regression of their knowledge and practices at home. Food Control 2020, 107, 106782. [Google Scholar] [CrossRef]
- He, Y.; Wang, J.; Zhang, R.; Chen, L.; Zhang, H.; Qi, X.; Chen, J. Epidemiology of foodborne diseases caused by Salmonella in Zhejiang Province, China, between 2010 and 2021. Front. Public Health 2023, 11, 1127925. [Google Scholar] [CrossRef]
- Qi, X.; Alifu, X.; Chen, J.; Luo, W.; Wang, J.; Yu, Y.; Zhang, R. Descriptive study of foodborne disease using disease monitoring data in Zhejiang Province, China, 2016–2020. BMC Public Health 2022, 22, 1831. [Google Scholar] [CrossRef]
- Duchenne-Moutien, R.A.; Neetoo, H. Climate Change and Emerging Food Safety Issues: A Review. J. Food Prot. 2021, 84, 1884–1897. [Google Scholar] [CrossRef]
- Li, W.; Huang, T.; Liu, C.; Wushouer, H.; Yang, X.; Wang, R.; Xia, H.; Li, X.; Qiu, S.; Chen, S.; et al. Changing climate and socioeconomic factors contribute to global antimicrobial resistance. Nat. Med. 2025, 31, 1798–1808. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, C.; Liu, Y.; Chen, J.; Yin, R.; Jia, C.; Kang, X.; Zhou, X.; Liao, S.; Jin, X.; et al. Salmonellosis outbreak archive in China: Data collection and assembly. Sci. Data 2024, 11, 244. [Google Scholar] [CrossRef]
- Archer, E.J.; Baker-Austin, C.; Osborn, T.J.; Jones, N.R.; Martínez-Urtaza, J.; Trinanes, J.; Oliver, J.D.; González, F.J.C.; Lake, I.R. Climate warming and increasing Vibrio vulnificus infections in North America. Sci. Rep. 2023, 13, 3893. [Google Scholar] [CrossRef]
- Simpson, R.B.; Zhou, B.; Naumova, E.N. Seasonal synchronization of foodborne outbreaks in the United States, 1996–2017. Sci. Rep. 2020, 10, 17500. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Liu, F.; Zhang, J.; Gao, J. Insights into the nature of food safety issues in Beijing through content analysis of an Internet database of food safety incidents in China. Food Control 2015, 51, 206–211. [Google Scholar] [CrossRef]
- Pires, S.M.; Desta, B.N.; Mughini-Gras, L.; Mmbaga, B.T.; Fayemi, O.E.; Salvador, E.M.; Gobena, T.; Majowicz, S.E.; Hald, T.; Hoejskov, P.S.; et al. Burden of foodborne diseases: Think global, act local. Curr. Opin. Food Sci. 2021, 39, 152–159. [Google Scholar] [CrossRef] [PubMed]
- Lake, I.R. Food-borne disease and climate change in the United Kingdom. Environ. Health 2017, 16, 117. [Google Scholar] [CrossRef]
- Jin, C.; Levi, R.; Liang, Q.; Renegar, N.; Springs, S.; Zhou, J.; Zhou, W. Testing at the Source: Analytics-Enabled Risk-Based Sampling of Food Supply Chains in China. Manag. Sci. 2021, 67, 2985–2996. [Google Scholar] [CrossRef]
- Gao, Q.; Levi, R.; Renegar, N. The Link between Food Safety and Zoonotic Disease Risks at Wholesale and Wet Markets in China. SSRN Electron. J. 2020. [Google Scholar] [CrossRef]
- Jin, C.; Levi, R.; Liang, Q.; Renegar, N.; Zhou, J. Food safety inspection and the adoption of traceability in aquatic wholesale markets: A game-theoretic model and empirical evidence. J. Integr. Agric. 2021, 20, 2807–2819. [Google Scholar] [CrossRef]
- Levi, R.; Singhvi, S.; Zheng, Y. Economically Motivated Adulteration in Farming Supply Chains. Manag. Sci. 2020, 66, 209–226. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Available online: https://dl.acm.org/doi/10.1145/2939672.2939785 (accessed on 21 June 2025).
- Ma, Y.; Liu, Z.; Ma, D.; Zhai, P.; Guo, K.; Zhang, D.; Ji, Q. A news-based climate policy uncertainty index for China. Sci. Data 2023, 10, 881. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, C.; Qi, X.; Wang, J.; Chen, L.; Chen, J.; Yin, H. Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach. Foods 2025, 14, 2857. https://doi.org/10.3390/foods14162857
Jin C, Qi X, Wang J, Chen L, Chen J, Yin H. Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach. Foods. 2025; 14(16):2857. https://doi.org/10.3390/foods14162857
Chicago/Turabian StyleJin, Cangyu, Xiaojuan Qi, Jikai Wang, Lili Chen, Jiang Chen, and Han Yin. 2025. "Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach" Foods 14, no. 16: 2857. https://doi.org/10.3390/foods14162857
APA StyleJin, C., Qi, X., Wang, J., Chen, L., Chen, J., & Yin, H. (2025). Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach. Foods, 14(16), 2857. https://doi.org/10.3390/foods14162857