Risk Factors Associated with Diarrheal Episodes in an Agricultural Community in Nam Dinh Province, Vietnam: A Prospective Cohort Study

In Vietnam, data on the risk factors for diarrhea at the community level remain sparse. This study aimed to provide an overview of diarrheal diseases in an agricultural community in Vietnam, targeting all age groups. Specifically, we investigated the incidence of diarrheal disease at the community level and described the potential risk factors associated with diarrheal diseases. In this prospective cohort study, a total of 1508 residents were enrolled during the 54-week study period in northern Vietnam. The observed diarrheal incidence per person-year was 0.51 episodes. For children aged <5 years, the incidence per person-year was 0.81 episodes. Unexpectedly, the frequency of diarrhea was significantly higher among participants who used tap water for drinking than among participants who used rainwater. Participants who used a flush toilet had less frequent diarrhea than those who used a pit latrine. The potential risk factors for diarrhea included the source of water used in daily life, drinking water, and type of toilet. However, the direct reason for the association between potential risk factors and diarrhea was not clear. The infection routes of diarrheal pathogens in the environment remain to be investigated at this study site.


Introduction
Diarrheal diseases are a significant issue for global health. Worldwide, of the nearly 5.3 million deaths of children aged less than 5 years in 2018, 8% were attributed to diarrhea, arising from a combination of factors [1], including poverty [2,3], malnutrition [4], poor sanitation and hygiene [5], unsafe drinking water [6,7], and poor healthcare systems [8]. In recent years, the mortality associated with diarrhea among children living in developed

Baseline Information through Face-To-Face Interviews Using a Questionnaire
In September 2014, trained health workers began to visit designated candidate households. The enrolment criteria for participation in this study were once again confirmed with the head of each household or with individual household members. If the household met the criteria, the health workers collected the baseline information of each household through face-to-face interviews and using a questionnaire. This questionnaire collected information about the demographic and socioeconomic characteristics of the household, including age; sex; the number of household members; water sources for daily use, and water for drinking (e.g., tap water, water truck, tube well/hand pump, open well, rainwater, canal/river, lake/pond, others); water treatment (e.g., boiling water before drinking or not); distance from the WPP; type of the toilet (e.g., flush toilet with septic tank, pit latrine); livestock animal husbandry (e.g., pigs, buffalos, dogs, cattle, cats, chickens, ducks, or geese); and wealth index ( Figure S1). Multiple-choice questions were used to inquire about the water source used for daily use. A representative from each household was provided with multiple options. Based on the responses, each water source was categorized as "used" or "not used" by each household. Toilet facility types were categorized into a flush toilet with septic tank, pit latrine, or other (not categorized).
Household crowding and lower household socioeconomic status could be risk factors for diarrhea [20][21][22]. To evaluate household crowding, we calculated housing space (m 2 ) per person as an index of crowding, with a cut-off value of 12 m 2 , based on guidelines for healthy housing produced by the World Health Organization (WHO) [23]. To assess the socioeconomic status of each household, data on the ownership of assets were collected using the principal component analysis (PCA) methodology [24][25][26]. The respondents were apportioned into three levels of wealth considering the possession of household assets, including radio, refrigerator, television, video cassette recorder (VCR), motorcycle,

Baseline Information through Face-to-Face Interviews Using a Questionnaire
In September 2014, trained health workers began to visit designated candidate households. The enrolment criteria for participation in this study were once again confirmed with the head of each household or with individual household members. If the household met the criteria, the health workers collected the baseline information of each household through face-to-face interviews and using a questionnaire. This questionnaire collected information about the demographic and socioeconomic characteristics of the household, including age; sex; the number of household members; water sources for daily use, and water for drinking (e.g., tap water, water truck, tube well/hand pump, open well, rainwater, canal/river, lake/pond, others); water treatment (e.g., boiling water before drinking or not); distance from the WPP; type of the toilet (e.g., flush toilet with septic tank, pit latrine); livestock animal husbandry (e.g., pigs, buffalos, dogs, cattle, cats, chickens, ducks, or geese); and wealth index ( Figure S1). Multiple-choice questions were used to inquire about the water source used for daily use. A representative from each household was provided with multiple options. Based on the responses, each water source was categorized as "used" or "not used" by each household. Toilet facility types were categorized into a flush toilet with septic tank, pit latrine, or other (not categorized).
Household crowding and lower household socioeconomic status could be risk factors for diarrhea [20][21][22]. To evaluate household crowding, we calculated housing space (m 2 ) per person as an index of crowding, with a cut-off value of 12 m 2 , based on guidelines for healthy housing produced by the World Health Organization (WHO) [23]. To assess the socioeconomic status of each household, data on the ownership of assets were collected using the principal component analysis (PCA) methodology [24][25][26]. The respondents were apportioned into three levels of wealth considering the possession of household assets, including radio, refrigerator, television, video cassette recorder (VCR), motorcycle, bicycle, car, mobile phone, landline phone, fan, washing machine, sewing machine, air conditioner, computer, Internet, and rice mill machine.
Questionnaire forms were translated from English to Vietnamese and then backtranslated to ensure accuracy when completing the final version. Any discrepancies were resolved by discussion among the research team.

Global Positioning System (GPS) for Measuring Distances
The GPS coordinates of all houses, the health center, and the WPP were collected, and the distance from each household to the WPP was calculated using QGIS software (version 2.8.2, Open-source software, Raleigh, NC, USA).

Definition of Diarrhea
We defined diarrhea as the presence of three or more watery or loose fecal discharges within 24 h, following WHO criteria [27]. Any reoccurrence of diarrhea within 14 days was considered part of the prior diarrheal episode, whereas an episode of diarrhea after a diarrhea-free gap of 14 or more days was taken to represent a new episode [28].

Routine Follow-Up for Collecting Data for Diarrhea
This study was carried out for 54 weeks, from 27 October 2014 to 16 November 2015. For longitudinal diarrheal surveillance, each health worker arranged for twice-weekly visits to between 15 and 60 households, to ask whether any member of the household had reported diarrheal episodes. If they reported an episode of diarrhea, a health worker received the diarrheal sample in a sample pot. Meanwhile, data on the occurrence of diarrhea were also collected through interviews with those participants who had suffered with diarrhea or, in case of children, their family members. Sample pots were distributed to all participating households beforehand. When the cups ran out, they were immediately redistributed to households. Diarrheal samples were analyzed in a different study. In the current study, the diarrheal episode was counted as such when accompanied by the diarrheal sample. The confirmed data on diarrhea were used for the analysis. A meeting was held once per month at the Hien Khanh commune health station to ensure that all procedures were undertaken properly by the health workers. Data collection was ceased for the national new year Tet holidays for two weeks in February 2015.

Data Management
Data management was performed at the Hien Khanh commune health station and the National Institute of Hygiene and Epidemiology. Data were double-entered and managed using FoxPro 7.0 software (Microsoft, Lewistown, PA, USA).

Risk Factor Analysis
We employed maximum likelihood statistical models to assess the association between various risk factors, including basic demographic factors and diarrheal episodes. All statistical analyses were carried out with R version 3.5.0 (R Core Team, Vienna, Austria), using the library glmmADMB package [29]. As the number of diarrheal episodes is overdispersed, a negative binomial generalized linear mixed model (NB-GLMM) was employed. The response variable was the number of diarrhea episodes. The explanatory variables were demographic and environmental factors. Demographic factors included age, sex, and level of wealth. Other factors included the use or non-use of each source of water for daily use (tap water, water truck, tube well, open well, rainwater, or lake/pond); the source of drinking water (tap water, rainwater, or other); the distance from the WPP; the type of toilet facility (flush toilet with septic tank, pit latrine, or other); and the presence or absence of animal husbandry (pigs, buffalos, dogs, cattle, cats, chickens, ducks, geese, or other animals). Behavioral factors, such as the practice of boiling water, were also included. As each household comprised several residents, "household" was considered a random factor and was added [30]. First, to perform the univariate NB-GLMM, each explanatory variable was separately entered into the model, with "household" as a random factor. Second, to control for confounding variables, a multivariate NB-GLMM was performed. The response and explanatory variables used were the same as those in the univariate NB-GLMM and were entered into the multivariate NB-GLMM with "household" again as a random factor. If explanatory variables showed variance inflation factor (VIF) values of more than 5, there would likely be a multicollinearity issue [30]. We would then remove one or more variables and re-evaluate the potential new model variable configuration. We repeated this analysis until all explanatory variables had VIF values of less than 5 and suitable combinations of variables had been determined. In each case of multivariate NB-GLMM with relevant variables, the best model for predicting the factors affecting episodes of diarrhea was chosen using a backward stepwise model selection, based on the Akaike Information Criterion (AIC) [31]. Confidence intervals of 95% and p values of less than 0.05 were considered to be statistically significant. For all analyses of local risk factors for diarrheal episodes, we used participants' age when the study began (27 October 2014).

Socioeconomic Characteristics of Participants
The study population consisted of 311 households, with a total of 1508 residents who were aged from 0 to 97 years. The geographical distribution of the randomly selected households is shown in the map ( Figure 2). Throughout this portion of the study, between the end of October 2014 and the middle of November 2015, a total of 311 households completed follow-up. We did not use any data from residents of the households who were born or died during our study period. The demographic characteristics of the participants are summarized in Table 1. More than half of the participants were female (56.4%). The mean number of household members was 4.86 (SD: 1.22). The mean number of children aged less than 5 years in each household was 1.28 (SD: 0.50). All 311 households lived in houses made of similar materials, i.e., bricks. The mean housing area was 75.1 m 2 (SD: 35.6 m 2 ). The proportion of households who used tap water for daily use was 92.6%, whereas 65.3% used rainwater. No households used canal or river water for daily use; therefore, we excluded it as a variable for analysis. Although one household answered "Other" about the water source used for daily use, the details of this other water source were not specified, and this risk-factor variable was not used for the analysis. Concerning the variables of water for drinking, we categorized these into three types of water sources: tap water, rainwater, and others. The proportion of households who used tap water, rainwater, and others for drinking was 35.1%, 64.4%, and 0.5%, respectively. The answer of "Other" for drinking water included mineral water (four individuals) and truck water (three individuals), which we combined into one category, "Other", for the analysis. For the type of toilet facility, 78.4% of households had a flush toilet (with septic tank), 17.8% used a pit latrine, and the remaining 3.7% were of other types. The distribution of distances from the WPP showed two peaks and was categorized into "near" (<1.5 km from the WPP) and "far" (≥1.5 km from the WPP) ( Figure S2). The education level achieved among individuals aged more than 15 years was as follows: lower than primary school, 28/907 (3.1%); primary school, 52/907 (5.7%); secondary high-school, 475/907 (52.4%); high school, 346/907 (38.1%); intermediate college, 5/907 (5.5%); and university, 1/907 (0.1%). These data were only used for basic demographic information and not for any further analysis because of the nature of this study design, which was focused on all age groups, including children who were yet to receive any formal education. Most households (90.7%) owned animals; 13.7% had pigs, 1.9% had buffalos, 73.9% had dogs, 10.7% had cattle, 52.1% had cats, 70.9% had chickens, and 17.4% had ducks or geese. In this study site, wealth disparities were not that large, based on the data relating to the ownership of household assets (Table S1)

Diarrheal Episodes
We identified 572/1508 participants who reported at least one episode of diarrhea during the study period. In total, there were 791 episodes of diarrhea over 1561.7 person-years (study period × total number of participants: 1.04 years (54 weeks) × 1508 participants) in the study population. The cumulative number of diarrheal episodes per individual during the study period varied widely: 936/1508 (62.1%) had none; 407/1508 (27.0%) had one episode; 125/1508 (8.3%) had two episodes; 32/1508 (2.1%) had three episodes; 5/1508 (0.3%) had four episodes; 2/1508 (0.1%) had five episodes; and 1/1508 (0.1%) had eight episodes. The case with the highest number of episodes was that of a 2-year-old child. The population age distribution and the individuals who had at least one episode of diarrhea are shown in Figure 3. The age histogram shows three peaks, at around 5, 30, and 60 years of age (Figure 3). Due to the nature of the study, we focused on households with children aged less than 5 years. Individuals with diarrhea (colored in Figure 3) showed three similar peaks. Overall, the incidence of diarrheal episodes was 0.51 episodes per person-year (95% CI 0.47-0.55). For children aged less than 5 years, the incidence was 0.81 episodes per person-year (95% CI 0.71-0.91). By breaking this group down into smaller groups of one year old each, each separated by one year of age, the results are as follows: under 1 year, 1.04 episodes per person-year (95% CI 0.67-1.41); 1-2 years, 0.86 (0.67-1.06); 2-3 years, 0.83 (0.61-1.06); 3-4 years, 0.82 (0.62-1.01); 4-5 years, 0.60 (0.40-0.82). We found that the younger the child, the higher the incidence (Table S2, Figure S3). There was no difference in the occurrence of diarrhea between males and females.

Diarrheal Episodes
We identified 572/1508 participants who reported at least one episode of diarrhea during the study period. In total, there were 791 episodes of diarrhea over 1561.7 personyears (study period × total number of participants: 1.04 years (54 weeks) × 1508 participants) in the study population. The cumulative number of diarrheal episodes per individual during the study period varied widely: 936/1508 (62.1%) had none; 407/1508 (27.0%) had one episode; 125/1508 (8.3%) had two episodes; 32/1508 (2.1%) had three episodes; 5/1508 (0.3%) had four episodes; 2/1508 (0.1%) had five episodes; and 1/1508 (0.1%) had eight episodes. The case with the highest number of episodes was that of a 2-year-old child. The population age distribution and the individuals who had at least one episode of diarrhea are shown in Figure 3. The age histogram shows three peaks, at around 5, 30, and 60 years of age (Figure 3). Due to the nature of the study, we focused on households with children aged less than 5 years. Individuals with diarrhea (colored in Figure 3) showed three similar peaks. Overall, the incidence of diarrheal episodes was 0.51 episodes per person-year (95% CI 0.47-0.55). For children aged less than 5 years, the incidence was 0.81 episodes per person-year (95% CI 0.71-0.91). By breaking this group down into smaller groups of one year old each, each separated by one year of age, the results are as follows: under 1 year, 1.04 episodes per person-year (95% CI 0.67-1.41); 1-2 years, 0.86 (0.67-1.06); 2-3 years, 0.83 (0.61-1.06); 3-4 years, 0.82 (0.62-1.01); 4-5 years, 0.60 (0.40-0.82). We found that the younger the child, the higher the incidence (Table S2, Figure S3). There was no difference in the occurrence of diarrhea between males and females.

Diarrheal Risk Factors
Based on the univariate NB-GLMM, age, rainwater for daily use, drinking water (tap water versus rainwater), distance from the WPP, and toilet facility type (flush toilet versus pit latrine) were significantly associated with diarrheal episodes. Participants who used rainwater for daily use showed a significantly lower risk of diarrhea than those who did not (incidence rate ratio: IRR 0.67, 95% CI 0.53-0.85). For participants who used tap water for drinking, the risk of diarrhea was significantly higher than those who used rainwater (IRR 1.54, 95% CI 1.21-1.95) ( Table 2). More than 99% (1500/1508) of the participants boiled the water before drinking and there was no significant difference in the association with diarrhea in the study. Regarding domestic animal husbandry, no significant difference was found between participants who kept animals and those who did not. Ownership of

Diarrheal Risk Factors
Based on the univariate NB-GLMM, age, rainwater for daily use, drinking water (tap water versus rainwater), distance from the WPP, and toilet facility type (flush toilet versus pit latrine) were significantly associated with diarrheal episodes. Participants who used rainwater for daily use showed a significantly lower risk of diarrhea than those who did not (incidence rate ratio: IRR 0.67, 95% CI 0.53-0.85). For participants who used tap water for drinking, the risk of diarrhea was significantly higher than those who used rainwater (IRR 1.54, 95% CI 1.21-1.95) ( Table 2). More than 99% (1500/1508) of the participants boiled the water before drinking and there was no significant difference in the association with diarrhea in the study. Regarding domestic animal husbandry, no significant difference was found between participants who kept animals and those who did not. Ownership of household livestock of any kind did not affect the incidence of diarrhea (Table 2). In terms of toilet facility type, a significant trend of reduced risk was found among participants using a flush toilet (with septic tank) versus those using a pit latrine (IRR 0.69, 95% CI 0.51-0.92). Among participants who lived near (<1.5 km) the WPP, a higher number of diarrheal episodes were reported versus participants who lived far (≥1.5 km) from the WPP (IRR 1.43, 95% CI 1.1-1.86) ( Table 2).  The explanatory variables that had a significant relationship with the number of diarrheal episodes in the multivariate NB-GLMM, such as age, rainwater for daily use, water for drinking, distance from the WPP, and type of toilet facility, were also significant in the univariate NB-GLMM (Tables 2 and 3). Regarding the model selection, a complete model with all explanatory variables for multivariate NB-GLMM was initially prepared. Then, the explanatory variable rainwater for daily use was excluded from the model because the variables rainwater for daily use and water for drinking had high collinearity (VIF > 6) (Table 3). After removing the explanatory variable, water for drinking, from the model, multicollinearity was no longer present for any of the other explanatory variables (all VIF < 3), and the model selection was finalized. As a result, the final model retained five explanatory variables-age, tap water for daily use, rainwater for daily use, distance from the WPP, and toilet facility type-that were significant (Tables 3 (Model 1) and S3). Additionally, we selected another model with explanatory variables similar to the previous multivariate NB-GLMM, except that rainwater for daily use was excluded and water for drinking was included instead (Table 3 (Model 2)). The results did not change compared with the previous results, except that the water for the drinking variable was retained rather than the rainwater for daily use variable (Table S4). The difference in the AIC between the two models was less than 1, indicating equally supported models. In terms of behavior related to the use of tap water for daily use, multivariate analysis showed a significantly higher number of diarrheal episodes in participants who used tap water for daily use versus those who did not use tap water for daily use (IRR 1.90, 95% CI 1.16-3.09). On the other hand, univariate analysis showed a higher risk, but this was not significant (IRR 1.59, 95% CI 0.97-2.62). This result was due to no adjustment having been made for multiplicity, which was applied for the multiple testing due to the exploratory nature of the analysis. There were no significant discrepancies between the results of the univariate NB-GLMM and multivariate NB-GLMM, except for the differences in significance, as described above. All results reflecting the increased risk for diarrheal episodes related to both tap water for daily use and water for drinking were essentially similar.

Discussion
This report presents an assessment of the incidence of diarrheal episodes among all age groups in our cohort study and the characteristics of participants concerning the frequency of diarrheal episodes.

Incidence of Recurrent Diarrhea
In our study, the incidence of diarrhea in all age groups was 0.51 episodes/personyear (95% CI 0.47-0.55), whereas the incidence in children aged less than 5 years only was 0.81 episodes/person-year (95% CI 0.71-0.91). Many studies have investigated diarrhea in children aged less than 5 years [28,[32][33][34], which is the age group most vulnerable for diarrheal morbidity and mortality. However, our study targeted all age groups because we were interested in community-based management of diarrhea, including communicable types of diarrhea caused by, e.g., norovirus. Generally speaking, cohort studies targeting all age groups tend to show lower incidence rates than those targeting only children [35], although the results may differ slightly depending on the study design and area.
A systematic review by Fischer and colleagues identified community-based cohort studies of children aged 0 to 59 months in developing countries [36]. They reported that the estimated incidence of diarrhea in children under 5 declined, from 3.4 episodes per personyear in 1990 to 2.9 episodes per person-year in 2010 [36]. Although we could not find the actual incidence in 2015 using the methodology adopted in the present study, we assume that the incidence may still be higher than our data in 2015, which was 0.84 episodes per person-year. Interpretations of incidence are difficult to compare due to the context-specific situation of each study area, although our study also reported a similar trend of higher rates of diarrheal incidence in children aged less than 5 years.
Some studies have been conducted by other research teams in Vietnam that focused only on adults [35,37] or all age groups combined [38]. For example, a study conducted by Pham-Duc et al. in Hanan province of northern Vietnam, an agricultural area with a similar climate and environmental conditions to those at our study site, reported a diarrheal incidence rate of 0.28 episodes per person-year in an open cohort of 867 adults aged 16 to 65 years [37]. Their report did not include any children aged less than 15 years. In a longitudinal study that included all age groups, in both urban and rural areas, in India, an overall incidence rate among all age groups of 0.12 (95% CI 0.11-0.14) episodes/person-year was reported, whereas the rate among children aged less than 5 years was 0.51 (95% CI 0.44-0.58) episodes/person-year [39]. Among children aged less than 5 years, the incidence of diarrhea was highest in infants (aged < 1 year) at 1.07 episodes/person-year [39]. Individuals living in rural areas exhibited a lower incidence rate of diarrhea.

Diarrheal Risk Factors
Contrary to expectations, we found a trend of higher rates of diarrheal disease in households that used tap water rather than rainwater. There are various reports on the relationship between diarrhea and tap water or rainwater. A study in Kabul, Afghanistan, reported that approximately half of all households in five districts within Kabul used home piped water [32]. Here, the authors reported that a tendency of risk reduction was seen among households using an open well versus a piped water source, and they concluded that no association between disease prevention and piped water was found. A crosssectional study conducted in a second city in Senegal also found no significant association between the source of drinking water and the occurrence of diarrhea [40].
Conversely, a cohort study in southern Vietnam reported that a lack of access to tap water was significantly associated with increased hospitalization due to diarrhea [16]. Similarly, Pham-Duc et al. reported the benefits of tap water in their nested case-control study in an agricultural community in northern Vietnam [37]. Based on a conditional logistic regression analysis, their data showed that participants who lived in households using rainwater for drinking had a significantly higher risk of diarrhea than those living in households using tap water for drinking [37]. Therefore, our findings were not consistent with those of the studies conducted in other parts of Vietnam, which demonstrated the benefits of drinking tap water for reducing the incidence of diarrhea.
Although studies often report that public water supply systems can reduce diarrheal burden [41], the systems themselves may sometimes generate their dangers due to the complexity of continued management of the entire supply system. For instance, maintenance of the water supply system is difficult in our study area, where power failures frequently occur [42]. Therefore, transient changes in water pressure caused by power failures can influence water quality [43]. This might be expected to be stronger for households closer to WPP. LeChevallier et al. reported methods to prevent the intrusion of contaminants that may cause health problems. The American Water Works Association Research Foundation (AWWARF) report showed that having low water pressure in otherwise satisfactory water distribution pipes can induce the aspiration into pipes of enteric organisms present in the soil surrounding the pipes [43]. If water pipes burst, it can negate any health benefits if the water is contaminated, although there were no records of this type of incident in our study area. Even in the absence of an actual burst pipe, however, low water pressure in distribution systems and intermittent supply are notorious risk factors for outbreaks of waterborne disease [43].
In our study site, rooftop rainwater harvesting is one of the main water sources; this practice began a few decades ago before tap water was piped into this area. In 2011, Ozdemir et al. reported that rainwater harvesting at the household level was widely practiced and was a primary source of drinking water in Southeast Asia; it was also economically feasible in southern Vietnam [44]. However, Meera et al. had already noted in 2006 that there was considerable contamination of household-harvested rainwater with pathogenic microorganisms [45]. Other studies conducted in Australia and the U.K. also showed the presence of a large variety of pathogenic microorganisms in household-harvested rainwater [46,47]. Hamilton et al. summarized the benefits and harm rainwater can present for human health [48]. Their review showed that microbial contamination of rainwater had been identified in many epidemiological studies worldwide, but it is possible to develop contamination prevention strategies using a variety of approaches.
Even if the water source is slightly contaminated, there are ways to ensure safety at the point of use. One of them is boiling water, which is used to make tea in Vietnamese social customs. Anecdotal reports from residents in our study site suggest that rainwater was considered ideal for sweetening tea. Conversely, they tended to be hesitant to drink tap water for tea because of the smell of chlorine [49]. Our questionnaire survey showed that boiling was a common answer given as a treatment practice. It is likely to help maximize the safety of rainwater, especially when consumed for tea. However, we do not know precisely how long water was boiled prior to drinking or which water was boiled.
Based on the limited information obtained from this study, it can be strongly recommended that the toilet facilities be improved to the flush toilet type to prevent diarrhea. Other studies have also compared the incidence of diarrhea in flush toilets and pit latrines.
A historic cohort study conducted in Kenya with HIV-infected mothers and their infants showed an association between flush toilets and a reduced risk of diarrhea [50]. Improved toilets reduce the risks of both diarrhea and stunting in children [51]. Designing toilets in a way that facilitates hygienic behavior may help prevent diarrhea. As in the study conducted in Laos, providing soap at handwashing facilities and making it available may help reduce the risk of diarrhea [52]. In order to reduce the burden of diarrheal diseases more effectively, community-wide sanitation can be introduced in addition to household-level sanitation [53].

Potential Risk Factors That Remain Unidentified Following Our Study
There are further concerns regarding potential risk factors that remain unidentified following our study. For instance, most households were unlikely to be aware of the duration for which their rainwater was stored, as it is continuously harvested from cumulative rainfall, calling into question the extent of the presence of pathogens. The use of tap water following a power failure may also be a risk factor for diarrhea [54]. Respondents could not clearly remember when their tap water supply had stopped and restarted due to frequent power failures [42]. Targeted research into point-of-use water may facilitate the development of effective water storage systems at a household level to effectively protect against these dangers.
The presence of animals near human dwellings has been reported to be associated with diarrhea in humans [55], although our study showed no increased risk for diarrhea linked with animal husbandry. In cases of diarrhea caused by Giardia spp. and Cryptosporidium spp., transmission between animals and humans was suspected due to environmental contamination with both animal and human feces in the same setting [18,56]. Fortunately, this transmission did not increase the risk of diarrhea. Although other studies have also considered open-field defecation in their analysis, we did not, because all households had toilet facilities.

Limitations
Our study has several limitations. First, there was no way to authenticate the actual occurrence of diarrhea using our passive diarrhea detection system. Although all participants in our study had already been registered at the Hien Khanh commune health station, they could choose to consult their preferred commune health stations without relying on the system they were registered with [57]. This could have led to lower estimations of the incidence of diarrhea. However, in our study site, participants seemed willing to consult the Hien Khanh commune health station to address any diarrheal disease. Participants tended not to use other commune health stations in cases of diarrhea because the Hien Khanh commune health station was easily accessible and more convenient. Moreover, trained health workers who were staff members at the Hien Khanh commune health station were hired for this study, then trained to communicate with both participants and physicians and correctly define diarrhea according to WHO criteria [58]. This system reduced the likelihood of missing cases of diarrhea.
Second, in this study, only the incidence of diarrheal episodes was used as a response variable to analyze the relationship between risk factors and household socioeconomic and environmental characteristics. Ideally, a more comprehensive approach, using more precise response or explanatory variables, such as the presence of diarrheal pathogens in human fecal samples and/or water samples, would have been used to confirm diarrheal pathogen transmission pathways. Although many types of pathogens cause diarrhea, with each pathogen manifesting its complex, dynamic transmission patterns, pathogen-specific epidemiological data would have been useful for making assumptions about diarrheal pathogen exposure and the eventual development of targeted interventions. To develop appropriate preventive measures, potential contamination sources, particularly water sources at the public and the household levels, must be investigated. If there is a problem with the water pipes, proper mapping information is also important. In this analysis, the distances are straight lines on a map, not the actual lengths of the water pipes. Therefore, at this stage, we can only suggest the importance of factors influencing the incidence of diarrhea, which is still far from developing appropriate prevention measures.
Third, we did not investigate detailed human behaviors other than boiling water. It would have been helpful to investigate the effects of individuals' practices and attitudes regarding the choice of water source and water treatment. This could differ among individuals and may have affected our study results. However, we could not know who, when, and what kind of practices individuals undertook for which water sources. The lack of detailed information on local treatment practices for water purification just prior to its use is recognized as a limitation of our results. Moreover, our results might have been influenced by a combination of factors, such as the duration of water storage at a household; personal hygiene practices, such as handwashing; and differences in water treatment.
Fourth, although seasonality of diarrheal disease has been recognized in Ho Chi Minh City, southern Vietnam [59], we did not consider factors related to seasonality in our analysis using data from our study area in northern Vietnam, where the climate is humid and tropical (Table S5). Finally, the severity of diarrheal episodes was not considered. Especially in children, diarrhea can be severe, and it cannot be denied that they may have been more likely to be noticed than adults.

Significance of This Study and Its Implications
Despite these limitations, our study still provides a valuable overview of diarrheal incidence in this agricultural community in northern Vietnam. We were able to identify risk factors, such as water used for daily use and drinking and type of toilet facilities, such as flush toilets or pit latrines. Although we initially assumed that livestock ownership would be one of the main risks for diarrheal disease in our study area, it was found that this was not likely. Our community survey revealed that more targeted studies should be conducted, initially on issues around water sources and the type of toilet. These should include investigating whether and to what extent tap water and rainwater are contaminated, how often power failures occur in this study area, and how power failures can affect water quality. Moreover, if water supply pipes and toilet tanks are in close proximity to each other, it is necessary to verify whether the proximity affects the water quality during leakage.
One of the implications of our study is that it is important to identify households that are not benefiting from sanitation-related infrastructure. In such cases, we recommend basic personal hygiene practices to solve sanitation problems, such as washing hands before eating, boiling drinking water, and storing water properly. Regardless of the origin of water or even if the water was boiled, longer periods of storing water in storage units were shown to be one of the risks for diarrhea [60]. Water quality improvement at the point of water use is suggested to have a more significant effect on preventing diarrhea [61]. There were various point-of-use quality improvement interventions, including boiling and others, such as chlorination, flocculation, filtration, and solar disinfection. Further research may be helpful to estimate the magnitude of the effects of these interventions in different home settings.

Conclusions
We identified potential risk factors for diarrhea incidence, such as the water source used for daily use and drinking and toilet facility type. Particularly in this rural area in the north of Vietnam, public infrastructure and sanitation are inadequate. Sanitation is a primary barrier to prevent fecal-oral transmission of the pathogens that cause diarrhea. Although pathogen exposure is only one cause of diarrhea, all individuals, especially children, who are most vulnerable to diarrheal morbidity and mortality, should be protected from environmental contamination by diarrheal pathogens. However, our study could not provide sufficient evidence of diarrheal pathogen transmission pathways to suggest concrete preventive measures. Therefore, we encourage further investigation tailored to the sources of potential diarrheal pathogens in this setting. Regardless of the level of development of the infrastructure, it is important to start by improving the rural living conditions at the household level to prevent diarrhea. In particular, domestic treatment systems for drinking water and sanitation of toilets are important in this study area. We are convinced that future research will help guide the development of adapted public health interventions to minimize the burden of diarrheal disease in this particular local community in Vietnam.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijerph19042456/s1. Figure S1. Questionnaire for socioeconomic status; Figure S2. Figure S3. Age pyramid of the study population under 5 years old and diarrhea episode; Distribution of the study population according to the distance from WPP and diarrhea episode; Table S1. Ownership of household assets; Table S2. Distribution of diarrheal episodes per year of age under 5 years old; Table S3. Model selection of multivariate NB-GLMM (Model 1); Table S4. Model selection of multivariate NB-GLMM (Model 2); Table S5. Number of diarrheal cases and temperature and rainfall amount in Nam Dinh province.  Informed Consent Statement: Written informed consent was obtained from all household heads that participated in our study prior to enrolment. All information and sample collections were performed after explaining this study's objectives and possible risks and benefits. The above committees approved the handling and storage of the collected data.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author Hanako Iwashita, at iwashita.hanako@twmu.ac.jp.