SARS-CoV-2 Wastewater Surveillance in Ten Cities from Mexico

: We aimed to estimate the lead time and infection prevalence from SARS-CoV-2 wastewater (WW) monitoring compared with clinical surveillance data in Mexico to generate evidence about the feasibility of a large-scale WW surveillance system. We selected 10 WW treatment plants (WWTP) and 5 COVID-19 hospitals in major urban conglomerates in Mexico and collected biweekly 24-h ﬂow-adjusted composite samples during October–November 2020. We concentrated WW samples by polyethylene glycol precipitation and employed quantitative PCR (RT-qPCR) assays, targeting the nucleoprotein (N1 and N2) genes. We detected and quantiﬁed SARS-CoV-2 RNA in 88% and 58% of the raw WW samples from WWTPs and COVID-19 hospitals, respectively. The WW RNA daily loads lead the active cases by more than one month in large and medium WWTP sites. WW estimated that cases were 2 to 20-fold higher than registered active cases. Developing a continuous monitoring surveillance system for SARS-CoV-2 community transmission through WW is feasible, informative, and recognizes three main challenges: (1) WW system data (catchment area, population served), (2) capacity to maintain the cold-chain and process samples, and (3) supplies and personnel to ensure standardized procedures.


Introduction
The emergence of SARS-CoV-2 marked the start of the largest pandemic in more than a century. Mexico reported its first confirmed COVID-19 case on 28 February 2020, and experienced two waves that reached their peak in July and December 2020 [1]. Early in the pandemic, Mexico monitored the progression of the pandemic through sentinel surveillance, using reverse transcriptase polymerase chain reaction (RT-PCR) tests in clinical cases that fulfilled an epidemiological case definition [2]. After 28 October 2020, rapid antigen (Ag-T) tests for SARS-CoV-2 infections were additionally included [3]. Surveillance systems rely on people seeking medical care and overrepresent symptomatic cases, providing an incomplete picture of COVID-19 cases. Among seropositive people, 67.3% were asymptomatic in Mexico from August to November 2020 [1]. The clinical surveillance is also subjected to reporting delays, failing to provide an early warning system.
Wastewater-based surveillance has been proposed as a complementary system to detect early changes in the epidemic dynamic and to estimate the burden of infection. SARS-CoV-2 infection is accompanied by viral shedding through stools by asymptomatic, symptomatic, and recovered individuals, which are all captured by wastewater [4][5][6][7][8][9]. The scientific community has been studying the presence of SARS-CoV-2 in the water cycle since the beginning of the pandemic, showing non-infective RNA of SARS-CoV-2 detection and quantification in hospital discharges, raw and treated wastewater, and primary sludges from WWTP across the world [10].
In Mexico, there have been several reports quantifying SARS-CoV-2 RNA in different environmental samples during the first year of the pandemic. During the first months of the pandemic, SARS-CoV-2 RNA was quantified in influent and secondary sludge from two WWTP in Querétaro (Central Mexico) [11]. The WW-based infection prevalence was 6.5 to 260 times higher compared to the government-reported cases in the five central municipalities in the state of Hidalgo (Central Mexico) in the mid-2020s [12]. The viral titers found in three out of four WWTPs in the Monterrey Metropolitan Area (Northern Mexico) were associated with clinical COVID-19 surveillance indicators preceding two-seven days of the rise of reported clinical cases, and the WW-based infection prevalence was significantly higher than active cases reported by health authorities between June and December 2020 [13]. Higher concentrations of SARS-CoV-2 RNA were quantified in water samples collected from rivers near human settlements and in WWTPs than in samples collected from rivers outside Tapachula city, (Southern Mexico) between September 2020 and January 2021 [14]. All groundwater samples from sinkholes were negative for the detection of SARS-CoV-2 RNA in the state of Quintana Roo from August 2020 to January 2021, whereas most of the raw WW samples were positive [15].
In Mexico and other countries, the SARS-CoV-2 RNA concentrations in raw WW have been correlated with different clinical COVID-19 surveillance indicators to establish the lead time of the early warning signal and the infection prevalence estimation provided by environmental surveillance. Different clinical surveillance indicators have been used, such as weekly COVID-19 case rate in the USA [16], weekly COVID-19 cases in Argentina [17], newly hospitalized patients in Sweden [18], showing a lead of one-two weeks with respect to the official confirmed cases in India [19], and a lead of two-three days for positive cases in Canada [20].
The ability of WW-based surveillance to provide informative data depends on various factors, including population coverage of WWTPs, dilution of wastewater from industrial or pluvial sources, and losses of water in the system, among others [21]. A review showed that reports around the world are not comparable, in terms of gene copies detected and lag-time between monitored RNA and reported cases, because of varying sewerage systems and climatic conditions that impact virus degradation rate. The lead time for the early warning signal can be better applicable to places having well-connected sewerage systems to the sampled WWTP [22]. Mexico is a middle-income country with high subnational inequality. More than 90% (92.42%) of the population uses at least basic sanitation services, but this is lower in the Southern regions compared to the Central and Northern regions of Mexico.
The present study systematizes the comparison with clinical surveillance data and the infection prevalence estimation from WW monitoring in ten cities in Mexico. A twomonth raw WW sampling campaign conducted between October and November 2020 allowed us (1) to evaluate the presence of SARS-CoV-2 RNA in WW from WWTP influents and COVID-19 hospitals effluents, (2) to analyze the lead time of WW-based surveillance compared to a clinical indicator up to 50 days, and (3) to estimate the number of infected subjects in cities under different epidemic stages, in order to generate evidence about the feasibility of a large-scale WW surveillance system in a middle-income country.

Instruments and Processes Standardization
We developed four instruments to register data: (1) WWTP information form (Supplementary Table S1), (2) Hospital information form (Supplementary Table S1 Figure S2). We also developed two process standardization guides: (1) Sampling process guideline, which explained the sampling process at WWTP influent/effluent and hospital effluent points, including the use of sampling procedures and storage and the use of personal protection equipment, and (2) Transportation process guideline, which explained the transport process from the sampling point to the lab while maintaining the cold chain. We undertook a two-week pilot phase in three WWTPs-two in Mexico City (Cerro de la Estrella and Ciudad Deportiva) and one in Cuernavaca (Acapantzingo)-to adjust data collection instruments, sampling processes, logistics (e.g., cold chain), and lab processes (i.e., sample concentration with polyethylene glycol and RNA extraction). During this phase, we collected 20 wastewater samples-12 raw and 8 treated samples.

Sampling Sites Selection
We selected 10 WWTPs for sampling using the following inclusion criteria: (1) located in a major urban conglomerate (>100,000 inhabitants), (2) located in a municipality within the 90th percentile of municipalities by COVID-19 attack rate (confirmed cases per 1000 inhabitants), and (3) influent flow above 10 L/s (to minimize viral load variability due to flow changes). We identified municipalities that fulfilled the criteria using the official COVID-19 dashboard from the National Institute of Public Health [https://www. insp.mx/informacion-institucional-covid-19.html (accessed on 30 September 2020)], and the official WWTP inventory from the National Water Commission (CONAGUA). The selected WWTPs were: Agua Prieta (Guadalajara Metropolitan Area), Cerro de la Estrella (Mexico City), San Francisco (Puebla Metropolitan Area), Zaragoza (Mexicali), León (León), Reynosa I (Reynosa), Norponiente (Cancún), Acapantzingo (Cuernavaca), La Raya (Oaxaca Metropolitan area). WWTPs personnel were contacted and asked to collaborate in the study through the official channels of CONAGUA; all selected WWTPs agreed to collaborate. We then selected 5 hospitals for effluent sampling using the following inclusion criteria: (1) designated for COVID-19 patients, (2) located within the catchment area of one of the selected WWTPs, and (3) held 50% or more of the COVID patient beds in the WWTP's catchment area, which we estimated using official data from the Health Ministry. The 5 selected hospitals were in Guadalajara, Mexicali, Reynosa, Cancún, and Cuernavaca. Hospital personnel were contacted and asked to collaborate in the study by local CONAGUA personnel; all 5 hospitals agreed to participate. Sampled sites are shown in Figure 1.

WW Sampling
Sampling personnel manually collected 12 grab samples to obtain 24-h flow-adjusted composite WW samples, hourly registering wastewater temperature, electrical conductivity, color, smell, pH, and ambient temperature. We sampled all 15 sites twice weekly (Wednesdays and Sundays) from 7 October to 30 November 2020. Samples were sent to

WW Sampling
Sampling personnel manually collected 12 grab samples to obtain 24-h flow-adjusted composite WW samples, hourly registering wastewater temperature, electrical conductivity, color, smell, pH, and ambient temperature. We sampled all 15 sites twice weekly (Wednesdays and Sundays) from 7 October to 30 November 2020. Samples were sent to the lab in Cuernavaca by air or land courier, depending on location, in containers at 4 • C. Composite samples were obtained on 23, 27, and 30 September in two WWTPs (pilot phase) and on 21 and 28 October in eight WWTPs were analyzed to determine five-day biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total nitrogen (N) and phosphorus (P).

Sample Concentration
We conducted viral particle harvesting by separation in two phases and precipitation with Polyethylene glycol (PEG) and NaCl [23]. As a safety measure, the 250 mL containers were pasteurized in a water bath at 60 • C for 90 min before opening. The container was then subjected to centrifugation at 8500 rpm for 30 min, to remove larger solid particles. The resulting supernatant was filtered through a 0.45 µm membrane (Merck, Millipore, Burlington, MA, USA, Cat. HAWP04700), and the flux was collected in sterile containers. Then, 40 mL of a sterile solution of PEG 8000 (50% w/v) (Sigma, Aldrich, St. Louis, MO, USA, Cat. 89510) and NaCl (0.3 M) (Merck, Cat. 106404) were added to 200 mL of filtered residual water. The containers were mixed by gentle inversion until homogenized and kept at 4 • C overnight. Samples were then centrifuged at 8500 rpm for 2 h or until a pellet was visible. The supernatant was discarded, and the pellet was resuspended in 140 µL with nuclease-free sterile distilled water, to continue with the extraction of the viral genome.

Viral RNA Extraction, Detection, and Quantification
Viral RNA was purified with the commercial QIAamp Viral RNA Mini kit according to the manufacturer's instructions (Qiagen, Cat. 52906). The detection of SARS-CoV-2 RNA was performed by reverse transcriptase quantitative polymerase chain reaction (RT-qPCR) with the Go Taq ® Probe 1-Step RT-qPCR System kit (Promega, Madison, WI, USA, Cat. A6121) and the RT-qPCR 2019-nCoV diagnostic kit that includes primers targeting the N1 and N2 regions of the SARS-CoV-2 nucleocapsid (N) gene, commercialized by Integrated DNA Technologies (IDT, Coralville, IA, USA, Cat. 10006770), validated by the US Centers for Disease Control and Prevention [24] and by the Mexican Institute for Diagnosis and Epidemiological Reference (InDRE, DGE-DSAT-04663-2020). The RT-qPCR reactions were carried out on the CFX96 Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA). For absolute quantification, we used the positive control 2019-nCoV-N (IDT, Cat. 10006625), supplied at 200,000 copies/µL, a plasmid that contains the complete N gene of SARS-CoV-2 and with which we made serial dilutions (base 10) to generate the standard curve. The number of copies of the SARS-CoV-2 N gene present in the WW samples was quantified by titration of the N1 and N2 gene segments. We, therefore, defined the RNA concentration in the sample as the highest quantified result for N1 and N2 gene segments and reported the Log10 transformation. Each RNA sample, standard curve, and recommended controls were analyzed in triplicate, and cycle threshold values (Ct) were used to calculate the average SARS-CoV-2 N gene copies/mL of each sample. Ct values <40 cycles were considered positive for SARS-CoV-2, as previously proposed [25]. The data obtained by technical replicas showed low dispersion and were reproducible. However, those RNA samples whose technical replicates showed values with a discrepancy greater than 50% were re-assayed. The limit of RNA detection for our method was 0.001 copies/mL, and the non-detectable samples were assigned a value of LOD/2, resulting in a Log10 value of −3.3. The viral load of N gene copies in WW samples was adjusted by WW flow, temperature, and mean travel time to the WWTP, as explained in Section 2.9 [26].

Recovery Efficiency Test
To evaluate the viral particle harvesting protocol, we used the φ ITL-1 DNA bacteriophage of the Podoviridae family that lacks an envelope, and it is lytic for the Ralstonia solanacearum bacterium. The φITL-1 was propagated, harvested, purified, and titrated to obtain a solution of 3.3 × 10 5 PFU (Plate Forming Units) per milliliter. With one milliliter of this solution, 17 wastewater samples were contaminated and processed following the viral particle harvesting protocol described above. DNA extraction from φITL-1 was performed with the phenol-chloroform method.
To have a reference template to generate a standard DNA curve of φITL-1 and to determine the number of genomic copies in water samples, we amplified three regions of the φITL-1 genome by endpoint PCR and independently cloned them into pCR2.1. The plasmid DNA of the three constructs was purified (with the alkaline lysis technique), quantified by fluorometry and the number of copies was calculated using the online tool "DNA Calculator" [http://www.molbiotools.com/dnacalculator.html (accessed on 30 April 2021)]. We then generated 10-fold serial dilutions in the order of unique copies up to 1 × 10 5 (6 standards) of each of the three constructs. Subsequently, the qPCR performance of the three pairs of oligonucleotides was evaluated in conjunction with their corresponding standards. The qPCR reactions were performed with the BRYT Green kit (Promega, Cat. A6001) in the CFX96 Real-Time PCR Detection System (Bio-Rad). Together these experiments indicated that the ITL1-TR2 oligonucleotide pair exhibited excellent qPCR performance, as shown in the Supplementary Methods and Table S2. Finally, each genomic DNA sample of the bacteriophage φITL-1 (purified from the contaminated sewage samples), standards, and controls were analyzed by qPCR in triplicate with the ITL1-TR2 oligonucleotides. Cycle threshold values (Ct) were used to calculate the percent recovery of the method. The mean recovery efficiency of the viral particles from the concentration method used for the 17 analyzed samples was 29.2% (range 4.5 to 53.1%).

Clinical-Based Surveillance Data
We obtained data for the number of COVID-19 cases from the Epidemiological Surveillance System for Viral Respiratory Disease (SISVER). Briefly, this is a central government system that registers all confirmed COVID-19 cases from public and private healthcare units in the country. During the study period, in public healthcare units, a confirmatory diagnostic test (RT-PCR and antigen test) was only applied to patients who fulfilled the criteria for Severe Acute Respiratory Infection (suspected case plus at least one sign: dyspnea, chest pain, or desaturation), except on 475 units which applied a diagnostic test to 10% of outpatient cases as part of a sentinel surveillance system. Therefore, the surveillance system does not include asymptomatic or pauci-symptomatic cases [27]. We used active cases as the epidemiological indicator from SISVER, following the operational definition from Mexico: Confirmed cases that started symptoms within 14 days from the sampling date. We decided not to include daily deaths among confirmed cases as an indicator, because of the delay in the update of this information in the Epidemiological Surveillance System at that time. We registered the testing rate per 100,000 for the municipality where the WWTP is located, at the end of the sampling period from the COVID-19 monitoring dashboard nested in the National Institute of Public Health site.

Lead Time of WW-Based Surveillance
For RNA concentrations in WW to be an early warning marker for COVID-19 cases, we expected RNA concentrations as a function of time to be displaced to the left, relative to the COVID-19 metric function. Therefore, the early warning time would correspond to the magnitude of the displacement of the RNA function that brings it most in harmony with the case function. Thus, we looked for optimal harmonization by locating the maximum of Pearson's Rho between the two functions for a given lag in days.
The procedure to locate the maximum Rho was implemented in three steps: firstly, to displace the RNA function in one-day increases, we linearly interpolated values of log 10 Water 2023, 15, 799 6 of 20 RNA daily adjusted copies between observed bi-weekly values, by WWTP. Secondly, by WWTP, we estimated Rho by one-day displacements of the active case function (COV), from 0 to 50 days. Lastly, to estimate the maximum jointly analyzing all sites, we fitted fixed-effects linear models of the form:

Infection Prevalence Estimation
We estimated SARS-CoV-2 infection prevalence in the WWTP catchment area using a Monte Carlo approach [28]. We ran 1000 estimations for each sample and obtained the median and interquartile range. Four parameters were obtained from the litera- The rest of the parameters were measured on-site by field personnel.
Our overall equation was the following: And [Hal f li f e in sampling conditions] (h) was estimated through the following equation: where t 1 2 , 1 is the known half-life (11.8 h) [30], T1 is the temperature at which said half-life was originally estimated (20 • C), T2 is the day mean wastewater temperature (of the 12 measurements taken at grab sampling times), and Q 10 is a temperature-dependent adjustment factor (2.5) [29]. For [Recovery percentage] we used a value of 0.292, our mean virus recovery, as previously explained.
We obtained four variables from the literature: f ecal excretion rate copies mL [31]; normal f ecal load g day×person [32] (value for middle and low income countries); f ecal density g mL [33]; and [% o f in f ected who shed SARS − CoV − 2 f ecally] [34]. The distributions, mean values, and standard errors we used are shown in Supplementary Table S3.
From the information provided by WWTP personnel, we identified the municipalities served by each WWTP and obtained their populations from the data of the 2015 intercensal survey. Afterward, we used the measured values of BOD5, COD, total nitrogen, and phosphorus to estimate the populations served by each WWTP, according to the methodology initially described by Van Nuijs [35]. Briefly, this method multiplies the measured parameter by the flow and divides the result by a constant, which is 59 for BOD5, 128 for COD, 12.5 for total nitrogen, and 1.7 for phosphorus. Table 1 shows median, minimum, and maximum SARS-CoV-2 RNA quantification, average daily flow (L/s), WW temperature ( • C), and WW distance traveled (km) by WWTP, as well as a sewage system and WW characteristics and reported hospital WW chlorination. SARS-CoV-2 RNA was detected and quantified in all WWTPs, except Zona Noreste (Villahermosa), which operated at maximum capacity throughout the sampling period, in a heavy rainfall scenario. Zaragoza (Mexicali), Reynosa I (Reynosa), and Acapantzingo (Cuernavaca) had at least one sample with undetectable RNA; on the remaining six WWTPs, we detected and quantified SARS-CoV-2 RNA in all samples. Complete data of RNA quantification by date is available in Supplementary Table S4. In Mexico, sewage provision and water treatment are not homogenous across cities or areas. The WWTPs sampled operate in cities of different sizes and characteristics that are known to have an association with RNA degradation in the environment. Agua Prieta in Guadalajara Metropolitan Area is the biggest sampled WWTP with 4009 to 5331 L/s average inflow for a ca. 3.5 million population in the metropolitan area in a state reporting a high proportion of basic sanitation services and in the municipality with the lowest testing rate. The second biggest WWTP is Cerro de la Estrella in Mexico City, population ca. 3.2 million, a high proportion of basic sanitation services, and a high test rate. The third WWTP is San Francisco in Puebla. Mexicali, León, and Reynosa are middle-size WWTP in middle-size cities. In contrast, La Raya (Oaxaca) had the lowest 28 to 38 L/s average inflow, reporting 80% runoff in the state, reporting the lowest proportion of basic sanitation services, for the second smallest population, ca. 500,000, but the highest test rate. Acapantzingo in Cuernavaca and Norponiente in Benito Juárez are smaller WWTP for smaller populations.

SARS-CoV-2 RNA in WW of WWTPs and COVID-19 Hospitals
We measured the highest WW temperatures at the Norponiente WWTP in Cancun (27.7 to 30.3 • C), followed by the WWTP in Mexicali and Reynosa. The lowest temperatures appeared at Cerro de la Estrella in Mexico City, San Francisco in Puebla, and Acapantzingo in Cuernavaca (19, 20, and 21 • C). The average WW travel distance was the longest (16.5 km) in the WWTP Cerro de la Estrella (Mexico City) and the shortest (2 km) in the WWTP Zona Noreste (Villahermosa).
The results of the quantification of the virus RNA in log 10 copies/mL during the pilot phase in Cuernavaca and Mexico City are shown in Figure 2A. All raw wastewater samples were positive, while all treated wastewater samples were non-detectable.
SARS-CoV-2 RNA was quantified in 58% of hospital effluent samples. The highest RNA concentrations were quantified in the samples from hospitals that reported no disinfection method for their discharges. In contrast, hospitals reporting discharge chlorination had no detectable results, except for one sample each ( Figure 2B). Figure 3 shows the temporal trend of both environmental adjusted Log10 SARS-CoV-2 RNA daily loads and clinical surveillance active cases for nine sites stratified as big, medium, and small WWTPs. Table 2 shows the correlation coefficients (Rho) and corresponding lead time days for RNA daily loads compared to active cases by the site. We found a lag time of over a month for all sites and each big and medium WWTP site. The correlation was stronger for all the sites and for Mexico City, Guadalajara, and León. A modest correlation was found for Puebla, Mexicali, and Reynosa. The small WWTP sites were not included in Table 2 because the resulting Rho was small, and the trends were not clear, as can be seen in Figure 3C. In some cases, the Rho vs. lag days was bimodal; thus, we present the two lag days corresponding to the two highest correlations observed. The first lead time occurred over the first days for Guadalajara, Mexico City, and Mexicali. The second lead time ranged from 35 to 43 days for active cases. Complete correlation series of rho vs. lag days for adjusted Log10 SARS-CoV-2 RNA daily loads with active cases are shown in Supplementary Figure S3. The results of the quantification of the virus RNA in log10 copies/mL during the pilot phase in Cuernavaca and Mexico City are shown in Figure 2A. All raw wastewater samples were positive, while all treated wastewater samples were non-detectable. SARS-CoV-2 RNA was quantified in 58% of hospital effluent samples. The highest RNA concentrations were quantified in the samples from hospitals that reported no disinfection method for their discharges. In contrast, hospitals reporting discharge chlorination had no detectable results, except for one sample each ( Figure 2B).  Figure 3 shows the temporal trend of both environmental adjusted Log10 SARS-Co 2 RNA daily loads and clinical surveillance active cases for nine sites stratified as b medium, and small WWTPs.  Table 2 shows the correlation coefficients (Rho) and corresponding lead time d for RNA daily loads compared to active cases by the site. We found a lag time of ove month for all sites and each big and medium WWTP site. The correlation was stronger all the sites and for Mexico City, Guadalajara, and León. A modest correlation was fou for Puebla, Mexicali, and Reynosa. The small WWTP sites were not included in Tabl because the resulting Rho was small, and the trends were not clear, as can be seen in F ure 3C. In some cases, the Rho vs. lag days was bimodal; thus, we present the two lag d corresponding to the two highest correlations observed. The first lead time occurred o the first days for Guadalajara, Mexico City, and Mexicali. The second lead time rang from 35 to 43 days for active cases. Complete correlation series of rho vs. lag days adjusted Log10 SARS-CoV-2 RNA daily loads with active cases are shown in Supplem tary Figure S3.    Table 3 shows the summary of WW-based estimated cases, catchment area population, and prevalence, compared to clinical surveillance-based active cases and testing rate for each WWTP during the sampling period. Complete data by date is available in Supplementary Table S5. The ratio of estimated to active cases shows that for all sites, WW-based surveillance estimated a larger number of cases compared to clinical-based surveillance. In the Center and South of Mexico, including Mexico City, Puebla, and Oaxaca, we estimated twice as many cases based on WW compared to clinical-based surveillance. These sites had the highest test rates. In contrast, the North and West region, including Mexicali, Reynosa, and Guadalajara, had a 20-fold increase in WW-based estimated cases compared to clinical-based active cases. These sites had a lower test rate. Table 3. WW-based estimated cases, catchment area population, and prevalence, compared to clinical-based active cases, municipal population, and prevalence for the period October-November 2020 in Mexico.

Wastewater (WW) Surveillance
Clinical The WW-based prevalence was higher (7.2 to 71.2 estimated cases per 1000 people) compared to the clinical-based prevalence (0.3 to 1.6 active cases per 1000 people).
Estimates of the catchment area population calculated from the physicochemical parameters (DBO5, DQO, N, P) were 4% to 42% smaller than the municipal population for all WWTPs.

Discussion
We aimed to evaluate the presence of SARS-CoV-2 RNA in WW from WWTP influents and COVID-19 hospital effluents, identify the lead time of WW monitoring compared to clinical surveillance, and estimate the number of infected subjects in cities in Mexico. We detected and quantified SARS-CoV-2 RNA in 88% of influent wastewater samples; samples in heavy rainfall scenarios were all undetectable. Our results showed that an increase in adjusted SARS-CoV-2 RNA daily loads in WW had a high to moderate correlation with an increase in active cases 35 to 43 days later in sites with big and medium WWTPs. Our estimates showed that the number of infected subjects in the study period could have been 2 to 20 times higher than the number of clinical cases detected through the clinical surveillance system. Our results suggest that SARS-CoV-2 detection in WW is a feasible and informative procedure to be conducted in Mexican cities, mainly in those having a better-connected sewerage network and a higher flow WWTP.
The early warning capability of wastewater surveillance for COVID-19 has been extensively discussed in the literature [13,[16][17][18][19][20]36,37]. Our study contributes to this body of work by showing a longer lead time, over one month, in the bigger and medium WWTP sites using standardized procedures and proposing a robust statistical methodology that focused on harmonizing the case and viral load distributions. Early COVID-19 studies reported highly heterogeneous and site-specific lags between changes in WW viral load and changes in clinical surveillance, ranging from some days to several weeks [38], implicating that lags cannot be compared across sites [22,26]. There are some possible explanations for our results showing a longer lead time than those previously reported. One is that we extended the analysis up to 50 days, and we have not found another paper considering such a long period. It has also been recognized that the lead time results are limited by the accuracy of the clinical data depending on the testing accessibility and seeking behavior and on the delay of the report in the clinical surveillance system [39]. WW system characteristics could also influence the results, as better-connected sewer networks provide better results [22]. In our study, the highest correlation with active cases, which ranged from 35 to 43 days, supports the conclusion that better-connected sewerage networks, in our case represented by the big and medium WWTP in states with higher coverage of basic sanitation services, provide stronger correlation coefficients. Because of the multiple factors that influence the delay of the detection of cases through the clinical surveillance system, conceptually, WW-based surveillance can provide information about COVID-19 prevalence and dynamics [40].
We chose the clinical surveillance indicator of active cases, defined as the confirmed cases that started symptoms within the previous 14 days of the sampling date, following the operational definition used in Mexico. We decided not to include the daily deaths among confirmed cases as an indicator, because of the delay in the update of this information in the Epidemiological Surveillance System at that time. While there is no consensus on the best indicator to use, other authors have used cumulative incidence and COVID-19 cases [16], COVID-19 cases per week in Argentina [17], newly hospitalized patients in Sweden [18], confirmed cases in India [19], and positive cases in Canada [20].
WW-based surveillance allowed us to estimate the number of prevalent infection cases. In our study, WW-based cases were 2 to 20-fold higher than the clinical-based active cases for the sampling period, capturing asymptomatic or mild infection cases that were shedding the virus but were not detected by hospitals and clinics. Included municipalities had dissimilar SARS-CoV-2 attack and testing rates, which could be influencing the comparison with WW-based estimates. Estimates of the catchment area population calculated from the physicochemical parameters (DBO5, DQO, N, P) were 4% to 42% smaller than the municipal population for all WWTPs, indicating that WWTPs were capturing a fraction of the municipal population. This estimation of the population that contributed to the WW sample is relevant to calculating the prevalence (number of cases/population in the catchment area). In México, Padilla-Reyes et al. [13] showed results for a WW-based case estimation for the city of Monterrey, comparing it to the clinical surveillance data. Robotto et al. [26] also estimated the cases expected according to the WW signal but failed to consider the natural degradation of the viral RNA. Hart and Halden [29] and Ahmed et al. [41] did not consider their estimation of the recovery efficiency of the method, and we did, although as a point value. Similar to recent studies [25,[42][43][44], in this work, we used an unenveloped bacteriophage to calculate the recovery efficiency of viral particles. It is possible that the recovery efficiency is different between SARS-CoV-2 and bacteriophage, which could have contributed to an incorrect estimate of the amount of SARS-CoV-2 in the wastewater samples. In-sewer travel time has rarely been reported, but we adjusted our results for this variable. The shedding profile of SARS-CoV-2, including the shedding rate, the beginning, and the duration of shedding are still being studied to improve the calculations of infection prevalence and reduce the uncertainties that exist in our results [45,46].
Recent studies [47][48][49] describe that pasteurization of wastewater samples is an important factor in the degradation of SARS-CoV-2 RNA. Controversially, another study [50] reports that pasteurization may lead to a slight increase in the recovery of SARS-CoV-2 RNA. While the work of Robinson et al. [51] mentions that the pasteurization of wastewater samples did not significantly reduce the SARS-CoV-2 signal when the RNA was extracted immediately after pasteurization; on the contrary, the signal decreased significantly when the RNA was purified 24-36 h after having pasteurized the samples. In this work, we used the same procedure to pasteurize all the samples, including those that were used to calculate the percentage of viral recovery; for this reason, we believe that all the samples analyzed had the same bias. However, pasteurization may have contributed to the degradation of the viral RNA, which could lead to a lower measurement of the amount of SARS-CoV-2.
Our results showed that secondary treatment and disinfection applied to WW in treatment plants were effective in eliminating the genetic material and, therefore, the coronavirus. A previous study in Mexico coincides with this finding [11]. On the other hand, the treated wastewater analyzed in Santiago de Chile in May and June 2020 was detectable, although with a lower concentration of virus RNA compared to raw wastewater [52]. In a previous study, chlorination of COVID-19 hospital wastewater discharges was effective in removing genetic material and, therefore, coronavirus in 94% of effluent samples, as reviewed by Achack et al. [53]. An additional study carried out in China during the SARS-CoV-1 epidemic showed that SARS-CoV-1 RNA was detected in untreated wastewater, and there was only a weak sample positive after the first disinfection process. All samples were negative at the end of the disinfection process [54]. Our results are consistent with these previous findings.
Our study had several limitations that we must mention. First, Villahermosa presented heavy rainfall and flooding during the sampling period, which is likely why all our samples from that site were negative for SARS-CoV-2 RNA detection. Our observation coincides with other reports mentioning that the WW sample can be diluted in the cases of rainfall and flooding, preventing the detection of genetic material [55,56]. Additionally, WWTPs in Mexico do not have precise information about their catchment area, and we approximated it according to the census data for the served municipalities. Epidemiological data from clinical surveillance are grouped at the municipal level, while the catchment areas of the WWTPs do not necessarily coincide with the municipal boundaries. We included an estimate of the number of inhabitants in the catchment area of the WWTP, based on concentrations of nitrogen, phosphorus, and oxygen in some of the samples. Through this method, we estimated the catchment population to be a smaller fraction (5% to 40%) of the municipal population in all sites. Other research has used the design capacity of a WWTP in the back-calculations as the number of inhabitants living in the catchment area of the WWTP, but this parameter is dynamic and shows daily variations [35]. A GIS-based estimation using census population and sewer shed maps could help to provide more precise information about the population served by each facility as described for Monterrey [13]. However, even with this information, there are important population movements because of work, tourism, visits, and other phenomena, as well as other contributing sources of wastewater, so the population calculation based on physicochemical parameters can provide a better parameter to establish the denominator for the prevalence estimation. Further studies will need to employ appropriate methods to estimate and validate the population covered by the studied WWTP and consider how to incorporate the dynamics of the population into the result interpretation [45,46].

Conclusions
Establishing a long-term monitoring system for COVID-19 in Mexico will be a challenge, mainly due to costs and logistics. Supply lines need to be stable to acquire materials that experienced shortages during the pandemic, a challenge that we faced frequently, including overpricing. Transportation was also a challenge, since the distance to our lab from the 10 included cities, ranged from over 2000 km to only 10 km. A network of laboratories, such as those provided by the public health laboratory network in Mexico, will be needed to facilitate transportation and analysis without compromising the cold chain. While challenging, we consider a monitoring system feasible and informative if three main conditions are fulfilled: (1) adequate data regarding the WW system (catchment area, population served, WW sources), (2) capacity to maintain the cold-chain and process samples (shorten transportation times), and (3) investment in supplies and training of personnel to ensure standardized procedures. We think these challenges may be transportable to other low-and middle-income countries.   Figure S3: Correlation series of Rho for lead time in days for the functions of Log10 adjusted SARS-CoV-2 RNA daily loads with active cases, by WWTP, Mexico, October-November 2020. Supplementary Table S5: Estimated cases by WW SARS-CoV-2 RNA quantification and active cases on the clinical surveillance system. References [31][32][33][34]