Inverted Covariate Effects for First versus Mutated Second Wave Covid-19: High Temperature Spread Biased for Young

(1) Background: Here, we characterize COVID-19’s waves, following a study presenting negative associations between first wave COVID-19 spread parameters and temperature. (2) Methods: Visual examinations of daily increases in confirmed COVID-19 cases in 124 countries, determined first and second waves in 28 countries. (3) Results: The first wave spread rate increases with country mean elevation, median population age, time since wave onset, and decreases with temperature. Spread rates decrease above 1000 m, indicating high ultraviolet lights (UVs) decrease the spread rate. The second wave associations are the opposite, i.e., spread increases with temperature and young age, and decreases with time since wave onset. The earliest second waves started 5–7 April at mutagenic high elevations (Armenia, Algeria). The second waves also occurred at the warm-to-cold season transition (Argentina, Chile). Second vs. first wave spread decreases in most (77%) countries. In countries with late first wave onset, spread rates better fit second than first wave-temperature patterns. In countries with ageing populations (for example, Japan, Sweden, and Ukraine), second waves only adapted to spread at higher temperatures, not to infect the young. (4) Conclusions: First wave viruses evolved towards lower spread. Second wave mutant COVID-19 strain(s) adapted to higher temperature, infecting younger ages and replacing (also in cold conditions) first wave COVID-19 strains. Counterintuitively, low spread strains replace high spread strains, rendering prognostics and extrapolations uncertain.


Introduction
Spread parameters of the Covid-19 pandemic decrease with temperature [1]. This could be a direct effect of temperature causing faster aerosol evaporation, limiting travel time and distance of airborne droplets with viral particles. Alternatively, high temperature due to insulation is a proxy for ultra-violet light (UV) exposure. UVs are highly mutagenic and can decrease viral "viability". Prediction or early detection of second waves could be valuable for policy decisions [2] and seems more accurate than usually believed [3]. The same is true for determining climatic conditions favorable to viral spread [4,5]. Surprisingly, comparisons among Italian regions show that in May, temperature increased viral spread, a pattern opposite to that observed in March [6]. Hence, this observation on Italian regions predicts that when comparing different countries, second wave spread parameters could increase with temperature. Figure 1 plots slopes of exponential regressions on time of daily new cases (calculated as a function of days since first 100 confirmed cases) as a function of mean country elevation. Exponents (which estimate contagion rates) increase with temperature up to 900-1000 m, then, drop above 1000 m, especially for landlocked high elevation countries. This analysis potentially disentangles co-linearities between temperature and UV.
The trend below 1000 m confirms previously described effects of temperature on spread parameters, as temperature decreases with elevation. The drop in the exponents above 1000 m elevation indicates direct UV effects, probably by increasing deleterious mutations. These observations are for exponents estimated for the first Covid-19 wave, for each country (Table 1).
We use this pattern as preliminary evidence to justify the working hypothesis that changes in epidemiological patterns between first and second waves could be due to mutations. Note that for countries above 1000 m, landlocked or isolated countries tend to fit the negative trend (for example, Bolivia, Ethiopia, Armenia, and Afghanistan) as opposed to countries with large coastal populations (for example, Chile and South Africa) and landlocked Nepal and Switzerland probably contaminated by tourists from low elevation countries. Peru has a low slope and a large coastal population. Data are from Table 1.

Covid-19 Viruses Evolve Over Time
The number of mutations in a country increases with time since first wave onset (r = 0.561, twotailed p = 0.00084, Figure 2). Time since onset is indeed proportional to replicational cycles, and viral population evolution. No meaningful correlation was observed between mutation numbers and country mean temperature or elevation. The results remain qualitatively unchanged after excluding from analysis extreme datapoints (Nepal, London, UK).
Identical mutations sometimes occur in different populations of the Covid-19 virus [18], called parallel evolution. Close positions of African, Asian, European and South American countries with high elevation in Figure 1 potentially suggest parallel evolutions at high altitudes affecting spread parameters of these distant viral lineages. Strengthening this point, Georgia, which was not initially included in our sample, has a low first wave slope = 0.0346 with a mean altitude of 1432 m. Slope of exponential regression, 1st wave Mean country elevation, m (X) Countries contributing to the negative trend with elevation, down to 110 m, r = −0.375, two tailed p = 0.01. Note that for countries above 1000 m, landlocked or isolated countries tend to fit the negative trend (for example, Bolivia, Ethiopia, Armenia, and Afghanistan) as opposed to countries with large coastal populations (for example, Chile and South Africa) and landlocked Nepal and Switzerland probably contaminated by tourists from low elevation countries. Peru has a low slope and a large coastal population. Data are from Table 1.

Covid-19 Viruses Evolve Over Time
The number of mutations in a country increases with time since first wave onset (r = 0.561, two-tailed p = 0.00084, Figure 2). Time since onset is indeed proportional to replicational cycles, and viral population evolution. No meaningful correlation was observed between mutation numbers and country mean temperature or elevation. The results remain qualitatively unchanged after excluding from analysis extreme datapoints (Nepal, London, UK).
Identical mutations sometimes occur in different populations of the Covid-19 virus [18], called parallel evolution. Close positions of African, Asian, European and South American countries with high elevation in Figure 1 potentially suggest parallel evolutions at high altitudes affecting spread parameters of these distant viral lineages. Strengthening this point, Georgia, which was not initially included in our sample, has a low first wave slope = 0.0346 with a mean altitude of 1432 m.

Determination of First and Second Waves
Here, we study exponents estimated for second Covid-19 waves, derived from visually examining daily new cases in 123 countries. We explored temporal-, geographic-, demographic-and temperature-associated patterns of second wave spread parameters. We examined graphs plotting daily numbers of new confirmed cases (as daily updated at https://www.worldometers.info/ coronavirus/ [2]) for 123 countries. Second waves were visually determined, with examples in Figure  3 (Iran and Argentina). Second waves occurred in 26 countries, along patterns shown for Iran (broken first wave, second wave started from a low rate). The pattern shown for Argentina (new slope after inflection in first wave still in its growing phase (note the logarithmic scale of the y axis of Figure 3B)) occurs only in one other country, i.e., neighboring Chile. For Argentina and Chile, the new second wave slope occurred during the hot-to-cold season transition, in early April, corresponding to an early October northern hemisphere seasonal shift.

Determination of First and Second Waves
Here, we study exponents estimated for second Covid-19 waves, derived from visually examining daily new cases in 123 countries. We explored temporal-, geographic-, demographicand temperature-associated patterns of second wave spread parameters. We examined graphs plotting daily numbers of new confirmed cases (as daily updated at https://www.worldometers.info/ coronavirus/ [2]) for 123 countries. Second waves were visually determined, with examples in Figure 3 (Iran and Argentina). Second waves occurred in 26 countries, along patterns shown for Iran (broken first wave, second wave started from a low rate). The pattern shown for Argentina (new slope after inflection in first wave still in its growing phase (note the logarithmic scale of the y axis of Figure 3B)) occurs only in one other country, i.e., neighboring Chile. For Argentina and Chile, the new second wave slope occurred during the hot-to-cold season transition, in early April, corresponding to an early October northern hemisphere seasonal shift. . First wave onsets are defined from the day the cumulative total number of confirmed cases passes 100 cases. Onset of second waves is determined visually. All countries, but Chile, follow the general pattern, as in the example for Iran, where the new wave follows a decrease; Chile follows the pattern of Argentina. Note the log scale for the Figure 3B y axis. This presentation mode was chosen in order to visually enhance slope change. Data are from Table 1.
The lower second vs. first wave slopes in Figure 3 are not due to temperature increase, as could be expected from negative correlations between first wave slopes and temperature [1]. This is because for Argentina and Chile (Table 1), lower slopes correspond to hot-to-cold season transition, but not cold-to-hot seasons. Table 1 compares the first and second wave slopes. . First wave onsets are defined from the day the cumulative total number of confirmed cases passes 100 cases. Onset of second waves is determined visually. All countries, but Chile, follow the general pattern, as in the example for Iran, where the new wave follows a decrease; Chile follows the pattern of Argentina. Note the log scale for the Figure 3B y axis. This presentation mode was chosen in order to visually enhance slope change. Data are from Table 1. The lower second vs. first wave slopes in Figure 3 are not due to temperature increase, as could be expected from negative correlations between first wave slopes and temperature [1]. This is because for Argentina and Chile (Table 1), lower slopes correspond to hot-to-cold season transition, but not cold-to-hot seasons. Table 1 compares the first and second wave slopes. Table 1. Exponential slopes of first and second Covid-19 waves in countries with two detected waves. Columns 1, Country; Column 2, T, mean annual temperatures; Column 3, E, mean elevation; Column 4, D, density; Column 5, A, median age in that country. Start S1 for first wave is the date when cumulated total confirmed cases reached 100, start S2 for second wave is visually estimated as in Figure 3. Numbers following second wave onset date indicate differences with onset dates determined by other methods, see Sections 3.11 and 3.12. Slopes are the exponent b from the exponential regression y = a * exp(b * x), where y is the number of new daily cases and x the number of days since 100 cumulated cases for the first wave, or second wave start. First wave data were completed by data from [1] and countries with mean elevation >900 m (indicated with *). In Kenya and Sri Lanka, erratic data prevent estimating first wave slopes.

Geographical Second Wave Clusters
Visual data examinations such as in Figure 3 for 123 countries, on 31 May, detect second waves in 28 countries from four continents (Africa (2), Americas (North, 4 and South, 2), Asia (12) and Europe (8)). For Kenya and Sri Lanka, first wave slopes could not be determined (Table 1). Earliest second waves are from Armenia and Poland (5 April), and Algeria (7 April

Second Wave Slopes versus First Wave Slopes
Second wave slopes are lower than first wave slopes for 20 among 26 countries (exceptions include Guatemala, Kazakhstan, Lithuania, Philippines, Portugal, and Singapore), a statistically significant majority (two tailed sign test, p = 0.0047). The mean second wave slope decreases by 43% as compared with the first wave slope.  Second wave slopes (open triangles, Figure 4) increase with temperature (r = 0.537, two tailed p = 0.00321). Unknown mechanisms enable second wave viral population spread at high temperatures. Earliest second wave occurrences at high elevations (Armenia, Algeria) may not be circumstantial. High UV regimes, increasing mutation rates, could occasionally favor selection of temperatureadapted viruses.

Time Since Start of First Wave for Low Slopes
For some countries, first wave slopes are closer to the regression line defined by second wave slopes than to the regression line defined by the first wave slopes. These countries are indicated in Figure 4 by filled triangles (second wave onset date before country): 9/3 Guatemala, 13/3 Portugal, 18/3 Armenia, 21/3 Lithuania, North Macedonia, 23/3 Malta, 25/3 Oman, 26/3 Afghanistan, Kazakhstan, 27/3 Cuba, 29/3 Peru, 30/3 Bolivia, Kyrgyzstan, 4/4 Rwanda, 10/4 El Salvador, 2/5 Tajikistan. On 31 May, the mean time since first wave onset in these countries was 65.19 days, significantly less than 76.32 days since first wave onset in remaining countries that fit best the negative trend (two tailed t-test, p = 0.0228). Hence, first wave viral population dynamics evolved with low spread in the latter.

Slopes and Times Since Start of First and Second Waves
Time since first wave onset increases with spread slope (r = 0.4968, p = 0.00018, two tailed test, Figure 5A). Outliers with high slopes despite recent start associate with high elevation, outliers with low slope despite early first wave have developed marine commerce. Time since first wave start could be a proxy of temperature, as early first waves occurred in February vs. late ones that occurred in April. Seasonal temperature might decrease slopes at their start for countries with a late first wave.  Second wave slopes (open triangles, Figure 4) increase with temperature (r = 0.537, two tailed p = 0.00321). Unknown mechanisms enable second wave viral population spread at high temperatures. Earliest second wave occurrences at high elevations (Armenia, Algeria) may not be circumstantial. High UV regimes, increasing mutation rates, could occasionally favor selection of temperature-adapted viruses.

Time Since Start of First Wave for Low Slopes
For some countries, first wave slopes are closer to the regression line defined by second wave slopes than to the regression line defined by the first wave slopes. These countries are indicated in Figure 4 by filled triangles (second wave onset date before country): 9/3 Guatemala, 13/3 Portugal, 18/3 Armenia, 21/3 Lithuania, North Macedonia, 23/3 Malta, 25/3 Oman, 26/3 Afghanistan, Kazakhstan, 27/3 Cuba, 29/3 Peru, 30/3 Bolivia, Kyrgyzstan, 4/4 Rwanda, 10/4 El Salvador, 2/5 Tajikistan. On 31 May, the mean time since first wave onset in these countries was 65.19 days, significantly less than 76.32 days since first wave onset in remaining countries that fit best the negative trend (two tailed t-test, p = 0.0228). Hence, first wave viral population dynamics evolved with low spread in the latter.

Slopes and Times Since Start of First and Second Waves
Time since first wave onset increases with spread slope (r = 0.4968, p = 0.00018, two tailed test, Figure 5A). Outliers with high slopes despite recent start associate with high elevation, outliers with low slope despite early first wave have developed marine commerce. Time since first wave start could be a proxy of temperature, as early first waves occurred in February vs. late ones that occurred in April. Seasonal temperature might decrease slopes at their start for countries with a late first wave.
However, mean annual temperature across countries does not correlate with the time since first wave onset. Hence, the effect in Figure 5A seems independent of mean temperature.
This contrasts with patterns for the second wave, where time since second wave onset correlates negatively with second wave slope (r = −0.5649, p = 0.0026, two tailed test, Figure 5B). Hence, second wave viral populations could increase their spread over time, possibly implying adaptation.

Elevation and Population Density
Mean country elevation correlates negatively with time since first wave onset (r = −0.6095, p = 0.0000016, two tailed test). This suggests that the pandemic reached more elevated and possibly isolated countries later. Low elevation also associates with ports and probable spread via marine commerce.
Notable is that at this point, no pandemic property (Table 1) correlates with population density. One would have expected that slopes increase with population densities, but this is not the case (first wave, r = −0.1779 and p = 0.2068; second wave, r = 0.0128 and p = 0.94845, two tailed tests). It seems that most COVID-19 cases are in dense urban centers. These densities could vary among different cities, but mean country density does not reflect this. New York city and Singapore could have similar urban densities, but population densities for their respective countries vary due to size differences in surrounding low population areas. Hence, no correlation could be observed using our simple method.

Median Age and Spread Rates
First wave virus strains mainly hit the elderly. Hence, the positive correlation between slope and median population age in Figure 6A (r = 0.414, one tailed p = 0.0011) fits the expected higher contagiousness in ageing populations. Note outliers as indicated in Figure 6A. For the second wave, the opposite association occurs, i.e., slopes are the highest for countries with low median age (r = −0.418, two tailed p = 0.0023, Figure 6B). This new information is crucial for future management of the pandemic. Second wave viruses apparently adapted to infect the larger reservoir of potential younger hosts, in addition to adapting to spread at higher temperatures.
Data gathered until mid-June find second waves in additional countries. For countries with median ages above 36 years (Bulgaria, Japan, Moldova, Serbia, Sweden, and Ukraine), the trend for these second wave slopes fits that observed for the first wave slopes as a function of median age in Figure 6A. This indicates that in these countries with late second wave onset and high median age, viruses only adapted to seasonal temperature increase, but not to the relatively few young in these populations.
In some countries, the second wave could be an artefact due to sudden policy changes such as increasing daily test numbers, which increase numbers of new reported cases, but do not reflect any epidemiological change. Other second wave slopes estimated after 31 May fit the trend in Figure 6B. This is the case for Bahrain, the Democratic Republic of Congo, Ghana, Guatemala, Iran, Israel, and Jordan. These patterns could be explained by policy differences between countries. Our interpretation remains biological and suggests that viruses evolve in relation to host populations and climatic conditions because country-specific sampling artefacts are unlikely to produce overall patterns across countries such as those in Figures 5 and 6.    Figure 4B), which corresponds to the main correlate of second wave slopes. Data are from Table 1. For second wave slopes, the figure plots the residual values after adjusting second wave slopes for time since second wave start (regression in Figure 4B), which corresponds to the main correlate of second wave slopes. Data are from Table 1.

Eyeballing versus Statistical Evaluation of Second Wave Onset Date
Second wave detection by visual examinations (eyeballing) has an arbitrary component. However, statistical methods mimicking the process underlying visual detections, which assume the onset of a new wave could occur any day, are biased for false positive detections [14,15]. In addition, the size of the moving window used in these methods is also arbitrary, which can lead to false negatives if the exponential increase period of the new wave is much shorter than the period of the window. Analyses accounting for different window sizes in different countries are unrealistic and beyond the scope of our analyses. Figure 7 presents the visual and statistical estimation of the onset date of the second wave for Sweden. Visual estimation considers the minimum preceding a clear increase pattern in daily new confirmed cases, indicated by a triangle. This is the first of June (1/6), one day after we stopped our initial sampling of second waves. Figure 7B plots the Pearson correlation coefficient r calculated between the number of days since February 15 and the daily number of new confirmed cases for Sweden, the very data from Figure 7A. Calculations are done for a moving window of 20 consecutive days, and r is plotted as a function of the first day included in that moving window. Figure 7B shows that the highest rs are for the first wave, at the beginning of the epidemy in Sweden. Then r values decrease and increase again towards a second local maximum of r = 0.55, on 1 June.
Patterns from Figure 7A are typical and show how relatively little differences exist between visual and statistical estimations for the onset of the second wave. The first value following second wave onset dates in Table 1 indicate the numbers of days between the date set by eyeballing and that set by the statistical method. A value of −1 means that eyeballing determined the onset day one day earlier than the presumably more objective method using moving windows. Shifts between dates set by both methods are random. In half of the cases, eyeballing detects earlier, and the other half later second waves than the moving window method. Most shifts (77%) are small, between −5 and +5 days.
Hence, eyeballing is not biased as compared with an objective method for detecting second waves. However, eyeballing has the advantage that it does not arbitrarily set a priori the window size for detection of second waves, but rather adjusts its detection between two clear extreme dates during which a more or less monotonous increase in daily new cases occurred. In addition, eyeballing is rapid and simple. More formal statistical methods estimating P values of non-stationarity (meaning a new phenomenon in the time series) based on Monte-Carlo approaches require heavy computational efforts, justified in much more messy data, such as climatological data [14,15].

Total Numbers of Tests
An important caveat to our analyses is that for each country, they assume equal sampling effort across the whole period under study. However, numbers of tests vary across periods, hence an increase in the number of confirmed cases could result from an increase in tests, rather than from a second wave. The former is a sampling artefact, whereas the second is a natural phenomenon.
For that reason, we used total daily numbers of tests done for each country for which these data were available for the relevant period at [7] for countries for which Table 1 has second waves, and repeated visual examinations for daily percentages of positive tests among all tests done that day. The second value following second wave onset dates in Table 1 indicates differences in numbers of days between onset dates determined by eyeballing percentages vs. numbers of positive cases. Value −1 indicates that eyeballing numbers of positive cases detect an onset date that is earlier by one day than eyeballing percentages of positive cases. A bias exists in these differences, i.e., using numbers of positive cases detects later onsets dates in 16 among 19 countries where both approaches produced different dates. This is a statistically significant majority of countries according to a one-tailed sign test (p = 0.0022).
This suggests that using more information (number of positive cases and total number of tests) enables earlier detection of the phenomenon than when using only numbers of positive cases. In addition, because percentages detect earlier second waves, this means that increases detected according to numbers of positive cases did not produce false positives, but rather false negatives for the period that second waves remained undetected when using numbers rather than percentages. This indicates that increases in testing efforts do not occur independently of onsets of second waves. We suggest that medical experts probably sense very early on a change in the kinetics and increase testing efforts at these periods.

Discussion
Analyses confirm that the spread of first wave COVID-19 decreases with temperature. They indicate that UVs also decrease the spread of first wave COVID-19. Second wave COVID-19 is characterized by a lower spread and by infecting younger age classes. Second wave spread increases with temperature.
This inversion of trends between first and second waves, at one to two months interval, is highly peculiar. The possibility that a different virus was cryptic and minor during the first wave and became dominant as conditions changed during the second wave, cannot be excluded. However, trends with time and mutation numbers suggest that a specific virus evolved from one state to another. The earliest second waves, in high elevation countries, suggest UV-induced high mutation rates hastened adaptation. Adaptations could independently arise in different virus populations [18].
Alternative explanations relate to human behaviors and policies. Negative trends of spread with temperature in winter and positive ones in spring could reflect tendencies to stay inside in winter, and during warmer weather. Trends with population age could also be explained by seasonal differences in behaviors of different age groups. However, this would imply different complex explanations for each of the three independent pattern inversions described here, with temperature, population age, and time since wave onset. Non-random mutations could channel changes of viral RNA between two local structural optima, as described for COVID-19 in [19], one putatively adapted to cold and one to warm weather. This is more parsimonious than assuming different explanations for each of the three correlations reported here. This mutation hypothesis is also in line with observations that the earliest second waves usually occur at high altitudes where UVs could increase mutation rates.
This does not exclude the possibility of combined effects of mutations and behavioral changes in human populations (more alerted authorities and public adapting their behaviors), as well as fewer susceptible hosts (most infection-prone individuals are already been infected). However, the most parsimonious explanation is also likely to be the main factor in the case of a combined factor scenario.
Note that analyses determined clear patterns in relation to various cofactors of the pandemic, despite uncertainties in data. For example, reported over unreported cases ratio [20] apparently vary hugely between countries depending on their mode of counting and public health policy, rendering predictions for the future of the pandemic highly uncertain. A striking major point is unexplained and open for optimistic interpretations from a health-oriented point of view. Usually, strains with high spread replace those with low spread. However, low spread second wave viruses replace fast spread first wave viruses, in an increasing number of countries. This could suggest that third wave spread could further decrease. Another counterintuitive point is that for the sample of examined countries, viral spread does not increase with population density. Hence, accepted knowledge in relation to epidemics seems inadequate regarding the current pandemic. Hence, prognostics and interpretations of observed patterns, whether pessimistic or optimistic, cannot be trusted, as these are based on previous knowledge contradicting the current fast-to-slow spread evolution of COVID-19.

Conclusions
Current analyzes suggest that a third wave will occur with a possibly lower spread than for the first and second waves. A study is currently in progress to study its characteristics, in particular the correlations with the geo-climatic and demographic variables highlighted in [1] and in this article.