# A New Algorithm for Identifying Possible Epidemic Sources with Application to the German Escherichia coli Outbreak

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

- A new point, the TWC Alfa, and a new scalar field, the TWC Alfa Map: these two entities constitute an estimation of the outbreak of the assigned epidemic;
- A scalar field, the TWC Beta, whose goal is to show the possible diffusion map of the epidemic;
- Another scalar field, the TWC Gamma, and other mathematical entities show an estimation of the future diffusion of the epidemic.

- Case 1: The Chikungunya fever epidemic of 2007;
- Case 2: The Foot and mouth disease epidemic of 1967 in Great Britain;
- Case 3: The Golden Square cholera epidemic of 1854 in London;
- Case 4: The Russian influenza in Sweden in 1889–1890;

- The Distance of the peak of the map from the target (outbreak);
- The Sensitivity of the target location on the map;
- The Specificity of the target location on the map;
- The Percent of the searching area proposed by any algorithm.

## 2. The Topological Weighted Centroid Algorithm

#### 2.1. Some Mathematical Details about TWC-α Method

_{r}and Py

_{r}are and y of the assigned point.

_{r}and Cy

_{r}are and of the point in the pane and is the sum of the square of the distance of point C

_{r}from the assigned point P

_{i}.

_{i}

_{,j},d

_{i}

_{,k}= Euclidean distance between two of assigned points, and .

_{n}), as n→∞, into a non-trivial attractor. With Equation (5a) for each distance (d

_{i,j}) we take into account the average of the other distances (d

_{i,k}, with k ≠ i and k ≠ j). In fact, without Equation (5a), the convergence point of TWC(α

_{n}), as n→∞, corresponds to the mean of the two points having the minimal Euclidean distance (see the proof in Appendix A), while by Equation (5a) the final attractor is the point in space whose average distance from the other points is minimal. This point need not be unique because the matrix of the distances generated by Equation (5a) is not symmetric (see Appendix B).

- Initialize α
_{(0}_{)}= 0 at first cycle; all the components of the vector**w**(α_{n}) at this point will be equal to 1 and the TWC (α_{n}) will have the same coordinates of the center of mass. - At the next cycle increase α with a small positive quantity:
- The Equations (7) and (8) will show an entropy reduction and an increasing of the free energy (see Appendix C), and then the TWC (α
_{n}) will move in a specific direction of the plane (Equation (4)). - When the free energy (Equation (7)) attains the global max, the process terminates at α
^{*}= α_{n}.

_{n}) evolution is also very informative and can be retained for at least two reasons:

- All the TWC (α
_{n}) points represent the best path with which to reach the maximum of the free energy of the weighted mean of the assigned points, starting from the center of mass. This path is usually nonlinear and a non-monotonic curve. - The set of points belonging to the TWC (α
_{n}) trajectory can be used to transform the plane into a scalar field, TWSF (α_{n}), where the proximity of each geometrical point to this trajectory can be measured.

^{*}) represents, therefore, the point at which the weighted mean of the assigned points represents the maximum free energy. In many applications this remarkable point can represent, or point out, the source of the process because this point is also the point where the entropy is minimal, so it is the point from which (if you were to put yourself there) other points generate maximum information; in other words this is the point of Negentropy. On the other hand, the center of mass is the point where the entropy is maximum, so it is also the point from which (if you were to put yourself there) the distribution of assigned points is least informative. We can also find α

^{*}by using a Newton’s Method or the fixed point algorithm (see appendix D). Conditions associated with Newton’s Method or the fixed point algorithm indicates existence and uniqueness of the method that is convergence to a unique solution.

#### 2.2. Details of TWC-β Method

^{*}for which point TWC (β

^{*}) of the trajectory begins to turn back. This is because v

_{i}(β

^{*}) is the vector of weights defined by a specific value of β, where the entropy is the smallest (β

^{*}) as computed by the following:

^{*}(See Appendix A from Equation (A16) to Equation (A22)).

^{*}parameter will now be used to define the proximity of each geometrical point (all the grid points that define the space) to the assigned points:

_{k}has been changed to TWSF (β

^{*}).

^{*}as parameter minimizing the entropy}

^{*}), which is generated from the β

^{*}parameter.

#### 2.3. Some Mathematical Details of TWC-γ Method

_{i}) analyzes the weighted distances of each of the assigned points from the other. In fact, the TWC (γ

_{i}) is the set of points connecting the center of mass to each one of the assigned points. Consequently, each one of the assigned points will be described by a vector,

**z**, of weights.

_{i}) points for each of the assigned points. Therefore, each component of this set of points represents the weighted average of all the points with respect to an increasing value of the γ parameter, in relation to any one of the assigned points. The starting point of each TWC (γ

_{i}) is located at the center of mass. Now the last TWC (γ

_{i}) terminates at the point where for each of the assigned points the entropy of the weighted average is minimized according to Equation (29).

_{i}):

_{i}) defines a set of trajectories whose dynamics is the output of the many-to-many interactions among the distances of all the assigned points. The TWC-γ map is the scalar field, TWSF (β

^{*}(γi)), measuring the global proximity of each geometrical point of the two-dimensional space to all these TWC (γ

_{i}) trajectories.

^{*}(γ

_{i})):

_{i}), in the discrete space;

^{*}parameter}

^{*}(γi)) which depends on the β

^{*}parameter.

#### 2.4. A Short Synthesis of TWC Method

^{*}) is a new point of the space from which the other assigned points (input data points of interest) have minimum entropy with maximum free energy. This point has been shown to mark the source of the dynamic process underlying the occurrence of points of interest. This prediction tool has a number of benchmark algorithms described in Section 4. The set of points belonging to the TWC (α

_{k}), k = 0,1,… trajectory transforms the plane into a scalar field where the proximity of each geometrical point (points on the grid of the map) to this trajectory can be measured. Parameter β

^{*}is the critical value of β at which the entropy of the weighted mean of the assigned points is minimal and it is used to define the TWC-β scalar field, TWSF (β

^{*}). The diffusion probability could be calculated by measuring the intensity of the scalar field. The diffusion probability determines the probability that a new event could occur at a geometrical point of the map. A scalar field in physics is basically used to associate a scalar value (like temperature or electric potential energy) to every point in the space. The gradient (or minus the gradient) of a scalar field is a vector field, for example, the negative gradient of electric potential is the electric field. Therefore the TWSF (β

^{*}) represents a property of the space which is similar to electric potential. We will interpret these points, trajectories, and scalar fields as possible sources of the disease or indicators as to where the disease will next spread.

_{i}), provides information about the weighted distances of each of the assigned points from the others. TWC (γ

_{i}) is the set of points connecting the center of mass to each one of the assigned points. TWC (γ

_{i}) can be used to build up a matrix of nonlinear trajectories connecting the points of interest, which may be interpreted as the dynamic movement of the disease outbreak. The TWC (β

^{*}(γ

_{i})) points are transformed into a map that is the scalar field TWSF (β

^{*}(γ

_{i})) measuring the global proximity of each geometrical point of the two-dimensional space to all of the TWC (γ

_{i}) trajectories by using the β

^{*}parameter. The point with minimum entropy and maximum free energy may be interpreted in context of many applications as a remarkable point that represents, or can point out, the source of a spreading phenomenon like an epidemic outbreak. TWC-β and TWC-γ sets of points do not yet have any benchmarking algorithm.

## 3. Four Epidemics Already Known

#### 3.1. The Chikungunya Fever Epidemic of 2007

#### 3.2. The Foot and Mouth Disease Epidemic of 1967

#### 3.3. The Golden Square Cholera Epidemic of 1854

#### 3.4. The Russian Influenza in Sweden in 1889–1890

## 4. The Algorithms Used for Comparison with TWC-α

^{N}and the shape of the decay distance function, F(D

^{N}) and the sum as a composition function, S, without specific assumptions about other factors. Following this approach, the anchor point, Y

^{*}, is located in a region with a high “Hit Score” [14]:

#### 4.1. The Rossmo Algorithm

#### 4.2. The Negative Exponential Summation Algorithm (NES)

#### 4.3. The Likelihood Variance Maximization Algorithm (LVM)

^{*}= the width (and the optimal width) of the bell of the decay function and σ

^{*}= the width (and the optimal width) of the bell of the decay function.

#### 4.4. The Mexican Probability Algorithm (Mex Prob)

^{*}= The width ( and the optimal width) of the bell of the decay function.

## 5. Results

#### 5.1. The Results of the Comparison of the Four Algorithms with TWC

^{*}) and especially the TWFS (α

_{n}) have been considered in this comparison, because they are very useful to estimate the outbreak of a points distribution.

- The
**distance**from the peak of each algorithm to the real outbreak has been calculated as follows: it is basically relative distance and it is calculated relative to the main diagonal of the window grid generated by the software in percentage form. For each data distribution (dataset) our software draws a grid map of 600 × 600 pixels, where all the points all embedded in a sub window of 500 × 500 pixels. - The
**sensitivity**is defined as the value of the point of the scalar field of each algorithm (each pixel value of the scalar field generated by each algorithm is scaled between 0 and 1) in the place where the real outbreak is located. The**specificity**is defined as the percent value of the number of points of the whole window whose value is the smallest values of the sensitivity of each algorithm. - The
**search area**in which the real outbreak can be found in the scalar field of each algorithm is defined as following: we have divided the scalar field generated by each algorithm in 20 bins of equal length and then we calculate the extension of the area into which the real outbreak is included. Finally we express the value of this bin area in relation to the area of the global window.

Foot and Mouth Disease | |||||
---|---|---|---|---|---|

Algorithm | Distance from Outbreak | Sensitivity | Specificity | Search Area | Rank |

TWC Alfa | 0.7400% | 94.6000% | 99.9825% | 0.0175% | 1 |

Rossmo | 6.1400% | 90.6200% | 99.7500% | 0.2950% | 2 |

NES | 6.1400% | 75.5700% | 99.8650% | 0.3700% | 3 |

LVM | 6.1400% | 87.7400% | 99.7600% | 0.3775% | 4 |

Mex Prob | 6.0000% | 83.3000% | 99.7800% | 0.5775% | 5 |

Chikungunya Fever | |||||
---|---|---|---|---|---|

Algorithm | Distance from Outbreak | Sensitivity | Specificity | Search Area | Rank |

TWC Alfa | 0.0000% | 100.0000% | 99.9650% | 0.0000% | 1 |

Rossmo | 0.7300% | 97.6600% | 99.9800% | 0.0200% | 2 |

NES | 1.0300% | 96.2100% | 99.9725% | 0.0275% | 3 |

LVM | 0.7300% | 98.7900% | 99.8875% | 0.1125% | 4 |

Mex Prob | 2.9100% | 97.0600% | 99.7500% | 0.2500% | 5 |

London Cholera | |||||
---|---|---|---|---|---|

Algorithm | Distance from Outbreak | Sensitivity | Specificity | Search Area | Rank |

TWC Alfa | 4.4900% | 95.3100% | 99.5375% | 0.0025% | 1 |

Mex Prob | 5.1600% | 93.7800% | 99.3625% | 0.4775% | 2 |

LVM | 5.1600% | 96.0600% | 99.2900% | 0.7100% | 3 |

NES | 4.9600% | 86.5100% | 99.6575% | 0.7350% | 4 |

Rossmo | 5.1600% | 97.3900% | 98.6800% | 1.3200% | 5 |

Russian Influenza | |||||
---|---|---|---|---|---|

Algorithm | Distance from Outbreak | Sensitivity | Specificity | Search Area | Rank |

NES | 3.9100% | 85.1200% | 99.8800% | 0.2475% | 1 |

TWC Alpha | 3.1600% | 67.8600% | 99.8500% | 0.4550% | 2 |

Mex Prob | 6.3200% | 86.5800% | 99.4050% | 1.4900% | 3 |

LVM | 6.5500% | 88.1100% | 99.0975% | 1.6525% | 4 |

Rossmo | 19.9300% | 92.5400% | 94.4700% | 3.1150% | 5 |

Algorithm | Foot and Mouth | Chikungunya | London Cholera | Russian Influenza | Rank |
---|---|---|---|---|---|

TWC Alfa | 1 | 1 | 1 | 2 | 1.25 |

NES | 3 | 3 | 4 | 1 | 2.75 |

Rossmo | 2 | 2 | 5 | 5 | 3.50 |

Mex prob | 5 | 5 | 2 | 3 | 3.75 |

LVM | 4 | 4 | 3 | 4 | 3.75 |

**searching zone**” and does not have an arbitrary link to the “binning strategy” chosen by the researchers. The searching area, in fact, may take different sizes in relation to the binning segmentation of the scalar field.

- All the algorithms have performed fairly well in each of the five tests (these four cases plus E-coli). That means that their foundation is robust and solid;
- The TWC (α) results have significantly shown this method to be more effective than the other algorithms (in most of the cases its searching area is one order of magnitude smaller than the searching area of the other algorithms).
- It is evident that we need to compose more than one index to evaluate the performances of any algorithm dedicated to the geographic profile. In this field the methodological research remains still open and we hope we can offer a contribution in the near future;
- Only the TWC (α) was tested in this comparison; the other quantities generated by the TWC algorithm—TWC (β) and TWC (γ
_{i})—present different types of key information about the dynamics of the process that no one of the existing algorithms at this moment can claim.

#### 5.2. The New Information about the Virtual Dynamics of the Process

^{*}(α

_{n}));

^{*});

_{i}), instead, provides an estimation of the epidemic diffusion when the different locations (points) start to communicate with each other. This dynamic information may be interpreted as the dynamic movement of the epidemic. We have named this scalar field TWSF (β

^{*}(γ

_{i}));

- a.
- TWSF (β
^{*}): The present (the time of data collection); - b.
- TWSF (β
^{*}(α_{n})): the recent past (the beginning of the process); - c.
- TWSF (β
^{*}(γ_{i})): the near future (the next step of the process).

- d.
- TWSF (β
^{*}) = t_{0}; - e.
- TWSF (β
^{*}(α_{n})) = t_{0}− ∆x_{1}; - f.
- TWSF (β
^{*}(γ_{i})) = t_{0}+ ∆x_{2}.

_{1}and ∆x

_{2}, but we hypothesize their logic implication:

**intensity**(the area size beyond the 95% of the scalar field) of each one of the four analyzed epidemics, according to the three TWC scalar fields: α, β and γ.

p > 0.95 | Foot and Mouth | Chikungunya | London Cholera | Russian Influenza |
---|---|---|---|---|

TWC α | 0.0150% | 0.225% | 0.3050% | 0.0125% |

TWC β | 0.0950% | 0.1800% | 0.1150% | 0.0525% |

TWC γ | 0.0900% | 0.1875% | 0.0775% | 0.0200% |

**Figure 1.**Areas of p > 0.95 according the Topological Weighted Centroid (TWC) α, β and γ, in Food and Mouth Disease.

- Data about Foot and Mouth disease were collected when the epidemics were in the first week, before reaching their peak. Despite this, α is small and β is big, as the epidemics would have already reached its peak and as the hot area of its diffusion would have remained stable in the next step (γ is high). See Figure 5(a,b). This discrepancy can be explained by the fact that at variance with the other three examples of epidemics, FMD evolution and spread depend basically from the wind action, since there is no direct contact between animals living in distinct farms. Wind therefore represents another instable variable which probably is not taken into account by our algorithm.
- Data about Chikungunya fever were collected in the main phase of outbreak development. This correspond quite well to the algorithm solution (α is small and β is big). The estimation is a further increase of the hot area in the next future step (γ is bigger than β) (see Figure 6(a,b)).
- Data about Cholera correspond to the end of the epidemic outbreak. The values of TWC parameter are consistent with a final state of evolution since α is huge and β and γ become increasingly smaller (see Figure 7(a,b)).
- The data set of Russian flu reflects basically an early phase of development in quantitative terms (number of cases in each location) but a peak for the epidemics in qualitative terms (number of locations). The values of TWC parameters correspond to this evolution phase since β is the biggest. In the next future step the “hot” area of fever is decreased (see Figure 8(a,b)).

## 6. A Special Case of Epidemic Outbreak: The HUS German Epidemics in May–June 2011

#### 6.1.The German Dataset

p > 0.95 | HUS Cases and Suspected HUS | Cumulative Incidence Cases (per 100,000 Population) |
---|---|---|

Hamburg | 59 | 3.33 |

Bremen | 11 | 1.66 |

Schleswig-Holstein | 21 | 0.74 |

Mecklenburg-Vorpommern | 10 | 0.61 |

Hesse | 31 | 0.51 |

Saarland | 5 | 0.49 |

Lower Saxony | 28 | 0.35 |

North Rhine-Weatphalia | 31 | 0.17 |

Berlin | 3 | 0.09 |

Baden-Württemberg | 8 | 0.07 |

Bavaria | 5 | 0.04 |

Thuringia | 1 | 0.04 |

Rhineland-Palatinate | 1 | 0.02 |

Brandemburg | 0 | 0 |

Saxony | 0 | 0 |

Saxony-Anhalt | 0 | 0 |

TOTAL | 214 | 0.26 |

ID | State | City used | Why used | Lat | Long | Q |
---|---|---|---|---|---|---|

1 | Hamburg | Hamburg | Exact match | 53°33'55''N | 10°00'05''E | 59 |

2 | Bremen | Bremen | Exact match | 53°4'33''N | 8°48'27''E | 11 |

3 | Schleswig-Holstein | Kiel | Capital | 54°19'31''N | 10°8'26''E | 21 |

4 | Mecklenburg-Vorpommern | Schwerin | Capital | 53°38'0''N | 11°25'0''E | 10 |

5 | Hesse | Frankfurt | Largest city | 50°6'37''N | 8°40'56''E | 31 |

6 | Saarland | Saarbrucken | Capital | 49°14'0''N | 7°0'0''E | 5 |

7 | Lower Saxony | Hanover | Capital | 52°22'N | 9°43'E | 28 |

8 | North Rhine-Weatphalia | Duesseldorf | Capital | 51°14'N | 6°47'E | 31 |

9 | Berlin | Berlin | Exact match | 52°30'2''N | 13°23'56''E | 3 |

10 | Baden-Württemberg | Stuggart | Capital | 48°46'43''N | 9°10'46''E | 8 |

11 | Bavaria | Munchen | Capital | 48°31'52''N | 11°57'50''E | 5 |

12 | Thuringia | Erfurt | Capital | 50°59'0''N | 11°2'0''E | 1 |

13 | Rhineland-Palatinate | Mainz | Capital | 50°0'0''N | 8°16'16''E | 1 |

**Figure 9.**(

**a**) Latitude and longitude of the first 13 German towns; (

**b**) Quantity of suspected cases in the first 13 German towns.

#### 6.2. The TWC-α Method and the Real Outbreak

**Figure 10.**(

**a**) TWC (α

^{*}) points out the outbreak close to Frankfurt; (

**b**) The scalar field of α

_{n}, the TWSF (α

_{n}).

_{n}points generated with the TWC (α) method, while Figure 11 shows the dynamics of decreasing the entropy of the system during this search process.

**Figure 12.**TWC (α), the possible outbreak source, on Google earth at the Long 8.683 and Lat 50.117 and its distance from the Erlenbach stream.

#### 6.3. The TWC-β Map

**Figure 13.**TWC-β scalar field, TWSF (β

^{*})—the more deep red, the more concentration of epidemics (the deep red zone represents around the 3/1,000 of the whole area).

- The higher probability of epidemics diffusion (p > 0.95) is an area representing 0.32% of total area of the map (see Figure 14); 66% of this area is around Frankfurt, while 34% is in the Hamburg cluster. The intensity of the scalar field was segmented into 20 levels; the 20th is the area where the intensity is largest so the probability of new events should be higher (p > 0.95).

**Figure 14.**Probability of the epidemics in TWC-β map, in relation to the global areas of the map (20 bins).

#### 6.4. The TWC-γ Map

**Figure 15.**(

**a**) TWC-γ Map where the darker the red, the higher the concentration of epidemics (the darkest red zone represents approximately 1/1,000 of the whole area). (

**b**) TWC-γ rebuilt all the possible paths among the 13 locations.

- It shows two independent circuits (clicks). The first includes Hamburg, Schleswig-Holstein and Mecklenburg-Vorpommern, and the second includes Frankfurt, the TWC (α) point, and Rhineland-Palatinate and Baden. These two circuits, by means of a feedback loop, should be the main engines of the HUS epidemics, according to the TWC-γ.
- Frankfurt is, in this case, the center of the graph (see Figure 18, the red point).
- Hamburg, Thuringa and Frankfurt are the nodes with a maximum of “betweeness”.

#### 6.5. Comparison with the Other Algorithms

German Escherichia Coli | |||||
---|---|---|---|---|---|

Algorithm | Distance from Outbreak | Sensitivity | Specificity | Search Area | Rank |

TWC Alfa | 1.0700% | 97.6000% | 99.8675% | 0.0105% | 1 |

NES | 0.7600% | 93.5000% | 99.9700% | 0.0300% | 2 |

Rossmo | 0.7600% | 91.3500% | 99.9675% | 0.0350% | 3 |

LVM | 0.7600% | 93.6100% | 99.9725% | 0.0400% | 4 |

Mex Prob | 4.4300% | 93.6000% | 99.8475% | 0.3200% | 5 |

Algorithm | FMD | Chikunguya | London Cholera | Russian Influenza | German HUS | Rank Average |
---|---|---|---|---|---|---|

TWC Alfa | 1 | 1 | 1 | 2 | 1 | 1.20 |

NES | 3 | 3 | 4 | 1 | 2 | 2.60 |

Rossmo | 2 | 2 | 5 | 5 | 3 | 3.40 |

LVM | 4 | 4 | 3 | 4 | 4 | 3.80 |

Mex Prob | 5 | 5 | 2 | 3 | 5 | 4.00 |

#### 6.6. TWC (γ) and German HUS Dynamics

**Figure 19.**German HUS: estimations of the hot areas of diffusion of the epidemic according to TWC α, β and γ.

## 7. Oahu (Hawaii): How to Predict 3 Months before the Intensity of a Food Epidemic

Oahu 2010: Number of Cases Each Month | |
---|---|

Jan | 108 |

Feb | 109 |

March | 114 |

April | 98 |

May | 79 |

June | 79 |

July | 93 |

August | 109 |

Sept | 92 |

Oct | 134 |

Nov | 114 |

Dec | 108 |

Gamma(n)=Beta(n) Delta=0 | |||||||||||||

Time Steps | n = 0 | n = 1 | n = 2 | n = 3 | n = 4 | n = 5 | n = 6 | n = 7 | n = 8 | n = 9 | n = 10 | n = 11 | Linear Correlation |

Months | Jan | Feb | March | April | May | June | July | August | Sept | Oct | Nov | Dec | |

Beta(n) | 0.1925% | 0.1000% | 0.1575% | 0.1275% | 0.2450% | 0.1600% | 0.1000% | 0.0575% | 0.2525% | 0.0625% | 0.0925% | 0.0575% | 0.28 |

Gamma(n) | 0.1075% | 0.0500% | 0.1275% | 0.1150% | 0.1350% | 0.3025% | 0.0950% | 0.0450% | 0.0925% | 0.1250% | 0.1100% | 0.0900% | |

Gamma(n)=Beta(n+1) Delta=1 | |||||||||||||

Time Steps | n = 0 | n = 1 | n = 2 | n = 3 | n = 4 | n = 5 | n = 6 | n = 7 | n = 8 | n = 9 | n = 10 | n = 11 | Linear Correlation |

Months | Jan | Feb | March | April | May | June | July | August | Sept | Oct | Nov | Dec | |

TWC Beta | 0.1925% | 0.1000% | 0.1575% | 0.1275% | 0.2450% | 0.1600% | 0.1000% | 0.0575% | 0.2525% | 0.0625% | 0.0925% | 0.0575% | −0.24 |

TWC Gamma | 0.1075% | 0.0500% | 0.1275% | 0.1150% | 0.1350% | 0.3025% | 0.0950% | 0.0450% | 0.0925% | 0.1250% | 0.1100% | 0.0900% | |

Gamma(n)=Beta(n+2) Delta=2 | |||||||||||||

Time Steps | n = 0 | n = 1 | n = 2 | n = 3 | n = 4 | n = 5 | n = 6 | n = 7 | n = 8 | n = 9 | n = 10 | n = 11 | Linear Correlation |

Months | Jan | Feb | March | April | May | June | July | August | Sept | Oct | Nov | Dec | |

TWC Beta | 0.1925% | 0.1000% | 0.1575% | 0.1275% | 0.2450% | 0.1600% | 0.1000% | 0.0575% | 0.2525% | 0.0625% | 0.0925% | 0.0575% | −0.22 |

TWC Gamma | 0.1075% | 0.0500% | 0.1275% | 0.1150% | 0.1350% | 0.3025% | 0.0950% | 0.0450% | 0.0925% | 0.1250% | 0.1100% | 0.0900% | |

Gamma(n)=Beta(n+3) Delta=3 | |||||||||||||

Time Steps | n = 0 | n = 1 | n = 2 | n = 3 | n = 4 | n = 5 | n = 6 | n = 7 | n = 8 | n = 9 | n = 10 | n = 11 | Linear Correlation |

Months | Jan | Feb | March | April | May | June | July | August | Sept | Oct | Nov | Dec | |

TWC Beta | 0.1925% | 0.1000% | 0.1575% | 0.1275% | 0.2450% | 0.1600% | 0.1000% | 0.0575% | 0.2525% | 0.0625% | 0.0925% | 0.0575% | 0.44 |

TWC Gamma | 0.1075% | 0.0500% | 0.1275% | 0.1150% | 0.1350% | 0.3025% | 0.0950% | 0.0450% | 0.0925% | 0.1250% | 0.1100% | 0.0900% | |

Gamma(n)=Beta(n+4) Delta=4 | |||||||||||||

Time Steps | n = 0 | n = 1 | n = 2 | n = 3 | n = 4 | n = 5 | n = 6 | n = 7 | n = 8 | n = 9 | n = 10 | n = 11 | Linear Correlation |

Months | Jan | Feb | March | April | May | June | July | August | Sept | Oct | Nov | Dec | |

TWC Beta | 0.1925% | 0.1000% | 0.1575% | 0.1275% | 0.2450% | 0.1600% | 0.1000% | 0.0575% | 0.2525% | 0.0625% | 0.0925% | 0.0575% | −0.16 |

TWC Gamma | 0.1075% | 0.0500% | 0.1275% | 0.1150% | 0.1350% | 0.3025% | 0.0950% | 0.0450% | 0.0925% | 0.1250% | 0.1100% | 0.0900% |

**Figure 21.**Linear Correlation between the Highest Epidemic Intensity in TWC Beta and TWC Gamma Scalar Fields, With Different Temporal Steps.

## 8. Discussion

^{*}) predicted “backwards” or retrodictively, the source of the epidemics, despite the program (and also the authors) not being told of its nature or location (Hamburg), which was, at the time of analysis, unknown and to be located six-hundred km distant from the second source (Frankfurt). It is not irrational therefore to believe that this approach could be applied in real world situations during future epidemic outbreaks to describe and better understanding the spatial-temporal features of infection risk and spread.

## 9. Conclusions

**topological**terms. Consequently, the probabilistic approaches and the gravitational approaches are neither the best solutions nor are they the only ones.

- To include a new algorithm (based on the TWC philosophy) able to identify a set of possible and
**different outbreaks**given one spatial distribution of events. - To include
**a third spatial dimension**in space analysis, and consequently a new metric able to consider the**energy**needed to complete a path (and not only the distance). - The addition of the latitude and the longitude a list of meaningful
**qualitative****attributes**for each event, and to find a way to collectively process all these features. - The integration of
**the time flow**to the TWC approach in such a way as to explain how the maps change when some attributes of the spatial events change, and which of those attributes could possibly be the cause-effect link between these changes.

## References

- Buscema, M.; Grossi, E.; Breda, M.; Jefferson, T. Outbreaks source: A new mathematical approach to identify their possible location. Phys. A
**2009**, 388, 4736–4762. [Google Scholar] [CrossRef] - Buscema, M.; Terzi, S. PST: An evolutionary approach to the problem of multi dimensional scaling. WSEAS Trans. Inform. Sci. Appl.
**2006**, 3, 1704–1710. [Google Scholar] - Buscema, M. The West Nile Virus. In Presented at the Department of Mathematical and Statistical Sciences, University of Colorado, Denver, CO, USA, 2009.
- Buscema, M.; Breda, M.; Grossi, E.; Catzola, L.; Sacco, P.L. Semantics of Point Spaces through the Topological Weighted Centroid and Other Mathematical Quantities—Theory & Applications. In Data Mining Applications Using Artificial Adaptive Systems; Tastle, W., Ed.; Springer: New York, NY, USA, 2012. [Google Scholar]
- Rossmo, D.K. Geographic Profiling; CRC Press: Boca Raton, FL, USA, 2000. [Google Scholar]
- Le Comber, S.C.; Rossmo, D.K.; Hassan, A.N.; Fuller, D.O.; Beier, J.C. Geographic profiling as a novel spatial tool for targeting infectious disease control. Int. J. Health Geogr.
**2011**. [Google Scholar] [CrossRef] - Stevenson, M.D.; Rossmo, D.K.; Knell, R.J.; Le Comber, S.C. Geographic profiling as a novel spatial tool for targeting the control of invasive species. Ecography
**2012**, 35, 704–715. [Google Scholar] [CrossRef] - Buscema, M.; Sacco, P.L.; Grossi, E.; Lodwick, W. Spatiotemporal Mining: A Systematic Approach to Discrete Diffusion Models for Time and Space Extrapolation. In Data Mining Applications Using Artificial Adaptive Systems; Tastle, W., Ed.; Springer: New York, NY, USA, 2012. [Google Scholar]
- Rezza, G.; Nicoletti, L.; Angelici, R.; Romi, R.; Finarelli, A.C.; Panning, M.; Cordioli, P.; Fortuna, C.; Boros, S.; Solvi, G.; et al. Infection with chikunguya virus in Italy: An outbreak in a temperate region. Lancet
**2007**, 370, 1840–1846. [Google Scholar] [CrossRef] - Reynolds, L.A.; Tansey, E.M. Foot and Mouth Disease: The 1967 Outbreak and ITS AFTERMATH; Wellcome Trust Centre for the History of Medicine at UCL: London, UK, 2001. [Google Scholar]
- Snow, J. Report on the Cholera Outbreak in the Parish of St. James, Westminster during the Autumn of 1854; Churchill: London, UK, 1984; pp. 97–120. [Google Scholar]
- Cameron, D.; Iones, I.G.; Snow, J. The broad street pump and modern epidemiology. Int. J. Epidemiol.
**1983**, 12, 393–396. [Google Scholar] - Skog, L.; Hauska, H.; Linde, A. The Russian influenza in Sweden in 1889–90: An example of geographic information system analysis. Eurosurveillance
**2008**, 13. pii: 19056. [Google Scholar] - Levine, N. CrimeStat III—A Spatial Statistical Program for the Analysis of Crime Incident Locations; NCJ 209264; The National Institute of Justice: Washington, DC, USA, 2004; pp. 10.1–10.2. [Google Scholar]
- Brantingham, P.L.; Brantingham, P.J. Environmental Criminology; Waveland Press Inc.: Prospect Heights, IL, USA, 1981. [Google Scholar]
- Brantingham, P.L.; Brantingham, P.J. Patterns in Crime; Macmillan: New York, NY, USA, 1984. [Google Scholar]
- Rossmo, D.K. Target patterns of serial murderers: A methodological model. Amer. J. Crim. Justice
**1993**, 17, 1–21. [Google Scholar] - Canter, D.V.; Larkin, P. The environmental range of serial rapists. J. Environ. Psychol.
**1993**, 13, 63–69. [Google Scholar] - Canter, D.; Tagg, S. Distance estimation in cities. Environ. Behav.
**1975**, 7, 59–80. [Google Scholar] - Canter, D. Mapping Murder: The Secrets of Geographic Profiling; Virgin Publishing: London, UK, 2007. [Google Scholar]
- Canter, D. Modeling the Home Location of Serial Offenders. In Proceedings of the 3rd Annual International Crime Mapping Research Conference, Orlando, FL, USA, 11–14 December 1999.
- Canter, D.; Coffey, T.; Huntley, M.; Missen, C. Predicting serial killers’ home base using a decision support system. J. Quant. Criminol.
**2000**, 16, 457–478. [Google Scholar] - Buscema, M; Breda, M.; Catzola, G. The Topological Weighted Centroid, and the Semantic of the Physical Space—Theory. In Artificial Adaptive Systems in Medicine; Buscema, M., Grossi, E., Eds.; Bentham: London, UK, 2009; pp. 69–78. [Google Scholar]
- Grossi, E.; Buscema, M.; Jefferson, T. The Topological Weighted Centroid, and the Semantic of the Physical Space
**—**Application. In Artificial Adaptive Systems in Medicine; Buscema, M., Grossi, E., Eds.; Bentham: London, UK, 2009; pp. 79–89. [Google Scholar] - O’Leary, M. A New Mathematical Technique for Geographic Profiling. In Proceedings of The NIJ Conference, Washington, DC, USA, 17–19 June 2006.
- Buscema, M. Pst Cluster, Version 20.1, Semeion Software #34; Semeion: Rome, Italy, 2012. [Google Scholar]
- Frank, C.; Faber, M.S.; Askar, M.; Bernard, H.; Fruth, A.; Gilsdorf, A.; Höhle, M.; Karch, H.; Krause, G.; Prager, R.; et al. Large and ongoing outbreak of haemolytic uraemic syndrome, Germany May 2011. Eurosurveillance
**2011**, 16, 2–4. [Google Scholar] - SurveStae, Berlin: Robert Koch Institute. German. Available online: http://www3.rki.de/SurvStat (accessed on 24 May 2011).
- Buchholz, U.; Bernard, H.; Werber, D.; Bohmer, M.M.; Remschmidt, C.; Wilking, H.; Delere, Y.; an der Herden, M.; Adlhoch, C.; Dreesman, H.; et al. German outbreak of Escherichia coli O104: H4 associated with sprouts. N. Engl. Med. J.
**2011**, 365, 1763–1770. [Google Scholar] [CrossRef] - Buscema, M; Sacco, P.L. Auto-Contractive Maps, the H Function, and the Maximally Regular Graph (MRG): A New Methodology for Data Mining. In Applications of Mathematics in Models, Artificial Neural Networks and Arts; Chapter 11; Capecchi, V., Buscema, M., Contucci, P., D’Amore, B., Eds.; Springer Science+Business Media B.V.: London, UK, 2010; pp. 227–275. [Google Scholar]

## Appendix A

## Appendix B

## Appendix C

_{n}(α) is given by Equation (5). Next, we can define the quantity

_{B}T ≡ 1/α, where K

_{B}is the Boltzmann constant and τ ≡ K

_{B}T is thermal energy. Using the above definition, we can study the behavior of the system as the temperature 1/α changes from ∞ to 0. At high temperature (α small), the quantities Pn(α) are almost independent of n and free energy is large and negative. The point is close to the center of mass, i.e., to the point of minimum squared distance from all other points in the system.

^{S}(α). In this “phase”, the value of is close to the point of maximum density.

## Appendix D

## Appendix E

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Buscema, M.; Grossi, E.; Bronstein, A.; Lodwick, W.; Asadi-Zeydabadi, M.; Benzi, R.; Newman, F.
A New Algorithm for Identifying Possible Epidemic Sources with Application to the German *Escherichia coli *Outbreak. *ISPRS Int. J. Geo-Inf.* **2013**, *2*, 155-200.
https://doi.org/10.3390/ijgi2010155

**AMA Style**

Buscema M, Grossi E, Bronstein A, Lodwick W, Asadi-Zeydabadi M, Benzi R, Newman F.
A New Algorithm for Identifying Possible Epidemic Sources with Application to the German *Escherichia coli *Outbreak. *ISPRS International Journal of Geo-Information*. 2013; 2(1):155-200.
https://doi.org/10.3390/ijgi2010155

**Chicago/Turabian Style**

Buscema, Massimo, Enzo Grossi, Alvin Bronstein, Weldon Lodwick, Masoud Asadi-Zeydabadi, Roberto Benzi, and Francis Newman.
2013. "A New Algorithm for Identifying Possible Epidemic Sources with Application to the German *Escherichia coli *Outbreak" *ISPRS International Journal of Geo-Information* 2, no. 1: 155-200.
https://doi.org/10.3390/ijgi2010155