This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

An algorithm that forecasts volcanic activity using an event tree decision making framework and logistic regression has been developed, characterized, and validated. The suite of empirical models that drive the system were derived from a sparse and geographically diverse dataset comprised of source modeling results, volcano monitoring data, and historic information from analog volcanoes. Bootstrapping techniques were applied to the training dataset to allow for the estimation of robust logistic model coefficients. Probabilities generated from the logistic models increase with positive modeling results, escalating seismicity, and rising eruption frequency. Cross validation yielded a series of receiver operating characteristic curves with areas ranging between 0.78 and 0.81, indicating that the algorithm has good forecasting capabilities. Our results suggest that the logistic models are highly transportable and can compete with, and in some cases outperform, non-transportable empirical models trained with site specific information.

Volcanism combines complex geophysical and geochemical processes that vary in composition, duration, and intensity from location to location and episode to episode. Therefore, successful eruption forecasting will only be achieved through the use of methods that simultaneously weigh empirical experience and real time interpretation of processes [

This research initiative aims to improve and standardize the event tree forecasting process. The method employed here also uses Newhall and Hoblitt’s generic event tree infrastructure. However, we have augmented its decision making process with a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a geographically diverse dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g., seismic), source modeling results, and historic eruption activity information. The regression uses a generalized linear model routine (GLMR) that assumes a binominal response variable and employs a logit linking function. This process allows for trends in the relationship between modeling results, monitoring data, historic information, and the known outcome of the events to drive the formulation of the statistical models. It yields a static set of logistic models that weight the contributions of empirical experience and monitoring data relative to one another. They estimate the probability of a particular event occurring based on the current values of a set of predefined explanatory variables, are highly transportable, and do not need to be redefined for each volcano being monitored. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes in short-term forecasts. In addition, a rigorous cross validation process is used to document the algorithm’s performance, identify optimum detection thresholds, and quantify false alarm rates. The methodology is easily extensible, where recalibration can be performed or new branches added to the decision making process with relative ease. Such a system could aid federal, state, and local emergency management officials in determining the proper response to an impending eruption and can be easily deployed by a National Volcano Early Warning System (NVEWS).

This paper is organized into several sections. It begins with an overview of the volcanic eruption forecasting algorithm, logistic models, and cross validation analysis. The next section provides a detailed discussion of forecasts generated by the algorithm. The final section discusses the conclusions that can be drawn from this work and provides an overview of future research initiatives that may improve the algorithm’s performance.

An event tree is a decision making framework that estimates the probability of a set of pre-defined events using a tree like decision making process. Each branch or node of the tree represents a set of possible outcomes for a particular event, which increase in specificity with each step. Progression to the next branch is accomplished when the probability estimate for the preceding node exceeds a pre-defined threshold. Our algorithm is rooted in a general state of unrest and grows to forecast volcanic activity in increasing detail, such as the probability of eruption, intensity, and location.

Schematic representation of the event tree, where the clone label indicates the tree structure at that point is identical to that below. The color code represents the USGS ground-based hazard declarations and are superimposed over their respective event tree branch.

The event tree utilized in this research is shown in

where

• Node 1.

where _{1}

• Node 2.

where

and _{2,N} is a node specific set of

• Node 3.

where

and _{3,N} is a node specific set of

• Node 4.

where

and _{4,N} is a node specific set of

• Node 5.

where

The probability of an event occurring at a particular event tree node is defined as

where ^{th}_{1}) _{2}) _{3}) _{4}) _{1}) _{2}) _{3}).

Raw logistic regression training data.

Episode | Response Var. | Independent Variable | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

Year | Volcano | VEI | Er | In | MM | TNE | TCSM | Days | TEH | Ref. |

1993 | Medicine Lake^{(2,3,4)} |
0 | 0 | 0 | 0 | 115 | 6.0e + 21 | 2492 | 1 | [ |

1993 | Makushin^{(3,4)} |
0 | 0 | 1 | 1 | 0 | 0 | 365 | 12 | [ |

1994 | Hengill^{(2,3,4)} |
0 | 0 | 1 | 1 | 63450 | 7.7e + 23 | 1607 | 0 | [ |

1995 | Trident^{(2,3,4)} |
0 | 0 | 1 | 1 | 69 | 3.2e + 19 | 137 | 13 | [ |

1996 | Lassen Peak^{(2,3,4)} |
0 | 0 | 0 | 0 | 110 | 3.4e + 21 | 1460 | 1 | [ |

1996 | Eyjafjallajökull^{(2,3,4)} |
0 | 0 | 1 | 1 | 144 | 5.2e + 19 | 114 | 2 | [ |

1996 | Akutan^{(2)} |
0 | 0 | 1 | 1 | 1194 | 7.6e + 22 | 32 | 34 | [ |

1996 | Iliamna^{(2,3,4)} |
0 | 0 | 1 | 1 | 1477 | 2.1e + 21 | 382 | 2 | [ |

1996 | Peulik^{(2,3,4)} |
0 | 0 | 1 | 1 | 0 | 0 | 365 | 2 | [ |

1997 | Kilauea^{(2,3)} |
1 | 1 | 1 | 0 | 1869 | 1.9e + 22 | 20 | 63 | [ |

1998 | Kiska^{(2,3,4)} |
0 | 0 | 0 | 0 | 0 | 0 | 365 | 3 | [ |

1998 | Grimsv"{o]tn^{(2,3,4)} |
3 | 1 | 1 | 1 | 31 | 9.4e + 20 | 10 | 29 | [ |

1999 | Shishaldin^{(2,3,4)} |
3 | 1 | 1 | 0 | 688 | 9.0e + 22 | 42 | 34 | [ |

1999 | Fisher^{(2,3,4)} |
0 | 0 | 0 | 1 | 0 | 0 | 365 | 1 | [ |

2000 | Katla^{(2,3,4)} |
0 | 0 | 1 | 1 | 12460 | 1.2e + 23 | 2190 | 4 | [ |

2000 | Kilauea^{(2,3,4)} |
0 | 0 | 0 | 0 | 48 | 5.8e + 20 | 13 | 63 | [ |

2000 | Three Sisters^{(2,3,4)} |
0 | 0 | 1 | 1 | 0 | 0 | 1460 | 1 | [ |

2000 | Hekla^{(2,3,4)} |
3 | 1 | 1 | 1 | 196 | 3.4e + 20 | 15 | 9 | [ |

2000 | Eyjafjallajökull^{(2,3,4)} |
0 | 0 | 1 | 1 | 170 | 4.1e + 20 | 365 | 2 | [ |

2001 | Etna^{(2,3,4)} |
2 | 1 | 1 | 1 | 414 | 1.0e + 23 | 28 | 115 | [ |

2001 | Okmok^{(4)} |
0 | 0 | 0 | 1 | 19 | 8.1e + 21 | 2 | 16 | [ |

2001 | Aniakchak^{(2,3,4)} |
0 | 0 | 0 | 1 | 13 | 5.8e + 20 | 64 | 1 | [ |

2002 | Hood^{(2,3,4)} |
0 | 0 | 0 | 0 | 86 | 7.8e + 22 | 60 | 2 | [ |

2002 | Etna^{(2,3,4)} |
3 | 1 | 1 | 1 | 353 | 2.1e + 23 | 94 | 115 | [ |

2003 | Veniaminof^{(2)} |
2 | 1 | 1 | 1 | 103 | 6.2e + 20 | 1050 | 22 | [ |

2004 | Grimsvötn^{(2,3,4)} |
3 | 1 | 0 | 0 | 920 | 6.3e + 21.9 | 490 | 29 | [ |

2004 | Spurr^{(3,4)} |
0 | 0 | 0 | 0 | 2743 | 5.1e + 20 | 239 | 2 | [ |

2004 | Etna^{(2,3,4)} |
1 | 1 | 1 | 1 | 156 | 4.8e + 21 | 186 | 115 | [ |

2004 | Saint Helens^{(2,3,4)} |
2 | 1 | 1 | 1 | 1094 | 1.5e + 23 | 21 | 14 | [ |

2005 | Augustine^{(2,3,4)} |
3 | 1 | 1 | 1 | 2007 | 3.1e + 20 | 80 | 9 | [ |

2006 | Korovin^{(2,3,4)} |
1 | 1 | 0 | 1 | 377 | 1.4e + 21 | 329 | 7 | [ |

2007 | Pavlof^{(2)} |
2 | 1 | 1 | 0 | 2 | 8.8e + 18 | 30 | 39 | [ |

2007 | Upptyppingar^{(2,3,4)} |
0 | 0 | 1 | 0 | 3124 | 6.5e + 20 | 133 | 0 | [ |

2008 | Yellowstone^{(2,3)} |
0 | 0 | 1 | 1 | 2594 | 6.9e + 22 | 49 | 0 | [ |

2008 | Paricutin^{(2,3,4)} |
0 | 0 | 0 | 0 | 0 | 0 | 21900 | 1 | [ |

2008 | Hengill | 0 | 0 | 0 | 0 | 3309 | 1.1e + 24 | 10 | 0 | [ |

2008 | Okmok^{(2,3,4)} |
4 | 1 | 1 | 1 | 464 | 4.9e + 21 | 100 | 16 | [ |

2008 | Kasatochi^{(2,3,4)} |
4 | 1 | 1 | 0 | 1489 | 7.4e + 24 | 22 | 1 | [ |

2009 | Redoubt^{(2,3,4)} |
3 | 1 | 1 | 1 | 4219 | 3.9e + 21 | 365 | 6 | [ |

2010 | Eyjafjallajökull^{(2,3,4)} |
4 | 1 | 1 | 1 | 4019 | 1.1e + 22 | 100 | 2 | [ |

The event tree was adapted to issue warnings according to the United States Geological Survey (USGS) ground-based volcanic hazard system. Our system assigns colors ranging from green to red to a particular volcano according to its current eruption hazard condition, where green = normal, yellow = advisory, orange = watch, and red = warning.

Logistic regression is a statistical modeling method used to analyze multivariate problems [

The logistic function,

where

where _{0}_{m}_{m}_{0}_{m}

Using this technique, we derived a suite of logistic models to compute the prior probability for the ^{th}

where _{n}^{th}_{n}

The goodness-of-fit (G) of a logistic model is often assessed using a likelihood ratio test [_{n})

where _{null}_{full}^{2}

A database comprised of monitoring data, source modeling results, and historic eruption information from a series of volcanic unrest episodes is required for deriving the logistic model coefficients. Unfortunately, no such database existed in the public domain at the time of this study. Therefore, one had to be constructed for this research. Ideally, a large and diversified set of data is desired for identifying the set of explanatory variables that will produce the most robust logistic model. Moreover, it is also desirable to identify a large number of events that either culminate or fail to culminate into an eruption. However, published accounts of volcanic unrest events vary greatly in detail and do not contain a consistent set of observations. This problem is exasperated by the fact that events which eventually result in an eruption tend to be published, while those that fail are not. This has artificially biased the open literature toward eruptive events as opposed to those that eventually fail [

The database constructed for this work is listed in

Superscripts above the volcano name represent that sample’s participation in the derivation of logistic coefficients for each node. Note that the logistic regression is performed after the

Anomalous geologic activity (e.g., volcanic unrest) is detected using the method described in [

where c is a constant, _{1}_{3}_{3}−Q_{1}

The severity of the anomaly is estimated from the weighted summation of a collection of binary variables. The general form of the severity estimate is defined as

where _{m}_{m}

where each binary variable is defined in _{m}_{1} values of 0.25, 0.50, 0.75, and 1.0 represent low, moderate, heightened, and extreme levels of unrest. This triggering mechanism differs from those described in previously published event tree implementations (e.g., [

Explanatory variable names, descriptions, and possible values for node 1.

Explanatory Variable | Description | Value |
---|---|---|

_{sr} |
Seismicity Rate | 0/1 |

_{df} |
Surface Deformation | 0/1 |

_{lm} |
Large Magnitude | 0/1 |

_{md} |
Model Indicates Intrusion | 0/1 |

Ideally, the regression uses many combinations of random samples from the population we are attempting to model. After many iterations, a distribution of parameter estimates, such as the sample mean or regression coefficients, can be produced and their true value estimated. In this case, however, there are no additional data available. Therefore, a bootstrapping approach is used to estimate the distribution of logistic model coefficients for nodes 2, 3, and 4 using selected subsets of data from

Bootstrapping is a resampling technique that produces

Model coefficients for each node were estimated from the distributions generated after 50,000 bootstrapped GLMR iterations. The distributions for each of the logistic model coefficient appears to be bimodal. The two distributions represent cases where the regression is either properly or ill constrained. Situations where the regression is ill constrained caused the GLMR to fail after reaching its maximum number of iterations, which results in a bogus set of model coefficients. Conversely, properly constrained cases represent runs where the GLMR converges to a solution within its maximum number of iterations. Therefore, the ill constrained results are rejected as outliers and the properly constrained distributions are retained and used to estimate logistic model information. In all cases, the median value of the truncated distributions are used to estimate the coefficients for each model.

The logistic models for nodes 2, 3, and 4 are shown in Equations 18–20 and a description of each explanatory variable is listed in

Logistic model explanatory variable names, descriptions, and possible range of values.

Explanatory Variable | Description | Value |
---|---|---|

_{MM} |
Unrest consistent with intrusion model | 0 or 1 |

_{NE} |
Average Number of Earthquakes Per Day | 0 – ∞ |

_{CSM} |
Average Normalized Cumulative Seismic Moment Per Day | 0 – ∞ |

_{DAYS} |
Episode Duration in Days | 0 – ∞ |

_{ERH} |
Average Eruption History | 0 – ∞ |

The influence of the source modeling results (_{MM}

Logistic functions derived from bootstrapping process for the intrusion node, where the black and red curves represent _{MM}_{MM}_{MM}

Logistic functions derived from bootstrapping process for the Eruption node, where the black and red curves represent _{MM}_{MM}_{MM}

Logistic functions derived from bootstrapping process for the intensity node, where the black and red curves represent _{MM}_{MM}_{MM}

The spatial probability density function (PDF) for estimating the probability of vent formation (V) at the ^{th}_{def}_{seis}

where

Binary classification problems attempt to categorize the outcome of an event into one of two categories, either true (1) or false (0). This process can result in one of four possible outcomes that are defined as follows:

•

•

•

•

The quality of a binary classifier is assessed through a receiver operating characteristic (ROC) analysis. A ROC curve is generated by plotting the prediction algorithm’s true positive rate (TPR or sensitivity) versus its false positive rate (FPR or 1-specificity). These parameters are defined as

and are both a function of the decision (detection) threshold,

where both expressions are a function of

The prediction power of the forecasting algorithm is characterized using a bootstrapped, leave-one-out (LOO), cross validation methodology. This process requires the removal of one sample from the training data, regeneration of the statistical model using the remaining data, and prediction of the outcome of the removed sample via the new model. Cross validation for each forecasting stage was conducted using 50,000 bootstrapped datasets. This is repeated for each sample in the training set for a collection of detection thresholds that range between 0.0 and 1.0. If the resulting probability is greater than or equal to the threshold, the outcome of the event is declared to be true. If the probability is less than the threshold, the outcome of the event is declared to be false. Since the outcome of each of the training events is known, the number of TP, TN, FP, and FN detections can be determined as a function of the detection threshold and plotted in ROC space.

ROC curves for the intrusion, eruption, and intensity event tree nodes are shown in

Receiver Operating Characteristics for the intrusion event tree node. The AUROC value of approximately 0.78 suggests this node will have fair to good predictive capabilities. TPR, FPR, accuracy, and precision estimates of 71%, 29%, 71% and 85% are obtained at the optimum detection threshold, (0.91).

Receiver Operating Characteristics for the eruption event tree node. The AUROC value of approximately 0.81 suggests this node will have fair to good predictive capabilities. TPR, FPR, accuracy, and precision estimates of 75%, 21%, 78%, and 74% are obtained at the optimum detection threshold, (0.47).

Receiver Operating Characteristics for the intensity event tree node. The AUROC value of approximately 0.80 suggests this node will have fair to good predictive capabilities. TPR, FPR, accuracy, and precision estimates of 73%, 19%, 78%, and 70% are obtained at the optimum detection threshold, (0.21).

Block diagrams highlighting the functionality of the forecasting algorithm are shown in _{MM}

Schematic diagram showing the data flow to and from the forecasting algorithm.

Upon the detection of unrest, the algorithm’s trigger state vector transitions from 0 to 1. While in the trigger state, data corresponding to the appropriate explanatory variables is routed directly to nodes 2–5 for a user specified number of days. During this time the algorithm produces probability estimates for each node, a daily color coded hazard declaration, an unrest severity estimate, and a vent location probability map. Once the specified number of days has been reached, the trigger states returns to 0, and outlier detection is reinitialized.

Internal functionality of the forecasting algorithm, where gray indicates processes internal to the algorithm, blue represents external data sources, and green identifies products.

The logistic models can be recalibrated upon the introduction of new samples into the training dataset. Once the models have been redefined and validated, the old models are replaced, and the new functionality is instantly available at all sites being monitored. This guarantees that all forecasting results are derived from the most recent set of logistic models.

The volcanic eruption forecasting algorithm (VEFA) was tested against unrest episodes occurring at four volcanoes; Grimsvötn, Okmok, Yellowstone National Park, and Mount Saint Helens. A detailed discussion of forecasts generated for the Icelandic example is presented, since it best illustrates the mechanics of our process. Results from all four examples are compared to forecasts generated by the Bayesian Event Tree Eruption Forecasting (BETEF) 2.0 application [

Grimsvötn is located in Iceland approximately 200 m below the northwestern portion of the Vatnajökull icecap (See

Location of the Grimsvötn Volcano (red triangle).

Since Grimsvötn is a subglacial volcano, GPS data is used to identify deformation consistent with a magmatic intrusion. GPS data acquired by Icelandic Meteorological Office stations posted on the Icelandic Institute of Earth Science (IES) website was used to estimate the source parameters for this episode [

Surface deformation was modeled using a spherical source in a semi-infinite elastic half space. The vertical and radial displacements at the surface are given by

where

Estimated Mogi source parameters derived from GFUM 240 days before and 30 days after the 2011 Grimsvötn Eruption, where positive or negative values of C indicate uplift or subsidence.

Sample | Δ h | Δ r | d | C |
---|---|---|---|---|

Pre-eruption | 40 |
36 |
3.33 |
0.0011 ^{3} |

Post-eruption | 250 |
468 |
1.60 |
–0.0061 ^{3} |

The vent location search area is defined using Equation 26 and the pre-eruptive source parameters shown in

Modeled deformation field preceding the 2011 Grimsvötn eruption.

The VEFA was used to assess the probability of volcanic activity at Grimsvötn preceding its 2011 eruption. Since raw GPS data from GFUM is not available in this situation, only seismic data was used for unrest detection. The boxplots shown in _{l}

Boxplots highlighting the distribution of seismicity beneath the Grimsvötn caldera between 2005 and 2011, where the events per day and magnitude whiskers are set to 1.5 time the interquartile range. Monitoring thresholds are 8.0 events per day and a _{l}

Monitoring was initiated on 24 November 2010 and triggered on an anomalously large seismic event (_{L}

Algorithm state as a function of processing day. (

VEFA input parameters. (_{NE}_{DAYS}_{CSM}_{DAYS}

The color code declarations shown in

Forecasts of volcanic activity preceding Grimsvötn 2011 eruption, where the intrusion, eruption, and intensity probabilities and thresholds are shown by the red, black, and blue, solid and dotted lines. Introduction of positive modeling results occurs on episode day 134. (

A time series of selected spatial PDFs highlighting probable volcanic vent locations at Grimsvötn are shown in ^{2}, ^{2}, reduce

Spatial Probability Density Maps for volcanic activity preceding Grimsvötn’s 2011 eruption, where the black ellipse outlines the approximate perimeter of the caldera. The plot for day 177 has been enlarged to emphasize regions with a higher probability of vent formation within the quantitatively constrained area.

The results generated by the VEFA are compared with those derived from the BETEF v2.0 application. The BETEF tool was developed by Warner Marzocchi, Laura Sandri, and Jacopo Selva of the Istituto Nazionale di Geofisica e Vulcanologia (INGV) and is freely available via the internet. It employs the statistical processing methodology described in [

The models used for each example were trained using monitoring and modeling data acquired from the most recent unrest episode preceding the event under test. Modeling results (_{MM}_{ERH}_{NE}_{CSM}_{MM}_{X}_{DAYS}

Forecasts for Grimsvötn and three other volcanoes were compared. The additional volcanoes are located in the western portion of North America and frequently exhibit varying forms of unrest. Okmok is an active shield volcano located on the northern portion of Umnak Island in the center of the Aleutian Arc. Comparison forecasts precede Okmok’s 2008, VEI 4, eruption. Mount Saint Helens is an active stratovolcano that is situated in the Cascade Mountain Range in the southwestern section of Washington State and last erupted in 2004. Forecasts for Mount Saint Helens are generated over a period of seismic activity in February 2011 that was not due to volcanic unrest. The Yellowstone caldera is located in the Snake River Valley in the northwestern portion of Wyoming and last erupted several thousand years ago. Forecasts for Yellowstone are derived over an unprecedented episode of volcanic unrest that occurred in late 2010. The outcome of each test event and its associated seismicity level, in terms of low or high, is listed in

Test event outcome for each algorithm stage and associated seismicity level, where 1 or 0 indicates whether the event occurred or not and if seismicity levels were high or low.

Volcano | Intrusion | Eruption | Intensity (VEI>1) | High Seismicity |
---|---|---|---|---|

Grimsvötn | 1 | 1 | 1 | 1 |

Mount Saint Helens | 0 | 0 | 0 | 0 |

Okmok | 1 | 1 | 1 | 0 |

Yellowstone | 1 | 0 | 0 | 1 |

Intrusion forecast comparisons for each example are shown in

Intrusion probability comparisons for selected episode days, where each time sample is highlighted with a circle and the VEFA and BETEF results are shown in blue and red. (

Eruption forecast comparisons are shown in

A comparison of eruption intensity estimates are shown in

Eruption probability comparisons for selected episode days, where each time sample is highlighted with a circle and the VEFA and BETEF results are shown in blue and red. (

Intensity probability comparisons for selected episode days, where each time sample is highlighted with a circle and the VEFA and BETEF results are shown in blue and red. (

An algorithm that forecasts volcanic activity using an event tree analysis system and logistic regression has been developed, characterized, and validated. The suite of logistic models that drive the system were derived from a geographically diverse dataset comprised of source modeling results, monitoring data, and the historic behavior of analog volcanoes. This allows the algorithm to simultaneously utilize a diverse set of information in its decision making process. A bootstrapping analysis of the training dataset allowed for the estimation of robust logistic model coefficients. Probabilities generated from the logistic models increase with positive modeling results, escalating seismicity, and high eruption frequency. The cross validation analysis produced a series of ROC curves with AUROC values in the 0.78–0.81 range, indicating that the algorithm has good forecasting capabilities. In addition, ROC curves also allowed for the determination of the false positive rate and optimum detection threshold for each stage of the algorithm.

Modeling results had a significant influence on the probability estimates for the intrusion and eruption nodes and a moderate effect on the intensity node. Logistic functions illustrated in

A comparison of the performance between the VEFA and BETEF further illustrates the power of using source modeling information to produce short-term intrusion and eruption forecasts. Source modeling data significantly enhanced the VEFA’s forecasting capabilities, especially in situations where little or no associated seismicity exists. This point is confirmed by the differences in the BETEF and VEFA intrusion and eruption forecasts for Okmok and Mount Saint Helens. In both cases, the VEFA leveraged source modeling information to confirm or deny the observed unrest was the result of fluid motion. While the BETEF was also given this information during the development and evaluation of its statistical models, it was unable to make the physical connection between the input data and the source of the unrest. Comparison results suggest that a static suite of empirical statistical models derived from a geographically diverse and sparse dataset are transportable and can compete with, and in some cases outperform, non-transportable empirical models trained exclusively from site specific information. This assessment is supported by comparison results showing that the VEFA forecasts are more consistent with the actual outcome of the test events listed in

To the best of our knowledge, this is the first time logistic regression is used to forecast volcanic activity. Furthermore, the derivation of optimized detection thresholds allowed for the quantification of the USGS hazard levels and the determination of an associated false positive rate. This was made possible by the data contained in the volcanic unrest database constructed exclusively for this research. The incorporation of source modeling data into the event tree’s decision making process has initiated the transition of volcano monitoring applications from simple mechanized pattern recognition algorithms to a physical model based system. This paper shows the VEFA has potential for forecasting volcanic activity at various locations throughout the world. Moreover, it can potentially aid civil authorities in determining the proper response to an impending eruption and can be easily implemented by a NVEWS. It should be stressed, however, that this algorithm is meant to assist and not replace the volcanologist in assessing the potential hazard associated with volcanic unrest episodes. Its results should be weighed carefully against a scientist’s personal experience and all other available information. It must also be stressed that low probability of occurrence means the event is unlikely, but not impossible. There will be situations where an unlikely event will occur.

Future work will focus on expanding the training dataset. Emphasis will be placed on identifying additional explanatory variables that can further enhance the predictive power of the algorithm. The Earth Observatory of Singapore, Nanyang Technological University, is leading a research initiative to develop a global volcanic unrest database referred to as WOVOdat [

This research was funded, in part, by Florida Space Research Program award 66018006-Y2. Seismic event catalogs used in this study were derived from open catalogs posted on the internet by the USGS, University of Alaska Fairbanks, Alaska Volcano Observatory (AVO), Icelandic Meteorological Office, and the INGV. GPS data was acquired from the Icelandic Meteorological Office and Icelandic Institute of Earth Science websites [