Simulation of the 2003 Foss Barge-Point Wells Oil Spill: A Comparison between BLOSOM and GNOME Oil Spill Models

The Department of Energy’s (DOE’s) National Energy Technology Laboratory’s (NETL’s) Blowout and Spill Occurrence Model (BLOSOM), and the National Oceanic and Atmospheric Administration’s (NOAA’s) General NOAA Operational Modeling Environment (GNOME) are compared. Increasingly complex simulations are used to assess similarities and differences between the two models’ components. The simulations presented here are forced by ocean currents from a Finite Volume Community Ocean Model (FVCOM) implementation that has excellent skill in representing tidal motion, and with observed wind data that compensates for a coarse vertical ocean model resolution. The comprehensive comparison between GNOME and BLOSOM presented here, should aid modelers in interpreting their results. Beyond many similarities, aspects where both models are distinct are highlighted. Some suggestions for improvement are included, e.g., the inclusion of temporal interpolation of the forcing fields (BLOSOM) or the inclusion of a deflection angle option when parameterizing wind-driven processes (GNOME). Overall, GNOME and BLOSOM perform similarly, and are found to be complementary oil spill models. This paper also sheds light on what drove the historical Point Wells spill, and serves the additional purpose of being a learning resource for those interested in oil spill modeling. The increasingly complex approach used for the comparison is also used, in parallel, to illustrate the approach an oil spill modeler would typically follow when trying to hindcast or forecast an oil spill, including detailed technical information on basic aspects, like choosing a computational time step. We discuss our successful hindcast of the 2003 Point Wells oil spill that, to our knowledge, had remained unexplained. The oil spill models’ solutions are compared to the historical Point Wells’ oil trajectory, in time and space, as determined from overflight information. Our hindcast broadly replicates the correct locations at the correct times, using accurate tide and wind forcing. While the choice of wind coefficient we use is unconventional, a simplified analytic model supported by observations, suggests that it is justified under this study’s circumstances. We highlight some of the key oceanographic findings as they may relate to other oil spills, and to the regional oceanography of the Salish Sea, including recommendations for future studies.


Foss Barge-Point Wells Oil Spill
The Foss Barge-Point Wells Spill was the basis of a comparison between two offshore spill trajectory models: BLOSOM and GNOME.
At 00:05 on 30 December 2003, heavy marine fuel oil #6 (IFO 380) spilled into the Puget Sound as it was pumped onto Foss tank barge 248-P2, from a Chevron/Texaco loading terminal at the Point Wells Asphalt facility near Richmond Beach in Shoreline, Washington. Based on gauge readings, 5712 gallons of fuel were accidentally released by overtopping; around 1075 gallons were recovered from the deck, and an estimated 4600-4800 gallons (about 110-114 bbl) spilled into the Puget Sound [10,11]. A timeline of oil spill movement is provided in Figure 1. A map showing the area of interest in the Salish Sea is provided in Figure 2, and the approximate spill path, including the points of interest mentioned in Figure 1 and in this section, are depicted in Figure 3.
Initially, oil drifted approximately 6 miles south along the eastern shore of Puget Sound, then began moving northwest towards the western shore. By 09:00, oil was observed within a mile from Port Madison, and was observed entering Port Madison 3 h later [10]. By the afternoon of 30 December, some of the oil slick beached between Point Jefferson and Indianola, and beaching increased by the morning of 31 December [10]. The oiled shoreline extended approximately 1.5 miles; an initial shoreline oiling survey estimated 3.5 acres were covered. An assessment by the United States Coast Guard A timeline of oil spill movement is provided in Figure 1. A map showing the area of interest in the Salish Sea is provided in Figure 2, and the approximate spill path, including the points of interest mentioned in Figure 1 and in this section, are depicted in Figure 3.   A timeline of oil spill movement is provided in Figure 1. A map showing the area of interest in the Salish Sea is provided in Figure 2, and the approximate spill path, including the points of interest mentioned in Figure 1 and in this section, are depicted in Figure 3.

Ambient Forcing for Oil Spill Models-The Hydrodynamic Model
One of the most challenging aspects of oil spill modeling is accurately simulating ocean currents. In 2011, the Pacific Northwest National Laboratory developed a hydrodynamic model of Puget Sound known as the Salish Sea Model. This model has been used as a tool for coastal estuarine research, nearshore restoration planning [12], water-quality management [13], and assessment of climate change effects [14,15]. The Salish Sea Model is one of the most detailed hydrodynamic models for the region, carefully calibrated for circulation and tidal exchange to primarily address water quality concerns, and thus, is a reasonable choice for oil spill modeling.
The Salish Sea Model was developed using the Finite Volume Community Ocean Model (FVCOM) framework [16]. The unstructured finite volume model grid extends to the west through the Strait of Juan de Fuca, to the north through Georgia Strait, thus covering all areas of the Puget Sound ( Figure 5). Grid cells vary in size between 250 m within inlets and bays, to 3.5 km in the Juan de Fuca Strait. In the vertical, the model uses 10 terrain-following layers (11 sigma levels) with higher density near the surface, and the top layer is about 3.2% of the local depth. Bathymetry is obtained from the University of Washington's Digital Elevation Model (DEM) [17] and the Department of Fisheries and Oceans Canada. The model bathymetry is typically smoothed for water quality simulations to prevent pressure gradient errors in a sigma-stretched coordinate system [18]. However, for applications such as oil spill transport, the model was operated without smoothing to account for the effects of shallow depths along complex shoreline. Open boundary conditions at the Strait of Juan de Fuca and Georgia Strait are forced by 15 min XTide stations, a harmonic tide predictor based on NOAA's National Oceanic Service algorithms [19]. There are additional water inputs throughout the domain including: (a) 19 major rivers with flows determined by USGS gages, (b) estimated runoff from 45 watersheds, and (c) wastewater discharge from 99 industrial outfalls. The Weather Research and Forecasting (WRF) Model on a 12 km grid [20] was used for meteorological forcing. It should be noted that wind from WRF was only used for the hydrodynamic model, while local wind measurements were used to force the oil spill models. The wind with which the ocean model was forced was not used to drive trajectories, as the oil spill happened in a relatively small area (oil beached about 8 km away from where the spill originated) relative to the atmospheric model resolution. Also, wind observations are likely to be an accurate representation of the wind acting on the sea surface during the spill, as we show below.   Table 1 shows the error statistics associated with the same five days; error statistics are deemed to be within an acceptable range.
Results from the model simulation were written at hourly intervals in NetCDF files that included information, such as sea-surface height, water velocity, salinity, and temperature at locations across the grid and at different depths. GNOME and BLOSOM each required a subset of the information; both used the sea-surface velocity, and while BLOSOM used an elevation raster to determine the coastline, GNOME used the ocean model boundary as the coastline. GNOME has a readily available toolkit with scripts to convert outputs from a variety of hydrodynamic models into a NetCDF file directly compatible with GNOME. The script mapped variables, determined the grid boundary based on grid connectivity information, and removed excess information, such as water velocity at depths, since in this application, GNOME is only being used for surface trajectories. Meanwhile, BLOSOM was capable of directly reading Salish Sea Model outputs because BLOSOM developers included an option for the relevant variables directly within the model code.

Additional Ambient Forcing for Oil Spill Models
Oil spill models use ocean currents to force the trajectories of oil parcels. However, ocean models do not perfectly simulate all of the physical processes that may be responsible for transporting oil, and it is possible that missing processes may need to be included through parametrizations [21]. As mentioned above, we used an FVCOM ocean model implementation to simulate the Salish Sea that has typically been used for water quality, circulation, and habitat restoration purposes. While some work was performed to optimize the ocean model to drive surface trajectories in this study, additional optimization could include improving the vertical grid resolution, using higher-resolution wind forcing (if available), and, possibly, transitioning to smaller grid sizing at the area of the spill. It was deemed that deficiencies in the vertical resolution resulted in an underestimation of the wind forcing; consequently, wind data was used to parametrize the effects of both windage and turbulent transfer of momentum, as a linear combination. Observational and theoretical justifications for this approach are described and discussed below. Additional forcing in the form of a stochastic model was used to simulate diffusion.

Wind Data
NOAA has a wind station at West Point (NOAA NDBC-WPOW1) about 8 miles south of where the Point Wells oil spill originated (47.662 N, 122.436 W); wind data from this source for 30 and 31 December 2003 (Figure 7) was used. A concern with this data source was that the wind coming from the east would be underrepresented due to land elevating quickly next to the coast, thus potentially causing a shading effect. A different wind source on the opposite side of the channel (Kingston wind station at 47.7940 N, 122.4940 W) was therefore used to evaluate the wind from the NOAA station. At Kingston station, wind coming from north and south is underestimated due to the presence of land, however, we were able to confirm that the wind having an east-to-west component was consistent across the channel (Figure 8). The trajectory driven exclusively by 6% of the NOAA NDBC-WPOW1 wind strongly suggests the importance of wind forcing in replicating the oil's trajectory (Figure 9, compare to schematic trajectory in Figure 4). The locations for wind stations are also shown in Figure 9.
Additional wind data from the Loyal Heights weather station (47.690 N, 122.383 W; not shown) were tested but were not found to help explain the observed trajectory; Loyal Heights station is about 100 m above sea level, and is therefore unlikely to be representative of the wind forcing at the ocean surface (wind-stress force over the ocean is usually computed from wind at either 2 m or 10 m above the sea surface). We did not use this wind data in our comparison.

Diffusion
To parameterize spreading of oil due to turbulent diffusion, GNOME uses a random walk with a constant diffusion coefficient. BLOSOM offers other options regarding both the diffusion scheme and the diffusion coefficient (e.g., using a diffusion coefficient read from the ocean model solution or computed through a Smagorisnky scheme, or options under development like a random flight

Diffusion
To parameterize spreading of oil due to turbulent diffusion, GNOME uses a random walk with a constant diffusion coefficient. BLOSOM offers other options regarding both the diffusion scheme and the diffusion coefficient (e.g., using a diffusion coefficient read from the ocean model solution or computed through a Smagorisnky scheme, or options under development like a random flight scheme) as detailed in Duran [21]. To make the simulations directly comparable, a random walk with

Diffusion
To parameterize spreading of oil due to turbulent diffusion, GNOME uses a random walk with a constant diffusion coefficient. BLOSOM offers other options regarding both the diffusion scheme and the diffusion coefficient (e.g., using a diffusion coefficient read from the ocean model solution or computed through a Smagorisnky scheme, or options under development like a random flight scheme) as detailed in Duran [21]. To make the simulations directly comparable, a random walk with constant diffusion coefficient was also used for BLOSOM. Turbulent diffusion parameterized through a random walk is expected to cause oil parcels to spread around the trajectory computed without turbulent diffusion, and affect how oil beaches along the coastline. Different values for the diffusion coefficient were tested to identify when the simulated spread resembled the observed spread, as described below.  . Trajectory (orange, red circles mark locations at hourly marks) initiated at the same time and location as the oil spill, resulting from forcing exclusively with 6% of the wind from the NOAA wind station. Also shown are the locations of NOAA (NDBC-WPOW1) and Kingston wind stations (white circles with black cross) and the approximate locations of oil at different times, as observed from overflights (white squares; see Figure 4).

Exploring Differences in the Oil Spill Models through Simulations
We use diverse simulations to compare GNOME and BLOSOM. Gradually increasing in complexity, these simulations are designed to isolate and compare different aspects of the models, or the oil spill we wish to hindcast. Thus, each difference among the models is illustrated through a test. Using, as a starting point, the wind information described in the "Wind Data" subsection above, tests three through nine illustrate the Foss barge -Point Wells (from now on, Point Wells) hindcast. . Trajectory (orange, red circles mark locations at hourly marks) initiated at the same time and location as the oil spill, resulting from forcing exclusively with 6% of the wind from the NOAA wind station. Also shown are the locations of NOAA (NDBC-WPOW1) and Kingston wind stations (white circles with black cross) and the approximate locations of oil at different times, as observed from overflights (white squares; see Figure 4).

Exploring Differences in the Oil Spill Models through Simulations
We use diverse simulations to compare GNOME and BLOSOM. Gradually increasing in complexity, these simulations are designed to isolate and compare different aspects of the models, or the oil spill we wish to hindcast. Thus, each difference among the models is illustrated through a test. Using, as a starting point, the wind information described in the "Wind Data" subsection above, tests three through nine illustrate the Foss barge -Point Wells (from now on, Point Wells) hindcast.

Initial Conditions and Technical Details
Information on the Point Wells Spill detailed in the reports by ENTRIX, Inc., [10] and The Foss-Pt.Wells Natural Resource Trustees [11] was used to create a baseline for the comparison. Most tests use an instantaneous release of oil except where noted, and the duration of each simulation was set to 30 h, starting on 30 December 2003 at 00:05. Two spill locations were used: the one closer to shore accurately represents the location of the spill, another location further offshore was used for a sensitivity test.
Throughout our tests, we used a fixed computational time step of 6 min, and sensitivity tests (Appendix A.1) showed this value to be a good compromise between satisfying the numerical requirement known as Courant-Friedrichs-Lewy (CFL) condition (explained in Appendix A.1) and computational efficiency. Oil parcels' positions are plotted approximately every 15 min, to allow a clear visualization of the trajectories, and to make the trajectories from both models directly comparable. Baseline parameters used in the simulations are listed in Table 2. The first couple of simulations are idealized test cases that allow for a direct comparison between the model's handling of advection due to ocean currents and wind, in the most simplified way possible.
BLOSOM uses, for its computations, an equidistant spatial reference system with units of meters; for this study, the Universal Transverse Mercator (UTM) Zone 10 North with the datum World Geodetic System 1984 was used. GNOME computes oil parcel trajectories using longitude and latitude directly, with an internal conversion factor that varies as a function of latitude, which makes their computations compatible with the metric system. Thus, BLOSOM assumes a tangent plane approximation, while GNOME computes advection on a sphere. These different approaches result in some divergence in the trajectories that can be assessed by computing a trajectory while keeping the ocean current velocity constant in space and time.
Our first test therefore consists of advecting a single oil parcel with a constant ocean sea-surface velocity set to (−0.03, −0.1) m/s and no wind ( Table 3). The constant velocity was chosen so that it would result in a trajectory comparable in length to the Point Wells oil spill trajectory, but without beaching. The difference in advection algorithms resulted in a trajectory separation of 76 m over 30 h ( Figure 10).

Parameters for Test 1, Ocean Currents Integration Currents
Spatially constant and steady currents = (−0.03, −0.1) m/s Wind source None Start location Actual initial location Wind advection coefficient None Figure 10. Comparison of advection algorithms using constant ocean currents; test 1.

Test 2: Wind Handling
The purpose of the second test (Table 4) was to test the advection due to wind. This test was kept comparable to the first one by requiring that the wind advective velocity were equal to the ocean advective velocity used in test 1. The wind was therefore set to (−0.5, −1.667) m/s and the wind advection coefficient set to 6% of the wind velocity. To be directly comparable with GNOME, BLOSOM did not use deflection due to the effect of earth's rotation on the wind's transfer of momentum to the ocean's surface (this deflection is explained in test 3 below). Since GNOME handles wind inputs with limited precision (a characteristic motivated by the inherent uncertainty in atmospheric models), the resulting separation for this test was greater than in the previous test, Figure 10. Comparison of advection algorithms using constant ocean currents; test 1.

Test 2: Wind Handling
The purpose of the second test (Table 4) was to test the advection due to wind. This test was kept comparable to the first one by requiring that the wind advective velocity were equal to the ocean advective velocity used in test 1. The wind was therefore set to (−0.5, −1.667) m/s and the wind advection coefficient set to 6% of the wind velocity. To be directly comparable with GNOME, BLOSOM did not use deflection due to the effect of earth's rotation on the wind's transfer of momentum to the ocean's surface (this deflection is explained in test 3 below). Since GNOME handles wind inputs with limited precision (a characteristic motivated by the inherent uncertainty in atmospheric models), the resulting separation for this test was greater than in the previous test, reaching about 95 m over a 30 h simulation ( Figure 11). The separation due to wind was about 26% greater than the difference due to ocean currents, over the same period. reaching about 95 m over a 30 h simulation ( Figure 11). The separation due to wind was about 26% greater than the difference due to ocean currents, over the same period.

Test 3: Wind Advection Scheme
Both models integrate the ocean currents using an Euler integration scheme, therefore, advection due to ocean currents should not result in any differences beyond those found in test 1. However, how BLOSOM and GNOME process wind data includes an additional distinction that could result in trajectory differences beyond those found in test 2. The turbulent transfer of momentum from wind

Test 3: Wind Advection Scheme
Both models integrate the ocean currents using an Euler integration scheme, therefore, advection due to ocean currents should not result in any differences beyond those found in test 1. However, how BLOSOM and GNOME process wind data includes an additional distinction that could result in trajectory differences beyond those found in test 2. The turbulent transfer of momentum from wind into the ocean results in trajectories that are not aligned with the wind's direction, but are deflected to the right of the wind direction (in the northern hemisphere)-this is caused by Earth's rotation (Coriolis effect). BLOSOM simulates this deflection, while GNOME assumes that wind advection is in the same direction as wind. This aspect of GNOME's design is a consequence of the inherent uncertainty of numerically simulated wind, and how that may apply to emergency response scenarios.
This test aims to qualitatively understand the difference in wind advection schemes by computing, with each model, a trajectory for an oil parcel originating from the same location, advected by the ocean currents from the hydrodynamic model, and wind data from NOAA WPOW1 station (Table 5). The wind advection coefficient was set to 6%. Table 5. Parameters for test 3, wind deflection part 1.

Start location
Actual initial location Wind advection coefficient 6% Deflection for GNOME Default (none) Temporal interpolation for BLOSOM Default (none) BLOSOM's trajectory illustrates the deflection to the right of GNOME's trajectory, including a difference on the beaching location ( Figure 12).  To confirm if the totality of the trajectory difference was only due to the wind deflection, an additional test (Table 6) was then designed. Wind data was manipulated to produce a wind time series that GNOME could read, and that would include the same deflection that BLOSOM computes internally. We were able to confirm that most, but not all, of the difference found in Figure 12 was due to the deflection, yet a small difference remained ( Figure 13); this difference is explored in test 4.  Figure 13. The same two plots shown in Figure 12 are shown along with an additional trajectory by GNOME, that now includes deflection due to earth's rotation; rotation was included directly to the wind data, forcing GNOME to replicate the deflection computed internally by BLOSOM. GNOME's trajectory with deflection agrees well with BLOSOM's trajectory, however, some difference remains; test 3 part 2.

Test 4: Temporal Interpolation
An additional distinction between BLOSOM and GNOME is that the latter includes temporal interpolation of the forcing fields by default. By adding temporal interpolation to BLOSOM as a beta feature, we were able to assess if the difference in trajectories that remained in the previous test could be explained. The parameters used for this test can be found in Table 7.
With both models using temporal interpolation, trajectories were now close enough ( Figure 14) to where the remaining difference could be explained by the different approaches used to compute trajectories, as illustrated in the first two tests.

Start location
Actual initial location Wind advection coefficient 6% Deflection for GNOME Deflection added manually Figure 13. The same two plots shown in Figure 12 are shown along with an additional trajectory by GNOME, that now includes deflection due to earth's rotation; rotation was included directly to the wind data, forcing GNOME to replicate the deflection computed internally by BLOSOM. GNOME's trajectory with deflection agrees well with BLOSOM's trajectory, however, some difference remains; test 3 part 2.

Test 4: Temporal Interpolation
An additional distinction between BLOSOM and GNOME is that the latter includes temporal interpolation of the forcing fields by default. By adding temporal interpolation to BLOSOM as a beta feature, we were able to assess if the difference in trajectories that remained in the previous test could be explained. The parameters used for this test can be found in Table 7.
With both models using temporal interpolation, trajectories were now close enough ( Figure 14) to where the remaining difference could be explained by the different approaches used to compute trajectories, as illustrated in the first two tests. Table 7. Parameters for test 4, temporal interpolation.

Start location
Actual initial location Wind advection coefficient 6% Deflection for GNOME Deflection added manually Temporal interpolation for BLOSOM Temporal interpolation included J. Mar. Sci. Eng. 2018, 6, x 18 of 43 Figure 14. The trajectories for GNOME and BLOSOM, both with deflection included, as seen in Figure  13, are compared to the same BLOSOM trajectory, but now including temporal interpolation; test 4. The trajectory from GNOME with added deflection, and the trajectory from BLOSOM with added interpolation, now resemble each other closely.

Test 5: Differences in Beaching
Once the above differences were clarified, we turned out attention to other potential sources of discrepancy, including how the coastline and beaching are treated.
For this test, we explore how the models respond, with their default options, to ocean currents and a more typical wind coefficient value of 3%, thus further clarifying the role of wind as a driver for this oil spill. The parameters used for this test can be found in Table 8. Table 8. Parameters for test 5, differences in beaching.

Parameters for Test 5, Differences in Beaching
Start Location Actual initial location Wind Advection Coefficient 3% Deflection for GNOME None Temporal interpolation for BLOSOM None GNOME includes a refloating algorithm that empirically describes the adhesiveness of the oil to the shoreline; a "half-life" parameter can be set by the user, representing the number of hours over Figure 14. The trajectories for GNOME and BLOSOM, both with deflection included, as seen in Figure 13, are compared to the same BLOSOM trajectory, but now including temporal interpolation; test 4. The trajectory from GNOME with added deflection, and the trajectory from BLOSOM with added interpolation, now resemble each other closely.

Test 5: Differences in Beaching
Once the above differences were clarified, we turned out attention to other potential sources of discrepancy, including how the coastline and beaching are treated.
For this test, we explore how the models respond, with their default options, to ocean currents and a more typical wind coefficient value of 3%, thus further clarifying the role of wind as a driver for this oil spill. The parameters used for this test can be found in Table 8. GNOME includes a refloating algorithm that empirically describes the adhesiveness of the oil to the shoreline; a "half-life" parameter can be set by the user, representing the number of hours over which half of the oil on a given shoreline can be removed by an offshore wind, diffusive transport, or from a sea level that is equal, or higher than when the oil originally beached [9]. Samaras et al. [22] present half-life values for different beach types. We use a half-life value of 24 h for GNOME simulations, which is representative of a sand or a gravel beach. While a marsh would typically have a much larger refloat half-life, the Doe-Kag-Wats marsh is mostly protected by a part-sand and part-gravel beach, with a small opening into the marsh. BLOSOM does not include a refloat option. Table 8. Parameters for test 5, differences in beaching.

Start Location
Actual initial location Wind Advection Coefficient 3% Deflection for GNOME None Temporal interpolation for BLOSOM None In this simulation, BLOSOM's trajectory beaches in about an hour and a half, and about 200 m northeast of the initial location. GNOME also beached at a similar location, but in about half an hour. GNOME creates a rasterized shoreline map from loaded current data for the purposes of tracking the oil beaching. This generated shoreline has a finite resolution across the entire grid, so GNOME provides an option to restrict the model domain, thus creating finer resolution of the shoreline in the area of interest, however, the beaching still occurs slightly offshore from the FVCOM boundary due to this approximation ( Figure 15a). Before the beaching, there was some divergence between the BLOSOM and GNOME trajectories ( Figure 15b); this separation is somewhat higher than those detected in tests 1 and 2 ( Figures 10 and 11), probably due to differences described in tests 3 and 4.
Due to the refloating algorithm, GNOME's oil parcel continued its trajectory about 18 h after beaching (at 18:53), subsequently moving north (Figure 15c), remaining within about 5 km of the spill's origin. simulations, which is representative of a sand or a gravel beach. While a marsh would typically have a much larger refloat half-life, the Doe-Kag-Wats marsh is mostly protected by a part-sand and partgravel beach, with a small opening into the marsh. BLOSOM does not include a refloat option. In this simulation, BLOSOM's trajectory beaches in about an hour and a half, and about 200 m northeast of the initial location. GNOME also beached at a similar location, but in about half an hour. GNOME creates a rasterized shoreline map from loaded current data for the purposes of tracking the oil beaching. This generated shoreline has a finite resolution across the entire grid, so GNOME provides an option to restrict the model domain, thus creating finer resolution of the shoreline in the area of interest, however, the beaching still occurs slightly offshore from the FVCOM boundary due to this approximation ( Figure 15a). Before the beaching, there was some divergence between the BLOSOM and GNOME trajectories ( Figure 15b); this separation is somewhat higher than those detected in tests 1 and 2 ( Figures 10 and 11), probably due to differences described in tests 3 and 4.
Due to the refloating algorithm, GNOME's oil parcel continued its trajectory about 18 h after beaching (at 18:53), subsequently moving north (Figure 15c), remaining within about 5 km of the spill's origin.   3.2.6. Test 6: Sensitivity to Initial Position As seen in the previous test, the simulated trajectories do not cross the channel with a 3% wind coefficient. An additional experiment exploring cross-channel transport was conducted, with no wind. Trajectories were initiated at both the actual initial location, and at an initial location that is about 730 m offshore, and 165 m north. This test was run on default values (i.e., BLOSOM does not include temporal interpolation); parameters used for these simulations are shown in Table 9. Table 9. Parameters to test 6, sensitivity to initial location.

Parameters to Test 6, Sensitivity to Initial Location
Wind None Start location Nearshore vs. offshore location Temporal interpolation for BLOSOM None When initiated offshore from the actual spill location, trajectories without wind forcing do cross the channel ( Figure 16A). This is because an eddy just offshore from the coast, induced a relatively strong cross-channel component. Starting at about 18:00 on 30 December, the models begin diverging markedly. GNOME's trajectory does a south-north oscillation, while BLOSOM's trajectory goes on to beach at the correct location at 06:00 on 31 December, about 14 h later than the time when the actual beaching began. GNOME's trajectory has not beached by the end of the simulation. As seen in the previous test, the simulated trajectories do not cross the channel with a 3% wind coefficient. An additional experiment exploring cross-channel transport was conducted, with no wind. Trajectories were initiated at both the actual initial location, and at an initial location that is about 730 m offshore, and 165 m north. This test was run on default values (i.e., BLOSOM does not include temporal interpolation); parameters used for these simulations are shown in Table 9.
When initiated offshore from the actual spill location, trajectories without wind forcing do cross the channel ( Figure 16A). This is because an eddy just offshore from the coast, induced a relatively strong cross-channel component. Starting at about 18:00 on 30 December, the models begin diverging markedly. GNOME's trajectory does a south-north oscillation, while BLOSOM's trajectory goes on to beach at the correct location at 06:00 on 31 December, about 14 h later than the time when the actual beaching began. GNOME's trajectory has not beached by the end of the simulation.
By contrast, when using no wind but with both model's trajectories initiated at the correct location, oil moves northward up the coast, then southward slightly offshore, then again northward but closer to the shore, oscillating with the tides ( Figure 16B). In this case, beaching for each model happens at locations about 700 m away.  By contrast, when using no wind but with both model's trajectories initiated at the correct location, oil moves northward up the coast, then southward slightly offshore, then again northward but closer to the shore, oscillating with the tides ( Figure 16B). In this case, beaching for each model happens at locations about 700 m away.

Test 7: Number of Particles While Using Turbulent Diffusion
Two parameters were tested through some diffusion tests: the diffusion coefficient itself and the number of particles. Greater values of the diffusion coefficient cause greater spread, while the number of particles may also influence the amount of spreading. Using a fixed diffusion coefficient of 10,000 cm 2 /s, we first test 30 oil particles ( Figure 17) against 1000 particles ( Figure 18); parameters for these simulations are listed in Table 10. We consider some of the implications of the number of particles that we use by presenting Oil Holding Capacity calculations in Appendix A.2.
This test suggests that thirty particles is a good approximation if we judge by the simulated spread at the beaching locations, which approximately matches the observed spread of beached oil, between Point Jefferson and Indianola (marked as observed trajectory in plots). BLOSOM's along-path spread is similar whether using 30 or 1000 particles. However, because GNOME's trajectory samples currents outside of the observed trajectory, there is spurious spreading, especially with 1000 particles. The greater the number of particles, the greater the possibility that trajectories sample a greater variety of ocean currents as they are diffused by the random walk displacements. GNOME's simulation beaches at additional locations when the number of particles is higher, also, some trajectories cross the channel further north.

Test 8: Release Period
Once it was established that thirty particles would result in approximately the desired spread due to diffusion, we next tested if the period of release would make a difference. So far, all tests have used an instantaneous release, with oil released at 00:05. Here, we compare an instantaneous release to a release lasting 15 min, which is the estimated actual duration of the overflow (Table 11). The instantaneous release of the thirty-particle simulation ( Figure 17) is very similar to a 15 min release ( Figure 19) when using BLOSOM. GNOME's trajectories remained roughly the same as well, however, spreading increases with the 15 min release after crossing the channel ( Figure 19); this difference can again be attributed to the random walk algorithm.   As mentioned above, GNOME includes a rebeaching option, where particles that have beached can still move back into the ocean and continue their trajectory. This refloating ability explains the northward movement of particles from the GNOME simulation.

Test 8: Release Period
Once it was established that thirty particles would result in approximately the desired spread due to diffusion, we next tested if the period of release would make a difference. So far, all tests have used an instantaneous release, with oil released at 00:05. Here, we compare an instantaneous release to a release lasting 15 min, which is the estimated actual duration of the overflow (Table 11). The instantaneous release of the thirty-particle simulation ( Figure 17) is very similar to a 15 min release ( Figure 19) when using BLOSOM. GNOME's trajectories remained roughly the same as well, however, spreading increases with the 15 min release after crossing the channel ( Figure 19); this difference can again be attributed to the random walk algorithm.

Test 9: Diffusion Coefficient
So far, our tests with diffusion have used a constant diffusion coefficient of 10,000 cm 2 /s; here, we test simulations with a coefficient of 100,000 cm 2 /s (Table 12). With the higher value, spreading along the BLOSOM trajectory seems to better match the width of the mid-channel observed trajectory, even if a few oil parcels diverge from the observed path ( Figure 20). Likewise, the width of the oil in the GNOME simulation seems closer to the observed path while crossing the channel, however beaching happens further north with the larger coefficient.

Start location
Actual initial location

Test 9: Diffusion Coefficient
So far, our tests with diffusion have used a constant diffusion coefficient of 10,000 cm 2 /s; here, we test simulations with a coefficient of 100,000 cm 2 /s (Table 12). With the higher value, spreading along the BLOSOM trajectory seems to better match the width of the mid-channel observed trajectory, even if a few oil parcels diverge from the observed path ( Figure 20). Likewise, the width of the oil in the GNOME simulation seems closer to the observed path while crossing the channel, however beaching happens further north with the larger coefficient.

Discussion
We first discuss the hindcast of the Point Wells oil spill, including some discussion of the regional oceanography as it may apply to other oil spills. We then discuss the differences between GNOME and BLOSOM.

Hindcasting the Historical Foss Point Well Spill
Further physical information will be needed to confirm our findings, however, we are able to make some compelling suggestions regarding what drove the Point Wells oil spill, insight that, to our knowledge, had so far remained elusive.
The trajectories are mainly forced with an accurate representation of the tides (which is a major driver of ocean currents in our region; Figure 6), as well as observed measurements of the wind

Discussion
We first discuss the hindcast of the Point Wells oil spill, including some discussion of the regional oceanography as it may apply to other oil spills. We then discuss the differences between GNOME and BLOSOM.

Hindcasting the Historical Foss Point Well Spill
Further physical information will be needed to confirm our findings, however, we are able to make some compelling suggestions regarding what drove the Point Wells oil spill, insight that, to our knowledge, had so far remained elusive.
The trajectories are mainly forced with an accurate representation of the tides (which is a major driver of ocean currents in our region; Figure 6), as well as observed measurements of the wind velocity (Figures 7 and 8). A quick inspection of the wind speeds during the spill, immediately suggest that wind was a major driver for the Point Wells spill: Meier and Höglund [23] (pp. 101-129) show that the advection of oil at the very surface tends to be dominated by wind whenever wind speeds reach or exceed about 4 m/s. Indeed, the trajectory of a parcel initiated at the location and time of the spill, and forced with only wind, is quite suggestive of the actual path ( Figure 9).
Under these circumstances, the advection of oil at the surface is often successfully modeled as a linear combination of ocean currents at the surface, and a velocity derived from wind, often called wind drift or windage. A wind drift velocity is typically calculated as a coefficient of 3%, multiplying the wind velocity. However, there are two points to be made regarding this general statement: (1) Main components of ocean currents at the surface typically include a geostrophic component; in our case, this is likely well represented by the tidal motion from our FVCOM model, and a wind-driven component that is independent of the wind drift mentioned above. The wind-driven current is given by some representation of the turbulent transfer of momentum from wind to the ocean's surface. The solution to this problem is an ocean current that spirals while decaying exponentially with depth; it is, therefore, very sensitive to its vertical dependence.
(2) Determining a correct wind drift coefficient is not trivial, the difficulties arise from a wide variety of ambient and dynamical considerations, some of which are discussed in Duran [21]. However, that is not the end of the list. As an additional example: naturally occurring surfactant has been shown to increase the wind speed drift velocity by 25% [24]. If we consider a typical wind drift coefficient of 0.03, then, under the influence of surfactant, the wind drift speed, computed from the wind speed U, becomes 0.03U(1 + 0.25) = 0.0375U. This illustrates naturally occurring phenomena, that would be very difficult to detect without in situ measurements, and that would increase the effective wind drift coefficient from 3 to almost 4%. This is as far as windage is concerned, however, additional processes, such as Stokes drift, Coriolis-Stokes force, and Langmuir circulation, are often parameterized directly from the wind velocity as well, and using a similar parameterization (e.g., Weisberg et al. [25], and references therein).
In Appendix A.3, we show that the model we use has coarse vertical resolution, and therefore, underestimates the wind-driven component of the sea-surface velocity by about a factor of four. We show this using a simplified analytical model, which is in good agreement with very high-resolution observations of vertical shear in ocean surface currents [26]. To compensate, a wind-driven velocity is directly parameterized from the wind, in the same way that a wind-drift velocity would be parameterized (see e.g., the surface velocity in Figure A1). As mentioned above, the FVCOM model used in this study does an excellent job of simulating tides ( Figure 5), however, for Lagrangian transport applications we can recommend a higher resolution near the surface to better resolve the turbulent transfer of momentum near the surface. This is because solutions for the turbulent transfer of momentum, from wind to the ocean, decay exponentially with depth, and are therefore sensitive to a coarse vertical resolution. Near the sea surface, the vertical resolution for ocean models used to simulate Lagrangian transport should be maximized, and is typically on the order of tens of centimeters (e.g., [27][28][29]), in marked contrast with the 4 to 6 m resolution of the uppermost layer of the model used for this study (except very near the coast, where the vertical resolution is higher). Indeed, even high vertical ocean model resolutions will need further parameterizations at times, since ocean models to not typically represent processes such as wind drift or Stokes drift. A natural follow up study towards calibrating this model for sea-surface trajectories could compare our results with an FVCOM implementation that uses a higher vertical-resolution, while continuing to simulate tides with a high skill.
In our study, a relatively high wind coefficient of 6%, in conjunction with the tide-driven ocean currents, cause the simulated trajectory to match the observed trajectory in time and space ( Figure 14). We do not find the high value for the wind coefficient surprising, because we are almost surely parameterizing at least two processes, the underestimated wind-driven current and windage. The former parameterization is remediating a coarse vertical resolution, while the latter is not typically included in ocean models, yet, is often a main driver when wind speed exceeds 4 m/s. It is also possible that additional wind-related processes (e.g., Stokes drift, Coriolis-Stokes force and/or Langmuir circulation) were at play. All of these processes can be parameterized with the same or a similar parameterization, a coefficient multiplying the wind velocity with a slight deflection to the right of the wind direction. Thus, the 6% coefficient likely represents at least two, perhaps more processes. Future studies should include detailed information, or at least a parameterization, of the processes related to surface waves, so as to gain insight to their role during this spill. We note that although Stokes drift is related to surface gravity waves, it can be effectively parameterized as a percentage of the local wind [30].
The inclusion of an angle of deflection for the parameterization of wind-driven processes was consequential in the hindcast of this oil spill. In this case, the angle computed internally by the default BLOSOM algorithm [21] did well. However, as discussed in Duran [21], the actual value of deflection does vary according to a number of conditions, making it difficult to simulate accurately. It is therefore recommended to have an option that allows the oil spill modeler to select an angle manually. Such an approach would allow the user to select an angle that matches observations, similar to the approach with the diffusion coefficient where the user selects a value to match the spread observed through overflights, satellite images or boat reports [9]. An example of an oil spill where wind drift was also very important, but where no deflection angle was appropriate, is given in Nakata et al. [31]. The value of 3.5% for the windage coefficient that they used, is consistent with our discussion, since their ocean upper layer was only 2 m deep, compared to the upper layer of the model we used, which is two to three times thicker (4-6 m).
When including deflection due to earth's rotation, tides, and a wind advection coefficient of 6% of the wind speed, the simulated trajectory does a good job of replicating the trajectory observed with overflights. The trajectory tends to hit the correct locations at the right times, although a bit too north during the first six hours of the simulation, and a few hours late after the first twelve hours (see BLOSOM's trajectory in Figure 12). Thus, BLOSOM's trajectory matches the observed path both in time and space, as can be seen by comparing to the timelines in Figures 1 and 9. The agreement strongly suggests that wind and tides were the major drivers during the 2003 Point Wells spill. This is because we are using observed winds (Figure 8), and because the tidal component of the ocean currents is almost perfectly simulated with the FVCOM model we use (Figure 6). Indeed, the trajectory forced only with 6% of the wind (Figure 9), is highly suggestive. Additionally, diffusion representing unresolved small-scale processes, helped match the observed spreading along that trajectory (Figures 19 and 20). It thus seems persuasive that winds, tides, and small-scale processes were the physical drivers during the 2003 Point Wells spill.
In the test where trajectories were initiated offshore from where the oil spill occurred and with no wind (Figure 16A), the cross-shore transport was due to eddies that spin off when the tidal currents interact with promontories (not shown). Thus, we have found two processes that may cause cross-shelf channel in our study region: cross-channel wind and eddies.

Integration Geometry
We now discuss the differences between GNOME and BLOSOM, as illustrated by the different tests we did, Table 13 includes a summary relating test numbers and the aspects being tested. The procedure an oil spill modeler would typically pursue for a hindcast or forecast is summarized in an infographic ( Figure 21).  Figure 21. Diagram summarizing steps, as used in this study, for hindcasting an oil spill.
Differences were found between trajectories computed by GNOME and BLOSOM due to the geometry over which the trajectories are integrated. BLOSOM uses a tangent plane approximation while GNOME integrates over a sphere. The differences were small, at about 30 to 40 m in the first six hours, and 60 to 80 m during the first day. We note that integrating wind as the only forcing mechanism (test 2), produced a somewhat higher difference than integrating only the ocean currents (test 1; compare Figures 10 and 11). The differences from tests one and two should be considered additive when the trajectory is forced by both ocean currents and wind forcing, as is often the case. As mentioned above, ocean currents are often unstable, consequently it possible that single Differences were found between trajectories computed by GNOME and BLOSOM due to the geometry over which the trajectories are integrated. BLOSOM uses a tangent plane approximation while GNOME integrates over a sphere. The differences were small, at about 30 to 40 m in the first six hours, and 60 to 80 m during the first day. We note that integrating wind as the only forcing mechanism (test 2), produced a somewhat higher difference than integrating only the ocean currents (test 1; compare Figures 10 and 11). The differences from tests one and two should be considered additive when the trajectory is forced by both ocean currents and wind forcing, as is often the case. As mentioned above, ocean currents are often unstable, consequently it possible that single trajectories may separate considerably over time due to these small initial differences. However, bulk calculations of trajectories (which is the typical approach during oil spill modeling) should give similar results.
The tangent plane approximation used by BLOSOM can be formally justified as an asymptotic expansion in terms of the length scale of interest, divided by the Earth radius, times the tangent of the latitude [32,33]. For the latitudes spanning the conterminous United States, about 25 to 49 to degrees North, the tangent plane approximation is appropriate for lengths of about one thousand kilometers or less. The Gulf of Mexico is approximately 1000 km in the zonal direction, and approximately 1000 km in the meridional direction, and therefore, BLOSOM is adequate for modeling a spill that encompasses the entire Gulf of Mexico. However, off Alaska's coasts, about 60 to 72 degrees North, the tangent plane approximation is adequate for distances up to about 500 km. Thus, due caution is needed when using BLOSOM for large spills in, say, Alaska, northern Europe or the Arctic.

Including the Effect of Earth's Rotation
The inclusion (BLOSOM) or not (GNOME) of the deflection caused by earth's rotation when computing the component of oceanic trajectories forced by wind (wind-forcing parameterization) is found to be consequential for this hindcast (test 3). The inclusion of earth's rotation, nudges the simulated trajectory to stay within the observed trajectory envelop for longer, and, importantly, it causes the oil parcel to beach at the correct location ( Figure 12). GNOME does not include the effect of earth's rotation because developers considered that, in the context of a quick response scenario, the uncertainty inherent to forecast winds would overcome the precision afforded by including this effect. However, because the wind used in our hindcasts comes from field measurements (Figures 7 and 8), the uncertainty due to forecasts is not an issue. In general, whenever the wind velocity is accurate, the inclusion of the deflection angle might result in an improvement to the simulation, this is because it approximates the physics of different wind-forced drivers that are missing in ocean models (e.g., [21]).

Interpolation of Wind Forcing
As a follow up to test 3, we used test 4 to understand the remaining difference between GNOME and BLOSOM trajectories, after including earth's rotation into both models ( Figure 13). Thus, as in test 3, the same deflection that BLOSOM computed internally, was added into the wind data that GNOME used to force the trajectory, but additionally, in test 4, temporal interpolation of the forcing fields was added to BLOSOM. This was done to match GNOME, that includes temporal interpolation by default. This test revealed ( Figure 14) that temporal interpolation explains an additional part of the difference between trajectories seen in Figure 13. We attribute the small difference remaining after the inclusion of temporal interpolation to the integration geometry as described in tests one and two.
Sensitivity to temporal interpolation when computing trajectories in the ocean is known, and has been discussed (e.g., [34]); it is especially important for coarse temporal resolutions. Future releases of BLOSOM will include temporal interpolation of the forcing fields.
Regarding spatial interpolation, the effect on currents from an ocean model with reasonable spatial resolution would likely be overcome by the use of a random walk scheme (turbulent diffusion) and would, therefore, only add to the computational cost without a benefit. Thus, both the GNOME and BLOSOM developer team currently do not prioritize using a spatial interpolation scheme beyond nearest neighbor. A study on the performance of integration and interpolation pairs for marine transport applications is ongoing, and will be reported elsewhere [35].

Differences in Beaching Algorithms
Test 5 was designed to understand differences in beaching algorithms. GNOME includes a refloat option based on an empirical parameter (half-life) that describes the adhesiveness of the oil to the shoreline. This half-life parameter is the number of hours in which half the oil on a given shoreline is expected to be removed with offshore wind or turbulent diffusion [9]. The spill described by Nakata et al. [31] is an example where the refloating of oil was important.
Based on results from test 4, and to make the trajectories directly comparable, wind deflection was introduced manually to the wind data for GNOME, thus, both models include wind deflection; likewise, BLOSOM uses temporal interpolation. We note that an additional difference between the models that affects beaching, is that GNOME uses the ocean model boundary as the coastline while BLOSOM uses an elevation raster provided by the user. In this test, the wind advection coefficient is reduced from 6% to a more typical value of 3%, therefore, this test also helps illustrate what a modeler would see, when using a typical value for the wind advection coefficient (Figure 15a). The trajectories diverge probably due to an initial perturbation caused by the additive effect of the different integration geometries detailed in tests one and two, that is amplified by the ocean currents ( Figure 15b); however, both models end up beaching at similar locations (Figure 15a). A more in-depth discussion on coastline treatment for oil spill model beaching can be found in Samaras et al. [22]. It is important to note that neither GNOME nor BLOSOM are designed to simulate beaching with the level of detail that can be found in other pioneering studies. GNOME, in particular, is intended as a quick response tool, consequently, the beaching location is only approximated (Figure 15a). In Appendix A.2, we show that the number of particles we use for the diffusion tests (30 and 1000) are reasonable regarding the type, and dimension, of beach that was impacted. Calculations with 30 particles suggests that, if a detailed beaching study were desired, a greater number of particles might be needed. However, we also note that trajectories, especially GNOME's, may deteriorate considerably as the number of particles increases, as noted below (Section 4.7). Thus, there is a trade-off to be made. For our purposes (a study that is primarily focused on the oil's trajectory), 30 particles gives good results, while maintaining reasonable oil holding capacity (OHC) values (Appendix A.2).
After about eighteen hours, GNOME's trajectory initiates again after refloating, to then move north (Figure 15c). In a real-life situation, both GNOME's and BLOSOM's simulation would be rejected, since they do not replicate the observed path, suggesting that the mechanisms forcing these trajectories are inadequate.
During a response scenario, the refloating of oil can be modeled alternatively by re-initializing an oil spill simulation once a need has been detected (for example, through overflights). Thus, BLOSOM does not contemplate, at present, the inclusion of a refloating option in future releases.

Sensitivity to Initial Position
In test 6, we explore initializing the trajectories offshore from the actual incident locations, about 730 m offshore and 165 m north. These trajectories do cross the channel even without any wind, a marked difference from the trajectories using the correct initial position ( Figure 16A).
To put this initial location offset into context, ocean models in the Gulf of Mexico are often 3 to 4 km resolution, with the higher-resolution models having resolutions of about 1 km. Thus, the correct and the offset initial locations, could have been within the same ocean model grid point of a high-resolution model of the Gulf of Mexico. However, the Salish Sea has much smaller spatial scales, and therefore requires a much finer ocean model. Thus, the distance between the actual and offset locations of the spill, spans about 4 ocean model grid cells. This is an example of the sensitivity to the initial location, a result of ocean currents varying considerably across the channel. As mentioned above, ocean currents are often unstable, and amplify exponentially any small, initial difference in the initial position [36,37] BLOSOM and GNOME differ in their final positions when the trajectories are initiated at the offshore location. This illustrates how small differences in oceanic trajectories can cause a large difference in the end result. This is because ocean currents are chaotic at length scales that are often smaller than the distance traveled by most trajectories of interest, which implies that small differences at any time during the trajectory can be exponentially amplified, causing distinct differences even within a few hours (see e.g., [36] or [37]). This test does not include any wind forcing, thus, the small differences (that eventually resulted in larger differences) can be attributed to the distinct geometries used in the ocean current integration, as illustrated in test 1. Exponential separation of trajectories is generally not relevant in bulk trajectory computations, especially when using a random component to simulate small-scale processes.
The trajectories initiated offshore do not coincide with the observed trajectory envelope ( Figure 16A). Since a trajectory can only be judged to be accurate if it visits the correct spatial locations at the correct times, BLOSOM's trajectory, in this test, is an example of getting the correct result (i.e., correct beaching location, although about 14 h late) for the wrong reasons (i.e., by initiating the trajectory offshore from the actual initial location).
In this test, we also run a simulation in which the trajectories are initialized at the correct location but, in contrast with test 5, without wind forcing ( Figure 16B). In this case, the trajectories are not forced towards the coast as fast as in test 5, instead, they oscillate with the tides, approximately parallel to the coastline to then beach not far from where the spill originated. This confirms that no wind, or typical values of wind forcing, are inadequate to force this oil spill, given the ocean model used in this study, as discussed above.

Turbulent Diffusion: Number of Particles
Tests one through six suggest what physical processes drove the Point Wells oil spill trajectory, namely, ocean currents plus 6% of wind, as illustrated in test three. Having identified appropriate ambient forcing, the next step a modeler would typically pursue, would be to add turbulent diffusion in an attempt to match the spread of pollutants, ideally along an observed trajectory envelope. In this test, BLOSOM does not use temporal interpolation and GNOME does not include wind deflection; these features were only added, respectively, in previous tests to understand the model's differences. However, having illustrated such differences, and because these features, respectively, are not currently available in the oil spill models, the simulations that follow will use default configurations.
In BLOSOM and GNOME simulations, the spread due to diffusion is controlled mainly by the diffusion coefficient, which both models allow the user to select. However, the spread is also affected by the number of particles used. If there are not enough particles, then the random walk algorithm will not represent the correct solution to the diffusion equation.
In test 7, a fixed diffusion coefficient of 10,000 cm 2 /s and 30 particles are selected. The turbulent spreading causes oil parcels to cover more of the area that was observed to be covered, providing an accurate estimate of the beaching, thus nudging the simulations towards greater realism ( Figure 17). We note that the higher the number of particles, the greater the amounts of oil refloating in GNOME's simulation, causing spurious northward transport.
To assess the effect of the number of particles, this experiment is repeated while using 1000 particles ( Figure 18). In this test, GNOME's trajectories diverge even further from the observed path, and the spread becomes less realistic, suggesting that 1000 particles is not an improvement. BLOSOM, however, does reach a desirable spread, suggesting that with 1000 particles, there is a modest improvement; note the width of BLOSOM's beaching approximately matching the observed width in Figure 18.
These results highlight the importance of simulating the diffusion-free trajectory as accurately as possible.

Release Period
All simulations, so far, used instantaneous releases at 00:05, therefore, the effect of releasing oil over a period of 15 min, an estimate of the duration of the actual spill, is assessed in test 8. No notable difference is found with this test, suggesting that an instantaneous release is adequate for simulation purposes.

Turbulent Diffusion: Diffusion Coefficient
In test 9, the diffusion coefficient is tested by using different values, while keeping the number of particles at 30. A coefficient of 100,000 cm 2 /s causes the spread to be closer to the width of the observed path envelope (Figure 20), suggesting an improvement relative to a diffusion coefficient of 10,000 cm 2 /s and 30 particles (Figure 17), but it is not necessarily an improvement relative to BLOSOM's simulation with a coefficient of 10,000 cm 2 /s and 1000 particles (Figure 18). Since a smaller turbulent diffusion coefficient is preferable (if the same results can be obtained by increasing the number of particles), the conclusion is that a coefficient of about 10,000 cm 2 /s and 1000 particles is the preferred setting for BLOSOM. Some fine-tuning is likely possible by testing numbers of particles between 10 and 100 thousand, but was not pursued.
Regarding GNOME's trajectories for both values of diffusion coefficients, the more significant factor continues to be that the trajectory does not match the observed path, due to not including the effect of earth's rotation when forcing with wind, as identified in test three. More trajectories diverge when using the 100,000 cm 2 /s coefficient; these differences are inherent to the random walk algorithm and are therefore expected. Therefore, beyond the differences found in test 3 and test 7 (rebeaching of oil), this test does not find any further noteworthy differences between GNOME and BLOSOM.
This test finalizes the simulations that a modeler would need to understand what simulation parameters best replicate the Point Wells oil spill. Consequently, all relevant differences between GNOME and BLOSOM have been illustrated.

Conclusions
While not a rigorous proof, it is very compelling that the simulation, initiated at the correct location and time, forced with accurate wind observations and accurate simulated tides, replicates the observed trajectory. The simulated trajectory broadly reaches the correct locations at the correct times, through the duration of the spill, to finally beach at the correct location and time (beaching started around 14:30 and continued through the afternoon and night of 30 December, up to the morning of 31 December). Since it is highly unlikely that such a combination of events would happen by chance, we conclude that our hindcasts make a compelling case for what forced the Point Wells oil spill, despite the fact that it is unusual for oil spill modelers to use a wind coefficient as high as 6%. This conclusion is supported by observational studies, and a simplified analytic model. The analytic model was used to quantify the effect of the coarse ocean model vertical resolution on the surface velocity (Appendix A.3), and it agrees well with very high-resolution observations recently reported by [26].
Deflection for the full 6% of the wind speed coefficient was shown to be important; this is in agreement with multiple studies (e.g., references in [21]) that have found that deflection may be part of each of the physical processes that are parameterized from wind.
To the best of our knowledge, this paper is the first to build a compelling case for what drove the historical Point Wells spill. The drivers for particle motion were tidal currents (including their interaction with promontories), wind, and small-scale processes represented by diffusion.
A dedicated study will be needed to fully understand the physics driving the Point Wells spill. In this regard, we can recommend future studies to include the effect of waves, because Stokes drift is known to be an important driver at the sea surface [27,38]. Stokes drift can be parameterized directly from wind, despite it being due to surface waves [30]. Thus, the inclusion of waves should provide further insight into the nature of the wind forcing that we have found to be important. Likewise, ocean simulations with a higher vertical resolution, that continue to accurately replicate tides, should also be helpful. We note the strong wind (5-10 m/s) blowing consistently towards the impacted beach, during the period of time over which beaching was observed. There was also very strong wind (>10 m/s) during the first 6 h of the spill (Figure 7). The Point Wells spill was difficult to forecast during response activities [39], presumably due to inadequate forcing. It is plausible that forcing was inadequate because of the strong dependence on wind. Understanding the drivers behind the Point Wells spill will allow oil spill modelers to be better prepared for future forecasts or hindcasts.
While initially developed for different purposes, GNOME and BLOSOM are similar models that are complementary in nature. For the particular hindcast used for this study, BLOSOM gives better results due to the inclusion of an internally computed deflection angle for the parameterization of wind-forced advection. However, we do not suggest that this implies that BLOSOM is a superior model. That this particular hindcast depends so heavily on a wind-driven parameterization, and that the wind we used comes from representative observations, may or may not be representative of other spills. Additionally, wind deflection could be included in a GNOME simulation by modifying the data, if so desired.
We are able to make two main recommendations for future releases: (1) that BLOSOM include temporal interpolation of the forcing fields, and (2) that GNOME include an option to allow for an angle of deflection when using wind-forced parameterizations.
velocity u over one computational time step ∆t to the size of the spatial grid ∆x (cf. [40] (pp. 171-175)). Thus, the CFL condition is defined as We say that the CFL condition is satisfied whenever C < 1. If the CFL condition is not satisfied, then an oil parcel could overshoot through one or more grid cells without taking into consideration the velocity of those grid cells. Thus, the trajectory could artificially ignore valid information, resulting in a scrambled sampling of the true velocity field and, consequently, in spurious trajectories.
In numerical analysis' terminology, satisfying the CFL condition guarantees that the finitedifference scheme used to solve a differential equation has a numerical domain of dependence, on initial or boundary conditions, which includes the differential equation's domain of dependence, on initial or boundary conditions (e.g., [41] (pp. 98-100)).
From the CFL condition, we can derive an upper bound for acceptable computational time steps. This upper bound can be computed using the maximum velocity and a minimum grid size in the model The information needed for this upper bound is readily available from the ambient data used by any oil spill model.
It is in BLOSOM's development plan to compute this time step upper bound when initializing a simulation to inform the user of any possible violations of the CFL condition or, alternatively, of an inefficiently small time step. When familiarizing oneself with a new ocean model, it is convenient to run a few tests to determine the ideal time step; it should satisfy the CFL condition but also, it should not be smaller than necessary since this can easily result in run-times that are too long to be practical.
For these simulations, an estimate of the time step ∆t was computed using a distance of 200 m and a velocity of 0.5 m/s (mean velocity + 1.5 standard deviations) in ∆t < ∆x min u max = 200 0.5 ≈ 6.7 min.. Thus, we obtain an a priori estimate of what the time step should be, this estimate can be confirmed running some simulations. The first test will be our benchmark, it uses a time step of one minute (such a small time step ensures convergence), and is shown in Figure A1.
Next a 6 min time step was used ( Figure A2), the trajectories are identical to those with a time step of one minute, thus confirming our a priori estimate. However, with a computational time step of 18 min, BLOSOM's trajectory started to deteriorate ( Figure A3). Notice that GNOME and BLOSOM's trajectories differ due to advection algorithms (test 3 above) and, therefore, their sensitivity to the computational time step will be different. Any small differences can and will be amplified due to unstable currents.
With a 36 min computational time step, the trajectory is completely spurious (Figure A4), to the point that trajectories end well into land. This experiment was not repeated with GNOME, however, similarly bad results are expected as this spurious behavior is a property of the numerical method, and both models use the same method. The difference in the advection algorithm should be small compared to the error induced by such a large computational time step. Also, when the time step is very large, the diffusion does not cause much spreading, this is because the random increment is added every time step, consequently, there is insufficient random spread when the number of time steps is small.
In summary, with a large time step, the trajectory is sampling an inaccurate velocity field (by overshooting valid information); additionally, the random walk is not converging to the solution of the diffusion equation. These tests confirm that the time step estimate of less than 6.7 min was a good choice (because the trajectory using 6 min time step is very similar to the trajectory using a 1 min time step, confirming convergence). not be smaller than necessary since this can easily result in run-times that are too long to be practical.
For these simulations, an estimate of the time step t Δ was computed using a distance of 200 m and a velocity of 0.5 m/s (mean velocity + 1.5 standard deviations) in min 7 . 6 5 . 0 Thus, we obtain an a priori estimate of what the time step should be, this estimate can be confirmed running some simulations. The first test will be our benchmark, it uses a time step of one minute (such a small time step ensures convergence), and is shown in Figure A1. Figure A1. GNOME and BLOSOM simulations using a 1 min time step. This simulation includes 6% wind, ocean currents and some diffusion.
Next a 6 min time step was used ( Figure A2), the trajectories are identical to those with a time step of one minute, thus confirming our a priori estimate. Figure A1. GNOME and BLOSOM simulations using a 1 min time step. This simulation includes 6% wind, ocean currents and some diffusion. However, with a computational time step of 18 min, BLOSOM's trajectory started to deteriorate ( Figure A3). Notice that GNOME and BLOSOM's trajectories differ due to advection algorithms (test 3 above) and, therefore, their sensitivity to the computational time step will be different. Any small differences can and will be amplified due to unstable currents.
With a 36 min computational time step, the trajectory is completely spurious (Figure A4), to the point that trajectories end well into land. This experiment was not repeated with GNOME, however, similarly bad results are expected as this spurious behavior is a property of the numerical method, Figure A2. Same as plot A1 but both models using a 6 min time step.

Appendix A.2. Computations Related to the Number of Parcels in Our Simulations
In this section, we use data for the historical spill, such as impacted area and beach length, and volume of oil, to determine if our selections for number of parcels are reasonable.
Three and a half acres were impacted, or about 14,164 m 2 . Roughly 2/3 of the total impacted area was heavily impacted. Half of the heavily impacted area was a gravel beach, and the other half a sand beach, roughly 4721 m 2 each half, for a total heavily impacted area of 9442 m 2 . The maximum estimated amount of oil was 4800 gallons or about 18.2 m 3 . Figure A4. Same as Figure A1 but with a 36 min computational time step for BLOSOM.

Appendix A.2 Computations Related to the Number of Parcels in Our Simulations
In this section, we use data for the historical spill, such as impacted area and beach length, and volume of oil, to determine if our selections for number of parcels are reasonable.
Three and a half acres were impacted, or about 14,164 m 2 . Roughly 2/3 of the total impacted area was heavily impacted. Half of the heavily impacted area was a gravel beach, and the other half a sand beach, roughly 4721 m 2 each half, for a total heavily impacted area of 9442 m 2 . The maximum estimated amount of oil was 4800 gallons or about 18.2 m 3 .
We can compute an upper bound for the oil holding capacity (OHC), by assigning all the oil at once, to the area that was heavily impacted: 18.2 m 3 /9442 m 2 = 0.0019 m 3 /m 2 . This is just below the mean, and about an order of magnitude smaller than the maximum volume of oil per area reported for OHC in [22] (their Table 2). Thus, data from this spill is consistent with data collected and published elsewhere.
If we divide the total volume 18.2 m 3 by 30, the number of oil parcels used in some of our simulations, we get that each parcel would have a volume of about 0.061 m 3 . For the 1000 parcel simulation, each parcel would represent about 0.0182 m 3 of oil. Using the upper bound for volume of oil per meter squared, computed above from Point Wells spill data, this means each parcel would impact, at least, an area of 0.061/0.0019 = 32.1 m 2 . This is the minimum area that each parcel would need to impact. When using a thousand parcels, each parcel would need to impact at least about 9.6 m 2 .
The length of the beach that was impacted was reported to be 2400 m (1.5 miles); considering the impacted area, this means the stretch of impacted beach was, in average, almost 6 m wide. Considering the minimum impacted area computed above for the 30 parcel scenario, this implies a reasonable stretch of beach of about 5 m, which is about 6 m wide. For the 1000 parcel scenario, each parcel would need to impact a stretch of beach of about 1.6 m by 6 m wide, which is also reasonable. Thus, we can conclude that our number of particles are not unreasonable, as far as the OHC is concerned.
In terms of the impacted area, we know a beach stretch of 2400 m was impacted, and if we divide this by 30 oil parcels, each parcel must stretch about 80 m long, which is also not an unreasonable quantity, and if we divide it by 1000 oil parcels, each parcel would need to stretch for about 2.4 m along the beach.
With the 30 parcel scenario, we arrived at a stretch of beach 5 m long when considering OHC, and a stretch of beach of about 80 m long when considering the length of the beach. This discrepancy suggests that, while neither of the values are unreasonable, it is likely, however, that if our goal was to model the beaching in more detail, a higher number of oil parcels would likely be desirable.
We emphasize that, with the level of detail that we are modeling, we are only looking for the trajectories to end at the correct beach. We do not suggest, or explore, anything regarding the distribution of oil on the beach. The computations in this appendix are merely to show that the number of particles we use are not unreasonable, given the data we know.

Appendix A.3 Parameterizing Missing Wind-Forced Physics
In this appendix, we explain the rationale behind using 6% of the wind speed. The ocean model solution we used has a relatively coarse resolution in the vertical, using only ten terrain-following levels. The shallowest velocity in this FVCOM implementation is a velocity depth averaged over the upper 4-6 m (depending on the bathymetry, as the vertical coordinate is terrain-following). Such a vertically averaged Ekman velocity is considerably smaller than the Ekman velocity at the sea surface. To show this, we used a simplified model for the wind-driven velocity from [42] (pp. 22-24) using a constant wind forcing of (u, v) = (5, 5) meters/second, which is representative of the wind during the Point Wells spill. We use the solution of this model to depth-average the wind-driven ocean current velocity over the upper 6m, and we find that the magnitude is about one fourth of that at the magnitude at the surface ( Figure A5). We therefore conclude that in our region of interest, the ocean model Ekman sea-surface circulation is likely underestimated by a factor of about 4. A typical wind coefficient for parameterizing the turbulent transfer of wind momentum is 3% of the wind speed, a value that can be derived approximately, from first principles (e.g., [43,44] (pp. 150-156)). terrain-following). Such a vertically averaged Ekman velocity is considerably smaller than the Ekman velocity at the sea surface. To show this, we used a simplified model for the wind-driven velocity from [42] (pp. 22-24) using a constant wind forcing of (u, v) = (5, 5) meters/second, which is representative of the wind during the Point Wells spill. We use the solution of this model to depthaverage the wind-driven ocean current velocity over the upper 6m, and we find that the magnitude is about one fourth of that at the magnitude at the surface ( Figure A5). We therefore conclude that in our region of interest, the ocean model Ekman sea-surface circulation is likely underestimated by a factor of about 4. A typical wind coefficient for parameterizing the turbulent transfer of wind momentum is 3% of the wind speed, a value that can be derived approximately, from first principles (e.g., [43,44] (pp. 150-156)). However, wind does not only cause an Ekman circulation. An additional forcing from wind that moves oil parcels at the sea surface and is not usually included in ocean models is windage. Indeed, windage is known to dominate over ocean currents when wind speed exceeds 4-5 m/s [23] (pp. 101-129). Common values for windage range from 1-4% of the wind speed [9]. In our simulations, we found that adding 6% of the wind to the ocean currents reproduced the historical trajectory relatively well. While 6% is within bounds of what has been observed and suggested in the literature for parameterizations of wind-related processes (cf. references in [21]), it is rather large for most applications. We suggest that while windage was a necessary addition, part of the 6% we used was needed to make up for the coarse resolution of the FVCOM implementation, as explained above. Thus, the 6% percent we use is meant as a linear combination (e.g., [45] (pp. 109-110)) of at least two independent physical processes: windage and turbulent transfer of momentum from the wind to the surface of the ocean (i.e., Ekman drift). We note that there are other processes that could be at play, and would be parameterized in the same way, thus similarly (and additionally) contributing as a linear combination of forcing mechanisms-please refer to the discussion above. This would further support the use of a large coefficient.
The analytical model we used above, to show that the ocean model vertical resolution is too coarse, is supported by recent, very high-resolution observations focusing on the upper meters of the ocean. In their study, Laxague et al. [26] confirm that "the current magnitude averaged over the upper 1 cm of the ocean is shown to be nearly four times the average over the upper 10 m, even for mild forcing". This constitutes a reasonably good agreement with the simplified theoretical model we use, which includes a law-of-the-wall velocity for the first centimeter as described in Csanady [42] (pp. [22][23][24]. Averaging over the upper 10 m (as in the observations by Laxague et al. [26]), our simplified, theoretical model predicts that the velocity over the first centimeter would be 5 times greater than the average over the upper ten meters, which compares reasonably well to the almost 4-times greater velocity found by Laxague et al. [26] in the upper two centimeters.
Further research will be needed to isolate the physical mechanisms driving the Point Wells oil spill.