Avian Conservation Areas as a Proxy for Contaminated Soil Remediation

Remediation prioritization frequently falls short of systematically evaluating the underlying ecological value of different sites. This study presents a novel approach to delineating sites that are both contaminated by any of eight heavy metals and have high habitat value to high-priority species. The conservation priority of each planning site herein was based on the projected distributions of eight protected bird species, simulated using 900 outputs of species distribution models (SDMs) and the subsequent application of a systematic conservation tool. The distributions of heavy metal concentrations were generated using a geostatistical joint-simulation approach. The uncertainties in the heavy metal distributions were quantified in terms of variability among 1000 realization sets. Finally, a novel remediation decision-making approach was presented for delineating contaminated sites in need of remediation based on the spatial uncertainties of multiple realizations and the priorities of conservation areas. The results thus obtained demonstrate that up to 42% of areas of high conservation priority are also contaminated by one or more of the heavy metal contaminants of interest. Moreover, as the proportion of the land for proposed remediated increased, the projected area of the pollution-free habitat also increased. Overall uncertainty, in terms of the false positive contamination rate, also increased. These results indicate that the proposed decision-making approach successfully accounted for the intrinsic trade-offs among a high number of pollution-free habitats, low false positive rates and robustness of expected decision outcomes.


Information of Selected Species
The eight selected bird species inhabit agricultural areas of Taiwan. These species are locally threatened by various pressures and are protected by the Taiwanese government [1]. These eight species are not an exhaustive list of all endangered species that are affected by heavy metal contamination, but serve as an example to demonstrate the effectiveness of the remediation prioritization method for species conservation proposed in this study.
Accipiter trivirgatus is a resident raptor species that is relatively common in wooded areas and in open areas of lower elevation. It preys on rodents, birds, lizards and frogs. Milvusmigrans is a resident raptor species that inhabits agricultural and aquatic areas in low-lying areas of Taiwan. The population was common and widespread in Taiwan before 1970s but has experienced a great reduction of numbers since then (less than 300 individuals after 1990), due to a variety of reasons, not least of which is poisoning. It feeds on various vertebrates but mainly carcasses. Otuslettia is a nocturnal raptor species that is relatively common in wooded areas at lower elevations of Taiwan. It preys on insects, rodents, birds, frogs, and lizards. Hydrophasianuschirurg is a rare resident species that prefers ponds covered by floating plants. It once had a vast distribution and large population size in Taiwan but is currently limited to a few hundred individuals in southern Taiwan. It feeds on aquatic insects, tadpole, spiders, and snails. Rostratulabenghalensis is a common resident species of Taiwan and inhabits rice paddies or wetlands on plains. It feeds on insects, snails, earthworms, and seeds. Glareolamaldivarum is a relatively uncommon summer visitor of Taiwan and prefers partially barren lands, especially arid farmlands. It mainly feeds on aerial insects. Garrulaxtaewanus is a species endemic to Taiwan and is uncommon in lower elevation areas. It prefers shrubs and forages mainly on insects, fruits and seeds. Acridotherescristatellus is a resident species of Taiwan and prefers open areas. Its population has been greatly reduced due to inter specific competition of other invasive mynas. It prefers foraging close to the ground and feeds mainly on insects and fruits.

Background Information of Heavy Metal Samples
Sample units correlated to grid sizes of 100 ha, which was also set as the resolution cell sizes of distribution simulations. The heavy metal concentration in each 100 ha grid was calculated as the average of 30 points within each grid. A total of 2183 soil sample units were considered from all across Taiwan; except in the mountainous center of the study area, whose soil had been identified as containing relatively low levels of heavy metal concentration early on in the survey process and was therefore not included in extensive national surveying.    Note: X axis represents distance from 0 to 160 km. Y axis represents semivariance (log(mg/kg + 1)) 2 . All of the variograms were derived from GS+ software.

General Linear Model (GLM)
The GLM provides the probability of the species presence at each location based on the driving factors considered [3]. The model quantifies the connection between species occurrences and the driving factors according to the following equation: p is the probability of species occurrence at location i; k is the number of driving factors; ji x is the driving factor j at location i; 0 β is the estimated coefficient; and j β is the estimated coefficient corresponding to driving factor j.

General Additive Model (GAM)
The GAM provides the probability of the species presence at each location based on a nonlinear format of driving factors considered [3]. The model quantifies the connection between species occurrences and driving factors in the polynomial form given in the following equation: p is the probability of species presence at location i; k is the number of driving factors; ji x is the driving factor j at location i; 0 β is the estimated coefficient; and j β is the estimated coefficient corresponding to driving factor j in the given polynomial equation.

Support Vector Machine (SVM)
Based on the driving factors x ( = 1,2, … , ), SVM assigns each location i to one of two classes with corresponding labels y = ±1 (−1: species absence; +1: species presence). The optimal separation hyperplane is defined by maximizing the margin between the training points for classes −1 and +1. The discriminant function is defined by the following equation: where the is a vector of driving factors; is the normal vector of hyperplane with elements; is the intercept of hyperplane, which is a scalar. f(x) ≥ 0 represents species presence; In contrast, f(x) ≤ 0 represents species absence. and were derived from minimizing the following Equation [4]: which is subjected to: ≥ 0, where is a slack variable of training point ; γ is a parameter to measure the amount of penalty for misclassification; N is the number of training points.

Sequential Gaussian Simulation and Normal Score Transformation
sGs is defined as a stochastic simulation, which generates a set of realizations for each cell based on the intrinsic statistical characteristics of the available data, rather than simply estimating a single output, such as a mean. The realization performed by sGs is based on the conditional distributions of the observed variable. It sequentially covers the entire surface generating values based on the known value distribution and on specific simulated values of preceding or, that is, close by cells. The assumption of the method is based on multi-Gaussian distributions for each variable. Thus, the data require a prior normal score transformation to ensure the normality of the distribution [5]. The equation of normal score transformation is as follows: where ( ) is the transformed variable at site , G (•) is the inverse Gaussian cumulative function (cdf), and H * is the sample cdf of the U-WEDGE factor .
The transformed variable ( ) is ensured to satisfy the assumption of the Gaussian distribution so that sGs can be applied based on ( ). When the simulation is finished, the simulated normal score will be back-transformed to simulate the value [6].