Use of Random Forest Model to Identify the Relationships among Vegetative Species, Salt Marsh Soil Properties, and Interstitial Water along the Atlantic Coast of Georgia

: Saltmarshes, known to be ecologically sensitive areas, face disturbances such as vegetation dieback due to anthropogenic activities such as construction. The current construction speciﬁcations recommended by state highway agencies do not speciﬁcally require documenting or restoring any prior saltmarsh soil/interstitial water properties, nor do they require re-establishing saltmarsh vegetation; restoring the abiotic properties and appropriate vegetation would enhance the long-term functionality and ecology of a disturbed area. In order to have a successful restoration of disturbed saltmarshes with healthy vegetation, the relationship between vegetative species and the properties of saltmarsh soils and interstitial water must be fully understood. In this study, ﬁeld and laboratory tests were conducted for the soil samples from eight different saltmarsh sites in the Southeastern US Atlantic coastal region, followed by the development of a random forest model; the aim is to identify correlation among saltmarsh predominant vegetation types, redox potential, and salinity. The results reveal that moisture content and sand content are two main drivers for the bulk density of saltmarsh soils, which directly affect plant growth and likely root development. Moreover, it is concluded that deploying modern machine learning algorithms, such as random forest, can help to identify desirable saltmarsh soil/water properties for re-establishing vegetative cover with the reduced time after construction activities.


Introduction
Saltmarshes provide a healthy environment for wildlife and plants, stabilize the coastline, protect structures from flooding, and improve water quality [1][2][3]. These areas have high net primary productivity and accommodate salt-tolerant vegetative species (halophytes), which play a vital role in coastal protection including soil erosion control [4,5].
The construction, reconstruction, and maintenance of transportation infrastructure cause disturbances in saltmarshes by altering storm-water runoff patterns [6,7], increasing soil bulk density [8], and changing the hydrology [9]. Utilizing heavy equipment, staging construction materials, and constructing access/egress roads in these environmentally sensitive areas alters surface elevation [10] and surface soil properties, including bulk density [4], redox potential [11], alkalinity [12], soil water content [13], and hydraulic conductivity [14], all of which affect vegetation health and result in the loss of ecological functionality in the impacted areas [8,15,16]. Saltmarsh dieback is identified by vegetation loss, yielding large expanses of bare mud [17], and is considered as a geographically independent event [18]; some examples include Spartina alterniflora (syn. Sporobolus alterniflorus) dieback throughout the Mississippi River deltaic plain [19], Spartina alterniflora and Juncus roemerianus dieback in Georgia [18], Spartina alterniflora dieback in Louisiana [20] and Spartina townsendii dieback in Great Britain [21]. The causes of such vegetation diebacks are likely different and remained unknown in most cases [22], but construction activity and the accumulation of vegetation wrack are believed to be the potential cause of vegetative dieback in saltmarshes [17,23]. Re-establishing the proper physicochemical environment after construction leads to an increase in the likelihood of vegetative success [24]. Although the soil property changes due to the construction in saltmarshes consequently expedite a dieback process, the relationship between soil properties and vegetation health is not fully understood [18,25].
A key restoration practice in an impacted saltmarsh is to re-establish native halophytes [26]. To ensure successful vegetation re-establishment, the structure and the composition of the underlying hydric soil should be returned to baseline or improved conditions once postconstruction restoration practice is carried out [27], but these data are often unavailable. Establishing the baseline based on reliable data improves the likelihood of returning hydrologic and ecological functionality if sites were restored to the pre-existing conditions after construction activity [28,29]. To improve the efficacy of restoration efforts in impacted saltmarshes, the research presented herein investigates in situ saltmarsh soil physical and chemical properties to provide a baseline for the re-establishment of soil properties with the expectation to re-establish vegetative success and ecological function after construction impacts.

Soil Properties and Vegetation
Because fundamental soil properties such as organic matter content, texture, and bulk density influence the physicochemical environment in saltmarshes, the roles of each factor should be considered in the re-establishment success of target species. Soils low in organic matter tend not to be capable of supporting vegetation due to poor nutrient cycling in restored wetlands [30]. Soil organic matter is the key source of the essential nutrients for vegetative species growth [31].
Soil texture influences both soil organic matter retention and moisture content; finer textured soils with high clay content or silt have a greater capability of holding water and organic matter compared to coarse soils with high sand content [11,32]. Soil texture contributes to bulk density, which reflects soils' structural stability to support vegetation growth against erosive impacts of tidal flooding; however, a bulk density greater than 1.6 g/cm 3 generally is not suitable for root and plant growth in saltmarshes [33]. Because highly compacted soils restrict plant growth and root development, the following maximum bulk density values (bulk density thresholds for root penetration) are recommended based upon soil texture as follows [34][35][36]: for clay, sandy clay, silty clay and clay loam; for silt loam, silty clay loam, silt, silt loam, sandy clay loam, clay loam, sandy loam and loam; • 1.6 ( g cm 3 ) for sand and loamy sand. An increase of soil bulk density from 1.1 to 1.4 g/cm 3 yielded a 42% reduction in oxygen diffusion rate through waterlogged saltmarsh soil, while the induced changes in soil bulk density from 1.1 to 1.7 g/cm 3 resulted in a 75% reduction in the rate of oxygen diffusion [37]. Further, as soil organic matter decreases, bulk density increases, which can inhibit vegetation growth in restored saltmarshes [38,39]. Therefore, low organic matter content and high bulk density impact biogeochemical processing and restrict the root growth of some hydrophytes [40].
Statistical regression models successfully use organic matter content as a predictor for soil bulk density prediction [41,42]. Linear and exponential models for bulk density prediction are based on organic matter content, but such statistical models have a high variability in the bulk density response.

Interstitial Water Properties and Saltmarsh Vegetation
Re-establishing a native halophyte in a disturbed saltmarsh can be expedited by mimicking the relationship among interstitial water parameters, such as salinity, redox potential (Eh), and hydrogen ion concentration index of solution (pH), and vegetation health. In other words, successful restoration is accomplished by considering the ideal range of salinity, pH, and Eh for halophyte re-establishment [25]. Some typical saltmarsh species such as S. alterniflora are water-tolerant and ideal for colonization and proliferation in low-Eh environments through anatomical adaptations such as aerenchyma formation, which allows the plant to transport oxygen from its shoots to roots [43]. Such water-tolerant species take advantage of the anatomical adaptation strategies to combat the intensity of soil reduction by transferring oxygen from the shoots to roots and restricting the buildup of ethanol to toxic levels [43].
S. alterniflora and J. roemerianus, two predominant halophytes along the Southeastern Atlantic coast, tend to establish and develop in higher-salinity and lower-Eh saltmarshes than Schoenoplectus tabernaemontani [44]. The root system and some sophisticated metabolic adaptations such as ion exclusion in roots and ion secretion in shoots by salt glands are two important strategies helping S. alterniflora adapt to high salinity (more than 45 PSU) environments [45,46]. On the other hand, Borrichia frutescens is a ubiquitous plant in saltmarshes along the Atlantic coastline in the United States where they are not exposed to daily tidal inundations [27,47]. This species is able to grow in high salinity soils ranging from 20 to 50 PSU [48].
To date, the use of modern machine learning methods, such as random forest (RF), for estimating saltmarsh soil properties and classifying vegetation type has not been investigated or published. In this study, the RF algorithm as a versatile method is used to characterize soil attributes at rooting depth and classify vegetation canopy at the sampling sites. Moreover, we use RF to determine the most important parameters that affect the establishment and development of saltmarsh vegetation. The outcome of this study helps to guide engineers to conduct successful restoration practices in disturbed tidal saltmarshes. Knowing the relationship between halophytes and soil parameters optimizes restoration design and provides target species with ideal growth conditions. Further, findings from this study are beneficial for monitoring saltmarshes and detecting the changes in soil conditions due to both anthropogenic and naturogenic disturbances.

Study Sites
Eight saltmarsh sites along Georgia's Atlantic coast were selected based on their proximity to infrastructure improvement projects ( Figure 1). Each selected site was divided into three different zones (A, B, C) depending on the dominant vegetative species for soil sampling. The dominant vegetative species were determined based on which species represented more than 50% coverage in each selected site.

Sample Collection and Soil Physical Characteristics
Triplicate soil samples from each site were obtained to quantify the variability within and between locations. A total of 24 soil samples were collected in the rooting zone at the sites and kept intact in sealed waterproof containers to avoid moisture loss. All samples were transported to a laboratory within four hours and stored at 4 • C for the measurement of moisture content, organic matter content, and particle size distribution. The bulk density of an undisturbed soil sample from the root zone was measured in accordance with the core method [49]. Porewater was withdrawn from the root zone using a pushpoint sampler and 60 mL syringes [50]; the salinity, pH, and Eh of the pore water were measured using a calibrated HI98194 portable meter (Hanna Instruments-Woonsocket, RI, USA) in the field.
Particle size distribution using methods employing a sieve (American Society for Testing and Materials (ASTM) D1140-17) and a hydrometer (ASTM D422) were used to determine classification following the USDA soil classification system. Moisture and organic matter content were also measured based on ASTM D2216-10 and ASTM D2974-87, respectively.

Random Forest
Tree-based models are nonlinear and extremely flexible for data fitting. However, single tree models suffer from high variance. To overcome this issue, tree-based ensemble methods have been widely adopted. Random forest (RF) is one of the popular ensemble methods that has been widely applied across different fields; it leverages a collection of trees that are constructed from bootstrapped samples of a training data set. Besides bootstrapping, RF considers the split candidates from a randomly chosen subset of the original feature set, which results in decorrelated trees and reduces prediction variance. RF has been used in soil science for characterizing and modeling soil organic matter distribution [51][52][53][54]. This study aims to explore the utility of RF for classifying halophyte types based on soil and interstitial properties and identifies the most important parameters that influence the halophyte type and community at saltmarshes found along Georgia's coast.

Saltmarsh Soils Physical Properties
The soil properties information obtained from the selected sites are summarized in Figure 2 and Table 1. The soil particle size fraction shows considerable variability in soil composition within a relatively small area at a saltmarsh site ( Figure 2). For example, site 3.A is near site 3.B, but they are determined to be clay and sandy loam, respectively. Site 3.A has 44.57% clay and 26.15% sand content, while 3.B contains 17.07% clay and 71.80% sand. Site 7 (classified as loamy sand) has the highest average bulk density (1.56 g/cm 3 ) and the lowest average organic matter content (1.38%), with 80% sand on average ( Figure 2). On the other hand, site 4 has the lowest average bulk density (0.24 g/cm 3 ) as well as the highest organic matter and moisture content (Table 1). Fine-textured soil and high organic matter content result in the low bulk density of soils at site 4. Soil texture has a high spatial variability in saltmarshes and is expected to change when disturbed by construction, and so spatial variability should be considered for restoration practice as it influences soil bulk density, organic matter retention, water retention, Eh, and salinity-all factors that contribute to native vegetation re-establishment. Organic matter content ranged from a minimum of 0.24% at site 7.A to a maximum of 28.8% at site 4.C (Table 1). Soil at site 4.C has the highest organic matter content as well as the finest texture (clay and silt) among the study sites. Sites 3 and 7 have relatively higher sand content and bulk density, which expedite water drainage and particulate organic matter loss due to daily inundations. The high sand content at these sites facilitates rapid drainage and contains little organic matter, thus little nutrients; because of this, these sites are not likely to support a diverse vegetation community. Moreover, high sand soils in close proximity to seawater tend to have high salinity due to a high rate of moisture loss occurring through drainage and evaporation, which make a less favorable environment for vegetation growth. It is believed that the vegetative species that can survive in lower pH level would be the dominant vegetative species in this soil environment [11].

Relationship among Vegetative Species, pH, Organic Matter, and Elevation Gradient
S. alterniflora is the dominant halophyte at sites 1, 2, 6, and 8 and S. tabernaemontani is dominant at sites 4 and 5. Sites 3 and 7 support both J. roemerianus and B. frutescens, respectively.
During the site visit in June 2018, anaerobic conditions and circumneutral pH were observed at all sampling sites. Table 2 summarizes the measurements of salinity, pH, and Eh at all sampling sites along with the dominant vegetative species. The pH level for all the sites exceeds 4 ( Table 2), which increases the saltmarsh capability to support halophytes because acidic soils tend to be low in necessary nutrients such as nitrogen and phosphorous for vegetation growth [55]. S. tabernaemontani and J. roemerianus can grow in lower pH while B. frutescens and S. alterniflora can grow at sites that have a higher pH with respect to other halophytes ( Table 2). To confirm this observation, a Tukey HSD determined that the mean pH significantly influences halophyte growth at saltmarsh sites ( Table 3). As shown in Table 3, a significant difference in mean pH was found between B. frutescens and J. roemerianus, B. frutescens and S. tabernaemontani, S. alterniflora and J. roemerianus, and S. alterniflora and S. tabernaemontani.

Relationship of Bulk Density and Organic Matter to Vegetation
Vegetation has clear differences in bulk density (BD) at each location, suggesting that the BD can influence vegetation patterns in saltmarshes (Figure 3). S. tabernaemontani grows in soils with a BD of 0.478 g cm −3 , which is significantly lower than that of B. frutescens (p-value = 0.0001), J. roemerianus (p-value = 0.017), and S. alterniflora (p-value = 0.033) (Figure 3). Sites 3 and 7 show the highest average bulk density among all study sites and are dominated by J. roemerianus and B. frutescens, respectively (Figure 3). These two species appear to be capable of dominating high bulk density soils.
Bulk density influences water and gas movement within the soil [4], and because of this, saltmarsh plant productivity and growth rate are affected by this parameter. Bulk density at sites 3.B and 7.B surpass the bulk density thresholds for root penetration. Site 3.B (sandy loam) and 7.B (loamy sand) have bulk densities as 1.504 g/cm 3 and 1.667 g/cm 3 , respectively, which restricts the vegetation growth at these two sites. Organic matter is an important factor influencing soil bulk density in tidal marshes. Of course, as organic matter decreases or as sand content increases, bulk density increases. Site 7 has the least average organic matter content, the highest bulk density, and the highest sand content. Therefore, it is inferred that bulk density is a function of percentage of mineral and organic matter in the soil substrate. the BD can influence vegetation patterns in saltmarshes (Figure 3). S. tabernaemontani grows in soils with a BD of 0.478 g cm −3 , which is significantly lower than that of B. frutescens (p-value = 0.0001), J. roemerianus (p-value = 0.017), and S. alterniflora (p-value = 0.033) (Figure 3). Sites 3 and 7 show the highest average bulk density among all study sites and are dominated by J. roemerianus and B. frutescens, respectively (Figure 3). These two species appear to be capable of dominating high bulk density soils.

Discussion
The RF model indicates that Eh and salinity are the two most important parameters for halophyte classification (Figure 4). In other words, Eh and salinity are two contributing factors dictating the vegetation type and structure at a saltmarsh site. The RF classification model had accuracy (the number of correctly classified data instances over the total number of data instances) of 100%. Bulk density influences water and gas movement within the soil [4], and because of this, saltmarsh plant productivity and growth rate are affected by this parameter. Bulk density at sites 3.B and 7.B surpass the bulk density thresholds for root penetration. Site 3.B (sandy loam) and 7.B (loamy sand) have bulk densities as 1.504 g/cm 3 and 1.667 g/cm 3 , respectively, which restricts the vegetation growth at these two sites. Organic matter is an important factor influencing soil bulk density in tidal marshes. Of course, as organic matter decreases or as sand content increases, bulk density increases. Site 7 has the least average organic matter content, the highest bulk density, and the highest sand content. Therefore, it is inferred that bulk density is a function of percentage of mineral and organic matter in the soil substrate.

Discussion
The RF model indicates that Eh and salinity are the two most important parameters for halophyte classification (Figure 4). In other words, Eh and salinity are two contributing factors dictating the vegetation type and structure at a saltmarsh site. The RF classification model had accuracy (the number of correctly classified data instances over the total number of data instances) of 100%. Further, the RF model suggests that moisture content and salinity are the most important parameters for predicting bulk density and Eh, respectively ( Figure 5). The MSE and R 2 for bulk density model are 0.037 and 0.849, respectively. Eh model has an MSE of 3339.231 and an R 2 of 0.321. Further, the RF model suggests that moisture content and salinity are the most important parameters for predicting bulk density and Eh, respectively ( Figure 5). The MSE and R 2 for bulk density model are 0.037 and 0.849, respectively. Eh model has an MSE of 3339.231 and an R 2 of 0.321. . Figure 5. The parameter importance for predicting bulk density and redox potential (Eh) by random forest.
The measured versus predicted values of bulk density and Eh models based on moisture content and salinity have strong correlations of 0.964 and 0.872, respectively ( Figures  6 and 7). The measured Eh of the samples varied from −12.1 to −373.5 mV with a tendency of underprediction (i.e., more negative) at the higher measured Eh (i.e., less negative) (Figure 6). The measured soil bulk density varied from 0.314 to 1.501 g/cm 3 , and the prediction of bulk density does not show a clear tendency of underprediction or overprediction (Figure 7). The slope from both regression analyses are nearly 1 (Figures 6 and 7), suggesting that the prediction of the models is close to what was observed, and the p-values of both models are less than 0.05.  The measured versus predicted values of bulk density and Eh models based on moisture content and salinity have strong correlations of 0.964 and 0.872, respectively (Figures 6 and 7). The measured Eh of the samples varied from −12.1 to −373.5 mV with a tendency of underprediction (i.e., more negative) at the higher measured Eh (i.e., less negative) ( Figure 6). The measured soil bulk density varied from 0.314 to 1.501 g/cm 3 , and the prediction of bulk density does not show a clear tendency of underprediction or overprediction ( Figure 7). The slope from both regression analyses are nearly 1 (Figures 6 and 7), suggesting that the prediction of the models is close to what was observed, and the p-values of both models are less than 0.05.    Knowing the relationship between vegetation and hydric soil attributes improves a saltmarsh restoration practice and leads to successful vegetation re-establishment. Saltmarshes adjacent to a construction site or exposed to future disturbances should be characterized in terms of the soil, interstitial water, and vegetative species prior to disturbance. When the targeted soil and interstitial water properties such as Eh, bulk Knowing the relationship between vegetation and hydric soil attributes improves a saltmarsh restoration practice and leads to successful vegetation re-establishment. Saltmarshes adjacent to a construction site or exposed to future disturbances should be characterized in terms of the soil, interstitial water, and vegetative species prior to disturbance. When the targeted soil and interstitial water properties such as Eh, bulk density, and salinity are returned, the time required for re-establishing the vegetative cover after construction activity is likely to be reduced, and the density and vigor of natural vegetation in disturbed areas are likely to improve, thus leading restoration practitioners toward a stronger chance of favorable outcomes. Therefore, the results from this study can help to guide scientists and engineers toward successful restoration in disturbed tidal saltmarshes.

Conclusions
Understanding the relationship among halophytes, soil, and interstitial water parameters can optimize restoration designs and provide the target species with ideal growth conditions.

•
Mean bulk densities for sites supporting S. tabernaemontani and B. frutescens are 0.323 g/cm 3 and 1.560 g/cm 3 , respectively. B. frutescens was able to establish and develop in soils that have a relatively high bulk density, up to 1.670 g/cm 3 , in comparison to the other vegetation, which is a result of high sand content or low organic matter content. B. frutescens was found in the highest average bulk density (around 1.560 g/cm 3 ) and the lowest average organic matter content (i.e., 1.383 percent). We found that S. tabernaemontani grows in the soil with the lowest average bulk density (0.478 g/cm 3 ) and the highest average organic matter content (13.83 percent) in comparison to the other vegetative species observed in this study.

•
With a 95% confidence, the salinity level of S. tabernaemontani is significantly different from that of B. frutescens and S. alterniflora. S. tabernaemontani has the lowest and S. alterniflora has the highest average salinities, which are 3.783 PSU and 27.873 PSU, respectively. High salinity inhibits S. tabernaemontani growth in coastal marshes, while the other studied species tend to be more salt tolerant. Vegetative species in costal marshes have different tolerances to salinity, and because of this, this tolerance is recommended to be considered for any restoration practice of disturbed salt marshes.

•
The results of random forest models indicate that the soil properties of saltmarshes are interrelated and influenced by interstitial water, and vegetative species. With the random forest models, the targeted soils/interstitial water properties such as redox potential (Eh), bulk density (BD), and salinity can be predicted with the estimated time required for re-establishing the vegetative cover after construction activity, which can be beneficial for the saltmarsh restoration. Funding: The work presented in this paper is part of a research project (RP 17-11) sponsored by the Georgia Department of Transportation. The contents of this paper reflect the views of the authors, who are solely responsible for the facts and accuracy of the data, opinions, and conclusions presented herein. The contents may not reflect the views of the funding agency or other individuals.