Stream Chemistry and Forest Recovery Assessment and Prediction Modeling in Coal-Mine-Affected Watersheds in Kentucky, USA

: Kentucky is one of the largest coal-producing states; surface coal mining has led to changes in natural land cover, soil loss, and water quality. This study explored relationships between actively mined and reclaimed areas, vegetation change, and water quality parameters. The study site evaluated 58 watersheds with Landsat-derived variables (reclamation age and percentage of mining, reclaimed forest, and reclaimed woods) as well as topographic variables (such as elevation, slope, drainage density, and infiltration). Water samples were collected in spring (n = 9), summer (n = 14), and fall (n = 58) 2017 to study changes in water quality variables (SO 42 − , alkalinity, conductivity, Ca 2+ , Mg 2+ , Mn 2+ , Al 3+ , and Fe 2+ , Fe 3+ ) in response to changes in land cover. Pearson correlation analyses indicated that conductivity has strong to very strong relationships with water quality variables related to coal mining (except Al 3+ , Fe 2+ , Fe 3+ , Mn 2+ , elevation, slope, and drainage density) and land cover variables. In addition, separate regression analyses were performed, with conductivity values based on samples collected in the fall. First, conductivity responses to mining percentage, reclamation age and topographic variables were examined (adjusted R 2 = 0.818, p < 0.01). Next, vegetation cover change parameters were added to the same model, which yielded slightly improved R 2 (adjusted R 2 = 0.826, p < 0.01). Finally, reclamation age and mining percentages were used to explain the quantity of reclaimed forested areas as a percentage of watersheds. The model was significant ( p < 0.01), with an adjusted R 2 value of 0.641. Results suggest that the quantity (area as a percentage) of reclaimed forests may be a predictor of the mining percentage and reclamation age. This study indicated that conductivity is a predictable water quality indicator that is highly associated with Coal-Mine-Related Stream Chemistry in areas where agriculture and urban development are limited. Water quality is not suitable for various purposes due to the presence of contaminants, especially in mined sites. These findings may help the scientific community and key state and federal agencies improve their understanding of water quality attributes in watersheds affected by coal mining, as well as refine land reclamation practices more effectively while such practices are in action.


Introduction
Coal deposits were formed millions of years ago, before dinosaurs roamed the earth, and have been used by our ancestors for the past 25,000 years [1].They are the largest source of energy in the world, with significant reserves in the U.S., China, Russia, Australia, and India [2].In the U.S., there are three major regions in 25 states with coal mining activities generating more than fifty percent of the electricity produced in the nation [3].The Appalachian region, including Kentucky, Tennessee, Virginia and West Virginia, produces more than one-third of the coal in the nation [2].Mining activities increased rapidly throughout the world due to the high demand for precious minerals and energy in recent decades, accounting for a total of 17.3 billion tons in 2020 [4] from a global area of 101,583 km 2 [5,6].
Water is an essential natural resource; however, human activities, including land use and climate change, impact both water quantity and quality, which create scarcity issues and threaten aquatic biodiversity [7].The impacts of coal mining activities on surface and groundwater quality have been documented by various authors [8][9][10][11].Coal mining is one of the dominant factors degrading water quality in the Appalachia region, where hundreds of headwater streams are impacted [12][13][14][15].Coal mining releases numerous pollutants directly or indirectly; the coal fuel cycle is among the most dangerous activity on the earth's ecosystem, threatening human health; contaminating air, water, soil, sediment, and vegetation; and contributing to global warming [3,10,16,17].
Commercial coal production started in Kentucky before it was recognized as a state.In 1790, 18 tons (20 short tons) of coal was produced in the first commercial coal operation in what is now Lee County.The demand for energy led to an increase in coal production, in which peak levels exceeded 162 million tons in 1990 [18].Kentucky was the fifth-largest coalproducing state in 2016, with 42 million tons of coal produced [19].Although production has declined consistently since 1990, seventy-nine percent of Kentucky's net electricity generation was coal-fired in 2017 [19].
Even though coal-generated electricity is less expensive than other energy sources, its impact on the environment is under debate [12,20].The Surface Mining Control and Reclamation Act (SMCRA) of 1977 created two programs: one for regulating active coal mines and a second for reclaiming abandoned mine lands [21].Many studies, however, have shown that reclamation practices often yield ineffective results [12,20,22,23].
Major coal mining impacts generally observed in Kentucky and Central Appalachia are natural land cover loss, hydrological pattern changes, valley fill, acid drainage, and water quality degradation (Figure 1) [20].Mountaintop coal mining, a method widely used in the Appalachian region, involves blasting rocks, clearing forests, and removing soil to reach coal reserves [24,25].For example, the landscape was greatly modified by smelting and mining activities in Copperhill, Tennessee, located in the Blue Ridge Mountains at the convergence of northern Georgia; western North Carolina; and southern Tennessee [26].Activities began in 1854 and resulted in a desert-like landscape during the 1900s [26].Central Appalachia has the highest earth movement rate in the United States, with each surface mine generating large quantities of spoil that are typically translocated to stream valleys close to mining areas [27].The mountaintop removal of coal mining requires forest clearing, which in some cases may be visible on a 1:3,000,000 scale from National Agriculture Imagery Program (NAIP) aerial images.Mountaintop mining poses a potential threat to intact characteristics of Appalachian forests [28].Interior forest loss (a change in the interior forest to the forest edge) in southern Appalachia between 1991 and 2002 was 1.75 to 5 times higher than direct forest loss (loss from edges), which may have been the result of mountaintop mining [27].Land cover alterations from active mines to reclaimed mines in central Appalachia almost entirely occurred in forested areas [20].Gastauer et al. (2018) described that modern functional and phylogenetic approaches proved to be powerful tools to enhance the success of mine land rehabilitation processes with the application of additional advanced techniques such as remote sensing and metabarcoding [28].There is a science-based reclamation and closure plan in action to minimize coal mining impacts on the environment; however, there are still high concentrations of contaminants widespread in the environment, despite extensive remediation efforts by the US EPA, other federal agencies, and state agencies [29].Gastauer et al. (2018) described that modern functional and phylogenetic approaches proved to be powerful tools to enhance the success of mine land rehabilitation processes with the application of additional advanced techniques such as remote sensing and metabarcoding [28].There is a science-based reclamation and closure plan in action to minimize coal mining impacts on the environment; however, there are still high concentrations of contaminants widespread in the environment, despite extensive remediation efforts by the US EPA, other federal agencies, and state agencies [29].
The Forest Reclamation Approach (FRA) provides a set of guidelines that are based on research conducted over several decades to promote the regrowth of forests on reclaimed mine lands, which requires using loose soil [23].This could potentially lead to the elevated concentration of ions and metals; moreover, the effect of FRA on water quality needs further investigation [25].The potential for total dissolved solids-source-control practices that incorporate FRA may improve mine water quality [23].The effect of FRA on water quality may require additional data on the spatial extent of FRA locations and reclamation age.
Reclaimed forests and other areas can be detected with remote sensing techniques [30,31] and assessed against water quality.Wei et al. (2011) found that, after seven years of monitoring, water quality improvement was more obvious in sub-watersheds that were heavily affected by past mining activities and reclaimed by reforestation than in lands with abandoned mines [32].These findings demonstrate that good reclamation practices can have a positive influence on water quality over time.
Besides forest loss, surface coal mining may have a significant effect on soil hydrological properties [33].Surface mining may have negative effects on watershed hydrologic characteristics [22,34].As shown in Figure 2, valley fills can bury headwater streams under tens to hundreds of meters of spoil [24,25], which minimizes the drainage capacity of a valley [33,35].The Forest Reclamation Approach (FRA) provides a set of guidelines that are based on research conducted over several decades to promote the regrowth of forests on reclaimed mine lands, which requires using loose soil [23].This could potentially lead to the elevated concentration of ions and metals; moreover, the effect of FRA on water quality needs further investigation [25].The potential for total dissolved solids-source-control practices that incorporate FRA may improve mine water quality [23].The effect of FRA on water quality may require additional data on the spatial extent of FRA locations and reclamation age.
Reclaimed forests and other areas can be detected with remote sensing techniques [30,31] and assessed against water quality.Wei et al. (2011) found that, after seven years of monitoring, water quality improvement was more obvious in sub-watersheds that were heavily affected by past mining activities and reclaimed by reforestation than in lands with abandoned mines [32].These findings demonstrate that good reclamation practices can have a positive influence on water quality over time.
Besides forest loss, surface coal mining may have a significant effect on soil hydrological properties [33].Surface mining may have negative effects on watershed hydrologic characteristics [22,34].As shown in Figure 2, valley fills can bury headwater streams under tens to hundreds of meters of spoil [24,25], which minimizes the drainage capacity of a valley [33,35].
Clark and Zipper (2016) researched hydrologic differences in a reforested area and a grass-covered area that were reclaimed 14 years prior to the study with similar processes [36].The reforested area had a higher infiltration rate and the grassy area had more surface flow paths; however, the water quality in these two areas was not documented (e.g., Figure 3).Clark and Zipper (2016) researched hydrologic differences in a reforested area and a grasscovered area that were reclaimed 14 years prior to the study with similar processes [36].The reforested area had a higher infiltration rate and the grassy area had more surface flow paths; however, the water quality in these two areas was not documented (e.g., Figure 3).
The Clean Water Act Amendment of 1972 (CWA) established standards for regulating pollutant discharges into surface water [37].However, groundwater and surface water from mined areas may be contaminated with numerous toxic solutes such as sulfate (SO4 2− ), iron (Fe), aluminum (Al), and selenium (Se) [38,39].Increased conductivity, pH changes, and elevated dissolved ion levels are commonly found in streams near surface mines [12].A relationship exists between water quality degradation and coal mining at various scales of watersheds, for example, elevated SO4 2− , alkalinity, conductivity, Ca 2+ , Mg 2+ , Mn 2+ , Al 3+ , and Fe 2+,3+ [12,23,[40][41][42]. Green et al., (2000) found that conductivity fluctuated seasonally and was highest in summer and lowest in spring, which may be due to the dilution effect of water in the watershed [43].Similarly, in consecutive seasons, conductivity was two or more than two folds in mined watersheds than unmined watersheds [43,44], (Table 1).These findings suggest that precipitation or seasonal conditions may affect the results; however, differences between unmined and mined areas are stable if water samples are collected under the same conditions, e.g., at the same season and on a clear day without rain.The Clean Water Act Amendment of 1972 (CWA) established standards for regulating pollutant discharges into surface water [37].However, groundwater and surface water from mined areas may be contaminated with numerous toxic solutes such as sulfate (SO 4 2− ), iron (Fe), aluminum (Al), and selenium (Se) [38,39].Increased conductivity, pH changes, and elevated dissolved ion levels are commonly found in streams near surface mines [12].A relationship exists between water quality degradation and coal mining at various scales of watersheds, for example, elevated SO 4 2− , alkalinity, conductivity, Ca 2+ , Mg 2+ , Mn 2+ , Al 3+ , and Fe 2+,3+ [12,23,[40][41][42]. Green et al., (2000) found that conductivity fluctuated seasonally and was highest in summer and lowest in spring, which may be due to the dilution effect of water in the watershed [43].Similarly, in consecutive seasons, conductivity was two or more than two folds in mined watersheds than unmined watersheds [43,44], (Table 1).These findings suggest that precipitation or seasonal conditions may affect the results; however, differences between unmined and mined areas are stable if water samples are collected under the same conditions, e.g., at the same season and on a clear day without rain.Coal mining typically releases pyrite (FeS 2 ), which forms in association with coal [45][46][47][48][49].When pyrite comes in contact with water and oxygen, it is oxidized by autotrophic bacteria, leading to acid mine drainage (Equation (1)) [50]: Carbonate minerals, e.g., calcite (CaCO 3 ) and dolomite (CaMg(CO 3 ) 2 ), can neutralize the acidity (Equations ( 2) and ( 3)) [51].
Topographic variables may play a role in stream chemistry, even though coal mine water quality researchers have rarely used them.Some land cover change (urban; agriculture) studies utilized topographic variables as contributing factors to stream chemistry [52,53].Chen and Lu (2014) found that the mean watershed elevation and slope significantly correlated with conductivity [53].In contrast, Haidary et al. (2013) did not find any significant relationship among the watershed slope, drainage density, and conductivity [52].Bhatt et al. (2018) found a very strong correlation with the elevation and electrical conductivity along the pristine watershed from central Himalaya [54].Pond et al. (2008) did not find any relation between elevation and stream chemistry in coal mine sites because the role of elevation in regulating stream chemistry was negligible in comparison to the acidic environment that controls the dissolution process and overall biogeochemical dynamics within the landscape [41].For example, the most acidic waters that were measured percolated through an underground mine near Reading in California.According to the USGS study, Iron Mountain California's mine water measured exceptionally high concentrations of sulfate with the pH value of 0.5, as reported by [10].The site is undergoing remediation by the US EPA [55].
Various studies have documented a negative correlation between coal mining and the health of local ecology, stream habitat, community structure, and ecosystem functions [56][57][58].Prior studies have focused on water quality degradation [12,23,44,45] or land change due to coal mining [20,27], whereas few studies have focused on CMRSC in mined watersheds along with land change [42,58,59], specifically the vegetation cover change in eastern Kentucky.Among those studies that have been conducted, Hopkins et al. (2013) found a high correlation between the mining percentage and SO 4 2− and conductivity, while Hopkins et al. ( 2013) and Merriam et al. (2015) created a general linear model between land use (surface mining, residential development, and underground mining) and specific conductivity [42,59].Future comprehensive research should focus on the interrelationship among land, vegetation, and water quality changes over time after disturbances caused by active mining and reclamation [42].Moreover, data are needed to determine how the spatial and temporal extent of surface coal mining affects the watershed based on the vegetation cover change, reclamation age, and topographic factors.The aim of the current research was to examine how coal mining affects the water quality and temporal vegetation cover change between 1986-2017 in mined watersheds on a regional scale.

Study Site
This research was conducted in eastern Kentucky, U.S.A. (Figure 4).The research area covers Johnson Creek, Troublesome Creek, and Quicksand Hydrologic Unit Code 10 (HUC10) watersheds in Magoffin, Knott, Perry, and Breathitt Counties.The research area was approximately 1768 sq.km, including 58 stream reach watersheds, which were sub-watersheds of HUC10 watersheds (Figure 5).Stream reach watersheds ranged from 2.65 km 2 to 16.94 km 2 .This study defined unmined watersheds as those that were less than 5% mined [59,60].Similarly, since fertilizer usage would have affected conductivity, watersheds that did not have combined developed and agricultural lands that exceeded 5% were selected in this study to control the agriculture and urban effects on CMRSC.The research area contained various watersheds that were 2.5% to 90% mined and had a reclamation age range from 4 years to 29 years.
Environments 2024, 11, x FOR PEER REVIEW 7 of 34 watersheds of HUC10 watersheds (Figure 5).Stream reach watersheds ranged from 2.65 km 2 to 16.94 km 2 .This study defined unmined watersheds as those that were less than 5% mined [59,60].Similarly, since fertilizer usage would have affected conductivity, watersheds that did not have combined developed and agricultural lands that exceeded 5% were selected in this study to control the agriculture and urban effects on CMRSC.The research area contained various watersheds that were 2.5% to 90% mined and had a reclamation age range from 4 years to 29 years.The weather was stable and there was no surface runoff for any of the sampling days.During the first two trips, 9 (spring) and 14 (summer) water samples were collected at drainage exit points.Any water sample collected from the watersheds' exit points represented the watersheds according to their geological structures and land cover types [42].Under dry (no precipitation) conditions, a perennial stream in a watershed is recharged by groundwater and all stream water exits through the lowest elevation point within the watershed.Three water samples were collected from each exit point at five-meter-intervals towards the headstream.

Data Collection, Preparation, and Analysis
This study followed Kentucky ambient/watershed water quality monitoring procedures for collecting water samples [61].Water samples were collected in high-density polyethylene (HDPE) bottles that were cooled to 4 • C. The samples collected in the spring and summer were sent to the West Virginia University National Research Center for Coal and Energy Water Analysis Lab for analysis.The samples were filtered at the laboratory through 11-micron filter paper (Whatman 1001-125, Pennsylvania, USA) to separate the sediments from the water.The samples were then analyzed for SO4 2− , alkalinity, conductivity, Ca 2+ , Mg 2+ , Mn 2+ , Al 3+ , and Fe 2+ ,Fe 3+ .In the fall, the third water sampling analysis was conducted for in situ conductivity measurements with a larger sample size (n = 58) using a Hydrolab Quanta Water Probe (OTT Hydromet, Kempten, Germany).The conductivity levels were measured at the drainage exit points during the sampling.Data from the spring and summer sampling were used for a bivariate correlation analysis.Data from the fall visit, which had the largest sample size, were used in the regression models.Figure 6 is the schematic of water quality, data preparation, and analysis steps.
(spring) and 14 (summer) water samples were collected at drainage exit points.Any water sample collected from the watersheds' exit points represented the watersheds according to their geological structures and land cover types [42].Under dry (no precipitation) conditions, a perennial stream in a watershed is recharged by groundwater and all stream water exits through the lowest elevation point within the watershed.Three water samples were collected from each exit point at five-meter-intervals towards the headstream.
This study followed Kentucky ambient/watershed water quality monitoring procedures for collecting water samples [61].Water samples were collected in high-density polyethylene (HDPE) bottles that were cooled to 4 °C.The samples collected in the spring and summer were sent to the West Virginia University National Research Center for Coal and Energy Water Analysis Lab for analysis.The samples were filtered at the laboratory through 11-micron filter paper (Whatman 1001-125, Pennsylvania, USA) to separate the sediments from the water.The samples were then analyzed for SO4 2− , alkalinity, conductivity, Ca 2+ , Mg 2+ , Mn 2+ , Al 3+ , and Fe 2+ ,Fe 3+ .In the fall, the third water sampling analysis was conducted for in situ conductivity measurements with a larger sample size (n = 58) using a Hydrolab Quanta Water Probe (OTT Hydromet, Kempten, Germany).The conductivity levels were measured at the drainage exit points during the sampling.Data from the spring and summer sampling were used for a bivariate correlation analysis.Data from the fall visit, which had the largest sample size, were used in the regression models.Figure 6 is the schematic of water quality, data preparation, and analysis steps.

Vegetation Cover Change Data Collection and Analysis Method
We used ArcGIS Desktop 10.5 for all GIS analyses [62].Landsat and National Aerial Imagery Program (NAIP) images, and Digital Orthophoto Quadrangle images acquired between 1986 and 2017 (NASA Landsat Program 1986-2017; Kentucky Geography Network 1990-2016).Additionally, the National Land Cover Dataset (NLCD) was downloaded from Multi-Resolution Land Characteristics Consortium (MRLC 1992-2011).These images were compiled for the mined and unmined boundaries to derive the vegetation cover data and accuracy assessment.Spectral reflectance values of near infrared (NIR) wavebands are commonly used in combination with other spectral bands to detect different land cover classes [63][64][65][66].Satellite images are particularly beneficial for detecting the land cover conversion from the natural vegetation cover (e.g., deforestation [67][68][69][70][71]. We followed a methodology by Kriegler et al. (1969) (Equation ( 4)) to derive the NDVI maps between 1986 and 2017 for a vegetation cover change analysis [74].NDVI measures the vegetation greenness by computing the proportion of visible red and NIR spectral reflectance [75,76].
We studied unmined (mining percentage ≤ 5%) and mined sites covered with undisturbed forest, reclaimed land (reclaimed forest, reclaimed woods, and reclaimed grass/ pasture land), and barren ground.We extracted NDVI maps from Landsat 5 and 8 clear images (<10% cloud coverage), which were taken between 1986 and 2014.
Using the Image Analysis tool of ArcMap (ArcGIS Desktop) 10.5 and Equation ( 4), time series NDVI maps were derived from all Landsat images acquired in 1986,1990,1994,1999,2002,2006,2010,2014, and 2017, since RGB-NDVI produces the most accurate results [77].The NDVI data were further grouped into five classes: (1) barren (active mine); (2) reclaimed grassland; (3) reclaimed woodland (<50% forest cover); (4) reclaimed forest (>50% forest cover); and (5) undisturbed forest using the [20,77] framework.Assigning years to the barren pixels with a con tool and running a cell statistics tool created a composite map (Figure 7).The resulting map displayed the location of barren (active mine) areas with the latest mining year.Next, we compared barren areas with 2017 vegetation classifications.For instance, we compared an area that was mined in 1994 and then later converted to a reclaimed forest and was a reclaimed wood in 2017.Consequently, a temporal NDVI class change within a specific area was considered to be a vegetation cover change for that specific area and time.Forest that was converted to barren ground during the coal mining activity in a particular year was expected to be converted to a reclaimed forest after several years.The vegetation cover change was estimated based on the vegetation improvement, from no vegetation (barren) to a final vegetation cover type (reclaimed grass, reclaimed woods, or reclaimed forest).We evaluated the classified pixels from National Land Cover Dataset (NLCD), Digital Orthophoto Quadrangle (DOQ), and NAIP imageries.If more than half of a pixel contained trees, it was classified as forest; if less than half of a pixel contained trees, it was classified as woods regardless of the remaining class or classes in a pixel.
Vegetation cover change was evaluated from the latest barren land to the current (2017) vegetation cover.The Zonal histogram tool of ArcMap 10.5 produced counts of the classified pixels (barren, grass, woods, and forest) in each watershed.An accuracy assessment was conducted to evaluate and validate the land cover or NDVI classes derived from the satellite images using ancillary, secondary, or in situ (ground truth) data [78].NAIP, digital orthophoto quadrangle images, and NLCD were used to assess the classification accuracy.Randomly selected (a total of 200 points, 50 for An accuracy assessment was conducted to evaluate and validate the land cover or NDVI classes derived from the satellite images using ancillary, secondary, or in situ (ground truth) data [78].NAIP, digital orthophoto quadrangle images, and NLCD were used to assess the classification accuracy.Randomly selected (a total of 200 points, 50 for each class) stratified points were used for data from 1990, 2002, and 2014 [20].NLCD (1992) and Digital Orthophoto Quadrangle (DOQ) (1985) images were used for the accuracy assessment of data from 1990, NLCD (2004), DOQ (2000)(2001), and NAIP (2004) images for data from 2002; NLCD (2011), and NAIP (2014) for data from 2014.The criteria used for separating reclaimed woods from reclaimed forests was a visual inspection of the abundance of trees in a pixel.If a pixel contained more than 50% trees, it was classified as a reclaimed forest.The accuracy assessment results provided an overall accuracy of 79.5%, 80.5%, and 89.5% for 1990, 2002, and 2014, respectively.We obtained the following Kappa (κ) for the same time periods: 67.6%, 73.1%, and 83.1%, respectively.The overall accuracy exceeded the target threshold accuracy (80%); it was assumed to be a high accuracy by [20].The overall accuracy and Kappa statistics suggest that the vegetation cover change data were valid and usable for further statistical analysis.Accuracy assessment results are reported in Table 2. NDVI-based land cover classes provided an opportunity to compute the average reclamation age (Equations ( 5) and ( 6)); mining percentage (Equation ( 7)); reclaimed forest percentage (Equation ( 8)); and reclaimed woods percentage (Equation ( 9)) for each watershed area studied.Many of the watersheds were mined multiple times in different years; therefore, the average reclamation year for the watersheds was calculated by taking the weighted average of the initial reclamation year (latest mining year) and pixel counts (Equation ( 5)).
Average Reclamation Age = 2017 − avg rec year The mining percentage was defined as a ratio of the sum of active years' of pixel values to total pixels counted for a watershed and multiplied by 100:

Topographic Data
KYAPED DEM (5 ft.) aggregated to 30 m. horizontal spatial resolution was used to derive topographic variables.Then, the mean elevation and mean slope for each watershed were calculated with the zonal statistics tool.The drainage density (density of streams in length in a watershed) was computed using the following equation [79].
D d : Drainage density; ∑L: Total length of streams within the watershed; A: Area of the watershed.The NHD 100K stream shapefile was extracted with the identity and intersect tools, and all streams were assigned to their watershed using the spatial join tool.Next, the total stream length was calculated with the summary statistics tool for each watershed.Then, the drainage density was obtained for each watershed using Equation (10).
Hydrologic Soil Group maps were created with the Soil Data Viewer [80] tool.Hydrologic Soil Group maps comprise soil infiltration rates with seven ordinal categories (Box 1).
These categories were translated to numbers.Based on categories and their representative scores, the mean infiltration was calculated for each watershed with zonal statistics.

Box 1. Hydrologic Soil Group Classification
Hydrologic soil groups are based on estimates of runoff potential.Soils are assigned to one of four groups according to the rate of water infiltration when the soils are not protected by vegetation, are thoroughly wet, and receive precipitation from long-duration storms.The soils in the United States are assigned to four groups (A, B, C, and D) and three dual classes (A/D, B/D, and C/D).The groups are defined as follows: Group A: Soils having a high infiltration rate (low runoff potential); Group B: Soils having a moderate infiltration rate; Group C: Soils having a slow infiltration rate; Group D: Soils having a very slow infiltration rate (high runoff potential).If a soil is assigned to a dual hydrologic group (A/D, B/D, or C/D), the first letter is for drained areas and the second is for undrained areas.Only the soils that in their natural condition are in group D are assigned to dual classes.Source: [80]

Empirical Models
We used the mined, reclaimed woods, reclaimed forest, and reclamation age derived from Landsat images as the vegetation cover and mined (mining percentage) variables (Table 3) to develop three regression models, as specified in Equations ( 11)- (13).All topographic variables were added to the models as control variables (elevation, slope, drainage density, and infiltration).Infiltration was an ordinal variable, whereas the rest of the variables comprise continuous data only.
Table 3. Dependent and independent variables and their descriptions.

Dependent Variable Independent Variables Description
Conductivity (µS/cm): measurement in a stream at exit point of a watershed.First, we aimed to predict the conductivity without a vegetation cover effect with all other independent and control variables.Conductivity = β 0 + β mined X mined + β reclamation age X reclamation age + β elevation X elevation + β slope X slope + β drainage density X drainage.density + β infiltration X infiltration + β reclaimed forest X reclaimed forest + β reclaimed woods X reclaimed woods + e

Mined
Second, we added the vegetation cover variables to the previous model to predict conductivity and determine how vegetation recovery affects conductivity.
Reclaimed Forest = β 0 + β mined X mined + β reclamation age X reclamation age + β elevation X elevation + β slope X slope + β drainage density X drainage.density + β infiltration X infiltration + e Finally, we created an empirical model to predict vegetation recovery with independent variables.

Statistical Analyses
Standard deviation, skewness, kurtosis, standard errors of skewness, and kurtosis values were calculated for each variable as part of the descriptive statistics.Kurtosis values between −2 and 2 were accepted as an indication of the approximately normal distribution [81,82].Skewness between −1 and 1 was accepted as a normal distribution [83].In addition, histogram plots with normal distribution curves corroborated descriptive statistics visually.Multicollinearity was evaluated using a tolerance and variance inflation factor (VIF); all independent variables had tolerance values larger than 0.1 and VIF values smaller than 5.0, indicating the absence of multicollinearity [84].The Durbin-Watson value (1.5 < d < 2.5) [85] assured the absence of autocorrelation.In addition, scatterplots of the residual distribution, P-P plot, and standardized residual-standardized predicted values plot were computed to validate the homoscedasticity.The data were analyzed using SPSS v 22.This study accepted results with p ≤ 0.05 as being significant in the models.

Variations in Measured Chemical Parameters
Concentrations of measured chemical parameters in stream water samples collected during spring 2017 were found in the following order: Alkalinity > SO 4 > Ca > Mg >>Fe > Al.The water samples collected during summer 2017 were found in the following order: SO 4 >> Alkalinity > Ca > Mg, within the watersheds of eastern Kentucky.Variation patterns of concentrations of measured chemical parameters during spring and summer, with watershed characteristics, are presented in Table 4.We do not have chemistry data from the fall samples so we just include chemistry data in Table 4 from spring and summer only; we discuss fall EC data in the text.The measured electrical conductivity (EC) varies from 88.3 to 280.3 µS/cm with a mean value of 168.2 ± 64.5 µS/cm during spring (n = 9) and 73.1 to 2743.3 µS/cm with a mean value of 798.3 ± 782.1 µS/cm during summer (n = 22).Overall, the average EC in all measured watersheds during spring and summer (n = 31) appeared as 615.4 ± 716.9 µS/cm.The average EC from samples collected during fall (n = 58) showed 786.1 ± 501.0 µS/cm.The EC of our samples appeared slightly higher than the EC reported from the Australian rivers (EC = 741.7 µS/cm), where coal mine waste was discharged (Belmer and Wright 2020).The EC value exceeds the WHO guideline.The alkalinity varies from 24.3 to 69.4 mg/L with a mean value of 44 ± 15.2 mg/L during spring (n = 9) and 22.1 to 310.8 mg/L with a mean value of 121.9 ± 72.7 mg/L during summer (n = 22).The average alkalinity in all measured watersheds during spring and summer (n = 31) appeared as 99.3 ± 70.9 mg/L.The sulfate concentration ranges from 17 to 60.3 mg/L with a mean value of 32.3 ± 14.1 mg/L during spring (n = 9) and 9.95 to 1906.7 mg/L with a mean value of 356.9 ± 524.7 mg/L during summer (n = 22).The average sulfate in all measured watersheds during spring and summer (n = 31) was 262.6 ± 463.8 mg/L.The sulfate concentration exceeds the WHO guideline value.The calcium concentration ranges from 7.1 to 24.4 mg/L, with a mean value of 14.2 ± 5.5 mg/L during spring (n = 9) and 5.4 to 297.1 mg/L with a mean value of 74 ± 77.2 mg/L during summer (n = 22).The average calcium in all measured watersheds during spring and summer (n = 31) was 56.6 ± 70.3 mg/L.The magnesium concentration ranges from 4.6 to 16.8 mg/L with a mean value of 8.7 ± 4.3 mg/L during spring (n = 9) and 3.1 to 311.1 mg/L with a mean value of 64.7 ± 84.5 mg/L during summer (n = 22).The average magnesium in all measured watersheds during spring and summer (n = 31) was 48.5 ± 75.3 mg/L.Aluminum concentration ranges from 0 to 1.5 mg/L with a mean value of 0.23 ± 0.5 mg/L during spring (n = 9).The aluminum concentration exceeds the WHO guideline value.Iron concentration ranges from 0.05 to 1.05 mg/L with a mean value of 0.26 ± 0.3 mg/L during spring (n = 9).Based on the concentration data of the measured chemical parameters, aluminum and sulfate appeared to be prime concerns for health and environmental issues.The high concentration of aluminum toxicity severely impacts the nervous system, with possible causes of severe diseases such as Alzheimer's disease, Crohn's disease, dementia, inflammatory bowel disease, anemia, sclerosis, autism, breast cancer and cyst, pancreatic necrosis and diabetes mellitus [86,87].
High sulfate concentrations are primarily due to anthropogenic sources and may harm humans, animals, plants, or aquatic life [88].The high concentration of sulfate in drinking water causes a laxative effect and the taste impairment varies with the associated cations present in the systems; it causes diarrhea in adults and sometimes severe problems in infants and elderly people [89,90].The ionic balance disturbance in plant tissue is created due to exceedingly high concentrations of sulfate.As a consequence, harmful impacts may occur in ecosystems [91].Although there is no precise health-based guideline value for chemicals in drinking water such as sulfate, iron, and aluminum proposed, there is a need to control the exposure of such chemicals in the population by establishing a maximum limit to exposure that is recommended [92]; this is also supported by the US EPA.
Coal mining sites increase acid drainage within the landscape, and as a consequence, biogeochemical dynamics are altered, and the dissolution rate of minerals is accelerated in addition to having direct input from coal mining waste.In such an acidic environment with exposure to abundant amounts of fresh reactive mineral surfaces due to mountaintop mining, the high concentration of trace elements includes highly toxic chemicals (e.g., As, Cd, Co, Cr, Hg, Mn, Se, Sb, and Tn, etc.) released into the environment, affecting the ecosystem's ecology and the whole environment of that landscape.We plan to measure these trace elements in our future study and evaluate their impacts with mechanisms.

Descriptive Statistics
This research covers mined areas between 1986 and 2017; however, considering Kentucky's long history of coal production, mined (mining percentage) areas were expected to be more than the values displayed in Table 5.The mined without reclaimed forests (RF) variable shows similar values to the mined area variable values because the forest recovery was low, as it was a percentage of a watershed.The average RF (reclaimed forest) percentage was 8.21%, which means only 21% of the mined area of a watershed was converted to a reclaimed forest between 1986 and 2017 in consideration of average values.The mean reclaimed woods was 17.42%, which was more than double the amount of the reclaimed forest.Numerous studies found a link between macroinvertebrate community structures and physicochemical parameters in streams [39,41,60,93].For example, Bernhardt et al. (2012) and Griffith et al. (2012) reported that a conductivity of greater than 300 µS/cm negatively affects the aquatic environment [60,93].Moreover, conductivity > 500 µS/cm indicates ecological impairment and decreased biological diversity [39,41].The results of our study indicate that in 42 of 58 (72%) watersheds in the study area, the conductivity was higher than 300 µS/cm, and, in 32 (55%) of them, the conductivity was over 500 µS/cm.Therefore, impairment and a negative impact is expected in aquatic ecology due to the change in natural biogeochemical processes in more than half of the watersheds, although we did not perform any biological assessments.

Bivariate Correlations
Among all variables, reclaimed forest and drainage density variables were transformed using square root transformation to gain normal distribution.The linearity between independent and dependent variables was checked by using scatter plots prior to running bivariate correlations and Ordinary Least Square (OLS) regression models.Pearson bivariate correlation matrixes for all sampling dates and variables were employed to find significant correlations among variables (Table 6).
There was no linear or curvilinear relationship between reclaimed forest and conductivity variables; however, the remaining scatter plots displayed linear relations between independent and dependent variables.Subsequently, another variable was created to test the influence of reclaimed forest on conductivity, i.e., the mined without a reclaimed forest variable (mined without RF or mined w/o RF).This variable represented the numerical difference between a mined and reclaimed forest.We expected a potential influence on the reclaimed forest from mined and reclamation age variables.While checking the data assumptions, a curvilinear relationship was discovered between mined and reclaimed forest variables.However, the log transformation of the mined variable yielded a linearity between the variables.We did not find any correlation among the vegetation cover; mining percentage; conductivity; and Al 3+ , Fe 2+ , Fe 3+ , and Mn 2+ .We concluded that either metals were not dissolved and filtered out during the filtering process or Al 3+ , Fe 2+ , Fe 3+ , and Mn 2+ levels were not sufficient to detect the correlations.
Comparisons between independent and dependent variables are displayed visually with maps and symbols in Figures 8-13.Bivariate correlation matrix results were consistent with linearity validation results.Linear relationships that were observed in scatterplots were significant at the p ≤ 0.05 or p ≤ 0.01 level (Table 6).The bivariate correlation analysis demonstrated that conductivity was strongly related to CMRSC variables [58,94], except for Fe 2+ , Fe 3+ , Al 3+ , and Mn 2+ .An in-depth literature review also suggested that conductivity was highly associated with surface coal mining in eastern Kentucky and West Virginia [42,58,60].Based on this information, we decided to use conductivity as a representative variable for CMRSC variables in multivariate regression models.
We did not find a significant correlation between topographic variables and conductivity (fall) except in infiltration.There was a strong correlation between infiltration and conductivity.7).The mined and reclamation age were significant (p < 0.01; Table 7).The standardized beta (B) for mined and reclamation age coefficients was 0.70 and −0.32, respectively (Table 7), which indicate that the mined parameter was positively and strongly correlated with conductivity, whereas the reclamation age was significantly, but negatively, related to conductivity.Partial correlations for coefficients also confirmed that the mined parameter was more influential on conductivity than the reclamation age.Infiltration was not a significant factor; thus, it is excluded from the model.The following regression equation was predicted for conductivity:  We found, for this model, that the constant value was higher than [42,59], suggesting that it might have been due to inadequate control sites.However, the coefficients were comparable to those found by [42,59].This supports the hypothesis that the reclamation age has less influence on conductivity, and that conductivity was best predicted with a mining percentage [42,59,60]; however, none of these studies reported a correlation between the reclamation age and conductivity.

Effects of Vegetation Cover Change on Conductivity
The bivariate Pearson correlation analysis between mined without RF and conductivity was very strong and positive (R = 0.90; Table 6).Multivariate regression test results indicated a significant overall goodness of fit for two predictors (mined without RF and reclamation age) that significantly predict conductivity (R 2 = 0.832, adjusted R 2 = 0.826, and F(2,55) = 136.659,p < 0.01).The regression model accounted for 82.6% of the variance in the conductivity prediction (Table 8).The model showed that mined without RF (p < 0.01) and reclamation age (p < 0.026) were significant coefficients (Table 8).Unstandardized beta coefficients (β) were 0.799 (mined w/o reclaimed forest) and −0.164 (reclamation age).Pearson bivariate correlation results suggested a strong positive correlation between reclaimed woods, infiltration, and conductivity.Conversely, the model showed that reclaimed woods and/or infiltration was not a significant contributor for predicting conductivity.p values were higher than 0.05 and t values were smaller than 1 for reclaimed woods; therefore, reclaimed woods were excluded in both steps.The multivariate analysis yielded the following equation for predicting conductivity:  The model had a slightly higher adjusted R 2 value (0.826) than Regression Model A (0.818).There was no difference between an undisturbed forest and reclaimed forest from a land cover standpoint, suggesting that reclaimed forest areas can be assumed as not disturbed.Comparing the coefficients of the mined and mined without RF parameters, the mined without RF was a better predictor of conductivity.
We were able to improve the model by removing the reclaimed forested areas.The result indicated a positive correlation between the reclaimed forest and conductivity mitigation [32].Our findings agree with [23], who showed that there is a potential for total dissolved solids-source-control practices that incorporate FRA to improve mine water discharge quality.The vegetation data covered a 31-year-period (1986-2017), since data prior to 1986 were not dependable.Measurements of the geographic region of the reforested areas for the entire 31-year-period have proved challenging due to data inconsistency.It should be noted that the reclaimed forest percentage range was relatively low within the given time interval .This might have caused the nonexistence of a direct relationship between reclaimed forest and conductivity.However, recent improvements in Landsat imagery will likely provide more reliable data.In addition, forest recovery is expected to increase in the future.This experiment should be repeated to obtain a clear idea about the influence of reclaimed vegetation on conductivity and water quality.

Effects of Reclamation Age and Mined Operation (Mining Percentage) on Reclaimed Forest
We used multivariate regression to predict the relationship between the dependent variable reclaimed forest and the independent variables of the reclamation age and mining percentage.The results indicated that the model was significant (R 2 = 0.641, adjusted R 2 = 0.628, F(2,55) = 49.02,p < 0.01).The regression model accounted for 62.8% of the variance in the reclaimed forest prediction (Table 9), and the mined and reclamation age were significant coefficients (p < 0.01 for both; Table 9).Adding the mined parameter improved the reclaimed forest prediction greatly (R 2 changed from 0.250 to 0.641).Standardized beta coefficients (B) showed that the log mined (0.692) and reclamation age (0.654) were positive and had similar effects in the reclaimed forest prediction (Table 9).None of the topographic variables were added to the model because no significant correlations were found between the reclaimed forest percentage and topographic variables.Multivariate regression analysis yielded the following equation for the prediction of a reclaimed forest: Reclaimed Forest = [(1.83× log 10 Mined) + (0.121 × ReclamationAge) − 1.953] 2  (16) McElfish and Beier (1990) reported that coal surface mine reclamation efforts in southeastern USA are usually assessed after five years of mining activity [95].Conversely, the model estimates that 25 years is necessary for 10% of a mined watershed to be converted to a reclaimed forest (Equation ( 16)).These results suggest that five years may not be enough time to assess reclamation success.Many scientists have studied effective techniques to reclaim forests in areas impacted by coal mining by species composition, comparing mined and unmined sites, hydrologic properties, or best revegetation practices [23,36,96,97].Conversely, not too many studies have measured the quantity of the reclaimed forest in relation to the reclamation age at the watershed level.

Conclusions
This study explored the relationships between water quality parameters; reclaimed forest percentage (dependent variables); mining percentage; reclamation age; reclaimed woods, slope, elevation, and drainage density; and infiltration in stream-reach watersheds affected by coal mining.We found that reclamation age is a significant factor for predicting conductivity in reclaimed mines.Even though reclamation age was not a primary factor, it appeared to increase the accuracy of the conductivity prediction.This research used more convenient and accurate methods for collecting and analyzing data about mining percentage and reclamation age than the previous studies in the Appalachian region states.Furthermore, reclamation age and topographic factors were not evaluated on a regional scale in previous studies.
We also investigated the effects of the reclaimed forest and reclaimed woods on conductivity to improve the Regression Model A, which included only the mining percentage (mined) and reclamation age for conductivity prediction.We did not find a direct relation between the conductivity and reclaimed forest percentage or the reclaimed woods percentage; however, by removing the reclaimed forest and performing a correlation of the mining percentage with conductivity, we were able to moderately improve the results.The Regression Model A results were consistent with earlier studies finding that coal mining had a major impact on water quality degradation and forest structure.
In addition, we used reclamation age and mining percentage to estimate the quantity of reclaimed forested areas in the watershed.Reclaimed forest quality is usually measured by comparing the resemblance to the original forest.We concluded that the quality of the reclaimed forest is important for evaluating the reclamation success; however, quantity is equally important.The quantity of a reclaimed forest, coupled with quality, provides a better assessment of reclaimed areas when advanced remote sensing techniques are used.For example, high resolution aerial images or Light Detection and Ranging (LIDAR) data may be used to define the composition of the reclaimed forested area with an accurate delineation of the mined area.
We included topographic variables (slope, elevation, drainage density, and infiltration) to take into account the possible influences of topographic variations in the regression models.The results indicated no such relation, except infiltration was positively correlated with conductivity (fall); however, this was not significant.Studies by [41,52] found similar results.
Overall, this study suggests that conductivity is a predictable water quality indicator that is highly associated with CMRSC, where agriculture and urban areas are limited.Furthermore, our assessment of the vegetation cover change may provide insight into the reclamation success in terms of restoring deforested areas.Water quality polluted through the coal mine activities in the region, especially due to the high concentration of sulfate and aluminum based on the measured parameters.Our findings may help the scientific community and regulating agencies improve their understanding of water quality characteristics more effectively in watersheds affected by coal mining.This can be accomplished by refining land reclamation practices by using advanced techniques and monitoring protocols.

Figure 1 .
Figure 1.Effects of coal mining on stream water.Left side of valley (unmined): natural infiltration, precipitation infiltrated efficiently (e.g., trees intercept rain, roots create porosity, topsoil provides effective infiltration).Right side of valley (mined): poor infiltration, stream pollution proportionally with mined area, surface flow not tolerated properly (e.g., compacted soil, topsoil loss).The figure was created by Oguz Sariyildiz.

Figure 1 .
Figure 1.Effects of coal mining on stream water.Left side of valley (unmined): natural infiltration, precipitation infiltrated efficiently (e.g., trees intercept rain, roots create porosity, topsoil provides effective infiltration).Right side of valley (mined): poor infiltration, stream pollution proportionally with mined area, surface flow not tolerated properly (e.g., compacted soil, topsoil loss).The figure was created by Oguz Sariyildiz.

Figure 2 .
Figure 2. Two Valley fills side by side near Chavies, Kentucky [37.373367, 83.351092].Valley fill is an engineered earthen and rock structure where excess soil and rocks are deposited from surface mining or, in some cases, underground mining.They built in approximately 1995 (right) and 2013 (left).They are 350 m. and 300 m. deep, respectively.Source: Google Earth.

Figure 2 .Figure 3 .
Figure 2. Two Valley fills side by side near Chavies, Kentucky [37.373367, 83.351092].Valley fill is an engineered earthen and rock structure where excess soil and rocks are deposited from surface mining or, in some cases, underground mining.They built in approximately 1995 (right) and 2013 (left).They are 350 m and 300 m deep, respectively.Source: Google Earth.Environments 2024, 11, x FOR PEER REVIEW 5 of 34

Figure 3 .
Figure 3. Photos from the same scene (near Whitesburg, Kentucky [37.031036, −82.710169] in different years.It displays how surface mining affects the natural land cover and the appearance of land cover during recovery.Mined areas from 1995 turned to woods while mined areas from 2005 turned to grass and bush.

Figure 4 .
Figure 4. Location of the research area.

Figure 4 .
Figure 4. Location of the research area.

Figure 5 .
Figure 5. Research area (HUC10 watersheds), studied stream reach watersheds, and sampling locations (exit point of the stream reach watersheds).

2. 2 . 1 .
Water Samples, Data Collection, and Analysis Method National Hydrography Dataset (NHD) 100K stream shapefile (USGS 2007-2014), HUC10 watershed polygons (USGS 2007-2014), and stream reach watersheds (Kentucky Division of Mine Permits 2004) were used to determine watershed boundaries and water sample locations.Three field trips were conducted to collect water samples: 15 May 2017 (spring), 21-24 July 2017 (summer), and 21-25 October 2017 (fall).The weather was stable and there was no surface runoff for any of the sampling days.During the first two trips, 9

Figure 5 .
Figure 5. Research area (HUC10 watersheds), studied stream reach watersheds, and sampling locations (exit point of the stream reach watersheds).

Figure 6 .
Figure 6.Water quality data analysis steps to assess variables.Figure 6. Water quality data analysis steps to assess variables.

Figure 6 .
Figure 6.Water quality data analysis steps to assess variables.Figure 6. Water quality data analysis steps to assess variables.

Figure 7 .
Figure 7. Active mining areas by years between 1986 and 2017.

Figure 7 .
Figure 7. Active mining areas by years between 1986 and 2017.

Table 1 .
Seasonal mean conductivity (µS/cm) values for four classes of streams (unmined, valley filled, valley fill with residences, and mined with no valley fill) from [43].n represents the sample size.Numbers in parentheses are standard deviations.Unmined: non-mined areas; filled: Valley fills with coal mine spoils; filled/Residential: Valley fills and residential areas together; mined: Mined areas without valley fills.

Table 1 .
[43]onal mean conductivity (µS/cm) values for four classes of streams (unmined, valley filled, valley fill with residences, and mined with no valley fill) from[43].n represents the sample size.Numbers in parentheses are standard deviations.Unmined: non-mined areas; filled: Valley fills with coal mine spoils; filled/Residential: Valley fills and residential areas together; mined: Mined areas without valley fills.

Table 2 .
(a) Accuracy assessment for land cover classes for 1990.(b) Accuracy assessment for land cover classes for 2002.(c) Accuracy assessment for land cover classes for 2014.

Table 4 .
Watershed characteristics with measured chemical parameters during spring and summer 2017.WHO guideline values from 2017.nd represents no data.Columns left to right: Watersheds, total mined area, mined area without RF and RW, average reclamation age, alkalinity, electrical conductivity, sulphate, aluminum, calcium, iron, magnesium.

Table 5 .
Descriptive statistics for dependent and independent variables.

Table 7 .
(a) Regression Model A summary.(b) Regression Model A coefficients.

Table 8 .
(a) Regression Model B summary.(b) Regression Model B coefficients.

Table 9 .
(a) Regression Model C summary.(b) Regression Model C coefficients.