Modes of Occurrence, Elemental Relationships, and Economic Viability of Rare Earth Elements in West Virginia Coals: A Statistical Approach

: Rare earth elements and yttrium (REY) are essential for manufacturing technologies vital to economic and national security. As the demand for REY increases and conventional ores become depleted, attention is turning to unconventional resources like coal as a source for these elements. As the nation’s second-largest coal producer, West Virginia (WV) has the potential to transition into producing REY. This study utilizes open-access coal chemistry data from the USGS COALQUAL database in order to assess the potential of WV coal deposits as resources for REY and to gain insight into elemental modes of occurrence and possible enrichment mechanisms. Results suggest that clay minerals dominate the inorganic fraction of most samples and that REY concentrations are primarily proportional to the inorganic content. A few samples deviate from this trend due to mineralogic differences and impacts of post-depositional processes, including possible hydrothermal ﬂuid inﬂuences. An ash-basis economic assessment identiﬁed 71 promising samples in the data set. The majority of promising samples were sourced from lower to lower-middle Pennsylvanian coal seams in the Kanawha, New River, and Pocahontas formations. Future studies should investigate these deposits using direct analytical methods to better characterize vertical and lateral heterogeneity in REY concentrations and conﬁrm modes of occurrence.


Introduction
Rare earth elements are critical for essential technologies used in renewable energy, communication, transportation, and national defense. They include the 15 lanthanide elements and sometimes yttrium and scandium, which are chemically similar to the lanthanides and can occur in the same mineral deposits. This paper assesses rare earth element content in the context of lanthanide elements plus yttrium, which are commonly referred to as "REY". Scandium was excluded as a rare earth element because, although the element is geochemically similar to REY, it often does not co-occur with REY [1]. There is great interest in developing new domestic resources for REY as conventional ores are being depleted while demand is increasing worldwide [2]. Coal is a good potential resource for these elements because in many cases it contains elevated concentrations of REY, which are comparable to conventional ore deposits. REY in coal occurs in both organic and inorganic associations [2]. The US also has the world's largest proven coal reserves and mining infrastructure is already in place. Additionally, the combustion of coal for electricity generation produces fly ash as a waste product that may have high concentrations of certain rare earths and can potentially be utilized as a source for REY [3][4][5][6]. Acid mine drainage, another waste product associated with coal mining, has also been found to be enriched in a few critical REY [7,8].
To assess the potential of coal deposits as REY resources, it is imperative to gain insight into REY modes of occurrences and understand the depositional and diagenetic processes responsible for REY distribution. This will help identify promising coal seams northern seams were generated during more seasonal conditions in topogenous mir [25].

Materials & Methods
Coal chemistry data for the state of WV were downloaded from the USG COALQUAL (version 3.0) database (https://ncrdspublic.er.usgs.gov/coalqual/) (access on 24 February 2021). The database contains measurements for major, minor, and tra elements, ash yield, and other characteristics for unweathered, full-bed US coal sample Methodologies for determining elemental concentrations may vary based on when an where the sample was analyzed, potentially leading to inconsistencies in accuracy an precision [26]. Additionally, the WV data set is biased toward bituminous coals sourc from stratigraphically older coal seams within the Central Appalachian Basin in the sout ern portion of the state, where the majority of mining takes place. Regardless, t COALQUAL database is the largest publicly-available resource for WV coal chemist data. At the time of download, the COALQUAL database contained 609 sample recor for the state of WV. Seven samples with blank seam names, one sample with coordinat falling outside state boundaries, and thirty-two samples with incomplete sets of RE measurements were removed, resulting in a data set containing 569 samples. Sample l cations are shown in Figure 1.

Data Substitutions
The presence of data below detection limits (BDL) presents a challenge when usi COALQUAL data for analyses. The database shows all BDL data at 0.7 times the detecti limit with an "L" indicator in the qualifier column. The use of the 0.7 multipliers w suggested by Connor et al. [27] and supported by Gluskoter et al. [28] as a way to create fully quantitative dataset when a majority of samples are above the detection limit (ADL

Materials & Methods
Coal chemistry data for the state of WV were downloaded from the USGS COALQUAL (version 3.0) database (https://ncrdspublic.er.usgs.gov/coalqual/) (accessed on 24 February 2021). The database contains measurements for major, minor, and trace elements, ash yield, and other characteristics for unweathered, full-bed US coal samples. Methodologies for determining elemental concentrations may vary based on when and where the sample was analyzed, potentially leading to inconsistencies in accuracy and precision [26]. Additionally, the WV data set is biased toward bituminous coals sourced from stratigraphically older coal seams within the Central Appalachian Basin in the southern portion of the state, where the majority of mining takes place. Regardless, the COALQUAL database is the largest publicly-available resource for WV coal chemistry data. At the time of download, the COALQUAL database contained 609 sample records for the state of WV. Seven samples with blank seam names, one sample with coordinates falling outside state boundaries, and thirty-two samples with incomplete sets of REY measurements were removed, resulting in a data set containing 569 samples. Sample locations are shown in Figure 1.

Data Substitutions
The presence of data below detection limits (BDL) presents a challenge when using COALQUAL data for analyses. The database shows all BDL data at 0.7 times the detection limit with an "L" indicator in the qualifier column. The use of the 0.7 multipliers was suggested by Connor et al. [27] and supported by Gluskoter et al. [28] as a way to create a fully quantitative dataset when a majority of samples are above the detection limit (ADL), since substituting values between 0-1 times the detection limit for a small number of samples has little overall effect on statistical analyses. For correlation and cluster analyses, the data set was modified by removing elements with greater than 20% of data BDL or with a large number of missing observations. Then, an additional 102 records with missing observations were deleted, leaving a data set containing 467 samples. All remaining BDL data were substituted at 0.7 times the detection limit.
Data for the economic assessment was modified slightly differently because the calculation relies on the concentration of all REY but no other elements. This part of the analysis uses the original 569 samples with substitutions made for all BDL data using a methodology developed by Lin et al. [29], which uses different multipliers (Q factors) for each REY. Summary statistics were generated separately for BDL and ADL data to help with determining appropriate Q factors and are shown in Table 1. Table 1. Summary statistics for below detection limit (BDL), above detection limit (ADL), and Q factor-adjusted data. Minimum (Min.), mean, and maximum (Max.) values in ppm (whole coal-basis). S.D. is standard deviation; n/a means not applicable. To evaluate how different Q factors and subsequent substitutions for BDL data influence statistical analysis, mean REY values were calculated using a series of Q factors between 0-1, and then the relative standard deviation (RSD) of the means was assessed. For each element with BDL data, the detection limit was multiplied by the following Q factors: 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0. Mean values were calculated for each Q factor iteration and were used to calculate the RSD for each element. Smaller RSD percentages indicate that the values are more tightly clustered around the mean, while larger percentages indicate more dispersion. Results for individual elements, total REY (TREY), and the concentration of both critical and excessive REY (as defined by Seredin [24]) are shown in Figure 2. The elements La, Ce, Sm, and Y have no BDL data, so no substitutions were necessary. The effect of substituting with Q factors was minimal (<12.4%) for Nd, Eu, Tb, Yb, Lu, TREY, critical, and excessive elements. RSD for Gd was 19.4%, Er was 37%, and for Pr, Dy, Ho, and Tm, it was >50%. Figure 3 shows that individual REY results are largely correlated with the percentage of BDL data, with larger BDL proportions resulting in higher RSD. For elements with RSD below 12.4% and BDL proportions less than 25%, a Q factor of 0.7 was used. For the remaining elements, Q factors were chosen based on the mean values of the ADL data. Q factors and summary statistics for the adjusted data are shown in Table 1. and for Pr, Dy, Ho, and Tm, it was >50%. Figure 3 shows that individual REY results largely correlated with the percentage of BDL data, with larger BDL proportions result in higher RSD. For elements with RSD below 12.4% and BDL proportions less than 2 a Q factor of 0.7 was used. For the remaining elements, Q factors were chosen based the mean values of the ADL data. Q factors and summary statistics for the adjusted d are shown in Table 1.

Pearson Correlation
To assess the strength of inorganic association for elements in WV coals, Pearson c relation coefficients were calculated progressively for all samples through the full ran of ash yields (in ascending order) using the cor() function in R. Pearson correlat

Pearson Correlation
To assess the strength of inorganic association for elements i relation coefficients were calculated progressively for all sample of ash yields (in ascending order) using the cor() function in

Pearson Correlation
To assess the strength of inorganic association for elements in WV coals, Pearson correlation coefficients were calculated progressively for all samples through the full range of ash yields (in ascending order) using the cor() function in R. Pearson correlation measures the strength of linear correlation between two variables, producing a coefficient between −1 to 1. A coefficient of zero indicates no correlation, a coefficient of 1 indicates a strong positive correlation, and −1 indicates a strong negative correlation. Matrix correlation was also performed to generate Pearson correlation coefficients for each element versus every other element in the data set to provide further insight into elemental modes of occurrence. Before performing matrix correlation, the data were transformed using symmetric pivot coordinates, which have been shown to provide more robust results when working with compositional data, including coal geochemistry data [12,30]. Pivot coordinates were created using the pivotCoord() function (robCompositions package) and then matrix correlation was performed using the cor() function in R.

Hierarchical Clustering
Agglomerative hierarchical clustering was used to provide additional insight into elemental relationships and modes of occurrence. Hierarchical clustering provides information on the degree of affinity between elements in coal [11,12]. The height at which elements are joined in the resulting dendrogram indicates the degree of similarity, with greater heights indicating less similarity. Hierarchical clustering was performed with the hclust() function in R, using the average-linkage method and Pearson correlation as the measure of dissimilarity. These methods have been shown to be robust when working with coal geochemistry data [11,12].

K-Means Clustering
K-means cluster analysis is a type of unsupervised machine learning algorithm used to identify underlying patterns in a data set and to partition data points into a series of clusters based on certain similarities by reducing the intra-cluster sum of squares. In this analysis, K-means was used to identify elemental patterns and chemical similarities between coal samples. The analysis was performed using the kmeans() function (stats package) in R. Data were scaled using the scale() function and the optimal number of clusters was determined using the elbow method with the fviz_nbclust() function (factoextra package). Coal samples were assigned to eight unique clusters. Effect sizes to determine relative enrichment between clusters were calculated by dividing the difference between the cluster mean and the full population mean by the standard deviation of the population.

REY Economic Assessment
To help identify coal seams that can potentially serve as REY resources, a methodology proposed by Seredin and Dai [4] was used to evaluate each sample in the dataset. Ashbasis rare earth oxide (REO) concentrations for each sample were calculated using the formula (REY) 2 O 3 . Market outlook (C outl ) coefficients were calculated using ash-basis REY concentrations by dividing the ratio of critical REY to TREY by the ratio of excessive REY to TREY. Samples were designated as highly promising, promising, and unpromising based on the following criteria developed by Seredin and Dai [4]: highly promising if REO concentration is at least 1000 ppm and C outl is greater than 2.4, promising if REO concentration is at least 1000 ppm and C outl is between 0.7 and 2.4, and unpromising if REO concentration is less than 1000 ppm or C outl is less than 0.7.

Pearson Correlation
Previous studies have established that positive correlations with ash yield generally indicate that an element has an inorganic association in coal, while negative correlations may indicate an organic association [10,31]. Calculating Pearson correlations between each element and progressively increasing ash yield produced a few notable trends. For most elements, correlation coefficients were either weakly positive or oscillated between weak positive and weak negative values throughout the full ash range, possibly due to mixed organic-inorganic relationships. However, Al and Si produced strong positive coefficients for the full range of ash values, indicating that these elements are inorganically associated and are the main ash-forming components in WV coals. This is in line with previous studies that have suggested clay minerals constitute the majority of mineral matter in most coal seams [32,33]. Two elements, Br and total sulfur (TS), produced negative correlation coefficients for the full range of ash yields. Although the correlations were relatively weak, results suggest these two elements may have organic associations in high and low ash coals. An organic association for Br is supported by a study on Pennsylvanian age coals from nearby eastern Kentucky [34]. Elements with weak correlations in low ash ranges but high correlations across higher ash samples include Ti, Cr, K, Li, Mg, Th, V, Ga, Pb, Sc, and the light rare earths Ce, La, and Nd. Weak correlation in the low ash range may indicate that the elements are more likely to have organic or mixed organic-inorganic associations in low ash samples but strong inorganic associations in higher ash samples. Figure 4 shows correlation coefficient trends for REY across the range of ash yields. The REY form noticeable groups, with Ce, La, and Nd trending together and representing the strongest correlations with ash yield. Tb trends by itself with the poorest correlation and the remaining REY trend in a loose group where Y and the heavy rare earths, Lu and Yb, have a stronger correlation than Sm and Eu.
Minerals 2022, 12, x FOR PEER REVIEW 7 of 23 mixed organic-inorganic relationships. However, Al and Si produced strong positive coefficients for the full range of ash values, indicating that these elements are inorganically associated and are the main ash-forming components in WV coals. This is in line with previous studies that have suggested clay minerals constitute the majority of mineral matter in most coal seams [32,33]. Two elements, Br and total sulfur (TS), produced negative correlation coefficients for the full range of ash yields. Although the correlations were relatively weak, results suggest these two elements may have organic associations in high and low ash coals. An organic association for Br is supported by a study on Pennsylvanian age coals from nearby eastern Kentucky [34]. Elements with weak correlations in low ash ranges but high correlations across higher ash samples include Ti, Cr, K, Li, Mg, Th, V, Ga, Pb, Sc, and the light rare earths Ce, La, and Nd. Weak correlation in the low ash range may indicate that the elements are more likely to have organic or mixed organic-inorganic associations in low ash samples but strong inorganic associations in higher ash samples. Figure 4 shows correlation coefficient trends for REY across the range of ash yields. The REY form noticeable groups, with Ce, La, and Nd trending together and representing the strongest correlations with ash yield. Tb trends by itself with the poorest correlation and the remaining REY trend in a loose group where Y and the heavy rare earths, Lu and Yb, have a stronger correlation than Sm and Eu. Results of progressive ash yield correlation were expanded upon by matrix correlation, which provided insight into the specific inorganic associations of elements in coal. Matrix correlation results are shown in Figure 5. Unsurprisingly, Al and Si were strongly correlated with each other as well as with Ti and moderately with K. These elements are known to be primarily associated with clay minerals in coal [19,35,36]. Ti also strongly correlates with Nb (0.68), which is most commonly associated with zircon in coal [19]. The correlation between Nb and Zr in this analysis was 0.86, suggesting that zircon is a major mode of occurrence for Nb. However, Nb can also be present in TiO2 minerals (e.g., anatase/rutile or ilmenite), often finely dispersed among clays minerals [36,37]. Caesium does not correlate well with Si or Al, but has a moderately strong correlation with K. Previous studies have shown that Cs occurs in potassium-rich clay forming minerals such as feldspar and mica, as well as within illite and mixed-layer clays [19,38]. Likewise, Mg correlates moderately well with K, but not with Si and Al in this analysis. Magnesium can be Results of progressive ash yield correlation were expanded upon by matrix correlation, which provided insight into the specific inorganic associations of elements in coal. Matrix correlation results are shown in Figure 5. Unsurprisingly, Al and Si were strongly correlated with each other as well as with Ti and moderately with K. These elements are known to be primarily associated with clay minerals in coal [19,35,36]. Ti also strongly correlates with Nb (0.68), which is most commonly associated with zircon in coal [19]. The correlation between Nb and Zr in this analysis was 0.86, suggesting that zircon is a major mode of occurrence for Nb. However, Nb can also be present in TiO 2 minerals (e.g., anatase/rutile or ilmenite), often finely dispersed among clays minerals [36,37]. Caesium does not correlate well with Si or Al, but has a moderately strong correlation with K. Previous studies have shown that Cs occurs in potassium-rich clay forming minerals such as feldspar and mica, as well as within illite and mixed-layer clays [19,38]. Likewise, Mg correlates moderately well with K, but not with Si and Al in this analysis. Magnesium can be present in illite and mixed-layer clays, but may also be present in chlorite and various carbonates (e.g., calcite, ankerite, and dolomite) or may have organic associations [19,39,40] Lithium, which is most commonly found in aluminosilicates in coal [19], has a strong to moderately strong correlation with Si, Al, and Ti in this analysis. Thorium and Cr have moderate correlations with Si and Ti and a strong correlation with each other. Thorium also has a moderate correlation with Sc and a strong correlation with Hf, Ce, and La. These elemental associations are in agreement with previous studies that have found Th in aluminosilicates, zircon, monazite, and xenotime in coal [19,41,42]. Compared to Th, Cr has slightly weaker correlations with Si, Ti, Hf, Ce, and La, but a stronger correlation with Sc. This may suggest that Cr and Th have some similar modes of occurrence, however published literature indicates that Cr is most commonly found in spinel-group minerals (e.g., chromite), aluminosilicates (e.g., Cr-bearing clays), and in association with organic matter [15,19,40,41,43]. Scandium has also most commonly been associated with aluminosilicates and organic matter, but can occur as a trace element in zircon and xenotime [19,44]. Weak correlations with Si and Al in this analysis suggest that aluminosilicates may not be a major mode of occurrence for Sc in WV coals. present in illite and mixed-layer clays, but may also be present in chlorite and various carbonates (e.g., calcite, ankerite, and dolomite) or may have organic associations [19,39,40] Lithium, which is most commonly found in aluminosilicates in coal [19], has a strong to moderately strong correlation with Si, Al, and Ti in this analysis. Thorium and Cr have moderate correlations with Si and Ti and a strong correlation with each other. Thorium also has a moderate correlation with Sc and a strong correlation with Hf, Ce, and La. These elemental associations are in agreement with previous studies that have found Th in aluminosilicates, zircon, monazite, and xenotime in coal [19,41,42]. Compared to Th, Cr has slightly weaker correlations with Si, Ti, Hf, Ce, and La, but a stronger correlation with Sc. This may suggest that Cr and Th have some similar modes of occurrence, however published literature indicates that Cr is most commonly found in spinel-group minerals (e.g., chromite), aluminosilicates (e.g., Cr-bearing clays), and in association with organic matter [15,19,40,41,43]. Scandium has also most commonly been associated with aluminosilicates and organic matter, but can occur as a trace element in zircon and xenotime [19,44]. Weak correlations with Si and Al in this analysis suggest that aluminosilicates may not be a major mode of occurrence for Sc in WV coals.  Several elements in this analysis correlate with only one other element. For example, Fe and As correlate only with each other (correlation coefficient of 0.62). The dominant mode of occurrence in coal for both of these elements is the mineral pyrite, although they can also be present in other sulfides, aluminosilicates, carbonates, and in association with organic matter [19,39,40]. Zinc and Cd share a moderate correlation and are predominately associated with sphalerite in the literature, but they have also been found in pyrite and silicates in coal [19,39]. Moreover, Zn has also been found in the organic fraction of low-rank coals [19]. Cobalt and Ni also share a moderate correlation in this analysis. Both of these elements are known to have a variety of modes of occurrence in coal, but are primarily Minerals 2022, 12, 1060 9 of 23 associated with sulfides, organic matter, and aluminosilicates [19]. Lead and Ag each have a moderate correlation with Cu. Copper is most commonly found in sulfides (e.g., pyrite and chalcopyrite) in coal, but may also occur in association with clay minerals and organic matter [19,40]. Silver has also been found in sulfide minerals and the organic fraction of coal, as well as in the native metal form [19]. Likewise, Pb is primarily associated with sulfides (e.g., pyrite and galena) and organics in coal, but can be present in numerous mineral phases including selenides (e.g., clausthalite), sulfates, carbonates, phosphates, and aluminosilicates [19]. Vanadium has a moderate correlation with Nd, but does not correlate with any other element. Vanadium is mainly associated with organic matter and aluminosilicates in coal [19,39]. Tantalum has a moderate correlation with Hf. Both of these elements are primarily associated with zircon in coal, although they have been identified in anatase, aluminosilicates, and (rarely) in association with organic matter [19]. Correlation between Hf, Th, and Ti does suggest possible associations with anatase and zircon. Beryllium has a moderate correlation with Y, but does not correlate with any other element. Modes of occurrence for beryllium are not well defined in the literature, but most studies have found either organic or silicate associations [19,39]. A very weak negative correlation with Si may indicate significant organic associations for Be in WV coals. Manganese has a moderate correlation with Mg and is most commonly found in carbonates including calcite, siderite, and ankerite, all of which have been identified in Appalachian Basin coals [19,37]. However, Mn has also been identified in silicates and in association with organic matter [19]. Strontium has a moderate correlation with Ca, but Ca also has a strong correlation with TS. All three of these elements can be introduced into a coal seam by circulating groundwater and TS and Ca are also connected to the original peat-forming plants [11,19]. Sulfur can be present in several forms in coal, but primarily occurs in sulfide minerals (e.g., pyrite) or in an organic form [19]. Organic S is thought to be the dominant form in low-S coals [19] like those used in this analysis. This is supported by a moderate negative correlation between TS and the major ash-forming elements, Si and Al. Like TS, Ca and Sr can have organic associations in coal and the three elements have also been identified in sulfate minerals, including gypsum [11,19]. Calcium and Sr are also commonly found in carbonate minerals and crandallite [19,45]. However, Sr can occur in a variety of other minerals including goyazite, celestine, barite, and aluminosilicates, while Ca has also been identified in apatite in coal [19,41,[45][46][47][48].
Elements that do not correlate well with any other element used in this analysis include Na, B, Ba, Br, F, Ga, Ge, Hg, Mo, P, Sb, Se, U, W, and Zn. These elements likely have mixed organic/inorganic associations, are present in a multitude of inorganic phases, or are present in a very low, narrow range of concentrations.
The REY form noticeable groups in terms of correlation with each other and with other elements. The lightest REY, La and Ce, have a strong correlation with each other and Th and a moderate correlation with Sm, Eu, Cr, and Sc. Lanthanum and Ce, along with Sm, Eu, Sc, and Th have been identified in phosphate minerals (e.g., monazite, crandallite, and apatite), zircon, carbonates, and aluminosilicates [19,39,45,47]. Samarium and Eu have a moderate to strong correlation with all other REY, except Nd and Y, and a moderate correlation with Sc. As previously noted, these elements have been found accompanying other LREY in phosphates and other minerals. The heaviest REY, Yb and Lu, have a strong correlation with each other, a moderately strong correlation with Sm and Eu, and a moderate correlation with Sc. The HREY are commonly associated with phosphate minerals (e.g., xenotime), organic matter, and aluminosilicates [19,39,45,47]. Ytterbium has also been identified in pyrite cement in a Central Appalachian Basin coal seam underclay [45]. Previous studies have suggested that HREY are more likely to form stable complexes with organic matter or to be adsorbed to clay mineral surfaces than the LREY [4,19,21,47]. Ytterbium and Lu correlations with ash yield are slightly weaker than ash yield correlations for La and Ce, possibly suggesting more organic affinity for the HREY in WV coals. Interestingly, Nd and Y moderately correlate with each other, but do not correlate with any other REY. Nd also has a moderate correlation with V and Y has a moderate correlation with Be.
These correlations could suggest that Nd and Y might be associated with organic matter or silicates, but it is unclear whether these relationships are real or if they are spurious correlations. Neodymium is generally assumed to have modes of occurrence similar to La and Ce and has been observed co-occurring with those elements in phosphate minerals (e.g., apatite and monazite) and aluminosilicates [39,45,47]. Yttrium has been observed in phosphate minerals (e.g., xenotime and crandallite) and zircon, but may also occur in association with organic matter and aluminosilicates [19,39,45,49]. Terbium has a moderate correlation with Sm and Eu but does not correlate with any other element. It has been detected in chalcopyrite in a Northern Appalachian Basin underclay, co-occurring with Dy and Er [47]. Published literature suggests that Tb has modes of occurrence similar to HREY [19,39].
Previous studies from within the Appalachian Basin have found REY associated with phosphate minerals, zircon, clay minerals, and organic matter [20,45,47,49,50]. Hower et al. [49,50] investigated the high-REY Fire Clay coal seam in eastern Kentucky, a seam that contains a volcanic-ash-fall tonstein. The authors identified REY in zircon, monazite, and crandallite [49,50]. Using particle size and density separations, Lin et al. [20] determined that REY had both organic and inorganic associations in a Central Appalachian Basin coal sample. REY-bearing minerals have also been directly observed within underclays associated with Appalachian Basin coal seams. Montross et al. [45] identified several REY-bearing phosphate minerals (e.g., apatite, monazite, and crandallite) in four coal seam underclay samples from WV. These minerals were present as discrete, detrital grains housed in the clay matrix and authigenic pore-filling minerals. Individual REY were detected as components of a clay coating on framboidal pyrite and in various types of cement. Yang et al. [47] also examined Appalachian Basin underclay samples. They found that REY primarily occurred within detrital phosphate minerals embedded in the clay matrix or as authigenic phosphates filling pore spaces. The pore-filling phosphates were presumed to have been formed during diagenesis or epigenetic infiltration events. However, baseline REY content was generally attributed to the original detrital input of REY-bearing minerals. The authors suggest that the post-depositional remobilization and redistribution of REY may also lead to a portion of the REY being adsorbed onto mineral surfaces [47]. REY concentration in coal seams is likely to be governed by similar processes. Detrital mineral sources remained similar throughout the Pennsylvanian subperiod, as source terrain and sediment distribution patterns remained relatively unchanged [51]. Therefore, the types of REY-bearing minerals initially introduced into the palaeomires should be the same as those contained in the underclays. However, the mechanisms controlling remobilization and redistribution of REY in coal would be influenced by differences in the depositional environment and the larger organic fraction within the resulting coal. Direct analytical methods should be applied to Appalachian coals in future studies to confirm individual REY modes of occurrence.

Hierarchical Clustering
An agglomerative hierarchical clustering algorithm was used to gain further insight into elemental relationships and modes of occurrence. The dendrogram shown in Figure 6 provides a visual representation of the degree of affinity between elements. At the highest level of the dendrogram the elements are split into two major clusters, one of which contains the REY (right side of diagram). All of the REY used in the analysis appear together in a subcluster, except for Nd and Y. This subcluster also contains Sc, which appears more closely related to the heaviest REY (Yb and Lu) than to the LREY, and U at the highest level of the subcluster, indicating less similarity. This subcluster of REY is part of a higher-level cluster containing Nb, Zr, Si, Al, Ti, Li, Th, Cr, and Hf, elements that are commonly associated with zircon, clay minerals, TiO 2 minerals (e.g., anatase and rutile), and phosphate minerals including monazite and crandallite. Yttrium and Nd are part of a small subcluster that also contains V, Be, and Ga. The latter three elements are mainly associated with organic matter and silicates (clay minerals in the case of V and Ga) in coal [19]. However, although these elements are clustered together, they do not have a very high degree of similarity. The subcluster containing Y and Nd is also part of a higher-level cluster containing W, Ge, Sb, Br, Co, Ni, Pb, Ag, and Cu. Many of these elements can be found in a variety of minerals, including sulfides and aluminosilicates, but all have been shown to have at least some association with organic matter [19].
highest level of the subcluster, indicating less similarity. This subcluster of REY is part of a higher-level cluster containing Nb, Zr, Si, Al, Ti, Li, Th, Cr, and Hf, elements that are commonly associated with zircon, clay minerals, TiO2 minerals (e.g., anatase and rutile), and phosphate minerals including monazite and crandallite. Yttrium and Nd are part of a small subcluster that also contains V, Be, and Ga. The latter three elements are mainly associated with organic matter and silicates (clay minerals in the case of V and Ga) in coal [19]. However, although these elements are clustered together, they do not have a very high degree of similarity. The subcluster containing Y and Nd is also part of a higher-level cluster containing W, Ge, Sb, Br, Co, Ni, Pb, Ag, and Cu. Many of these elements can be found in a variety of minerals, including sulfides and aluminosilicates, but all have been shown to have at least some association with organic matter [19]. The major (non-REY containing) cluster on the left side of the dendrogram contains several subclusters. The first contains Cd, Zn, Fe, As, Hg, and Mo, all elements that are primarily associated with sulfide minerals in coal [19]. The second subcluster contains Ca, TS, Sr, Ba, B, and Na. Sulfur, B, Sr, and Ba have been used as indicators for the depositional environment of the coal forming peat (e.g., marine or freshwater influences) [19]. All of these elements have been identified in association with organic matter, but may also be present in a variety of minerals in coal [19]. Potassium, Mg, Cs, and Mn are also clustered together, although K, Mg, and Cs are more similar to each other than they are to Mn. The former three elements are most commonly found in aluminosilicates (e.g., illite, mixedlayer clays, feldspar, and mica), although Mg can also be present in carbonates and organic associations. Manganese has been identified in aluminosilicates as well, but is more common in carbonates, and sometimes organic matter [19].

K-Means Clustering
K-means cluster analysis was used to assign coal samples to eight clusters based on geochemical patterns to identify samples that may have similar elemental modes of occurrence. These similarities may be due to common depositional conditions or the influence of similar post-depositional processes. The algorithm assigned 120, 65, 162, 45, 50, 6, 16, and 3 samples to Clusters 1 through 8, respectively. Figure 7 shows the relative enrichment (effect size) between clusters for each element used in the analysis. For most Clusters, Si and Al enrichment is proportional to enrichment in ash, except for Cluster 4 where The major (non-REY containing) cluster on the left side of the dendrogram contains several subclusters. The first contains Cd, Zn, Fe, As, Hg, and Mo, all elements that are primarily associated with sulfide minerals in coal [19]. The second subcluster contains Ca, TS, Sr, Ba, B, and Na. Sulfur, B, Sr, and Ba have been used as indicators for the depositional environment of the coal forming peat (e.g., marine or freshwater influences) [19]. All of these elements have been identified in association with organic matter, but may also be present in a variety of minerals in coal [19]. Potassium, Mg, Cs, and Mn are also clustered together, although K, Mg, and Cs are more similar to each other than they are to Mn. The former three elements are most commonly found in aluminosilicates (e.g., illite, mixed-layer clays, feldspar, and mica), although Mg can also be present in carbonates and organic associations. Manganese has been identified in aluminosilicates as well, but is more common in carbonates, and sometimes organic matter [19].

K-Means Clustering
K-means cluster analysis was used to assign coal samples to eight clusters based on geochemical patterns to identify samples that may have similar elemental modes of occurrence. These similarities may be due to common depositional conditions or the influence of similar post-depositional processes. The algorithm assigned 120, 65, 162, 45, 50, 6, 16, and 3 samples to Clusters 1 through 8, respectively. Figure 7 shows the relative enrichment (effect size) between clusters for each element used in the analysis. For most Clusters, Si and Al enrichment is proportional to enrichment in ash, except for Cluster 4 where the elements are more depleted than ash content, and Cluster 6, where Al is slightly more enriched than ash and Si. Enrichment patterns for the elements Mg, K, and Ti, which appear to be primarily associated with clay minerals in this analysis and in previous studies, also generally follow the same pattern as ash yield, Si, and Al. The enrichment pattern of Li mimics that of ash yield and the primary clay forming elements for all clusters except Cluster 8, where Li is slightly depleted while ash, Si, and Al are enriched. Likewise, the enrichment patterns of Cr, Cs, Sc, and Th mimic ash yield, for all clusters except Cluster 6, where those elements are depleted. Moreover, the enrichment patterns for many REY follow the general trend of ash yield across clusters, with some apparent exceptions. In Cluster 6, the elements Ce, Eu, La, Lu, Sm, Tb, and Yb are depleted even though the cluster is enriched in ash. In Cluster 2, REY are noticeably more depleted than ash, while in Cluster 8, REY are enriched far beyond the level of ash. Enrichment patterns for elements that do not have a significant association with the inorganic fraction are more complex. The trend for TS is contrary to ash in Clusters 1, 2, 4, 5, 6, and 7, indicating the presence of significant organic associations for sulfur. The trend for Br is contrary to ash in Clusters 1, 3, 5, and 6, again indicating the presence of organic associations in these clusters. Cluster 8 is highly enriched in REY compared to other clusters. Dai et al. [18] indicate that in high-REY coal, the elements are often hosted within aluminum-phosphate-sulfate minerals of the alunite supergroup. These minerals are formed by the chemical weathering of sulfides [61]. Enrichment in TS, Ni, and Cd in Cluster 8 indicates the presence of sulfide minerals [10]. As sulfides are oxidized and dissolved in an aqueous solution during coalification or post-coalification weathering, they produce acidic groundwater that can dissolve additional mineral species and mobilize REY [61]. Water and sediments impacted It is evident that samples from Clusters 2, 6, and 8 have influences that set them apart from the general elemental trends. Compared to other clusters, Cluster 6 is highly enriched in Na, B, Ba, Be, Ga, Ge, and Sr. Compared to other ash-enriched clusters, Cluster 6 is depleted in all of the rare earth elements except for Nd and Y. Enrichment in Na, Sr, and Ba suggest a seawater influence during the depositional or post-depositional periods. Sr and Ba have been shown to occur in certain coals as carbonate and sulfate minerals, including calcite, barite, celestine, and gypsum that can be of detrital, syngenetic, or epigenetic origin [19,39,41,48]. Cluster 6 is one of only three clusters in this analysis to show a relative enrichment in calcium, possibly indicating the presence of carbonates, which are usually authigenic in coal [19,39,51]. Calcium may also be present in LREE-bearing phosphate minerals including apatite and crandallite [45][46][47]. The six samples constituting Cluster 6 are high volatile bituminous coals from seams in the Kanawha Formation, located along the Kanawha River and its tributaries in the northwestern section of the Kanawha-New River Basin. It is known that this region is affected by saline brines of marine origin flowing upward under artesian pressure from Mississippian-age and older formations [52]. It is possible that the coal samples in Cluster 6 were influenced by these brines post-deposition. The Mississippian brines, in particular, are known to be highly enriched in Sr, which would explain the excessive relative enrichment in this cluster [53]. Although accessory B-bearing minerals (e.g., tourmaline or mica), illite, or intra-basinal brines could elevate B concentrations [15,[54][55][56][57], Goodarzi and Swaine [58] suggested that B concentrations between 50-110 ppm in coal indicate a mildly-brackish water influence during or shortly after deposition. The B content in the Cluster 6 samples ranges from 45-226 ppm, possibly suggesting the influence of brackish groundwater. Although B content is sometimes used as an indicator of depositional conditions, an influx of seawater-derived brines after deposition could introduce a significant source of the element into the coal seam. Further, in seawater and related precipitates, Y/Ho and Zr/Hf ratios are high, with seawater ratios ranging from 44 to 74 and 85 to 130, respectively. Precipitated minerals contain similar but usually slightly lower ratios [59]. Holmium measurements for Cluster 6 are all below the detection limit, so they cannot be used to assess Y/Ho ratios, but the Zr/Hf ratios range from 53.2 to 138.6, which is further evidence of a seawater influence. This may explain why Y is one of only two REY enriched in this cluster since an influx of brines with anomalous Y concentrations may have mobilized and leached the other REY. In contrast, excess Y could have been incorporated into mineral precipitates or adsorbed onto mineral or organic matter. A post-depositional brine influence is also supported by Mastalerz and Drobniak [60], who suggested that in Ga-enriched coal, the element may be adsorbed onto coal surfaces within pore spaces during the post-depositional flow of Ga-enriched fluids.
Cluster 8 is highly enriched in REY compared to other clusters. Dai et al. [18] indicate that in high-REY coal, the elements are often hosted within aluminum-phosphate-sulfate minerals of the alunite supergroup. These minerals are formed by the chemical weathering of sulfides [61]. Enrichment in TS, Ni, and Cd in Cluster 8 indicates the presence of sulfide minerals [10]. As sulfides are oxidized and dissolved in an aqueous solution during coalification or post-coalification weathering, they produce acidic groundwater that can dissolve additional mineral species and mobilize REY [61]. Water and sediments impacted by acidic coal mine waters often exhibit elevated REY concentrations and, when normalized to shale or the upper continental crust (UCC), present a concave pattern indicating relative enrichment in the middle REY [22,62]. UCC normalized Tb/La and Lu/Tb ratios in Cluster 8 range from 1.88 to 5.98 and 0.52 to 0.72, respectively, signifying medium REY enrichment. It appears likely that the Cluster 8 samples were impacted by acid mine waters. This is supported by significant enrichment in elements commonly associated with Appalachian coal mine drainage, including Mn, Ni, Cd, Zn, and TS. REY would have been transported within the fluid, and precipitated from solution, adsorbed onto clay minerals or organic material, or trapped within pore spaces in the Cluster 8 coal samples [63,64]. The REY are enriched far beyond ash content, suggesting that they may have some organic association in Cluster 8 samples. However, it is difficult to draw strong conclusions about this cluster because it only contains 3 samples.
Cluster 2 is depleted in Si, Al, and other clay-associated elements but has an average ash yield close to the population mean. Enrichment in Ca may indicate the presence of calcium carbonates, while Fe enrichment most likely indicates the presence of pyrite or possibly siderite [19,51]. These authigenic minerals appear to make up a significant portion of the mineral matter in Cluster 2 samples. About 85% of samples in this group come from the Monongahela formation, which is known to contain abundant lacustrine carbonates, including micrites interbedded with argillaceous limestone and calcareous shale [65]. Since calcium carbonates in coal are not typically enriched in REY [51], a lack of detrital REYbearing minerals in the original peat mire probably accounts for the relatively depleted status of these elements. Calcium in Appalachian coal is thought to be primarily sourced from groundwater [51], so it is possible that some of the existing REY leached out of the coal bed. This may explain why some of the REY in this cluster are slightly more depleted than those in Cluster 1, even though Cluster 1 is more depleted in ash.
Clusters 1 and 3 are the largest clusters and have elemental enrichment patterns that appear to be largely controlled by the ash, or inorganic, content of the coal samples. Cluster 1 is the most depleted in ash, Si, Al, and clay-associated elements, K, Ti, Cr, Cs, Li, and Sc. It is also depleted in other elements most commonly associated with the inorganic fraction of coal, including Ca, Fe, Ga, Th, V, Nb, Hf, Ta, and Zr, but is slightly enriched in TS and Br, which supports a high organic content in these samples. REY are depleted in this cluster due to the low inorganic content since mineral matter is the original primary source of REY in coal. Cluster 3 has an ash yield range similar to that of Cluster 2, but is slightly depleted in Ca, Fe, and TS. Silicon and Al contents are close to the population means and proportional to ash enrichment, suggesting that clay minerals likely make up a significant portion of the inorganic fraction. The average concentration of most elements in this cluster, including REY, are also close to the population means and proportional to ash yield.
Cluster 4 is the only cluster with an effect size greater than 1 for Mo and is the most enriched in As, both of which are most often associated with sulfide minerals and organic matter in coal [19,40]. The most common sulfide association for As is pyrite, where it can substitute for S 2 2-or, in some cases, for Fe [19]. In the present analysis, the correlation between As and Fe in Cluster 4 samples is 0.66, indicating some likely association with pyrite. The correlation between Mo and Fe, however, is very poor. Finkelman et al. [39] suggested that Mo may have a variety of associations in coal, including sulfides, silicates, and organics. Molybdenum is easily mobilized and redistributed due to high solubility even in low-oxygen conditions, resulting in highly concentrated subsurface deposits [66]. A study by Diehl et al. [67] found that most As in Appalachian coals is found in epigenetic pyrite, and enrichment is associated with late-stage precipitation from hydrothermal fluids [67]. In addition to mobilizing Mo, hydrothermal fluids can also be responsible for REY mobilization and redistribution [4]. Most of the REY in Cluster 4 are enriched to a slightly higher degree than ash yield, which could potentially be explained by a hydrothermal influence. The cluster is also enriched in P, but no clear correlation exists with REY. If hydrothermal processes were at play, the REY may have multiple modes of occurrence in this cluster, including organic and inorganic.
Cluster 5 has a higher average ash yield and is more enriched in most clay-associated elements than Cluster 4 but is less enriched in REY. Additionally, it is also more depleted in sulfide-related elements, including Fe, TS, Cd, Cu, Co, Ni, As, and Sb. However, it is the second most Se-enriched cluster. According to the West Virginia Geologic and Economic Survey, the highest Se concentrations in WV are present in coals of the Allegheny and upper Kanawha formations, particularly in the south-central part of the state [68]. Seventy-eight percent of samples in Cluster 5 come from these formations. A portion of Se in these samples may exist in the form of lead selenides, as the correlation between Pb and Se is 0.65. Lead selenides, such as clausthalite, are a common trace mineral in Appalachian Basin coal and have been identified within partings and filling organic matter pore space in WV samples [37]. However, some Se is likely to be directly bound or adsorbed to organics and could also occur in sulfide minerals [37,69]. Yudovich & Ketris [69] proposed two types of Se accumulation in coal: a mostly syngenetic accumulation of Se in high sulfur coals under reducing conditions or mostly epigenetic accumulation under the influence of oxidized, Se-rich waters. The coals of Cluster 5 are low-S, with TS values ranging from 0.136 to 1.87 ppm. An influx of oxidized water and subsequent leaching could explain enrichment in Se and depletion of sulfide-related elements. At low temperatures, selenides are more stable than sulfides in oxidizing conditions and are a common feature of lowtemperature hydrothermal systems [70]. Oxidized fluid flow may also partially explain the REY distribution in Cluster 5. Cerium and La are more enriched than the other REY used in the analysis. Ce 3+ can be oxidized to Ce 4+ , leading to an accumulation of Ce in oxidized zones, while the remaining REY are leached to new accumulation zones [71]. However, this does not explain why La is enriched similarly to Ce.
Cluster 7 is also enriched in Se, but unlike Cluster 5, it is also enriched in some common sulfide-related elements, including Fe, Cd, Cu, Co, Ni, Sb, and Zn. However, TS is depleted. Cluster 7 is also notable because it is highly enriched in K relative to the other clusters. As the most ash-enriched group, samples would contain a significant amount of clay minerals. Closer inspection reveals that the high cluster mean for K is largely driven by the stratigraphically older samples (lower to lower-middle Pennsylvanian). However, K concentrations for all samples in the cluster are within the top 7% of values for the entire WV data set. A mineralogical study of underclays associated with central Appalachian coal seams found that illite is more common and well-crystallized in lower and lowermiddle Pennsylvanian age underclays than in stratigraphically younger clays. However, poorly-crystallized illite was found in a few samples from younger formations, including the Allegheny, which is also represented in Cluster 7. The authors concluded that illite in the lower to lower-middle Pennsylvanian underclays primarily represents the original sediments, with little to no alteration. In contrast, the younger underclays likely experienced in-situ alteration by post-depositional acid leaching during peat accumulation [25]. High concentrations of illite in Cluster 7 samples would explain enrichment in K, as well as the relatively high enrichment in Mg and Cs, as these elements are readily adsorbed onto the surface of or incorporated into the molecular structure of illite [19,72]. Potassium-rich clay could also serve as a host of Cr, Cd, Cu, Co, Ni, Pb, and Zn in this cluster of samples. Previous research has shown that heavy metal ions can be adsorbed onto alkaline clay mineral surfaces or, in some cases, bound within the clay minerals [19,73]. Like Cluster 5, samples in Cluster 7 may have experienced an influx of oxidizing, Se-rich waters. But unlike Cluster 5, there is no correlation between Pb and Se. Instead, Se may be adsorbed to the large clay fraction, as Se has been shown to exist in a sorbed selenate state in oxidized coals [69]. Furthermore, Se may also be associated with organic matter, sulfides, or selenides [19,37,69]. REY enrichment in Cluster 7 generally appears proportional to enrichment in ash. Zircon is a potential host of Y in this cluster, as the correlation coefficient for Zr and Y is 0.68.
Ash yield, or the inorganic content of coals, plays a role in determining the REY concentration of samples. It is generally assumed that TREY concentration should correlate well with ash yield [33]. This assumption proves to be accurate when performing correlation with the WV dataset as a whole. However, K-means cluster analysis identified groups of samples that did not necessarily follow this trend. As shown in Figure 8, Clusters 2 and 3 have similar ranges of ash yield, but they have different ranges of TREY concentration ( Figure 9). This is due to differences in the specific type of mineral matter present in these samples. Cluster 2 appears to be dominated by authigenic minerals like carbonates, while Cluster 3 likely had a more significant detrital input, which allowed allogenic REY-bearing minerals to accumulate. Likewise, Cluster 5 samples are generally higher in ash than Cluster 4, but Cluster 4 samples are more enriched in TREY. These results indicate that the relationship between REY and ash yield can vary due to differences in depositional environments and post-depositional processes within a coal basin.
( Figure 9). This is due to differences in the specific type of mineral matter present in these samples. Cluster 2 appears to be dominated by authigenic minerals like carbonates, while Cluster 3 likely had a more significant detrital input, which allowed allogenic REY-bearing minerals to accumulate. Likewise, Cluster 5 samples are generally higher in ash than Cluster 4, but Cluster 4 samples are more enriched in TREY. These results indicate that the relationship between REY and ash yield can vary due to differences in depositional environments and post-depositional processes within a coal basin.

Economic Assessment of REY in WV Coals
In many cases, higher ash coal samples have higher whole coal REY concentrations than lower ash samples. But, when REY concentration is considered on an ash basis, the

Economic Assessment of REY in WV Coals
In many cases, higher ash coal samples have higher whole coal REY concentrations than lower ash samples. But, when REY concentration is considered on an ash basis, the opposite is often true-high ash coals can end up with lower concentrations than their low ash counterparts. Therefore, when assessing the economic viability of coal as a source of REY, it is crucial to consider the type of coal product being used. If REY are being extracted directly from whole coal, it is evident that the most valuable coals would be those with the highest in-situ REY content. Additionally, materials co-produced during coal mining and use, such as overburden rocks, refuse, and underclays, are often enriched in REY relative to the associated coal seam [74], so coals with the highest whole coal REY concentration would likely also have associated refuse containing high REY concentrations. However, if coal ash is to be used as a raw material for extracting REY, one needs to identify coals with the highest ash basis REY concentrations. Seredin and Dai [4] developed a method to assess the economic viability of coal using the ash-basis REO sum and market outlook coefficients (C outl ) based on the ratio of critical to excessive REY. To identify economically promising samples, they proposed a total REO cutoff of 1000 ppm and a C outl of >0.7.
Coal ash is a promising resource for REY because they are non-volatile elements that become concentrated in ash during coal combustion, and vast quantities of ash are generated as a byproduct of electricity generation. In 2019 alone, the US produced 29,319,239 short tons of fly ash and 9,150,680 short tons of bottom ash [75,76], meaning coal ash is an abundant potential resource. Market outlook coefficients for WV COALQUAL samples ranged from 0.12 to 4.03, and total ash-basis REO concentration ranged from 317.0 ppm to 2825.01 ppm. Using the criteria established by Seredin and Dai [4], 68 samples fell in the promising range and 3 in the highly promising range. However, for the purposes of this discussion, all 71 samples will be referred to as promising. These samples represent 12.5% of the entire WV data set. The samples were sourced from 30 different coal seams (Table S1), almost entirely located within the Central Appalachian Basin in the southern half of the state ( Figure 10). Based on the proportion of promising to unpromising samples, lower to lower-middle Pennsylvanian coals appear to be the most promising ( Figure 11). In particular, the New River formation had the largest proportion of promising samples (28.10%), followed by the Pocahontas formation (20.75%), and the Kanawha formation (9.65%). However, the COALQUAL sample distribution is skewed toward stratigraphically older coal seams in the southern part of the state. While this may have resulted in a less accurate economic viability assessment for stratigraphically younger coals in northern WV, these coals are not mined as intensively as those in southern WV.
The promising samples are split between four of the eight K-means clusters. Cluster 1 has 31 promising samples, followed by Cluster 3 with 20 promising samples, Cluster 4 with 12 samples, and Cluster 8 with 2 samples. The remaining 6 promising samples were not used during cluster analysis due to a high proportion of missing values for various non-REY elements. Although Clusters 1 and 3 have relatively low whole coal TREY, they are low in ash, so the REY are not diluted by an abundance of other inorganic materials when converted to an ash basis concentration. 99% of Cluster 1 samples and 87% of Cluster 3 samples come from coal seams in the New River, Pocahontas, or Kanawha formations. REY enrichment in these clusters is generally proportional to inorganic content and does not appear to be greatly affected by post-depositional processes. About 89% of samples in Cluster 4 also come from the New River, Pocahontas, and Kanawha formations. Hydrothermal fluids may have enhanced REY enrichment in these samples. Although Clusters 1 and 3 have a higher total number of promising samples, Cluster 4 has a higher proportion of samples that are considered promising (27%, compared to 26% and 12% for Clusters 1 and 3). Cluster 8 has the highest proportion of promising samples (about 67%), but with only 3 total samples, the cluster itself is anomalous.
ples, lower to lower-middle Pennsylvanian coals appear to be the most promising ( Figure  11). In particular, the New River formation had the largest proportion of promising samples (28.10%), followed by the Pocahontas formation (20.75%), and the Kanawha formation (9.65%). However, the COALQUAL sample distribution is skewed toward stratigraphically older coal seams in the southern part of the state. While this may have resulted in a less accurate economic viability assessment for stratigraphically younger coals in northern WV, these coals are not mined as intensively as those in southern WV. The promising samples are split between four of the eight K-means clusters. Cluster 1 has 31 promising samples, followed by Cluster 3 with 20 promising samples, Cluster 4 with 12 samples, and Cluster 8 with 2 samples. The remaining 6 promising samples were not used during cluster analysis due to a high proportion of missing values for various non-REY elements. Although Clusters 1 and 3 have relatively low whole coal TREY, they are low in ash, so the REY are not diluted by an abundance of other inorganic materials when converted to an ash basis concentration. 99% of Cluster 1 samples and 87% of Cluster 3 samples come from coal seams in the New River, Pocahontas, or Kanawha formations. REY enrichment in these clusters is generally proportional to inorganic content and does not appear to be greatly affected by post-depositional processes. About 89% of samples in Cluster 4 also come from the New River, Pocahontas, and Kanawha formations. Hydrothermal fluids may have enhanced REY enrichment in these samples. Although Clusters 1 and 3 have a higher total number of promising samples, Cluster 4 has a higher proportion of samples that are considered promising (27%, compared to 26% and 12% for Clusters 1 and 3). Cluster 8 has the highest proportion of promising samples (about 67%), but with only 3 total samples, the cluster itself is anomalous.  Although Seredin and Dai [4] set the ash-basis REO cutoff grade at 1000 ppm, they suggested that the cutoff could be reduced to 800-900 ppm in coal seams greater than 5 m thick. It can be argued that the cutoff grade could also be reduced for other reasons, including advances in REY extraction efficiency and the use of coal seams already being mined for electricity generation, resulting in reduced costs associated with acquiring the REY. If REO cutoff grade were reduced to 800 ppm in this analysis, it would increase the number of promising samples from 71 to 181 (about 31.8% of total WV samples). Future research should focus on assessing the economic implications of new REY extraction technologies and utilizing existing coal ash deposits or coal seams currently mined for power generation to determine whether potential cost reductions would justify lowering the REO cutoff grade.
The economic assessment methodology can also be applied to evaluate samples on a whole coal basis using an REO cut-off of 300 ppm, the concentration currently considered viable by the US Department of Energy [77]. Using this cutoff, the WV COALQUAL data set yields only 1 sample in the promising range-a Cluster 8 sample from the New River Although Seredin and Dai [4] set the ash-basis REO cutoff grade at 1000 ppm, they suggested that the cutoff could be reduced to 800-900 ppm in coal seams greater than 5 m thick. It can be argued that the cutoff grade could also be reduced for other reasons, including advances in REY extraction efficiency and the use of coal seams already being mined for electricity generation, resulting in reduced costs associated with acquiring the REY. If REO cutoff grade were reduced to 800 ppm in this analysis, it would increase the number of promising samples from 71 to 181 (about 31.8% of total WV samples). Future research should focus on assessing the economic implications of new REY extraction technologies and utilizing existing coal ash deposits or coal seams currently mined for power generation to determine whether potential cost reductions would justify lowering the REO cutoff grade.
The economic assessment methodology can also be applied to evaluate samples on a whole coal basis using an REO cut-off of 300 ppm, the concentration currently considered viable by the US Department of Energy [77]. Using this cutoff, the WV COALQUAL data set yields only 1 sample in the promising range-a Cluster 8 sample from the New River formation with an REO concentration of 320.05 ppm. Two other samples have REO concentrations greater than 270 ppm and C outl values in the promising range. Both of these samples are from Cluster 7 and are located in the New River and "New River-Kanawha" formations. Cluster 7 samples have the highest range of ash yields and may host some of their REY in zircon.
Although Cluster 6 did not yield any promising samples, it is interesting that this cluster has the highest C outl values, ranging from 2.40 to 4.03. This is due to the fact that Nd and Y, both critical REY, are relatively enriched while the remaining REY are depleted. These coals and others along the Kanawha River and tributaries may be worth investigating as a source of Nd and Y, since excessive REY and the radioactive elements Th and U are present in relatively low proportions.

Conclusions
The results of Pearson correlation and K-means cluster analysis suggest that clay minerals dominate the inorganic fraction of most WV COALQUAL samples and that REY concentrations are generally proportional to inorganic content. However, in samples that appear to be dominated by authigenic minerals, such as calcium carbonates, REY are more depleted than clay-dominated samples with similar ash yields.
Based on published literature and results of matrix correlation and hierarchical clustering, REY in WV coals are likely to be associated with phosphate minerals (e.g., monazite, xenotime, and crandallite), zircon, and possibly aluminosilicates. Yttrium and Nd may have modes of occurrence that differ from the remaining REY, possibly including organic affiliations.
An ash basis economic assessment found 71 economically promising samples in the WV data set. The majority of promising samples came from coal seams in the Kanawha, New River, and Pocahontas formations in the Central Appalachian Basin.
While statistics are a standard indirect method to assess elemental modes of occurrence, there are limitations to this type of analysis. Coal geochemical data are compositional in nature and can produce misleading results when traditional statistical methods (e.g., correlation or hierarchical clustering) relying on Euclidean geometry are applied [12,30]. This problem can be overcome by first transforming the coal chemistry data using log-ratio transformation techniques including symmetric pivot coordinates, which are used in this study [12,30]. Further, statistical methods can only provide information about possible mineral phase associations (e.g., sulfides or aluminosilicates) and cannot be used to discern an element's presence within a specific mineral (e.g., pyrite or calcite) [19]. Elements in coal can have unexpected mineral associations (e.g., Pb and Se occurring in clausthalite rather than sulfides) and can be present in very different concentrations even within the same mineral (e.g., As being enriched in epigenetic pyrite relative to syngenetic pyrite) [19,67]. Mixed organic-inorganic associations can also complicate results and the degree of these associations can be influenced by coal rank or ash yield [10,39]. Therefore, statistical results must be interpreted using careful consideration of geochemical principles and can only be verified using direct analytical methods including transmission electron microscopy (TEM), scanning electron microscopy-energy dispersive X-ray spectroscopy (SEM-EDS), and X-ray diffraction (XRD).
The COALQUAL database is comprised of full-bed coal samples, that do not reflect vertical heterogeneity. These samples were collected at different times, for various purposes, and were analyzed using methods with varying degrees of precision and accuracy. Only broad inferences regarding bulk samples can be made during statistical analysis of COALQUAL data. The WV data set is biased against stratigraphically younger samples, so these coal seams are underrepresented in the analyses. However, the COALQUAL database is the largest publicly-accessible source of coal chemistry data in WV and was leveraged in this study to obtain insight into possible REY modes of occurrence and identify coals that may serve as promising REY resources.
To fully understand the potential of WV coals as a resource for REY, samples should be investigated through direct methods such as TEM, SEM-EDS, and XRD, to accurately quantify REY content and modes of occurrence and to characterize the vertical variations within samples. Waste products related to the mining and utilization of these coals, including coal ash and abandoned mine discharge, should be located, cataloged, and analyzed to identify existing refuse that may serve as valuable REY resources.