Mid-Infrared Spectroscopy as a Potential Tool for Reconstructing Lake Salinity

Many aquatic ecosystems in Australia are impacted or threatened by salinisation; however, there is a paucity of records detailing the changes in salinity of individual water bodies that extend beyond a few decades. One way to overcome this issue is the use of inference models, which have typically been based on biological proxies. This pilot project investigates the potential for mid-infrared spectroscopy (MIRS) to provide an alternative method of reconstructing past salinity levels in Australian lakes. A small (19 lakes) calibration dataset was used to develop a MIRS-based lake water salinity inference model (measured vs. inferred salinity, based on leave-one-out cross-validation, R2 = 0.64). This model and a previously published diatom–salinity model were both used to infer salinity levels in Tower Hill Lake in south-eastern Australia, over the last 60 years. Comparisons between these reconstructions and measured salinity data from Tower Hill Lake indicate that salinities inferred by the MIRS model more closely resembled the measured values than those produced using the diatom model, predominantly in terms of the actual values inferred, but also with regard to the trends observed. This supports the hypothesis that MIRS can provide a valuable new tool for reconstructing lake salinity.


Introduction
Salinity is a leading cause of soil degradation in Australia, resulting from alterations to land use, particularly vegetation clearance. In Australia, salinity has major economic and environmental impacts; over 2 million hectares of land is already affected by dryland salinity, whilst a further 3.7 million hectares has a high potential to be affected [1]. Without successful intervention, this is likely to increase to 17 million hectares by 2050 [2]. Many aquatic ecosystems in Australia are also impacted or threatened by salinisation [3,4]. For example, increasing salinity has already halved the number of bird species present in Western Australian wetlands [5]. However, long-term monitoring data are not available for the majority of Australian wetlands and lakes. Such information is important for understanding the extent and degree of lake salinisation induced by land-use change, establishing realistic remediation targets, and for inferring past climate, particularly in non-outlet lake systems. Consequently, alternative methods, such as palaeolimnology, are required to improve understanding of salinity within Australia [6,7].
Palaeolimnological methods that quantify past environmental conditions predominantly rely on the development of spatial calibration models. Within this paper, the term "spatial calibration model" refers to a multivariate regression model (typically using partial least-squares regression) used to determine the relationships between: (1) the value of the proxy in sediment samples from a number of different lakes or wetlands across a large spatial area; and (2) selected physical and chemical parameters from the same waterbodies, sampled at the time of sediment collection. Such predictive calibration models rely on the assumption that the processes responsible for differences over a spatial gradient will also be applicable for changes that have occurred over time. Furthermore, these models will only be accurate if the variable of interest (e.g., salinity) is the environmental factor that has the strongest influence on the selected proxy [8].
A number of studies have previously developed quantitative reconstructions of salinity within Australia, predominantly using spatial calibration models based on diatom and ostracod assemblages [9][10][11][12][13][14][15]. There are, however, a number of difficulties associated with some of these. For example, although diatom-based models have often been used to quantify past salinity levels overseas, in Australia their applicability is frequently limited by the impact of land-use changes, as diatoms are also sensitive to water quality shifts associated with land-use change [16]. Diatom-based inference models can also be hampered by poor diatom preservation in many Australian wetlands [17]. Taphonomic problems such as preferential transportation, different settling rates, and varying susceptibility to breakage or dissolution can also reduce the accuracy of diatom reconstructions. Finally, even when considering recent sediments that are not subject to the above issues, significant disparities between diatom inferred values and monitoring data may still be observed [18]. Such issues can also apply to other biological proxies used to infer past changes in salinity within Australian waterways. For example, distinct differences between living foraminiferal communities and the assemblages preserved within sediments have been reported [19], and differential dissolution of ostracods dependent on both on carapace morphology and the sedimentary microenvironment has also been observed [20]. Consequently, there is a need to develop a new, alternative method of determining past salinity levels.
Mid-infrared spectroscopy (MIRS) offers several advantages over more traditional reconstructions based on species assemblage data of biological proxies. Acquisition of the spectral data is rapid and several hundred samples can be measured in one day. The rapid rate of measurement, combined with the low cost of this method, allows for a greater number of sediment samples to be analysed given the same temporal and financial constraints [21][22][23]. Consequently, higher resolution studies and increased replication of samples are possible. Furthermore, less specialised knowledge and training are required, relative to species identification and taxonomy. Accordingly, MIRS represents a promising alternative to more costly and labour-intensive analyses of biological community assemblages.
The basic principle of MIRS is that infrared radiation stimulates molecular vibrations and, as a consequence of the quantum mechanical behaviour, this radiation is absorbed at specific wavenumbers. The wavelengths that this energy is absorbed at correspond to molecular vibrations [24]. Consequently, the wavelengths at which the MIRS radiation is absorbed can provide information about the molecular structure of the sample [25]. As every molecule has a unique chemical composition, it also has a unique infrared spectrum. Lake sediments consist of biogenic material derived from organisms formerly living in a lake and its catchment, as well as minerogenic material eroded from surrounding soils [26]. MIRS spectra of sediment consist of a combination of spectral signatures from all these components. The use of predictive calibration models provides one means of disentangling this information and extracting both qualitative and quantitative information about lake sediments and the conditions that create them. MIRS is often used in soil science where predictive models enable a wide range of chemical parameters (Table 1) [17,[20][21][22][23][26][27][28][29][30] to be reliably determined more quickly and cheaply than traditional analyses [31,32]. Given that MIRS can be used to quantify a range of variables in dried soils, it should be equally suitable for inferring similar properties of dried lake sediments. Notes: 1 indicates that, instead of a multivariate calibration model (as presented here), these papers used the peak height or area to calculate the amount present; 2 indicates the correlation (R) between the variables is stated, rather than the coefficient of determination (R 2 ).
In comparison to soils, fewer properties of sediments (lacustrine, riverine, or marine) have been quantified using MIRS (Table 1) [22,[33][34][35][36]40,41]. Furthermore, where MIRS methods have been applied to sediments, they have predominantly utilised a mathematical approach [40] whereby the concentration of a compound in a dilute potassium bromide (KBr) disc is linearly related to the absorbance at specific wavelengths. Thus, by knowing the spectral peaks associated with specific compounds and determining the relative height of these, the amounts of compounds present in a sample can be calculated. This approach, along with similar methods based on peak area rather than height, have been used for quantitative determination of silica, carbonate, kaolinite, and other minerals in sediments derived from lake, marine, and riverine environments [42][43][44]. MIRS has also been used for more detailed investigations, such as identifying humic materials [45,46].
Although spatial calibration models are not widely used for MIRS analysis of sediments (Table 1), there has been some development in this area. For example, predictive models based on MIRS have previously been used to determine the amount of total organic carbon, total inorganic carbon, total organic nitrogen, and biogenic silica within lake sediments [21,23,47]. Such research demonstrates the applicability of MIRS for determining a range of environmental parameters. One advantage of spatial calibration models is that they allow parameters that influence the sediments but do not have a strong spectral signature (within the range being considered) to be inferred. For example, infrared spectroscopy has been used to determine water quality parameters at the time of sediment deposition, such as total organic carbon content [48]. The current paper, therefore, investigates the hypothesis that MIRS can also be used to infer the lake water salinity at the time of sediment deposition.

Materials and Methods
This study focused on lakes from the western plains of Victoria, Australia (Figure 1) that had a salinity less than 20 mg·L −1 . This criterion was implemented because changes in salinity below this point tend to have the greatest ecological impact with regards to turnover of species. Many species that inhabit lakes above this salinity level can tolerate a very wide range of salinity [49], thus changes in salinity will have a proportionally lesser effect. Applying this criterion also enabled us to compare our data with an existing reconstruction [9] derived from a diatom-based calibration model [17]. The latter performs best at lower salinities (presumably because of the greater turnover discussed above), thus using this range would enable an appropriate comparison of the models produced. In comparison to soils, fewer properties of sediments (lacustrine, riverine, or marine) have been quantified using MIRS (Table 1) [22,[33][34][35][36]40,41]. Furthermore, where MIRS methods have been applied to sediments, they have predominantly utilised a mathematical approach [40] whereby the concentration of a compound in a dilute potassium bromide (KBr) disc is linearly related to the absorbance at specific wavelengths. Thus, by knowing the spectral peaks associated with specific compounds and determining the relative height of these, the amounts of compounds present in a sample can be calculated. This approach, along with similar methods based on peak area rather than height, have been used for quantitative determination of silica, carbonate, kaolinite, and other minerals in sediments derived from lake, marine, and riverine environments [42][43][44]. MIRS has also been used for more detailed investigations, such as identifying humic materials [45,46].
Although spatial calibration models are not widely used for MIRS analysis of sediments (Table 1), there has been some development in this area. For example, predictive models based on MIRS have previously been used to determine the amount of total organic carbon, total inorganic carbon, total organic nitrogen, and biogenic silica within lake sediments [21,23,47]. Such research demonstrates the applicability of MIRS for determining a range of environmental parameters. One advantage of spatial calibration models is that they allow parameters that influence the sediments but do not have a strong spectral signature (within the range being considered) to be inferred. For example, infrared spectroscopy has been used to determine water quality parameters at the time of sediment deposition, such as total organic carbon content [48]. The current paper, therefore, investigates the hypothesis that MIRS can also be used to infer the lake water salinity at the time of sediment deposition.

Materials and Methods
This study focused on lakes from the western plains of Victoria, Australia (Figure 1) that had a salinity less than 20 mg•L −1 . This criterion was implemented because changes in salinity below this point tend to have the greatest ecological impact with regards to turnover of species. Many species that inhabit lakes above this salinity level can tolerate a very wide range of salinity [49], thus changes in salinity will have a proportionally lesser effect. Applying this criterion also enabled us to compare our data with an existing reconstruction [9] derived from a diatom-based calibration model [17]. The latter performs best at lower salinities (presumably because of the greater turnover discussed above), thus using this range would enable an appropriate comparison of the models produced.   Based on previously recorded salinities [17], 44 lakes were initially selected for sampling, however, during the field campaign it was found that 15 of these were dry whilst the salinities of 9 others exceeded the selection criterion and 1 lake was inaccessible. Consequently, 19 lakes were incorporated in the calibration dataset. Surface sediments were collected from either the deepest part of the lake or from the centre of the lake when no bathymetric data were available. Sediments were collected using a modified Hongve corer [50]. The sediment/water interface was suctioned off using a large-bore 50 mL medical syringe; the material removed did not exceed 2 mm in core depth. This technique was used as the surface sediments were too soft to remove in a more traditional manner (i.e., using a spatula or core cutter). Given the very rapid sedimentation rates observed in almost all Australian lakes since European arrival [51][52][53][54] and the thin layer of sediment collected, it is assumed that the samples represent a few years of accumulation, at most.
Electrical conductivity (EC), as an indicator of salinity, was measured at the time of sediment sampling using a TPS 90-FL Field Lab multimeter that was calibrated before each use. EC measurements were taken from within the photic zone, approximately 1 m below the water surface. The salinity was calculated by multiplying the measured EC value by 0.64 [55]. Water temperature, pH, and dissolved oxygen (DO) were also measured in the field using this device. As the majority of these lakes are shallow (<5 m deep), minimal stratification is assumed and thus no attempt was made to measure changes in these parameters with depth through the water column. Water samples were collected approximately 50 cm below the surface for the analysis of chloride (Cl), alkalinity, aluminium (Al), iron (Fe), calcium (Ca), potassium (K), magnesium (Mg), sodium (Na), total nitrogen (TN), total phosphorus (TP), and sulphur (S). Water samples were also collected and filtered to allow the dissolved concentrations of the latter 7 parameters to be determined, along with phosphate and total carbon (TC). Analyses of these samples were undertaken at the Commonwealth Scientific and Industrial Research Organisation (CSIRO) Land and Water laboratories using standard methods. This range of chemical data was intended to provide information about the calibration lakes and their settings with only a subset consisting of Cl, Ca, dissolved N, EC, and pH incorporated in the calibration dataset. These parameters are all ecologically meaningful yet, despite previous research indicating that MIRS-based models can quantify these parameters within soils [29,31], these methods have yet to be applied to lake sediments. In contrast, well-developed MIRS-based calibration models already exist for quantifying the proportion of organic carbon, inorganic carbon, and biogenic silica in lake sediments; thus these compounds were not considered here.
An existing sediment core from Tower Hill forms the basis of the salinity reconstruction developed herein. This core was collected in 2000 with both collection and subsampling details described elsewhere [9]. To briefly summarise, two cores were collected approximately 50 m in from the western edge of the lake. One of these was predominantly used for dating and midge studies whilst the second core was primarily used for the diatom reconstruction; remnant samples from the latter have been used herein. The chronology consisted of 10 210 Pb dates constrained by a marked change in stratigraphy that corresponded to the date the lake was most recently inundated [9]. Sediments below this stratigraphic marker could not be reliably dated using 210 Pb and thus have not been included within this study.
Prior to MIRS analyses, samples were air-dried and machine-ground to a fine powder (nominal particle size of <100 µm) in a single puck Rocklabs © mill. Samples were analysed as neat powders using a rapid scanning Fourier-transform spectrometer (Bio-rad 175C, Hercules, CA, USA) with an extended range KBr beamsplitter and DTGS (deuterated triglycine sulphate) detector (spectral range of 8300-440 cm −1 ) at 4 cm −1 resolution. Spectral frequencies were referenced against an internal He-Ne laser to give a precision and accuracy of 0.01 cm −1 . The diffuse reflectance accessory (Harrick™ DRS-3SO) used off-axis geometry and was set up for maximum energy without removing stray specular radiation. An initial KBr blank spectrum was run to test the spectrometer performance and as a reference for calculating the sample spectra in absorbance units. Each sample spectrum acquisition and processing took 1 min per sample over 60 scans. Only the MIRS range (4000-470 cm −1 ) was used for further analysis. Spectra were baseline corrected prior to model development. All data were mean centred and chemical data were autoscaled in The Unscrambler V 9.8 (Camo Software AS, Oslo, Norway). Chemical parameters were then related to the spectral dataset using the kernel partial least-squares (PLS) regression model available in The Unscrambler. A PLS analysis was performed in the Unscrambler for each variable, individually, using the autofit function. This function fits the number of axes that the software thinks best explains the variance in the data. The plots of residual variance were used to confirm the optimal number of axes. Models were initially developed using untransformed chemical data, however, given the skewed distribution observed for some variables, a square-root transformation was also applied. Cross-validation of PLS models was performed using an iterative, leave-one-out, procedure (LOOCV). In the latter, one lake is omitted from the calibration model, the salinity is predicted for that lake and then compared to the actual measured value. The model is then re-run including the first lake that was omitted but with a second lake left out. This process continues until each lake has been sequentially omitted. This provides an estimate of the error associated with the model. The above analyses were all performed using The Unscrambler V 9.8 (Camo Software AS).
An independent validation was performed to assess the validity of salinity values inferred from the MIRS calibration model. Reconstructed values from a sediment core collected from Tower Hill Lake in 2000 were compared to historical monitoring data. Salinity data were available for 1983-1990 from the Tower Hill Lake State Game Reserve, from two sampling locations. EC data were available for 1993-1996 from Australian Water Technologies. To facilitate comparisons between the different datasets, EC was again converted to salinity using a conversion factor of 0.64 [55]. Due to irregular sampling frequencies, annual averages have been calculated from these datasets. In addition, a single salinity measurement was also available for 2007 and 2008 [56]. Finally, the MIR-inferred salinity reconstruction was compared with a diatom-inferred salinity record from this site [9]. Errors on the two different reconstructions have been calculated as ± the root mean squared error (RMSE) of the model.

Results
The calibration lakes were all well-oxygenated and predominantly circumneutral to slightly alkaline ( Table 2). The majority of lakes had a water depth less than 5 m and water temperatures within the upper meter ranged between 17 and 23 • C (Table 2). Nutrient levels were frequently low, although a few lakes within the calibration set had higher nutrient levels ( Figure 2). Although the criterion used to select the lakes was that the salinity was less than 20‰, the highest recorded salinity within the calibration dataset was 10.2‰. Temperature, pH, EC, Cl, and Na all showed a relatively even distribution across their gradients ( Figure 2). Although DO had a relatively even distribution, there were a few outliers with higher values. The remaining variables were all skewed to varying degrees, with more samples in the low-mid values and fewer samples with higher values (Figure 2). A pronounced outlier was visible in each of TN, P, and S ( Figure 2); the outlying sample for TN and TP was from Yallakar Lake whilst the outlying S sample was from Immensal North. EC was strongly correlated with Na and Cl (R = 0.94 and 0.89, respectively). EC showed moderate correlations (R = 0.47 and 0.53) with K and Mg, respectively, but no correlation with Al, Ca, or TP. A weak negative correlation (R = −0.3) was observed between EC and alkalinity. These results reinforce previous findings that Na and Cl are the dominant salts present in surface waters from this area [57] and justify the use of EC as a measure of salinity within this study [55].
Significant variability was observed in the spectra of the different sediment samples (Figure 3), especially within the lower spectral range, which suggests varying mineralogy or organic content. However, of the chemical parameters examined, only salinity showed a significant relationship with the spectral data. No significant axes were identified for any parameter other than EC and √ EC, and thus R 2 values could not be calculated for the other variables. It is noted, however, that only a small number of chemical parameters were assessed and these results do not preclude other environmental or chemical variables from also influencing the spectra. The results do, however, indicate that the salinity at the time of sediment deposition exerts some influence on the spectral signature of lake sediments. Interestingly, as the salinities increase, the spectral differences between samples diminish, thus less variability is observed between the spectra of lakes 16-19 than is seen between lakes 1-5 ( Figure 3). A biplot of the PLS components ( Figure 4A) shows samples with salinity less than 2.5% (lakes 1-8) form a separate group to those with higher salinities. This separation indicates that the spectral signature of the sediments reflects, to some degree, the salinity of lake water overlying the sediments at the time of their deposition. The lakes with low salinities tend to be more widely dispersed on the scatter plot ( Figure 4A) than lakes of higher salinities, again reflecting the greater variability observed in the spectra of the former (Figure 3). (lakes 1-8) form a separate group to those with higher salinities. This separation indicates that the spectral signature of the sediments reflects, to some degree, the salinity of lake water overlying the sediments at the time of their deposition. The lakes with low salinities tend to be more widely dispersed on the scatter plot ( Figure 4A) than lakes of higher salinities, again reflecting the greater variability observed in the spectra of the former (Figure 3).  Each parameter has been arranged in order of increasing values for that parameter; lake names and locations are shown in Figure 1.    Figure 1 for lake location or Table 2 for physical and chemical data.
A plot of the residual variance indicates the initial amount of variance within the dataset, the amount remaining after each component is sequentially added to the model, and the amount of variance that is explained by each component ( Figure 4B). In this plot, the "calibration" set refers to a model which incorporates all 19 lakes and thus shows the fit of the model to the training data. In  Table 2 for physical and chemical data.
A plot of the residual variance indicates the initial amount of variance within the dataset, the amount remaining after each component is sequentially added to the model, and the amount of variance that is explained by each component ( Figure 4B). In this plot, the "calibration" set refers to a model which incorporates all 19 lakes and thus shows the fit of the model to the training data. In contrast, the validation data is derived from the iterative LOOCV procedure and thus provides an indication of the prediction error of the model variation ( Figure 4B). For example, the calibration data suggest that a five-component model would account for 90% of the variance within the dataset, with a nine-component model accounting for 99% of the variance. In contrast, the validation data shows that a five-component model explains only~65% of the variance. Furthermore, within the validation data, 19% of the variance cannot be explained regardless of how many additional components are included. This indicates that some of the explained variance within the calibration set is not related to salinity but rather is attributable to other differences in the spectra. This raises the possibility that another variable, not captured within the calibration dataset, contributes to the spectral variability. This could potentially relate to the variability observed within samples from lakes with low salinity values discussed above. However, the high residual variance could also indicate a significant component of noise within the model, which could result from the small number of lakes incorporated within the calibration dataset.
The residual variance should decrease as additional components are added to the model. If the residual variance remains the same, it indicates that the additional component is reflecting variance that has already been captured by one or more of the existing components. If the variance increases with the addition a component, this indicates that variable represents a component associated with noise in the calibration dataset. Consequently, the slight increase in residual variance observed within the validation data for a six-component model can be attributed to noise and thus indicates that a five-component model is the best choice for these data. A five-component PLS model for √ EC, as a measure of salinity, was subsequently developed and performed reasonably (R 2 LOOCV = 0.64, RMSEP = 0.7 g·L −1 ). It is noted, however, that the model underestimates the variance within the calibration set, with values in the lower range overestimated and values in the upper range underestimated ( Figure 4C). contrast, the validation data is derived from the iterative LOOCV procedure and thus provides an indication of the prediction error of the model variation ( Figure 4B). For example, the calibration data suggest that a five-component model would account for 90% of the variance within the dataset, with a nine-component model accounting for 99% of the variance. In contrast, the validation data shows that a five-component model explains only ~65% of the variance. Furthermore, within the validation data, 19% of the variance cannot be explained regardless of how many additional components are included. This indicates that some of the explained variance within the calibration set is not related to salinity but rather is attributable to other differences in the spectra. This raises the possibility that another variable, not captured within the calibration dataset, contributes to the spectral variability. This could potentially relate to the variability observed within samples from lakes with low salinity values discussed above. However, the high residual variance could also indicate a significant component of noise within the model, which could result from the small number of lakes incorporated within the calibration dataset.
The residual variance should decrease as additional components are added to the model. If the residual variance remains the same, it indicates that the additional component is reflecting variance that has already been captured by one or more of the existing components. If the variance increases with the addition a component, this indicates that variable represents a component associated with noise in the calibration dataset. Consequently, the slight increase in residual variance observed within the validation data for a six-component model can be attributed to noise and thus indicates that a five-component model is the best choice for these data. A five-component PLS model for √EC, as a measure of salinity, was subsequently developed and performed reasonably (R 2 LOOCV = 0.64, RMSEP = 0.7 g•L −1 ). It is noted, however, that the model underestimates the variance within the calibration set, with values in the lower range overestimated and values in the upper range underestimated ( Figure 4C).  The first two components (PC1 and PC2) have 43% and 29% influence on the model, respectively ( Figure 4B). Further components have less effect on the model. The influence of the spectral regions on each component in the salinity model is indicated by the loading weights. Although the full spectral range used is shown, wavelengths associated with higher loading weights will have the greatest influence on the model, whilst those with smaller loadings will have less influence. The loading plot ( Figure 5) shows that the relationship between PC1 and salinity is predominantly influenced by kaolinite, signified by the three peaks between 3600 and 3700 cm −1 [40,58]. The peak at 918 cm −1 may also indicate an influence of kaolinite on PC1, but this peak is not definitive. Many compounds with bonds visible within the lower wavenumbers have spectra that overlap, and the resulting complexity of the spectra can hinder identification. PC1 is also negatively related to organic matter (OM), as indicated by the alkyl peaks around 2800-2900 cm −1 [40]. The first two components (PC1 and PC2) have 43% and 29% influence on the model, respectively ( Figure 4B). Further components have less effect on the model. The influence of the spectral regions on each component in the salinity model is indicated by the loading weights. Although the full spectral range used is shown, wavelengths associated with higher loading weights will have the greatest influence on the model, whilst those with smaller loadings will have less influence. The loading plot ( Figure 5) shows that the relationship between PC1 and salinity is predominantly influenced by kaolinite, signified by the three peaks between 3600 and 3700 cm −1 [40,58]. The peak at 918 cm −1 may also indicate an influence of kaolinite on PC1, but this peak is not definitive. Many compounds with bonds visible within the lower wavenumbers have spectra that overlap, and the resulting complexity of the spectra can hinder identification. PC1 is also negatively related to organic matter (OM), as indicated by the alkyl peaks around 2800-2900 cm −1 [40]. As the CH3 stretching vibrations at 2965 cm −1 are of a similar magnitude to the CH2 vibrations at 2853 and 2922 cm −1 , the OM seen as a negative correlation in PC1 probably represents a range of organic materials with short alkyl chain lengths. A small band at 3060 cm −1 is characteristic of C=C-H stretching and is seen in spectra of humic materials. Weak, positive relationships are observed between PC2 and organic carbon, particularly C-H bonds shown by the peaks at 2850 and 2950 cm −1 [40,44,59]. In contrast to PC1, the stronger, positive peaks for CH2 vibrations at 2853 and 2922 cm −1 in PC2 suggests that longer chained materials, such as lipids, are responsible. Although quartz peaks around 800, 1040, 1878, and 1990 cm −1 [40,44,59] are negatively related to both PCs, they exert a much As the CH 3 stretching vibrations at 2965 cm −1 are of a similar magnitude to the CH 2 vibrations at 2853 and 2922 cm −1 , the OM seen as a negative correlation in PC1 probably represents a range of organic materials with short alkyl chain lengths. A small band at 3060 cm −1 is characteristic of C=C-H stretching and is seen in spectra of humic materials. Weak, positive relationships are observed between PC2 and organic carbon, particularly C-H bonds shown by the peaks at 2850 and 2950 cm −1 [40,44,59]. In contrast to PC1, the stronger, positive peaks for CH 2 vibrations at 2853 and 2922 cm −1 in PC2 suggests that longer chained materials, such as lipids, are responsible. Although quartz peaks around 800, 1040, 1878, and 1990 cm −1 [40,44,59] are negatively related to both PCs, they exert a much stronger influence on PC2 ( Figure 5). Three small carbonate peaks are observed on the loading plot at 1470, 1810, and 2523 cm −1 [22,31,36], influencing both PC1 and PC2.
Given the significant relationship between MIRS-predicted and measured salinity in the spatial calibration model, the model was applied to down-core samples from Tower Hill Lake. The resulting MIRS-inferred salinity reconstruction (Figure 6) indicates that a sharp decrease in inferred salinity values occurred between 1946 and 1955. This decrease was followed by a gradual, long-term increasing trend that persisted until the late 1980s. Subsequent to this, the MIRS-inferred salinity levels fluctuated from 5‰ to 6‰ but exhibited a greater degree of variability between samples than had previously been observed. The MIRS-inferred values are within the lower range of salinity measured between 1983 and 1990 ( Figure 6). However, the salinity data collected in the 1980s show a higher degree of variability than that seen within the MIRS reconstruction, with the annual averages varying between 5.4‰ and 12.3‰. In contrast, the MIRS inferred values agree relatively well with salinity measurements reported in the 1990s, where the annual averages ranged between 4.3‰ and 7.0‰. Climatic changes could explain the differences in variance observed, however, part of these observed differences are likely to be associated with the higher frequency of sampling during the 1980s as this would result in a greater range of values being captured, as well as increasing the potential for extreme values to be recorded. stronger influence on PC2 ( Figure 5). Three small carbonate peaks are observed on the loading plot at 1470, 1810, and 2523 cm −1 [22,31,36], influencing both PC1 and PC2. Given the significant relationship between MIRS-predicted and measured salinity in the spatial calibration model, the model was applied to down-core samples from Tower Hill Lake. The resulting MIRS-inferred salinity reconstruction (Figure 6) indicates that a sharp decrease in inferred salinity values occurred between 1946 and 1955. This decrease was followed by a gradual, long-term increasing trend that persisted until the late 1980s. Subsequent to this, the MIRS-inferred salinity levels fluctuated from 5‰ to 6‰ but exhibited a greater degree of variability between samples than had previously been observed. The MIRS-inferred values are within the lower range of salinity measured between 1983 and 1990 ( Figure 6). However, the salinity data collected in the 1980s show a higher degree of variability than that seen within the MIRS reconstruction, with the annual averages varying between 5.4‰ and 12.3‰. In contrast, the MIRS inferred values agree relatively well with salinity measurements reported in the 1990s, where the annual averages ranged between 4.3‰ and 7.0‰. Climatic changes could explain the differences in variance observed, however, part of these observed differences are likely to be associated with the higher frequency of sampling during the 1980s as this would result in a greater range of values being captured, as well as increasing the potential for extreme values to be recorded. The diatom-inferred values showed little resemblance to the MIRS-inferred values throughout most of the core, although detailed comparisons are difficult given the lower sample frequency seen in the diatom reconstructions. Prior to 1990s, the diatom model shows little variation whereas the MIRS-inferred values suggest an initial decline followed by a generally increasing trend up-core. After 1990, the diatom-inferred values show some initial variability followed by an almost exponential increase in salinity between 1995 and 2000. In contrast, the MIRS model suggests a period The diatom-inferred values showed little resemblance to the MIRS-inferred values throughout most of the core, although detailed comparisons are difficult given the lower sample frequency seen in the diatom reconstructions. Prior to 1990s, the diatom model shows little variation whereas the MIRS-inferred values suggest an initial decline followed by a generally increasing trend up-core. After 1990, the diatom-inferred values show some initial variability followed by an almost exponential increase in salinity between 1995 and 2000. In contrast, the MIRS model suggests a period of relatively stability occurred between 1990 and 2000, with fluctuations varying between 5‰ and 6‰. Thus, there is a cross-over period between when the diatom model indicates lower salinity values than the MIRS model, and when the diatom model shows higher variability than the MIRS model. There is also a significant increase in error associated with diatom-inferred values above 3‰ and thus there is a slight overlap between the two models from 1955 to 1970 and 1995 to 1998 when the associated errors are included ( Figure 6). The diatom-inferred salinity values are much lower than the measured values during the 13 years for which the latter data is available. There is no overlap between any of the measured values and the inferred salinity values, even when the error range of the diatom model is considered. Finally, the highest diatom-inferred salinity values during the period for which monitoring data is available, correspond to some of the lowest salinity values observed within the monitoring data.

Discussion
This study demonstrates the potential to use MIRS to reconstruct lake salinity in Australia, and potentially around the world. The inference model's ability to infer salinity is predominantly linked to changes in the mineralogy of the sediments with changes in the organic content also contributing a small component. These results agree well with MIRS studies of Australian soils, where inference models for EC were strongly influenced by kaolinite, smectite, gibbsite, and quartz [31]. Hence, we would caution that this initial model is reflective of the environmental and geological conditions found within the study region. As such, the model itself should not be applied to areas outside of this region; however, the technique of using MIRS to develop a calibration model and subsequently infer past salinity could be applied elsewhere. Similarly, the model presented here is only suitable for lakes with a low salinity; it should not be assumed that the model can be extrapolated beyond the range of salinities observed. Further research is required to determine whether higher salinities can be inferred using MIRS, and if so, whether the relationships with mineralogy observed here are still maintained for lakes with higher salinities.
Within the model presented here, the relationships observed between sediment composition and salinity probably reflect a combination of weathering and depositional influences, with a minor contribution from in-lake processes. For example, clay minerals are common weathering products that generally do not require intense chemical weathering and can therefore be formed by physical weathering during low rainfall dry phases [60,61]. In contrast, a high degree of chemical weathering, typically associated with intense leaching and high rainfall, is required to produce silica from mafic igneous rocks [60], such as those found in the study area. Furthermore, high stream flow is generally required for aqueous transport of quartz into lake sediments, whereas lighter clay minerals are more easily transported by lower streamflow. Thus, high levels of quartz are likely to indicate high precipitation that, in turn, would be likely to lower the salinity levels of lakes within the affected area. It is possible for both clay minerals and quartz to be transported into lakes by wind, although as is the case for stream flow, less energy is required to transport clay minerals. This could also contribute to the observation that high salinity values were associated with a greater amount of clay minerals. For example, if in dry conditions, clay minerals from soils were preferentially transported by wind [62], this would increase the amount of clay minerals present within the lake sediments. Thus, differences in the weight of minerals and preferential transport by either wind or water could act in conjunction with different weathering regimes to produce the observed relationships between the mineral composition of lake sediments and the salinity present at the time of sediment deposition.
Despite the predictive power of the MIRS salinity model, the reconstruction from the Tower Hill Lake sediments did not fully capture the variability observed within the measured values ( Figure 5). This is probably attributable to the smoothing influence of sedimentary processes in lakes, including sediment mixing. Given that a number of the properties inferred from MIRS are related to increased rates of weathering and delivery of materials from lake catchments, it may be that the model is less well suited to inferring subdecadal changes in lake salinity. However, this is a characteristic of most lake-based salinity reconstructions, as other biological proxies are also often affected by catchment processes. These processes can affect variables such as light availability, nutrient supply, or pH and thus subsequently influence the community composition of a range of proxies. As an example, the diatom reconstruction used herein captured even less of the variability in the monitoring data.
The MIRS-inferred reconstruction shows some trends that broadly correspond to regional rainfall. During the first half of the 20th century, the southwest region of Victoria experienced approximately 50 years of relatively low precipitation, with the eastern side of Tower Hill Lake drying out in 1942 [9]. This was followed by a very wet period commencing in the 1950s [56] that led to flooding of several lakes, including Lake Corangamite and Lake Colac. This timing correlates well with the decrease in salinity observed between~1946 and~1955 in the MIRS reconstruction. The long-term increase in salinity values is consistent with a long-term decrease in precipitation. Despite this, the MIRS-inferred values do not reflect the quasi-decadal variability seen within the precipitation records ( Figure 6). From the 1980s on, the rainfall shows less variability and more coherence is apparent between the MIRS-inferred salinity reconstruction and the historic rainfall data from this point on.
The diatom-based salinity reconstruction shows little variation before 1990, with inferred values ranging between 1.0‰ and 1.6‰ ( Figure 6). During the next five years, inferred values are slightly higher (~2‰), however, it is not until~1996 that values increase markedly with the three uppermost samples rising sharply. This marked increase in diatom-inferred salinity levels during the latter part of the 1990s corresponds to lower-than-average precipitation, including a drought in 1997 [63]. Unfortunately, no measured salinity data are available for 1995-2000, thus the latter relationship cannot be verified. It is worth noting, in the 2007-2008 austral summer, after 12 years of prolonged drought within the region (including pronounced drought conditions during 2002, 2006, and 2008) [63], measured salinity values within Tower Hill Lake varied between 5.0‰ in November and 9.6‰ in April [56]. Consequently, the high salinity level indicated by the diatom-inferred reconstruction for the late 1990s appears unrealistic.
Given the lack of agreement throughout the record between the measured salinity data and the diatom-inferred reconstructions, MIRS should be considered as an alternative method for reconstructing past salinity levels in Australian lakes. Whilst the MIRS reconstruction does not capture variability on the subdecadal scale, it reflects the long-term trends and infers values that more accurately reflect the measured data. Consideration should, therefore, be given to expanding the number of lakes incorporated within the model to improve the reliability and statistical performance as well as expanding the range of lake water salinities covered.
Finally, it is noted that the diatom model is intended to capture a biological response to salinity while the MIRS calibration model appears to be more related to catchment process (weathering and erosion) than in-lake conditions. One implication of this is that the MIRS model is primarily responding to climatic changes on a local, or possibly regional, scale rather than salinity per se. Consequently, MIRS calibration models may potentially provide information pertaining to drought history as indicated by the mineralogy. Although similar information may be obtained through more traditional chemical analyses or X-ray fluorescence, MIRS offers a cheaper, faster alternative to these methods [35,36]. As such, it represents an intriguing possibility for a cost-effective means for examining past climatic change in an area that is predicted to become more prone to drought in the future. Given that recent increases in salinity have been observed in many waterbodies within Australia, this method could also prove an invaluable tool for ongoing management of these water bodies; for example, by providing the baseline data required to set realistic targets for remediation work.

Conclusions
This study demonstrates the feasibility of using MIRS for reconstructing lake salinity. However, the number of lakes used in this study is small, with all lakes located in the same geographical region. Consequently, additional region-specific calibration models would need to be developed before the method could be applied elsewhere. Further research would also be required to develop a salinity model that is applicable for lakes with higher salinity values. Despite this caveat, our study indicates that MIRS could be a valuable technique for reconstructing lake salinity (including drought history) in south-eastern Australia.