Usage of Airborne Hyperspectral Imaging Data for Identifying Spatial Variability of Soil Nitrogen Content

: Soil is a signiﬁcant natural resource composed of organic and inorganic material. Nitrogen, one of the essential elements, is traditionally measured using laboratory methods. The development of hyperspectral imaging enables the cost-effective acquisition of both spectral and spatial information for detecting physical, chemical, and biological attributes of the soil samples. The presented work evaluates the suitability of airborne hyperspectral imaging for determining soil nitrogen content and producing a soil nitrogen map on a pixel-wise basis. The measurement of spatial variability of the soil nitrogen content was taken at two ﬁelds located at Rudice, in northeast Brno, Czech Republic, using laboratory methods and a handheld spectrometer. The soil reﬂectance was also recorded using airborne-mounted imaging spectroscopy sensors. A partial least squares regression was used to develop a model for the calibration of the data collected with a portable spectrometer and to predict the total nitrogen in the soils based on hyperspectral images from airborne sensors. The determination factor for the PLSR model presented in this paper reached an R 2 of 0.44. The model’s performance could be improved by using a handheld spectrometer with a wider spectral range, using the same acquisition period for ﬁeld data collection and hyperspectral imaging, and enlarging the sample size.


Introduction
Soil is a fundamental natural non-renewable resource that people rely on for food production, fiber, and energy. It is also a habitat for microorganisms and earthworms and the foundation for buildings and other constructions [1]. Fundamentally, soil is a complex matrix that consists of organic and inorganic mineral matter, water, and air. The organic material in soils ranges from decomposed and stable humus to fresh, particulate residues of various origins. The distribution of these different organic pools in soil influences biological activity, nutrient availability and its dynamics, soil structure and aggregation, and waterholding capacity [2]. The inorganic mineral fractions are often described by their particle size distribution (proportions of sand, silt, and clay) and by additional subclasses in various classification systems [3]. Nitrogen (N) is one of the essential elements that affect vegetative growth and plant development because it plays a central role in all metabolic processes, as well as in cellular structure and genetic coding [4]. Its association with phosphorus (P) plays a vital role in plant growth as these two elements interact with each other. Therefore, it is essential to investigate the N content in the soils and obtain the spatial distribution information that would improve field nitrogen management efficiency and the economic benefit from agricultural production and contribute to sustainable agriculture [5].
The goal of the research was to determine the total soil nitrogen (Ntot); its concentration was expressed in the percentage of dry weight of soil (%). It consists of all accessible and inaccessible forms that can pass between each other. The natural nitrogen cycle depends on the ability of organisms to bind and convert inert atmospheric nitrogen (N 2 ) and ISPRS Int. J. Geo-Inf. 2021, 10, 355 2 of 18 decompose proteins. The basic mineral forms of ammonium (NH 4 + ) and nitrate (NO 3 -) are also important accessible forms of nitrogen for plant nutrition and indication of soil viability. Both forms of nitrogen move in the soil sorption complex, indicating the degree of eutrophication and the efficiency of the nitrogen cycle.
The nitrogen content in the soils is usually measured using laboratory methods. However, they are often time consuming, expensive, and destructive. Therefore, new techniques such as laboratory spectroscopy are being developed to minimize the disadvantage of traditional laboratory methods [6]. Examples of popular spectrometers include ASD FieldSpec 3, Peristrom NIR System 6500, and FOSS XSD Rapid Content Analyzer Spectrometer [5,[7][8][9]. These devices measure mostly at spectral bandwidths ranging from 1 to 2 nm over a wavelength range of 300 to 2500 nm.
Another option is to use remote sensing methods. There are aerial and satellite remote sensing techniques. Aerial remote sensing is generally used to acquire data for smaller areas (national scale), while satellite remote sensing is used for covering larger areas (global scale). In addition, there are three types of optical data. The first is panchromatic, which is characterized by high or very high spatial resolution and a single band. The band is formed by total light energy in a visible spectrum. Multispectral data have a higher spectral resolution consisting of multiple spectral bands with broader bandwidth instead of lower spatial resolution compared to panchromatic data. Therefore, a pan-sharpening method is sometimes employed, which improves the spatial resolution of multispectral data. For example, Sentinel-2A has 13 spectral bands in visible, NIR (near-infrared), and SWIR (short-wavelength infrared) spectra with a spatial resolution of 10, 20, and 60 meters, respectively. The range of bandwidth is from 15 to 175 nm [10]. Hyperspectral data contain many spectral bands compared to multispectral data. However, their main advantage is contiguous bands with narrow bandwidth. As a result, the spectral curve of a surface is continuous. Hyperspectral images are used for soil mapping [11] or identifying types of iron and clay minerals [12].
Aerial and satellite hyperspectral images enable the mapping of relatively large areas over a short period. However, the raw data must be corrected to eliminate the influence of the atmosphere in the determination of soil properties [13]. Hyperspectral imaging combines conventional spectroscopy with imaging techniques to acquire spectral and spatial information to detect physical, chemical, and biological attributes of the samples [14,15]. A soil spectrum is generated by directing radiation containing all relevant frequencies to the sample. Depending on the constituents present in the soil, the radiation will cause individual molecular bonds to vibrate, either by bending or stretching, and absorb light to various degrees. The resulting absorption spectrum produces a characteristic shape that can be used for analytical purposes [16]. Visible and near-infrared (vis-NIR) regions, encompassing wavelengths between 400 and 2500 nm, contain useful information on organic and inorganic materials in the soil. Absorptions in these regions can be used to detect mineral content associated with iron, soil organic matter, clay, carbonates, or water [17][18][19][20]. It can also be used to detect soil matter, such as organic carbon (SOC) or total nitrogen, as a result of the stretching and bending of NH, CH, and CO groups [21][22][23][24][25]. Viscarra Rossel and Behrens [26] present a summary of important fundamental absorptions in the mid-infrared (mid-IR) region and the occurrence of their overtones and combinations in the vis-NIR regions, which can be used to help with the interpretation of soil constituents.
Diffuse reflectance spectra of soil in the vis-NIR regions is largely nonspecific due to the overlapping absorption of soil constituents. This inherent lack of specificity is compounded by scattering effects caused by soil structure or its specific components such as quartz. All these factors result in complex absorption patterns that need to be mathematically extracted from the spectra and correlated with soil properties. Hence, the analyses of soil diffuse reflectance spectra require a sophisticated statistical technique to discern the response of the soil attributes from spectral characteristics [27]. The most common calibration methods for soil applications are based on linear regressions, namely, stepwise multiple linear regression (SMLR), principal component regression (PCR), and partial least squares regression (PLSR) [28]. The main reason for using SMLR is the inadequacy of more conventional regression techniques such as multiple linear regression (MLR) and the lack of awareness among soil scientists of the existence of full-spectrum data compression techniques such as PCR and PLSR. Both methods can cope with data containing large numbers of predictor variables that are highly collinear. PCR and PLSR are related techniques, and in most situations, their prediction errors are similar. However, PLSR is often preferred by analysts because it relates the response and predictor variables so that the model explains more of the variance in the response with fewer components, and the algorithm is therefore computationally faster. The use of data mining techniques such as neural networks (NN) [23,24], multivariate adaptive regression splines (MARS) [25] and boosted regression trees [26] is increasing. Viscarra Rossel et al. [29] combined PLSR with bootstrap aggregation (bagging-PLSR) to improve the robustness of the PLSR models and produce predictions with uncertainty. MLR, PCR, and PLS are linear models, while the data mining techniques can handle nonlinear data. Viscarra Rossel and Lark [28] used wavelets combined with polynomial regressions to reduce the spectral data, account for non-linearity, and produce accurate and parsimonious calibrations based on selected wavelet coefficients. Mouazen et al. [30] compared NN with PCR and PLS for predicting selected soil properties. They found combined PLSR-NN models to provide improved forecasts compared to PLSR and PCR. Viscarra Rossel and Behrens [26] examined the use of PLSR to several data mining algorithms and feature selection techniques for predictions of clay, organic carbon, and soil acidity (pH). The comparison included MARS, random forests (RF), boosted trees (BT), support vector machines (SVM), NN, and wavelet transform. Their results suggest that data mining algorithms produced more accurate results than PLSR. Some of the algorithms provide information on the importance of specific wavelengths in the models so that they can be used to interpret them.
The objectives of the presented study were to evaluate the suitability of airborne hyperspectral imaging, determine the soil nitrogen content, and produce a soil nitrogen map on a pixel-wise basis usable for precision agriculture. The goal was fulfilled with emphasis on the use of geoinformation technologies, for processing of hyperspectral data and performing spatial analyses in a GIS environment.

Study Site
The measurement of the spatial variability of the soil nitrogen content was taken at two fields located at Rudice, northeast Brno, Czech Republic ( Figure 1). These sites were selected to provide soil samples from both agricultural land and grassland located near a forest. This region is characterized by mean annual precipitation of 393-430 mm and a mean annual temperature of approximately 8.3 • C. Due to the different soil cultivation, both test sites varied in surface roughness, soil structure, green coverage, and straw residue. At the agricultural test site, which has been under intensive cultivation, the soil surface was ploughed with residue coverage of less than 10%. According to the World Reference Base for Soil Resources [31], the dominant soil type of the test site is Luvisols.
Soil sampling and data collection with a portable spectrometer were performed on 10 October 2016. Due to unfavorable weather conditions on this day, the aerial imaging campaign was postponed and completed one month after the soil sampling. The soil was sampled from the top 5 cm of plough horizon at the agricultural site using core rings around randomly selected sampling points. The top 5 cm of soil horizon was discarded at the grassland field due to extensively dense root activity. Spatial coordinates were measured using handheld GPS. A total of 22 topsoil samples were collected from both sites.

Conventional Soil Analysis
The soil sampling and laboratory analysis were coordinated with the Institute of Geology and Soil Science, Mendel University in Brno. The presented study uses the results of analyses of soil samples taken from the soil horizon A1 at a depth of 5 cm. Before analysis, all soil samples were air-dried, passed through a 2 mm sieve, and milled before analysis. The total nitrogen was determined by a modified Kjeldahl method [32] that consists of a procedure of dissolving the soil samples in a boiling mixture of sulfuric acid and catalyst additives. The resulting NH 4 + ions together with the NH 4 + ions originally present in the sample after alkalization were distilled in the form of ammonia (NH 3 ) into a certain volume of a standard solution of sulfuric acid (H 2 SO 4 ) as specified by the method. The captured NH 3 was then determined indirectly by the titration of the excess volumetric acid solution with a volumetric NaOH solution.

Field VIS/NIR Spectroscopy
The reflectance of the soil samples was measured in the field using the FieldSpec HandHeld2 portable spectrometer [33]. The HandHeld2 spectrometer offers a vis-NIR spectral range (325-1075 nm) with resolution better than 3 nm at 700 nm and accuracy of ±1 nm. A white Spectralon panel (approximately 10 cm in diameter) provided the absolute reflectance factor for field calibration of the device, and it was used before each sample measurement was taken. The distance between the scanned surface and the equipment was chosen so that the scanned area was approximately 30 cm × 30 cm. Three repeat measurements of soil spectrum were recorded per sample.

Hyperspectral Imaging in vis-NIR Range
The soil reflectance was recorded with the Flying Laboratory of Imaging Spectroscopy (FLIS). The flight campaign was conducted by CzechGlobe-Global Change Research Institute of the Czech Academy of Sciences (GCRI) on 8 November 2016. Weather conditions allowed imaging only early in the morning between 9 a.m. and 10 a.m. when the temperature ranged between −3 and 2 • C, as reported by a nearby meteorological measuring station located in Vranov. Due to the late data acquisition date, an increased noise level could be expected in the collected hyperspectral data, as reported by CzechGlobe. During the data collection, no rain occurred, and the precipitation rates for the prior five days were minimal, not exceeding 1 mm.
The FLIS consists of an aircraft-mounted set of three hyperspectral sensors from ITRES, Ltd. [34], two of which (CASI-1500 and SASI-600) collected data for the study. The CASI-1500 is a visible near-infrared (vis-NIR) sensor that offers 1500 pixels across its field of view, and a spectral range between 380 and 1500 nm with resolution better than 3.2 nm. The hyperspectral SASI-600 sensor captures 100 spectral channels across the shortwave infrared (SWIR) spectral region between 950 and 2450 nm with 600 across-track imaging pixels. The spatial pixel resolution for CASI-1500 and SASI-600 sensors is 1.5 m and 2.5 m, respectively, based on a 40-degree field of view and flight altitude of 2060 m. Radiometric calibration, an atmospheric correction, the exclusion of water absorption bands, and the georeferencing of the hyperspectral images were carried out by CzechGlobe to derive nadir normalized ground reflectance. Furthermore, as part of the pre-processing steps, the data from the two sensors were joined into the final hyperspectral image.

Partial Least Squares Regression (PLSR) Method
A partial least squares regression (PLSR) was used to develop a model for calibration of the data collected with FieldSpec HandHeld2 portable spectrometer with the reference soil nitrogen data from the laboratory analysis. Then, the PLSR was again applied to predict the total nitrogen content in the soils based on the developed model and data captured in the FLIS hyperspectral image. PLSR is a multi-linear regression technique that handles often strongly correlated spectral predictor bands well. This method reduces the dimensionality of the explanatory variables by projecting them into a certain number of new, orthogonal latent variables (LVs) [35]. PSLR transfers the information content of predictors to a few uncorrelated synthetic variables referred to as latent vectors. The latent vectors are generated not only with respect to maximizing information content regarding the spectral information but also due to their explanatory power in multiple linear regression. The optimization process involves the simultaneous implementation of dimensionality reduction and regression [36]. The developed PLSR models were calibrated by fitting them to the training data and validated by omitting one of the training points in the calibration phase and estimating the dependent variable for that point (Leave One Out, LOO, cross-validation). This method is based on sequentially removing one value from the dataset, testing the quality of the model's prediction with the remaining values in the dataset, returning the removed value, and selecting another value that is removed from the dataset. The stability of the model was verified by predicting the total nitrogen content on the data with known values (validation set). The ratio of calibration and validation samples was 75% to 25%, respectively. Figure 2 shows the schematic process for predicting total nitrogen from hyperspectral images using the PLSR method.
Performance of the developed model was described using the coefficient of determination (R 2 ) between measured and predicted values and root-mean-square error (RMSE) during the calibration and validation [37]. The optimal number of LVs is experimentally selected to minimize the determination coefficient during the calibration (R 2 cal), so model over-fitting is avoided and the determination coefficient during validation (R 2 val) is maximized [38].

Spectrometer Data Processing
Reflectance data collected with FieldSpec HandHeld2 portable spectrometer on 22 test sites were processed to remove the influence of the noise on the developed model. These marginal spectral range values vary from device to device. For FieldSpec HandHeld2, excessive noise was experimentally determined to affect the measured signal with a wavelength of less than 450 nm and greater than 950 nm, as shown in Figure 3. Nawi et al. [39] and Grant [40] also recommended removal of reflectance values in the wavelength range less than 600 nm and above 1000 nm and less than 450 nm and greater than 950 nm, respectively, to reduce the influence of the noise on measured data.  In the next step, the filtered spectral data were resampled to match the resolution and center wavelength value of the FLIS hyperspectral images recorded during the flight campaign. This step was accomplished using the freely available prospect package developed for the R environment [41]. The resampling process changed the original bandwidth width of the spectrometer's data to about 14.2 nm and 15 nm to match the bandwidth width of the CASI and SASI sensor, respectively. This step further reduced the influence of the noise in the spectral data for model development. After noise removal and resampling, the reflectance data were transformed to absorbance using the following equation: absorbance = log 10 1 reflectance (1)

Hyperspectral Data Processing
The FLIS hyperspectral image was imported into the R environment using the stack function of the raster package. However, the supplied hyperspectral image contained pixels with a value of 0 for the areas that were not imaged, as shown in the black color in Figure 4. Therefore, the values for 385,595 pixels, representing 36% of the hyperspectral image, were changed to "NoData" value. After loading the FLIS hyperspectral image into the R environment, spectral bands that were not available in the resampled spectral information from the handheld spectrometer were removed. This step had to be performed because the hyperspectral image data on which the prediction itself was performed must have the same range and form as the data on which the model was built. This step led to the removal of SASI data with a wavelength of more than 1062.5 nm.

Model Development
The most critical step of the entire process was building the model, because the resulting values depend on the character of the model. Furthermore, the precision and accuracy of the prediction are dependent on the quality of input data, which often cannot be influenced, and on the quality of the developed model. Due to the relatively small size of the dataset (total of 22 samples), the standard procedure of dividing the entire dataset into the training set and validation set was not implemented, and the entire dataset was used to build the model. Validation of the developed model was performed using only the cross-validation method. During the model development, an iterative process of leave-one-out cross-validation was used. It sequentially hides one value from the dataset, tests the quality of its prediction with the remaining values, returns the value into the dataset, and then selects another value and hides it. Since the dataset was not extremely large, the model was built with a maximum number of latent variables. The PSLR function then calculated the RMSEP for the different number of latent variables, as seen in Figure 5. The accepted recommendation for the selection of the number of latent components is based on displaying the RMSEP for a different number of latent components and then selecting one with the lowest RMSEP or from which RMSEP is not significantly reduced. Melvik and Wahrens [42] further refined this recommendation by claiming that it is preferable to select the number of components that display the first local root-mean-square error of prediction minimum, not the overall minimum. Based on the chart above, six components were chosen as RMSEP has the lowest value of 0.0178, while the model reached an R 2 of 0.45.

Prediction
For the prediction of soil nitrogen content, the PLSR library again employed FLIS hyperspectral image input, and the number of components was set to 6. The resulting predicted pixel values were then inserted into the original raster to maintain the identical geospatial structure. Before exporting the predicted total nitrogen content values, all but the first band were removed, as all the spectral bands contained the same predicted data. The matrix containing predicted total nitrogen content values was exported into TIF format in the final step.

Establishment of Bare Soil
The last step involved a process of removing pixels that do not logically correspond to the possible soil properties values and determining the bare soil to which the resulting data should correspond. The acquired raster for total nitrogen content also showed a certain number of negative values (Table 1). This is logically impossible because the investigated quantities cannot have negative numbers. Therefore, these pixels were also removed to preserve only the portion of the raster with relevant data. The mask for distinguishing pixels in the raster representing the bare soil was created using the normalized difference vegetation index (NDVI) and the cellulose absorption index (CAI). According to the authors of [43], this is one of the most reliable methods for determining the bare soil. To calculate the NDVI index from hyperspectral images, wavelengths of 800 nm for the NIR band, where the reflectivity of vegetation is increased, and 670 nm for the RED band were used [44]. The NDVI values in our area of interest ranged from −0.25 to 0.82. Negative values normally correspond to impermeable and artificial surfaces, such as roads or structures. Values approaching the value of 1 represent surfaces with a high chlorophyll content, such as dense green vegetation. Two masks were used to remove both extremes. Experimentally, it was determined that the first extreme included pixels with NDVI values ranging from −0.25 to 0.08 and the second one from 0.3 to 0.82.
The CAI index is used to determine the health status of vegetation based on the cellulose content. Gerighausen et al. [45] used the CAI index for detecting the arable land that was covered with vegetation. The index was developed for hyperspectral data and works with wavelengths of 2000, 2100, and 2200 nm. Since the hyperspectral image used in the present work was taken in November, after the end of the agronomic season, most of the surface forming the arable land was without visible vegetation cover. Based on this, it was possible to use the CAI index to extract impermeable surfaces with zero vegetation content. The NDVI index was therefore supplemented with this index, showing remaining artificial surfaces such as roads and buildings. The application of this index resulted in values ranging from −1.281 to the extreme of 13.348. In general, the values of this index rise linearly from the surface, with 0% vegetation to 100% coverage. We experimentally determined that the CAI values from −1.281 to −0.02 identified impermeable surfaces.

Results
Data containing information on the spectral signature of the soil samples were obtained in the form of reflectance according to the spectral range of the scanning instrument. Each sampling site (or soil sample) was measured three times ( Table 2). Based on notes on high cloud transitions recorded during measurements and graphical visualization of spectral curves, a single measurement was selected for further analysis. The processed data were then combined with soil data (Table 3), so all 22 sampling points had information about their location, total nitrogen content measured in the laboratory from soil samples, and the spectral reflectance characteristics. Several pre-treatment methods were tested to ensure the best results were achieved when using the PLSR method. The results (Tables 4 and 5) differed quite significantly according to the selected variant (a) without resampling the spectra and thus intended for possible new data from the spectrometer, or (b) with resampled spectra for building a model for prediction based on analyzed hyperspectral data. Table 4 shows the characteristics of the developed models after fitting the original spectral information from which the noise was removed without performing another spectral resampling.  Transformation-type of pre-treatment method, LVs-number of latent variables, R2-coefficient of determination, RMSEP-root mean square error of prediction, RPD-ratio of standard error of prediction to standard deviation, Abs.-absorbance, Abs. + SG-absorbance with Savitzky-Golay smoothing, SG + 1.der-Savitzky-Golay smoothing with first derivation, SG-Savitzky-Golay smoothing, SNV-standard normal variate. Table 5 presents results of the evaluation of models that were built for the prediction of total nitrogen content in the soils for a specific hyperspectral dataset. Therefore, their spectral information was resampled into the resolution of hyperspectral images. Models for which the values are not available indicate that after cross-validation the R 2 became a negative number. This statistical indicator describes how much of the dependent variable is explained by the model and how much remains unexplained [42].
For determining the total soil nitrogen content from data collected with the spectrometer, the model using four latent components and reflectance without pre-treatment (R 2 , 0.36; RMSEP, 0.0195; and RPD, 1.25) achieved the best results. Other variants for both elements resulted in a lower predictive power. When predicting the total soil nitrogen using hyperspectral data, the best predictive abilities of all 48 built models were achieved using a pre-treatment model with absorbance (R 2 , 0.44; RMSEP, 0.01; and RPD, 1.34). The developed application and the prediction procedure of the PLSR method were subsequently tested on this model.
Since the data sample included only 22 samples, the maximum number of components was about 20 depending on the applied pre-treatment method. This is important because the best model required six components, which is almost one-third of all available data. Figure 6 shows the basic predictive abilities of the model using six components determined based on cross-validation. The closer the individual values are to the line of best fit, the better the model performed.  Figure 7 show that the use of seven components contributes to the prediction, but this contribution is not significant and, according to cross-validation (Figure 8), there is a slight deterioration in the prediction ability. Although the use of seven components would not be wrong, only six were used to avoid over-fitting the model and, as a result, increasing the noise. The decision to use six components is also supported by a score graph (Figure 8), which shows the variance of the data that make up the main components [46]. It is possible to read the variance of data between individual components from this graph. The most important component forming the model itself was the second component, explaining 87% of the problem relevant variance in the data. This is a rather surprising result because for most models built this way, the most important component is the first one. The sixth component is the most important for the second part of modeling and prediction. The values described by the sixth component are the values with the least variance and, except for two outliers, describe the data very well.

Regression vectors in
The main result of the prediction is a map that displays the spatial distribution of total soil nitrogen. The spatial component of the map output enables its immediate application in precision agriculture for spatially variable nitrogen management. The prediction results were refined by removing several negative values, which could be caused by low concentration or variability of the investigated element. Further refinement was made by removing those pixels that represent a surface (buildings and roads) for which the model was not built, and therefore, the resulting values for these surfaces are irrelevant. It can be seen from Table 6 that the predicted values were slightly higher than the values obtained by pedological measurements. This is due to the overall higher reflectance values on the hyperspectral image than the values measured by the handheld spectrometer.  Figure 9 shows a map with the spatial distribution of nitrogen values. It is possible to see that higher N concentrations were observed in the northwest area.

Discussion
The determination of total nitrogen in the soil is a very complex matter and involves several processes. Ideally, it is necessary to include all nitrogen cycles that can help to understand the evolution of nitrogen. However, these models will always be subject to a certain degree of error. Pedologists deal with assessing nitrogen mainly due to better management of nitrogen fertilizers. It is in the general interest to ensure high yields with minimal soil degradation [8].
Soil properties are most often determined in the laboratory. Unfortunately, classical laboratory methods are often time consuming, expensive, and destructive. Therefore, new methods are being developed to eliminate these disadvantages [6]. One of these new methods is laboratory spectroscopy. Another option is to employ remote sensing, which is probably the least expensive path and non-destructive to the soil. Aerial and satellite multispectral and hyperspectral images can map a relatively large area in a short time. However, when determining soil properties using this method, it is necessary to eliminate the influence of the atmosphere to achieve pure reflectance [13]. For example, aerial photography with a hyperspectral camera has been used in the past to detect nitrogen [47]. Of the satellite systems, the Hyperion hyperspectral system [48] or the Landsat 5 and 7 multispectral systems [9,49] are widely used.
The wavelengths suitable for examining soil nitrogen content vary considerably among published work. Dalal and Henry [22] determined the most suitable range of wavelengths to be 1100 to 2500 nm (more specifically 1702, 1870, and 2052 nm). On the other hand, Sterberg et al. [50] considered 1100, 1600, 1700, 1800, 2000, and 2200 nm to be the most appropriate wavelengths. Shi et al. [51] compared their results with previous works, which they incorporated into their work, and the output was a series of wavebands 1450, 1850, 2250, 2330, and 2400 nm. In this work, it is pointed out that the determination of suitable wavelengths depends mainly on the overall processing and the method used. The latter bands are especially suitable for nitrogen (but they can also be used for other soil properties) and for the popular PLSR method. In general, water absorption bands from 1300 to 3000 nm are the most suitable and most widely used for determining the soil nitrogen content [51,52]. The area is associated with water content, which changes the total carbon content in the soil. This suggests that the soil nitrogen content can be indirectly related to carbon content through these wavelengths [53]. Chang et al. [7] also pointed to a high correlation between the amount of these two elements in the soil.
Several studies have further compared the vis-NIR and MIR portions of the spectrum. MIR performed better in most of these studies [52,54,55]. The PLSR method or its variations (e.g., CARS-PLSR) were mostly used when working with MIR. Wavelengths suitable for measuring nitrogen in MIR were determined to be between 1676 and 1672, 1260 and 1036 cm −1 [52].
In the present work, the PLSR method was used to develop a model for the prediction of total soil nitrogen for selected test locations. Unfortunately, the resulting parameters and prediction capabilities were not as high as results published in some previous studies reviewed by authors that have shown that this method can achieve accuracy higher than an R 2 of 0.8 [9,43]. The determining factor for the PLSR model presented in this paper reached an R 2 of 0.44, indicating that after cross-validation, the model describes data from which it was created with 44% confidence. Areas for improvement are listed below.
The primary disadvantage, when developing the PLSR model, concerns the spectral range of the handheld spectrometer used. The FieldSpec HandHeld2 spectrometer measures wavelengths in the range of 325-1075 nm with an excessive noise affecting the signal with a wavelength of less than 450 nm and greater than 950 nm. According to Vohland et al. [52], the critical wavelengths required to build a highly accurate model based on the spectral behavior of soil properties are in the NIR (near-infrared) and MIR (mid-infrared) bands starting at 750 nm. Using spectral information measured up to the infrared spectrum of the electromagnetic spectrum, Zornoza et al. [45] achieved results with an R 2 reaching up to 0.95 for total soil nitrogen content.
Another issue that possibly influenced the quality of the developed model was the different acquisition periods for the hyperspectral imaging (8 November 2016) and field data collection using a handheld spectrometer (7 October 2016), as the weather conditions for these two periods were different. Hyperspectral imaging was performed at lower temperatures, with possible ground frosts but minimal precipitation, while handheld spectrometer data were collected at higher temperatures with higher cloud coverage.
Finally, the low sample size was identified to affect the development of the model. The number of samples collected in our study (a total of 22 samples) was small and did not allow us to perform the second standard validation step. Unfortunately, with this low sample size, it was impossible to carry out a more rigorous statistical evaluation of the dataset and the selection of the most optimal samples. This small number of samples from only two different sites also caused the low variability of the total soil nitrogen, minimizing the ability to capture the key relationships between the soil property and the relevant spectral information. Studies that used hundreds of samples [47,56] achieved results with an R 2 higher than 0.8. Regarding the samples, some degree of uncertainty could also be due to laboratory measurements.
Based on the results, it was identified that close attention should be paid to datasets, as each can exhibit some degree of inaccuracy. In general, meteorological conditions, sample size, and the spectrometer parameters used for field data collection should be evaluated during the test preparation. The encountered problems could be significantly minimized by using a larger dataset sample and by implementing a public soil spectral library for the Czech region. Unfortunately, the only suitable soil spectral library, created at the Czech University of Life Sciences in Prague [57], was not publicly available. Therefore, field data collection was performed with the handheld spectrometer available to the authors.

Conclusions
The present work is a contribution to the wider use of remote sensing methods in precision agriculture. Nitrogen is a very complex component of soils that behaves unstably in time and space. Traditional methods for assessing the soil nitrogen content involve contact sampling and subsequent laboratory analysis. They achieve accurate results but have several practical limitations. They are time consuming, contact and destructive methods that require large numbers of samples to capture the soil variability. With frequent and repeated sampling, the land and cultivated plants are also degraded. The use of remote sensing data in the extended spectrum minimizes the disadvantages of traditional methods. Hyperspectral data are more suitable for determining the nitrogen supply than multispectral, and are often used in conjunction with statistical predictive modeling. According to the reviewed literature, the best models are created by employing the PLSR method, the applicability of which was subsequently tested in our work and confirmed this fact.
In our study, PLSR was applied to predict new values for total soil nitrogen based on data measured by a handheld spectrometer and laboratory-determined total soil nitrogen content. The developed model was then applied to the hyperspectral image to determine the predicted total nitrogen content of the soil. The application of PLSR was evaluated as a useful method for predicting soil properties in hyperspectral images. In our case, unfortunately, the method was not applicable for some soil properties and even the best model came out only with a medium predictive power (R 2 of 0.44). The results of the study showed the key role of input data quality in predictive modeling. The influence of atmospheric conditions in the collection of individual datasets and the necessity of using spectrometers covering the entire width of the spectrum in which soil nitrogen can be identified are significant.
Funding: This research was funded by the Internal Grant Agency of Palacký University Olomouc, grant number IGA_PrF_2021_020, "Advanced application of geospatial technologies for spatial analysis, modeling, and visualization of the phenomena of the real world", and by the Technology Agency of the Czech Republic, grant number TA04020888.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The datasets generated during analyses are available from the corresponding author on reasonable request.