Characterization of Vegetation Physiognomic Types Using Bidirectional Reﬂectance Data

: This paper presents an assessment of the bidirectional reflectance features for the classification and characterization of vegetation physiognomic types at a national scale. The bidirectional reflectance data at multiple illumination and viewing geometries were generated by simulating the Moderate Resolution Imaging Spectroradiometer (MODIS) Bidirectional Reflectance Distribution Function (BRDF) model parameters with Ross-Thick Li-Sparse-Reciprocal (RT-LSR) kernel weights. This research dealt with the classiﬁcation and characterization of six vegetation physiognomic types—evergreen coniferous forest, evergreen broadleaf forest, deciduous coniferous forest, deciduous broadleaf forest, shrubs, and herbaceous—which are distributed all over the country. The supervised classiﬁcation approach was used by employing four machine learning classiﬁers—k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN)—with the support of ground truth data. The confusion matrix, overall accuracy, and kappa coefﬁcient were calculated through a 10-fold cross-validation approach, and were also used as the metrics for quantitative evaluation. Among the classiﬁers tested, the accuracy metrics did not vary much with the classiﬁers; however, the Random Forests (RF; Overall accuracy = 0.76, Kappa coefﬁcient = 0.72) and Support Vector Machines (SVM; Overall accuracy = 0.76, Kappa coefﬁcient = 0.71) classiﬁers performed slightly better than other classiﬁers. The bidirectional reﬂectance spectra did not only vary with the vegetation physiognomic types, it also showed a pronounced difference between the backward and forward scattering directions. Thus, the bidirectional reﬂectance data provides additional features for improving the classiﬁcation and characterization of vegetation physiognomic types at the broad scale.


Introduction
Vegetation has been threatened by changes in species composition and the shifting of zones under the influence of climate change worldwide [1][2][3]. The mapping and characterization of vegetation physiognomic types (growth forms: tree, shrub, herbaceous; leaf characteristics: needle-leaved or broadleaved; and phenology: evergreen or deciduous [4]) is useful for a better understanding of vegetation dynamics.
Crown relative height h b and relative shape b r parameters are also included in the K geo . The f iso is a constant called isotropic scattering, which describes the reflectance under nadir solar illumination and nadir viewing condition; whereas the f vol and f geo are the kernel weights for volumetric and geometric scatterings, respectively. The Moderate Resolution Imaging Spectroradiometer (MODIS) BRDF/Albedo Model Parameters product (MCD43A1) delivers the BRDF parameters ( f iso , f vol , and f geo ) in seven spectral bands at 500-m spatial resolution on an eight-day cycle by fitting daily atmospherically corrected surface reflectance data with the RT-LSR model [20]. Using the BRDF parameters ( f iso , f vol , and f geo ) and associated RT-LSR model kernel (K vol and K geo ) weights, bidirectional reflectance (R) at any illumination and viewing geometry can be generated. The MODIS BRDF/Albedo products provide high quality BRDF parameters [21][22][23][24][25].
The classification of vegetation physiognomic types using satellite remote sensing data over a large region is a challenging field. For example, extant maps such as MODIS Land Cover Type Product (MCD12Q1, [26]) and Global Land Cover by National Mapping Organizations (GLCNMO, [27]), from which the vegetation physiognomic information can be obtained, have not correctly classified the vegetation physiognomic types over a region as large and diverse as all of Japan [28,29]. With a focus on ground truth data and mapping at the national scale, more accurate vegetation physiognomic maps have been produced in Japan [28,29]. The importance of input features and the size of ground truth data for the classification of vegetation physiognomic types have also been emphasized [30]. In our previous research [29], nadir BRDF-adjusted reflectance indicated a slightly better classification of vegetation physiognomic types than the conventional surface reflectance. The objective of this research was to further assess the potential of bidirectional reflectance data at multiple illumination and viewing geometries for improving the classification and characterization of vegetation physiognomic types at moderate spatial resolution.

Processing of Satellite Data
The MODIS BRDF/Albedo parameters product (MCD43A1), available from the United States Geological Survey (USGS) on an eight-day cycle at 500-m resolution, were processed for the year 2016. The BRDF model parameters ( f iso , f vol and f geo ) of six spectral bands (red, near-infrared, blue, green, mid-infrared, and short-wave infrared) were utilized. Bidirectional reflectance data, at different illumination and viewing geometries (Table 1), were derived by simulating the BRDF model parameters with the RT-LSR kernel weights. Using all of the stacks of images available for Japan, the bidirectional reflectance data for each spectral band were composited by calculating eleven percentile values (0, 10,20,30,40,50,60,70,80,90,100) pixel by pixel following the methodology described by Sharma et al. [30]. In this manner, a total of 264 bidirectional reflectance features were prepared from the MCD43A1 product (Table 1).
For the purpose of comparing the bidirectional reflectance, the surface reflectance product (MOD09A1/MOY09A1), which provides an estimate of the surface spectral reflectance as it would be measured at the ground level in the absence of atmospheric scattering or absorption, was also processed in a manner similar to the MCD43A1 product, and the annual minimum and maximum value composites for each spectral bands were generated.

Preparation of Ground Truth Data
This research deals with the classification and characterization of six vegetation physiognomic types: evergreen coniferous forest (ECF), evergreen broadleaf forest (EBF), deciduous coniferous forest (DCF), deciduous broadleaf forest (DBF), shrubs (Sh), and herbaceous (Hb). Ground truth data prepared in previous research studies [28][29][30] were further strengthened with reference to Google Earth imagery and used for this research. This research utilizes a total of 410 ground truth points for each class, which were located all over Japan.

Machine Learning and Cross-Validation
Four machine learning classifiers that were described for the classification of vegetation physiognomic types in the previous research [30]: k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN), were employed for the evaluation of bidirectional reflectance features in this research.
The performance was evaluated by a 10-fold cross-validation method, following the method described by Sharma et al. [30]. In this method, given features were shuffled, and then grouped into 10 folds. Machine learning was carried out only on nine folds, whereas the remaining fold was used for validation. The features were standardized by removing the mean and scaling to unit variance. Best scoring features were scored based on an analysis of variance test. Then, for each set of best features, a machine learning model established with the learning folds was used to predict the physiognomic classes with the validation fold. Predictions were collected from cross-validation loops and the validation metrics-the confusion matrix, overall accuracy, and kappa coefficient-were calculated for each set of best features. The hyperparameters of the classifier were tuned by repeated hit and trial method with reference to the validation metrics. The optimum number of important features that yielded the highest kappa coefficient with the lowest number of input features were recorded. The same procedure was repeated for each machine learning classifier.
We also compared the spectral profiles of the vegetation physiognomic types regarding the surface and bidirectional reflectance features. For this comparison, the spectral profiles were extracted from both surface and bidirectional reflectance features using the median values of all of the ground truth points that were prepared in the research.

Cross-Validation Results
The variation of the kappa coefficient by increasing the important number of features obtained from the cross-validation method is shown in Figure 1. The kappa coefficients increased by increasing the number of important features up to a point, after which they started to saturate for all of the classifiers.
The confusion matrices that were computed are plotted in Figure 2. These matrices were computed based on a 10-fold cross-validation method using the optimum set of features. Among the classifiers used, the Random Forests (RF; Overall accuracy = 0.76, Kappa coefficient = 0.72) and Support Vector Machines (SVM; Overall accuracy = 0.76, Kappa coefficient = 0.71) performed slightly

Preparation of Ground Truth Data
This research deals with the classification and characterization of six vegetation physiognomic types: evergreen coniferous forest (ECF), evergreen broadleaf forest (EBF), deciduous coniferous forest (DCF), deciduous broadleaf forest (DBF), shrubs (Sh), and herbaceous (Hb). Ground truth data prepared in previous research studies [28][29][30] were further strengthened with reference to Google Earth imagery and used for this research. This research utilizes a total of 410 ground truth points for each class, which were located all over Japan.

Machine Learning and Cross-Validation
Four machine learning classifiers that were described for the classification of vegetation physiognomic types in the previous research [30]: k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN), were employed for the evaluation of bidirectional reflectance features in this research.
The performance was evaluated by a 10-fold cross-validation method, following the method described by Sharma et al. [30]. In this method, given features were shuffled, and then grouped into 10 folds. Machine learning was carried out only on nine folds, whereas the remaining fold was used for validation. The features were standardized by removing the mean and scaling to unit variance. Best scoring features were scored based on an analysis of variance test. Then, for each set of best features, a machine learning model established with the learning folds was used to predict the physiognomic classes with the validation fold. Predictions were collected from cross-validation loops and the validation metrics-the confusion matrix, overall accuracy, and kappa coefficient-were calculated for each set of best features. The hyperparameters of the classifier were tuned by repeated hit and trial method with reference to the validation metrics. The optimum number of important features that yielded the highest kappa coefficient with the lowest number of input features were recorded. The same procedure was repeated for each machine learning classifier.
We also compared the spectral profiles of the vegetation physiognomic types regarding the surface and bidirectional reflectance features. For this comparison, the spectral profiles were extracted from both surface and bidirectional reflectance features using the median values of all of the ground truth points that were prepared in the research.

Cross-Validation Results
The variation of the kappa coefficient by increasing the important number of features obtained from the cross-validation method is shown in Figure 1. The kappa coefficients increased by increasing the number of important features up to a point, after which they started to saturate for all of the classifiers.
The confusion matrices that were computed are plotted in Figure 2. These matrices were computed based on a 10-fold cross-validation method using the optimum set of features. Among the classifiers used, the Random Forests (RF; Overall accuracy = 0.76, Kappa coefficient = 0.72) and Support Vector Machines (SVM; Overall accuracy = 0.76, Kappa coefficient = 0.71) performed slightly

Preparation of Ground Truth Data
This research deals with the classification and characterization of six vegetation physiognomic types: evergreen coniferous forest (ECF), evergreen broadleaf forest (EBF), deciduous coniferous forest (DCF), deciduous broadleaf forest (DBF), shrubs (Sh), and herbaceous (Hb). Ground truth data prepared in previous research studies [28][29][30] were further strengthened with reference to Google Earth imagery and used for this research. This research utilizes a total of 410 ground truth points for each class, which were located all over Japan.

Machine Learning and Cross-Validation
Four machine learning classifiers that were described for the classification of vegetation physiognomic types in the previous research [30]: k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN), were employed for the evaluation of bidirectional reflectance features in this research.
The performance was evaluated by a 10-fold cross-validation method, following the method described by Sharma et al. [30]. In this method, given features were shuffled, and then grouped into 10 folds. Machine learning was carried out only on nine folds, whereas the remaining fold was used for validation. The features were standardized by removing the mean and scaling to unit variance. Best scoring features were scored based on an analysis of variance test. Then, for each set of best features, a machine learning model established with the learning folds was used to predict the physiognomic classes with the validation fold. Predictions were collected from cross-validation loops and the validation metrics-the confusion matrix, overall accuracy, and kappa coefficient-were calculated for each set of best features. The hyperparameters of the classifier were tuned by repeated hit and trial method with reference to the validation metrics. The optimum number of important features that yielded the highest kappa coefficient with the lowest number of input features were recorded. The same procedure was repeated for each machine learning classifier.
We also compared the spectral profiles of the vegetation physiognomic types regarding the surface and bidirectional reflectance features. For this comparison, the spectral profiles were extracted from both surface and bidirectional reflectance features using the median values of all of the ground truth points that were prepared in the research.

Cross-Validation Results
The variation of the kappa coefficient by increasing the important number of features obtained from the cross-validation method is shown in Figure 1. The kappa coefficients increased by increasing the number of important features up to a point, after which they started to saturate for all of the classifiers.
The confusion matrices that were computed are plotted in Figure 2.

Preparation of Ground Truth Data
This research deals with the classification and characterization of six vegetation physiognomic types: evergreen coniferous forest (ECF), evergreen broadleaf forest (EBF), deciduous coniferous forest (DCF), deciduous broadleaf forest (DBF), shrubs (Sh), and herbaceous (Hb). Ground truth data prepared in previous research studies [28][29][30] were further strengthened with reference to Google Earth imagery and used for this research. This research utilizes a total of 410 ground truth points for each class, which were located all over Japan.

Machine Learning and Cross-Validation
Four machine learning classifiers that were described for the classification of vegetation physiognomic types in the previous research [30]: k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN), were employed for the evaluation of bidirectional reflectance features in this research.
The performance was evaluated by a 10-fold cross-validation method, following the method described by Sharma et al. [30]. In this method, given features were shuffled, and then grouped into 10 folds. Machine learning was carried out only on nine folds, whereas the remaining fold was used for validation. The features were standardized by removing the mean and scaling to unit variance. Best scoring features were scored based on an analysis of variance test. Then, for each set of best features, a machine learning model established with the learning folds was used to predict the physiognomic classes with the validation fold. Predictions were collected from cross-validation loops and the validation metrics-the confusion matrix, overall accuracy, and kappa coefficient-were calculated for each set of best features. The hyperparameters of the classifier were tuned by repeated hit and trial method with reference to the validation metrics. The optimum number of important features that yielded the highest kappa coefficient with the lowest number of input features were recorded. The same procedure was repeated for each machine learning classifier.
We also compared the spectral profiles of the vegetation physiognomic types regarding the surface and bidirectional reflectance features. For this comparison, the spectral profiles were extracted from both surface and bidirectional reflectance features using the median values of all of the ground truth points that were prepared in the research.

Cross-Validation Results
The variation of the kappa coefficient by increasing the important number of features obtained from the cross-validation method is shown in Figure 1. The kappa coefficients increased by increasing the number of important features up to a point, after which they started to saturate for all of the classifiers.
The confusion matrices that were computed are plotted in Figure 2. These matrices were computed based on a 10-fold cross-validation method using the optimum set of features.

Preparation of Ground Truth Data
This research deals with the classification and characterization of six vegetation physiognomic types: evergreen coniferous forest (ECF), evergreen broadleaf forest (EBF), deciduous coniferous forest (DCF), deciduous broadleaf forest (DBF), shrubs (Sh), and herbaceous (Hb). Ground truth data prepared in previous research studies [28][29][30] were further strengthened with reference to Google Earth imagery and used for this research. This research utilizes a total of 410 ground truth points for each class, which were located all over Japan.

Machine Learning and Cross-Validation
Four machine learning classifiers that were described for the classification of vegetation physiognomic types in the previous research [30]: k-Nearest Neighbors (KNN), Random Forests (RF), Support Vector Machines (SVM), and Multilayer Perceptron Neural Networks (NN), were employed for the evaluation of bidirectional reflectance features in this research.
The performance was evaluated by a 10-fold cross-validation method, following the method described by Sharma et al. [30]. In this method, given features were shuffled, and then grouped into 10 folds. Machine learning was carried out only on nine folds, whereas the remaining fold was used for validation. The features were standardized by removing the mean and scaling to unit variance. Best scoring features were scored based on an analysis of variance test. Then, for each set of best features, a machine learning model established with the learning folds was used to predict the physiognomic classes with the validation fold. Predictions were collected from cross-validation loops and the validation metrics-the confusion matrix, overall accuracy, and kappa coefficient-were calculated for each set of best features. The hyperparameters of the classifier were tuned by repeated hit and trial method with reference to the validation metrics. The optimum number of important features that yielded the highest kappa coefficient with the lowest number of input features were recorded. The same procedure was repeated for each machine learning classifier.
We also compared the spectral profiles of the vegetation physiognomic types regarding the surface and bidirectional reflectance features. For this comparison, the spectral profiles were extracted from both surface and bidirectional reflectance features using the median values of all of the ground truth points that were prepared in the research.

Cross-Validation Results
The variation of the kappa coefficient by increasing the important number of features obtained from the cross-validation method is shown in Figure 1. The kappa coefficients increased by increasing the number of important features up to a point, after which they started to saturate for all of the classifiers.
The confusion matrices that were computed are plotted in Figure 2. These matrices were computed based on a 10-fold cross-validation method using the optimum set of features. Among the classifiers used, the Random Forests (RF; Overall accuracy = 0.76, Kappa coefficient = 0.72) and Support Vector Machines (SVM; Overall accuracy = 0.76, Kappa coefficient = 0.71) performed slightly better than others. Nevertheless, accuracy metrics obtained from the bidirectional reflectance did not vary much with the classifiers, which was similar to the surface reflectance [30]. better than others. Nevertheless, accuracy metrics obtained from the bidirectional reflectance did not vary much with the classifiers, which was similar to the surface reflectance [30].   better than others. Nevertheless, accuracy metrics obtained from the bidirectional reflectance did not vary much with the classifiers, which was similar to the surface reflectance [30].    Figures 3 and 4 show the spectral profiles using annual minimum and maximum value composite images, respectively. There is a substantial difference in the magnitude of the reflectance values in all of the spectral regions between the surface and bidirectional reflectance products. The annual minimum values of the surface reflectance are lower than the corresponding isotropic reflectance (Figure 3), whereas the annual maximum values of the surface reflectance are higher than the corresponding isotropic reflectance (Figure 4). The isotropic reflectance (SZA = 0 • , VZA = 0 • , RAA = 0 • ) has indicated a better discrimination of the vegetation physiognomic types in the near infrared and shortwave infrared region than the surface reflectance. Therefore, the bidirectional reflectance features may be more sensitive to the vegetation physiognomic types.  Figures 3 and 4 show the spectral profiles using annual minimum and maximum value composite images, respectively. There is a substantial difference in the magnitude of the reflectance values in all of the spectral regions between the surface and bidirectional reflectance products. The annual minimum values of the surface reflectance are lower than the corresponding isotropic reflectance (Figure 3), whereas the annual maximum values of the surface reflectance are higher than the corresponding isotropic reflectance (Figure 4). The isotropic reflectance (SZA = 0°, VZA = 0°, RAA = 0°) has indicated a better discrimination of the vegetation physiognomic types in the near infrared and shortwave infrared region than the surface reflectance. Therefore, the bidirectional reflectance features may be more sensitive to the vegetation physiognomic types.   Figure 5 shows the spectral profiles in the backward and forward scattering directions. The herbs, shrubs, and deciduous conifer forests showed higher backward scattering in the red region than the deciduous broadleaf, evergreen broadleaf, and evergreen conifer forests. It may be because the exposure of ground surface is more pronounced in the case of short and deciduous vegetation (herbs, shrubs, and deciduous forests) while viewing from the off-nadir directions. Moreover, the  Figures 3 and 4 show the spectral profiles using annual minimum and maximum value composite images, respectively. There is a substantial difference in the magnitude of the reflectance values in all of the spectral regions between the surface and bidirectional reflectance products. The annual minimum values of the surface reflectance are lower than the corresponding isotropic reflectance (Figure 3), whereas the annual maximum values of the surface reflectance are higher than the corresponding isotropic reflectance (Figure 4). The isotropic reflectance (SZA = 0°, VZA = 0°, RAA = 0°) has indicated a better discrimination of the vegetation physiognomic types in the near infrared and shortwave infrared region than the surface reflectance. Therefore, the bidirectional reflectance features may be more sensitive to the vegetation physiognomic types.   Figure 5 shows the spectral profiles in the backward and forward scattering directions. The herbs, shrubs, and deciduous conifer forests showed higher backward scattering in the red region than the deciduous broadleaf, evergreen broadleaf, and evergreen conifer forests. It may be because the exposure of ground surface is more pronounced in the case of short and deciduous vegetation (herbs, shrubs, and deciduous forests) while viewing from the off-nadir directions. Moreover, the  Figure 5 shows the spectral profiles in the backward and forward scattering directions. The herbs, shrubs, and deciduous conifer forests showed higher backward scattering in the red region than the deciduous broadleaf, evergreen broadleaf, and evergreen conifer forests. It may be because the exposure of ground surface is more pronounced in the case of short and deciduous vegetation (herbs, shrubs, and deciduous forests) while viewing from the off-nadir directions. Moreover, the forward reflectance data are much lower than the backward reflectance, which was possibly due to the presence of shadows in the forward direction. Thus, the analyses with the spectral profiles indicated that bidirectional reflectance data provides additional features for improving the classification and characterization of vegetation physiognomic types.

Comparison of the Spectral Profiles
forward reflectance data are much lower than the backward reflectance, which was possibly due to the presence of shadows in the forward direction. Thus, the analyses with the spectral profiles indicated that bidirectional reflectance data provides additional features for improving the classification and characterization of vegetation physiognomic types. The vegetation exhibits anisotropic reflectance, i.e., the reflectivity varies with respect to the direction of observation [31,32]. Researchers have described the effects of viewing geometry and illumination conditions on images [33] and vegetation indices [34]. The multi-angular remote sensing has shown promises for characterization of forests [35] and biomes [36], as well as the retrieval of canopy structural [37][38][39][40] and chemical characteristics [41,42]. Therefore, the mapping and characterization of vegetation physiognomic types using the bidirectional reflectance data is an interesting topic for research.

Conclusions
In this research, we analyzed the variation of the spectral profiles between the surface and bidirectional reflectance data, and assessed the potential of bidirectional reflectance features at multiple illumination and viewing geometries for improving the classification and characterization of vegetation physiognomic types. The results of this research indicated that bidirectional reflectance provides effective information for the classification and characterization of vegetation physiognomic types. Our hope is that the mapping and monitoring of vegetation changes with bidirectional reflectance data, especially at higher spatial resolution in the future, will contribute greatly to land planning, nature conservation, and global biodiversity strategies. This research dealt with the physiognomic characteristics of the vegetation only; however, exploring the effects of leaf and vegetation structure (proportion of leaves/needles to woody parts, plant density, diversity, variation in plant height, edges, etc.) on the bidirectional reflectance and classification of vegetation types is an important subject of future research.  The vegetation exhibits anisotropic reflectance, i.e., the reflectivity varies with respect to the direction of observation [31,32]. Researchers have described the effects of viewing geometry and illumination conditions on images [33] and vegetation indices [34]. The multi-angular remote sensing has shown promises for characterization of forests [35] and biomes [36], as well as the retrieval of canopy structural [37][38][39][40] and chemical characteristics [41,42]. Therefore, the mapping and characterization of vegetation physiognomic types using the bidirectional reflectance data is an interesting topic for research.

Conclusions
In this research, we analyzed the variation of the spectral profiles between the surface and bidirectional reflectance data, and assessed the potential of bidirectional reflectance features at multiple illumination and viewing geometries for improving the classification and characterization of vegetation physiognomic types. The results of this research indicated that bidirectional reflectance provides effective information for the classification and characterization of vegetation physiognomic types. Our hope is that the mapping and monitoring of vegetation changes with bidirectional reflectance data, especially at higher spatial resolution in the future, will contribute greatly to land planning, nature conservation, and global biodiversity strategies. This research dealt with the physiognomic characteristics of the vegetation only; however, exploring the effects of leaf and vegetation structure (proportion of leaves/needles to woody parts, plant density, diversity, variation in plant height, edges, etc.) on the bidirectional reflectance and classification of vegetation types is an important subject of future research.