Next Article in Journal
Early Warning of Anthracnose on Illicium verum Through the Synergistic Integration of Environmental and Remote Sensing Time Series Data
Previous Article in Journal
Improved Pixel Offset Tracking Method Based on Corner Point Variation in Large-Gradient Landslide Deformation Monitoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning-Based Prediction of Multi-Species Leaf Pigment Content Using Hyperspectral Reflectance

1
School of Urban Planning and Design, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
2
College of Environment and Resources, College of Carbon Neutrality, Zhejiang A&F University, Hangzhou 311300, China
3
Key Laboratory of Carbon Sequestration and Emission Reduction in Agriculture and Forestry of Zhejiang Province, Zhejiang A&F University, Hangzhou 311300, China
4
Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(19), 3293; https://doi.org/10.3390/rs17193293
Submission received: 9 August 2025 / Revised: 22 September 2025 / Accepted: 23 September 2025 / Published: 25 September 2025
(This article belongs to the Section Forest Remote Sensing)

Abstract

Highlights

What are the main findings?
  • CNN models combined with genetic algorithm–based spectral band selection achieved high-accuracy estimation of leaf pigment content across tree species.
  • The 2D CNN outperformed the 1D CNN, with optimal results obtained using 3–4 convolutional layers.
What is the implication of the main finding?
  • The study provides a non-destructive and robust approach for monitoring leaf pigments across different tree species.
  • The CNN-based approach improved remote sensing applications in vegetation health assessment and forest ecosystem management.

Abstract

Leaf pigment composition and concentration are crucial indicators of plant physiological status, photosynthetic capacity, and overall ecosystem health. While spectroscopy techniques show promise for monitoring vegetation growth, phenology, and stress, accurately estimating leaf pigments remains challenging due to the complex reflectance properties across diverse tree species. This study introduces a novel approach using a two-dimensional convolutional neural network (2D-CNN) coupled with a genetic algorithm (GA) to predict leaf pigment content, including chlorophyll a and b content (Cab), carotenoid content (Car), and anthocyanin content (Canth). Leaf reflectance and biochemical content measurements taken from 28 tree species were used in this study. The reflectance spectra ranging from 400 nm to 800 nm were encoded as 2D matrices with different sizes to train the 2D-CNN and compared with the one-dimensional convolutional neural network (1D-CNN). The results show that the 2D-CNN model (nRMSE = 11.71–31.58%) achieved higher accuracy than the 1D-CNN model (nRMSE = 12.79–55.34%) in predicting leaf pigment contents. For the 2D-CNN models, Cab achieved the best estimation accuracy with an nRMSE value of 11.71% (R2 = 0.92, RMSE = 6.10 µg/cm2), followed by Car (R2 = 0.84, RMSE = 1.03 µg/cm2, nRMSE = 12.29%) and Canth (R2 = 0.89, RMSE = 0.35 µg/cm2, nRMSE = 31.58%). Both 1D-CNN and 2D-CNN models coupled with GA using a subset of the spectrum produced higher prediction accuracy in all pigments than those using the full spectrum. Additionally, the generalization of 2D-CNN is higher than that of 1D-CNN. This study highlights the potential of 2D-CNN approaches for accurate prediction of leaf pigment content from spectral reflectance data, offering a promising tool for advanced vegetation monitoring.

1. Introduction

The production of oxygen and organic matter produced by photosynthesis is of tremendous significance for the global carbon cycle in the biosphere [1]. Photosynthesis is primarily driven by leaf pigments, which serve as excellent indicators of plant growth, stress, function, and phenology from local to global scales [2,3]. The three main families of pigments are chlorophylls, carotenoids, and anthocyanins. Chlorophylls, including chlorophyll-a and chlorophyll-b, are the fundamental sunlight energy-absorbing pigments involved in photosynthetic reactions [4]. Carotenoids, composed of carotenes and xanthophylls, contribute to light harvesting and provide essential photoprotective effects, preventing excess energy from damaging the photosynthetic system [5]. Anthocyanins are strongly associated with foliage coloration and protect leaves from excess light and ultraviolet radiation [6]. Variation in pigment content thus provides information concerning leaf physiological state.
However, the ability to extract pigment content quickly and accurately is still limited. Leaf pigment content estimation is mainly based on chemical and remote sensing methods [7]. Traditional wet chemical pigment analysis requires the destruction of the detached leaves, followed by solvent extraction and spectrophotometric analysis according to a standard procedure [8,9]. These wet laboratory methods are time-consuming and labor-intensive. Furthermore, the destructiveness of traditional techniques limits the ability to monitor the change in pigments over time for the individual leaf [5,10]. In contrast, remote sensing-based approaches, which measure spectral reflectance, are nondestructive, rapid, and applicable across various spatiotemporal scales [11].
Leaf reflectance spectra are directly influenced by the spectral absorbance properties of pigments [10,12]. Therefore, the measurement of leaf reflectance can offer the opportunity to quantify pigment content [3]. Remote sensing-based pigment estimation generally follows two approaches, i.e., data-driven and radiative transfer model (RTM) inversion [13]. Among data-driven approaches, spectral index-based statistical models are widely used to predict leaf pigment content. The vegetation indices (VIs) were calculated from the reflectance of wavelengths related to pigment absorption features located in the visible (VIS) region (400–800 nm) [14]. To date, most studies on leaf pigments have focused on chlorophylls [15,16,17], while fewer have concentrated on the estimation of carotenoids [18,19,20] and anthocyanins [21,22]. More specifically, Croft et al. (2014) comprehensively summarized 47 VIs that are sensitive to chlorophyll at both the leaf and canopy scales [23]. It was concluded that the estimation of leaf chlorophyll content can achieve comparable results using simple statistical models based on the linear relationship between chlorophyll and VIs. However, overlapping absorption coefficients of pigments in the VIS region make it difficult to separate and quantify these pigments using VI-based linear models [12]. Actually, the relationships between the reflectance in the VIS region and pigment contents are inherently nonlinear. Consequently, using a mathematical formulation derived from VIs to calculate the pigment content can sometimes be inaccurate [24]. Secondly, multivariate statistical models such as partial least squares regression (PLSR) [25,26] have also been extensively used to estimate leaf pigment content. In PLSR, the full range of reflectance was transformed into several latent variables, which were then used to estimate leaf pigment content [27]. Furthermore, machine learning (ML) algorithms based on nonparametric approaches, e.g., neural networks (NNs), random forest regression (RFR), and Gaussian processes regression (GPR), perform better on pigment estimation [28,29,30]. However, the performance of these models depended mainly on the trained spectral dataset and pre-processing techniques. ML may lose universality when it is applied to another new dataset with the variation in feature spaces and distributions from different plant species and environmental conditions [31].
Another approach is based on the physical principle through the inversion of RTMs. RTMs such as PROSPECT [32] and LIBERTY [33] are widely used to simulate directional–hemispherical reflectance over the wavelength across 400 to 2500 nm at the leaf scale. The LIBERTY model was developed to represent the optical properties of needles. However, the generalization of the LIBERTY model is limited because measuring the spectrum of narrow needles is challenging. The PROSPECT model has evolved through several versions. PROSPECT-4 and 5 [34] introduced the separation of photosynthetic pigments (i.e., chlorophyll and carotenoid) in leaf optical property models. Subsequently, a new anthocyanin pigment was included in PROSPCT-D [13]. Recently, Féret et al. (2021) developed PROSPECT-PRO [35], which separated proteins from carbon-based constituents. This advancement brings leaf optical model simulations closer to natural leaf spectra. However, increased model input parameters can lead to ill-posed inversion problems because various combinations of model parameters can compensate for each other, resulting in leaf reflectance with very similar optical properties [36].
With the advancement of modern computational capabilities, deep learning-based algorithms have been successfully applied in the field of computer vision. These methods generally employ neural network architectures with multiple processing layers to extract higher-level features from input data. Convolutional neural networks (CNNs) are among the most widely used deep learning algorithms, achieving high performance in image classification and pattern recognition [37]. Recently, CNNs have also been proven effective in addressing regression problems in the remote sensing community, such as agricultural yield prediction [38] and the estimation of forest structural parameters [39]. Moreover, an emerging trend in remote sensing research highlights the integration of multi-sensor data (e.g., RGB cameras, hyperspectral, and LiDAR) with artificial intelligence methods (e.g., CNNs), which has shown great potential for non-destructive, precise, and efficient quantification of plant physiological and biochemical traits [40,41,42]. Traditionally, CNNs are applied to two-dimensional (2D) image data. By contrast, leaf reflectance spectra are often represented as one-dimensional (1D) sequential signals [43]. Several recent studies have therefore developed 1D-CNNs with leaf reflectance to estimate leaf pigments [44,45,46,47]. However, the contrast between absorption peaks and their surroundings tends to be smoothed when only sequential spectra are considered, particularly across different vegetation species, limiting the model’s ability to capture important spectral features.
This paper aims to develop a CNN-based approach for accurately predicting leaf pigment content across different tree species using reflectance spectra. We propose two CNN-based approaches to simultaneously estimate chlorophyll, carotenoid, and anthocyanin from the leaf reflectance spectra: (1) a 1D-CNN applied directly to spectral signals, and (2) a 2D-CNN applied to spectral signals transformed into 2D matrices. The specific objectives are to (1) investigate leaf spectral diversity along with the relationship between pigment content and reflectance spectra across various tree species; (2) compare the performance of 1D-CNN models and 2D-CNN models; (3) evaluate model accuracy and robustness across different tree species.
The structure of this paper is arranged as follows: leaf reflectance spectra and leaf pigment content measurements, as well as CNN-based methodologies, are described in Section 2; the comparison results between 1D- and 2D-CNN are described in Section 3; the discussion and conclusions are given in Section 4 and Section 5, respectively.

2. Materials and Methods

2.1. Study Area and Field Collection

The study site (see Figure 1) is located in Hangzhou (118°21′–120°30′E, 29°11′–30°33′N), Zhejiang Province, China. The region has a subtropical monsoon climate, with a mean annual temperature of 17.8 °C, a mean relative humidity of 70.3%, and annual precipitation of 1454mm. The elevation ranges from 3 m to 1587 m.
This study was conducted between April and September from 2021 to 2023, during which 2094 leaf samples were collected from 28 broadleaf tree species (e.g., Diospyros oleifera, see Table 1 for the complete list). For each species, several leaf samples were obtained from different individual trees to ensure diversity. During the sample collection process, we only selected healthy leaves to represent the physiological characteristics of the growing season. The sampling of leaves followed consistent standards in different regions, with leaves collected from different heights and orientations within the tree crown. The collected samples were labeled and stored in an ice box to prevent discoloration before pigment analysis in the laboratory.

2.2. Spectra and Leaf Pigment Content Measurements

Leaf hemispherical reflectance spectra were measured using an ASD FieldSpec-4 Wide-Res (Analytical Spectra Devices, Inc., Boulder, CO, USA) handheld spectrometer equipped with an integrating sphere. This instrument measures spectra from 350 to 2500 nm with a spectral resolution of 1 nm. The radiometer was calibrated and optimized for dark current and system offsets using a white reference panel. To reduce instrument noise, each leaf sample was measured ten times, and then the results were averaged to obtain the mean reflectance. Finally, the mean reflectance of each sample was calculated from all collected leaves within a tree crown.
After the spectral measurements, leaf pigments, including chlorophyll a and b content (Cab), carotenoid content (Car), and anthocyanin content (Canth), were measured immediately in the laboratory. Each sample was cut into eight small leaf disks (approximately 0.5 g) with a diameter of 1.2 cm. Two types of organic solvents were used for pigment extraction, including 80% acetone for Cab and Car, and a mixture of methanol, concentrated hydrochloric acid, and distilled water for Canth. Leaf disks were placed in tubes containing organic solvents (25 mL) and kept in a dark environment for 48 h to dissolve pigments completely. A dual beam scanning UV-2100 Spectrophotometer was utilized to measure the supernatant absorbance at wavelengths of 470 nm, 537 nm, 647 nm, and 663 nm. The pigments were derived using Equations (1)–(4). Finally, Cab, Car, and Canth expressed in the same unit (µg/cm2) were determined using a multi-wavelength calculation, according to [10,48].
C a = 12.25 A 663 2.79 A 647
C b = 21.50 A 647 5.10 A 663
C a r = ( 1000 A 470 1.82 C a 85.02 C b ) / 198
C a n t h = 36.71 A 537 3.13 A 646   A 663

2.3. Model Development

2.3.1. Data Preparation

The reflectance spectra were processed to reduce noise and enhance model performance. Wavelengths with a low signal-to-noise ratio, from 350 nm to 400 nm, were excluded. Then, the wavelength ranges from 400 nm to 800 nm, which are strongly related to pigment spectral absorption, were selected to estimate leaf pigment content.
In addition, all leaf pigment contents were normalized due to the significant differences in the amounts of three pigments, with Cab typically being more than ten times higher than Car and Canth. This normalization ensures that each neuron (filter) in the CNN has an equal opportunity to learn the pattern for each pigment [44]. The raw content of three pigments was converted into a 0–1 range by
P i * = P i min ( P i ) max ( P i ) min ( P i )
where Pi and P i * are the original and normalized values of each leaf pigment, respectively.

2.3.2. Design CNN Architecture

In this study, two CNN-based approaches were developed to estimate leaf pigment contents. CNN-based strategies have proven effective in the processing of both one-dimensional (1D) signals (e.g., time series) and two-dimensional (2D) images, providing accurate estimation through a hierarchical architecture [49,50]. Recent research has shown that AlexNet performs well in pigment prediction tasks [44,45,46,47]. Therefore, the architecture of AlexNet was referenced in this study. The original architecture of AlexNet included five convolutional layers and three fully connected layers. It also employs rectified Linear Unit activations (ReLU), dropout regularization, and overlapping max-pooling to achieve high efficiency and good generalization ability.
(1) 1D-CNN-based approach
For the 1D-CNN, the reflectance spectra (wavelength range: 400–800 nm) with the size of 1 × 400 were considered as a waveform input, and the output layer with the size of 1 × 3 was the prediction of three pigment contents in the 1D-CNN architecture (see Figure 2). In each convolution operation, the feature map of spectral data was extracted from the convolutional filter, with feature sizes determined by the filter (kernel) sizes, stride, and padding. Considering the narrow wavelength range and data length used in this study, the kernel size was set to 1 × 5 and 1 × 3 for the first and other convolutional layers, respectively. To avoid feature dimensions and to minimize network overfitting, the dropout method was used instead of max-pooling in the 1D-CNN architecture [43]. Each convolutional layer employed the same padding with a stride of 1. After all the convolution operations, the first two fully connected layers combined all the features. The last fully connected layer performed multi-output regression with a linear activation function to predict pigment variables. In addition, the number of filters and neurons in each layer was set according to AlexNet and other similar research [39,43], which follows the general principles applied in the design of convolutional neural networks [51]. The parameters of the 1D-CNN are summarized in Table 2.
(2) 2D-CNN-based approach
To analyze the spectral data using 2D-CNN, each waveform spectrum must be encoded as a 2D matrix format for feeding as input to the 2D-CNN. Several studies have addressed the transformation of 1D reflectance spectra to 2D representations using spectrograms [50,52]. However, these transformations did not achieve better accuracy in parameter estimation because the frequency count with the short-time Fourier transform altered the original spectra arrangement. Here, we proposed a simple and effective 2D transformation for reflectance spectra data.
The spectral reflectance was transformed into a 2D representation by rearranging the elements of the 1D reflectance spectrum in a 2D matrix. The 2D matrix dimensions were set to 8 × 50, 10 × 40, 20 × 20, 40 × 10, and 50 × 8. The rearrangement process is performed by dividing the reflectance data into vectors of equal horizontal size and then stacking this information together vertically. For comparison purposes, the 2D-CNN architecture (Figure 3) was designed to be similar to that of the 1D-CNN model. Batch normalization was applied to the convolutional layer features to accelerate convergence speed and improve model stability [53]. Additionally, the nonlinear activation function ReLU and the Dropout method were used for 2D-CNN. The first two fully connected layers flattened and combined all features extracted from the convolutional kernels. Finally, similarly to 1D-CNN architectures, the last fully connected layer performed multi-output regressions with three neurons representing Cab, Car, and Canth, respectively. The parameters of the 2D-CNN are summarized in Table 3.
(3) Optimization methods
In CNN-based methods, features are extracted from spectral reflectance using convolutional layers, which strongly influence leaf pigment estimation performance [46]. Therefore, the number of convolutional layers (Nconv) is an important factor in assessing prediction accuracy. Five strategies with an increasing number of convolutional layers were implemented in 1D- and 2D-CNN to optimize feature extraction for pigment prediction. The architectures of these five strategies are shown in Table 4.
We trained the aforementioned CNN architectures using stochastic gradient descent with momentum (SGDM), while internal neuron weights were updated with the Adam optimizer. During backpropagation, the mean absolute error (MAE) was used as the cost function. Therefore, the loss function was defined as follows:
Loss   = 1 N j = 1 N 1 R i = 1 R y i y i ^
where y i   and y i ^   represent the measured pigment content and the predicted pigment content, respectively; N is the number of observations, and R is the number of responses.
In addition, the standard hyperparameters like learning rate and batch size were set to 0.001 and 32, respectively. An epoch of 2000 was used to train for both 1D- and 2D-CNN models.

2.3.3. Spectral Bands Selection Using Genetic Algorithm

To reduce computational time and avoid under- and overfitting, the Genetic algorithm (GA) was coupled with CNN models for spectral feature selection [54,55]. We employed a pre-optimized 1D-CNN model for fitness evaluation of GA (named as 1D-CNNGA) to select the important spectral bands in the prediction of leaf pigment contents. In this study, CNN models adopt the multi-output regression for three pigment estimations. Therefore, the sensitive bands selected by GA represent the combined influence of three pigments on leaf spectral reflectance. For the GA coupled with a 2D-CNN model (named as 2D-CNNGA), the rearrangement of the 2D matrix for spectral reflectance was the same as in the original 2D-CNN with full spectrum (referred to as 2D-CNNf). Pixel values corresponding to selected importance bands were filled with measured reflectance, while pixels for non-selected bands were set to zero.
The standard GA procedures included encoding (waveband reflectance was translated into binary strings (named as chromosomes)), initial population (n = 100), calculation of fitness, selection, crossover, mutation, and fitness evaluation. After evaluation, a 500 iteration search for GA was implemented to obtain the best performing chromosomes over several generations.

2.4. Accuracy Assessment

In this study, the measured leaf dataset (n = 2094) was used to evaluate the performance of four variants of CNN models, which are mentioned above. To assess the performance of the four versions of CNNs and five strategies, the dataset was divided into a training set (70%) and a validation set (30%) using random systematic sampling based on tree species and collection month to ensure model generalization. For comparison, three benchmark estimation models, including PLSR, RFR, and GPR, were employed as standard competitors to further evaluate the effectiveness of CNN models. During training, the optimal number of latent variables used in the PLSR was determined by identifying the number that resulted in the smallest root mean squared error (RMSE) using the cross-validation approach (5-fold) to create the linear models. Hyperparameter optimization for RFR and GPR was implemented using a grid search method.
All models (see Table 5 for details) were trained and tested using both full-spectrum reflectance and the subset of important bands selected by GA. The accuracy of leaf pigment content prediction was evaluated using the coefficient of determination (R2), RMSE, and normalized root mean square error percentage (nRMSE). Finally, a t-test was conducted to assess whether there were significant differences in prediction accuracy between the CNN models and the three benchmark models.
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y i ¯ 2
RMSE = 1 n i = 1 n y i y i ^ 2
nRMSE = 100   ×   R M S E y -
where y - is the mean of n measured pigment contents. The nRMSE was used to compare the performances across four variants of CNN models.
For further validation of the effectiveness and generalizability of the optimal 1D- and 2D-CNN architectures (with the best number of convolutional layers), we incorporated the internationally available LOPEX93 leaf spectral dataset, which provides leaf optical properties for 331 samples from 45 species across wavelengths of 400–2500 nm [56]. Since Canth measurements are not included in this dataset, only Cab and Car were used to validate the optimal 1D- and 2D-CNN model. The LOPEX93 dataset was randomly divided by plant type into training (70%) and validation (30%) subsets. The LOPEX93 training subset was then combined with the training dataset from this study to train models, while the remaining LOPEX93 validation subsets, together with our study’s validation dataset, were used to evaluate model performance in estimating Cab and Car.

3. Results

3.1. The Distribution of Leaf Pigment Content and Spectral Variation

The statistics of the three pigment contents from 2094 samples across 28 tree species are given in Table 6. Cab values showed a wide range (15.50–97.73 µg/cm2) with a mean of 52.51 µg/cm2 and a standard deviation (SD) of 21.78 µg/cm2. In contrast, the variation in Car and Canth was relatively small, with a range value of 10.21 µg/cm2 (SD: 2.49 µg/cm2) and 7.16 µg/cm2 (SD: 1.15 µg/cm2), respectively. The frequency distribution of Cab, Car, and Canth is shown in Figure 4. The distribution of Cab and Car was normal. However, as for Canth, most of the samples are distributed in the low value region (<2 µg/cm2).
Figure 5 shows the average reflectance of 28 tree species. Due to differences in tree species and pigment content, substantial spectral variation was observed across the 400–800 nm wavelength range. The significant differences in reflectance among different tree species occurred in the blue light region (400–500 nm), green light region (500–600 nm), and near-infrared region (760–800 nm). However, a relatively low variation was found in red valley (650–700 nm) and red-edge (700–760 nm).

3.2. Importance Bands for Estimating Leaf Pigments

The importance bands (Figure 6) were derived using the 1D-CNN model coupled with GA for leaf pigment content estimation. A total of 218 bands was selected by GA from the original 400 bands across the full wavelength (400–800 nm) region. Most of the sensitive spectral bands (accounting for 75%) are distributed within the absorption wavelength ranges of the three pigments, located in the blue (430–500 nm), green (520–600 nm), and red regions (640–680 nm). Specifically, the broad and continuous important bands were located around 430–440 nm, 465–485 nm, 510–515 nm, 525–540 nm, 545–555 nm, 575–580 nm, 620–630 nm, 655–660 nm, 680–700 nm, and 705–715 nm. Additionally, some important bands (about 10%) were found outside the pigments’ absorption ranges, at wavelengths above 750 nm.

3.3. Model Performance for the Prediction of Leaf Pigment Content

The performance of 1D-CNN models for leaf pigment content prediction is summarized in Table 7. This table shows three assessment indicators, including R2, RMSE, and nRMSE of 1D-CNNf and 1D-CNNGA for the prediction of Cab, Car, and Canth. For 1D-CNNf, the prediction accuracy improved as the number of convolutional layers increased from 1 to 4. Specifically, Cab achieved nRMSE = 12.94–14.72% (R2 = 0.88–0.91, RMSE = 6.75–7.67 µg/cm2), Car showed nRMSE = 14.88–17.52% (R2 = 0.70–0.77, RMSE = 1.25–1.47 µg/cm2), and Canth achieved nRMSE = 56.13–93.51% (R2 = 0.05–0.68, RMSE = 0.63–1.05 µg/cm2). Compared with 1D-CNNf, 1D-CNNGA exhibited more stable prediction accuracy, particularly for Cab and Car. For both 1D-CNNf and 1D-CNNGA, strategy D with four convolutional layers provided the best estimate for all pigments. Figure 7 illustrates the 1:1 relationship between measured and predicted values of the three pigments using the optimal 1D-CNN models with four convolutional layers. In terms of nRMSE, the 1D-CNNGA model (using the subset spectrum) slightly outperformed the 1D-CNNf model (using the full spectrum), with reductions in nRMSE ranging from 0.15% to 1.35% for all pigments. For both 1D-CNNf and 1D-CNNGA models, Cab (nRMSE = 12.94% or 12.79%) was predicted more accurately than Car (nRMSE = 14.88% or 13.53%) and Canth (nRMSE = 56.13% or 55.34%), with Canth values above 2 µg/cm2 mostly underestimated. Additionally, the nRMSE (%) of pigment estimation using the optimal 1D-CNN models with four convolutional layers across tree species is presented in Table A1. Both 1D-CNNf and 1D-CNNGA exhibited good generalizability and robustness for Cab (SD = 2.53–2.54%) and Car (SD = 1.71–1.94%) estimation. However, the 1D-CNN models for Canth estimation showed relatively higher uncertainty (SD = 13.32–15.84%).
The prediction accuracy of leaf pigments using 2D-CNN models based on five strategies and different matrix dimensions input is shown in Table 8. For both Cab and Car, all 2D-CNN models with different strategies have good and stable prediction accuracy. In particular, when the number of convolutional layers was 3 or 4, Cab estimation achieved R2 > 0.92 and RMSE < 7 µg/cm2 for both 2D-CNNf and 2D-CNNGA. Similarly, across different input matrix sizes, the 2D-CNNGA model consistently yielded slightly higher estimation accuracy for all three pigments compared with the 2D-CNNf model. Figure 8 shows the best-performing 2D-CNN models for different matrix dimensions, with the highest prediction accuracy obtained using a 20 × 20 matrix input under strategy D. Overall, the 2D-CNN-based approach, including 2D-CNNf and 2D-CNNGA, produced higher prediction accuracy than the 1D-CNN approach, particularly for Canth (nRMSE value change from 42.74% to 31.58%). Furthermore, the optimal 2D-CNN models with four convolutional layers for estimating the three pigments exhibited better generalizability across the tree species compared with the 1D-CNN models (shown in Table A1).
Three benchmark estimation models (i.e., PLSR, RFR, and GPR) were trained and tested using the same training and validation dataset as the CNN models. The results of the three pigment estimations are shown in Table 9. The t-test results indicate a significant difference (p < 0.05) in prediction accuracy between three benchmark estimation models and CNN models. Compared with the best 1D- and 2D-CNN models, three benchmark estimation models have higher nRMSE, with a range of 16.36–21.24% for Cab, 16.59–22.96% for Car, and 66.85–82.13% for Canth. Similarly, three benchmark estimation models coupled with GA have higher accuracy than those using full-spectrum reflectance, particularly for the PLSR model. Among the three benchmark models, GPR provided the best estimates for Cab and Car (nRMSE close to 16%), followed by RFR and PLSR. However, all models performed poorly in Canth estimation (nRMSE > 65%).
The prediction accuracy of Cab and Car using the optimal 1D- and 2D-CNN models with four convolutional layers is summarized in Table 10. As the number of tree species and sample sizes increased, both models maintained robust performance in estimating Cab and Car (nRMSE < 18%), particularly when coupled with GA.

4. Discussion

4.1. The CNN Model Performance on Leaf Pigment Content

In this study, we developed a CNN-based approach using spectral reflectance to predict leaf pigment content across 28 tree species. The results obtained an acceptable accuracy of leaf pigment estimates, with nRMSE showing a range of 11.71–12.98% for Cab, 12.29–14.88% for Car, and 31.58–56.13% for Canth. These different accuracies caused by the variant of CNNs (1D-CNNf or 2D-CNNf trained with full spectrum (400–800 nm), and 1D-CNNGA or 2D-CNNGA trained with subset spectrum) were used. Different tree species differ in growth, phenological development, and plant composition, resulting in large differences in spectral characteristics. These differences reflect the complexity of spectral–pigment relationships, making it difficult to model with a small number of features. This study shows that the proposed CNN can be successfully applied to predict the pigmentation content of leaves of different tree species. Compared with PLSR, RFR, and GPR (nRMSE for Cab and Car >16%), our proposed CNN models with large filters can extract more comprehensive spectral features and achieve more accurate pigment prediction, particularly for Canth. Overall, these results confirmed that the CNN-based approaches can automatically extract the photosynthetic pigment information with satisfactory accuracy, especially for Cab and Car.
The CNN architecture with four convolutional layers achieved the best performance in pigment prediction. Similarly, Prilianti et al. (2021) reported comparable accuracy using a network architecture with only three convolution layers [44]. It showed that, to a certain extent, adding more convolution layers resulted in larger errors in the training process. Therefore, it can be concluded that a relatively simple architecture is suitable for leaf pigment prediction. Another factor that cannot be ignored is the selected optimization method, which influences the training performance to a large extent [57]. SGDM can reduce overfitting in CNN model training, which has been systematically demonstrated by Prilianti et al. (2021) [44].
Four variants of CNNs for the prediction of Cab, Car, and Canth were compared in this study. The difference between these models was the input of the spectrum representation. The 1D-CNNf and 1D-CNNGA models used 1D spectral representations, while the 2D-CNNf and 2D-CNNGA models used 2D encoded matrices. The results showed that 2D-CNN-based approaches achieved higher prediction accuracy than 1D-CNN models, particularly when the importance bands selected by GA were used to transform 2D matrices. This improvement can be attributed to the reflectance spectrum being a stationary signal with similar peak–valley patterns. When transformed into a 2D matrix, these spectral features are emphasized, enabling convolutional filters to better capture the relationships between reflection peaks and valleys. Furthermore, the size of the 2D input image influenced performance, with the 20 × 20 format yielding the highest accuracy (Figure 8). This is likely because square matrices provide uniform coverage during convolution and preserve complete structural features. The GA-selected important bands (450 ± 20 nm, 510 ± 5 nm, 550 ± 20 nm, 700 ± 20 nm, and >750 nm) were consistent with previously reported sensitive ranges [7,21]. Among all models, the 2D-CNNGA achieved the best performance, especially for Canth. The ability of 2D-CNN to highlight reflectance characteristics within pigment absorption regions not only enhances interpretability but also improves model robustness and generalization.

4.2. Influence of Pigment Distribution

The distribution of the measured leaf pigment content is critical for developing a rubout deep learning-based regression model based on parameter estimates. The wide range of Cab and Car obtained good accuracy (nRMSE around <15%), while a larger error (nRMSE > 30%) appeared on Canth with a narrow range. Further, the results showed that the normal distribution of Cab and Car performed better than a decreased distribution of Canth. These results indicate that non-uniform sample distributions, particularly extreme cases, hinder the model’s ability to capture features across the full range, as neuronal weights become biased toward densely sampled regions. Furthermore, the distribution style influence on 1D-CNN models was greater than that of 2D-CNN models based on the prediction accuracy of pigment in this work. Because the features of 2D encoded matrices of reflectance spectra are easier to capture than 1D reflectance signals, this results in accelerated model convergence.

4.3. Limitations and Future Work

For Cab and Car, both multi-output 1D-CNN and 2D-CNN models achieved good prediction accuracy. By contrast, Canth showed poorer performance, with an nRMSE value around 55% for the 1D-CNN-based approach and around 32% for the 2D-CNN-based approach, mainly because its estimation can be affected by high levels of Cab and Car [58]. Sensitive band selection was a key factor influencing the performance of CNN models. Féret et al. (2017) noted that chlorophyll exhibits two major absorption peaks in the blue (near 430 nm) and red (660–680 nm) regions, carotenoids strongly absorb in the blue region (400–500 nm), while anthocyanins show an absorption peak mainly in the blue–green region (500–550 nm) [13]. These absorption features partly overlap in the VIS region, especially in the blue–green region, which is critical for decoupling the effects of these pigments on leaf reflectance [7,59]. However, the GA-selected bands (Figure 6) in this study were located in low-absorption regions, introducing spectral noise. To mitigate these effects, spectral processing methods should be considered to improve pigment prediction. For example, the widely used derivative transformation could help determine the location of the spectral absorption features and reduce the finer-scale noise [60]. Another method worth considering in the future is wavelet analysis. Wavelet transformations are functions that decompose a complex signal into component sub-signals. Several studies have demonstrated that continuous wavelet decomposition can more effectively characterize spectral variability at appropriate frequencies, thereby improving pigment prediction accuracy [61,62].
The goal of model transferability is to develop a model that can be applied across different locations and times. Thus, the model training for pigment prediction requires sufficient training datasets and covers diverse geographical areas, tree species, and seasons. In this experiment, we obtained a robust CNN model with a dataset of 2094 across 28 broadleaf tree species. Fayad et al. (2021) similarly suggested that the optimal number of training samples was observed to be greater than 1500 for the retrieval of biochemical parameters [39]. However, since all samples were collected only during the growing season (April–September), the models cannot capture the spectral characteristics of autumn leaves with high anthocyanin content, limiting their accuracy in anthocyanin prediction and year-round monitoring. Incorporating samples from different seasons will therefore improve model transferability. In addition, the spectral interpretation is complex as it is influenced by diverse species and different geographic areas, which have different leaf optical properties [63,64]. A promising alternative is a hybrid method that exploits deep learning algorithms trained on RTM simulation, which can combine their advantages of the spectral response mechanism of pigment changes and the hyper-level learning ability. In addition, leaf samples were collected destructively to obtain pigment reference data in this study. While this approach ensures accurate ground truth, it constrains long-term, large-scale, and non-destructive monitoring. Advances in remote sensing offer a way forward, particularly through integrating data from multiple platforms and sensors (e.g., ground-based and UAV hyperspectral, RGB, and LiDAR) with artificial intelligence [40,41,42]. Such strategies are key to developing highly precise and transferable models for predicting plant physiological and biochemical traits across scales and environments.

5. Conclusions

Accurate estimates of leaf pigments are significant for providing valuable information about plant physiology. This study developed two types of convolutional neural networks (1D and 2D) coupled with GA to predict leaf pigment content across different tree species using the leaf reflectance spectrum. The results indicate that 2D-CNN-based approaches performed better than 1D-CNN-based approaches for predicting Cab, Car, and Canth, with the nRMSE value reduced by between 0.6% and 14.7%. The 2D-CNN approach produced more robust and lightweight results than the 1D-CNN model in this study. Furthermore, the selection of importance bands using the GA method improves the CNN’s performance on pigment prediction accuracy, especially for Canth. estimation (improved by 9.85%) using the 2D-CNNGA model. Additionally, the transferability and generalization of the model are important for applying it to predict a new dataset. In the future, larger, more diverse, and distributed databases across different tree species, seasons, and geographical areas will be considered to further enhance model performance.

Author Contributions

Conceptualization, Z.W.; methodology, Z.W. and D.X.; software, Z.W.; validation, Z.W., and D.X.; formal analysis, Z.W.; investigation, Z.W., and D.X.; resources, Z.W., and D.X.; data curation, Z.W.; writing—original draft preparation, Z.W.; writing—review and editing, Z.W., and D.X.; visualization, Z.W.; supervision, D.X.; funding acquisition, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Research Development Fund of Zhejiang A & F University (2025LFR021).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to thank Yingxiang Chen, Lijuan Li, and Mengxiang Zheng for their valuable assistance with the leaf biochemical and spectral measurements.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. nRMSE (%) of pigment estimation using optimal 1D- and 2D-CNN models with four convolutional layers for each tree species.
Table A1. nRMSE (%) of pigment estimation using optimal 1D- and 2D-CNN models with four convolutional layers for each tree species.
Tree Species1D-CNNf1D-CNNGA2D-CNNf2D-CNNGA
CabCarCanthCabCarCanthCabCarCanthCabCarCanth
Osmanthus fragrans9.7712.8356.5512.7112.2155.959.7113.5636.4311.6711.8926.42
Diospyros oleifera13.9115.0077.5711.3714.7171.8514.5311.1069.2011.9611.1150.70
Sapindus mukorossi12.2911.4540.0710.8711.8245.3511.699.7046.0312.799.0636.12
Phoebe sheareri15.4311.6355.6118.9012.1461.0613.799.1345.5615.449.4744.67
Ginkgo biloba12.2215.0745.4811.8413.9247.299.8410.8927.2110.3812.9425.87
Glochidion puberum10.7918.1047.6712.4515.9059.5211.0215.0541.0711.9213.6528.81
Ilex chinensis12.0214.0266.6611.9612.7356.7512.459.7554.759.9113.9047.11
Carya illinoinensis18.7114.6179.0014.4213.4091.8214.9313.3164.8513.6611.2943.03
Quercus glauca11.5618.8539.8912.1618.5551.2312.078.7445.1210.2715.7929.62
Acer buergerianum Miq.11.1116.3652.1210.2315.9752.799.8814.3235.038.8814.1524.80
Ligustrum lucidum10.3513.0840.868.8913.0750.2112.6812.1723.709.579.9017.53
Cinnamomum camphora11.7914.4553.6612.7514.1852.8010.6514.8570.3110.649.4452.19
Ailanthus altissima13.4416.0063.1212.9714.6465.4615.6818.1948.2012.9911.4248.58
Viburnum macrocephalum11.0017.9483.199.7715.6252.7712.5011.2929.359.8912.2026.78
Liquidambar formosana12.2018.1066.1514.4216.8672.3810.6416.8256.049.9713.7733.91
Acer cinnamomifolium9.8412.4650.1711.7612.2852.7210.7213.8587.689.5810.8939.35
Eriobotrya japonica12.5512.4367.3214.9511.4174.8613.999.0754.4712.159.9141.20
Photinia serratifolia15.2314.2239.2211.7611.5337.4015.8013.1443.9712.7511.1327.60
Michelia figo13.5114.2786.0714.5213.2445.5618.0511.1237.8411.7013.7835.73
Lagerstroemia indica10.9014.9140.1711.1013.0046.019.4212.6444.8410.1911.1231.09
Prunus serrulata var. lannesiana17.6815.5757.439.8712.8654.8013.919.0432.4012.2110.7033.67
Gardenia jasminoides10.1113.4536.3910.5013.5642.899.9313.6836.9410.4410.1621.91
Firmiana simplex14.6213.8777.8313.8613.3776.8012.8514.4971.5712.4412.0538.60
Liriodendron chinense15.7212.6838.6916.2113.5947.5515.1712.6122.0716.0313.7615.56
Yulania denudata10.6812.8345.4711.3811.5048.4911.1314.4641.1010.3112.0125.47
Magnolia grandiflora12.9116.1729.6714.2915.1428.5210.7113.9531.2012.4612.9626.88
Albizia julibrissin16.4314.3842.2718.0312.7650.2312.2111.4655.6614.7119.0921.07
Styphnolobium japonicum9.9514.0435.317.5012.5636.8710.5112.8038.278.9811.6832.15
Mean12.7414.6054.0612.5513.6654.6412.3712.5446.1011.5712.1233.09
SD2.391.9415.842.531.7113.322.192.3615.591.852.129.79

References

  1. Post, W.M.; Peng, T.; Emanuel, W.R.; King, A.W.; Dale, V.H.; DeAngelis, D.L. The global carbon cycle. Am. Sci. 1990, 78, 310–326. [Google Scholar]
  2. Nelson, N.; Yocum, C.F. Structure and function of photosystems I and II. Annu. Rev. Plant Biol. 2006, 57, 521–565. [Google Scholar] [CrossRef]
  3. Kattenborn, T.; Schiefer, F.; Zarco-Tejada, P.J.; Schmidtlein, S. Advantages of retrieving pigment content [μg/cm] versus concentration [%] from canopy reflectance. Remote Sens. Environ. 2019, 230, 111195. [Google Scholar] [CrossRef]
  4. Cutolo, E.A.; Guardini, Z.; Dall’Osto, L.; Bassi, R. A paler shade of green: Engineering cellular chlorophyll content to enhance photosynthesis in crowded environments. New Phytol. 2023, 239, 1567–1583. [Google Scholar] [CrossRef]
  5. Lopatin, J. Estimation of foliar carotenoid content using spectroscopy wavelet-based vegetation indices. IEEE Geosci. Remote Sens. Lett. 2023, 20, 2500405. [Google Scholar] [CrossRef]
  6. Yu, Z.C.; Lin, W.; Zheng, X.T.; Chow, W.S.; Luo, Y.N.; Cai, M.L.; Peng, C.L. The relationship between anthocyanin accumulation and photoprotection in young leaves of two dominant tree species in subtropical forests in different seasons. Photosynth. Res. 2021, 149, 41–55. [Google Scholar] [CrossRef] [PubMed]
  7. Gitelson, A.A.; Keydan, G.P.; Merzlyak, M.N. Three-band model for noninvasive estimation of chlorophyll, carotenoids, and anthocyanin contents in higher plant leaves. Geophys. Res. Lett. 2006, 33, L11402. [Google Scholar] [CrossRef]
  8. Jacquemoud, S.; Ustin, S.L.; Verdebout, J.; Schmuck, G.; Andreoli, G.; Hosgood, B. Estimating leaf biochemistry using the PROSPECT leaf optical properties model. Remote Sens. Environ. 1996, 56, 194–202. [Google Scholar] [CrossRef]
  9. Coops, N.C.; Smith, M.; Martin, M.E.; Ollinger, S.V. Prediction of eucalypt foliage nitrogen content from satellite-derived hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 1338–1346. [Google Scholar] [CrossRef]
  10. Sims, D.A.; Gamon, J.A. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
  11. Gamon, J.A.; Field, C.B.; Fredeen, A.L.; Thayer, S. Assessing photosynthetic downregulation in sunflower stands with an optically-based model. Photosynth. Res. 2001, 67, 113–125. [Google Scholar] [CrossRef]
  12. Ustin, S.L.; Gitelson, A.A.; Jacquemoud, S.; Schaepman, M.; Asner, G.P.; Gamon, J.A.; Zarco-Tejada, P. Retrieval of foliar information about plant pigment systems from high resolution spectroscopy. Remote Sens. Environ. 2009, 113, S67–S77. [Google Scholar] [CrossRef]
  13. Féret, J.B.; Gitelson, A.A.; Noble, S.D.; Jacquemoud, S. PROSPECT-D: Towards modeling leaf optical properties through a complete lifecycle. Remote Sens. Environ. 2017, 193, 204–215. [Google Scholar]
  14. Gitelson, A.A.; Merzlyak, M.N. Non-Destructive Assessment of Chlorophyll Carotenoid and Anthocyanin Content in Higher Plant Leaves: Principles and Algorithms; Peripheral Editions: Ella, Greece, 2004. [Google Scholar]
  15. Blachburn, G.A. Quantifying chlorophylls and carotenoids at leaf and canopy scales. Remote Sens. Environ. 1998, 66, 273–825. [Google Scholar]
  16. Croft, H.; Chen, J.M.; Zhang, Y.; Simic, A. Modelling leaf chlorophyll content in broadleaf and needle leaf canopies from ground, CASI, Landsat TM 5 and MERIS reflectance data. Remote Sens. Environ. 2013, 133, 128–140. [Google Scholar]
  17. Bhadra, S.; Sagan, V.; Sarkar, S.; Braud, M.; Mockler, T.C.; Eveland, A.L. PROSAIL-Net: A transfer learning-based dual stream neural network to estimate leaf chlorophyll and leaf angle of crops from UAV hyperspectral images. ISPRS J. Photogramm. Remote Sens. 2024, 210, 1–24. [Google Scholar]
  18. Garrity, S.R.; Eitel, J.U.; Vierling, L.A. Disentangling the relationships between plant pigments and the photochemical reflectance index reveals a new approach for remote estimation of carotenoid content. Remote Sens. Environ. 2011, 115, 628–635. [Google Scholar]
  19. Zhou, X.; Huang, W.; Kong, W.; Ye, H.; Dong, Y.; Casa, R. Assessment of leaf carotenoids content with a new carotenoid index: Development and validation on experimental and model data. Int. J. Appl. Earth Obs. Geoinf. 2017, 57, 24–35. [Google Scholar] [CrossRef]
  20. Gitelson, A. Towards a generic approach to remote non-invasive estimation of foliar carotenoid-to-chlorophyll ratio. J. Plant Physiol. 2020, 252, 153227. [Google Scholar] [PubMed]
  21. Li, X.; Wei, Z.; Peng, F.; Liu, J.; Han, G. Non-destructive prediction and visualization of anthocyanin content in mulberry fruits using hyperspectral imaging. Front. Plant Sci. 2023, 14, 1137198. [Google Scholar] [CrossRef]
  22. Piccolo, E.L.; Matteoli, S.; Landi, M.; Guidi, L.; Massai, R.; Remorini, D. Measurements of Anthocyanin Content of Prunus Leaves Using Proximal Sensing Spectroscopy and Statistical Machine Learning. IEEE Trans. Instrum. Meas. 2022, 71, 2508110. [Google Scholar] [CrossRef]
  23. Croft, H.; Chen, J.M.; Zhang, Y. The applicability of empirical vegetation indices for determining leaf chlorophyll content over different leaf and canopy structures. Ecol. Complex. 2014, 17, 119–130. [Google Scholar] [CrossRef]
  24. Gara, T.W.; Darvishzadeh, R.; Skidmore, A.K.; Wang, T. Impact of vertical canopy position on leaf spectral properties and traits across multiple species. Remote Sens. 2018, 10, 346. [Google Scholar] [CrossRef]
  25. Li, Y.; Huang, J. Leaf anthocyanin content retrieval with partial least squares and gaussian process regression from spectral reflectance data. Sensors 2021, 21, 3078. [Google Scholar] [CrossRef]
  26. Feilhauer, H.; Asner, G.P.; Martin, R.E. Multi-method ensemble selection of spectral bands related to leaf biochemistry. Remote Sens. Environ. 2015, 164, 57–65. [Google Scholar] [CrossRef]
  27. Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemometr. Intell. Lab. 2001, 58, 109–130. [Google Scholar] [CrossRef]
  28. Caicedo, J.P.R.; Verrelst, J.; Muñoz-Marí, J.; Moreno, J.; Camps-Valls, G. Toward a semiautomatic machine learning retrieval of biophysical parameters. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1249–1259. [Google Scholar] [CrossRef]
  29. Van Wittenberghe, S.; Verrelst, J.; Rivera, J.P.; Alonso, L.; Moreno, J.; Samson, R. Gaussian processes retrieval of leaf parameters from a multi-species reflectance, absorbance and fluorescence dataset. J. Photochem. Photobiol. B Biol. 2014, 134, 37–48. [Google Scholar] [CrossRef] [PubMed]
  30. Verrelst, J.; Rivera, J.P.; Gitelson, A.; Delegido, J.; Moreno, J.; Camps-Valls, G. Spectral band selection for vegetation properties retrieval using Gaussian processes regression. Int. J. Appl. Earth Obs. Geoinf. 2016, 52, 554–567. [Google Scholar] [CrossRef]
  31. Wan, L.; Zhou, W.; He, Y.; Wanger, T.C.; Cen, H. Combining transfer learning and hyperspectral reflectance analysis to assess leaf nitrogen concentration across different plant species datasets. Remote Sens. Environ. 2022, 269, 112826. [Google Scholar] [CrossRef]
  32. Jacquemoud, S.; Baret, F. PROSPECT: A model of leaf optical properties spectra. Remote Sens. Environ. 1990, 34, 75–91. [Google Scholar] [CrossRef]
  33. Dawson, T.P.; Curran, P.J.; Plummer, S.E. LIBERTY—Modeling the effects of leaf biochemical concentration on reflectance spectra. Remote Sens. Environ. 1998, 65, 50–60. [Google Scholar] [CrossRef]
  34. Feret, J.; François, C.; Asner, G.P.; Gitelson, A.A.; Martin, R.E.; Bidel, L.P.; Ustin, S.L.; Le Maire, G.; Jacquemoud, S. PROSPECT-4 and 5: Advances in the leaf optical properties model separating photosynthetic pigments. Remote Sens. Environ. 2008, 112, 3030–3043. [Google Scholar] [CrossRef]
  35. Féret, J.; Berger, K.; de Boissieu, F.; Malenovský, Z. PROSPECT-PRO for estimating content of nitrogen-containing leaf proteins and other carbon-based constituents. Remote Sens. Environ. 2021, 252, 112173. [Google Scholar] [CrossRef]
  36. Li, P.; Wang, Q. Retrieval of leaf biochemical parameters using PROSPECT inversion: A new approach for alleviating ill-posed problems. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2499–2506. [Google Scholar]
  37. LeCun, Y.; Bengio, Y. Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 1995, 3361, 1995. [Google Scholar]
  38. Kim, N.; Ha, K.; Park, N.; Cho, J.; Hong, S.; Lee, Y. A comparison between major artificial intelligence models for crop yield prediction: Case study of the midwestern United States, 2006–2015. ISPRS Int. J. Geo-Inf. 2019, 8, 240. [Google Scholar] [CrossRef]
  39. Fayad, I.; Ienco, D.; Baghdadi, N.; Gaetano, R.; Alvares, C.A.; Stape, J.L.; Scolforo, H.F.; Le Maire, G. A CNN-based approach for the estimation of canopy heights and wood volume from GEDI waveforms. Remote Sens. Environ. 2021, 265, 112652. [Google Scholar] [CrossRef]
  40. Wang, R.; Qu, H.; Su, W. From sensors to insights: Technological trends in image-based high-throughput plant phenotyping. Smart Agric. Technol. 2025, 12, 101257. [Google Scholar] [CrossRef]
  41. Zhou, Z.; Zhang, H.; Bian, L.; Zhou, L.; Ge, Y. Integrating sensor fusion with machine learning for comprehensive assessment of phenotypic traits and drought response in poplar species. Plant Biotechnol. J. 2025, 23, 2464–2481. [Google Scholar] [CrossRef]
  42. Zhang, R.; Jin, S.; Wang, Y.; Zang, J.; Wang, Y.; Zhao, R.; Su, Y.; Wu, J.; Wang, X.; Jiang, D. PhenoSR: Enhancing organ-level phenotyping with super-resolution RGB UAV imagery for large-scale field experiments. ISPRS J. Photogramm. Remote Sens. 2025, 228, 582–602. [Google Scholar] [CrossRef]
  43. Pullanagari, R.R.; Dehghan-Shoar, M.; Yule, I.J.; Bhatia, N. Field spectroscopy of canopy nitrogen concentration in temperate grasslands using a convolutional neural network. Remote Sens. Environ. 2021, 257, 112353. [Google Scholar] [CrossRef]
  44. Prilianti, K.R.; Setiyono, E.; Kelana, O.H.; Brotosudarmo, T.H.P. Deep chemometrics for nondestructive photosynthetic pigments prediction using leaf reflectance spectra. Inf. Process. Agric. 2021, 8, 194–204. [Google Scholar] [CrossRef]
  45. Barman, U. Deep Convolutional neural network (CNN) in tea leaf chlorophyll estimation: A new direction of modern tea farming in Assam, India. J. Appl. Nat. Sci. 2021, 13, 1059–1064. [Google Scholar] [CrossRef]
  46. Shi, S.; Xu, L.; Gong, W.; Chen, B.; Chen, B.; Qu, F.; Tang, X.; Sun, J.; Yang, J. A convolution neural network for forest leaf chlorophyll and carotenoid estimation using hyperspectral reflectance. Int. J. Appl. Earth Obs. Geoinf. 2022, 108, 102719. [Google Scholar] [CrossRef]
  47. Annala, L.; Honkavaara, E.; Tuominen, S.; Pölönen, I. Chlorophyll concentration retrieval by training convolutional neural network for stochastic model of leaf optical properties (SLOP) inversion. Remote Sens. 2020, 12, 283. [Google Scholar] [CrossRef]
  48. Lichtenthaler, H.K. Chlorophylls and carotenoids: Pigments of photosynthetic biomembranes. Methods Enzymol. 1987, 148, 350–382. [Google Scholar]
  49. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
  50. Ng, W.; Minasny, B.; Montazerolghaem, M.; Padarian, J.; Ferguson, R.; Bailey, S.; McBratney, A.B. Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra. Geoderma 2019, 352, 251–267. [Google Scholar] [CrossRef]
  51. Radosavovic, I.; Kosaraju, R.P.; Girshick, R.; He, K.; Dollár, P. Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10428–10436. [Google Scholar]
  52. Padarian, J.; Minasny, B.; McBratney, A.B. Using deep learning to predict soil properties from regional spectral data. Geoderma Reg. 2019, 16, e198. [Google Scholar] [CrossRef]
  53. Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning: PMLR, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
  54. Holland, J.H. Genetic algorithms. Sci. Am. 1992, 267, 66–73. [Google Scholar] [CrossRef]
  55. Leardi, R.; Gonzalez, A.L. Genetic algorithms applied to feature selection in PLS regression: How and when to use them. Chemometr. Intell. Lab. 1998, 41, 195–207. [Google Scholar] [CrossRef]
  56. Hosgood, B.; Jacquemoud, S.; Andreoli, G.; Verdebout, J.; Pedrini, G.; Schmuck, G. Leaf Optical properties Experiment 93 (LOPEX93); Report EUR; European Commission: Brussels, Belgium, 1995; Volume 16095, pp. 1–46. [Google Scholar]
  57. Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
  58. Huang, W.; Lin, K.; Hsu, M.; Huang, M.; Yang, Z.; Chao, P.; Yang, C. Eliminating interference by anthocyanin in chlorophyll estimation of sweet potato (Ipomoea batatas L.) leaves. Bot. Stud. 2014, 55, 11. [Google Scholar] [CrossRef]
  59. Xiao, Y.; Zhao, W.; Zhou, D.; Gong, H. Sensitivity analysis of vegetation reflectance to biochemical and biophysical variables at leaf, canopy, and regional scales. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4014–4024. [Google Scholar] [CrossRef]
  60. Tsai, F.; Philpot, W. Derivative analysis of hyperspectral data. Remote Sens. Environ. 1998, 66, 41–51. [Google Scholar] [CrossRef]
  61. Blackburn, G.A.; Ferwerda, J.G. Retrieval of chlorophyll concentration from leaf reflectance spectra using wavelet analysis. Remote Sens. Environ. 2008, 112, 1614–1632. [Google Scholar] [CrossRef]
  62. Ali, A.M.; Skidmore, A.K.; Darvishzadeh, R.; van Duren, I.; Holzwarth, S.; Mueller, J. Retrieval of forest leaf functional traits from HySpex imagery using radiative transfer models and continuous wavelet analysis. ISPRS J. Photogramm. Remote Sens. 2016, 122, 68–80. [Google Scholar] [CrossRef]
  63. Féret, J.; François, C.; Gitelson, A.; Asner, G.P.; Barry, K.M.; Panigada, C.; Richardson, A.D.; Jacquemoud, S. Optimizing spectral indices and chemometric analysis of leaf chemical properties using radiative transfer modeling. Remote Sens. Environ. 2011, 115, 2742–2750. [Google Scholar] [CrossRef]
  64. Zhao, K.; Valle, D.; Popescu, S.; Zhang, X.; Mallick, B. Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection. Remote Sens. Environ. 2013, 132, 102–119. [Google Scholar] [CrossRef]
Figure 1. The geographical locations of leaf sampling sites over Hangzhou city. EBF: Evergreen Broadleaf Forest, DNF: Deciduous Needleleaf Forest, DBF: Deciduous Broadleaf Forest, MF: Mixed Forest, ENF: Evergreen Needleleaf Forest, and SHR: Shrubland.
Figure 1. The geographical locations of leaf sampling sites over Hangzhou city. EBF: Evergreen Broadleaf Forest, DNF: Deciduous Needleleaf Forest, DBF: Deciduous Broadleaf Forest, MF: Mixed Forest, ENF: Evergreen Needleleaf Forest, and SHR: Shrubland.
Remotesensing 17 03293 g001
Figure 2. The architecture of the 1D-CNN.
Figure 2. The architecture of the 1D-CNN.
Remotesensing 17 03293 g002
Figure 3. The architecture of the 2D-CNN.
Figure 3. The architecture of the 2D-CNN.
Remotesensing 17 03293 g003
Figure 4. The distribution of pigment contents from the measured dataset: (a) leaf chlorophyll content (Cab), (b) leaf carotenoid content (Car), and (c) leaf anthocyanin content (Canth).
Figure 4. The distribution of pigment contents from the measured dataset: (a) leaf chlorophyll content (Cab), (b) leaf carotenoid content (Car), and (c) leaf anthocyanin content (Canth).
Remotesensing 17 03293 g004
Figure 5. The average reflectance of 28 tree species in the wavelength range between 400 and 800 nm.
Figure 5. The average reflectance of 28 tree species in the wavelength range between 400 and 800 nm.
Remotesensing 17 03293 g005
Figure 6. The important bands to predict pigment contents were selected. The specific absorption coefficient of three pigments was extracted from the PROSPECT-D model.
Figure 6. The important bands to predict pigment contents were selected. The specific absorption coefficient of three pigments was extracted from the PROSPECT-D model.
Remotesensing 17 03293 g006
Figure 7. Scatterplots between measured vs. predicted values using optimal 1D-CNN models with four convolutional layers, displayed in three columns: (a) leaf chlorophyll content (Cab), (b) leaf carotenoid content (Car), and (c) leaf anthocyanin content (Canth). The rows corresponding to 1D-CNNf and 1D-CNNGA, respectively. The dashed line indicates the 1:1 relationship between observed and predicted values, the solid line shows the fitted regression curve, and the gray area denotes the 95% confidence interval.
Figure 7. Scatterplots between measured vs. predicted values using optimal 1D-CNN models with four convolutional layers, displayed in three columns: (a) leaf chlorophyll content (Cab), (b) leaf carotenoid content (Car), and (c) leaf anthocyanin content (Canth). The rows corresponding to 1D-CNNf and 1D-CNNGA, respectively. The dashed line indicates the 1:1 relationship between observed and predicted values, the solid line shows the fitted regression curve, and the gray area denotes the 95% confidence interval.
Remotesensing 17 03293 g007
Figure 8. Scatterplots of measured vs. predicted values using optimal 2D-CNN models with different dimensions of 2D matrix input, displayed in three columns: (a1,a2) leaf chlorophyll content (Cab), (b1,b2) leaf carotenoid content (Car), and (c1,c2) leaf anthocyanin content (Canth). The dashed line indicates the 1:1 relationship between observed and predicted values, the solid line shows the fitted regression curve, and the gray area denotes the 95% confidence interval.
Figure 8. Scatterplots of measured vs. predicted values using optimal 2D-CNN models with different dimensions of 2D matrix input, displayed in three columns: (a1,a2) leaf chlorophyll content (Cab), (b1,b2) leaf carotenoid content (Car), and (c1,c2) leaf anthocyanin content (Canth). The dashed line indicates the 1:1 relationship between observed and predicted values, the solid line shows the fitted regression curve, and the gray area denotes the 95% confidence interval.
Remotesensing 17 03293 g008aRemotesensing 17 03293 g008b
Table 1. Tree species measured in this study (n = 2094).
Table 1. Tree species measured in this study (n = 2094).
Tree SpeciesNumber of SamplesTree SpeciesNumber of Samples
Osmanthus fragrans74Liquidambar formosana74
Diospyros oleifera80Acer cinnamomifolium75
Sapindus mukorossi68Eriobotrya japonica76
Phoebe sheareri74Photinia serratifolia77
Ginkgo biloba75Michelia figo74
Glochidion puberum77Lagerstroemia indica77
Ilex chinensis74Prunus serrulata var. lannesiana70
Carya illinoinensis72Gardenia jasminoides68
Quercus glauca78Firmiana simplex74
Acer buergerianum Miq.79Liriodendron chinense76
Ligustrum lucidum80Yulania denudata77
Cinnamomum camphora73Magnolia grandiflora73
Ailanthus altissima75Albizia julibrissin76
Viburnum macrocephalum74Styphnolobium japonicum74
Table 2. The parameters of the one-dimensional convolutional neural network (1D-CNN).
Table 2. The parameters of the one-dimensional convolutional neural network (1D-CNN).
Layer TypeSizeFiltersActivation Function
Input layer1 × 400--
Conv11 × 596ReLU
Dropout (0.5)---
Conv2 1 × 396ReLU
Dropout (0.5)---
Conv31 × 3192ReLU
Conv41 × 3192ReLU
Conv51 × 3256ReLU
Dropout (0.5)---
Fully11 × 4096-ReLU
Fully21 × 4096-ReLU
Fully31 × 3-Linear
Table 3. The parameters of the two-dimensional convolutional neural network (2D-CNN) model.
Table 3. The parameters of the two-dimensional convolutional neural network (2D-CNN) model.
Layer TypeSizeFiltersActivation Function
Layers20 × 20
Conv15 × 596ReLU
BatchNormalization ()---
Dropout (0.5)---
Conv2 3 × 396ReLU
BatchNormalization ()---
Dropout (0.5)---
Conv33 × 3192ReLU
BatchNormalization () -
Conv43 × 3192ReLU
BatchNormalization ()
Conv53 × 3256ReLU
BatchNormalization ()---
Dropout (0.5)---
Fully11 × 4096-ReLU
Fully21 × 4096-ReLU
Fully31 × 3-Linear
Table 4. Five strategies of the CNN model with different numbers of convolutional layers for leaf pigment estimation.
Table 4. Five strategies of the CNN model with different numbers of convolutional layers for leaf pigment estimation.
StrategiesABCDE
LayersInput layerInput layerInput layerInput layerInput layer
Conv1Conv1Conv1Conv1Conv1
DropoutDropoutDropoutDropoutDropout
-Conv2Conv2Conv2Conv2
-DropoutDropoutDropoutDropout
--Conv3Conv3Conv3
---Conv4Conv4
----Conv5
----Dropout
Fully1Fully1Fully1Fully1Fully1
Fully2Fully2Fully2Fully2Fully2
Fully3Fully3Fully3Fully3Fully3
Table 5. List of tested models.
Table 5. List of tested models.
ModelsData Used
1D-CNNffull spectrum (400–800 nm)
1D-CNNGAsubset spectrum (400–800 nm)
2D-CNNffull spectrum (400–800 nm) in 2D representation
2D-CNNGAsubset spectrum (400–800 nm) in 2D representation
PLSRffull spectrum (400–800 nm)
PLSRGAsubset spectrum (400–800 nm)
RFRffull spectrum (400–800 nm)
RFRGAsubset spectrum (400–800 nm)
GPRffull spectrum (400–800 nm)
GPRGAsubset spectrum (400–800 nm)
Table 6. Statistics of measured leaf pigment contents (n = 2094).
Table 6. Statistics of measured leaf pigment contents (n = 2094).
PigmentsMeanSDMaxMinRange
Cab (µg/cm2)52.5121.7897.7315.5082.23
Car (µg/cm2)8.332.4914.364.1510.21
Canth (µg/cm2)1.181.157.180.027.16
Note: standard deviation (SD), maximum (Max), and minimum (Min).
Table 7. Accuracy of pigment estimation using 1D-CNN models with different numbers of convolutional layers.
Table 7. Accuracy of pigment estimation using 1D-CNN models with different numbers of convolutional layers.
ModelsStrategies (Nconv)CabCarCanth
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
1D-CNNfA (1)0.887.6714.720.701.4717.520.051.0593.51
B (2)0.897.3814.160.731.3716.310.460.8676.88
C (3)0.907.4114.220.751.2915.360.540.7869.48
D (4)0.916.7512.940.771.2514.880.680.6356.13
E (5)0.9014.2527.340.581.9623.330.231.0089.13
1D-CNNGAA (1)0.916.8213.080.781.2815.280.111.0290.87
B (2)0.916.6712.810.781.2514.890.550.7971.22
C (3)0.916.7412.930.801.1714.020.590.7264.53
D (4)0.926.6612.790.811.1313.530.720.6255.34
E (5)0.916.7012.850.801.1914.260.510.7869.63
Note The bold represents the best strategy for each model.
Table 8. Accuracy of pigment estimation using 2D-CNN models with different numbers of convolutional layers.
Table 8. Accuracy of pigment estimation using 2D-CNN models with different numbers of convolutional layers.
ModelsStrategies (Nconv)CabCarCanth
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
2D-CNNf
(20 × 20)
A (1)0.916.9213.280.781.0712.700.820.4943.72
B (2)0.927.0813.580.781.1013.040.760.5649.90
C (3)0.926.5212.500.801.1213.340.820.4741.89
D (4)0.926.4312.350.811.0712.770.840.4641.43
E (5)0.927.3014.010.801.1213.380.820.5346.90
2D-CNNGA
(20 × 20)
A (1)0.916.9813.400.831.0512.530.880.3732.99
B (2)0.926.3812.240.821.0912.980.870.4035.29
C (3)0.926.2311.950.831.1613.800.890.3733.14
D (4)0.926.1011.710.841.0312.290.890.3531.58
E (5)0.926.8113.070.821.1113.290.890.3631.84
2D-CNNf
(40 × 10)
A (1)0.926.7512.960.821.1213.310.800.5044.54
B (2)0.926.6712.790.831.0712.780.810.4842.46
C (3)0.926.7813.010.821.1013.130.780.6154.02
D (4)0.917.1613.750.811.1113.230.780.5447.79
E (5)0.926.8813.200.841.0812.850.790.5246.36
2D-CNNGA
(40 × 10)
A (1)0.926.4612.390.821.1113.240.860.4137.05
B (2)0.926.3612.200.841.0512.490.880.3733.25
C (3)0.926.3812.240.831.0712.760.850.4237.80
D (4)0.926.4712.410.811.1213.390.840.4338.38
E (5)0.927.1313.680.821.0812.830.910.4338.61
2D-CNNf
(10 × 40)
A (1)0.916.8413.120.811.1313.420.820.5246.06
B (2)0.917.3114.030.811.1313.450.860.5347.75
C (3)0.926.6312.720.821.1013.140.830.4741.82
D (4)0.916.7913.040.811.1613.780.780.5145.26
E (5)0.916.7112.880.821.1313.430.830.5448.36
2D-CNNGA
(10 × 40)
A (1)0.916.6512.760.801.1814.020.890.4237.65
B (2)0.926.7512.960.811.1513.680.880.4035.48
C (3)0.926.4912.450.821.1313.390.880.3833.98
D (4)0.926.6812.810.791.2014.230.840.4338.30
E (5)0.926.6612.770.801.1413.540.860.4035.90
2D-CNNf
(50 × 8)
A (1)0.916.8113.060.831.1613.850.790.5246.16
B (2)0.926.7212.890.841.0812.830.770.5649.96
C (3)0.926.6412.740.841.0612.600.800.4842.74
D (4)0.907.3114.020.831.0812.900.770.5750.61
E (5)0.926.6512.760.831.0712.720.800.4943.62
2D-CNNGA
(50 × 8)
A (1)0.926.4112.300.831.0912.990.810.4641.36
B (2)0.926.5312.540.821.1113.250.820.4742.35
C (3)0.926.2411.960.831.0812.850.850.4136.66
D (4)0.916.8413.120.821.1013.100.840.4640.67
E (5)0.926.9413.310.821.0812.870.840.4540.07
2D-CNNf
(8 × 50)
A (1)0.917.1113.640.811.2314.700.830.5245.99
B (2)0.916.8213.090.821.1313.460.810.4741.76
C (3)0.926.7712.980.821.1213.300.860.4641.22
D (4)0.907.0413.510.801.2114.430.810.4641.27
E (5)0.907.1113.630.861.2815.280.860.4842.43
2D-CNNGA
(8 × 50)
A (1)0.916.9213.280.821.1213.290.860.4842.67
B (2)0.926.4912.450.821.0912.970.850.4338.62
C (3)0.926.4112.290.841.0812.910.870.4238.22
D (4)0.916.9513.330.811.1113.240.830.4640.97
E (5)0.927.3014.000.801.2014.260.830.4439.57
Note: The bold represents the best strategy for each model.
Table 9. Accuracy comparison of pigment estimation between the optimal CNN models with four convolutional layers and three benchmark models (PLSR, RFR, and GPR).
Table 9. Accuracy comparison of pigment estimation between the optimal CNN models with four convolutional layers and three benchmark models (PLSR, RFR, and GPR).
ModelsCabCarCanth
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
PLSRf0.7511.0721.240.531.9322.960.390.9282.13
PLSRGA0.809.9119.010.611.6719.840.440.8677.03
RFRf0.849.0517.360.701.5318.170.510.8172.15
RFRGA0.858.8116.900.701.5118.000.520.7970.50
GPRf0.868.9217.120.721.4817.670.620.7869.75
GPRGA0.878.5316.360.741.3916.590.630.7566.85
Table 10. Estimation accuracy of Cab and Car from the combined dataset of this study and LOPEX93 using optimal 1D- and 2D-CNN models with four convolutional layers.
Table 10. Estimation accuracy of Cab and Car from the combined dataset of this study and LOPEX93 using optimal 1D- and 2D-CNN models with four convolutional layers.
ModelsCabCar
R2RMSE
(µg/cm2)
nRMSE
(%)
R2RMSE
(µg/cm2)
nRMSE
(%)
1D-CNNf0.868.3317.650.701.5417.89
1D-CNNGA0.877.9416.830.741.4717.15
2D-CNNf0.877.9216.770.751.4616.96
2D-CNNGA0.887.8116.540.751.4316.62
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Xu, D. Deep Learning-Based Prediction of Multi-Species Leaf Pigment Content Using Hyperspectral Reflectance. Remote Sens. 2025, 17, 3293. https://doi.org/10.3390/rs17193293

AMA Style

Wang Z, Xu D. Deep Learning-Based Prediction of Multi-Species Leaf Pigment Content Using Hyperspectral Reflectance. Remote Sensing. 2025; 17(19):3293. https://doi.org/10.3390/rs17193293

Chicago/Turabian Style

Wang, Ziyu, and Duanyang Xu. 2025. "Deep Learning-Based Prediction of Multi-Species Leaf Pigment Content Using Hyperspectral Reflectance" Remote Sensing 17, no. 19: 3293. https://doi.org/10.3390/rs17193293

APA Style

Wang, Z., & Xu, D. (2025). Deep Learning-Based Prediction of Multi-Species Leaf Pigment Content Using Hyperspectral Reflectance. Remote Sensing, 17(19), 3293. https://doi.org/10.3390/rs17193293

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop