Application of Parallel Factor Analysis (PARAFAC) to the Regional Characterisation of Vineyard Blocks Using Remote Sensing Time Series

: Monitoring wine-growing regions and maximising the value of production based on their region/local speciﬁcities requires accurate spatial and temporal monitoring. The increasing amount and variability of information from remote sensing data is a potential tool to assess this challenge for the grape and wine industry. This article provides a ﬁrst insight into the capacity of a multiway analysis method applied to Sentinel-2 time series to assess the value of simultaneously considering spectral and temporal information to highlight site-speciﬁc canopy evolution in relation to environmental factors and management practices, which present a large diversity at this regional scale. Parallel Factor Analysis (PARAFAC) was used as an unsupervised technique to recover pure spectra and temporal signatures from multi-way spectral imagery of vineyards in the Languedoc-Roussillon region in the south of France. The model was developed using a time series of Sentinel-2 satellite imagery collected over 4978 vineyard blocks between May 2019 and August 2020. From the Sentinel-2 (spectral and temporal) signal, the PARAFAC analysis allowed the identiﬁcation of spectral and temporal proﬁles in the form of pure components, which corresponded to vegetation and soil. The PARAFAC analysis also identiﬁed that two of the pure spectra were strongly related to characteristics and dynamics of vineyard cultivation at a regional scale. A conceptual framework was proposed in order to simultaneously consider both vegetation and soil proﬁles and to summarise the mass of data accordingly. This methodology allowed the computation of a concentration index that characterised how close a ﬁeld was to a vegetation or a soil proﬁle over the season. The concentration indices were validated for the vegetation and the soil over two growing seasons (2019 and 2020) with geostatistical analysis. A non-random distribution of the concentration index at the regional scale was assumed to highlight a strongly spatially organised phenomenon related to spatially organised environmental factors (soil, climate, training system, etc.). In a second step, spatial patterns of indices were subjected to the expertise of a panel of advisors of the wine industry in order to validate them in relation to vine-growing conditions. Results showed that the introduction of the PARAFAC method opened up the possibility to identify relevant spectro-temporal proﬁles for vine monitoring purposes.


Introduction
The modern accessibility of remote sensing (RS) time series information for landscape monitoring allows the study of large-scale evolutionary phenomena to be considered [1]. It opens up the possibility of improving the understanding of dynamic processes, such as changes in forests, agricultural crops, and land use. In particular, crop monitoring may show temporal variability (different phenological growth patterns) due to either natural 1. to propose a multi-way approach aiming to identify spectro-temporal profiles from a time series of images, to assess the value of the approach for its potential to generate knowledge in a viticulture case study, and to determine whether the spatial patterns highlighted can be considered relevant with regard to the experts' observations, and; 2.
to address the possible limitations of the approach when dealing with large-scale time series of multispectral images without ground truth data.

Notations
For N-way arrays, capitalised bold and underlined characters will be used, e.g., X (I, J, K) indicates a 3-way array with I objects at J times described by K wavelengths. For matrices, bold and capitalised characters will be used, e.g., X, and for vectors, a lower-case bold character will be used, e.g., a. Upper-case and italics characters will be used for scalars, e.g., the number of wavelengths, K and lower-case characters will be used for running indices, e.g., a i is the ith element of the vector a.

PARAFAC Method
PARAFAC is used to decompose N-way arrays into distinct components. It is based on an alternating least square (ALS) algorithm where the data signal is decomposed into a set of trilinear terms and a residual array [13]. Following Ouertani (2014) [14], the PARAFAC decomposition of a three-way array X is the decomposition in the form of the sum of a minimum number of three-way arrays of rank one (Equation (1)): where X ijk is the reflectance value of the ith sample at the jth variable (temporal mode) and at the kth variable (spectral mode). Each f corresponds to a PARAFAC component and each such component has I a-values (scores); one for each sample. Each component also has J b-values as well as K c-values; one for each loading. The E ijk array is the residual data, representing the variability not accounted for by the model. In addition, a valid PARAFAC model provides an estimate of the pure spectra of the constituents of the sample under study. It will also provide an information of the concentration of the constituents over time, i.e., the components resulting from a valid PARAFAC model will allow a direct interpretation of spectral dynamics. For example, the PARAFAC method is widely used today in the analysis of multivariate data (structured in a three-dimensional excitation emission matrix (EEM)) to quantitatively compare the content of fluorescent organic compounds in samples [15]. The EMM-PARAFAC combination allows the division of fluorescent organic compounds into several independent components according to their unique properties and structures, which provides a basis for better understanding the dynamic changes in fluorescent compounds [15]. A schema of the PARAFAC decomposition of a three-way array is given in Figure 1.
where Xijk is the reflectance value of the ith sample at the jth variable (temporal mode) and at the kth variable (spectral mode). Each f corresponds to a PARAFAC component and each such component has I -values (scores); one for each sample. Each component also has J -values as well as K -values; one for each loading. The Eijk array is the residual data, representing the variability not accounted for by the model. In addition, a valid PARA-FAC model provides an estimate of the pure spectra of the constituents of the sample under study. It will also provide an information of the concentration of the constituents over time, i.e., the components resulting from a valid PARAFAC model will allow a direct interpretation of spectral dynamics. For example, the PARAFAC method is widely used today in the analysis of multivariate data (structured in a three-dimensional excitation emission matrix (EEM)) to quantitatively compare the content of fluorescent organic compounds in samples [15]. The EMM-PARAFAC combination allows the division of fluorescent organic compounds into several independent components according to their unique properties and structures, which provides a basis for better understanding the dynamic changes in fluorescent compounds [15]. A schema of the PARAFAC decomposition of a three-way array is given in Figure 1.
The PARAFAC model was run and validated following Anderson and Bro (2000) [16] using the N-way Toolbox in MATLAB version R2015a (The MathWorks Inc., Natick, MA, USA).

Case Study
Remote sensing data are characterised by several sets of information that are collected as a time series. This type of multi-way data provides valuable spectral-temporalspatial information to characterise crops. In order to have a large variability of conditions, which will be favourable for highlighting the traits of a particular crop, this study is focused at a regional scale. Research at the regional scale allows, among other issues, the monitoring of how different environmental and management conditions affect vegetation development [17]. It also may provide insights on how an extreme climate event may affect vegetation activity [18]. For this study, the region selected was the Languedoc-Roussillon region of southern France where the principal cropping system is for winegrape production (see the following section, study area).
For remote sensing applications in agriculture at the regional scale, recent studies have demonstrated the potential of cost-free satellite imagery, such as Sentinel-2 satellites [19]. In terms of revisiting time and spatial and temporal resolution, Sentinel-2 satellite imagery is well suited to the requirements of monitoring agricultural fields, with a high potential to extract relevant agronomic information. Therefore, in this study, the decomposition PARAFAC method was applied to vine cultivation at a regional scale using multispectral times series from the Sentinel-2 satellites. The PARAFAC model was run and validated following Anderson and Bro (2000) [16] using the N-way Toolbox in MATLAB version R2015a (The MathWorks Inc., Natick, MA, USA).

Case Study
Remote sensing data are characterised by several sets of information that are collected as a time series. This type of multi-way data provides valuable spectral-temporal-spatial information to characterise crops. In order to have a large variability of conditions, which will be favourable for highlighting the traits of a particular crop, this study is focused at a regional scale. Research at the regional scale allows, among other issues, the monitoring of how different environmental and management conditions affect vegetation development [17]. It also may provide insights on how an extreme climate event may affect vegetation activity [18]. For this study, the region selected was the Languedoc-Roussillon region of southern France where the principal cropping system is for winegrape production (see the following section, study area).
For remote sensing applications in agriculture at the regional scale, recent studies have demonstrated the potential of cost-free satellite imagery, such as Sentinel-2 satellites [19]. In terms of revisiting time and spatial and temporal resolution, Sentinel-2 satellite imagery is well suited to the requirements of monitoring agricultural fields, with a high potential to extract relevant agronomic information. Therefore, in this study, the decomposition PARAFAC method was applied to vine cultivation at a regional scale using multispectral times series from the Sentinel-2 satellites.

Study Area
The study area corresponded to 4978 samples (vineyard blocks) extracted from the graphical parcel register of France (RPG) from a large wine-growing region located in southern France, the Languedoc-Roussillon (LR). The LR vineyards extend over approximately 27,400 km 2 , covering four French administrative sectors: Gard (A), Hérault (B), Aude (C), and Pyrénées-Orientales (D) (Figure 2). The study area encompasses a great variability of pedo-climatic conditions and a great diversity of varieties, training systems, etc. [20]. It is assumed that changes in the reflectance values from Sentinel-2 satellites over the 4978 vineyard block samples over time will provide agronomic insights that arise from differences in the genotype-environment-management interactions between vineyards, locally and regionally. Assuming that a unique solution can be expected if the Sentinel-2 data are trilinear, i.e., that there is a relationship between the three-way structure of these data (samples × times × wavelengths) this would imply that the true pure spectra can be found if the correct number of components is used and the signal-to-noise ratio is appropriate [11].

Study Area
The study area corresponded to 4978 samples (vineyard blocks) extracted from the graphical parcel register of France (RPG) from a large wine-growing region located in southern France, the Languedoc-Roussillon (LR). The LR vineyards extend over approximately 27,400 km², covering four French administrative sectors: Gard (A), Hérault (B), Aude (C), and Pyrénées-Orientales (D) (Figure 2). The study area encompasses a great variability of pedo-climatic conditions and a great diversity of varieties, training systems, etc. [20]. It is assumed that changes in the reflectance values from Sentinel-2 satellites over the 4978 vineyard block samples over time will provide agronomic insights that arise from differences in the genotype-environment-management interactions between vineyards, locally and regionally. Assuming that a unique solution can be expected if the Sentinel-2 data are trilinear, i.e., that there is a relationship between the three-way structure of these data (samples × times × wavelengths) this would imply that the true pure spectra can be found if the correct number of components is used and the signal-to-noise ratio is appropriate [11].

Remote Sensing Data
The Sentinel-2 (A/B) satellites provide 13 spectral bands from the Visible (Vis) and Near InfraRed (NIR) to the shortwave infrared (SWIR) regions of the electromagnetic spectrum. They provide coverage of land surfaces on a global scale with a revisit frequency of 5 days in cloud-free conditions with a spatial resolution of 10, 20, or 60 m depending on the spectral band [21].
Sentinel-2 L2A data containing the study vineyards ( Figure 2) were selected and processed via the Google Earth Engine (GEE) platform. Prior to the calculation of the average pixel values for each vineyard block, each date and each waveband, a 10 m inner buffer was imposed over each block boundary ( Figure 2) that was extracted from the RPG. This 10 m inner buffer was used to prevent information from outside the block being integrated into the analysis [22]. Images from 1 May to 31 August in 2019 and 2020 were selected. This time period (from May to August) for both years of the study was considered as being the most relevant for monitoring vine vegetation in the study region [21]. Images contain-

Remote Sensing Data
The Sentinel-2 (A/B) satellites provide 13 spectral bands from the Visible (Vis) and Near InfraRed (NIR) to the shortwave infrared (SWIR) regions of the electromagnetic spectrum. They provide coverage of land surfaces on a global scale with a revisit frequency of 5 days in cloud-free conditions with a spatial resolution of 10, 20, or 60 m depending on the spectral band [21].
Sentinel-2 L2A data containing the study vineyards ( Figure 2) were selected and processed via the Google Earth Engine (GEE) platform. Prior to the calculation of the average pixel values for each vineyard block, each date and each waveband, a 10 m inner buffer was imposed over each block boundary ( Figure 2) that was extracted from the RPG. This 10 m inner buffer was used to prevent information from outside the block being integrated into the analysis [22]. Images from 1 May to 31 August in 2019 and 2020 were selected. This time period (from May to August) for both years of the study was considered as being the most relevant for monitoring vine vegetation in the study region [21]. Images containing clouds or shadows that altered the visibility of the blocks were removed from the database. For this purpose, the spectral band 10 at 1380 nm was used for the detection of visible and sub-visible cirrus clouds [23]. Approximately 25 images should have been available on each vineyard block over the two years; however, because of local atmospheric effects on the imagery, the final Sentinel-2 database consisted of 4978 vineyard blocks containing 12 spectral bands and an average of 11 images (dates) per block.

Data Array Construction
Two PARAFAC models were derived from the data set of 4978 samples (vineyard blocks), one for each year of the study, 2019 and 2020. To overcome the challenge of heterogeneity in the number of images per block, an interpolation was performed to obtain two continuous data cubes; X 19 and X 20 . Interpolation using the Gaussian filter by Alam et al. (2008) [24] was applied in order to have a consistent time step dimension (J) between 1 May and 31 August for both years. The parameters involved in the interpolation setting were fixed to the Gaussian filter width (P) = 30 days and date interval (N) = 5 days to simulate the revisit time of the Sentinel-2 satellites [22]. At the end of the interpolation step, the data set was meaningfully arranged in two three-way arrays X (X 19 and X 20 ) of dimensionality 4978 (samples, I) × 25 (times, J) × 12 (wavelengths, K).

PARAFAC Model
A separate decomposition of the two arrays, X 19 and X 20 , into trilinear components was performed using the PARAFAC method. The trilinear model was found to minimize the sum of squares of the residuals, E ijk in the model. Each component consisted of one score vector and two loading vectors, i.e., temporal and spectral loadings. A PARAFAC model of a three-way array, e.g., X 19 , was given by one score matrix A and two loading matrices, B and C, with its respective a i f , b j f , and c k f (Figure 3). The parameter a i f is the score of the ith sample of the f th component; b j f is the loading specific to reflectance intensity at the time j of the f th component; and c k f is the loading estimate of the reflectance spectrum k of the f th component. Essentially, the PARAFAC model provides an estimation of the relative concentration, i.e., the relative amount of the f th component in each sample (matrix A) and temporal and spectral properties of the components loadings matrices (B and C respectively), which can be used to interpret the spectral dynamics of the constituents of the samples. For 2020 (X 20 ), the same PARAFAC decomposition scheme was used as for 2019.
tabase. For this purpose, the spectral band 10 at 1380 nm was used for the detection of visible and sub-visible cirrus clouds [23]. Approximately 25 images should have been available on each vineyard block over the two years; however, because of local atmospheric effects on the imagery, the final Sentinel-2 database consisted of 4978 vineyard blocks containing 12 spectral bands and an average of 11 images (dates) per block.

Data Array Construction
Two PARAFAC models were derived from the data set of 4978 samples (vineyard blocks), one for each year of the study, 2019 and 2020. To overcome the challenge of heterogeneity in the number of images per block, an interpolation was performed to obtain two continuous data cubes; X19 and X20. Interpolation using the Gaussian filter by Alam et al. (2008) [24] was applied in order to have a consistent time step dimension (J) between 1 May and 31 August for both years. The parameters involved in the interpolation setting were fixed to the Gaussian filter width (P) = 30 days and date interval (N) = 5 days to simulate the revisit time of the Sentinel-2 satellites [22]. At the end of the interpolation step, the data set was meaningfully arranged in two three-way arrays X (X19 and X20) of dimensionality 4978 (samples, I) × 25 (times, J) × 12 (wavelengths, K).

PARAFAC Model
A separate decomposition of the two arrays, X19 and X20, into trilinear components was performed using the PARAFAC method. The trilinear model was found to minimize the sum of squares of the residuals, Eijk in the model. Each component consisted of one score vector and two loading vectors, i.e., temporal and spectral loadings. A PARAFAC model of a three-way array, e.g., X19, was given by one score matrix A and two loading matrices, B and C, with its respective , , and ( Figure 3). The parameter is the score of the ith sample of the fth component; is the loading specific to reflectance intensity at the time j of the fth component; and is the loading estimate of the reflectance spectrum k of the fth component. Essentially, the PARAFAC model provides an estimation of the relative concentration, i.e., the relative amount of the fth component in each sample (matrix A) and temporal and spectral properties of the components loadings matrices (B and C respectively), which can be used to interpret the spectral dynamics of the constituents of the samples. For 2020 (X20), the same PARAFAC decomposition scheme was used as for 2019. where K represents relevant wavelengths, J represents time, and I represents the objects of interest, vineyard blocks in this case. E corresponds to the residuals. where K represents relevant wavelengths, J represents time, and I represents the objects of interest, vineyard blocks in this case. E corresponds to the residuals.

Component Selection
Several PARAFAC models, with the number of components in the models ranging from one to five, were fitted to the data in order to determine the model with the optimal number of components. A F-component model was validated following the Core Consistency Diagnostic (CORCONDIA) of Bro (1997) [11]. The PARAFAC model is considered to be valid if the value of the CORCONDIA is close to 100%. If the CORCONDIA value is around 50%, the model is considered unstable, and if the value is close to 0 (even negative), then it is considered that the data cannot be described by the tri-linear model [14]. The CORCONDIA value will mostly decrease with the number of components. It decreases very sharply after the optimal number of components is exceeded. Therefore, the appro- priate number of components is the model with the highest number of components and a valid CORCONDIA value.

Validation of the Model
The PARAFAC models for both 2019 and 2020 were validated in two steps:

1.
A geostatistical analysis (see next section) to test whether the score values were spatially autocorrelated. The underlying hypothesis is that the environmental variables (soil, climate, and associated training practices, etc.) that are likely to explain the differences between blocks at the regional scale are spatially organised (not random). Consequently, if score values resulting from the PARAFAC models are not randomly distributed at the regional level, they are assumed to be related to environmental variables.

2.
An assessment of the coherence of the spatial distribution of the score values against expert knowledge of the diversity and spatial distribution of soil, climate, and vineyard cultural practices in the region. The objective of this second step was to identify the relevance of the relationship between the scores and agri-environmental information. In order to do so, an original approach based on placing the human, i.e., local viticultural experts, at the centre of the study was conducted. Following the methodological framework of Pichon et al. (2019) [12], a global approach, a so-called 'scenario simulation' [25], was used. In this method, external reliability can be considered valid if the conclusions reached by different experts (at least two experts) coincide [26,27].
In order to facilitate the expert's interpretation, maps of score values were produced at the regional level. These maps were created by interpolating (see next section) the scores observed on the 4978 vineyard blocks for each year (2019 and 2020) and for each component (F) when spatial autocorrelation was observed.

Spatial Analysis and Mapping
The spatial analysis was based on the modelling of semivariograms, which allows for a statistical assessment of how spatially autocorrelated the data (scores) are over space. It is expected that the (semi)variance of points closer together will be lower if the data exhibits spatial patterning. From the semivariogram modelling, the following parameters were derived: C 0 (nugget effect or semivariance at a distance approaching 0 m), C 1 (sill or the maximum semivariance observed), and A 1 (the range (distance) at which C 1 is reached). The C 0 and C 1 parameters were first used to calculate the Cambardella Index [28] to determine the spatial dependence and then all three parameters were used in a kriging process to obtain the score maps. The characterisation of the spatial structure via semivariograms, as well as the realisation of the score maps, was performed using the GeoFis 1.0 software (Open Source Software GeoFIS, http://www.geofis.org, accessed on 24 August 2022) [29].
Spatial auto-correlation of the score values of each component was assessed with the Cambardella Index (I c ) [28] (Equation (2)): where C 0 is the nugget effect and C 1 is the sill of the semivariogram model. According to Cambardella et al. (1994) [28], the following thresholds were used: 1. if I c is less than or equal to 25%, the distribution is considered strongly spatially organised (high auto correlation); 2.
if I c is between 25 and 75%, the distribution is considered moderately spatially organised; and, 3.
if I c is higher than 75%, the distribution is considered weakly spatially organised.
Score maps were obtained using point kriging interpolation [30]. The latter was performed on a grid of points regularly spaced 1000 m apart within the geographical boundary of the LR region [31].  [12], 6 experts were selected. These 6 experts were chosen in order to benefit from an integrated assessment of the results obtained over the whole region, taking into account simultaneously their knowledge of plant development, the cultural practices of the winegrowers in each part of the region, the soil and its variability, etc. (see Supplementary Materials section).
In order to validate the experts' consistency, they conducted a self-assessment of the criteria considered important by the authors for the study (Table 1). In this self-assessment, the experts ranked their perceived expertise on aspects of vineyard production, regional viticulture practices, regional pedo-climatic knowledge, and RS expertise on a scale from 0 (none) to 5 (excellent). This information validated the choice of experts and was also used to consider the importance of (i.e., weight) the different opinions expressed by different experts during the working session. For example, less importance was assigned to the opinion of expert A3 when assessing the scores in the Aude department due to A3 selfidentifying as having a poor knowledge of viticultural practices in this department. A 'scenario simulation' session was conducted with experts on 31 May 2022. The general workflow of the session is presented in Table 2. During the session, the score maps were hand-delivered to each expert and projected on a screen managed by an animator (zoom in and out) according to requests from the experts. These score maps were previously elaborated by adding characteristic elements (cities, watercourses, roads, etc.) that would allow the experts to orientate themselves on the maps. It should be noted that an explanation session of the meaning of the score values was necessary so that the experts could understand the maps and interpret them in the best possible way using a colour coding defined by the authors. Maps were displayed in QGIS 3.2.3 (Open Source Geospatial Foundation, http://qgis.osgeo.org, accessed on 24 August 2022). Experts were asked to first write down their opinions before oral group discussions. In the oral group discussion, the animator wrote down the experts' comments and reactions. Table 2. Description of the general workflow of the 'scenario simulation' session.
Step 1 Step 2 Step 3 Step 4 Step 5 It was agreed that the experts would react mainly to score maps derived from the 2020 data, because it was considered to be most representative of common crop-soil-climate interactions in the region. In contrast, 2019 was assumed to present a specific atypical behaviour (although interesting) as a heatwave strongly affected vine growth in the region at the end of June. Score maps from 2019 were then presented in a second step as a particular case study. Table 3 presents the results of fitting the PARAFAC model with different numbers of components (1)(2)(3)(4)(5) to X 19 and X 20 . With two components, the observed CORCONDIA and explained variance values were 100-99.62% and 100-99.57% for 2019 and 2020, respectively. For a higher number of components, the CORCONDIA values dropped significantly, showing that the two-component model was the best for both years, as the appropriate number of components corresponded to the model with the highest number of components and a valid CORCONDIA value. Since a two-component model was validated according to CORCONDIA, the results in Table 3 also justified the suitability of applying the PARAFAC methodology assuming a trilinear structure in the presented data set.

Data Signal Decomposition
A two-component PARAFAC corresponds to two pure spectra of constituents. For both PARAFAC models (2019 and 2020), the score in the matrix A (4978 × 2) contains estimated relative concentrations in relation with both loadings (temporal × spectral) of the two components in the 4978 samples. The matrix B (25 × 2) contains estimated temporal loadings and the matrix C (12 × 2) estimated spectral loadings.
The spectral loadings (C) and the temporal loadings (B) for the two components selected (Co1 and Co2) from the PARAFAC model of the years 2019 and 2020 are presented in Figures 4 and 5, respectively.
Regarding component 1 (Co1) for both years, it was observed that the loading intensity was characterised by an increase from the Visible (Vis) to the Near InfraRed region (NIR). It should be noted, though, that in 2019 there was a slight drop in intensity between the wavelengths ranging from 700 to 1000 nm. Concerning Co2 for both years, the spectral loading was characterised by: (i) a low intensity in red wavelengths (650-680 nm), (ii) the highest intensity in the NIR range (785-875 nm), and (iii) a decrease in intensity in the short-wave infrared region (1360-2200 nm). Taking into account that matrix C represents the pure spectra and considering the type of case study proposed, the components were identified as soil (Co1) and vegetation (Co2). Soil spectral reflectance is typically characterised by a steady increase with wavelengths from the Vis to the NIR, except at 950, 1200, and 1350 nm, where reflectance decreases [32]. As for the typical vegetation spectral reflectance, it exhibits the following characteristics: (i) a low reflection in the blue (458-523 nm) and red wavelengths (650-680 nm), (ii) relatively more reflection in green wavelengths (543-578 nm), (iii) reflectance in the NIR range (785-875 nm) is the highest, and iv) the short-wave infrared region (1360-2200 nm) is mainly characterised by less reflectance [33]. Given that grapevine cultivation is seasonal and is a row crop, the PARAFAC model clearly identifies the mixture of these two components. In fact, for both years, the Co2 spectral loadings represented a 'reference' vegetation profile (which refers to the notion of the pure spectra in chemometrics) that probably included both vines and interrow grass/crops reflecting the typical signature of healthy vegetation, while the Co1 spectral loadings represented a 'reference' soil profile at the region level. It actually summarised different soil types across the region, probably with different spectral characteristics. This may explain why the spectral loading identified as soil followed a slightly different general trend than that described as a typical soil profile by Khadse (2012) [32].  Regarding component 1 (Co1) for both years, it was observed that the loading intensity was characterised by an increase from the Visible (Vis) to the Near InfraRed region (NIR). It should be noted, though, that in 2019 there was a slight drop in intensity between the wavelengths ranging from 700 to 1000 nm. Concerning Co2 for both years, the spectral loading was characterised by: (i) a low intensity in red wavelengths (650-680 nm), (ii) the highest intensity in the NIR range (785-875 nm), and (iii) a decrease in intensity in the short-wave infrared region (1360-2200 nm). Taking into account that matrix C represents the pure spectra and considering the type of case study proposed, the components were identified as soil (Co1) and vegetation (Co2). Soil spectral reflectance is typically characterised by a steady increase with wavelengths from the Vis to the NIR, except at 950, 1200, and 1350 nm, where reflectance decreases [32]. As for the typical vegetation spectral reflectance, it exhibits the following characteristics: (i) a low reflection in the blue (458-523 nm) and red wavelengths (650-680 nm), (ii) relatively more reflection in green wavelengths (543-578 nm), (iii) reflectance in the NIR range (785-875 nm) is the highest, and iv) the short-wave infrared region (1360-2200 nm) is mainly characterised by less reflectance [33]. Given that grapevine cultivation is seasonal and is a row crop, the PARAFAC model clearly identifies the mixture of these two components. In fact, for both years, the Co2 spectral loadings represented a 'reference' vegetation profile (which refers to the notion of the pure spectra in chemometrics) that probably included both vines and interrow grass/crops reflecting the typical signature of healthy vegetation, while the Co1 spectral loadings represented a 'reference' soil profile at the region level. It actually summarised different soil types across the region, probably with different spectral characteristics. This may explain why the spectral loading identified as soil followed a slightly different general trend than that described as a typical soil profile by Khadse (2012) [32]. Figure 5 presents the temporal loadings (B) for both 2019 and 2020 and both components (Co1 and Co2). It was observed that the two components showed large differences between them in their temporal loadings in the two consecutive years. Regarding Co1, in 2019 the intensity in its time profile decreased over time, while in 2020 it remained more stable. Regarding Co2, its spectral profile in 2019 peaked between 17 and 27 of July, with a slow decrease afterward, while for 2020 the Co2 showed a maximum around 11 and 26 of June followed by a very steep decline later in the season. Indeed, the temporal loadings of both years seemed to show that the Co2 intensity (vegetation) increased with the vegetative development of the vine (June-July) and, conversely, Co1 intensity (soil) decreased   Given that the samples studied were vineyard blocks, the spectral and temporal profiles indicate a 'reference' behaviour for each component (soil and vegetation) at the regional scale, but each block may differ more or less from this 'reference' behaviour. Thus, once the models were validated, the data could be examined with respect to the variability of each component found, i.e., with respect to the values of the score matrix (A) for both years.

Spatial Analysis and Characterisation
The score value maps ( Figure 6) represent the spectro-temporal scores of each block in relation to each component and its corresponding year, e.g., for Figure 6a, the estimated concentration value of the Co1 (soil) component of each block is plotted in relation to its spectro-temporal profile in 2019 at the regional level. Therefore, the score map for Co1 was supposed to highlight blocks that were directly related (or counter-related) to the identified spectro-temporal signature for the soil in 2019.  Figure 5 presents the temporal loadings (B) for both 2019 and 2020 and both components (Co1 and Co2). It was observed that the two components showed large differences between them in their temporal loadings in the two consecutive years. Regarding Co1, in 2019 the intensity in its time profile decreased over time, while in 2020 it remained more stable. Regarding Co2, its spectral profile in 2019 peaked between 17 and 27 of July, with a slow decrease afterward, while for 2020 the Co2 showed a maximum around 11 and 26 of June followed by a very steep decline later in the season. Indeed, the temporal loadings of both years seemed to show that the Co2 intensity (vegetation) increased with the vegetative development of the vine (June-July) and, conversely, Co1 intensity (soil) decreased with the growth of the vine. In fact, the decreasing temporal profile of Co1 observed throughout 2019 reaffirms the hypothesis that it may be a compensatory profile of the vegetation at certain times during the time period considered. For the record, 2019 was marked by a heatwave in late June that affected a significant part of the region, which may explain a compensatory effect later in the season. With regard to Co2, the maximum peak of intensity shifted between 2019 and 2020, moving forward by almost one month in 2020. This could be explained if the time of vegetative development (phenological stages) of the vineyards was different between the two years studied.
Given that the samples studied were vineyard blocks, the spectral and temporal profiles indicate a 'reference' behaviour for each component (soil and vegetation) at the regional scale, but each block may differ more or less from this 'reference' behaviour. Thus, once the models were validated, the data could be examined with respect to the variability of each component found, i.e., with respect to the values of the score matrix (A) for both years.

Spatial Analysis and Characterisation
The score value maps ( Figure 6) represent the spectro-temporal scores of each block in relation to each component and its corresponding year, e.g., for Figure 6a, the estimated concentration value of the Co1 (soil) component of each block is plotted in relation to its spectro-temporal profile in 2019 at the regional level. Therefore, the score map for Co1 was supposed to highlight blocks that were directly related (or counter-related) to the identified spectro-temporal signature for the soil in 2019. The spatial organisation of the score maps for both years and for all the selected components was confirmed by the semivariogram modelling (Table 4), which showed that at least 40-50% of the variability was explained by a spatial phenomenon. The year and component with the strongest spatial structure was the soil component (Co1) for 2019 with an I c of 45%. These results support the assumption that scores values were not randomly organised, but spatially structured and likely to vary according to environmental variables.
The score maps obtained after kriging ( Figure 6) show the spatial organisation of the blocks that have: (i) a similar spectro-temporal profile (high scores) and (ii) a different spectro-temporal profile (low scores) to the 'reference' spectral profile for each component and year. Therefore, the greater the difference in the time spectral profiles, the lower the scores, which were negative if they were opposite to the 'reference' spectro-temporal profile. The spatial patterns for soil (Co1) between the two years were very similar, showing that the same vineyard blocks had both high (light colour) and low (dark colour) score values in the same areas for both years. Regarding the spatial patterns of vegetation (Co2) for both years, a general trend of higher score values (light colour) in the northern part of the region was highlighted. For other sectors, the spatial patterns were less similar between both years. It should be noted that, for Co2, the spectral loadings ( Figure 4) followed the same profile in both years but, in contrast, the temporal loadings ( Figure 5) showed a significant temporal shift. This implies that the temporal dimension may have a major role in the difference in score values for each vineyard block for both years, i.e., in the differences in spatial patterns for each year represented.  Table 4.
The spatial organisation of the score maps for both years and for all the selected components was confirmed by the semivariogram modelling (Table 4), which showed that at least 40-50% of the variability was explained by a spatial phenomenon. The year and component with the strongest spatial structure was the soil component (Co1) for 2019 with an Ic of 45%. These results support the assumption that scores values were not randomly organised, but spatially structured and likely to vary according to environmental variables. The score maps obtained after kriging ( Figure 6) show the spatial organisation of the blocks that have: (i) a similar spectro-temporal profile (high scores) and (ii) a different Interpolation was performed using ordinary kriging using the semivariogram information in Table 4.

Comparaison of Cross-Observation by Experts
Regarding the spatial patterns for Co1 between 2019 and 2020, these were almost identical (Figure 6a,b), and it was considered that the score map of Co1 for 2019 would not provide any relevant information to the study for the experts. Therefore, the workflow of cross-validation by experts focused first on the two 2020 components (Co1 and Co2), and then the vegetation component (Co2) from 2019 was presented only for comparison with 2020. Table 5 presents the extent to which each type of observation was made by the expert group for the 2020 year.  It clearly appears that a significant number of experts made similar observations on Sectors A and B of the LR region. Moreover, more expert observations were related to Co1 than to Co2. This result can be explained by the scale of the study, where vegetation heterogeneity may be more difficult to analyse properly by the experts given the large diversity of situations. Experts were obviously more comfortable with identifying and explaining differences in spatial patterns within sectors that they were familiar with. Figure 7 shows a summary map of each type of observation for 2020 made by the group of experts related to Co1 for each sector. The observations made by individual experts coincided in the vast majority of cases with at least one other expert. Therefore, if at least two experts were in agreement, the relevance of the identified area was considered valid, according to Wacheux (1996) [25]. Concerning the soil units (Figure 7a), experts identified that the spatial patterns were related to the depth of the soil. In terms of colour range, it was perceived that the darker areas could represent valley and coastal soils, e.g., fluviosols, as well as calcareous soils. The former are soils containing mostly coarse elements (gravel, pebbles, stones, etc.) with a thickness >0.50 m and the latter are defined as medium to thick soils (>0.35 m thick), developed from calcareous materials. By comparison, the lighter zones (less distinct from a visual point of view) could represent mineral soils, e.g., rankosols or lithosols, which are characterised by a thickness of <0.30 m. It should be noted that this spatialisation of the different geological units was directly related to the information on the soil available water capacity provided by four experts. Regarding Figure 7b, some major geological formations within the LR region were largely commented on by five of the experts. Three clear boundaries between the two colours were observed and are outlined in sectors A-B-C of the map.
Regarding Figure 8 (Co2), it should be noted that the colour coding does not always represent the same phenomenon, but rather differences when compared to the 'reference' spectro-temporal profile for 2020. That is, the yellow vineyard blocks represent a spectrotemporal profile very similar to the 'reference' vegetation profile extracted by the PARA-FAC method. In contrast, the blue vineyard blocks represent plots that differ from the 'reference' spectro-temporal profile of the vegetation. Therefore, as these colour variations may be due to differences in the spectral and/or temporal profiles in relation to the 'reference' profile, there were some inconsistencies in the experts' interpretation of the meaning of the colour codes. A clear example was the comments on the phenological state of the vineyards by sectors in Figure 8a. Each of the areas highlighted in Figure 8a was selected The observations made by individual experts coincided in the vast majority of cases with at least one other expert. Therefore, if at least two experts were in agreement, the relevance of the identified area was considered valid, according to Wacheux (1996) [25]. Concerning the soil units (Figure 7a), experts identified that the spatial patterns were related to the depth of the soil. In terms of colour range, it was perceived that the darker areas could represent valley and coastal soils, e.g., fluviosols, as well as calcareous soils. The former are soils containing mostly coarse elements (gravel, pebbles, stones, etc.) with a thickness >0.50 m and the latter are defined as medium to thick soils (>0.35 m thick), developed from calcareous materials. By comparison, the lighter zones (less distinct from a visual point of view) could represent mineral soils, e.g., rankosols or lithosols, which are characterised by a thickness of <0.30 m. It should be noted that this spatialisation of the different geological units was directly related to the information on the soil available water capacity provided by four experts. Regarding Figure 7b, some major geological formations within the LR region were largely commented on by five of the experts. Three clear boundaries between the two colours were observed and are outlined in sectors A-B-C of the map.
Regarding Figure 8 (Co2), it should be noted that the colour coding does not always represent the same phenomenon, but rather differences when compared to the 'reference' spectro-temporal profile for 2020. That is, the yellow vineyard blocks represent a spectrotemporal profile very similar to the 'reference' vegetation profile extracted by the PARAFAC method. In contrast, the blue vineyard blocks represent plots that differ from the 'reference' spectro-temporal profile of the vegetation. Therefore, as these colour variations may be due to differences in the spectral and/or temporal profiles in relation to the 'reference' profile, there were some inconsistencies in the experts' interpretation of the meaning of the colour codes. A clear example was the comments on the phenological state of the vineyards by sectors in Figure 8a. Each of the areas highlighted in Figure 8a was selected by the experts as characteristic of late growth vineyard development, but the colours were different in each area. These inconsistencies could be explained, not by the late phenological stage, but by the different grape varieties selected and adapted to this late growth in each sector, e.g., a predominance of Chardonnay in sector A and a predominance of Mourvèdre in sector C. Since the grape varieties were different, different management practices, different training systems, etc., could be involved, and thus explain why on some maps the observed trends and the resulting agronomic interpretations were difficult to interpret. practices, different training systems, etc., could be involved, and thus explain why on some maps the observed trends and the resulting agronomic interpretations were difficult to interpret. Concerning Figure 8b, certain spatial structures, which were characterised by high/low vine vigour, were commented on by a minority of the experts. A possible explanation given by one of the experts focusing on sector A was the presence of the distinct denomination areas that characterise the different vineyards. In particular, the yellow area corresponded to the Protected Geographical Indication (PGI) label and the blue area to the Protected Designation of Origin (PDO) label. Two different identifications imply different yield objectives, soil cultivations, grape varieties, and management practices. The latest observations made by the experts are shown in Figure 8c and concern zones where irrigation is available through regional facilities. The experts largely addressed these observations for sectors A and B. For the final step of the workflow ( Table 2) it was decided to present only the score map from 2019 for the vegetation component (Co2) as the score map of the Co1 confirmed the stability of this component from one year to another, and the temporal stable factors (mainly related to soil) that it is related to. Therefore, when comparing the two years for Concerning Figure 8b, certain spatial structures, which were characterised by high/low vine vigour, were commented on by a minority of the experts. A possible explanation given by one of the experts focusing on sector A was the presence of the distinct denomination areas that characterise the different vineyards. In particular, the yellow area corresponded to the Protected Geographical Indication (PGI) label and the blue area to the Protected Designation of Origin (PDO) label. Two different identifications imply different yield objectives, soil cultivations, grape varieties, and management practices. The latest observa-tions made by the experts are shown in Figure 8c and concern zones where irrigation is available through regional facilities. The experts largely addressed these observations for sectors A and B.
For the final step of the workflow ( Table 2) it was decided to present only the score map from 2019 for the vegetation component (Co2) as the score map of the Co1 confirmed the stability of this component from one year to another, and the temporal stable factors (mainly related to soil) that it is related to. Therefore, when comparing the two years for the Co2 component, the experts observed the same spatial patterns but noted that these were less pronounced in 2019 (Figure 9a). This attenuation of the yellow colour, associated with higher score values, could be justified by the heatwave episode from 23 June to 8 July in 2019 that was experienced in the LR wine-growing region. The heatwave would have homogenised the differences between vineyards and, therefore, the corresponding spectro-temporal profiles. the Co2 component, the experts observed the same spatial patterns but noted that these were less pronounced in 2019 (Figure 9a). This attenuation of the yellow colour, associated with higher score values, could be justified by the heatwave episode from 23 June to 8 July in 2019 that was experienced in the LR wine-growing region. The heatwave would have homogenised the differences between vineyards and, therefore, the corresponding spectro-temporal profiles.

Discussion
A remote sensing time series study for the regional characterisation of vineyard blocks was provided by the application of the PARAFAC algorithm. Results showed both the potential and the limitations provided by the application of an unsupervised complex data analysis method focusing on grapevine production at the regional scale. The validated application with a practical framework of expert winegrowers' opinions demonstrated both the complexity and the added value of considering the feature approach in the temporal and spectral dimensions for interpretation purposes to identify relevant region/local characteristics in a grapevine context.
In order to analyse a time series of multispectral images to assess the value of simultaneously considering spectral and temporal information over the LR wine-growing region, the PARAFAC method was used to address the issue of analysing the three-way Sentinel-2 data sets in two distinct years, 2019 and 2020. It should be noted that a trilinear structure was assumed a priori in the analysis of the data sets. This assumption could be partly accepted because the fit of the data obtained was satisfactory according to the COR-CONDIA criterion (100% for both years) and because the shape of the spectro-temporal profiles obtained was coherent from an agronomic point of view. Several authors, including De Juan and Tauler (2001) [34], have established that methods based on the trilinear structure assumption will fit the data less efficiently than others that do not, such as Tucker3 [35] or Multivariate Curve-Resolution Alternating Least Squares (MCR-ALS) [36]. Indeed, the more ambiguous solutions given by Tucker3 and MCR-ALS through bilinear decomposition are balanced by the greater flexibility in modelling the shape of the profiles, which also leads to a feasible physical and chemical interpretation of the results

Discussion
A remote sensing time series study for the regional characterisation of vineyard blocks was provided by the application of the PARAFAC algorithm. Results showed both the potential and the limitations provided by the application of an unsupervised complex data analysis method focusing on grapevine production at the regional scale. The validated application with a practical framework of expert winegrowers' opinions demonstrated both the complexity and the added value of considering the feature approach in the temporal and spectral dimensions for interpretation purposes to identify relevant region/local characteristics in a grapevine context.
In order to analyse a time series of multispectral images to assess the value of simultaneously considering spectral and temporal information over the LR wine-growing region, the PARAFAC method was used to address the issue of analysing the three-way Sentinel-2 data sets in two distinct years, 2019 and 2020. It should be noted that a trilinear structure was assumed a priori in the analysis of the data sets. This assumption could be partly accepted because the fit of the data obtained was satisfactory according to the CORCON-DIA criterion (100% for both years) and because the shape of the spectro-temporal profiles obtained was coherent from an agronomic point of view. Several authors, including De Juan and Tauler (2001) [34], have established that methods based on the trilinear structure assumption will fit the data less efficiently than others that do not, such as Tucker3 [35] or Multivariate Curve-Resolution Alternating Least Squares (MCR-ALS) [36]. Indeed, the more ambiguous solutions given by Tucker3 and MCR-ALS through bilinear decomposition are balanced by the greater flexibility in modelling the shape of the profiles, which also leads to a feasible physical and chemical interpretation of the results [37]. However, the applicability of these kind of methodologies, which assume linear relationships in the data, is basically unknown in remote sensing scenarios. This is because the definition of the component, i.e., soil, vegetation, etc., and the image acquisition conditions will sometimes justify the assumption of a certain variability in the spectral signatures of the components [38]. Therefore, the non-linear unmixing, which is essentially the usual approach used in remote sensing, is often solved using deep learning methods [38]. Regardless of this claim, the PARAFAC decomposition accurately modelled soil dynamics and vegetation dynamics for both 2019 and 2020. The interpretability of the spectro-temporal profiles was useful to understand the variation of the spectral response of crops (soil-vegetation dynamics) over time.
Since the PARAFAC decomposition allowed for unique solutions to be obtained, i.e., pure spectra, the modelling was considered relevant for the study of the temporal dynamics of vineyard blocks at the scale of the region. However, it is essential to place the results presented in this paper within the reality of large-scale unsupervised case study data. It was apparent that the study of such variable samples that are representative of a large geographical area has limitations related to the reliability of the validation. To overcome this challenge, Wacheux's (1996) [25] 'scenario simulation' was used, where if at least two experts confirmed that there is some logic in the observations, then these observations are considered valid. The proportion of different experts who made the same observation for each score map for the year 2020 can be considered a reliable indicator of the representativeness of the findings and may validate the relevance of the approach in highlighting relevant agronomic information. The observations were not made in the same proportion or in the same detail for the different sectors of the LR region. Table 5 clearly shows that almost all the observations in sector D were made by only one expert. This does not question the approach, but given the observations of the selected experts, it is to be expected that the conclusions from sectors A, B, and C were more consistent than those from sector D, where only one expert had in-depth knowledge. However, it should be highlighted that the 'external reliability' of this study refers to the LR region and that the same validation framework would have been approached differently, i.e., with experts from other regions or different countries, if a different scale of study had been defined.
Chemometric data analysis methods use very little a priori knowledge. Instead, they provide parameters that give insight into the system, such as its intrinsic dimension or the loadings that describe it. However, their relevance must be verified by numerical criteria and the user's expertise. Thus, for the PARAFAC method, the number of factors is a result of the decomposition. It must be chosen by successive trials of increasing size, examining criteria, such as CORCONDIA, and also examining the relevance of the loadings produced. For other factorial methods, such as PCA [39], PLS [40], or MCR-ALS [36], the explained variance, prediction error, or lack of fit will be examined respectively. In analytical chemistry, the systems are often simple. In these cases, some priors can be useful for the calculation of the model. In agriculture, situations are often very complex, and it is difficult to constrain the model. Therefore, the result of the model must be verified a posteriori.
The main limitation of the PARAFAC methodology in relation to its application with Sentinel-2 satellite data at a regional scale is that it required a temporal interpolation step prior to the analysis in order to create a continuous data cube. However, this step necessarily involved a smoothing of the spectral data, which could lead to the removal of relevant information for crop monitoring. In addition to the methodological limitation mentioned above, there was also an important limitation within the expert's validation framework used. Indeed, the spatial representation of a single score value summarising the whole growing season (the same colour code represents in reality a large variety of phenomena), probably added complexity to the interpretation by experts of areas related to 'real' soil or vegetation dynamics. However, according to the experts, the PARAFAC analysis highlighted different, relevant zones according to their knowledge. From the differences in the score value maps, different spatial patterns were visually highlighted that could potentially reveal some interpretative clues depending on the situation of the vineyards (geographical area, soil characteristics, and soil water capacity), the different phenological stages of the vineyards (different denominations area, grape variety, and management practices), and the impact of the use of irrigation in certain areas.
The validation of the results was only performed at the regional scale with a group of experts able to integrate a wide range of knowledge (soils, crops, climates) to analyse the relevance of the information obtained. It is likely that similar analyses conducted at finer scales will identify new information and perhaps even interesting links to new agronomic information. However, the availability of such experts, at both regional and local scales, to perform this validation is a potential limitation of the approach. In their absence it will be difficult to correctly extract and identify the various agronomic and pedo-climatic effects from the model outputs.
Concerning the year 2019, the validation by the expert group focused only on the vegetation component, highlighting its similarity, i.e., same spatial patterns with the subsequent year. The main observation was the attenuation of the high-value zones in 2019, which raised the hypothesis that the crop-heatwave interaction caused the homogenisation of the reflectance signal, creating a less differentiated spatial distribution of vine vigour at the regional scale in 2019, i.e., fewer strong concentration values (yellow colour). However, no sudden variations in the profiles were observed, as would be expected as a consequence of the impact of an extreme weather event. This could also be attributed to the fact that the temporal interpolation performed may have masked this punctual variability.
The unsupervised approach (PARAFAC) presented here represents a specific application case for the LR wine-growing region in 2019 and 2020. In view of the results, it is a type of approach that can be effective for spatialising and characterising phenomena with a temporal evolution, e.g., the spectral response of the vine canopy, providing spectro-temporal 'fingerprints' that are capable of highlighting differences in behaviour. However, in this particular case study, probably as a consequence of the resolution scale, its application is dependent on a posteriori expert knowledge of the observed phenomenon, thus limiting its applicability.

Conclusions
The work conducted in this study showed the potential of an appropriate three-way data resolution methodology, such as PARAFAC, in the analysis of remote sensing images time series. The results obtained from data collected over two cropping years (2019-2020), on 4978 vineyard blocks showed that the PARAFAC method provided relevant information on temporal spectral profiles, which, in turn, allowed the spatial characterisation of the LR wine region. The validation of the results was based on expert observations. Although this can be seen as a limitation, the practical approach of experts demonstrated that the application of specific methodologies for the resolution of complex three-way data can be optimised at various levels and be potentially useful for understanding and characterising viticulture at the regional scale. Elements extracted from the PARAFAC method, such as the relationship with the landscape, the irrigation areas in relation to the soil characteristics, and the spatial footprint of the soil water capacity seen as an indicator of biomass production over the season, have given the authors new insights into elements to be investigated in more detail. However, the requirement to have a continuous cube can be a limiting factor in characterising isolated episodes that affect the crop growth during the year.
The proposed methodology is potentially transferable to other resolution scales. In fact, for the effective characterisation of the specificity of the spectral temporal response of agricultural-related factors at other scales, the spatial segmentation elements (different spectral time zones) highlighted by the PARAFAC methodology and identified by expert observations would be appropriate as a starting point. It is assumed that the methodology is equally transferable to other regions and other cropping systems, although the resulting inferences will be very different.