An Innovative Slepian Approach to Invert GRACE KBRR for Localized Hydrological Information at the Sub-Basin Scale

: GRACE spherical harmonics are well-adapted for representation of hydrological signals in river drainage basins of large size such as the Amazon or Mississippi basins. However, when one needs to study smaller drainage basins, one comes up against the low spatial resolution of the solutions in spherical harmonics. To overcome this limitation, we propose a new approach based on Slepian functions which can reduce the energy loss by integrating information in the spatial, spectral and time domains. Another advantage of these regionally-deﬁned functions is the reduction of the problem dimensions compared to the spherical harmonic parameters. This also induces a drastic reduction of the computational time. These Slepian functions are used to invert the GRACE satellite data to restore the water mass ﬂuxes of different hydro-climatologic environments in Africa. We apply them to two African drainage basins chosen for their size of medium scale and their geometric speciﬁcities: the Congo river basin with a quasi-isotropic shape and the Nile river basin with an anisotropic and more complex shape. Time series of Slepian coefﬁcients have been estimated from real along-track GRACE geopotential differences for about ten years, and these coefﬁcients are in agreement with both the spherical harmonic solutions provided by the ofﬁcial centers CSR, GFZ, JPL and the GLDAS model used for validation. The Slepian function analysis highlights the water mass variations at sub-basin scales in both basins.


Introduction
Since 2002, the Gravity Recovery and Climate Experiment (GRACE) launched by NASA and DLR belongs to a new generation of Low-Earth Orbiter (LEO) satellites that improves the resolution of the gravity field to 300-400 km [1,2]. The specificity of the GRACE mission was to map weak changes of the gravity field associated with the moving water mass of the superficial fluid envelopes of the Earth, i.e., atmosphere [3], oceans [4,5], continental water storage [6,7].
Using Spherical Harmonics (SH) is somehow the ancestor of all GRACE signal representation methods and still remains the reference [8]. SH have been initially used to retrieve low degrees, which correspond to long wavelengths of the gravity field, by deriving precise satellite trajectories. The band-limited SH suffer from spatial leakage due to spectrum truncation. SH are a basis of orthogonal functions on the entire Earth sphere and are not adapted to study restricted areas or localized structures, e.g., medium-to small-scale watersheds that cause a loss of their orthogonality. Without this orthogonality condition, any regional decomposition is not unique and cannot be ensured. Improved approaches highlight better regional and time resolutions for GRACE, such as mascons [9], and consider networks of surface tiles for describing hydrological patterns [6]. Besides, mascon solutions require regularization by space and time smoothing [10]. A new representation derived from SH but using the Slepian function (SF) has the advantage to be orthogonal over the globe but also at medium-to small-scale watersheds [11].
A representation in SF that maximizes the information density in spatial and spectral domains is adopted in the present study. It consists of building a finite set of orthogonal functions in a given region, as earlier proposed by [12] to solve geodetic localization problems. To limit the spatial leakage, the SF coefficients of the ten major African basins ( Figure 1) are estimated jointly. Former studies on applications of SF concerned the characterization of deep deformation of important earthquakes by their signature in the gravity field measured by GRACE. The authors of [12] tried to isolate the coseismic gravity variation due to the 2004 Sumatra-Andaman earthquake, while [13] studied the Tohoku earthquake of 2011. In hydrology, the total mass loss of Greenland by melting has been quantified by [14] using GRACE data ranging from 2002 to 2011. The mass balance of the Tibetan Plateau and its surrounding glaciers that have been losing ice mass due to global warming was also quantified by [15]. The authors of [16] explored the possibility that coal mining could cause groundwater depletion consumption in the Loess Plateau using SF representation. A re-evaluation of the mass balance of the Iceland ice cap using SF has also been made by [17].
Remote Sens. 2021, 13, x FOR PEER REVIEW 2 of 16 representation derived from SH but using the Slepian function (SF) has the advantage to be orthogonal over the globe but also at medium-to small-scale watersheds [11]. A representation in SF that maximizes the information density in spatial and spectral domains is adopted in the present study. It consists of building a finite set of orthogonal functions in a given region, as earlier proposed by [12] to solve geodetic localization problems. To limit the spatial leakage, the SF coefficients of the ten major African basins (Figure 1) are estimated jointly. Former studies on applications of SF concerned the characterization of deep deformation of important earthquakes by their signature in the gravity field measured by GRACE. The authors of [12] tried to isolate the coseismic gravity variation due to the 2004 Sumatra-Andaman earthquake, while [13] studied the Tohoku earthquake of 2011. In hydrology, the total mass loss of Greenland by melting has been quantified by [14] using GRACE data ranging from 2002 to 2011. The mass balance of the Tibetan Plateau and its surrounding glaciers that have been losing ice mass due to global warming was also quantified by [15]. The authors of [16] explored the possibility that coal mining could cause groundwater depletion consumption in the Loess Plateau using SF representation. A re-evaluation of the mass balance of the Iceland ice cap using SF has also been made by [17]. In this study, we analyzed two specific basins: Congo/Zaire (white color), and Nile (blue) but SF inversion have been made on the ten major basins, i.e., Niger (red, ~2.26), Zambezi (dark green, ~1.33), Orange (orange, ~0.97), Okavango (purple, ~0.72), lake Tchad-Chari (light blue, ~0.55), Jubba-Shebelle (grey, ~0.49), Limpopo (light green, ~0.41), and Senegal (yellow, ~0.3). Source: http://hydro.iis.u-tokyo.ac.jp/~taikan/TRIPDATA/TRIPDATA.html (last accessed on 31 March 2020); (b) map of climatic zones using the Köppen climate classification, the main watersheds are delimited by a dark red line and the associated hydrographic network corresponds to a dark blue line for the main rivers and a white line is used for small tributaries or gullies; (c) example of the January 2003 GLDAS simulation map, the variations of the EWH are expressed in mm.
We propose a new approach based on a set of orthogonal SF as a regional representation basis to invert directly original KBRR-based geopotential differences to recover the hydrology of two African river basins that offer different watershed shapes and climatologies ( Figure 1b) with geographical details at an unprecedented sub-basin scale (i.e., ~250 km). Usually, the SF are constructed from the SH basis by imposing a bandwidth of L = 60 for GRACE data representation. They are commonly derived using the method of [14], coded in [18], to transform the GRACE Stokes coefficients, i.e., the dimensionless spherical harmonic coefficients of the geopotential into a regional Slepian basis. The lowest frequencies represented by the coefficients of degrees 1 and 2 need to be corrected using satellite ranging measurements [19,20] before being converted into a surface mass density by SH filtering [21]. While these previous works use SH of water mass to derive SF coefficients, we propose here, for the first time, to adjust SF coefficients of water mass directly from along-track GRACE geopotential differences measured and reduced at satellite altitude. The article is structured as follows. After exposing the principles of the SF decomposition In this study, we analyzed two specific basins: Congo/Zaire (white color), and Nile (blue) but SF inversion have been made on the ten major basins, i.e., Niger (red,~2.26), Zambezi (dark green,~1.33), Orange (orange,~0.97), Okavango (purple,~0.72), lake Tchad-Chari (light blue,~0.55), Jubba-Shebelle (grey,~0.49), Limpopo (light green,~0.41), and Senegal (yellow,~0.3). Source: http://hydro.iis.u-tokyo.ac.jp/~taikan/TRIPDATA/TRIPDATA.html (last accessed on 31 March 2020); (b) map of climatic zones using the Köppen climate classification, the main watersheds are delimited by a dark red line and the associated hydrographic network corresponds to a dark blue line for the main rivers and a white line is used for small tributaries or gullies; (c) example of the January 2003 GLDAS simulation map, the variations of the EWH are expressed in mm.
We propose a new approach based on a set of orthogonal SF as a regional representation basis to invert directly original KBRR-based geopotential differences to recover the hydrology of two African river basins that offer different watershed shapes and climatologies ( Figure 1b) with geographical details at an unprecedented sub-basin scale (i.e., 250 km). Usually, the SF are constructed from the SH basis by imposing a bandwidth of L = 60 for GRACE data representation. They are commonly derived using the method of [14], coded in [18], to transform the GRACE Stokes coefficients, i.e., the dimensionless spherical harmonic coefficients of the geopotential into a regional Slepian basis. The lowest frequencies represented by the coefficients of degrees 1 and 2 need to be corrected using satellite ranging measurements [19,20] before being converted into a surface mass density by SH filtering [21]. While these previous works use SH of water mass to derive SF coefficients, we propose here, for the first time, to adjust SF coefficients of water mass directly from along-track GRACE geopotential differences measured and reduced at satellite altitude. The article is structured as follows. After exposing the principles of the SF decomposition for describing regional surface water mass changes in the first part, an inversion of real GRACE geopotential differences to estimate SF coefficients related to the two largest African drainage basins, i.e., the Nile and Congo basins, over successive 10-day periods ranging from 2003 to 2011, is made in the second part of the article. Due the low quality of the GRACE accelerometer and thus the velocity vector measurements and the end-of-life battery lows creating very frequent interruptions in observation after 2011, the inversion of GRACE KBRR data has been converted into along-track potential differences and inverted for estimating the 10-day SF coefficients during the stable period of 2003-2011. The Slepian-based estimates are also confronted to other GRACE solutions and GLDAS model outputs for validation. In the third part, a discussion on the spectrum truncation of SF and the influence of the a priori bandwidth is made. At the end, the conclusion section recaps the main findings of the present study and then possible perspectives of our work are proposed.

(a) The tropical Congo river basin
The Congo River basin is notable for its large, almost circular, drainage shape of 4 million km 2 (see Table 1 and Figure 1). It is the second largest river in the world by discharge [22,23] of~40,000 m 3 /s, referred to as the "Cuvette Centrale" that slows the downstream flow [22,24], and the branching patterns formed by its tributaries indicate different hydro-climatic climatic provinces: the region of the lakes Victoria and Tanganyika ( Figure 1a) which corresponds to the border between the equatorial/monsoon climate and the tropical climate ( Figure 1b) in the East, the tropical forests which correspond to the transition between the tropical climate to the humid subtropical one in the South, the central depression which corresponds mainly to the equatorial/monsoon climate, and the mainly tropical climate with the semi-arid Sahelian influences in the North. The flow regime at the Kinshasa, near the estuary, represents 98% of the basin and shows two peaks of high streamflow each year denoting the bimodal flow regime of April-May (~50 m 3 /s for the period 1998-2010; [25]) and the tropical influence in November-December (120 m 3 /s for the same period). The average annual temperature has increased by~0.5 • C during the last two decades, with an average minimum of 11.7 • C in January and a maximum in March of 37.4 • C (see Table 1). The annual height of the rainfall has a decrease of -6.4 mm during this period [25]. Besides, seasonal variations in the wetland water storage using information of inundations have already been studied by satellite gravimetry [26], by comparison between SH, mascon and Slepian solutions [11], and by comparison between radar altimeter data and GRACE solutions [27]. The Nile river basin is notable for its length (~6650 km). In terms of catchment area (~3 million km 2 ), it is slightly smaller than that of Congo (~4 million km 2 ), with an average discharge of~2800 m 3 /s, and its drainage network extends northward. The Nile basin has a strong climatic diversity and it crosses seven different climatic zones: the desert near its mouth in the North, then it crosses the semi-arid zone and arrives in its central part in a tropical climate which leaves place to an area under marine influence more to the East, and to an equatorial climate in the South (Figure 1): the flow regime presents a multi-modality which increases towards the north as the tributaries converge towards the Nile [11].

Global Land Assimilation System (GLDAS)
The interest of the Global Land Data Assimilation System (GLDAS) is that it integrates satellite-and ground-based observational data products, using advanced land surface modeling and data assimilation techniques with the framework of water use and availability estimations [28] (see description on the site https://ldas.gsfc.nasa.gov/gldas accessed on 31 March 2020). The model allows the input of a huge quantity of observation-based data, executes globally at high resolutions (~27 km to 1 km), and produces EWH grids in near-real time. A vegetation-based tiling approach is used to simulate the sub-grid scale variability. Soil and elevation parameters are based on high-resolution global datasets. Observation-based precipitation and downward radiation products and the best available analyses from atmospheric data assimilation systems are employed to force the models. Data assimilation techniques for incorporating satellite-based hydrological products, including snow cover and water equivalent, soil moisture, surface temperature, and leaf area index, are now being implemented as part of a follow-on project funded by the NASA Energy and Water Cycle Study (NEWS) Initiative. The African climate (Figure 1b) derived from the well-known Köppen climate classification [29] and the modeling of EWH using GLDAS (Figure 1c) are well correlated even if the GLDAS simulation is more dependent on surface and atmosphere exchanges.

KBRR GRACE Data
The on-board K-Band Range Rate (KBRR) of 24-32 GHz is the key science instrument of the Gravity Recovery and Climate Experiment (GRACE) mission. It measures the dual one-way range change of the baseline between the co-planar low-altitude satellites. These measures have a precision of~1 µm/s on velocity difference; or expressed in distance, 10 µm after integration versus time [30]. The average distance between the GRACE vehicles is~220 km. In fact, KBRR is the most precise measure of gravity variation sensed by the GRACE satellite tandem and gives access to surface water mass transfers of the Earth. These measurements are reduced from the non-conservative forces detected by the 3-axis accelerometers (i.e., atmospheric drag and solar pressure), as well as from the contributions of atmosphere, oceanic masses, solid Earth/oceanic tides, polar tides and static field using a priori models. The obtained KBRR residuals are associated to the non-modeled gravitational effects that represent mainly the continental water storage. These potential difference residuals are converted into along-track potential differences between the twin GRACE satellites following the energy integral method, as earlier proposed by [31][32][33].

Construction of an Orthogonal Slepian Function Basis
The SF f are the eigenfunctions of an area-bandwidth limiting operator. They are given as the solutions of the eigenvalue equation: where the elements of the symmetric correlation matrix D are given by: Remote Sens. 2021, 13, 1824 where P n is the Legendre polynomial of degree n and ϕ ij is the angular distance between two points numbers i and j inside the geographical mask of the considered drainage basin. Equation (1) represents the main task for determining the SF for regions of arbitrary shapes [34]. For each eigenvalue, each SF is a separate solution of an eigenvalue equation that maximizes the concentration of information (or equivalently energy) inside a given region, so that the eigenvalue number k is a measure of density of the corresponding SF. Extracting information over the full bandwidth of the solution without filtering, the SF basis provides spatial sensitivity that is superior to the sensitivity of many other modeling methods [11].
Considering only the first eigenvalues of the highest energy, e.g., >0.5, drastically improves the signal-to-noise ratio [35] with very small impact, or leakage effect, outside the region of interest.
A set of SF for each river basin requires starting parameters: (1) the description of the studied region, i.e., the points in the geographical mask, (2) the bandwidth L taken as the spatial resolution of the data, i.e., usually 60 for the GRACE solutions based on SH, and the number of the SF needed for a true description of the hydrological variations. This latter parameter is usually taken as the Shannon number K, i.e., K = ∑ λ k λ max for the defined bandwidth L and catchment [36]. For each river basin, the elements of the matrix D of Equation (2) are produced from the M points inside the basin limited by its geographical boundaries. For example, the 0.5 • -sampled boundary gives M = 656 for the Congo basin. Then, Equation (1) is solved for the M eigenfunctions f k and eigenvalues λ k by considering this particular L. In practice, we check the orthogonality and it is not completely numerically realized for very large Slepian orders, e.g., k > 100. We show in Figure 2 the spectra for the Congo and Nile watersheds. The Shannon's truncation K is strong enough since a significant part (~20% for the Nile;~12% for the Congo) of the total energy is not taken into account in the signal reconstruction. To rectify this, in the Section 2.5, an optimized threshold method is proposed that conserves much of the energy of the SF spectrum, and consequently gives more details (~130% and~165% for Congo and Nile, respectively). (1) where the elements of the symmetric correlation matrix D are given by: where Pn is the Legendre polynomial of degree n and ϕij is the angular distance between two points numbers i and j inside the geographical mask of the considered drainage basin. Equation (1) represents the main task for determining the SF for regions of arbitrary shapes [34]. For each eigenvalue, each SF is a separate solution of an eigenvalue equation that maximizes the concentration of information (or equivalently energy) inside a given region, so that the eigenvalue number k is a measure of density of the corresponding SF. Extracting information over the full bandwidth of the solution without filtering, the SF basis provides spatial sensitivity that is superior to the sensitivity of many other modeling methods [11].
Considering only the first eigenvalues of the highest energy, e.g., >0.5, drastically improves the signal-to-noise ratio [35] with very small impact, or leakage effect, outside the region of interest.
A set of SF for each river basin requires starting parameters: (1) the description of the studied region, i.e., the points in the geographical mask, (2) the bandwidth L taken as the spatial resolution of the data, i.e., usually 60 for the GRACE solutions based on SH, and the number of the SF needed for a true description of the hydrological variations. This latter parameter is usually taken as the Shannon number K, i.e., ∑ for the defined bandwidth L and catchment [36]. For each river basin, the elements of the matrix D of Equation (2) are produced from the M points inside the basin limited by its geographical boundaries. For example, the 0.5°-sampled boundary gives M = 656 for the Congo basin. Then, Equation (1) is solved for the M eigenfunctions fk and eigenvalues λk by considering this particular L. In practice, we check the orthogonality and it is not completely numerically realized for very large Slepian orders, e.g., k > 100. We show in Figure 2 the spectra for the Congo and Nile watersheds. The Shannon's truncation K is strong enough since a significant part (~20% for the Nile; ~12% for the Congo) of the total energy is not taken into account in the signal reconstruction. To rectify this, in the Section 2.5, an optimized threshold method is proposed that conserves much of the energy of the SF spectrum, and consequently gives more details (~130% and ~165% for Congo and Nile, respectively).  Normalized eigenvalue spectra computed from the geographical masks of the Congo and Nile river drainage basins. Shannon numbers K are also indicated (blue line for Nile, grey one for Congo). For the Nile basin around 20% of the normalized energy is not taken into account if the K truncation is used by comparing to the K opt . The overall increase in the number of coefficients, which constitutes the basis of SF, corresponds to~165% (K = 29 versus K opt = 77) for the Nile, and the loss of information is~12% lower for the Congo and is again marked by a strong increase in the number of SF coefficients of about~130% (K = 34 versus K opt = 78). The choice of the maximum mode for constructing the solution from the GRACEderived SF coefficients is examined here. The classical Shannon number K [36] is usually used as the maximum number of terms in the SF development. K depends on the geographical mask of the basin as well as the bandwidth L considered in the SVD decomposition. Alternatively, by keeping 99.9% of the SVD spectrum energy to determine K opt for GRACE potential data inversion, it is possible to accumulate the information of more high-mode coefficients of short wavelengths to complete the surface signal reconstruction by more details as K opt > K (Figure 2), even if the very finer scales (from hundred to few tens of km) of hydrological variability are not reachable due to the upward continuation of very short-scale structure to the satellite altitude.

Bandwidth Analysis
According to Equation (2), the matrix D is determined by considering a certain bandwidth L, which corresponds to the maximum degree of the Legendre polynomial development. Since this parameter is usually taken to be 60 in all the previous GRACE-based studies on SF representation [14,17], it is considered as the most suitable value for representing GRACE data, and therefore this numerical value has been considered in our computation. Besides, possibilities of using optimized values for L are also explored in our study for generating an optimal SF basis that is well-adapted for regions of anisotropic/isotropic shape and/or of large, e.g., L = 30 and small, e.g., L = 120, dimensions.

Polar Reduction
The operator D in Equation (2) also depends on the cosines of the angular distances between the sampled points describing the geographical mask of each basin, the surface distances being greater at the Equator and decreasing with the latitude. To compensate for this distortion and obtain more homogeneity in the spatial sampling of the elementary surfaces versus the centers of each drainage basin, a "Polar reduction" is proposed by operating two successive angular rotations to displace the mask centers and consequently all their surrounding points to: (1) the longitude zero for a Greenwich meridian centering, and (2) the latitude of 90 degrees, i.e., to the North pole, Congo and Nile being in the Northern Hemisphere, before applying the eigenvalue decomposition of the matrix D. The inverse rotations are made on the SVD decomposition-based SF to set them back to the starting location of the center of the basin taken as a position reference. Consequently, the SF favors the description of the centering region of the basins instead of the highest latitude parts. To reduce the geometric distortions and to obtain homogeneous sampling, we have chosen to implement this pole reduction automatically in our processing chain, thus all the results presented in Section 3 will integrate this reduction.

SF Reduction of the Gravity Inverse Problem
The damped Moore-Penrose solution of the problem for estimating the Slepian coefficient vector C of dimension K total , i.e., the total number of Slepian coefficients of the ten African river basins, is: where Y is the N-element vector containing the GRACE-observed potential differences due to a surface distribution of sources for a given period of observation (see Section 2.4), and B * is the numerically stabilized version of the complete SF operator given by: The elements of the Newtonian potential matrix Γ of dimensions N × M are provided by [34], M (that is at least equal or lower than K total ) is the number of surface elements in the basin and F is the block-diagonal matrix gathering the SF f k . B * is built by keeping, e.g., 99.9% of the energy of the eigenvalue spectrum of the N × K total -sized matrix B for attenuating the ill-conditioning.

Results
The first two parts of the results obtained correspond to a sensitivity analysis of the key parameters, i.e., K versus K opt and L, which govern the basis of SF. This analysis was applied to two African basins, Congo and Nile, respectively, according to their surface area which are quite close to~4 and~3 million km 2 (see Table 1). Another key point is the shape of these two basins, the first one, the Congo basin, has an isotropic shape ratio (circular shape with SR~1, Table 1) while the second, the Nile basin, presents a strongly latitudinal anisotropic shape (diamond shape with SR~2.3, Table 1), which will also allow us to examine the impact of the shape and size on the parameterization of both K/K opt and L.

Analysis of the Slepian Coefficient Truncation
We have selected the year 2005 to evaluate the impact of the truncation of a circular SF base. This year is considered as "representative" of the whole time series in terms of rainfall and hydrology. The GLDAS model was used herein as reference and its variations are expressed in mm of Equivalent Water Height (EWH). It shows a clear latitudinal zonation of the Congo basin in two climatic zones: the desert climate in the northern part with a dry period visible in January and in April where the water deficit can reach-75 mm (Figure 3a). This dry period is followed by a wet period visible from June until October with a water supply of up to 100 mm in the western part where the Atlantic Ocean current brings a large part of this humidity.
To better understand the difference between K and K opt , we have computed a circular SF basis using K = 34 and K opt = 78 for the Congo basin. Figure 3a shows the results of this analysis. We compare both SF mode maps and the related GRACE-estimated time series, as well as the GLDAS maps for the same periods. The KBRR inversion using SF for both cases of the Congo and Nile basins, yield results that are very comparable with those of the GLDAS model for frequency (temporal variations) and latitudinal zonation (spatial variations). In terms of intensity, a relatively constant and clear bias is observed like many authors have observed before (in South America [37], Africa [38], Poland [39], and at global scale [40]) and it corresponds, in our case, to a factor~2. So, in order to be able to compare the estimated SF maps, we have chosen to divide the outputs of the GLDAS simulation by 2, at least to ease the graphical representation using the same color scale (Figure 3).
The patterns of the K (column 1, Figure 3a) and the K opt (column 2, Figure 3a) inversions are geographically very close. Some oscillations are clearly visible when we used Shannon criteria, e.g., April 2005, whereas the K opt inversion gives a better spatial continuity closer to those provided by the GLDAS model. The difference between both inversions underlines that the smoothness observed with the K opt threshold is provided by the high-order SF coefficients which are absent from the inversion using the Shannon number threshold (third column of Figure 3). In addition, the influence of these high-order coefficients of SF is all the more visible the greater the latitudinal difference relative to the reference model becomes.
The results for the Nile river basin also present a clear latitudinal zonation: the northern part has a desert climate followed by a thin semi-arid band which then gives way to the south to the dominant equatorial climate to end near Lake Victoria with an equatorial-monsoon climate (Figure 1b). The GRACE and GLDAS results are shown in Figure 3b. The threshold value used for the SF reconstruction are K = 29 and K opt = 77. Despite the fact that the Nile basin has a surface that is slightly smaller than that of the Congo basin (Table 1), its diamond shape poses a real challenge to retrieve latitudinal variations because, if its central part has a sufficiently large scale (diameter~1400 km), its Remote Sens. 2021, 13, 1824 8 of 16 estuary and the supply areas of the basin have smaller sizes (~200-300 km and~500-600 km, respectively).
x FOR PEER REVIEW 8 of 16 The patterns of the K (column 1, Figure 3a) and the Kopt (column 2, Figure 3a) inversions are geographically very close. Some oscillations are clearly visible when we used Shannon criteria, e.g., April 2005, whereas the Kopt inversion gives a better spatial continuity closer to those provided by the GLDAS model. The difference between both inversions underlines that the smoothness observed with the Kopt threshold is provided by the highorder SF coefficients which are absent from the inversion using the Shannon number threshold (third column of Figure 3). In addition, the influence of these high-order coefficients of SF is all the more visible the greater the latitudinal difference relative to the reference model becomes.
The results for the Nile river basin also present a clear latitudinal zonation: the northern part has a desert climate followed by a thin semi-arid band which then gives way to the south to the dominant equatorial climate to end near Lake Victoria with an equatorialmonsoon climate (Figure 1b). The GRACE and GLDAS results are shown in Figure 3b. The threshold value used for the SF reconstruction are K = 29 and Kopt = 77. Despite the fact that the Nile basin has a surface that is slightly smaller than that of the Congo basin (Table  1), its diamond shape poses a real challenge to retrieve latitudinal variations because, if its central part has a sufficiently large scale (diameter ~1400 km), its estuary and the sup- The GLDAS simulation model highlights the latitudinal zonation of the climate. The desert climate can be observed for latitudes ranging from~32 to 20 • with a mean EWH value close to 0 mm. The thin semiarid band corresponds to latitudes ranging from 20 • to 15 • with a weak deficit of water of −25 to 50 mm during the dry season (May), and a tiny excess of~20-25 mm during the wet season (September). The large tropical area ranges between 15 • until 0 • and presents a huge loss of water during the dry season of −25 mm to 150 mm, which transforms during the rainy season into an EWH excess of about 150 mm to 75 mm; and the microclimate around the Lake Victoria, that is impacted by a monsoon climate, presents intensities (excess or deficit) generally lower than the ones in the tropical climate zone.
For the GRACE SF reconstructions, the climate areas are well restored but with a more homogeneous pattern. Thus, the transition between desert, semi-arid and tropical climate (with an excess during the wet season of 125 mm to 75 mm for both K/K opt simulations and a loss of −150 mm (for K simulation) and −100 mm (for K opt ) to 50 mm (for K) to 75 mm (for K opt )) is analyzed gradually without limit, and is as clear as the GLDAS reconstruction permits. The transition to tropical climate and monsoon climate is clearer (100 mm in September and 50 mm in April). The difficulty in the North to find gradients to point out the climate variations are linked to the small extension of this zone which cannot be totally transcribed at the altitude of the GRACE satellite due to upward continuation. As Remote Sens. 2021, 13, 1824 9 of 16 for the Congo basin, the K opt solution is smoother than the K solution, and the pattern of the difference between both models underlines the gap of high-order SF coefficients of the K solution, one can see the orthogonal pattern due to these missing high-order coefficients on the difference map. This pattern is clearer for this diamond-shaped basin than that of the Congo basin, and it demonstrates the advantage of increasing the number of SF coefficients when the basin shape becomes complex. This also points out the artifacts of the K inversion where the gap of high SF coefficients generates a loss of ± 25 mm during the dry period and inversely during the wet period. This difference can reach ±100 mm in the border zone of the basin, a little bit more than in the Congo basin, where the same effects are observed. It is important to mention that these are not edge effects because the GRACE data inversion is done for estimating the SF coefficients of the ten largest basins of African continent, and it is not restricted to the Congo and Nile basins.

Analysis of the Bandwidth
Conventionally, [14] and many other authors who rely on the inversion of harmonic coefficients use a bandwidth L = 60 which allows them to find the maps and time series of solutions from the well-known official centers (JPL, GFZ, CSR, GRGS/CNES, etc.) with great reliability. Here, the philosophy is quite different, since we inverse the KBRR directly, so we must identify if it contains bandwidths more suitable for our approach. However, due to the extensive literature on using this L = 60 bandwidth [11,17,18], we choose to use the L = 60 results as a starting reference and compare them with the other selected bandwidths, i.e., L = 30 and L = 90.
Each SF mode is a combination of a limited band of spatial wavelengths. A spherical Fourier analysis is used to determine this range and the dominant wavelengths of the SF mode, or at least its so-called spatial "scale" that is identified by the largest amplitude (or energy) peak of its spectrum. Figure 4a shows results for the Congo basin. For the L = 30 bandwidth, a linear trend appears with a concentration of the normalized energy for spatial scales ranging from less than 500 km to more than 2500 km for coefficient numbers varying from 40 to 5, respectively. For the upper coefficients, no significant energy is registered. For the L = 60 bandwidth, the energy is concentrated on a noticeably clear linear trend which concentrated the main part of the energy. For this bandwidth, we have a coefficient number which corresponds to a unique spatial scale. In this area, the values of the normalized spectrum of amplitude range from 0.6 to 0.9 with a very low variability, unlike the L = 30 bandwidth. In the case of L = 90, all the energy is concentrated for the high coefficient numbers (47 to 75), but for this bandwidth the variability is twice than that observed for L = 60. For coefficient numbers lower than 47, the SF obtained with L = 90 have a clear loss of energy with values ranging from 0.2 to punctually 0.8. As it has been already demonstrated by previous studies, the option L = 60 gives the best results for the Congo basin, because this bandwidth offers the advantage of concentrating the energy on a single spatial scale for each SF coefficient. In addition, the linear correlation is the best of all the experiments, with several SF coefficient numbers greater than 30 and corresponding to normalized spectrum amplitudes close to 0.75-0.9.
For the Nile (Figure 4b) and for all the bandwidths, there is a much greater variability of spatial scales than those found in the case of the Congo basin. However, a more restricted range of SF orders has to be identified. In other words, for L = 30, the number of significant SF coefficients simply varies between 2 and 26, whereas for the bandwidth L = 60, the number of relevant coefficients varies between 4 and 47. The spectral energies are concentrated between 15 and 40. To explore larger SF orders (ranging from 43 to 76), a larger bandwidth L = 90 must be considered, and in that case, high frequency signals corresponding to scales ranging from 190 to~250 km are preserved.
The bandwidth L and the size of the basin drive the characteristics of the SF, large bandwidths, i.e., L = 90, produce SF of shorter wavelengths so that more Slepian modes are necessary to reconstruct the complete regional signals, increasing the Shannon number K, but these wavelengths are better adapted for describing smaller watersheds than the Congo and Nile ones. The parameter K increases with the size of the basin, e.g., from K~6 for the Orange basin (~1 million of km 2 ) to K = 32 for the Amazon basin (~6 million of km 2 ), for L = 60. The range of SF scales rises with the parameter L and the basin size. In the case of the Amazon basin or even greater surfaces, under the criterion of keeping the most compact basis of SF (the lesser K), the better solution is to consider L = 30. For small watersheds whose sizes are less than one million km 2 , ranges of wavelengths greater than 800 km (L = 30) do not ensure localization of patterns inside these basins.

1, 13, x FOR PEER REVIEW
10 of 16 several SF coefficient numbers greater than 30 and corresponding to normalized spectrum amplitudes close to 0.75-0.9. For L = 30, the maximum amplitudes are for a scale ranging from 1000 km to 500 km. For a coefficient number of ~30 and for L = 60, the amplitude spectrum is more linear with a strong amplitude for a scale ranging from 500 km to ~300 km and for a coefficient number ranging from 30 to 75; for L = 90, the spectrum shows for each coefficient a large range of scales, the linear part with more concentrated scales starts at ~300 km for a coefficient number around 47 to 200 km and for a coefficient number of 75; (b) for the Nile watershed, for L = 30, the maximum amplitudes are for a scale ranging from 1200 km to 500 km and for a coefficient number ~30. For L = 60, the spectrum is less linear than that of the Congo watershed ( Figure 3) but the energy is more dissipated between different scales ranging from 500 km to ~250 km for Slepian coefficient numbers ranging from 4 to 70; for L = 90, the spectrum shows for each SF coefficient, as for the Congo watershed, a broad range of spatial scales, the linear part starts at ~300 km for coefficients around 44 to 200 km and for an SF coefficient number of 75.
For the Nile (Figure 4b) and for all the bandwidths, there is a much greater variability of spatial scales than those found in the case of the Congo basin. However, a more restricted range of SF orders has to be identified. In other words, for L = 30, the number of significant SF coefficients simply varies between 2 and 26, whereas for the bandwidth L = 60, the number of relevant coefficients varies between 4 and 47. The spectral energies are concentrated between 15 and 40. To explore larger SF orders (ranging from 43 to 76), a larger bandwidth L = 90 must be considered, and in that case, high frequency signals corresponding to scales ranging from 190 to ~250 km are preserved.
The bandwidth L and the size of the basin drive the characteristics of the SF, large bandwidths, i.e., L = 90, produce SF of shorter wavelengths so that more Slepian modes For L = 30, the maximum amplitudes are for a scale ranging from 1000 km to 500 km. For a coefficient number of~30 and for L = 60, the amplitude spectrum is more linear with a strong amplitude for a scale ranging from 500 km to~300 km and for a coefficient number ranging from 30 to 75; for L = 90, the spectrum shows for each coefficient a large range of scales, the linear part with more concentrated scales starts at~300 km for a coefficient number around 47 to 200 km and for a coefficient number of 75; (b) for the Nile watershed, for L = 30, the maximum amplitudes are for a scale ranging from 1200 km to 500 km and for a coefficient number~30. For L = 60, the spectrum is less linear than that of the Congo watershed ( Figure 3) but the energy is more dissipated between different scales ranging from 500 km to~250 km for Slepian coefficient numbers ranging from 4 to 70; for L = 90, the spectrum shows for each SF coefficient, as for the Congo watershed, a broad range of spatial scales, the linear part starts at~300 km for coefficients around 44 to 200 km and for an SF coefficient number of 75. Figure 5 displays the spatial components of several SF modes for the Congo basin of quite isotropic/circular shape, as well as their associated time series estimated from the GRACE geopotential differences every 10 days from 2003 to 2011. At the end of 2007 until 2011, these 10-day time series are much more interrupted due to the increase of noise in the along-track geopotential measured by GRACE (outliers: light grey vertical bands in Figure 5). GRACE data after 2011 are much more interrupted.

Analysis of the Resulting Time Series
The first modes that are well-centered on this tropical basin exhibit a strong seasonal cycle and obvious inter-annual variations, e.g., much wetness due to precipitation in this basin starting at the end of 2006. The amplitudes of the last SF coefficients that are related to high-frequency spatial maps are lesser in time, i.e., close to 0 mm of EWH. These high-order SF modes are less localized in space, but bring details, especially at the edge of the Congo basin due to the fact that the SF coefficients are computed for the ten major African basins (no edge effects for the Congo and Nile watersheds). Figure 5 displays the spatial components of several SF modes for the Congo basin of quite isotropic/circular shape, as well as their associated time series estimated from the GRACE geopotential differences every 10 days from 2003 to 2011. At the end of 2007 until 2011, these 10-day time series are much more interrupted due to the increase of noise in the along-track geopotential measured by GRACE (outliers: light grey vertical bands in Figure 5). GRACE data after 2011 are much more interrupted. The first modes that are well-centered on this tropical basin exhibit a strong seasonal cycle and obvious inter-annual variations, e.g., much wetness due to precipitation in this basin starting at the end of 2006. The amplitudes of the last SF coefficients that are related to high-frequency spatial maps are lesser in time, i.e., close to 0 mm of EWH. These highorder SF modes are less localized in space, but bring details, especially at the edge of the Congo basin due to the fact that the SF coefficients are computed for the ten major African basins (no edge effects for the Congo and Nile watersheds).

Analysis of the Resulting Time Series
The total EWH signal over the Congo basin ( Figure 6) is reconstructed by summing all the SF from k = 1 to 78 and spatially-averaged over the basin. It is characterized by an important seasonal cycle of +/−50 mm that remains comparable in amplitude to the ones obtained using the SH solutions. Some 10-day SF solutions could not have been well-estimated due to the poor quality of the along-track real GRACE geopotential data. This is represented by outliers in the time series of each SF coefficient. For the first years of the time series (2003 until end of 2006), the k = 3 component points out a remarkable tendency to dry out the basin with a slope of 97 mm per year. The decreasing water storage occurring up to 2003-2006 can be explained by specific hydro-climatic conditions over the East African Rift, due to a severe drought reported for Equatorial East Africa [41], followed by the positive strong Indian Ocean dipole (IOD) in late 2006. This sudden rise of rainfall forced by the IOD was observed by other authors [42,43] using multi-technique satellite approach, mainly radar altimetry. Then, beginning at the end of 2006 to 2011, a sudden The total EWH signal over the Congo basin ( Figure 6) is reconstructed by summing all the SF from k = 1 to 78 and spatially-averaged over the basin. It is characterized by an important seasonal cycle of +/−50 mm that remains comparable in amplitude to the ones obtained using the SH solutions. Some 10-day SF solutions could not have been well-estimated due to the poor quality of the along-track real GRACE geopotential data. This is represented by outliers in the time series of each SF coefficient. For the first years of the time series (2003 until end of 2006), the k = 3 component points out a remarkable tendency to dry out the basin with a slope of 97 mm per year. The decreasing water storage occurring up to 2003-2006 can be explained by specific hydro-climatic conditions over the East African Rift, due to a severe drought reported for Equatorial East Africa [41], followed by the positive strong Indian Ocean dipole (IOD) in late 2006. This sudden rise of rainfall forced by the IOD was observed by other authors [42,43] using multi-technique satellite approach, mainly radar altimetry. Then, beginning at the end of 2006 to 2011, a sudden increase in the water content, i.e., + 94 mm, appears and remains at this almost constant level. In fact, a weak increase is observed with a trend of 3 mm per year until 2011.
The other SF components present an almost constant level that oscillates around 0 mm annually. This point is very interesting because, for a quasi-circular basin, the first k coefficients carry long-term trends and seasonal variations, meanwhile the highorder SF coefficients testify to local and short-term variations. The seasonal amplitudes of basin-average time series described by GLDAS are at least twice than those of the SF reconstruction, and sometimes slightly out-of-phase by 1-2 months. solutions are less than 2 cm of EWH (~17.5 mm for the tropical humid basin of the Congo and ~19.7 mm for the arid basin of the Nile, that are comparable to the values found by [11], and these differences are mainly due to measurement errors that vary from basin to basin and with size and shape). These RMS differences can be explained by the fact that Slepian functions ensure the concentration of information inside each basin, whereas the 1°-by −1° mascons still suffer from spatial leakage at the basin boundaries [11].  The Nile basin, with its irregular diamond-shape, has its first SF modes centered over the largest latitudinal part of the middle of the basin ( Figure 5). These coefficients concern the Ethiopian highlands in the East, where the precipitations are relatively important. Consequently, these very first SF modes show strong seasonal variations that reach +/−1 m of EWH. A general long-term trend is also visible for the complete time series, it shows a clear increase of the water content for the period 2003-2011. The first SF mode has a slight positive trend of~7 mm per year ( Figure 5) near the Ethiopian Highlands (red-white area of the SF map), as the long-term average annual rainfall is high in this region during the period 1981-2016, with a maximum rainfall record of nearly 2000 mm in the western part of Ethiopia [44]. As for Congo river basin, the coefficients of the highest orders show more local variations, more heterogeneous in space and time with a constant average around 0 mm. The time series of the last maximum mode of k = 77 is nearly flat. The amplitudes of the GLDAS time series for the Nile river basin are remarkably smaller than the SF one, by a factor 2.
As presented in Figure 6, the Root Mean Square (RMS) differences with CSR mascon solutions are less than 2 cm of EWH (~17.5 mm for the tropical humid basin of the Congo and~19.7 mm for the arid basin of the Nile, that are comparable to the values found by [11], and these differences are mainly due to measurement errors that vary from basin to basin and with size and shape). These RMS differences can be explained by the fact that Slepian functions ensure the concentration of information inside each basin, whereas the 1 • -by −1 • mascons still suffer from spatial leakage at the basin boundaries [11].

Discussion
The most famous and oldest data processing centers of GRACE data are without any doubts the Jet Propulsion Laboratory of NASA [21], the German GeoForschungsZentrum (GFZ, Potsdam, Germany) [45], the Center of Space Research (CSR, Austin, TX, USA) [46]. Our results are compared to the SH solutions computed and made available by these reference centers once these solutions are projected onto the regional orthogonal basis of SF.
The SF inversion results for Congo and Nile hydrographic basins presented in Section 3 correspond to the direct outputs of the inversion, and they are comparable to the aver-ages computed using the SH solutions of GFZ, JPL, CSR and the GLDAS simulation ( Figures 5 and 6). Differences between the SF and SH time series are typically less than 10 mm of EWH most of the time and for all the solutions, but for an individual SF coefficient this difference can be greater in space and time.
If we now look into the details for the Congo river basin, Figure 5 shows that the energy is distributed more over the whole spectrum of the harmonic solutions while for the Slepian solutions the energy is concentrated on the first coefficients and the variations of EWH are almost zero for the upper coefficient (K = 4 and 77). Due to the fact that the inversion concentrates the energy on the first order SF coefficients, it is possible to reduce drastically the computational time, especially for an isotropic basin.
The profile of reconstructed signal averaged over the Nile basin shows important differences of amplitude and phase with the ones of the GRACE-based SH, reaching more than 50 mm of EWH, especially during the last years. These important differences can be explained by: (1) the smaller size of the Nile basin that is limited by narrow parts (versus the intrinsic spatial resolution of GRACE) of the drainage area, especially near its delta in the North, and (2) the fact that SH coefficients are fitted globally and include low-degree components, whereas the SF are adjusted regionally, thus cannot contain these low spatial wavelength components. Once again, the GLDAS time series is remarkably lower than the SF and SH ones, by a factor 2.
Irregular-shaped basins, such as the Nile river basin, generate a broad SF coefficient spectrum for any bandwidth L, indicating that many more Slepian coefficients are required for a complete regional and temporal reconstruction of hydrological signals.
As a perspective of this first study, a combination of SF coefficients generated with different bandwidths could be considered to represent more optimally a basin of complex shape (see Figure 7), providing that the condition of their orthogonality remains checked for all SF of the combined bandwidths. Another perspective of this new approach is the global inversion of the Earth system. This kind of inversion will permit to first reduce the computation time but also reduce the leakage effects and better reconstruct high frequencies in terms of space and also time once we have done this inversion following the temporal dimension.

Conclusions
In this study, an innovative approach of a regional pole-centered Slepian function representation is proposed to estimate surface water mass variations in the largest African

Conclusions
In this study, an innovative approach of a regional pole-centered Slepian function representation is proposed to estimate surface water mass variations in the largest African river basins, i.e., Congo and Nile, directly from geopotential differences along GRACE satellite tracks, instead of projecting SH onto an SF basis as it is usually done. Using such a regional orthogonal basis strategy offers the advantages of decreasing the number of parameters (or Slepian coefficients), thus the dimension of the inverse problem, and consequently the computation time. The Slepian coefficients of ten African basins have been estimated every 10 days in the same process to limit spatial leakage. The seasonal cycle amplitudes of these GRACE-recovered signals reaching hundreds of mm of EWH are comparable to the SH ones provided by the CSR, GFZ and JPL, pointing out the influence of latitudinal climatic bands, in particular for the tropical Congo basin. Discrepancies with the GLDAS simulation can be explained by an incomplete description of deep-water reservoirs or lacks of surface information on human activities in GLDAS modeling. The choice of the SF truncation K versus K opt and the a priori bandwidth L for deriving an SF basis is examined. As K is usually taken as the Shannon number in most of the previous studies but a large amount of SF spectrum energy remains lost, so that a higher optimal value K opt , e.g., 99.9% of this spectrum is kept to recover more high-frequencies and ensure a more complete reconstruction of the hydrological signals. It should be noted that the construction of the base of SF and the inversion of the GRACE data are done more quickly than if one uses the SH for better spatially and temporally localized results for the SF. The spatial scales of each Slepian mode have also been characterized by a radial Fourier spectrum analysis, revealing its dependency on L in the decomposition. Besides, a fixed bandwidth of L = 60 keeps on being considered when dealing with GRACE data, increasing (decreasing) this parameter permits to have access to ranges of shorter (larger) spatial SF wavelengths versus the main influence of the size and shape anisotropy of the basin.

Data Availability Statement:
The data products presented in this study are available on request from the corresponding author.