Sub-Daily Natural CO 2 Flux Simulation Based on Satellite Data: Diurnal and Seasonal Pattern Comparisons to Anthropogenic CO 2 Emissions in the Greater Tokyo Area

: During the last decade, advances in the remote sensing of greenhouse gas (GHG) concentrations by the Greenhouse Gases Observing SATellite-1 (GOSAT-1), GOSAT-2, and Orbiting Carbon Observatory-2 (OCO-2) have produced ﬁner-resolution atmospheric carbon dioxide (CO 2 ) datasets. These data are applicable for a top-down approach towards the veriﬁcation of anthropogenic CO 2 emissions from megacities and updating of the inventory. However, great uncertainties regarding natural CO 2 ﬂux estimates remain when back-casting CO 2 emissions from concentration data, making accurate disaggregation of urban CO 2 sources difﬁcult. For this study, we used Moderate Resolution Imaging Spectroradiometer (MODIS) land products, meso-scale meteorological data, SoilGrids250 m soil proﬁle data, and sub-daily soil moisture datasets to calculate hourly photosynthetic CO 2 uptake and biogenic CO 2 emissions with 500 m resolution for the Kant¯o Plain, Japan, at the center of which is the Tokyo metropolis. Our hourly integrated modeling results obtained for the period 2010–2018 suggest that, collectively, the vegetated land within the Greater Tokyo Area served as a daytime carbon sink year-round, where the hourly integrated net atmospheric CO 2 removal was up to 14.15 ± 4.24% of hourly integrated anthropogenic emissions in winter and up to 55.42 ± 10.39% in summer. At night, plants and soil in the Greater Tokyo Area were natural carbon sources, with hourly integrated biogenic CO 2 emissions equivalent to 2.27 ± 0.11%–4.97 ± 1.17% of the anthropogenic emissions in winter and 13.71 ± 2.44%–23.62 ± 3.13% in summer. Between January and July, the hourly integrated biogenic CO 2 emissions of the Greater Tokyo Area increased sixfold, whereas the amplitude of the midday hourly integrated photosynthetic CO 2 uptake was enhanced by nearly ﬁve times and could offset up to 79.04 ± 12.31% of the hourly integrated anthropogenic CO 2 emissions in summer. The gridded hourly photosynthetic CO 2 uptake and biogenic respiration estimates not only provide reference data for the estimation of total natural CO 2 removal in our study area, but also supply prior input values for the disaggregation of anthropogenic CO 2 emissions and biogenic CO 2 ﬂuxes when applying top-down approaches to update the megacity’s CO 2 emissions inventory. The latter contribution allows unprecedented amounts of GOSAT and ground measurement data regarding CO 2 concentration to be analyzed in inverse modeling of anthropogenic CO 2 emissions from Tokyo and the Kant¯o Plain.


Introduction
To address climate change, greenhouse gas inventories including both source emissions and sink removals have been exceedingly important for policymaking and for scientific studies. The primary greenhouse gas in the atmosphere affected by human activities is carbon dioxide (CO 2 ), with a global averaged concentration of 407.8 ± 0.1 parts per million (ppm) in 2018 [1]. Scientific reports such as the Global Carbon Budget 2019 [2] estimated sources and sinks of CO 2 globally. National greenhouse gas inventories have included annual totals and trends of CO 2 emissions and removals by sector. In addition to global and national inventories, the Integrated Global Greenhouse Gas Information System (IG 3 IS) advocates the necessity of advancing estimation and attribution of megacity GHG emissions to support the implementation of the ambition cycle and global stocktake in the Paris Agreement [3]. Cities are the focal points of climate action for the following reasons: more than half of the human population resides in urban areas [4], where cities and their affiliated power plants, airports and other infrastructure constitute the largest human-induced CO 2 emission sources, accounting for 50%-80% of the global total [3,[5][6][7]; also, urban populations will continue growing to 68% of the global total by 2050 [4], with expected rapid rises occurring particularly in Africa, Asia, and Latin America, presumably increasing CO 2 emissions markedly [8,9].
Some city-scale anthropogenic CO 2 emission datasets already exist. Cities such as Los Angeles [10], Paris [11], and Tokyo [12], with abundant resources and the greatest commitments to addressing climate change, self-report their GHG annual emissions. Studies such as those by Gurney et al. [13] and Shan et al. [14] developed urban CO 2 emission inventory frameworks to estimate anthropogenic CO 2 emissions. Nangini et al. [15] synthesized anthropogenic CO 2 emissions in 343 cities from both developed and developing countries. However, a few aspects remain to be improved to acquire comprehensive and consistent urban CO 2 emission datasets. Because of variations in local resources and accounting capacities, global CO 2 emissions data at the city scale are incomplete. Moreover, the representativeness of cities is biased [7], and some cities in the global south are struggling with data scarcity challenges such as inadequate information collection [8,16,17]. Moreover, urban GHG inventory methods are yet to be standardized [7,16]. Because of inconsistent accounted emissions sources, inventory methods and incompatible boundaries [7,9], the contents and quality of city-scale CO 2 emission data are often incoherent, making city-to-city comparisons difficult.
Top-down approach might address the research challenges described above and update megacities' anthropogenic CO 2 emission datasets extensively. In the field of atmospheric sciences, the top-down approach refers to the back-casting of CO 2 flux based on atmospheric concentration data. In fact, verification of greenhouse gas inventories via the top-down approach (atmospheric measurement and inverse model) has been highlighted by the Intergovernmental Panel on Climate Change (IPCC)'s Task Force on National Greenhouse Gas Inventories (IPCC-TFI) in the latest refinement of its 2006 guidelines [18], because this method allows the exploitation of independent observation datasets. Topdown approach can be applied flexibly at various scales. In particular, advances taking place during the last decade of global atmospheric greenhouse gas concentration observations derived from satellite retrievals (GOSAT-1, OCO-2, TanSat, GOSAT-2, and OCO-3) generated CO 2 and methane (CH 4 ) datasets in fine resolution to support such application of inverse models [19][20][21].
Some of these studies accounted the net ecosystem exchange of CO 2 (NEE) in their study areas: Bréon et al. [33], Staufer et al. [34], Boon et al. [37], and Nickless et al. [38] used land surface models [44][45][46][47]; McKain et al. [22], Pillai et al. [35], and Sargent et al. [29] used a terrestrial biosphere model-the Vegetation Photosynthesis and Respiration Model (VPRM, [43,48]), or its urban version UrbanVPRM [49]; Hu et al. [30] used CarbonTracker results [50]. However, models such as VPRM, which calculate sub-daily photosynthetic CO 2 uptake based on simple empirical relations to air temperature and photosynthetically active radiation (PAR), neglect diurnal variation of vegetation-absorbed PAR attributable to cloud conditions and the vegetation canopy structure [51][52][53][54]. First, the flux density of diffuse and direct photosynthetically active photons in the shortwave spectrum of solar radiation varies considerably according to changes in cloud cover, atmospheric aerosols and ozone, solar elevation angle, surface elevation, and terrain [52][53][54] conditions. For instance, the presence of clouds reduces the total photosynthetically active photon flux density (PPFD) reaching the top of the vegetation canopy, whereas the fraction of diffuse photons increases. Compared to clear-sky noon, the fraction of diffuse photons reaching the ground is higher at times near sunrise and sunset. Second, when diffuse and direct photosynthetically active photons pass through a vegetation canopy, they present different absorption, scattering, and transmittance characteristics. Gu et al. [51] analyzed seasonal patterns of photosynthetic parameters based on flux tower measurements. They concluded that without separating diffuse photosynthetically active photons from the total (diffuse + direct), the photosynthetic CO 2 uptake tends to be underestimated. Third, de Pury and Farquhar [55], Gu et al. [51], Urban et al. [56], Chen et al. [57], Luo et al. [58], and Ryu et al. [59] discussed light saturation effects in sunlit leaves vs. light limitation effects on shaded leaves of a plant canopy. The results of those studies suggest that photosynthetic CO 2 uptake tends to be overestimated if leaves of the two types are not separated.
Tokyo, the capital of Japan, is the world's largest city [4]. Together with its neighboring prefectures, the Greater Tokyo Area, at the center of the Kantō Plain, is home to more than 37 million residents. Measurement, reporting, and verification [6] of anthropogenic CO 2 emissions in this megacity would help policymakers, local stakeholders, and the community to act against climate change. For example, Moriwaki and Kanda [39] conducted a one-year measurement of CO 2 flux in a residential area of Tokyo with a significant fraction of vegetation cover, and reported representative seasonal and diurnal variation patterns of anthropogenic CO 2 emission in the city. In addition, Hirano et al. [60] clarified the diurnal variation of CO 2 flux in a less vegetated residential area of Tokyo that was close to arterial roads. Imasu and Tanabe [61] summarized more than ten years of ground-based CO 2 concentration data measurements at five sites in the Greater Tokyo Area, finding different sub-daily and seasonal patterns of CO 2 concentrations from urban sites in Tokyo to rural ones in Saitama prefecture. In other efforts, GOSAT-1 and GOSAT-2 enhanced target mode observation, particularly addressing the Kantō Plain and provide an unprecedented amount of high-resolution atmospheric CO 2 concentration data to elucidate anthropogenic CO 2 emissions in Tokyo and to update its CO 2 inventory.
To support analyses of these data in a top-down approach (i.e., inversion modeling system) for the megacity Tokyo, disaggregating anthropogenic CO 2 emissions from biogenic CO 2 flux at the order of a half-hour to several hours is necessary. To reduce biogenic CO 2 flux interference for city-scale CO 2 emissions estimation in particular, both photosynthetic CO 2 uptake and CO 2 emissions attributable to vegetation respiration, along with soil microbial and passive organic carbon decomposition within urban areas and surrounding vegetated land, should be explicitly considered.
This study was conducted to develop a terrestrial biosphere model calculating subdaily variation of NEE based on satellite data and meteorological conditions, to be coupled with a regional CO 2 transport model for better estimation of anthropogenic CO 2 emissions from megacities, with Tokyo and the Kantō Plain ( target area of study. To set up boundary conditions of CO 2 in the transport model AIST-MM [62,63], the calculation of biogenic CO 2 fluxes was expanded further to the central area of Japan (134.5 • E-142.5 • E, 32.5 • N-38.5 • N). Moreover, the NEE results provide reference information that is useful to estimate net vegetation photosynthetic CO 2 removal from the atmosphere in the region, which itself is part of the CO 2 inventory.

Materials and Methods
The CO 2 flux from an ecosystem to the atmosphere is represented by NEE [64]. In terrestrial ecosystems, NEE is the difference between the gross primary production (GPP) and ecosystem respiration (R e ), where GPP represents the total photosynthetic CO 2 uptake at the ecosystem scale, whereas R e denotes the sum of respired CO 2 by vegetation (R autotrophic ) and CO 2 emitted from organic carbon decomposition processes at the soil surface and in shallow soil layers (R heterotrophic ). Table S1 in the Supplementary Materials presents a list of variables used for the calculation of NEE.

Calculation of Hourly GPP
One key step of GPP calculation is the quantification of the absorbed photosynthetically active photons flux density by the vegetation canopy at local time t (APPFD t ). By generalizing the sun/shade model described by de Pury and Farquhar [55] to various foliage clumping index conditions and types of leaf angle distribution, APPFD t is calculable once diffuse PPFD (PPFD t diffuse ) and direct PPFD (PPFD t direct ) at the vegetation canopy top are known. For this study, PPFD t diffuse was calculated based on diffuse shortwave solar radiation (SR t diffuse ), whereas PPFD t direct was based on direct shortwave solar radiation (SR t direct ) using reference parameter settings in Ito and Oikawa [65]. To estimate SR t diffuse under various sky conditions (sunny to different degree of cloudiness), an artificial neural network (ANN) linear regression [59,[66][67][68] method was developed using the following input variables: low cloud cover (LCDC), middle cloud cover (MCDC), high cloud cover (HCDC), surface atmospheric pressure (P ambient ), shortwave black-sky albedo (α black-sky ), shortwave white-sky albedo (α white-sky ), surface vapor pressure (VP) and cosine of solar zenith angle (θ t SZA ) at local time t. The details of the generalized APPFD t equations and ANN training processes are presented in the Supplementary Materials.
With solved hourly APPFD t , it is then feasible to apply a terrestrial biosphere model, the Biosphere model integrating Eco-physiological And Mechanistic approaches using Satellite data (BEAMS, [69,70]), to estimate hourly GPP. BEAMS adopted the Monteith equation [71,72] with improvement of light use efficiency (LUE) calculation, assuming that GPP is proportional to APPFD t and taking into account the photosynthetic Stress t (unitless) caused by light, soil moisture conditions, surface temperature and relative humidity.
Stress t = P t actual /P t optimal (4) where LUE max represents the maximal photosynthetic light use efficiency in each plant functional type (PFT), P t actual , denotes the actual photosynthetic assimilation rate at given conditions, and P t optimal stands for the optimal photosynthetic assimilation rate at optimal conditions (i.e., 100% surface relative humidity and no soil water stress). Additionally, APPFD t was assumed to be evenly distributed in a big-leaf model [69] to calculate P t actual and P t optimal , which were solved by coupling a simplified Farquhar-von Caemmerer-Berry model [69,[73][74][75][76] of a C3 plant to the stomatal conductance of CO 2 [73,[77][78][79][80] As reviewed in the Introduction, photosynthetic productivity at the canopy scale tends to be overestimated without differentiating sunlit leaves from shaded leaves, which is the case for a big-leaf model. Moreover, the vegetation density, represented by the Leaf Area Index (LAI), in our study area was often dense (LAI ≥ 5 m 2 m −2 ) during the growing season in summer, meaning that only a limited area (≤2 m 2 m −2 ) of leaves would be sunlit. To avoid GPP overestimation, a two-leaf (sunlit/shaded) up-scaling scheme [55,57,58,73,81] was also applied for this study to calculate hourly GPP, based on the average net photosynthetic assimilation rate in sunlit leaves (P n sunlit ) and the average net photosynthetic assimilation rate for shaded leaves (P n shaded ).
The equations used to calculate P n sunlit , P n shaded , LAI t sunlit , and LAI t shaded are given in the Supplementary Materials.

Caculation of Hourly R e
To maintain physiological activities, plants respire CO 2 (R maintenance ). Vegetation biomass pools [69,82,83] were simply divided into foliage (bio t foliage ), wood (bio t live-wood + bio t dead-wood ), and root biomass (bio t root ) for this study. Each was assigned a PFT-specific maintenance respiration rate that was proportional to the amount of biomass and ambient temperature. The R m foliage , R m live-wood , and R m root equations are provided in the Supplementary Materials.
To use the carbon captured during photosynthesis, a plant consumes a fraction of its GPP and emits CO 2 to grow. At annual-or daily-scale modeling, this growth respiration (R growth ) was assumed to be proportional to the difference between GPP and R maintenance [69,82,83]. A similar assumption was made for this study at a sub-daily scale: hourly GPP is almost instantaneously exported and used by foliage, live-wood, and root areas to apply the same calculation method of R growth .
where f Rg stands for the fraction of growth respiration ( [69,83]: 0.14 in deciduous broadleaf forest (DBF), 0.055 in evergreen broadleaf forest (EBF), 0.11 in evergreen needleleaf forest, and 0.25 in grassland (GRS) after calibrations). With solved R growth and R maintenance , autotrophic respiration R autotrophic and net primary production (NPP) were calculated as follows: By assuming that live-wood and root biomass are proportional to foliage biomass in evergreen forests, live-wood biomass proportional to foliage biomass in DBF and root biomass proportional to foliage biomass in GRS, the allocation of hourly NPP was simplified for this study. In DBF, considering that some live roots might remain active and respire CO 2 even if foliage biomass appears to be 0 in early spring or late autumn, the root biomass was assumed as the accumulated result of NPP allocated to roots based on the formula of Friedlingstein et al. [84] minus carbon loss because of maintenance respiration and root mortality. Although dead-wood biomass does not emit respired CO 2 , it contributes to the amount of litter fall, which eventually affects R heterotrophic . Therefore, a simple allocation scheme of NPP was applied to calculate dead-wood biomass in forests of the respective types. Additional details are presented in the Supplementary Materials.
Compared to photosynthetic assimilation, the CO 2 emission rate attributable to soil organic carbon (SOC) decomposition is often lower, but SOC is estimated to store 2/3 of the carbon in terrestrial ecosystems [85]. Consequently, the total amount of CO 2 emit-ted from soil carbon pools should not be neglected. In this study, three surface SOC pools (surface structural litter carbon (SOC surface structural ), surface metabolic litter carbon (SOC surface metabolic ), and surface microbe (SOC surface microbe )), and five soil SOC pools (root structural litter carbon (SOC root structural ), root metabolic litter carbon (SOC root metabolic ), soil microbe (SOC soil microbe ), soil slow carbon (SOC soil slow ), and soil passive carbon (SOC soil passive )) were identified from 0 to 28 cm-deep soil layers. Carbon flow among these SOC pools and emitted CO 2 via decomposition processes were simulated based on earlier studies of the CENTURY model [69,[86][87][88]. Followed the volcanic ash soil assumption of Japanese soil in Hashimoto et al. [89], the maximum decomposition rates [88] were adjusted by 1/2.3. The equations of carbon influx and efflux flows are presented in the Supplementary Materials, including CO 2 emission R h SOC in each SOC pool, as shown below:

Parameter Calibration and Model Validation at the Site Level
According to the register of JapanFlux [90], 39 Japanese sites have been monitoring mass and energy fluxes over terrestrial ecosystems, including CO 2 fluxes. However, only the GPP, R e , and NEE data of six sites are accessible in the Forestry and Forest Products Research Institute FluxNet Database (FFNet DB, [91]): Sapporo (SAP), Appi (API), Kawagoe, Fujiyoshida (FJY), Kahoku, and Yamashiro (YMS). Additionally, the data of the Moshiri Birch Forest site and the Seto Mixed Forest site are included in the FLUXNET2015 dataset [92]. Nevertheless, no data of Japanese EBF or GRS were available when we developed our model. To complete parameter calibration and model validation for different ecosystem types, some GPP and R e data from outside Japan (China and Malaysia) were used. To be more specific, half-hourly data of two FFNet DB sites (SAP [93,94] and FJY [95][96][97]) and two FLUXNET2015 sites (Dinghushan DIN [98]) and Changeling CNG [99]) were obtained to calibrate PFT-specific parameters in hourly GPP BEAMS and GPP TLM modules. To validate the calibrated modeling settings, half-hourly data of one FFNet DB site (API [100][101][102][103]) and three FLUXNET2015 sites (Qianyanzhou QIA [104], Pasoh forest reserve PSO [105], and Duolun DU2 [106]) were obtained. Table 1 presents site information such as the PFT type, coordinates, and period for which data were obtained. Half-hourly GPP data with air temperature, relative humidity, wind speed, downward shortwave solar radiation, and precipitation records were screened in each FFNet DB data subset. Only the best quality data (quality flag = 0) were retained in each FLUXNET2015 data subset. Post-screened data were then converted to the hourly average for model calibration and validation at the site level.
Because continuous LAI records were not measured in either FFNet DB or FLUXNET2015 sites, it was extracted for the grid cell where each site was located from the MODIS LAI product (MCD15A3H v006 [107], 4-day 500 m) as proxies of the site conditions. Bad quality LAI data were replaced by interpolated values, as described further in the Supplementary Materials. Similarly, the canopy clumping index (Ω) was extracted from the MODIS Bidirectional Reflectance Distribution Function (BRDF)-derived dataset [108] with 500 m resolution, and both α black-sky and α white-sky were based on the MODIS albedo product (MCD43A3 v006 [109], daily 500 m). Although some FFNet DB and FLUXNET2015 sites included 0-5 cm depth soil moisture data, this layer is too shallow to estimate the comprehensive effects of soil water availability on photosynthesis. Soil water content θ k (m 3 m −3 ) for the 0-7, 7-28, 28-100, and 100-289 cm soil layers was therefore extracted from the ERA-5 Land reanalysis dataset [110] (hourly 9 km), whereas the fractions of clay, sand, and silt and the amount of soil organic carbon for each site were obtained from the SoilGrids250m dataset [111]. In addition, cloud cover and atmospheric pressure conditions at each FFNet DB site were based on the hourly Meso-Spectral Model Grid Point Value (MSM-GPV) reanalysis dataset with 5 km resolution provided by the Japanese Meteorological Agency [112]. The ANN Remote Sens. 2021, 13, 2037 7 of 32 method developed for this study to estimate SR t diffuse was not applicable to FLUXNET2015 sites because of the lack of cloud cover data, and in such cases, the empirical method of Luo et al. [58] was adopted instead. During calibration, each PFT-specific parameter for DBF, EBF, ENF, and GRS was increased or decreased by 10% until: (1) the correlation coefficient between hourly flux tower estimate (GPP tower ) and modeled GPP was as close to 1 as possible and this change in parameter did not conflict with (2) or (3); (2) the root mean square error (RMSE) between hourly GPP tower and modeled GPP was as low as possible and this change in parameter did not conflict with (1) or (3); (3) the standard deviations of hourly GPP tower and modeled GPP were as similar as possible and this change in parameter did not conflict with (1) or (2). Table 2 presents the calibrated parameters used to calculate hourly GPP. After calibration, hourly GPP values at the validation sites were also calculated and compared with GPP tower .   [113][114][115] data were also added, and flux tower estimates of the mixed broadleaf forest with both deciduous and evergreen species were compared against DBF modeling results and EBF modeling results, respectively. Half-hourly GPP, R e , and NEE data with complete air temperature record were selected in each FFNet DB data subset. Only the best quality half-hourly GPP, R e , NEE, and air temperature data (quality flag = 0) were selected in each FLUXNET2015 data subset. Hourly averages were then calculated for R e modeling calibration and validation at the site level. Additional required variables such as LAI, θ k , soil temperature T k soil , and fractions of the respective soil components in the 0-7 and 7-28 cm soil layers were assumed with grid values of the MODIS LAI product (MCD15A3H [107]), the ERA-5 Land reanalysis dataset [110], and the SoilGrids250m dataset [111], respectively, same as the case of GPP. To set up initial SOC conditions, hourly R e simulations based on CENTURY [86][87][88] were first conducted in spin-up mode to accumulate organic carbon, until the carbon stock in the 0-28 cm soil layers approximately reached 6.95 ± 3.25 kg m −2 , a range estimated based on the National Forest Soil Carbon Inventory of Japan [116]. Then, hourly R e and NEE were modeled to calibrate the parameters and to validate the R e model. The details of the related equations and parameter settings are available in the Supplementary Materials.

Hourly GPP and R e Modeling at the Regional Scale
The Kantō Plain, circumjacent to Tokyo, is a region of heterogeneous land cover and complex terrain (Figure 1). To assign PFT-specific parameters to each forest and grassland grid cell, the MODIS land cover type dataset (MCD12Q1 v006 [117], annual 500 m) was used as a reference after quality control screening (only retaining classified land and forest type changed land). Because of insufficient access of Japanese urban vegetation CO 2 flux data, its land-cover-specific traits were based on grassland parameter values in non-tree vegetated parts. Because valid LAI values of partially vegetated urban land presented clear appearing-disappearing seasonal patterns in our study area, the parameters of urban trees were temporarily set to those of DBF. For each urban grid cell, the relative fractions of tree cover and non-tree vegetation cover were obtained from the MODIS vegetation continuous fields product (MOD44B v006 [118], annual 250 m) after quality control screening (count of bad quality period ≤ 1) to calculate the fraction of partial vegetation cover.
To calculate the hourly 500 m × 500 m GPP and R e in the study area during the period 2010-2018, the MODIS LAI dataset (MCD15A3H v006 [107], 4-day 500 m), Land Cover Dynamics dataset (MCD12Q2 v006 [119], annual 500 m), and BRDF-derived Clumping Index [108] were applied to quantify the regional vegetation canopy structure after quality control screening (details are presented in the Supplementary Materials). The surface albedo was set by the MODIS albedo product (MCD43A3 v006 [109]). The fractions of clay, sand, and silt and the amount of organic carbon in the 0-200 cm soil layers were converted from the SoilGrids250m dataset [111]. Before March 2015, soil moisture and temperature data were based on the ERA-5 Land reanalysis dataset [110]. After March 2015, root zone soil moisture data were also extracted from the Soil Moisture Active Passive (SMAP) Root Zone (0-100 cm) Soil Moisture Geophysical Dataset (SMAPL4SMGPH v5 [120], 3-h 9 km) to calculate the effects of soil water availability on photosynthesis and to calculate the hydraulic limit of transpiration. Hourly surface air temperature, relative humidity, atmospheric pressure, wind speed, precipitation, cloud covers, and downward shortwave solar radiation in the MSM-GPV reanalysis dataset [112] were also processed as input data for the calculations. Depending on the data file format, the resampling of the input data to geographic projection was realized using the HDF-EOS to GeoTIFF Conversion Tool [121], Geospatial Data Abstraction Library [122] or wgrib2 [123].
Smoothing iteration in gdal_fillnodata.py [122] was applied to fill missing edge data of soil moisture or temperature. Similar to the modeling at the site level, a spin-up simulation was conducted regionally to initialize SOC in the study area: input data of the period 2010-2018 were used repeatedly until the carbon stock in the 0-28 cm soil layers ( Figure S1 in the Supplementary Materials) reached approximately 6.95 ± 3.25 kg m −2 [116].
Furthermore, hourly modeling results were aggregated and compared to the MODIS Gross Primary Productivity (MOD/MYD17A2HGF v006 [124,125] [127,128], annual 500 m). By calculating the difference between the modeling output and existing satellite datasets, the performances of our GPP and R e model were analyzed further.

Estimation of Integrated Anthropogenic CO 2 Emissions in the Greater Tokyo Area
Finally, the calculation results in Section 2.4 were compared with anthropogenic CO 2 emissions data in the Greater Tokyo Area spatially and at various time scales. Annual anthropogenic CO 2 emission data by sector (TotalE sector 2018 , e.g., construction, industry, agricultural waste burning, vehicle, ship, aircraft) in Japan in 2018 were obtained at as high a rate as possible from the national greenhouse gas emissions report provided by the Ministry of the Environment of Japan [129]. For our study area specifically, hourly human-induced CO 2 emissions in January to December were estimated based on the East Asian Air Pollutant Emissions Grid Database (EAGrid2000Japan [130]), where the value of each grid cell by sector (GridCellE sector EAGrid2000Japan ) was adjusted using a constant ratio (f sector ) of total CO 2 emissions in 2000 to that in 2018: GridCellE sector In addition, integrated anthropogenic CO 2 emission of all source sectors, GPP, R e , and NEE of the Greater Tokyo Area were calculated to elucidate the patterns of sub-daily and seasonal variations: the mean and standard deviation of biogenic integrated fluxes were calculated based on hourly results of all the days in each month during the period 2010-2018.

Hourly GPP at the Site Level
Comparison between hourly modeled GPP and FFNet DB/FLUXNET2015 estimation provides reference information about the modeling uncertainty of CO 2 uptake via photosynthesis in DBF, EBF, ENF, and GRS. As presented in Figure 2a (the evaluation metrics values also detailed in Table S2), the degree of similarity [131] between hourly GPP BEAMS (Big-Leaf upscaling scheme) and GPP tower was greater than 0.6 in both calibration and validation groups. Stated more specifically, the hourly GPP BEAMS agreed closely with GPP tower in ENF (R 2 ≥ 0.75), followed by DBF (R 2 ≥ 0.63), EBF (R 2 ≥ 0.53), and GRS (R 2 ≥ 0.49). The difference between flux tower-estimated and modeled standard deviation of hourly GPP at the site level was always within 1.22 µmolCO 2 m −2 s −1 , and therefore was acceptable. The RMSE of the hourly GPP BEAMS was less than 6.1 µmolCO 2 m −2 s −1 or 0.26 gC m −2 h −1 , where the largest was found for DBF and the smallest was found for GRS.
Comparison between hourly modeled GPP and FFNet DB/FLUXNET2015 estimation provides reference information about the modeling uncertainty of CO2 uptake via photosynthesis in DBF, EBF, ENF, and GRS. As presented in Figure 2a (the evaluation metrics values also detailed in Table S2), the degree of similarity [131] between hourly GPPBEAMS (Big-Leaf upscaling scheme) and GPPtower was greater than 0.6 in both calibration and validation groups. Stated more specifically, the hourly GPPBEAMS agreed closely with GPPtower in ENF (R 2 ≥ 0.75), followed by DBF (R 2 ≥ 0.63), EBF (R 2 ≥ 0.53), and GRS (R 2 ≥ 0.49). The difference between flux tower-estimated and modeled standard deviation of hourly GPP at the site level was always within 1.22 μmolCO2 m -2 s -1 , and therefore was acceptable. The RMSE of the hourly GPPBEAMS was less than 6.1 μmolCO2 m -2 s -1 or 0.26 gC m -2 h -1 , where the largest was found for DBF and the smallest was found for GRS. The accuracy of the hourly modeled GPPTLM (two-leaf upscaling scheme) was better than that of the hourly modeled GPPBEAMS when compared to GPPtower, but with one exception in the validation group of ENF (QIA). As with GPPBEAMS, the similarity between hourly GPPTLM and GPPtower was greater than 0.6 in both the calibration and validation groups, according to Figure 2b and Table S2. Hourly GPPTLM also showed very good agreement with GPPtower in ENF (R 2 ≥ 0.74), DBF (R 2 ≥ 0.67), EBF (R 2 ≥ 0.61), and GRS (R 2 ≥ 0.50). The difference between the flux tower estimated and modeled standard deviation of hourly GPP in the two-leaf module was within 1.05 μmolCO2 m -2 s -1 , smaller than that in the case of GPPBEAMS. The RMSE of hourly GPPTLM was less than 5.55 μmolCO2 m -2 s -1 , or 0.24 gC m -2 h -1 , which was again smaller than the value of hourly GPPBEAMS.

Hourly Re and NEE at the Site Level
The comparison between hourly modeled Re and FFNet DB/FLUXNET2015 estimation provides reference information about the modeling uncertainty of CO2 emission via The accuracy of the hourly modeled GPP TLM (two-leaf upscaling scheme) was better than that of the hourly modeled GPP BEAMS when compared to GPP tower , but with one exception in the validation group of ENF (QIA). As with GPP BEAMS , the similarity between hourly GPP TLM and GPP tower was greater than 0.6 in both the calibration and validation groups, according to Figure 2b and Table S2. Hourly GPP TLM also showed very good agreement with GPP tower in ENF (R 2 ≥ 0.74), DBF (R 2 ≥ 0.67), EBF (R 2 ≥ 0.61), and GRS (R 2 ≥ 0.50). The difference between the flux tower estimated and modeled standard deviation of hourly GPP in the two-leaf module was within 1.05 µmolCO 2 m −2 s −1 , smaller than that in the case of GPP BEAMS . The RMSE of hourly GPP TLM was less tha 5.55 µmolCO 2 m −2 s −1 , or 0.24 gC m −2 h −1 , which was again smaller than the value of hourly GPP BEAMS .

Hourly R e and NEE at the Site Level
The comparison between hourly modeled R e and FFNet DB/FLUXNET2015 estimation provides reference information about the modeling uncertainty of CO 2 emission via autotrophic and heterotrophic respiration in DBF, EBF, ENF, and GRS, as detailed in Figure 3a and Table S2. In DBF, ENF, and GRS, the hourly modeled R e agreed well with the R e estimates based on flux tower monitoring data (R 2 ≥ 0.71, 0.52, and 0.66, respectively). In EBF, the calculated hourly R e agreed poorly with flux tower estimates at low latitude sites-DIN and PSO. At the mixed broadleaf forest site YMS in Japan, simulated hourly R e using both DBF and EBF parameters agreed well with flux tower-estimated R e (R 2 was 0.68 and 0.76). In ecosystems of all four types, hourly R e was overestimated. The standard deviation of hourly modeled R e was 0.70-1.24 times (absolute difference within 0.66 µmolCO 2 m −2 s −1 that of the flux tower estimated one. Consequently, the RMSE of hourly R e was up to 2.61 µmolCO 2 m −2 s −1 , or 0.11 gC m −2 h −1 in EBF, followed by ENF, DBF, and GRS, in descending order. Hourly R e was approximately one to two thirds of hourly GPP in terms of the absolute value, while the former often presented much less sub-daily variations than the latter. As a result, their difference, hourly modeled NEE, agreed well with flux tower estimates at the site level simulations (Figure 3b and Table S2), though hourly modeled R e tended to be overestimated. In DBF, the correlation coefficient of hourly NEE was equal to or larger than 0.95, followed by ENF (R 2 ≥ 0.79), EBF (R 2 ≥ 0.74), and GRS (R 2 ≥ 0.41). Contrary to the hourly modeled R e , the standard deviation of the hourly modeled NEE was only 0.80-0.98 times (the difference of the absolute values was within 1.33 µmolCO 2 m −2 s −1 ) that of the flux tower estimated one. The RMSE of the hourly modeled NEE was up to 5.43 µmolCO 2 m −2 s −1 , or 0.23 gC m −2 h −1 in EBF, followed by ENF, DBF, and GRS, in descending order. The standard deviation of the hourly modeled NEE at YMS was close to that of the flux tower-estimated NEE, when assuming with either DBF or EBF parameters; the RMSE of the hourly modeled NEE was 3.21 and 3.37 µmolCO 2 m −2 s −1 , while the correlation coefficient was 0.51 and 0.50, respectively. These results might be partially attributable to the overestimation of hourly R e : overestimated R e can engender a positive NEE value, representing a false net carbon release, when the actual R e was smaller than GPP and there was actually net carbon uptake.

Hourly GPP at the Regional Scale
Because point-scale modelling results suggest that hourly modeled GPP TLM was close to or more accurate than hourly modeled GPP BEAMS , only the GPP TLM estimates at the regional scale were presented in the following analyses for simplicity, although regional simulations of the latter were conducted and generated similar results. Figure 4a-g present hourly GPP results of the study area from dawn to dusk on a summer day as examples of diurnal variations of vegetation CO 2 uptake. Hourly GPP was generally low (<0.75 gC m −2 h −1 ) in the early morning because of the limited availability of solar radiation. By virtue of the favorable weather conditions (e.g., 20-34 • C air temperature, 0.5-0.9 soil moisture wetness) in the study area, as the level of shortwave solar radiation increased, the hourly GPP raised quickly (up to 1.5 gC m −2 h −1 ) and peaked before noon. Then, in the late afternoon, the hourly GPP dropped and became 0 after sunset. Spatially, the hourly GPP was up to 1.19 ± 0.27, 0.90 ± 0.18, and 0.93 ± 0.18 gC m −2 h −1 in ENF, EBF, and DBF, and 1.19 ± 0.39 gC m −2 h −1 in GRS, as the means and standard deviations of hourly GPP among all grid cells in Figure 6 show. In contrast, the hourly GPP was only up to 0.69 ± 0.19 gC m −2 h −1 in urban grid cells with partial vegetation cover. Depending on the distribution of cloud cover and vapor pressure, in cloudy summer periods with highly diffuse shortwave solar radiation ratios, plants in forests or grasslands effectively absorbed many of the available photosynthetically active photons, although the downward shortwave solar radiation was as low as 200-300 W m −2 . Consequently, the hourly GPP in such cloudy conditions could reach 0.6-1.2 gC m −2 h −1 . In raining areas, precipitation interfered with stomatal conductance of CO 2 in photosynthesis, thereby temporarily reducing the hourly modeled GPP to 0. Over seasons, the range and spatial distribution of the hourly GPP varied according to the changes in meteorological and ecological conditions. Because of limited space, more examples of the hourly modeling results are not presented here.
In addition, Figure 4 also includes diurnal variations of the modeled fraction of absorbed PPFD on the same summer day. By virtue of the density of the leaf area in summer, the fraction of absorbed PPFD by the vegetation canopy was high (>0.7) in most of the forests in the study area but lower (0.2-0.6) in less densely vegetated urban grid cells. In agreement with earlier studies [57,132], this variable also exhibited diurnal change, higher in the early morning and late afternoon of a sunny day, when the ratio of diffuse PPFD was relatively high, as Figure 4h,i,m,n show.

3.4.
Hourly R e at the Regional Scale   Hourly NEE (the difference of R e and GPP) in different types of ecosystem was also presented by the mean and standard deviation among all grid cells in Figure 6. At night, R e was the sole contributor of hourly NEE, and therefore the latter shared the same range of values described in the paragraph above. In the daytime, the strongest net carbon uptake occurred in ENF, up to -0.82 ± 0.22 gC m −2 h −1 , followed by DBF, EBF, urban land with partial vegetated cover, and GRS, up to −0.66 ± 0.13, −0.65 ± 0.16, −0.49 ± 0.15, and −0.35 ± 0.41 gC m −2 h −1 , respectively. Furthermore, Table 3 summarizes the annual averaged GPP and R e during the period 2010-2018 in different types of ecosystem within the study area. According to our calculation, the highest annual averaged GPP occurred in ENF, about 1834.27 ± 38.60 gC m −2 year −1 . By virtue of the longer photosynthetic active period in EBF than in DBF, the annual averaged GPP in EBF (1484.26 ± 22.22 gC m −2 year −1 ) was significantly higher than that in DBF (1254.42 ± 67.63 gC m −2 year −1 ), even though the hourly averaged GPP in these two types of broadleaf forest was similar from May to October. In GRS, the annual averaged GPP was 1409.33 ± 185.44 gC m −2 year −1 . The annual averaged GPP in partially vegetated urban areas was 1020.77 ± 93.59 gC m −2 year −1 , probably because of a less dense LAI level than that in forests or grassland. The strongest annual averaged R e occurred in GRS, about 1308.89 ± 280.83 gC m −2 year −1 , followed by ENF, EBF, and DBF in descending order. The annual averaged R e in partially vegetated urban area was only 393.85 ± 38.41 gC m −2 year −1 ; both the limited SOC-decomposition-induced CO 2 emissions and less dense LAI level probably caused this.

Comparison of Regional Scale Modeling Results and Satellite Data
Hourly modeled GPP at the regional scale also presented clear seasonal changes. The reason to calculate 8-day GPP of the study area was that few GPP datasets or reference values in earlier studies were available at the sub-daily scale. In fact, only Bodesheim et al. [133] has modeled global half-hourly biogenic CO 2 flux data, but the spatial resolution (0.5 • ) of this dataset was too coarse to validate our modeling results. Therefore, comparisons between our 8-day GPP results and MODIS GPP [124,125] or GOSIF GPP [126] (Figure 7e-p) were presented. As Figure 7e-h show, our model estimated higher (0.0-30.0 gC m −2 ) 8-day GPP than the MODIS one in evergreen forests in the northeast and west of the study area. In contrast, the modeled 8-day GPP was 0.0-20.0 gC m −2 lower compared to MODIS 8-day GPP in the western deciduous forests and southeastern evergreen forests; for the latter, the discrepancy was up to 30.0 gC m −2 in summertime (Figure 7g). Over seasons, the range of difference was larger in summer than in winter while the spatial patterns remained the same, as the average and standard deviation of the difference between the modeled and MODIS 8-day GPP during the period 2010-2018 in Figure 7m-o indicate.
In January (Figure 7i) and October (Figure 7l), the difference between the modeled and GOSIF 8-day GPP was similar to the case of the modeled and MODIS 8-day GPP comparisons. Interestingly, GOSIF 8-day GPP was lower (up to 40.0 gC m −2 ) than our modeled results or MODIS GPP in April (Figure 7j), and then became much larger than the latter two estimates in July (Figure 7k). The favorable conditions (e.g., warm temperature and sufficient solar radiation) in spring and frequent rain in summer shaped the seasonal patterns of our modeled GPP, which probably led to the discrepancy with GOSIF 8-day GPP. The average and standard deviation of the difference between the modeled and GOSIF 8-day GPP during the period 2010-2018 are presented in Figure 7n (Figure 8b,c) values, especially in evergreen forests in the northeast and west of the study area. This was probably because of lower estimates of autotrophic respiration and growth respiration rate of plants in particular, considering that the modeled GPP showed good agreement with MODIS GPP and that the algorithm of maintenance respiration in this study was similar to the MODIS method. Because of limited space, comparisons of 8-day GPP and annual NPP to satellite data in 2010-2017 are not presented, but were very similar to those obtained for cases of 2018.

Biogenic CO 2 Fluxes and Anthropogenic CO 2 Emissions in the Greater Tokyo Area
The spatial distribution of anthropogenic CO 2 emissions and biogenic CO 2 fluxes in the Greater Tokyo Area barely overlapped and exhibited some asynchronous characteristics, as the examples of our modeling results in Figures 4, 5 and 9 demonstrate. On one hand, Tokyo was surrounded by evergreen and deciduous forests in Ibaraki, Chiba, Saitama, and Kanagawa prefectures, which were sources of biogenic CO 2 . On the other hand, strong anthropogenic CO 2 emissions occurred from the Tokyo metropolis, as well as from the Keihin and Keiyou industrial areas near Tokyo and the Kashima seaside industrial area (marked by dashed cycles in Figure 1, respectively), where power plants and factories are concentrated. If the range of the biogenic CO 2 flux in our study area was much smaller than that of anthropogenic CO 2 emissions, then neglecting the former in a topdown approach modeling would not cause much error in back-casting results. Otherwise, accurate calculation of the biogenic CO 2 flux is essential for updating the anthropogenic CO 2 emission inventory in the megacity of Tokyo. The comparisons of hourly integrated GPP (photosynthetic CO 2 uptake), R e (biogenic CO 2 emission), and NEE (net biogenic CO 2 flux) to hourly anthropogenic CO 2 emissions in the Greater Tokyo Area (Figure 10) provide some reference to evaluate this question. For our study, "hourly integrated" refers to the sum of modeled results in all grid cells within 139 • E-141 • E, 35 • N-36.33 • N, averaged over all days in each month during the period 2010-2018. The mean and standard deviation of hourly integrated biogenic CO 2 fluxes are elaborated as below. To elaborate, first, hourly integrated anthropogenic CO 2 emissions of the Greater Tokyo Area presented similar diurnal variations year-round; the lowest level (2.37-2.64 × 10 4 ton CO 2 h −1 ) was found after midnight when human activities were least active. As traffic rush hours started in the morning, hourly integrated anthropogenic CO 2 emissions increased gradually and almost doubled by around 9 a.m. to 4.05-4.63 × 10 4 ton CO 2 h −1 . From then until late afternoon, the integrated anthropogenic CO 2 emissions entered a plateau phase and then started to drop at about 8 pm, after the evening traffic rush hours.
Secondly, the hourly integrated biogenic CO 2 emissions of the Greater Tokyo Area, the total of vegetation-respired CO 2 and CO 2 emitted via SOC decomposition processes, exhibited both sub-daily and seasonal variations. At night, the lowest hourly integrated biogenic CO 2 emissions (0.09 ± 0.01 × 10 4 ton CO 2 h −1 ), which was about 3.07 ± 0.51% of the hourly integrated anthropogenic CO 2 emission in the area, occurred at 4 a.m. in January. In the daytime, the hourly integrated biogenic CO 2 emissions became stronger and peaked in the early afternoon (around 3 p.m.). The peak biogenic CO 2 emission (0.16 ± 0.03 × 10 4 ton CO 2 h −1 ) was equivalent to about 3.55 ± 0.79% of the hourly integrated anthropogenic CO 2 emissions in January. Such a rise in integrated biogenic CO 2 emissions was probably attributable to increased temperatures and additional vegetation growth respiration when photosynthesis was active during the day.
As meteorological conditions became more favorable for SOC decomposition from winter to summer and vegetation became more active in terms of both maintenance and growing activities, hourly integrated biogenic CO 2 emissions of the Greater Tokyo Area increased considerably. In July, the lowest hourly integrated biogenic CO 2 emissions were at 5 a.m. and was estimated to be 0.56 ± 0.08 × 10 4 ton CO 2 h −1 , roughly 18.76 ± 2.80% of the hourly integrated anthropogenic CO 2 emissions in the Greater Tokyo Area and 6.35 times the biogenic value in January. The strongest hourly integrated biogenic CO 2 emissions were found at noon and were estimated to be 0.96 ± 0.13 × 10 4 ton CO 2 h −1 , which was 6.39 times that of January's biogenic level, and 23.62 ± 3.13% of the hourly integrated anthropogenic CO 2 emissions, as the latter also increased after several hour. Figure 10. (a-l) Hourly integrated anthropogenic CO 2 emissions (10 4 ton CO 2 h −1 ) in 2018, compared to hourly integrated biogenic CO 2 emissions, photosynthetic CO 2 uptake, and net biogenic CO 2 flux of the Greater Tokyo Area (10 4 ton CO 2 h −1 ) during the period 2010-2018. Shadows represent the variation (the minimum to the maximum) of hourly integrated biogenic CO 2 emission (red), photosynthetic CO 2 uptake (green), and net biogenic CO 2 flux (blue).
Thirdly, hourly integrated photosynthetic CO 2 uptake of the Greater Tokyo Area represents the amount of absorbed carbon by forests, grassland or urban vegetation, varying both diurnally and seasonally. Because the calculation of GPP in our model was under the assumption that the supply of CO 2 for photosynthetic reactions in C3 plants is controlled by stomata in daytime only, the hourly integrated photosynthetic CO 2 uptake at night was 0. In winter, the absolute value of hourly integrated photosynthetic CO 2 uptake in both the early morning and late afternoon was low and was the lowest (0.04 ± 0.06 × 10 4 ton CO 2 h −1 and 1.34 ± 2.03% of the hourly integrated anthropogenic CO 2 emissions in the Greater Tokyo Area) at 5 a.m. in January. Because photosynthetic enzymes became more active at higher winter temperatures and because more photosynthetically active radiation was available around midday, the hourly integrated photosynthetic CO 2 uptake was en- hanced. For January, the highest absolute value was obtained around 2 p.m., and that value (0.62 ± 0.17 × 10 4 ton CO 2 h −1 ) was 14.08 ± 3.89% of the hourly integrated anthropogenic CO 2 emissions. As the seasons changed, hourly integrated photosynthetic CO 2 uptake of the Greater Tokyo Area reached the highest level in July, with an absolute value of 3.22 ± 0.50 × 10 4 ton CO 2 h −1 at noon and 5.42 times the January value. This amount was 79.04 ± 12.31% of the hourly integrated anthropogenic CO 2 emissions.
Finally, the hourly integrated net biogenic CO 2 flux of the Greater Tokyo Area represents the amount of CO 2 flow from ecosystems to the atmosphere. At night, such flow was one-way, from forests, grasslands, and urban vegetated areas into the air. The nighttime vegetated land of the Greater Tokyo Area was a natural carbon source, collectively emitting 0.08 ± 0.01 × 10 4 to 0.09 ± 0.01 × 10 4 ton CO 2 h −1 in January and 0.56 ± 0.09 × 10 4 to 0.60 ± 0.10 × 10 4 ton CO 2 h −1 in July, which were 3.07 ± 0.51% to 2.34 ± 0.41% and 21.97 ± 3.38% to 17.80 ± 2.97% of the hourly integrated anthropogenic CO 2 emissions for January and July, respectively. Depending on whether the photosynthetic CO 2 uptake would completely compensate for the biogenic CO 2 emission in each grid cell or not, the daytime net biogenic CO 2 flux from ecosystems to the atmosphere was either negative or positive. Here, negative net biogenic CO 2 flux values mean that the direction of carbon flow is reversed (from atmosphere to the ecosystem): a natural carbon sink. The maximal negative hourly integrated net biogenic CO 2 flux of the Greater Tokyo Area often occurred around 12 a.m. to 2 p.m. in all seasons, −0.46 ± 0.14 × 10 4 ton CO 2 h −1 in January in winter to −2.26 ± 0.42 × 10 4 ton CO 2 h −1 in July in summer, would potentially offset 10.52 ± 3.17% to 55.42 ± 10.39% of the hourly anthropogenic emissions.
Overall, the sub-daily patterns of hourly integrated biogenic CO 2 emissions, photosynthetic CO 2 uptake, and net biogenic CO 2 flux of the Greater Tokyo Area were bell-shaped, whereas the ranges of hourly integrated values increased from winter to spring and from spring to summer, subsequently dropping in autumn. The intraday peaks almost always occurred around midday, but the highest hourly integrated biogenic CO 2 emissions often appeared several hours later than the photosynthetic CO 2 uptake. In each month, the variation of hourly integrated photosynthetic CO 2 uptake of the Greater Tokyo Area was much greater than that of biogenic CO 2 emissions. As a result, the integrated net biogenic CO 2 flux fluctuated between positive and negative figures within a day. Figure 11 presents the integrated anthropogenic CO 2 emissions, biogenic CO 2 emissions, photosynthetic CO 2 uptake, and net biogenic CO 2 flux of the Greater Tokyo Area at the monthly scale. In winter to early spring (December to March), the integrated biogenic CO 2 flux was lower than the anthropogenic one by more than one order of magnitude (the ratio of integrated biogenic CO 2 flux to anthropogenic emission was only 3.34 ± 0.54% in February). From April to November, both photosynthesis and biogenic respiration became more active, up to 34.29 ± 5.92% and 20.46 ± 1.96% of the integrated anthropogenic CO 2 emissions. Consequently, the integrated biogenic CO 2 flux was up to 17.06 ± 5.44% of the anthropogenic one in June. Figure 11. Monthly integrated anthropogenic CO 2 emissions (10 6 ton CO 2 month −1 ) in 2018, compared to biogenic CO 2 emissions, photosynthetic CO 2 uptake, and net biogenic CO 2 flux of the Greater Tokyo Area (10 6 ton CO 2 month −1 ) during the period 2010-2018. Shadows represent the variation (the minimum to the maximum) of monthly integrated biogenic CO 2 emission (red), photosynthetic CO 2 uptake (green), and net biogenic CO 2 flux (blue).

Discussion
The sub-daily scale terrestrial biosphere model developed for this study was based on earlier studies [55,69,88]. It required multiple satellite datasets as input. Our main motivation of GPP and R e simulation in such short time-steps was to disaggregate biogenic CO 2 fluxes from the anthropogenic CO 2 emissions of the Tokyo megacity and surrounding industrial areas in future inversion studies. Even though some driving forces of the model (i.e., temperature, solar radiation, and soil moisture data) are available hourly or every few hours, sub-daily scale modeling of biogenic CO 2 flux remained challenging.
In terms of GPP calculation, earlier satellite-based models often applied the light use efficiency concept [71,72], which requires APPFD by plant canopy or its equivalent in W m −2 as input. However, global daily satellite datasets to calculate APPFD, let alone sub-daily ones, are not available yet. Furthermore, after removing cloud-contaminated points, valid data obtained from existing dataset such as MODIS 4-day LAI/Fraction of Photosynthetically Active Radiation ( [107]) are incomplete, especially in regions such as our study area, where the rain period is long in summer. To prepare complete and continuous hourly APPFD estimates for GPP calculation, we introduced the satellitederived clumping index Ω and interpolated LAI as the input. An ANN was also trained to estimate hourly diffuse shortwave solar radiation under various weather conditions. This combination allowed the generalization of APPFD modeling method proposed in de Pury and Farquhar [55]. Because both diffuse and direct PPFD were accounted for in hourly APPFD modeling processes, underestimation of GPP [51] was alleviated in our model. It also means simulating more realistic diurnal variations of photosynthetic CO 2 uptake compared to existing terrestrial biosphere models, for instance, VPRM [43,48,49], which were applied in the field of top-down CO 2 emissions verification studies at a city scale.
Accurate temporal variation and spatial distribution of both LAI and the clumping index are the key to obtaining more realistic hourly APPFD estimates and subsequently more realistic hourly GPP. For this study, we set the clumping index of vegetated land to fixed values [108], assuming that the temporal variation of Ω was small within a growing season [57] and interannually (2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). This assumption reflected a compromise because of dataset accessibility: it should be improved for future studies by introducing a clumping index dataset with finer temporal resolution, such as the Global Vegetation Clumping Index Products LIS-CI-A1 [134]. Until daily satellite LAI datasets become accessible, improvements to LAI data quality rely on better interpolation and smoothing algorithms. By introducing a satellite dataset of phenological metrics [119], we were able to smooth trends of 4-day LAI in deciduous forest and grassland areas. However, it is difficult to constrain satellite-observed LAI in evergreen forests by checking against date of vegetation green-up or dormancy. Our results of both hourly modeled GPP at the site level and comparison to existing GPP data at the regional scale suggest that temporal linear averaging interpolated LAI of evergreen forests was sufficient for accurate simulation of hourly GPP. However, frequently fluctuated LAI in evergreen forests can be expected to trigger abnormal amounts of leaf litter fall, which can lead to overestimation of SOC and risks of overestimating hourly R e . In this case, data assimilation of LAI [135][136][137][138][139] might provide an alternative for model improvement in future work.
Additionally, we recognized that in recent years, satellite-observed SIF has become a promising proxy for GPP because the two variables presented a strong linear relation [140][141][142]. Although we already used a SIF-based GPP dataset [126] to assess our modeled GPP results, it was difficult to apply this type of SIF-based GPP algorithm directly at sub-daily scale modeling because of similar challenges discussed in earlier paragraphs: temporally continuous, sub-daily SIF satellite datasets are not yet available. The ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) and the Orbiting Carbon Observatory-3 (OCO-3) aboard the International Space Station (ISS) have diurnal sampling capability to some extent. ECOSTRESS has been used to develop instantaneous GPP estimates for different times of day [143], but neither ECOSTRESS nor OCO-3 could provide temporally continuous observations throughout the day due to the orbit nature of the ISS. Consequently, the diurnal variability of GPP cannot be simulated accordingly [144,145].
Because GPP modeling in this study was process-based, our last discussion point about photosynthetic CO 2 removal estimation is about PFT-specific parameters. Following Wullschleger [76], we assumed that, in each PFT, the maximal electron transport rate J max in photosynthetic reactions was linearly related to the maximal carboxylation rate at 25 • C V cmax25 , where the latter was set as a constant. However, studies such as that by Kosugi et al. [146] observed a seasonal change in V cmax25 , which has been reflected in some global GPP models [147] and which should be included in our future model optimizations. Furthermore, the values of empirical parameters to calculate V cmax25 based J max reported by Medlyn et al. [148] and Walker et al. [149] differ from those reference settings [76] for this study, which should be compared in further sensitivity tests of our GPP modeling.
In terms of R e modeling, hourly R e calculations in this study considered the effects of biomass amount in all vegetation biomass pools and SOC pools [69,70,[86][87][88] instead of simple R e correlations to ambient temperature and soil moistures. This consideration is both a benefit and a challenge of our model because below-ground biomass was rarely obtained via satellite observations, if not impossible. Hourly modeled R e results at the site level suggest that our sub-daily model tended to overestimate biogenic CO 2 emissions, which led to an overestimated natural carbon source at night and underestimated negative biogenic CO 2 flux during the day.
Several parts of the hourly R e model might contribute to such results. To start with, we assumed that live-wood and root biomass were proportional to the foliage biomass in most cases (except for root biomass in DBF) for the allocation of hourly NPP, with a fixed specific leaf area (SLA) setting [82] in each PFT. Our forest NPP results fluctuate above or below the calculated values presented by Sasai et al. [69]: 1200 gC m −2 year −1 NPP in 1982-2000. These results also exceed the estimates of Ise et al. [150]. In that study, NPP of temperate forest was no more than 500 gC m −2 year −1 . Occasionally, because the change in biomass was only constrained using the difference of LAI over time, carbon flow in and out of the plants was not fully balanced. This imbalance can cause errors of maintenance respiration and then biogenic CO 2 emissions. To reduce modeling errors of this type, the data quality of the input LAI should be improved, as discussed in earlier paragraphs. Additionally, more representative values of PFT-specific parameters should be updated from the latest literature or from research datasets such as the TRY Plant Trait Database [151]. Another point to be addressed in future work is the assumption about growth respiration. Bruggemann et al. [152] reviewed existing isotope studies of plant and pointed out a time lag of transport and allocation of carbon from leaves to the root tissues belowground, suggesting that our assumption of almost instantaneous GPP export to root utilization should be converted to a lagged one once measurement data are available to set the delay time range.
In addition to temperature and moisture conditions, initial SOC amounts also greatly affected the heterotrophic part of hourly R e . On one hand, active soil biomass pools such as SOC surface metabolic , SOC surface microbe , SOC root metabolic , and SOC soil microbe reached equilibrium (annual carbon change remained several grams or even larger by our calculations) rapidly during the spin-up. On the other hand, it would take more than one thousand years to stabilize SOC soil passive . However, hourly R e estimates based on these fully stabilized initial SOC settings were almost twice those of observed ones. Thereby, the number of years for spin-up was shortened to reduce SOC and R heterotrophic overestimation.
To improve this aspect of modeling further, a few aspects must be addressed. As reported by Fierer et al. [153] and by Pugh et al. [154], both the quantity and quality of litter can be expected to affect the biomass amount in SOC and its decomposition rate. Estimating the hourly litter fall rate more accurately requires good estimation of plant biomass change in foliage, wood, and root parts, as described above. However, plant biomass loss attributable to biogenic volatile organic compound (VOC) emission [152], herbivore consumption [155], and litter loss because of dissolved carbon [64] should be updated or tuned to more accurately estimate SOC. From a quality perspective, PFTspecific parameters such as the lignin fraction and the lignin:nitrogen ratio can be further calibrated or adjusted with reference values from the TRY Plant Trait Database [151]. Our R e modeling only included carbon cycling, and coupling C and N might improve the simulations [85,87,88].
Because of the limited availability of GPP, R e , and NEE datasets for Japanese forests and grasslands, some (n = 5) flux tower sites' data outside our study area and the 33 • N-38 • N range were used in model calibration and validation experiments. To improve the representativeness of model calibration, future availability of data from more observational sites is expected to reduce site-specific errors [133]. This point also addresses future application of our sub-daily scale model to provide fine-resolution biogenic CO 2 flux data in an inversion analysis of anthropogenic CO 2 emissions of other megacities, which requires both globally available satellite input datasets and well-tuned model parameters.
Furthermore, although biogenic CO 2 fluxes induced by urban forest and greenspace were estimated in our sub-daily scale model, these calculations were based on natural vegetation parameters because of the scarcity of observational data. Such assumptions should be replaced in future work because urban vegetated land might present unique characteristics that are different from natural forests [49,156]. Although CO 2 emissions attributable to agricultural biomass waste burning were counted in the anthropogenic emission data [129,130] that we used, photosynthetic CO 2 uptake and crop-soil CO 2 emissions in agricultural systems distributed in our study area were not included and could potentially be calculable based on a model described by Sasai et al. [157].
According to our hourly integrated biogenic CO 2 emission, photosynthetic CO 2 uptake and net biogenic CO 2 flux results discussed in Section 3.6, the range of integrated hourly biogenic to anthropogenic CO 2 emissions ratio in winter (2.27 ± 0.11%-4.97 ± 1.17%) was similar to the value (4-5%) estimated in earlier urban CO 2 flux studies [24,42,158]. However, the modeled ratio range in summer (13.71 ± 2.44%-23.62 ± 3.13%) was much higher than reference values, probably because our study area included much more vegetated land than Los Angeles [24], or a higher vegetation fraction than the neighborhood-scale studies in Singapore and Mexico City [42,158].
In terms of CO 2 removal, the range of integrated hourly photosynthetic uptake to anthropogenic CO 2 emission ratio in winter (up to 19.07 ± 5.33%) was comparable to the series of 1.4% in Velasco et al. [158], 8% in Velasco et al. [42], 10% in Brioude et al. [24], 14% in Hardiman et al. [49], and 20-24% in Park et al. [159]. In summer, the ratio of photosynthetic uptake to anthropogenic emissions (79.04 ± 12.31%) was higher than the estimate (69%) for a July afternoon in Boston [29]. The summer daytime integrated hourly net biogenic to anthropogenic CO 2 flux ratio (up to 55.42 ± 10.39%) was higher than the 20% offset by daytime NEE in Paris [33] and up to 31% estimated by Menzer et al. [41] while our winter daytime estimates was only up to 14.15 ± 4.24%. Forests, grasslands, and urban greenspaces in the Greater Tokyo Area showed potential to offset Tokyo's anthropogenic CO 2 emissions, especially during the growing season.
According to Table 3, our model estimated 5.14 ± 0.47 × 10 6 tonCO 2 year −1 of photosynthetic uptake and 1.98 ± 0.19 × 10 6 tonCO 2 year −1 of biogenic emission in the partially vegetated urban land (1374.62 ± 134.05 km 2 ) in the Greater Tokyo Area. Another 1665.31 km 2 of urban land in the study area was also identified as having partial vegetation cover by the MODIS vegetation continuous fields product (MOD44B), but was without valid LAI data to proceed any simulation. Additionally, if the rest of the urban land (about 7065.07 km 2 ) in our study area was fully covered by DBF, EBF, ENF or GRS instead, 3.25 ± 0.18 × 10 7 , 3.84 ± 0.06 × 10 7 , 4.75 ± 0.10 × 10 7 or 3.64 ± 0.49 × 10 7 ton year −1 more CO 2 would be captured by photosynthesis, while 1.73 ± 0.19 × 10 7 , 2.08 ± 0.21 × 10 7 , 2.68 ± 0.27 × 10 7 or 3.37 ± 0.71 × 10 7 ton year −1 more CO 2 would be emitted via biogenic respiration and SOC decomposition. Nonetheless, one must bear in mind that once the spatial range of hourly integrated calculation changed, so would the results [49,160]. These values were the total of GPP, R e , and NEE in ecosystems of different types under various environmental conditions. It is too complicated to make an interpretation without discussing factors such as near-ground CO 2 storage [161][162][163], which affect the actual biogenic offset of anthropogenic CO 2 emissions. For that reason, the gridded estimates of sub-daily biogenic CO 2 flux are supplied to the transport model AIST-MM for inversion analysis of anthropogenic CO 2 emissions in cities such as Tokyo.

Conclusions
Sub-daily photosynthetic CO 2 uptake and biogenic CO 2 emissions in DBF, EBF, ENF, GRS, and urban vegetated lands located in Tokyo and its surrounding areas were estimated via a terrestrial biosphere model driven by multiple satellite inputs and meteorological datasets. After calibrations, our modeled results were found to be consistent with both point-scale observed values and existing satellite-based data. The order of hourly integrated photosynthetic CO 2 uptake and NEE values in the Greater Tokyo Area was up to about 80%, or approximately half of the total anthropogenic CO 2 emissions in summer, and lower than one-tenth of the human-induced values in winter. Such modeling results not only provide reference data for the estimation of total natural CO 2 removal in our study area but also provide prior input values for the disaggregation of anthropogenic CO 2 emission and biogenic CO 2 flux when applying top-down approaches to update the megacity's CO 2 emissions inventory. This allows for unprecedented amounts of GOSAT and ground measurement data of CO 2 concentrations to be analyzed using inverse modeling of anthropogenic CO 2 emissions from Tokyo and the Greater Tokyo Area. The updated megacity CO 2 emissions inventory based on observations of CO 2 concentration will serve to detect changes in emission and to improve the understanding of source patterns over space and time. The updating of the urban CO 2 emissions inventory will also support the assessment of emission reduction efforts pledged and implemented by cities such as Tokyo.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/rs13112037/s1, Figure S1: Accumulated carbon (sum of SOC root structural , SOC root metabolic , SOC soil microbe , SOC soil slow , and SOC soilpassive ) (kgC m −2 ) in the 0-28 cm soil layer of the study area after model spin-up; Figure S2: Structure, layers, and nodes of artificial neural network (ANN) to estimate K t diffuse ; Figure S3: Taylor diagram of hourly observed and modeled SR t diffuse (W m −2 ) at the World Radiation Data Centre Fukuoka, Sapporo, and Tateno sites. Table S1: List of variables and their acronyms used in the sub-daily GPP, R e , and NEE model; Table S2: Evaluation metrics for model calibration and validation of hourly gross primary production (GPP), ecosystem respiration (R e ), and net ecosystem exchange (NEE); Table S3: Parameters to activate the ANN and predict K t diffuse ; Table   S4: PFT-specific parameters used to calculate hourly R maintenance . References [164][165][166][167][168][169][170][171][172][173][174][175]  Funding: This research received no external funding.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.