Annual Cycle of East Asian Precipitation Simulated by CMIP6 Models

: Annual cycle is fundamental in the East Asian monsoon (EAM) systems, profoundly governing the spatiotemporal distribution of the East Asian rainfall. The present study identiﬁed the dominant modes of the annual cycle in the East Asian rainfall based on the Fourier harmonic analysis and the Empirical Orthogonal Function (EOF) decomposition. We evaluated the performance of the ﬁrst two leading modes (i.e., EOF-1 and EOF-2) in historical experiments (1979–2014) of the 21 released climate models of phase six of the Coupled Model Intercomparison Project (CMIP6). Comparing with the observation, although the CMIP6 models yield the essential ﬁdelity, they still show considerable systematic biases in the amplitude and phase of the annual cycle, especially in east and south China. Most models exhibit substantial phase delays in the EOF-2 mode of the annual cycle. Some speciﬁc models (BCC-ESM1, CanESM5, and GFDL-CM4) exhibiting better performance could capture the observed annual cycle and the underlying physics in climatology and interannual variability. The limited ﬁdelity of the EOF-2 mode of the EAM annual cycle primarily hinders the monsoon variability simulation and thus the reliable future projection. Therefore, the dominant modes of the EAM annual cycle act as the evaluate benchmark in the EAM modelling framework. Their improvement could be one possible bias correction strategy for decreasing the uncertainty in the CMIP6 simulation of the EAM.


Introduction
Asian monsoon precipitation feeds billions of people over the regions with the world's most dense population [1]. There is an urgent demand to address food security and agricultural practices caused by precipitation-related severe disasters, such as anomalous floods and droughts. Hence, precipitation is an essential factor in predicting climate variabilities given its tremendous impact on public health, socioeconomic risk, and ecological balance. However, Asian monsoon rainfall, characterized by the multiple spatiotemporal scales and multi-scale interactions, is still challenging to predict. Despite its extremely heterogeneous structure and diverse behavior, the remarkable annual cycle of the Asian monsoon depends on the large-scale underlying forcing and constraints, which provides higher predictability than the monsoon rainfall in a given period. The global monsoon, as a global-scale solstitial mode featuring the seasonal variation of monsoon precipitation, is forced by the annual cycle of solar radiation [2][3][4][5]. In particular, the subtropical East Asian (EA) and tropical western North Pacific (WNP) monsoonal circulations are driven by the solar radiation [6] and regulated by the sharp land-sea thermal contrast [6][7][8] and complex terrain (i.e., topographic uplift of the Tibet Plateau and various surrounding mountain chains [9][10][11][12][13]), among others.
Rainfall in the East Asia-western North Pacific (EA-WNP) region exhibits a distinctive and remarkable annual cycle. Traditionally, the rainfall distribution includes five rainy periods associated with the annual cycle over the EA-WNP domain [14][15][16][17]. The abrupt change one ensemble member for each model has been used and occupied equal weightage when calculating the multi-model ensemble (MME) mean. Given the various spatial resolutions of CMIP6 models, we first interpolated all participating model outputs into a regular grid of 2.5 • × 2.5 • using the bilinear interpolation methods. The period from 1979 to 2014 is adopted as the baseline of the climatology. The pentad-mean records were identified as the 5-day average of the daily outputs to access the annual cycle. If the model has 366-day in the leap years, we omit the 29 February manually and obtain a total of 73 pentads. Here we treated the precipitation from the Climate Precipitation Center Merged Analysis of Precipitation (CMAP) [54] and the circulation from the National Centers for Environmental Prediction-Department of Energy Reanalysis-2 datasets as the observation [55]. The observations were also turned into the pentad-mean data from 1979 to 2014. We used the Taylor diagram to quantitatively compare the model performance and the observations [56].

Definition of the Annual Cycle
The annual cycle of rainfall was conventionally represented by the monthly evolution of the rainfall amounts. Since the harmonic analysis was objective and relatively insensitive to noise [57], Horn and Byson [58] originally applied harmonic analysis to describe the climatic precipitation regimes, which has been widely used to quantify the annual cycle of rainfall in climatology afterward [59][60][61][62]. Following the previous works [58,63], we conducted a Fourier harmonic analysis on both simulations and observations to extract the climatological annual cycle component of the East Asian rainfall as follows: In Equation (1), the climatological precipitation P(t, x, y) at the grid point scale (x, y) can perform as the function of pentad t (T = 73 pentads, t = 1, 2, 3, . . . , 73), whereas m, A 0 , A m , ϕ m were the harmonic order, annual mean, amplitude, and phase of the m th harmonic, respectively. Here, we considered the first harmonic (m = 1) as the annual cycle component, which can be expressed as the combination of the A 1 and ϕ 1 using the trigonometric functions at each grid (i.e., . Its explained variance to the total variability quantified the contribution of the annual cycle at grid scale. We then conducted the multivariate empirical orthogonal function (MV-EOF) analysis on the observed and simulated annual cycle component of precipitation, low-level horizontal wind, and SLP over EA to extract two dominant modes, as some previous works [3,63]. The reason for the strategy of these multivariable is to reveal the physical mechanisms of the thermodynamic and dynamic effects. The corresponding principal components (hereafter PCs) of the EOF modes indicated the temporal evolution of the annual cycle modes. The maximum value of the PCs and its timing were considered the amplitude (abbreviated to A pc ) and phase (ϕ pc ) of the annual cycle modes.

Simulation of the Annual Cycle Component of East Asian Rainfall
We first evaluate the large-scale spatiotemporal fidelity of the climatological rainband in the EA-WNP sector. Figure 1a displays the seasonal migration of the CMAP rainfall in EA and WNP. In general, the MME could reproduce the observed rainy season (Figure 1b). The simulated seasonal cycle of the tropical precipitation over 0-20 • N varies consistently with the seasonality of the intertropical convergence zone (ITCZ) over WNP as the observation. In extratropic, the CMIP6 models can capture the observed northward migration of the subtropical monsoon precipitation, including the spring rainy season starting in south China, followed by two rainy periods (i.e., Meiyu/Baiu system and fall rainy season). The simulated seasonal evolution of the rainband by CMIP6 improves slightly than the CMIP3 and CMIP5 models [15,16,29]. However, some details vanish in the ensemble mean of the model simulations. For instance, high-frequency variations and some dry breaks in observation cannot be well reproduced. The precipitation amount is significantly weaker than the observation. It is one of the longstanding systematic biases in the current climate models. For the individual models, except for INM-CM4-8, others show high pattern correlation coefficients (PCCs) between 0.8 and 0.9 comparing with the observation (Figure 1c). The PCC of the ensemble result reaches 0.908 and is comparable to the best models, such as the NorESM2-LM (PCC = 0.914). The standardized deviation ratios (SDR) ranging from 0.75 to 1.2 indicate that the uniform spatial distribution is good in some models but bad in others.
The interannual variability is less skillful than the climatological seasonal cycle of the EA rainfall. Compared with the observation, although the simulated interannual standard deviation (SD) pattern is similar, the MME variabilities are much weaker over WP but stronger over EA (Figure 1d,e). Figure 1f shows the PCC score (0.88) of the interannual variability in MME is lower than the climatology in Figure 1c. In addition, the PCCs and the SDRs exhibit more vigorous spread amongst individual models. In general, the models with lower skill in climatology simulation also show lower interannual SD skills, such as the INM-CM4-8 and NorESM2-LM. The metrics encompass the standardized deviation ratios (SDR, referring to the ratio of model's pattern standard deviation against the observed pattern standard deviation, vertical axis), rootmean-square-error (RMSE, distance to the REF), and the spatial pattern correlation coefficients (PCC, azimuthal axis) among the 21 CMIP6 models and MME mean comparing with the observation. (d-f) are the same as (a-c), but for the year-to-year standard deviations of the zonal mean rainfall.
As to the annual cycle component, it is dominant in total rainfall over most EA subregions in either observation or CMIP6 models, particularly over tropical and subtropical monsoonal regions (Figure 2a,b). In observation, the annual cycle component explains more than 60% variance, except for the Tibetan Plateau and northwest China. Nevertheless, the CMIP6 MME result overestimates the contribution of the annual cycle (Figure 2b), implicating its overestimation of the annual cycle intensity. As such, the (c) Taylor diagram exhibits a statistical comparison of the spatial structure of (a,b). The metrics encompass the standardized deviation ratios (SDR, referring to the ratio of model's pattern standard deviation against the observed pattern standard deviation, vertical axis), root-mean-square-error (RMSE, distance to the REF), and the spatial pattern correlation coefficients (PCC, azimuthal axis) among the 21 CMIP6 models and MME mean comparing with the observation. (d-f) are the same as (a-c), but for the year-to-year standard deviations of the zonal mean rainfall.
As to the annual cycle component, it is dominant in total rainfall over most EA subregions in either observation or CMIP6 models, particularly over tropical and subtropical monsoonal regions (Figure 2a,b). In observation, the annual cycle component explains more than 60% variance, except for the Tibetan Plateau and northwest China. Nevertheless, the CMIP6 MME result overestimates the contribution of the annual cycle (Figure 2b), implicating its overestimation of the annual cycle intensity. As such, the interannual variability is underestimated, presenting the attenuation of the high-frequency variances in the MME (Figure 1b). Comparing with the observation, the maximum of the annual cycle's explained variances around the Yangtze River valley and the Bay of Bengal are different to some extent, resulting in a relatively lower PCC of 0.58. The individual models also cannot precisely reproduce the observed annual cycle domination, featured by the large diversity across them and the PCCs lower than 0.7 ( Figure 2c). The pattern standard deviation of the models is higher than the observation as the majority of points falling outside the quadrant of radius one (i.e., SDR > 1).
sphere 2021, 12, x FOR PEER REVIEW 6 of 16 interannual variability is underestimated, presenting the attenuation of the highfrequency variances in the MME (Figure 1b). Comparing with the observation, the maximum of the annual cycle's explained variances around the Yangtze River valley and the Bay of Bengal are different to some extent, resulting in a relatively lower PCC of 0.58. The individual models also cannot precisely reproduce the observed annual cycle domination, featured by the large diversity across them and the PCCs lower than 0.7 ( Figure 2c). The pattern standard deviation of the models is higher than the observation as the majority of points falling outside the quadrant of radius one (i.e., SDR > 1).  Figure 3 displays the horizontal distribution of the gridded phase ( ) and amplitude ( ) of the annual cycle components. In observation, over the monsoon domain represents the peak of the summer monsoon rainy season. For instance, around June-July in India denotes the formation of the Indian summer monsoon in early summer ( Figure 3a). The clockwise rotation of from April to late July over the South, Central, and North China in order indicates the seasonal migration of the EASM rainfall ( Figure  1a). is large in the tropics but becomes small in mid-latitudes, consistent with the strong tropical monsoon convection but relevant weak rainfall of the EA summer monsoon (EASM). The annual cycle is further attenuated over the non-monsoon regions deep in the continent, where is much weaker. In CMIP6 models, the MME results can generally capture the observed and over most areas, except for the dry inland region over Northwest China. In tropics, the CMIP6 models can basically reproduce the observed pattern of and , although the is larger in India but smaller over WNP. The difference of is considerable between simulation and observation, especially over the East and South China, and the South China Sea (SCS). The simulated is more identical than ( Figure 4b). Moreover, the SDRs and the RMSEs of vary greatly among different models to indicate great challenge in simulating the phase of the annual cycle component. Meanwhile, the simulation biases between the two metrics are independent across the CMIP6 models. It implicates that a better simulated cannot guarantee a better simulation.  Figure 3 displays the horizontal distribution of the gridded phase (ϕ 1 ) and amplitude (A 1 ) of the annual cycle components. In observation, ϕ 1 over the monsoon domain represents the peak of the summer monsoon rainy season. For instance, ϕ 1 around June-July in India denotes the formation of the Indian summer monsoon in early summer ( Figure 3a). The clockwise rotation of ϕ 1 from April to late July over the South, Central, and North China in order indicates the seasonal migration of the EASM rainfall ( Figure 1a). A 1 is large in the tropics but becomes small in mid-latitudes, consistent with the strong tropical monsoon convection but relevant weak rainfall of the EA summer monsoon (EASM). The annual cycle is further attenuated over the non-monsoon regions deep in the continent, where A 1 is much weaker.
In CMIP6 models, the MME results can generally capture the observed A 1 and ϕ 1 over most areas, except for the dry inland region over Northwest China. In tropics, the CMIP6 models can basically reproduce the observed pattern of A 1 and ϕ 1 , although the A 1 is larger in India but smaller over WNP. The difference of ϕ 1 is considerable between simulation and observation, especially over the East and South China, and the South China Sea (SCS). The simulated A 1 is more identical than ϕ 1 (Figure 4b). Moreover, the SDRs and the RMSEs of ϕ 1 vary greatly among different models to indicate great challenge in simulating the phase of the annual cycle component. Meanwhile, the simulation biases between the two metrics are independent across the CMIP6 models. It implicates that a better simulated A 1 cannot guarantee a better ϕ 1 simulation.       Figure 4 shows the first dominant mode of the EAM annual cycle (hereafter referred to as M1), with the 75.1% explained variance in observation. Spatially, M1 presents the mature stage of the Asian summer monsoon. An anomalous large-scale low-pressure is located over the East Asian continent (Figure 4a). Meanwhile, a remarkable anomalous high-pressure emerges over the western Pacific, with the significant positive SLP and anticyclonic anomalies centered around 180 • E. The continental low-and oceanic high-SLP constitutes an east-west seesaw pattern over the EA-WNP region. The SLP gradient maintains the large-scale monsoon rainfall over EA, which is uniformly enhanced over most moist regions with its maximum over the EA continent (Figure 4c). The PC1 exhibits the temporal evolution of M1, as shown by the black sinusoidal in Figure 4e. Thereafter, we calculate the amplitude (A pc1 ) and phase (ϕ pc1 ) of the PC1 series. The value of A pc1 is 68.6, and ϕ pc1 equals pentad 40.5, which around the summer solstice.

Modes of the Annual Cycle of East Asian Rainfall
In general, the CMIP6 MME can reproduce the major characteristics of M1. The simulated zonal SLP gradient between land and ocean and the cyclonic and anticyclonic circulations in the lower troposphere are comparable to the observation (Figure 4c). The positive precipitation anomalies over most regions are reasonably reproduced in simulations (Figure 4d), though some deficiencies exist at the high-latitude Pacific Ocean, South China, and tropical WNP. The amplitude and phase of the simulated PC1 are 68.7 and pentad 40.2, respectively, close to the observation. The M1 is consistently stable among the 21 individual models (Figure 4f). The explained variance of M1 in each model is around 75%, which is close to the observation and the MME result (75.3%). The relative high PCC score between the simulated and observed pattern of M1 (0.86 for MME) shows a small inter-model spread (ranging between 0.68 and 0.88, and 0.049 for the inter-model standard deviation). Figure 5 depicts the situation of the second annual cycle mode (M2) of the EA rainfall. M2 explained the variance of 24.9% in observation. A strong east-west SLP gradient exists in the low-latitudes, whose positive (negative) center is located over WNP (the Southeast of Tibetan Plateau). A relatively weak cyclone maintains on the ocean to the east of Japan. The precipitation in M2 is above-normal from South China to the adjacent oceans south of Japan beneath a couple of anticyclonic and cyclonic. In comparison, the rainfall is below-normal over most of the oceans (Figure 5c). The mature phase of M2 (ϕ pc2 = pentad 22.3) is close to the spring equinox, exactly corresponding to the spring rainy season in South China (Figure 5e).
The M2 in the MME simulations has an explained variance of 24.7%. The positive (negative) SLP anomalies and anomalous anticyclone (cyclone) in tropics resemble the observations, although the cyclone over the ocean to the east of Japan is slightly stronger. The spatial distribution of the precipitation anomalies is reasonably reproduced, except that the above-normal anomalies of precipitation are overestimated in South China but underestimated in the southeast of Japan. The mature phase of the simulated PC2 emerges in pentad 22.0 and closes to the observation (Figure 5e). For the individual models, the performance of M2 is ragged among them. The PCCs ranging from 0.64 to 0.80 vary largely among the models. The inter-model standard deviation of the PCC is 0.057, suggesting a broad simulation diversity in M2. To examine the temporal stability of the simulated annual cycle modes, we first divide the HIST (1850-2014) experiment into 14 groups. Each group has a period of 30 years starting from 1850 to 1979 with an interval of 10 years. Afterward, the MV-EOF is conducted repeatedly on each group to check their similarity. The results show that the relative contribution of M1 and M2 to the annual cycle component is constant, and the proportion does not broadly vary with the sliding of the modelled climatology. The explained variance ratios of M1 to M2 stay at ~3.1 in the MME simulation (Figure 6a), as well as the ensemble members, albeit the existence of the inter-model spread (blue shading). The PCCs of M1 and M2 pattern are sufficiently high between each episode and the standard period (1979-2014), especially the MME result (Figure 6b). The comparative results show that M1 is more stable than M2. Thereby, the annual cycle modes are temporal stable in the longtime historical experiments. The reproducibility of the M1 is mainly ascribed to A 1 simulation among the models (Figure 3b). Models that better reproduce A 1 could simulate a more realistic M1 (e.g., the GFDL-CM4, MPI-ESM1-2-HR, and NorESM2-LM). However, both M1 and M2 are profoundly related to the ϕ 1 simulation performance (Figure not shown), resulting in the considerable challenge of the simulation on the annual cycle phase. Spatially, the sub-regions with larger biases of A 1 and ϕ 1 also show lower skills in the pattern of M1 and M2 (e.g., South China and SCS).
To examine the temporal stability of the simulated annual cycle modes, we first divide the HIST (1850-2014) experiment into 14 groups. Each group has a period of 30 years starting from 1850 to 1979 with an interval of 10 years. Afterward, the MV-EOF is conducted repeatedly on each group to check their similarity. The results show that the relative contribution of M1 and M2 to the annual cycle component is constant, and the proportion does not broadly vary with the sliding of the modelled climatology. The explained variance ratios of M1 to M2 stay at~3.1 in the MME simulation (Figure 6a), as well as the ensemble members, albeit the existence of the inter-model spread (blue shading). The PCCs of M1 and M2 pattern are sufficiently high between each episode and the standard period , especially the MME result (Figure 6b). The comparative results show that M1 is more stable than M2. Thereby, the annual cycle modes are temporal stable in the longtime historical experiments. Since the M1 and M2 are stable among individual CMIP6 models, resemble the observation, and not sensitive to the temporal variation, we can project the simulated results onto the observed modes to reconstruct the PCs (i.e., projected PC1 and PC2) and make a unified comparison. This is a commonly used strategy under the precondition of finding a stable projection basis (e.g., the MJO [64,65]). The amplitudes of the projected PC1 (abbreviated to _ ) share significant common biases across the models, consistent with the systematic bias in the monsoon intensity (Figure 1b). All the simulations are weaker than the observations, though the MME result performs better (Figure 7a). The projected phase in simulation ( _ ) is either slightly lead or lag the observations (Figure 7c). Most models can reproduce the timing of the mature phase with the _ biases less than 1.5 pentads, except for the CESM2-FV2 (1.81 pentads phase delay) and CESM2-WACCM (1.58 pentads phase delay). Some models (CNRM-CM6-1, CNRM-CM6-1-HR, CNRM-ESM2-1, INM-CM5-0, and MRI-ESM2-0) can accurately simulate the _ , which is beyond the MME simulation (0.27 pentads phase delay). The inter-model standard deviation of _ biases is 0.82 pentads. Since the M1 and M2 are stable among individual CMIP6 models, resemble the observation, and not sensitive to the temporal variation, we can project the simulated results onto the observed modes to reconstruct the PCs (i.e., projected PC1 and PC2) and make a unified comparison. This is a commonly used strategy under the precondition of finding a stable projection basis (e.g., the MJO [64,65]). The amplitudes of the projected PC1 (abbreviated to A proj_1 ) share significant common biases across the models, consistent with the systematic bias in the monsoon intensity (Figure 1b). All the simulations are weaker than the observations, though the MME result performs better (Figure 7a). The projected phase in simulation (ϕ proj_1 ) is either slightly lead or lag the observations ( Figure  7c). Most models can reproduce the timing of the mature phase with the ϕ proj_1 biases less than 1.5 pentads, except for the CESM2-FV2 (1.81 pentads phase delay) and CESM2-WACCM (1.58 pentads phase delay). Some models (CNRM-CM6-1, CNRM-CM6-1-HR, CNRM-ESM2-1, INM-CM5-0, and MRI-ESM2-0) can accurately simulate the ϕ proj_1 , which is beyond the MME simulation (0.27 pentads phase delay). The inter-model standard deviation of ϕ proj_1 biases is 0.82 pentads.
The amplitudes of projected PC2 (A proj_2 ) are also weaker in all the simulations, as shown in Figure 7b. Although the A proj_2 value is relatively low, the inter-model diversity is comparable with the A proj_1 . The MME result is outperforming than most individual models. The differences between the projected PC2 phase (ϕ proj_2 ) and the observation suggest a great spread among the ensemble models (Figure 7d). The intermodel standard deviation of the phase biases reaches 2.10 pentads, evidently larger than the counterpart in projected PC1 (0.82). Nearly all the models (except for the BCC-CSM2-MR and NESM3) present a tremendous lagging bias, indicating that the temporal evolution is systematically delayed compared to the observations. Some models even suffer more than four pentads (beyond double M1) phase delay, alongside a maximum up to 6.99 pentads in the MIROC6. Fortunately, some models (BCC-CSM2-MR, BCC-ESM1, CanESM5, GFDL-CM4, INM-CM4-8, MPI-ESM1-2-HR, and MRI-ESM2-0) feature the phase delay less than two pentads and the MME result (2.03 pentad delay), especially the BCC-ESM1 and GFDL-CM4 with only 0.35 and 0.33, respectively.
( Figure 7a). The projected phase in simulation ( _ ) is either slightly lead or lag the observations (Figure 7c). Most models can reproduce the timing of the mature phase with the _ biases less than 1.5 pentads, except for the CESM2-FV2 (1.81 pentads phase delay) and CESM2-WACCM (1.58 pentads phase delay). Some models (CNRM-CM6-1, CNRM-CM6-1-HR, CNRM-ESM2-1, INM-CM5-0, and MRI-ESM2-0) can accurately simulate the _ , which is beyond the MME simulation (0.27 pentads phase delay). The inter-model standard deviation of _ biases is 0.82 pentads. Considering the underlying physical mechanism of the two modes represents the dominant features of the EA annual cycle. The phase biases are essential for the mode accuracy. It is vital that idealized models reasonably reproduce the phases in both of them. Thereby, we pick out the optimized subset of good performing models (GMs, include the BCC-ESM1, CanESM5, and GFDL-CM4) in annual cycle modes according to the ranking of the sum of phase biases in Figure 7c,d (sum < 1 pentad, choosing one model from the same host center). These models share relatively small phase biases of M1(<1 pentad) and M2 (<2 pentads). In contrast, the MIROC6, CESM2-FV2, and Fgoals-g3 are categorized as the poor performance models (PMs) with the most significant phase biases, especially the phase delay on M2.
More challenging in climate models, the annual cycle varies substantially year by year and exhibits prominent interannual variations. In the following, we project the annual cycle component of every year onto the observed M1 and M2. Every single year has its projected amplitude (A proj ) and phase (ϕ proj ) of the annual cycle modes. Hence, we obtain the interannual standard deviations (ISDs) of the A proj and ϕ proj in individual models, as shown in Figure 8. In general, the simulated interannual variations of the amplitude are relatively weak compared to the observation, as the scaled ISDs below the reference line, especially the amplitude of M2. On the other hand, the simulations of the phase ISD are mostly higher than the observations. Nearly all the models overestimate the phase ISD of M1, and half models overestimate the phase ISD of M2. Some reproduce all four ISDs relatively well. Examples are the BCC-ESM1, GFDL-CM4, BCC-CSM2-MR, and INM-CM5-0, which also considerably capture the climatological annual cycle mode (Figure 7). In contrast, models with the low accuracy of the climatology (e.g., CESM2-FV2, CNRM-ESM2-1, and NorESM2-LM) tend to exhibit notable biases in ISD. Atmosphere 2021, 12, x FOR PEER REVIEW 12 of 16 In summary, the GMs well reproduce the annual cycle modes in climatology and interannual variations, with evident lower phase biases in M1 and M2 than the PMs, thus reasonably capturing the underlying physics related to the key process in the EAM evolution. Moreover, as shown in Figure 3b, the ensemble of the GMs (GME) simulates the continental-scale annual cycle components (i.e., higher skill in and ) more reliable than the PMs (PME) and on par with the MME. These results verify the benchmark that models with fewer phase errors in annual cycle modes tend to represent more realistic annual cycle features over East Asia.

Summary
This study documents evaluation against observations of the simulation of the annual cycle over the EAM domain, using the new simulations of the 21 released CMIP6 CGCMs HIST. We propose a robust quantification and benchmark of the annual cycle of East Asian rainfall in climate models. The main conclusions are as follows: Initially, CMIP6 models yield the basic veracity while still sharing considerable limitations in representing the climatological annual cycle's unique features in East Asia, especially the annual cycle phase. The notable discrepancies mainly exist in East and South China, and the South China Sea.
We propose a new assessing benchmark by utilizing the annual cycle's EOF modes (M1 and M2), which embed the underlying physics associated with the EAM annual evolution. The MME simulations can generally reproduce the main characteristics of the annual cycle modes, while still suffering some drawbacks. First, the amplitudes of both annual cycle modes exhibit significant common biases across the models. Second, although most models can reproduce the M1 phase's timing with the phase biases less than 1.5 pentads, nearly all the models represent a prominent delayed phase of M2, some even beyond four pentads.
Finally, according to the combined phase biases of M1 and M2, the BCC-ESM1, CanESM5, and GFDL-CM4 are defined as the best performing models (GMs). The GMs' ensemble can capture the phase of the annual cycle modes that link to the critical process In summary, the GMs well reproduce the annual cycle modes in climatology and interannual variations, with evident lower phase biases in M1 and M2 than the PMs, thus reasonably capturing the underlying physics related to the key process in the EAM evolution. Moreover, as shown in Figure 3b, the ensemble of the GMs (GME) simulates the continental-scale annual cycle components (i.e., higher skill in A 1 and ϕ 1 ) more reliable than the PMs (PME) and on par with the MME. These results verify the benchmark that models with fewer phase errors in annual cycle modes tend to represent more realistic annual cycle features over East Asia.

Summary
This study documents evaluation against observations of the simulation of the annual cycle over the EAM domain, using the new simulations of the 21 released CMIP6 CGCMs HIST. We propose a robust quantification and benchmark of the annual cycle of East Asian rainfall in climate models. The main conclusions are as follows: Initially, CMIP6 models yield the basic veracity while still sharing considerable limitations in representing the climatological annual cycle's unique features in East Asia, especially the annual cycle phase. The notable discrepancies mainly exist in East and South China, and the South China Sea.
We propose a new assessing benchmark by utilizing the annual cycle's EOF modes (M1 and M2), which embed the underlying physics associated with the EAM annual evolution. The MME simulations can generally reproduce the main characteristics of the annual cycle modes, while still suffering some drawbacks. First, the amplitudes of both annual cycle modes exhibit significant common biases across the models. Second, although most models can reproduce the M1 phase's timing with the phase biases less than 1.5 pentads, nearly all the models represent a prominent delayed phase of M2, some even beyond four pentads.
Finally, according to the combined phase biases of M1 and M2, the BCC-ESM1, CanESM5, and GFDL-CM4 are defined as the best performing models (GMs). The GMs' ensemble can capture the phase of the annual cycle modes that link to the critical process of the EAM evolution driving by the external forcing. They can also reproduce the spatiotem-poral structures of the annual cycle reasonably well. Moreover, the models capture the annual cycle modes in climatology commonly perform better interannual variabilities and vice versa. These results suggest the necessity of reasonable annual cycle mode simulation. Therefore, the dominant modes of the EAM annual cycle can act as the evaluate benchmark in the EAM modelling framework, providing valuable insight into the bias correction strategies for decreasing the uncertainty in the simulation of the EAM.

Discussion
What causes the limits and deficiencies in representing the annual cycle modes and the excellent model spread calls for further investigation and beyond our study. The multistage evolution and annual cycle of rainfall over East Asia correspond with the surface boundary forcing (e.g., SST, snow cover, and soil moisture) and internal atmospheric variabilities in observations. It is foreseeable that annual cycle of the monsoon system links with complicated and various modelling factors, including the mean state and cycle of the sea surface temperature (SST), mid-latitude processes, physical parameterizations, dynamic core, among others. For instance, previous works have suggested that both the mean state and the cyclic progression of the SST warming profoundly affects the seasonal delay of the tropical rainfall [66][67][68]. In climate models, the annual cycle SST contributes to the annual cycle rainfall bias in some regions [69]. In addition, SST anomalies associated with the El Niño-Southern Oscillation (ENSO) considerably impact the East Asian monsoon. Therefore, the realistic simulation of these SST anomalies, including the absolute magnitude and temporal behavior, might play a crucial role in this issue.
As shown in Figure 7, comparing a subset of models (CNRM-CM6-1, CNRM-CM6-1-HR, and CNRM-ESM2-1) with different atmospheric horizontal resolutions, the highresolution one shows better performance. In addition, most of the Earth system models we used also exhibit relatively low phase biases. The extent to which the model resolution and complexity can influence the annual cycle over East Asia requires further explorations. In our work, we only use one ensemble member for each model, ignoring the atmospheric internal variabilities. How does the internal variability impact the simulation of the East Asian annual cycle?
In any event, many future avenues deserve in-depth research. Despite all puzzles and challenges, it is crucial to identify the possible vital factors for modelling community. Given the annual cycle is the fundamental feature of the EAM system. Reliable simulation is crucial for society and adaptation strategies under climate change. Our analysis highlights a crucial model bias to keep in mind for future work on EAM modelling. The capacity to accurately simulate the distribution and timing of East Asian rainfall would appear to be a necessary precursor to making reliable predictions for the watershed, agricultural, and hydrological purposes.  Data Availability Statement: Publicly available datasets were analyzed in this study. These datasets are freely available through the Earth System Grid Federation: https://esgf-node.llnl.gov/search/ cmip6.