Integration of Site Effects into Probabilistic Seismic Hazard Assessment (PSHA): A Comparison between Two Fully Probabilistic Methods on the Euroseistest Site

The integration of site effects into Probabilistic Seismic Hazard Assessment (PSHA) is still an open issue within the seismic hazard community. Several approaches have been proposed varying from deterministic to fully probabilistic, through hybrid (probabilistic-deterministic) approaches. The present study compares the hazard curves that have been obtained for a thick, soft non-linear site with two different fully probabilistic, site-specific seismic hazard methods: (1) The analytical approximation of the full convolution method (AM) proposed by Bazzurro and Cornell 2004a,b and (2) what we call the Full Probabilistic Stochastic Method (SM). The AM computes the site-specific hazard curve on soil, HC(Sa( f )), by convolving for each oscillator frequency the bedrock hazard curve, HC(Sa( f )), with a simplified representation of the probability distribution of the amplification function, AF( f ), at the considered site The SM hazard curve is built from stochastic time histories on soil or rock corresponding to a representative, long enough synthetic catalog of seismic events. This comparison is performed for the example case of the Euroseistest site near Thessaloniki (Greece). For this purpose, we generate a long synthetic earthquake catalog, we calculate synthetic time histories on rock with the stochastic point source approach, and then scale them using an adhoc frequency-dependent correction factor to fit the specific rock target hazard. We then propagate the rock stochastic time histories, from depth to surface using two different one-dimensional (1D) numerical site response analyses, while using an equivalent-linear (EL) and a non-linear (NL) code to account for code-to-code variability. Lastly, we compute the probability distribution of the non-linear site amplification function, AF( f ), for both site response analyses, and derive the site-specific hazard curve with both AM and SM methods, to account for method-to-method variability. The code-to-code variability (EL and NL) is found to be significant, providing a much larger contribution to the uncertainty in hazard estimates, than the method-to-method variability: AM and SM results are found comparable whenever simultaneously applicable. However, the AM method is also shown to exhibit severe limitations in the case of strong non-linearity, leading to ground motion “saturation”, so that finally the SM method is to be preferred, despite its much higher computational price. Finally, we encourage the use of ground-motion simulations to integrate site effects into PSHA, since models with different levels of complexity can be included (e.g., point source, extended source, 1D, two-dimensional (2D), and three-dimensional (3D) site response analysis, kappa effect, hard rock . . . ), and the corresponding variability of the site response can be quantified.


Introduction
The integration of site effects into Probabilistic Seismic Hazard Assessment (PSHA) is a constant subject of discussion within the seismic hazard community due to its high impact on hazard estimates. Preliminary versions on how to integrate them have been presented in [1][2][3]. Furthermore, [4][5][6][7][8][9] have also published similar studies on this subject. A relevant overview with enlightening examples can be found, for instance in [10,11] and [12,13]. However, research on this topic is still needed since the integration of site effects is still treated in a rather crude way in most engineering studies, as the variability associated to the non-linear behavior of soft soils and its ground-motion frequency dependence is generally neglected. For instance, the various approaches that are discussed in [14,15] correspond to hybrid (deterministic-probabilistic) approaches [16], where the (probabilistic) uniform hazard spectrum on standard rock at a given site is simply multiplied by the linear or nonlinear median site-specific amplification function.
The main drawback of those different hybrid-based approaches (currently the most common way to include site effects into PSHA) is that once the hazard curve on rock is multiplied with the local non-linear frequency-dependent amplification function, the exceedance probability at the soil surface is no longer the one that is defined initially for the bedrock hazard. At a given frequency, the same soil surface ground motion can be reached with different reference bedrock motion (corresponding to different return periods and/or different earthquake contributions) with different non-linear site responses. Ref. [10,11] concluded from their studies at two different sites (clay and sand) that the hybrid-based method tends to be non-conservative at all frequencies and at all mean return periods with respect to their approximation of the fully probabilistic method (AM) that is considered here.
The present research work constitutes a further step along the same direction, trying to overcome the limitation of hybrid-based approaches for strongly non-linear sites by providing a fully probabilistic description of the site-specific hazard curve. The aim of this work is to present a comparison exercise between hazard curves that are obtained with two different, fully probabilistic site-specific seismic hazard approaches: (1) The analytical approximation of the full convolution method (AM) proposed by [10,11] and (2) what we call the Full Probabilistic Stochastic Method (SM).
The AM computes the site-specific hazard on soil by convolving the hazard curve at the bedrock level, with a simplified representation of the probability distribution of the site-specific amplification function. The SM hazard curve is directly derived from a set of synthetic time histories at the site surface, combining point-source stochastic time histories for bedrock that is inferred from a long enough virtual earthquake catalog, and non-linear site response. This comparative exercise has been implemented for the Euroseistest site, a multidisciplinary European experimental site for integrated studies in earthquake engineering, engineering seismology, seismology, and soil dynamics [17]. To perform PSHA calculations, the area source model developed in the SHARE project is used [18].

Study Area: The Euroseistest Site
As stated in [18], the Euroseistest site is a multidisciplinary experimental site for integrated studies in earthquake engineering, engineering seismology, seismology, and soil dynamics. It has been established thanks to European funding in Mygdonia valley, the epicentral area of the 1978, M6.4 earthquake, located about 30 km to the NE of the city of Thessaloniki in northern Greece, Figure 1a [19][20][21]. The ground motion instrumentation consists of a three-dimensional (3D) strong motion array, which has been operational for slightly more than two decades. This site was selected as an appropriate site to perform this exercise, because of the availability of extensive geological, geotechnical, and seismological surveys. The velocity model of the Euroseistest basin has been published by several authors [22][23][24][25], varying from one-dimensional (1D) up to two-dimensional (2D) and 3D models (Figure 1b). These models were used to define the 1D soil column for the present exercise ( Figure 2a and Table 1). Degradation curves are also available to characterize each soil layer, (Figure 2b,c) [26], as well as local earthquake recordings to calibrate the model in the linear (weak motion) domain. For the present exercise, in the absence of reliable (1D) up to two-dimensional (2D) and 3D models (Figure 1b). These models were used to define the 1D soil column for the present exercise ( Figure 2a and Table 1). Degradation curves are also available to characterize each soil layer, (Figure 2b,c) [26], as well as local earthquake recordings to calibrate the model in the linear (weak motion) domain. For the present exercise, in the absence of reliable measurements along the whole profile, the cohesion was assumed to be zero in all layers (C = 0 kPa). As non-zero cohesion is generally associated with non-zero plasticity index, and a shift of non-linear degradation curves towards higher strains, this zero-cohesion assumption is likely to result in a stronger impact of the NL site response, so that the present case is to be considered as an example case for a highly non-linear, deep soft site.
Several studies performed at the Euroseistest, both instrumental and numerical, have consistently shown a fundamental frequency (f0) around 0.6-0.7 Hz [23,27,28,29,30] with an average shear wave velocity over the top 30 m (VS30) of 186 m/s. The 1D simulations that are performed in this study are based on the parameters listed in Table 1, which have been shown to be consistent with the observed instrumental amplification functions, ( ), at least in the linear domain [23,27,28,29,30]. Epicenter and focal mechanism [19] of the mainshock (red) and epicenters of the largest events of the sequence (green, yellow) [20] are also depicted. Dotted lines mark the Mygdonian graben and surface ruptures after the 1978 earthquake [21] are shown as red lines. (b) Three-dimensional (3D) model of the Mygdonian basin geological structure [25].   [19] of the mainshock (red) and epicenters of the largest events of the sequence (green, yellow) [20] are also depicted. Dotted lines mark the Mygdonian graben and surface ruptures after the 1978 earthquake [21] are shown as red lines. (b) Three-dimensional (3D) model of the Mygdonian basin geological structure [25]. (Figures from http://euroseisdb.civil.auth.gr/geotec).

Methods
The methods that are followed in this work correspond to two different, fully probabilistic procedures to account for highly non-linear soil response in PSHA, a Full Probabilistic Stochastic Method (SM) that is developed for the present work, and the analytical approximation of the full convolution method (AM), as proposed by [10,11].
For both approaches, the first step consists in deriving the hazard curve for the specific bedrock properties of the considered site. As shown in Figure 2a and Table 1, this bedrock is characterized by a high S-wave velocity (2600 m/s), and is therefore much harder than "standard rock" conditions corresponding to V S30 = 800 m/s. Hence, host-to-target adjustments are required [31][32][33][34] to adjust the ground motion estimates that are provided by Ground Motion Prediction Equations (GMPEs) for the larger shear wave velocity and also possibly for the different high-frequency attenuation of the reference rock (such adjustments are commonly known as Vs-kappa effect [35]). For the sake of simplicity in the present exercise, the hazard curve has been derived with only one GMPE (Akkar et al., 2014) (AA14) [36], which is satisfactorily representative of the median hazard curve on rock obtained with seven other GMPEs deemed to be relevant for the European area (see [15]). The host-to-target (HTT) adjustments have been performed here on the basis of simple velocity adjustments calibrated on KiKnet data using the Laurendeau et al., 2107 GMPE [37], since the detailed knowledge of the site characteristics does not compensate the very poor information on the "host" characteristics of any GMPE (in particular those from European data), which results in very large uncertainty levels in classical HTT approaches [38].
For the SM approach, the next step consists in generating a synthetic earthquake catalog by sampling the earthquake recurrence model. The area source model that was developed in SHARE [18] is sampled with the Monte Carlo method to generate the earthquake catalog (OpenQuake engine, Stochastic Event Set Calculator tool [39]). The Boore (2003) Stochastic Method [40,41] is then applied to generate synthetic time histories at the bedrock site corresponding to all earthquakes in the catalog. Some specific adjustments and corrections were required to obtain ground motions compatible with the ground-motion model (AA14) used in the classical PSHA calculation. This step, which is not trivial, is described in detail in the hard rock section.
The hard rock, correctly scaled time histories are then propagated to the site surface using two different 1D non-linear site response codes. One set of time histories on soil is generated that is based on the SHAKE91 equivalent-linear (EL) code [42][43][44], while the second set was derived using the NOAH [45] fully non-linear code (NL). Both of the codes were used with the objective of incorporating somehow the code-to-code variability, which has been shown in all recent benchmarking exercises to be very important [46,47]. The final step in the SM method consists in deriving the site-specific hazard curve by simply calculating the annual rate of exceedance from the set of surface synthetic time series.
The AM approach, as proposed by [11], requires, for each frequency, f, a statistical description of the response spectrum amplification function AF( f , S r a ) as a function of input ground motion on rock, S r a . This is done, for each used site response code, with a piecewise linear function describing the dependence of the median AF( f ) with loading level S r a ( f ), together with an appropriate lognormal distribution, accounting for its variability as a function of input ground motion on rock, S r a . The large number of synthetic time histories at site surface were used to derive the piecewise linear function of the AM (AF( f ) vs. S r a ( f )). Finally, the bedrock hazard curve (including host-to-target adjustments) is then convolved analytically with the simplified amplification function following [11], to obtain an estimate of the site surface hazard curve, to be compared with the SM estimate.

Results for Hard Rock Hazard
The site-specific soil hazard estimates at the Euroseistest using the two different fully probabilistic methods are presented below, with step-by-step explanations of the hypotheses that are considered and the obtained results.

Rock Target Hazard
The first step consists in defining the target hazard on a standard rock at the Euroseistest, to be able to generate a set of consistent synthetic time histories. Following the methodology that is described in [14,15], the hazard curve calculated using the AA14 GMPE was selected as the target hazard curve for EC8 standard rock (V S30 = 800 m/s) for this exercise. Figure 3 displays the AA14 hazard curves at three different spectral periods (PGA, 0.2 s and 1.0 s, respectively) using 5% spectral damping ratio (ξ = 5%). The hazard is calculated while using the area source model that was developed in the European SHARE project [18] (xml input files available on the efher.org website) and the Openquake Engine [39,48].

Rock Target Hazard
The first step consists in defining the target hazard on a standard rock at the Euroseistest, to be able to generate a set of consistent synthetic time histories. Following the methodology that is described in [14,15], the hazard curve calculated using the AA14 GMPE was selected as the target hazard curve for EC8 standard rock (VS30 = 800 m/s) for this exercise. Figure 3 displays the AA14 hazard curves at three different spectral periods (PGA, 0.2 s and 1.0 s, respectively) using 5% spectral damping ratio (ξ = 5%). The hazard is calculated while using the area source model that was developed in the European SHARE project [18] (xml input files available on the efher.org website) and the Openquake Engine [39,48].

Host-to-Target Adjustments
As presented in more detail in [21,22], the bedrock beneath the Euroseistest basin (~196 m depth) has a shear wave velocity of about 2600 m/s, which is much larger than the validity range of most GMPEs. It is therefore mandatory to perform what is known in literature as "host-to-target adjustment" corrections (HTT [31,32,33,34]). As discussed in [14,15], the high level of complexity and uncertainty associated to the current HTT adjustments procedures [16,37,49,50,51], have motivated us to use here another, simpler, straightforward way to account for rock to hard rock correction for probabilistic seismic hazard purposes.
This simpler approach is based on the work by Laurendeau et al., 2017 (LA17) [37], who proposed a new rock GMPE that is valid for surface velocities ranging from 500 m/s to 3000 m/s and derived on the basis of the Japanese KiK-net dataset [52]. This GMPE presents the advantage to rely only on the value of rock velocity, without requiring the κ0 values either for the host region or for the target site, and therefore avoiding the associated uncertainties. The adjustments are thus performed on the basis on one single proxy for rock, as it is usually done for all soil deposits. It may account for the variability of some other mechanical characteristics (such as κ0, or the fundamental frequency f0), but only through their potential correlations with VS30. We consider, as discussed in [37,38], that such an approach is as reliable as the conventional HTT approach due to the huge uncertainties and possible bias linked with the measurement of the physical parameter κ0. It provides rather robust predictions of rock motion whatever the approach used for their derivation: deconvolution of surface recordings to outcropping bedrock with the known 1D profile, or the correction of down-hole recordings, see [37] for the corresponding discussion.
The LA17 GMPE is based on a combined dataset including surface recordings from stiff sites and hard rock motion estimated through deconvolution of the same recordings by the corresponding 1D linear site response. The site term in the proposed (simple) GMPE equation is given by a simple dependence on rock S-wave velocity value VS30 in the form ( ). ( /1000), where is tuned from actual recordings for each spectral oscillator period ( ). The HTT adjustment factor can thus be computed, as follows in Equation (1)

Host-to-Target Adjustments
As presented in more detail in [21,22], the bedrock beneath the Euroseistest basin (~196 m depth) has a shear wave velocity of about 2600 m/s, which is much larger than the validity range of most GMPEs. It is therefore mandatory to perform what is known in literature as "host-to-target adjustment" corrections (HTT [31][32][33][34]). As discussed in [14,15], the high level of complexity and uncertainty associated to the current HTT adjustments procedures [16,37,[49][50][51], have motivated us to use here another, simpler, straightforward way to account for rock to hard rock correction for probabilistic seismic hazard purposes.
This simpler approach is based on the work by Laurendeau et al., 2017 (LA17) [37], who proposed a new rock GMPE that is valid for surface velocities ranging from 500 m/s to 3000 m/s and derived on the basis of the Japanese KiK-net dataset [52]. This GMPE presents the advantage to rely only on the value of rock velocity, without requiring the κ 0 values either for the host region or for the target site, and therefore avoiding the associated uncertainties. The adjustments are thus performed on the basis on one single proxy for rock, as it is usually done for all soil deposits. It may account for the variability of some other mechanical characteristics (such as κ 0 , or the fundamental frequency f 0 ), but only through their potential correlations with V S30 . We consider, as discussed in [37,38], that such an approach is as reliable as the conventional HTT approach due to the huge uncertainties and possible bias linked with the measurement of the physical parameter κ 0 . It provides rather robust predictions of rock motion whatever the approach used for their derivation: deconvolution of surface recordings to outcropping bedrock with the known 1D profile, or the correction of down-hole recordings, see [37] for the corresponding discussion.
The LA17 GMPE is based on a combined dataset including surface recordings from stiff sites and hard rock motion estimated through deconvolution of the same recordings by the corresponding 1D linear site response. The site term in the proposed (simple) GMPE equation is given by a simple dependence on rock S-wave velocity value V S30 in the form S 1 (T). ln(V S30 /1000), where S 1 is tuned from actual recordings for each spectral oscillator period (T). The HTT adjustment factor can thus be computed, as follows in Equation (1):

HTT Adjutment Factors(T) =
Hard Rock HC (Vs = 2600 m/s) Standard Rock HC (Vs = 800 m/s) = S 1 (T).ln (2600/800) (1) Since the site terms in LA17, for such hard and standard rock, do not include any non-linearity, the resulting HTT adjustment factors are ground-motion independent, and thus return period (Tr) independent. The correction procedure consists simply in computing the hazard curves for standard rock (V S30 = 800 m/s) using AA14 GMPE, and then correcting it with the so defined, frequency-dependent LA17 HTT factor. The values listed in Table 2 indicate that hard rock motion is systematically smaller than the standard rock motion at all of the spectral periods, with larger reduction factors at short periods. Of course, these corrections are, in principle, only valid for Japanese rock sites, but on the other hand most of HTT methods are implicitly using, in practice, typical United States-California (US-California) velocity rock profiles at least for the host side. There is no obvious reason why such US profiles would be more acceptable for Europe than Japanese results. Ultimately, the hard-rock hazard curve is then derived by simply scaling the AA14 hazard curve with the HTT adjustments factors for each period (Figure 4).
Since the site terms in LA17, for such hard and standard rock, do not include any non-linearity, the resulting HTT adjustment factors are ground-motion independent, and thus return period (Tr) independent. The correction procedure consists simply in computing the hazard curves for standard rock (VS30 = 800 m/s) using AA14 GMPE, and then correcting it with the so defined, frequencydependent LA17 HTT factor. The values listed in Table 2 indicate that hard rock motion is systematically smaller than the standard rock motion at all of the spectral periods, with larger reduction factors at short periods. Of course, these corrections are, in principle, only valid for Japanese rock sites, but on the other hand most of HTT methods are implicitly using, in practice, typical United States-California (US-California) velocity rock profiles at least for the host side. There is no obvious reason why such US profiles would be more acceptable for Europe than Japanese results. Ultimately, the hard-rock hazard curve is then derived by simply scaling the AA14 hazard curve with the HTT adjustments factors for each period (Figure 4).   One might wonder why scaling AA14 curve instead of using directly LA17 in the calculations. The main reason was that AA14 (VS30 = 800 m/s) provides a good approximation to the median hazard estimate at the Euroseistest on standard rock from seven explored GMPES [15], while the estimates with LA17 (VS30 = 800 m/s) is located outside the uncertainty range of the same selected representative GMPEs. This is probably due to the very simple functional form that is used in LA17, and to the fact that it was elaborated only from Japanese data. Its main interest however is to provide a rock adjustment factor calibrated on actual data from a large number of rock and hard rock sites, without any assumptions regarding non-measured parameters, such as κ0. Once the approach that is proposed here and initially based in LA17 is tested on other data sets, the corresponding results can be used to provide alternative scaling factors between hard rock and standard rock. Note, however, that this modulation approach through another kind of GMPE is exactly similar to what has been proposed in [16], where stochastic simulations are used for the VS-κ corrections without being used as GMPEs.

Synthetic Earthquake Catalog
Once defined the target hazard on hard rock, we proceed to generate a long enough seismic catalog to build synthetic hazard curves both on rock and soil at the selected Euroseistest site. The  One might wonder why scaling AA14 curve instead of using directly LA17 in the calculations. The main reason was that AA14 (V S30 = 800 m/s) provides a good approximation to the median hazard estimate at the Euroseistest on standard rock from seven explored GMPES [15], while the estimates with LA17 (V S30 = 800 m/s) is located outside the uncertainty range of the same selected representative GMPEs. This is probably due to the very simple functional form that is used in LA17, and to the fact that it was elaborated only from Japanese data. Its main interest however is to provide a rock adjustment factor calibrated on actual data from a large number of rock and hard rock sites, without any assumptions regarding non-measured parameters, such as κ 0 . Once the approach that is proposed here and initially based in LA17 is tested on other data sets, the corresponding results can be used to provide alternative scaling factors between hard rock and standard rock. Note, however, that this modulation approach through another kind of GMPE is exactly similar to what has been proposed in [16], where stochastic simulations are used for the V S -κ corrections without being used as GMPEs.

Synthetic Earthquake Catalog
Once defined the target hazard on hard rock, we proceed to generate a long enough seismic catalog to build synthetic hazard curves both on rock and soil at the selected Euroseistest site. The earthquake catalog is obtained by sampling the area source model of SHARE. To handle a reasonable amount of events, only the sources that are contributing to the hazard must be sampled. To identify which area sources contribute to the hazard at the Euroseistest site, the individual contributions from the crustal sources in the neighborhood of the site were calculated. The individual hazard curves corresponding to each source were compared to the total hazard curve using all sources, showing that the area source enclosing the Euroseistest site (GRAS390) fully controls the hazard at all of the considered periods, so that we can neglect the other sources in our calculations.
The stochastic earthquake catalog was generated by sampling the magnitude-frequency distribution (truncated Gutenberg-Richter model) of the area source zone, and earthquakes are distributed homogeneously within the area source zone. For this purpose, we use the Stochastic Event Set Calculator tool from the Openquake Engine [39,48]. The generated catalog must be long enough to estimate correctly the hazard for the return period of interest, 5000 years (equivalent to a probability of exceedance of 1.0% in 50 years). The length of the catalog is obviously a key issue when calculating seismic hazard with the Monte Carlo technique.
To identify which minimum length is required, a sensitivity analysis is led consisting in comparing the different magnitude-frequency distribution curves and hazard curves that were obtained with different catalog lengths. Figure 5 displays the magnitude-frequency distribution curves built using increasing catalog lengths (500, 5000, 25,000, 35,500, 50,000 years), when compared to the original SHARE model (black solid line). For short catalog lengths (here 500 years), the recurrence model is not correctly sampled in the upper magnitude range. The catalog length is too short with respect to the recurrence times of large events. When considering longer catalog lengths (from 5000 years on), the full magnitude range is sampled. However, the correct annual rates for the upper magnitude range are obtained with a reasonable error (~5%) only for very long earthquake catalog lengths (larger or equal to 50,000 years). Most important is that the magnitude range contributing to the hazard at the return periods of interest are correctly sampled. earthquake catalog is obtained by sampling the area source model of SHARE. To handle a reasonable amount of events, only the sources that are contributing to the hazard must be sampled. To identify which area sources contribute to the hazard at the Euroseistest site, the individual contributions from the crustal sources in the neighborhood of the site were calculated. The individual hazard curves corresponding to each source were compared to the total hazard curve using all sources, showing that the area source enclosing the Euroseistest site (GRAS390) fully controls the hazard at all of the considered periods, so that we can neglect the other sources in our calculations. The stochastic earthquake catalog was generated by sampling the magnitude-frequency distribution (truncated Gutenberg-Richter model) of the area source zone, and earthquakes are distributed homogeneously within the area source zone. For this purpose, we use the Stochastic Event Set Calculator tool from the Openquake Engine [39,48]. The generated catalog must be long enough to estimate correctly the hazard for the return period of interest, 5000 years (equivalent to a probability of exceedance of 1.0% in 50 years). The length of the catalog is obviously a key issue when calculating seismic hazard with the Monte Carlo technique.
To identify which minimum length is required, a sensitivity analysis is led consisting in comparing the different magnitude-frequency distribution curves and hazard curves that were obtained with different catalog lengths. Figure 5 displays the magnitude-frequency distribution curves built using increasing catalog lengths (500, 5000, 25,000, 35,500, 50,000 years), when compared to the original SHARE model (black solid line). For short catalog lengths (here 500 years), the recurrence model is not correctly sampled in the upper magnitude range. The catalog length is too short with respect to the recurrence times of large events. When considering longer catalog lengths (from 5000 years on), the full magnitude range is sampled. However, the correct annual rates for the upper magnitude range are obtained with a reasonable error (~5%) only for very long earthquake catalog lengths (larger or equal to 50,000 years). Most important is that the magnitude range contributing to the hazard at the return periods of interest are correctly sampled.   Figure 6a display the stochastic hazards curves on rock obtained from the five catalogs with increasing lengths, as well as the hazard curve calculated with the classical PSHA calculation (using AA14 ground-motion model). Figure 6b illustrates the ratio between the PGAs relying on the Monte Carlo method and the PGAs relying on the classical approach, interpolated for different annual rates. Based on these results, we decide to use a 50,000 years earthquake catalog to estimate the hazard, since the difference between PGAs is lower than 5% for all of the annual rates down to 10 −4 (or return periods up to 10,000 years). The final selected catalog of 5000 years consists of~21,800 events ranging from the minimum considered magnitude M w = 4.5 to the maximum magnitude M w = 7.8, with a distance range (Joyner and Boore) from 0 to 150 km and a depth range from 0 to 30 km. model of the source zone GRAS390 in the SHARE model. (f) The 5 recurrence curves obtained are superimposed to the original recurrence model. Figure 6a display the stochastic hazards curves on rock obtained from the five catalogs with increasing lengths, as well as the hazard curve calculated with the classical PSHA calculation (using AA14 ground-motion model). Figure 6b illustrates the ratio between the PGAs relying on the Monte Carlo method and the PGAs relying on the classical approach, interpolated for different annual rates. Based on these results, we decide to use a 50,000 years earthquake catalog to estimate the hazard, since the difference between PGAs is lower than 5% for all of the annual rates down to 10 −4 (or return periods up to 10,000 years). The final selected catalog of 5000 years consists of ~21,800 events ranging from the minimum considered magnitude Mw = 4.5 to the maximum magnitude Mw = 7.8, with a distance range (Joyner and Boore) from 0 to 150 km and a depth range from 0 to 30 km.

Synthetic Time Histories on Rock
Now, we can proceed to generate synthetic time histories on hard rock. We propose to define the input motions combining a simple approach to generate synthetic time histories with the AA14 GMPE. An arbitrary seismological model is used to generate the waveforms that are then scaled to match the target accelerations, using a continuous function. We consider that it is reasonable: (1) to define the input motions using the European AA14 GMPE and (2) to generate synthetic time histories that are compatible to the selected GMPE on the basis of a simple, point source seismological model with plausible but arbitrary source and path parameters, providing waveforms that are later scaled using a continuous function to match the target accelerations at different spectral periods.
We use the well-known stochastic method that was proposed by D. Boore [40] for the generation of ground-motion time histories, and its corresponding Fortran code SMSIM [41], which is freely provided by the author on his website. Any other suitable method to generate synthetic time histories could be used. This step is one of the most time consuming of the entire process, since the length of the earthquake catalog implies a very large number (~21,800) of events and thus the generation of numerous time histories at the site. The main input parameters are [40]: the source characteristics (seismic moment M0 and stress drop Δσ or corner frequency fc), the path terms with geometrical spreading G(R), anelastic attenuation Q(f), and duration terms Tgm(f) related to hypocentral distance Dhyp, the frequency , and the site terms including the crustal amplification factor CAF(f) derived from the site velocity profile VS(z), together with the site specific shallow attenuation factor characterized by the κ0 value. When considering the published, locally available data at the

Synthetic Time Histories on Rock
Now, we can proceed to generate synthetic time histories on hard rock. We propose to define the input motions combining a simple approach to generate synthetic time histories with the AA14 GMPE. An arbitrary seismological model is used to generate the waveforms that are then scaled to match the target accelerations, using a continuous function. We consider that it is reasonable: (1) to define the input motions using the European AA14 GMPE and (2) to generate synthetic time histories that are compatible to the selected GMPE on the basis of a simple, point source seismological model with plausible but arbitrary source and path parameters, providing waveforms that are later scaled using a continuous function to match the target accelerations at different spectral periods.
We use the well-known stochastic method that was proposed by D. Boore [40] for the generation of ground-motion time histories, and its corresponding Fortran code SMSIM [41], which is freely provided by the author on his website. Any other suitable method to generate synthetic time histories could be used. This step is one of the most time consuming of the entire process, since the length of the earthquake catalog implies a very large number (~21,800) of events and thus the generation of numerous time histories at the site. The main input parameters are [40]: the source characteristics (seismic moment M 0 and stress drop ∆σ or corner frequency f c ), the path terms with geometrical spreading G(R), anelastic attenuation Q(f), and duration terms Tgm(f) related to hypocentral distance D hyp , the frequency f , and the site terms including the crustal amplification factor CAF(f) derived from the site velocity profile V S (z), together with the site specific shallow attenuation factor characterized by the κ 0 value. When considering the published, locally available data at the Euroseistest site, the Boore stochastic method has been applied with the crustal velocity structure detailed in [26], a κ 0 value of 0.024 s [53], and a lognormal distribution of the stress drop with a median value of 30 bars and standard deviation of 0.68 (natural log). The ground-motion duration (Tgm) used here corresponds to the sum of the source duration, which is related to the inverse of the corner frequency (1/fc) and a path-dependent duration (0.05 R follows [54] model).
We calculate the hazard curves on rock based on the synthetic time histories generated using SMSIM (Figure 7, PGA, 0.2 s and 1.0 s). The annual rates of the generated time histories are significantly lower than the rates of the target hard-rock hazard curve relying on AA14 GMPE (green curve on Figure 7). Such a discrepancy is related to the lack of consistency between the theoretical stochastic approach (source, path, and site term) and the empirical AA14 GMPE. A correction should thus be applied to the results of the stochastic simulation, which can be done in several possible ways. The first way consists in tuning the SMSIM ground-motion model, by performing a sensitivity analysis on the input parameters (geometrical spreading term, crustal attenuation, ∆σ, κo) until a hazard curve close to the target hazard curve is obtained. We tried this approach on a trial-and-error basis, but we found it to be very time-consuming and rather inefficient in obtaining time histories simultaneously compatible at all spectral periods. Actually, such an approach would require a full inversion of the "free" parameters mentioned above, as performed recently, for instance by [55]. Hence, we opt for a much faster, easier, and entirely pragmatic scaling technique, as described below. Euroseistest site, the Boore stochastic method has been applied with the crustal velocity structure detailed in [26], a κ0 value of 0.024 s [53], and a lognormal distribution of the stress drop with a median value of 30 bars and standard deviation of 0.68 (natural log). The ground-motion duration (Tgm) used here corresponds to the sum of the source duration, which is related to the inverse of the corner frequency (1/fc) and a path-dependent duration (0.05 R follows [54] model).
We calculate the hazard curves on rock based on the synthetic time histories generated using SMSIM (Figure 7, PGA, 0.2 s and 1.0 s). The annual rates of the generated time histories are significantly lower than the rates of the target hard-rock hazard curve relying on AA14 GMPE (green curve on Figure 7). Such a discrepancy is related to the lack of consistency between the theoretical stochastic approach (source, path, and site term) and the empirical AA14 GMPE. A correction should thus be applied to the results of the stochastic simulation, which can be done in several possible ways. The first way consists in tuning the SMSIM ground-motion model, by performing a sensitivity analysis on the input parameters (geometrical spreading term, crustal attenuation, Δσ, κo) until a hazard curve close to the target hazard curve is obtained. We tried this approach on a trial-and-error basis, but we found it to be very time-consuming and rather inefficient in obtaining time histories simultaneously compatible at all spectral periods. Actually, such an approach would require a full inversion of the "free" parameters mentioned above, as performed recently, for instance by [55]. Hence, we opt for a much faster, easier, and entirely pragmatic scaling technique, as described below. .0] s. These AA14 predictions were corrected while using the site term of the LA17 GMPE to derive the expected motion for a hard rock with Vs = 2600 m/s (see HTT, Section 4.2). Then, for each event (Mw, Dhyp), we calculate the ratios between the synthetic spectral acceleration and the corrected AA14 acceleration (VS30 = 2600 m/s). The ratios that were obtained for different spectral periods are displayed in Figure 8a. The error bar represents the median ± one standard deviation calculated over all the ratios available for a given magnitude. The variability comes from the inconsistency between the distance term in the AA14 GMPE and its equivalent contribution in the SMSIM approach (i.e., the combination of the geometrical spreading term and the crustal/site attenuation terms).
The horizontal black lines in Figure 8a represent the average correction factors when all magnitudes and distances are considered and the associated standard deviation (dashed lines). These values are displayed in Figure 8b as a function of the spectral period, showing that the largest correction corresponds to rather long periods (around 2.0 s), suggesting that part of this discrepancy could be due to the fact that only point sources were considered. Indeed, the correction factors in Figure 8a increase with magnitude, which suggests that the use of a code considering extended source should be preferred. .0] s. These AA14 predictions were corrected while using the site term of the LA17 GMPE to derive the expected motion for a hard rock with Vs = 2600 m/s (see HTT, Section 4.2). Then, for each event (M w , D hyp ), we calculate the ratios between the synthetic spectral acceleration and the corrected AA14 acceleration (V S30 = 2600 m/s). The ratios that were obtained for different spectral periods are displayed in Figure 8a. The error bar represents the median ± one standard deviation calculated over all the ratios available for a given magnitude. The variability comes from the inconsistency between the distance term in the AA14 GMPE and its equivalent contribution in the SMSIM approach (i.e., the combination of the geometrical spreading term and the crustal/site attenuation terms).
The horizontal black lines in Figure 8a represent the average correction factors when all magnitudes and distances are considered and the associated standard deviation (dashed lines). These values are displayed in Figure 8b as a function of the spectral period, showing that the largest correction corresponds to rather long periods (around 2.0 s), suggesting that part of this discrepancy could be due to the fact that only point sources were considered. Indeed, the correction factors in Figure 8a increase with magnitude, which suggests that the use of a code considering extended source should be preferred. Figure 8b also displays the values of the median correction factor + one and two standard deviations, and their interpolation from period to period to derive a continuous, period-dependent average correction factor, required to scale the generated stochastic time histories to match the target hazard at all spectral periods.
The next step consisted in scaling the~21,800 stochastic time histories by applying the continuous correction factors (Figure 8b) derived in the response spectra domain to each one of the time histories in the Fourier domain, and posteriorly retrieving the new scaled time histories by inverse Fourier transform, as shown in Figure 8c. After the scaling process is performed, we calculate the new hazard curves based on the scaled accelerograms. Figure 9a shows the newly scaled hazard curve, this time being closer to the target hazard curve (AA14 2600 m/s) than the original one, but still with lower annual rates. We applied the same correction procedure using the median + one standard deviation (CF+1σ) and median + two standard deviations (CF+2σ) correction factors. The resulting hazard curves are displayed in Figure 9b,c. standard deviations, and their interpolation from period to period to derive a continuous, perioddependent average correction factor, required to scale the generated stochastic time histories to match the target hazard at all spectral periods. The next step consisted in scaling the ~21,800 stochastic time histories by applying the continuous correction factors (Figure 8b) derived in the response spectra domain to each one of the time histories in the Fourier domain, and posteriorly retrieving the new scaled time histories by inverse Fourier transform, as shown in Figure 8c. After the scaling process is performed, we calculate the new hazard curves based on the scaled accelerograms. Figure 9a shows the newly scaled hazard curve, this time being closer to the target hazard curve (AA14 2600 m/s) than the original one, but still with lower annual rates. We applied the same correction procedure using the median + one standard deviation (CF+1σ) and median + two standard deviations (CF+2σ) correction factors. The resulting hazard curves are displayed in Figure 9b,c. an Mw = 7.9 accelerogram generated using SMSIM (red) and then scaled by the interpolated median + two standard deviations correction factor (black). The ratios between the target AA14 accelerations (VS30 = 2600 m/s) and the scaled stochastic accelerations, interpolated for a series of annual exceedance rates, are shown in Figure 10. The best agreement is obtained by applying the (CF+2σ) scaling factor; the resulting ratio is close to 1.0 for most annual rates of exceedance (and especially the smallest) at the three displayed spectral periods. It is important to mention that the (CF+2σ) correction factor is a pragmatic non-physical parameter that we used to scale the accelerograms in order to fit a certain target hazard level. The ratios between the target AA14 accelerations (V S30 = 2600 m/s) and the scaled stochastic accelerations, interpolated for a series of annual exceedance rates, are shown in Figure 10. The best agreement is obtained by applying the (CF+2σ) scaling factor; the resulting ratio is close to 1.0 for most annual rates of exceedance (and especially the smallest) at the three displayed spectral periods. It is important to mention that the (CF+2σ) correction factor is a pragmatic non-physical parameter that we used to scale the accelerograms in order to fit a certain target hazard level.  Figure 11 displays the synthetic mean spectral acceleration values before and after the scaling process and is compared to Akkar et al., 2013 [36] scaled while using the HTT factors ( Figure 11). This plot is shown for three different spectral periods (PGA, 0.2 s, 1.0 s) and for events of Mw = [4.7, 6.5, 7.5]. Events in the catalog were generated for different hypocentral distances (Dhyp) and a scaling factor at +2σ was applied to generate compatible hazard curves as explained in the previous step. As mentioned before, this correction is only a pragmatic way of scaling accelerograms and it does not rely on a physics-based scaling strategy. Nonetheless, it does provide a very satisfactory fit between the stochastic hazard curves and the target hazard curves at all spectral periods, and it upholds the original variability of the accelerograms. Future applications should improve this step to find a physics-based optimal fitting procedure (such as higher, possibly magnitude-dependent, stress drops, frequency-dependent geometrical spreading, and crustal quality factor).  Figure 11 displays the synthetic mean spectral acceleration values before and after the scaling process and is compared to Akkar et al., 2013 [36] scaled while using the HTT factors ( Figure 11). This plot is shown for three different spectral periods (PGA, 0.2 s, 1.0 s) and for events of M w = [4.7, 6.5, 7.5]. Events in the catalog were generated for different hypocentral distances (D hyp ) and a scaling factor at +2σ was applied to generate compatible hazard curves as explained in the previous step. As mentioned before, this correction is only a pragmatic way of scaling accelerograms and it does not rely on a physics-based scaling strategy. Nonetheless, it does provide a very satisfactory fit between the stochastic hazard curves and the target hazard curves at all spectral periods, and it upholds the original variability of the accelerograms. Future applications should improve this step to find a physics-based optimal fitting procedure (such as higher, possibly magnitude-dependent, stress drops, frequency-dependent geometrical spreading, and crustal quality factor).  Figure 11 displays the synthetic mean spectral acceleration values before and after the scaling process and is compared to Akkar et al., 2013 [36] scaled while using the HTT factors ( Figure 11). This plot is shown for three different spectral periods (PGA, 0.2 s, 1.0 s) and for events of Mw = [4.7, 6.5, 7.5]. Events in the catalog were generated for different hypocentral distances (Dhyp) and a scaling factor at +2σ was applied to generate compatible hazard curves as explained in the previous step. As mentioned before, this correction is only a pragmatic way of scaling accelerograms and it does not rely on a physics-based scaling strategy. Nonetheless, it does provide a very satisfactory fit between the stochastic hazard curves and the target hazard curves at all spectral periods, and it upholds the original variability of the accelerograms. Future applications should improve this step to find a physics-based optimal fitting procedure (such as higher, possibly magnitude-dependent, stress drops, frequency-dependent geometrical spreading, and crustal quality factor).

Synthetic Time Histories on Soil
Based on the (CF+2σ) scaled synthetic time histories on hard rock, we can now generate the corresponding time histories on soil by performing a 1D equivalent-linear (EL, SHAKE91 [42,43,44]) and 1D non-linear (NL, NOAH [45]) site response analyses.
The soil profile (Table 1) and the degradation curves for all layers (Figure 2b,c) were used to perform the 1D site response calculations. SHAKE91 requires only degradation curves, while NOAH requires the maximum stresses, related to the cohesion, friction angle φ, and the coefficient of earth at rest K0. For the present exercise, the cohesion was assumed to be zero in all layers since no sufficient information was available being related to this parameter, especially at large depths. It is important to highlight that the (CF+2σ) scaled synthetic time histories on hard rock are representative of the AA14 (VS30 = 2600 m/s) ground motion predictions for outcropping conditions, and this should be specified inside the site response input files in order to appropriately remove the free field surface effect (about a factor of ~2.0).

Focus on Nonlinear Ground Motion Saturation
An interesting way to analyze the surface synthetics is to compare the spectral acceleration on soil, with respect to the corresponding input ground motion on rock. Figure 12a shows the results at three spectral periods (PGA, 0.2 s and 1.0 s) for the SHAKE91 computations and Figure 12b for the NOAH computations.
In both cases, the soil presents significant nonlinear behavior, even at weak/intermediate motion levels (0.01 g-0.1 g), and de-amplification is observed on soil with respect to the rock motion at ground motion levels around (0.1 g-0.3 g). Moreover, a saturation level of the acceleration is always reached in both soil cases (SHAKE91 and NOAH) and at all spectral periods. The point where the saturation limit is reached depends strongly on the spectral period (at least for this example), and less on the propagation code that is used for the site response analysis. The saturation limit appears faster at high frequency than at low frequency for both codes. These results also indicate a larger variability of the data for a given rock input level when using SHAKE91 when compared to the NOAH results. Using SHAKE91, the dispersion of calculated accelerations increases with the spectral period (this is not the case using NOAH). Figure 11. Mean spectral acceleration decay with respect to distance (km) predicted by AA14·HTT ± 2σ (corrected for hard rock using HTT factors) for three different spectral periods (PGA, 0.2 s, 1.0 s), superimposed to the original SMSIM accelerations for hard rock (blue dots) and the scaled accelerations after applying a correction factor of CF+2σ (red dots) for events of M w = [4.7, 6.5, 7.5].

Synthetic Time Histories on Soil
Based on the (CF+2σ) scaled synthetic time histories on hard rock, we can now generate the corresponding time histories on soil by performing a 1D equivalent-linear (EL, SHAKE91 [42][43][44]) and 1D non-linear (NL, NOAH [45]) site response analyses.
The soil profile (Table 1) and the degradation curves for all layers (Figure 2b,c) were used to perform the 1D site response calculations. SHAKE91 requires only degradation curves, while NOAH requires the maximum stresses, related to the cohesion, friction angle ϕ, and the coefficient of earth at rest K 0 . For the present exercise, the cohesion was assumed to be zero in all layers since no sufficient information was available being related to this parameter, especially at large depths. It is important to highlight that the (CF+2σ) scaled synthetic time histories on hard rock are representative of the AA14 (V S30 = 2600 m/s) ground motion predictions for outcropping conditions, and this should be specified inside the site response input files in order to appropriately remove the free field surface effect (about a factor of~2.0).

Focus on Nonlinear Ground Motion Saturation
An interesting way to analyze the surface synthetics is to compare the spectral acceleration on soil, with respect to the corresponding input ground motion on rock. Figure 12a shows the results at three spectral periods (PGA, 0.2 s and 1.0 s) for the SHAKE91 computations and Figure 12b for the NOAH computations.
In both cases, the soil presents significant nonlinear behavior, even at weak/intermediate motion levels (0.01 g-0.1 g), and de-amplification is observed on soil with respect to the rock motion at ground motion levels around (0.1 g-0.3 g). Moreover, a saturation level of the acceleration is always reached in both soil cases (SHAKE91 and NOAH) and at all spectral periods. The point where the saturation limit is reached depends strongly on the spectral period (at least for this example), and less on the propagation code that is used for the site response analysis. The saturation limit appears faster at high frequency than at low frequency for both codes. These results also indicate a larger variability of the data for a given rock input level when using SHAKE91 when compared to the NOAH results. Using SHAKE91, the dispersion of calculated accelerations increases with the spectral period (this is not the case using NOAH).

Ground Motion Variability
It is also of interest to better comprehend what happens with the ground motion variability after performing the 1D site response analysis with the two selected codes. For this purpose, we compared the probability density function (PDF) of the ground motion acceleration (PGA) on hard rock at depth with respect to the PDF on soil, for different magnitude and distance ranges ( Figure 13).
A systematic amplification of the median ground motion at soil surface (SHAKE91 and NOAH) with respect to hard rock at depth is observed for small magnitudes M = [4.7-5.1] and all distances (Figure 13a). Non-linearity of the soil is not observed, indicating mostly linear behavior of the soil. Similarly, Figure 13b shows the ground motion variability at the intermediate magnitude range Mw = [6.1-6.5] at different distances (Dhyp) ranges. Amplification of the ground motion with respect to hard rock at depth is again observed. However, this time a clear saturation of the ground motion occurs at all distances due to the nonlinear effect of the soil (PDF truncated on the strong motion side, with more pronounced effect at short distances). This saturation is stronger using NOAH rather than SHAKE91, since NOAH is a fully nonlinear code. However, part of the difference can be also explained by intrinsic code differences and different degradation curves definition. At short distances, a smaller amplification and even deamplification on soil with respect to rock is observed.
Finally, for large magnitudes Mw = [7.1-7.9], a strong saturation of ground motion can be observed (Figure 13c), especially when using NOAH. It is also interesting to notice that the variability of the soil ground motion, σS ( ), is smaller than the variability of the hard rock ground motion, σS ( ). [2] also discussed this phenomenon, which has also been observed from real recordings on rock and soil conditions (see [56,57,58,59]).

Ground Motion Variability
It is also of interest to better comprehend what happens with the ground motion variability after performing the 1D site response analysis with the two selected codes. For this purpose, we compared the probability density function (PDF) of the ground motion acceleration (PGA) on hard rock at depth with respect to the PDF on soil, for different magnitude and distance ranges ( Figure 13).
A systematic amplification of the median ground motion at soil surface (SHAKE91 and NOAH) with respect to hard rock at depth is observed for small magnitudes M = [4.7-5.1] and all distances (Figure 13a). Non-linearity of the soil is not observed, indicating mostly linear behavior of the soil. Similarly, Figure 13b shows the ground motion variability at the intermediate magnitude range M w = [6.1-6.5] at different distances (D hyp ) ranges. Amplification of the ground motion with respect to hard rock at depth is again observed. However, this time a clear saturation of the ground motion occurs at all distances due to the nonlinear effect of the soil (PDF truncated on the strong motion side, with more pronounced effect at short distances). This saturation is stronger using NOAH rather than SHAKE91, since NOAH is a fully nonlinear code. However, part of the difference can be also explained by intrinsic code differences and different degradation curves definition. At short distances, a smaller amplification and even deamplification on soil with respect to rock is observed.
Finally, for large magnitudes M w = [7.1-7.9], a strong saturation of ground motion can be observed (Figure 13c), especially when using NOAH. It is also interesting to notice that the variability of the soil ground motion, σS s a ( f ), is smaller than the variability of the hard rock ground motion, σS r a ( f ). [2] also discussed this phenomenon, which has also been observed from real recordings on rock and soil conditions (see [56][57][58][59]).

Full Probabilistic Stochastic Method (SM)
What we call here the full probabilistic stochastic method is nothing else than the hazard curve built from stochastic time histories (~21,800 accelerograms), corresponding to the strong-motion history at the site during 50,000 years.
To build the stochastic hazard curves, we calculate the annual rate of exceedance, λ(x ≥ X), of a series of ground motion levels (X), for a given spectral period (T), by counting the number of occurrences of ground-motions (N) exceeding a certain ground motion level x, as follows:

SM Results for Hard Rock
The stochastic hazard curve on hard rock, ( ) , is thus built from the annual rate of exceedance of the scaled synthetic time histories set, as expressed in Equation (2), for three different spectral periods (PGA, 0.2 s and 1.0 s), Figure 14 (black).

SM Results for Soft Soil Using SHAKE91
Similarly, we calculate the hazard curves on soil at the site for different spectral periods, while using the synthetic time histories on soil that was obtained with the SHAKE91 equivalent-linear site response analysis code. The results displayed in Figure 14 exhibit two characteristic features when compared to their rock equivalent. For all spectral periods, the soil hazard curves exhibit a much stronger convexity than for the rock hazard curves: this is related to the non-linear behavior of soil,

Hard Rock
Soil (SHAKE) Soil (NOAH) Figure 13. Probability density function (PDF) of the ground motion acceleration (PGA) on hard rock (blue) and on soil after performing 1D site response analysis using SHAKE91 (red) and NOAH (green) for different ranges of magnitudes (columns) and distances (rows).

Full Probabilistic Stochastic Method (SM)
What we call here the full probabilistic stochastic method is nothing else than the hazard curve built from stochastic time histories (~21,800 accelerograms), corresponding to the strong-motion history at the site during 50,000 years.
To build the stochastic hazard curves, we calculate the annual rate of exceedance, λ(x ≥ X), of a series of ground motion levels (X), for a given spectral period (T), by counting the number of occurrences of ground-motions (N) exceeding a certain ground motion level x, as follows:

SM Results for Hard Rock
The stochastic hazard curve on hard rock, HC( f ), is thus built from the annual rate of exceedance of the scaled synthetic time histories set, as expressed in Equation (2), for three different spectral periods (PGA, 0.2 s and 1.0 s), Figure 14 (black).

SM Results for Soft Soil Using SHAKE91
Similarly, we calculate the hazard curves on soil at the site for different spectral periods, while using the synthetic time histories on soil that was obtained with the SHAKE91 equivalent-linear site response analysis code. The results displayed in Figure 14 exhibit two characteristic features when compared to their rock equivalent. For all spectral periods, the soil hazard curves exhibit a much stronger convexity than for the rock hazard curves: this is related to the non-linear behavior of soil, which saturates the ground motion for the upper acceleration range at long return periods. This is also true for the largest spectral period considered here (1.0 s), because it corresponds to a shorter period than the site fundamental period (around~1.5 s), and therefore it is affected by the non-linear behavior over the whole soil column.

SM Results for Soft Soil Using NOAH
Lastly, we calculate the hazard curves on soil at the site based on the synthetics that were obtained with NOAH (see Section 5.1). Results displayed in Figure 14, exhibit the same overall features than the SHAKE91 results: a pronounced convexity related with the existence of an upper bound for the surface ground motion due to the non-linear behavior. As the saturation is better accounted for in fully NL codes than in equivalent-linear approaches, the hazard curve that was predicted using NOAH is saturated at acceleration levels lower than in the SHAKE91 case. which saturates the ground motion for the upper acceleration range at long return periods. This is also true for the largest spectral period considered here (1.0 s), because it corresponds to a shorter period than the site fundamental period (around ~1.5 s), and therefore it is affected by the non-linear behavior over the whole soil column.

SM Results for Soft Soil Using NOAH
Lastly, we calculate the hazard curves on soil at the site based on the synthetics that were obtained with NOAH (see Section 5.1). Results displayed in Figure 14, exhibit the same overall features than the SHAKE91 results: a pronounced convexity related with the existence of an upper bound for the surface ground motion due to the non-linear behavior. As the saturation is better accounted for in fully NL codes than in equivalent-linear approaches, the hazard curve that was predicted using NOAH is saturated at acceleration levels lower than in the SHAKE91 case.

Analytical Approximation of the Full Convolution Method (AM)
In order to calculate a fully probabilistic hazard curve on soil at the surface (or at any desired depth), [10,11] proposed three different approaches to calculate hazard on soil, consisting in the convolution of the site-specific hazard curve at the bedrock level with simple to more complex descriptions of the site amplification function. These approaches provide more precise surface ground-motion hazard estimates than those found by means of standard GMPEs for generic soil conditions. These three different methods are: (1) Full Convolution of the reference rock hazard curve with the NL amplification function, AF( , ), and its variability. (2) PSHA with a Site-Specific GMPE (i.e., with the generic site term of the original GMPE being replaced by a site-specific term, S, derived from the site-specific amplification functions, ( , ), and the original aleatory variability being also modified to account for the actual variability in the site-specific amplification function). (3) Analytical estimate of the soil hazard convolving the probabilistic rock hazard curve with a statistical description of the non-linear site response and its variability.
In the present work, we decided to use the third approach, which is the simplest one to implement. The "Analytical Estimate of the Soil Hazard" method (AM) provides a simple, closedform solution that appropriately modifies the hazard results at the rock level to obtain the hazard curve on soil. Under several assumptions, as discussed in their paper, [10,11] stated that the acceleration on soil, ( ), can be predicted from the acceleration on rock, ( ), as follows: The amplification function is:

Analytical Approximation of the Full Convolution Method (AM)
In order to calculate a fully probabilistic hazard curve on soil at the surface (or at any desired depth), [10,11] proposed three different approaches to calculate hazard on soil, consisting in the convolution of the site-specific hazard curve at the bedrock level with simple to more complex descriptions of the site amplification function. These approaches provide more precise surface ground-motion hazard estimates than those found by means of standard GMPEs for generic soil conditions. These three different methods are: (1) Full Convolution of the reference rock hazard curve with the NL amplification function, AF( f , S r a ), and its variability. (2) PSHA with a Site-Specific GMPE (i.e., with the generic site term of the original GMPE being replaced by a site-specific term, S, derived from the site-specific amplification functions, AF( f , S r a ), and the original aleatory variability being also modified to account for the actual variability in the site-specific amplification function). (3) Analytical estimate of the soil hazard convolving the probabilistic rock hazard curve with a statistical description of the non-linear site response and its variability.
In the present work, we decided to use the third approach, which is the simplest one to implement. The "Analytical Estimate of the Soil Hazard" method (AM) provides a simple, closed-form solution that appropriately modifies the hazard results at the rock level to obtain the hazard curve on soil. Under several assumptions, as discussed in their paper, [10,11] stated that the acceleration on soil, S s a ( f ), can be predicted from the acceleration on rock, S r a ( f ), as follows: The amplification function is: Equation 4 simply corresponds to a log-linear dependence of the amplification factor AF( f ) in Equation (4) and the loading level S r a ( f ) in the form ln(AF( f )) = C 0 + C1 ln(S r a ( f )), with a variability described by a standard deviation σln(AF( f )), as in Figure 15.
They also assumed that the rock hazard curve HC(S r a ( f )) may be locally described with a power law dependence on the spectral acceleration, as follows: By convolving the rock hazard curve with the lognormal distribution of the site amplification factor and its linear dependence on the loading level, they could then derive a closed-form expression for the hazard curve at soil site, as follows: where σ corresponds to the standard deviation of the amplification function in log-normal scale, σln(AF( f )), and the spectral acceleration on rock, S r a ( f ), is related to the spectral acceleration on soil, S s a ( f ), through the corresponding median amplification curve AF( f , S r a ). In the case where the uncertainty can be associated to the rock hazard curve (e.g., when a logic tree is used with different GMPEs), the resulting uncertainty in the soil site hazard curve may be written, as proposed by [11]: The first step in the proposed procedure consists in calculating the amplification functions, AF( f ), in the response spectra domain for all input ground motions and for each site response approach (SHAKE91 and NOAH), and then proceed to fit piecewise linear regressions to obtain the values of the coefficients C 0 and C 1 , together with the associated variability, σln(AF( f )). The results are illustrated in Figure 15 for the two site-response codes and for the three considered spectral periods. It is interesting to notice that non-linear behavior is observed even at relatively large spectral periods (~1.0 s)-for large loading levels-and also at short periods (PGA and 0.2 s) from rather low input acceleration levels (blue regression lines in Figure 15a,b). Equation (4) and the loading level ( ) in the form ( ( )) = + 1 ( ( )) , with a variability described by a standard deviation σln( ( )), as in Figure 15. They also assumed that the rock hazard curve ( ( )) may be locally described with a power law dependence on the spectral acceleration, as follows: By convolving the rock hazard curve with the lognormal distribution of the site amplification factor and its linear dependence on the loading level, they could then derive a closed-form expression for the hazard curve at soil site, as follows: where σ corresponds to the standard deviation of the amplification function in log-normal scale, ( ( )), and the spectral acceleration on rock, ( ), is related to the spectral acceleration on soil, ( ), through the corresponding median amplification curve ( , ) . In the case where the uncertainty can be associated to the rock hazard curve (e.g., when a logic tree is used with different GMPEs), the resulting uncertainty in the soil site hazard curve may be written, as proposed by [11]: The first step in the proposed procedure consists in calculating the amplification functions, ( ), in the response spectra domain for all input ground motions and for each site response approach (SHAKE91 and NOAH), and then proceed to fit piecewise linear regressions to obtain the values of the coefficients C0 and C1, together with the associated variability, σ ( ( )). The results are illustrated in Figure 15 for the two site-response codes and for the three considered spectral periods. It is interesting to notice that non-linear behavior is observed even at relatively large spectral periods (~1.0 s)-for large loading levels-and also at short periods (PGA and 0.2 s) from rather low input acceleration levels (blue regression lines in Figure 15a,b).  Table 3.
The explanation for the former (long period non-linear behavior) can be related to the low fundamental frequency of the site (~[0.6-0.7] Hz): everything beyond that frequency is primarily controlled by the site response, and the large velocity contrast at ~200 m depth is likely to generate significant strains over the whole soil column. The latter observation, however, seems more likely to be related to the dependence of amplification factor on the response spectra on the actual frequency contents of the input waveform, leading to the magnitude dependence of the response spectral amplification, as reported in [58], than to a true NL behavior at very low strains, which cannot be identified in the corresponding Fourier domain amplifications [60].
The regression coefficients are listed in Table 3 together with their standard deviations. The hazard curves on soil can now be estimated according to Equation (6). It is worth discussing the limitations of this simplified analytical method (AM), as already acknowledged by [11]. The functional form provided in Equation (6) presents numerical limitations when the argument of the exponential is too large, i.e., when C1 is close to −1.0, and/or when the variability of the amplification factor (σ) is large. The former situation does occur at Euroseistest for high acceleration levels (green lines in Figure 15a,b), which corresponds to the saturation of surface ground motion displayed in Figure 12.
The hazard curves obtained are presented in Figure 16a (SHAKE91) Figure 16b (NOAH). In most cases of this particular example (all except in Figure 16b, i.e., for the NL code NOAH at PGA level), the nonlinearity is so important for both site response models, that the AM approach is not even capable to provide hazard curves results for the most common 475 years return period: C1 values are getting to close to −1.0, where Equation (6) diverges. This limitation is actually one of the main drawbacks of the AM approach, which cannot be used in the case of strong non-linearity. Also, as acknowledged by [11], the AM should be used with caution when the correction factor in Equation (6) takes on values that are greater than 10 (which may also result from a large variability on the amplification factor for different waveforms).  Table 3.
The explanation for the former (long period non-linear behavior) can be related to the low fundamental frequency of the site (~[0.6-0.7] Hz): everything beyond that frequency is primarily controlled by the site response, and the large velocity contrast at~200 m depth is likely to generate significant strains over the whole soil column. The latter observation, however, seems more likely to be related to the dependence of amplification factor on the response spectra on the actual frequency contents of the input waveform, leading to the magnitude dependence of the response spectral amplification, as reported in [58], than to a true NL behavior at very low strains, which cannot be identified in the corresponding Fourier domain amplifications [60].
The regression coefficients are listed in Table 3 together with their standard deviations. The hazard curves on soil can now be estimated according to Equation (6). It is worth discussing the limitations of this simplified analytical method (AM), as already acknowledged by [11]. The functional form provided in Equation (6) presents numerical limitations when the argument of the exponential is too large, i.e., when C 1 is close to −1.0, and/or when the variability of the amplification factor (σ) is large. The former situation does occur at Euroseistest for high acceleration levels (green lines in Figure 15a,b), which corresponds to the saturation of surface ground motion displayed in Figure 12.
The hazard curves obtained are presented in Figure 16a (SHAKE91) Figure 16b (NOAH). In most cases of this particular example (all except in Figure 16b, i.e., for the NL code NOAH at PGA level), the nonlinearity is so important for both site response models, that the AM approach is not even capable to provide hazard curves results for the most common 475 years return period: C 1 values are getting to close to −1.0, where Equation (6) diverges. This limitation is actually one of the main drawbacks of the AM approach, which cannot be used in the case of strong non-linearity. Also, as acknowledged by [11], the AM should be used with caution when the correction factor in Equation (6) takes on values that are greater than 10 (which may also result from a large variability on the amplification factor for different waveforms). Table 3. Intercept (C 0 ), slope (C 1 ), and standard deviation (σ) of the piecewise-linear regression models (PLM) for AF( f ) vs. S r a ( f ).

Robustness of the ( ) Prediction
Another issue that may be discussed that is related to the accuracy and performance of the model, is the required amount of accelerograms to robustly predict the median of the amplification function ( ). Bazzurro, P. [10] proposed that "the number of records, n, needed to keep the standard error of the median of ( ( )) at any spectral acceleration level within a specified fractional accuracy, ζ, is approximately (Strictly, Equation (9) is valid for the median ( ) at a lognormal spectral-acceleration level corresponding to the sample median of the spectral-acceleration values used to produce the piece-wise regression. The standard error of the median of ( ) grows at values that are away from the average. So, to maximize accuracy, the record intensities should be chosen in the appropriate ( ) neighborhood (see [10]).) given by" Equation (8): The standard deviation of the median of ( ) is in both cases (SHAKE91 and NOAH) and at all spectral periods significantly lower than 0.3 (Figure 15a,b and Table 3, most values between 0.01 and 0.2). The criterion defined by [10] in page 2099 is therefore fulfilled: "Given that the residual standard deviation of the quadratic model in ( ) is 0.3 or less across the entire frequency range, the median value of ( ) associated with a given ( ) on rock can be estimated within ±10% (i.e., ζ = 0.1) by using not more than 10 records".

Robustness of the AF( f ) Prediction
Another issue that may be discussed that is related to the accuracy and performance of the model, is the required amount of accelerograms to robustly predict the median of the amplification function ln(AF). Bazzurro, P. [10] proposed that "the number of records, n, needed to keep the standard error of the median of ln(AF( f )) at any spectral acceleration level within a specified fractional accuracy, ζ, is approximately (Strictly, Equation (9) is valid for the median ln AF( f ) at a lognormal spectral-acceleration level corresponding to the sample median of the spectral-acceleration values used to produce the piece-wise regression. The standard error of the median of ln(AF) grows at values that are away from the average. So, to maximize accuracy, the record intensities should be chosen in the appropriate ln S r a ( f ) neighborhood (see [10]).) given by" Equation (8): The standard deviation of the median of ln(AF) is in both cases (SHAKE91 and NOAH) and at all spectral periods significantly lower than 0.3 (Figure 15a,b and Table 3, most values between 0.01 and 0.2). The criterion defined by [10] in page 2099 is therefore fulfilled: "Given that the residual standard deviation of the quadratic model in ln S r a ( f ) is 0.3 or less across the entire frequency range, the median value of AF( f ) associated with a given S r a ( f ) on rock can be estimated within ±10% (i.e., ζ = 0.1) by using not more than 10 records".
To test this assertion, we randomly selected from 2 to 200 accelerograms (1000 times each) in our set of stochastic time histories and for different S r a ( f ) bins, S r a ( f ) bins = [0.001:0.001:0.01; 0.01:0.01:0.1; 0.1:0.1 :2]. Posteriorly, we calculated efficiency curves showing the percentage of random samples where the number of accelerograms, n, predicted a median ln AF( f ) within a 10% error margin (with respect to median ln AF( f ), estimated from the full set of stochastic time series; Figure 17 and Table 4). One can observe that the number of accelerograms that is required is larger using SHAKE91 than using NOAH. This is due to the larger variability of SHAKE91 data with respect to NOAH data. Hence, more accelerograms are needed for a robust estimation. From this particular example, we determine that an amount of 60 accelerograms per acceleration bin S r a ( f ) are required to robustly predict the median of the amplification function ln(AF). The different centers of the S r a ( f ) bins that were used are displayed in Figures 18 and 19.
To test this assertion, we randomly selected from 2 to 200 accelerograms (1000 times each) in our set of stochastic time histories and for different ( ) bins, ( )bins = [0.001:0.001:0.01; 0.01:0.01:0.1; 0.1:0.1 :2]. Posteriorly, we calculated efficiency curves showing the percentage of random samples where the number of accelerograms, , predicted a median ( ) within a 10% error margin (with respect to median ( ), estimated from the full set of stochastic time series; Figure 17 and Table 4). One can observe that the number of accelerograms that is required is larger using SHAKE91 than using NOAH. This is due to the larger variability of SHAKE91 data with respect to NOAH data. Hence, more accelerograms are needed for a robust estimation. From this particular example, we determine that an amount of 60 accelerograms per acceleration bin ( ) are required to robustly predict the median of the amplification function ( ). The different centers of the ( ) bins that were used are displayed in Figures 18 and 19.  We consider that the differences between [10,11] work and this study, might be associated to the following reasons: (1) The authors in [10,11] calculated the "true" reference median ( ) based on only 78 ground motion records along all of the acceleration levels, ( ), while in our case, we calculated the "true" reference median, ( ) , with more than ~21,800 ground motion time histories. Authors in [10,11] might have been calculating statistics on a sample too small. (2) Since both studies are site-specific studies, it is very likely that the minimum number of records to estimate the median ( ) with a given accuracy is expected to be different. However, we found that, in our case, the expression in Equation (8) 50 30 We consider that the differences between [10,11] work and this study, might be associated to the following reasons: (1) The authors in [10,11] calculated the "true" reference median ln(AF) based on only 78 ground motion records along all of the acceleration levels, S r a ( f ), while in our case, we calculated the "true" reference median, ln(AF), with more than~21,800 ground motion time histories. Authors in [10,11] might have been calculating statistics on a sample too small. (2) Since both studies are site-specific studies, it is very likely that the minimum number of records to estimate the median ln(AF) with a given accuracy is expected to be different. However, we found that, in our case, the expression in Equation (8)  With respect to the robustness of the standard deviation of AF( f , S r a ), nothing was said by the authors. We decided to explore the behavior of the standard deviation while considering 30, 50, and then whole set of accelerograms. Figures 18 and 19 display the amplification factor, AF( f , S r a ), (top) and its standard deviation (bottom), versus the input hazard on hard rock, S r a ( f ), using SHAKE91 and NOAH results, respectively. The median standard deviation as a function of the input hazard on hard rock, S r a ( f ), does not vary significantly when increasing the number of accelerograms, for both SHAKE91 and NOAH results. The apparent larger variation at high input acceleration levels is due to the lack of data on such acceleration levels. With respect to the robustness of the standard deviation of AF( , ), nothing was said by the authors. We decided to explore the behavior of the standard deviation while considering 30, 50, and then whole set of accelerograms. Figures 18 and 19 display the amplification factor, AF( , ), (top) and its standard deviation (bottom), versus the input hazard on hard rock, ( ), using SHAKE91 and NOAH results, respectively. The median standard deviation as a function of the input hazard on hard rock, ( ), does not vary significantly when increasing the number of accelerograms, for both SHAKE91 and NOAH results. The apparent larger variation at high input acceleration levels is due to the lack of data on such acceleration levels.   With respect to the robustness of the standard deviation of AF( , ), nothing was said by the authors. We decided to explore the behavior of the standard deviation while considering 30, 50, and then whole set of accelerograms. Figures 18 and 19 display the amplification factor, AF( , ), (top) and its standard deviation (bottom), versus the input hazard on hard rock, ( ), using SHAKE91 and NOAH results, respectively. The median standard deviation as a function of the input hazard on hard rock, ( ), does not vary significantly when increasing the number of accelerograms, for both SHAKE91 and NOAH results. The apparent larger variation at high input acceleration levels is due to the lack of data on such acceleration levels.

Discussion
The direct comparison of both approaches in Figure 20 shows that the AM and SM hazard curves present a good fit at all available return periods for the three studied spectral periods. Nevertheless, despite this good fit, the above-mentioned numerical limitation of the AM method prevents this study from performing such comparison even at the most common 475 years return period, which raises questions about the effectiveness of the AM approach for the incorporation of site effects in site-specific PSHA. It is important to emphasize that this limitation will only affect sites with strong nonlinear and/or high seismic demands such as the Euroseistest site. The vertical asymptotes of the hazard curves on soil in Figure 20 correspond, for each spectral period, to the slope (green) in Figure 12, and it therefore corresponds to a non-linear saturation effect of the ground motion at the surface under strong shaking, as it can be better appreciated in Figure 21.

Discussion
The direct comparison of both approaches in Figure 20 shows that the AM and SM hazard curves present a good fit at all available return periods for the three studied spectral periods. Nevertheless, despite this good fit, the above-mentioned numerical limitation of the AM method prevents this study from performing such comparison even at the most common 475 years return period, which raises questions about the effectiveness of the AM approach for the incorporation of site effects in site-specific PSHA. It is important to emphasize that this limitation will only affect sites with strong nonlinear and/or high seismic demands such as the Euroseistest site. The vertical asymptotes of the hazard curves on soil in Figure 20 correspond, for each spectral period, to the slope (green) in Figure  12, and it therefore corresponds to a non-linear saturation effect of the ground motion at the surface under strong shaking, as it can be better appreciated in Figure 21. These results also show that the code-to-code variability introduces much more uncertainty on the hazard estimates than the method-to-method (AM, SM) variability. The use of SHAKE91 leads to larger PGA values for return periods that are larger than 100 years, while the use of NOAH leads to larger spectral accelerations for all of the return periods at intermediate and long spectral periods (0.2 s and 1.0 s). It is interesting to notice that nonlinear behavior is observed, even at relatively large spectral periods (1.0 s) and low return periods. However, the deamplification behavior with respect to rock occurs only at short periods (PGA, 0.2 s), while at longer period (1.0 s), amplification can be observed for all explored return periods. This is simply because the (1.0 s) period is close to the fundamental period of the site (~1.5 s), where linear amplification is strong (close to a factor of 10). Hence, the nonlinearity due to the soil shear modulus degradation and larger damping (induced under strong strain levels) is not strong enough to generate deamplification with respect to the ground motion on rock.
Concerning the ground motion variability analysis, the standard deviation of the ground motion on soil, σS ( ), remains in both cases (SHAKE91 and NOAH) of the same order of magnitude at low ground motion levels (Figure 13a), and it is larger for SHAKE91 than for NOAH at strong ground motion levels (Figure 13c). This can be partially explained by the larger variability of amplification ( ) using SHAKE than NOAH ( Figure 15, Table 3, σ( )). Additionally, our numerical results are in qualitative agreement with observations from real recordings on soil and rock [57,58], where the standard deviation on soil, σS ( ), is smaller than the standard deviation on rock, σS ( ), for all ground motion levels (see Figure 13). Currently, this phenomenon is presumed to be due to nonlinearity of the soil response during severe shaking [11], and also the effect of a lower damping at low strain levels on rock that makes the ground motion more sensitive to the input motion. These results also show that the code-to-code variability introduces much more uncertainty on the hazard estimates than the method-to-method (AM, SM) variability. The use of SHAKE91 leads to larger PGA values for return periods that are larger than 100 years, while the use of NOAH leads to larger spectral accelerations for all of the return periods at intermediate and long spectral periods (0.2 s and 1.0 s). It is interesting to notice that nonlinear behavior is observed, even at relatively large spectral periods (1.0 s) and low return periods. However, the deamplification behavior with respect to rock occurs only at short periods (PGA, 0.2 s), while at longer period (1.0 s), amplification can be observed for all explored return periods. This is simply because the (1.0 s) period is close to the fundamental period of the site (~1.5 s), where linear amplification is strong (close to a factor of 10). Hence, the nonlinearity due to the soil shear modulus degradation and larger damping (induced under strong strain levels) is not strong enough to generate deamplification with respect to the ground motion on rock.
Concerning the ground motion variability analysis, the standard deviation of the ground motion on soil, σS s a ( f ), remains in both cases (SHAKE91 and NOAH) of the same order of magnitude at low ground motion levels (Figure 13a), and it is larger for SHAKE91 than for NOAH at strong ground motion levels (Figure 13c). This can be partially explained by the larger variability of amplification AF( f ) using SHAKE than NOAH ( Figure 15, Table 3, σ(AF)). Additionally, our numerical results are in qualitative agreement with observations from real recordings on soil and rock [57,58], where the standard deviation on soil, σS s a ( f ), is smaller than the standard deviation on rock, σS r a ( f ), for all ground motion levels (see Figure 13). Currently, this phenomenon is presumed to be due to nonlinearity of the soil response during severe shaking [11], and also the effect of a lower damping at low strain levels on rock that makes the ground motion more sensitive to the input motion. Furthermore, we compared the predictions with current available data at the site. Figure 21 plots the spectral acceleration on soil, as a function of the corresponding input spectral acceleration on rock, for both nonlinear codes tested, together with recordings from the Euroseistest database [17]. The same amount of observed earthquakes recordings (28 records) are used at the three different spectral periods, but accelerations on rock below 0.001g are not displayed.
From visual inspection, the few observations that are available at low/intermediate ground motion levels are roughly within the range predicted by the two models. We can thus expect the actual hazard curve of the site to be comprised between the two proposed hazard models. SHAKE91 might better predict site surface motion for low input ground motion levels [S ( ) < 0.01 g], while NOAH predictions might be better for intermediate input ground motion levels [0.01 g ≤ S ( ) ≤ 0.1 g]. However, many more recordings would be necessary to derive sound conclusions. For higher input acceleration levels [S ( ) > 0.1 g], nothing can be said, as they are no recordings. As some of the accelerations recorded are located outside the scatter range, it seems that the simulations somewhat underestimate the actual variability.

Conclusions
A comparison between two fully probabilistic methods to integrate site effects into PSHA has been presented along this work. We found that the analytical approximation of the full convolution method, AM, is computationally less expensive, but it breaks down in the case of strong non-linearity due to a numerical limitation (as for the Euroseistest site). On the other hand, the full probabilistic stochastic method, SM, requires much heavier computations, but it provides results over the full acceleration range, even when strong non-linearity occurs. Both of the methods provide comparable annual exceedance rates over their common validity range.
To build the hazard curves on soil, two different propagation codes were used to calculate the time histories on soil at the surface, EL-SHAKE91 [42,43,44] and NL-NOAH [45]. In the Euroseistest case study, the code-to-code variability is found to control the uncertainty on the hazard estimates, rather than the method-to-method variability (AM or SM). We found that the soil variability that was obtained as a result after performing the propagation is smaller than the input rock variability. This result is in qualitative agreement with observations from real recordings on soil and rock [57,58]. Currently, this phenomenon is presumed to be due to nonlinearity of the soil response during severe shaking [11]. We also found that the expression describing the minimum required amount of records proposed by [11] falls rather short in our case, being insufficient and under conservative, especially for capturing the whole variability of the site response. For this particular example, 8 to 60 records per acceleration bin on rock were needed to predict the amplification within a 10% error margin (with respect to the true amplification based on the full stochastic sample), for both propagation codes and at the three studies spectral periods (PGA, 0.2 s and 1.0 s). Furthermore, we compared the predictions with current available data at the site. Figure 21 plots the spectral acceleration on soil, as a function of the corresponding input spectral acceleration on rock, for both nonlinear codes tested, together with recordings from the Euroseistest database [17]. The same amount of observed earthquakes recordings (28 records) are used at the three different spectral periods, but accelerations on rock below 0.001g are not displayed.
From visual inspection, the few observations that are available at low/intermediate ground motion levels are roughly within the range predicted by the two models. We can thus expect the actual hazard curve of the site to be comprised between the two proposed hazard models. SHAKE91 might better predict site surface motion for low input ground motion levels [S r a ( f ) < 0.01 g], while NOAH predictions might be better for intermediate input ground motion levels [0.01 g ≤ S r a ( f ) ≤ 0.1 g]. However, many more recordings would be necessary to derive sound conclusions. For higher input acceleration levels [S r a ( f ) > 0.1 g], nothing can be said, as they are no recordings. As some of the accelerations recorded are located outside the scatter range, it seems that the simulations somewhat underestimate the actual variability.

Conclusions
A comparison between two fully probabilistic methods to integrate site effects into PSHA has been presented along this work. We found that the analytical approximation of the full convolution method, AM, is computationally less expensive, but it breaks down in the case of strong non-linearity due to a numerical limitation (as for the Euroseistest site). On the other hand, the full probabilistic stochastic method, SM, requires much heavier computations, but it provides results over the full acceleration range, even when strong non-linearity occurs. Both of the methods provide comparable annual exceedance rates over their common validity range.
To build the hazard curves on soil, two different propagation codes were used to calculate the time histories on soil at the surface, EL-SHAKE91 [42][43][44] and NL-NOAH [45]. In the Euroseistest case study, the code-to-code variability is found to control the uncertainty on the hazard estimates, rather than the method-to-method variability (AM or SM). We found that the soil variability that was obtained as a result after performing the propagation is smaller than the input rock variability. This result is in qualitative agreement with observations from real recordings on soil and rock [57,58]. Currently, this phenomenon is presumed to be due to nonlinearity of the soil response during severe shaking [11]. We also found that the expression describing the minimum required amount of records proposed by [11] falls rather short in our case, being insufficient and under conservative, especially for capturing the whole variability of the site response. For this particular example, 8 to 60 records per acceleration bin on rock were needed to predict the amplification within a 10% error margin (with respect to the true amplification based on the full stochastic sample), for both propagation codes and at the three studies spectral periods (PGA, 0.2 s and 1.0 s).
By comparing the two models with real in-situ data, we observed that the few available recordings at the Euroseistest are within the acceleration range that was predicted by both site response models (EL and NL). However, more data might be needed to validate this observation.
We are aware that the selection of the synthetic time histories and the scaling procedure that we followed here might raise several questions that are related to the 'true ground motion' at the Euroseistest. However, the selection of the input time histories on rock is simply an intermediate step that, in our opinion, should not impact significantly the conclusions reached here. The accelerogram selection and generation process would impact the site nonlinear response in case of a deterministic study with one or only a few times series. We believe that the large number of computations performed here allows for considering that the results are statistically significant and quite robust with respect to the actual phase contents of the input rock time series. It must nevertheless be emphasized that the selection of accelerograms compatible to a given GMPE is still an open issue, which needs to be better addressed in the future. Recent works, such as those proposed by [55,60], allow for a more physical tuning of the source and crustal propagation parameters, while not warranting however to get the actual regional site and source parameters, neither avoiding the potential trade-off between some of them [38].
We are also aware that the nonlinear parameters that are assumed for the soil column (in particular the zero-cohesion) results in over-emphasizing the impact of the non-linearity of site response on the hazard curve, and thus limiting the validity domain of the AM approach. The present results, especially in terms of AM/SM comparison, are thus to be considered as an 'extreme' example for highly non-linear, thick soft site. It is likely that the validity domain of the AM approach would be much larger for shallow, high-plasticity clayey deposits.
Finally, we encourage the use of ground-motion simulations calibrated with real data to integrate site effects into PSHA, since models with different levels of complexity can be included (e.g., point source, extended source, 1D, 2D, and 3D site response analysis, kappa, hard rock . . . ), and the corresponding variability of the site response can be quantified. In most practical cases, this is presently not possible with real data because of their scarcity at high acceleration levels. Funding: This work has been supported by a grant from Labex OSUG@2020 (Investissements d'avenir-ANR10 LABX56-http://www.osug.fr/labex-osug-2020/) and by the European STREST project (Harmonized approach to stress tests for critical infrastructures against natural hazards-http://www.strest-eu.org).