Efﬁciency of Intensity Measures Considering Near- and Far-Fault Ground Motion Records

: This paper focuses on the identiﬁcation of high-efﬁciency intensity measures to predict the seismic response of buildings affected by near- and far-fault ground motion records. Near-fault ground motion has received special attention, as it tends to increase the expected damage to civil structures compared to that from ruptures originating further aﬁeld. In order to verify this tendency, the nonlinear dynamic response of 3D multi-degree-of-freedom models is estimated by using a subset of records whose distance to the epicenter is lower than 10 km. In addition, to quantify how much the expected demand may increase because of the proximity to the fault, another subset of records, whose distance to the epicenter is in the range between 10 and 30 km, has been analyzed. Then, spectral and energy-based intensity measures as well as those obtained from speciﬁc computations of the ground motion record are calculated and correlated to several engineering demand parameters. From these analyses, fragility curves are derived and compared for both subsets of records. It has been observed that the subset of records nearer to the fault tends to produce fragility functions with higher probabilities of exceedance than the ones derived for far-fault records. Results also show that the efﬁciency of the intensity measures is similar for both subsets of records, but it varies depending on the engineering demand parameter to be predicted.


Introduction
Every so often, parts of the Earth are shaken by sudden energy releases, which are associated to the relative movement at the plate boundaries. If the release occurs near human settlements, civil structures can be heavily affected if they are not designed to withstand dynamic horizontal loads. Due to the growth of the world population, in a short time there will hardly be a place where a moderate-to-high earthquake can occur without affecting us. In addition, increasing globalization has produced a redistribution of the risks to society at large. That is, a couple of decades ago, negative consequences of catastrophes barely affected areas other than the stricken one. This is why seismic risk should be a concern of the whole society and not only in the areas with high seismic hazard.
Seismic risk mainly comes from the damage to civil structures. Its extreme consequence for humankind is the collapse of structures, since this leads to the loss of lives, let alone the fact that economical losses can reach the point of affecting the sustainable development of entire countries. Such socio-economical setbacks trigger poverty, inequality, and casualties, amongst many other negative consequences.
According to Coburn and Spence [1], earthquakes up to magnitudes of about 5.5 can occur almost anywhere in the world. These earthquakes can cause damage if they are shallow and trigger significant intensities in areas containing vulnerable structures [2]. However, most of the current seismic hazard maps worldwide have been estimated based Geosciences 2021, 11, 234 2 of 27 on statistical projections, which mostly consider the seismicity recorded just a few decades ago. Hence, several urban environments have been projected without any concern regarding seismic hazard. This has meant that some earthquakes have excessively affected regions rated as relatively low -risk by seismic hazard estimations [3,4].
Trifunac [5] reviewed the history of recording earthquake motions up to the early 1990s. According to this paper, the strong motion instrumentation program in United States started in 1931 (see also [6]). The main purpose was to learn more about the response of structures to seismic actions. An example of the convenience of such data and of the relevance and usefulness of the information they can provide is the paper by Hudson and Housner (1958) [7], who performed a detailed spectrum analysis of the accelerograms corresponding to the 22 March 1957 San Francisco earthquake. In spite of the low Gutenberg-Richter magnitude (5.3), the shaking produced structural damage in the downtown area because of the epicenter proximity. This indicated that the proximity to the rupture plays a paramount role when quantifying seismic risk.
With time, the analysis of ground motions records has also allowed identifying that near-fault events contain severe long duration acceleration pulses, which result in unusually large ground velocity increments [8]. The presence of these pulses can be considered as an important factor in causing damage due to the transmission of a large amount of energy to the structure in a very short time [9]. Thus, the structural response to this type of record has received special attention in recent years; the very large deformation demand on buildings has been of special concern [10][11][12].
This article is focused on two main aspects. The former is to find intensity measures (IMs) exhibiting high efficiency to predict engineering demand parameters (EDPs) when near-and far-fault records are considered. The latter is to quantify the consequences of the increased demand when considering near-fault records in deriving fragility functions. To do so, the probabilistic relationship between seismic hazard in terms of IMs and the structural response in terms of EDPs is studied. This relationship has been an important topic of study in the field of earthquake engineering since several design and assessment methodologies for civil infrastructure are mainly based on the statistical analysis of IM-EDP pairs [13][14][15]. For instance, advanced methodologies to quantify the probability of occurrence of a certain seismic damage state are based on the analysis of these pairs. Therefore, it is a relevant objective to find which IM is best able to predict a specific EDP. Identifying this IM will contribute to reducing uncertainties when quantifying seismic risk of the building stock in human settlements. This reduction is a key aspect to mitigating seismic risk worldwide since stakeholders as well as policy-and decision-makers could prioritize actions to eradicate this risk based on more reliable estimations.
Regarding physical magnitudes to be represented by IMs, displacement, velocity, acceleration, and energy are expected to be highly correlated with the ground motioninduced deformation field of the structure.
Three types of IMs are analyzed herein. The first one is related to the dynamic response of a set of single degree of freedom systems (SDoF) against a certain ground motion record. These IMs are estimated from the spectral response in acceleration, velocity, or displacement.
The second type are IMs obtained from the elastic input energy equivalent velocity spectrum, which is related to the energy input to the structure. This spectrum has recently received research attention since several studies have demonstrated its high effectiveness in the prediction of the seismic structural response [16][17][18][19].
The third type are IMs that can be calculated directly from the ground motion record. To this category belong IMs obtained from computations of the earthquake record such as peak response parameters. The appeal of these IMs is that they do not consider the dynamic properties of the structures like the preceding categories do.
When analyzing the seismic response of civil infrastructure, the most desirable feature for EDPs is that they quantify as best as possible the expected damage. There are many statistical approaches to extract information from IM-EDP pairs on the expected damage to buildings [20][21][22][23][24]. Roughly, the main differences between them lie in the sampling strategy considered to obtain these pairs, the selection procedure of the earthquake records, and the IM and EDP variables selected. The main results of these studies suggest that the most suitable IMs to analyze the probability distribution of an EDP depend on the structural type and the seismogenic environment under consideration.
IMs and EDPs are random variables with a high dispersion. In the case of IMs, this variability is due to the fact that an earthquake generates several types of waves that travel through a very heterogeneous medium at different velocities. The resulting chaotic waves simultaneously influence the position of points located at ground level. It is worth noting that local effects, as well as interaction with civil structures, can also increase the variability observed in IMs.
Regarding EDPs, the randomness may come from the dynamic properties of the structure, which are mainly determined by the type of resistant system against horizontal loads, mechanical properties of the materials, geometric characteristics, acting loads, amongst other aspects [25].
IM-EDP pairs exhibiting high levels of correlation are suitable candidates to analyze the expected damage of buildings subject to ground motions. This is because fragility functions obtained through these highly correlated pairs will exhibit less variability when determining the probability of exceeding a certain damage level assumed as a limit state. IMs can also be used to predict expected values of EDPs [26]. The higher the correlation between an IM and an EDP, the more reliable the estimation of the EDP through an IM. In other words, the most efficient IM to predict an EDP is the one maximizing the correlation between them.
Efficiency of IMs can be measured by considering several sources of uncertainty. For instance, if a specific structure is analyzed, uncertainties related to the mechanical properties of the materials and seismic action are the ones carrying more variability into the response [22,24,27]. For the case of urban environments, information related to the geometrical distribution of buildings belonging to the emplacement are also required [28]. When this geometrical distribution is considered, new properties between variables characterizing hazard and exposure can be analyzed. Within the framework of the KaIROS project (https://cordis.europa.eu/project/id/799553, accessed on 27 May 2021) several numerical tools to deal with such uncertainties have been developed to date. These tools are used herein to analyze the bivariate correlation between a group of IMs and EDPs. To do so, a set of probabilistic numerical 3D models representing the behavior of reinforced concrete (RC) frame buildings is used as a test bed [26].
In order to characterize the seismic action, two subsets of ground motion records are selected. The main difference between them is the distance to the epicenter. These records are scaled with respect to a specific IM so that both subsets have similar intensity values. This can be achieved by defining the same intervals to limit the IM considered. This scaling procedure is intended to produce a building response within and beyond the elastic limit while highlighting the influence of each type of records on the structural performance. Note that for all intervals, the number of records that belong to them is the same.
Once building models and seismic action have both been characterized, several EDPs are obtained through nonlinear dynamic analysis (NLDA). From each subset of ground motion records, a group of IMs are obtained and correlated with the EDPs. Two main aspects will be analyzed: (i) the efficiency of each IM to predict EDPs and (ii) how much the structural response increases in terms of some EDP due to the proximity to the source of the records. The consequences of this increased demand are analyzed through the derivation of fragility functions.

Probabilistic Characterization of the Structural Models
In the design of new structures or in the assessment of existing ones, two key issues are the characterization of the seismic hazard and the quantification of the structural response. Both are highly random and should be analyzed from a probabilistic perspective.
Developing proper building models mainly depends on the available information regarding the mechanical and geometrical properties of their structural elements. If buildings of a specific urban environment require analysis, it is also important to know the distribution of as many physical properties as possible (number of structures belonging to each structural type, distribution of the number of stories and spans, acting loads, variability of the mechanical and geometrical properties of the elements, the azimuthal position, amongst many other aspects). In the following, consideration of several of these random variables is explained.

Epistemic Uncertainties in Structural Input Variables
Several random variables should be considered when the seismic behavior of a group of buildings is modelled. Herein, gravity loads as well as mechanical properties of the materials are considered as random variables. Live loads, LL, permanent loads, DL, the concrete compressive strength, fc, the yield strength of the steel, fy, the elastic modulus of the concrete, Ec, and the elastic modulus of the steel, Es, are considered as random variables. A continuous Gaussian distribution is assumed for them. The coefficient of variations for DL and LL have been set to 0.12 and 0.18, respectively [29]. The coefficient of variation of the concrete compressive strength, fc, may vary from building to building, in the range of 0.07-0.2. The average value of this COV is about 0.15. However, this value depends on the quality control carried out during construction. For instance, a highly controlled concrete tends to exhibit a COV value close to 0.1 [29]. In this study, the COV for the concrete properties has been assumed to be 0.1. Note that the uncertainty level related to the steel is lower than the one considered for the concrete. This is because the quality control of this material is generally superior to that of concrete. Thus, the coefficient of variation of the steel properties has been set to 0.07.

Geometrical Properties
Vargas-Alzate et al. [26] developed an algorithm that can be used to generate probabilistic MDoF systems representing the behavior of 2D RC frame buildings. This algorithm can be easily adapted to analyze the seismic response of other structural types located in other seismic environments [30]. In this research, this computational tool has been adapted to simulate 3D RC frame buildings having similar characteristics to those projected to meet the requirements of a moderate-to-high seismic area.
The algorithm allows considering several geometrical properties as random variables. Specifically, the number of stories, N st , number of spans, N sp , story height, H st , and span length, S l , can be considered as random variables. For the hypothetical case of study, N st and N sp follow a uniform discrete distribution in the interval (3,13) and (3,6), respectively; H st and S l are distributed uniformly in the interval (2.8, 3.2) m and (4, 6) m, respectively. The aforementioned properties are assumed to be the same for both main directions.
In order to assign the cross-sectional properties of the first story columns of each model, the following equation has been used: where c i are coefficients that may be adjusted depending on the data distribution of the analyzed area. For this study, c 1 = 0.4, c 2 = 0.05, c 3 = 0.01 and c 4 = −0.35. Φ 1,0 is the standard normal distribution. Note that the columns are not necessarily square, that is, one random sample is generated for the width W c and one for the depth D c of the columns (Equation (1)). For upper stories, the columns' size decreases systematically by 5 cm every 3 stories. Values generated according to Equation (1) are rounded to the nearest multiple of 5 cm to be consistent with real dimensions of RC elements. The width of the beams, W b , depends on the number of stories of the building model and on the span length, and is calculated using the following equation: where b n are again coefficients that depend on the characteristics of the urban area. For the hypothetical case of study, b 1 = 0.01, b 2 = 0.02 and b 3 = 0.19. Analogously, the depth of the beams, D b , is obtained using the following equation: where g 1 = 0.01, g 2 = 0.05 and g 3 = 0.17. Notice that no random term was considered for the beams. According to the approach described above, building models have been generated to analyse the relationship between IMs and EDPs from a probabilistic perspective. For the case of near-fault records, 444 building models have been generated while for far-fault records 492. These numbers are in accordance with the availability of records within the analyzed database, as it will be shown below. Figure 1 shows 100 simulated building models according to the description presented above. every 3 stories. Values generated according to Equation (1) are rounded to the nearest multiple of 5 cm to be consistent with real dimensions of RC elements.
The width of the beams, , depends on the number of stories of the building model and on the span length, and is calculated using the following equation: where bn are again coefficients that depend on the characteristics of the urban area. For the hypothetical case of study, = 0.01, = 0.02 and = 0.19. Analogously, the depth of the beams, , is obtained using the following equation: where = 0.01, = 0.05 and = 0.17. Notice that no random term was considered for the beams.
According to the approach described above, building models have been generated to analyse the relationship between IMs and EDPs from a probabilistic perspective. For the case of near-fault records, 444 building models have been generated while for far-fault records 492. These numbers are in accordance with the availability of records within the analyzed database, as it will be shown below. Figure 1 shows 100 simulated building models according to the description presented above. As commented above, coefficients considered in equations 1 to 3 as well as the mechanical properties values (Table 1) have been selected to approximate the expected behavior of RC structures located in earthquake-prone areas. In order to verify this, the fundamental period of the generated models is compared with the approximation given by the European seismic design regulations [31] = 0.75 .
where represents the fundamental period of the structure and is the height of the building. Figure 2-a shows the comparison between the fundamental period of the generated models for the near-fault case and the expression provided by this regulation. It can be seen that values agree with those expected for the typology analyzed. In order to quantify the dissimilarity between the horizontal stiffness of the 3D models, the following dissimilarity ratio (see Figure 2-b) is calculated:  As commented above, coefficients considered in equations 1 to 3 as well as the mechanical properties values (Table 1) have been selected to approximate the expected behavior of RC structures located in earthquake-prone areas. In order to verify this, the fundamental period of the generated models is compared with the approximation given by the European seismic design regulations [31] T f = 0.75H 0.75 (4) where T f represents the fundamental period of the structure and H is the height of the building. Figure 2a shows the comparison between the fundamental period of the generated models for the near-fault case and the expression provided by this regulation. It can be seen that values agree with those expected for the typology analyzed. In order to quantify the dissimilarity between the horizontal stiffness of the 3D models, the following dissimilarity ratio (see Figure 2b) is calculated:

Seismic Hazard
One of the main sources of uncertainty in estimations of the seismic response of structures is the random variability of the ground motions. There are several methodologies to properly select ground motion records from a database that are consistent with the sitedependent spectral shape [32]. For the purpose of this study, the most important requirement for selecting both subsets of as-recorded accelerograms (near-and far-fault records) is the distance to the epicenter. These records have been selected from the Engineering Strong Motion Database compiled by the 'Instituto Nazionale di Geofisica e Vulcanologia', INGV [33]. The first subset is composed by 444 records whose distance to the epicenter is lower than 10 km, while the second subset contains 492 records whose distance to the epicenter is within the interval 10-30 km.
To scale the selected records, the objective is that the structural models reach different performance levels. It is also a requirement that each subset be scaled to have IM values similar to each other. However, the availability of strong ground motion records covering high-intensity intervals at specific periods is a common restriction found in current databases. This lack of strong ground motion records can trigger excessive scaling to fit within high-intensity intervals. It is worth mentioning that, for the purpose of this study, it should be avoided to scale the same record to different intensity levels since this introduce false correlation between IMs and EDPs. In order to face these issues, the following procedure has been used for selecting and scaling (where necessary) each subset of records: 1. Classify the ground motion records according to the distance to the epicenter.
2. Identify the IM for selecting ground motion records. 3. Define the IM intervals. 4. Calculate the selected IM for the entire set of records that belong to the database. 5. Sort the ground motion records in descending order as a function of the IM value calculated in step 4. 6. The ground motion record with the highest IM is scaled so that its new IM value belongs to the highest interval. If the IM naturally fulfils the interval condition, no scale factor is considered. This step is repeated with the subsequent IMs, according

Seismic Hazard
One of the main sources of uncertainty in estimations of the seismic response of structures is the random variability of the ground motions. There are several methodologies to properly select ground motion records from a database that are consistent with the site-dependent spectral shape [32]. For the purpose of this study, the most important requirement for selecting both subsets of as-recorded accelerograms (near-and far-fault records) is the distance to the epicenter. These records have been selected from the Engineering Strong Motion Database compiled by the 'Instituto Nazionale di Geofisica e Vulcanologia', INGV [33]. The first subset is composed by 444 records whose distance to the epicenter is lower than 10 km, while the second subset contains 492 records whose distance to the epicenter is within the interval 10-30 km.
To scale the selected records, the objective is that the structural models reach different performance levels. It is also a requirement that each subset be scaled to have IM values similar to each other. However, the availability of strong ground motion records covering high-intensity intervals at specific periods is a common restriction found in current databases. This lack of strong ground motion records can trigger excessive scaling to fit within high-intensity intervals. It is worth mentioning that, for the purpose of this study, it should be avoided to scale the same record to different intensity levels since this introduce false correlation between IMs and EDPs. In order to face these issues, the following procedure has been used for selecting and scaling (where necessary) each subset of records:

1.
Classify the ground motion records according to the distance to the epicenter.

2.
Identify the IM for selecting ground motion records.
Calculate the selected IM for the entire set of records that belong to the database. 5.
Sort the ground motion records in descending order as a function of the IM value calculated in step 4. 6.
The ground motion record with the highest IM is scaled so that its new IM value belongs to the highest interval. If the IM naturally fulfils the interval condition, no scale factor is considered. This step is repeated with the subsequent IMs, according to the sorted list, until the desirable number of records belonging to the highest interval is obtained. 7.
Step 6 is repeated for all intervals. The scale factor in step 6 is calculated having in mind that the IM values are uniformly distributed within each interval.
It is important to select an IM (Step 2) highly correlated with the structural response. That is, an IM (or an arrangement of them) capable of predicting specific EDPs. Hence, it is more adequate to look into IMs calculated from the spectral response of single-degree-offreedom systems, SDoF, as they have proven to be highly correlated with the structural response in terms of EDPs [34]. In this respect, several researchers have proven that it is more efficient to use IMs based on average spectral values around the fundamental period of the structure than using the spectral value associated to it [35][36][37][38][39]. However, because of the probabilistic approach of this study, there is not a single structural model, but a group of them which, in addition, have highly variable fundamental periods (see Figure 2a). Therefore, the period range for averaging spectral ordinates is established from the dynamic properties of the entire population of buildings [26]. In this way, earthquake records within each preselected subset are scaled so that their mean spectral accelerations, in the interval (0.2-1.6) s, meet the scaling criteria presented above. The intensity levels defining the upper and lower limits of each band range from 0.035 to 0.7 g at intervals of 0.035 g. Figure 3a shows the geometric mean spectra of the horizontal components of 444 scaled near-fault records while Figure 3b shows the same spectra for the 492 scaled far-fault records. to the sorted list, until the desirable number of records belonging to the highest interval is obtained. 7.
Step 6 is repeated for all intervals. The scale factor in step 6 is calculated having in mind that the IM values are uniformly distributed within each interval.
It is important to select an IM (Step 2) highly correlated with the structural response. That is, an IM (or an arrangement of them) capable of predicting specific EDPs. Hence, it is more adequate to look into IMs calculated from the spectral response of single-degreeof-freedom systems, SDoF, as they have proven to be highly correlated with the structural response in terms of EDPs [34]. In this respect, several researchers have proven that it is more efficient to use IMs based on average spectral values around the fundamental period of the structure than using the spectral value associated to it [35][36][37][38][39]. However, because of the probabilistic approach of this study, there is not a single structural model, but a group of them which, in addition, have highly variable fundamental periods (see Figure 2-a). Therefore, the period range for averaging spectral ordinates is established from the dynamic properties of the entire population of buildings [26]. In this way, earthquake records within each preselected subset are scaled so that their mean spectral accelerations, in the interval (0.2-1.6) s, meet the scaling criteria presented above. The intensity levels defining the upper and lower limits of each band range from 0.035 to 0.7 g at intervals of 0.035 g. Figure 3-a shows the geometric mean spectra of the horizontal components of 444 scaled near-fault records while Figure 3-b shows the same spectra for the 492 scaled farfault records.   Figure 3c shows a comparison between the mean value of each subset. It can be seen that, for periods lower than approximately 0.45 s, the mean spectrum of the near-fault records is higher than the mean one for far-fault records and that the opposite happens for periods higher than 0.45 s (see Figure 3d).

Intensity Measures
IMs can be obtained ranging from the subjective opinion of people who felt the acceleration produced by the ground motion to those calculated from signals recorded by sophisticated devices measuring acceleration amplitudes which humans are not able to detect. Prior to the development of seismological networks worldwide, there was scarce information on the characteristics of ground motion produced by earthquakes. Nowadays, these complex and interactive networks have provided humankind with valuable datasets, allowing to enhance the interpretation of the effect of seismic waves in urban environments. These networks are continuously recording three orthogonal acceleration components at the ground level in several places across the planet. Such records implicitly contain information about the interaction of physical processes of highly random variability. Basic mathematical background along with proper numerical tools allow extracting from these records variables representing the seismicity of an area. In the context of earthquake engineering, such variables are known as instrumental IMs. Ideally, an IM should contain enough information about the earthquake, so according to it, the structural response can be predicted with confidence [40]. Notice that an IM can depend either on the earthquake properties, or on both the earthquake and structural features [38].
From the statistical point of view, one of the most desirable features that an IM should exhibit is efficiency [41][42][43]. An IM is considered efficient if it reduces the dispersion in the estimated parameter representing the structural response. An IM exhibiting efficiency could potentially reduce the number of structural calculations in estimations of seismic risk at the urban level.

Spectral-Based Intensity Measures
The dynamic response of structures subject to earthquakes has been correlated to the peak response of an equivalent SDoF system. The study of this simplified model gives rise not only to the response spectra but also to spectral IMs. It has been recognized that efficient IMs would be defined by response spectral ordinates [26]. This is why response spectral ordinates are extensively used to quantify the seismic hazard at a site. Such ordinates are obtained from the dynamic equilibrium equation for SDoF systems: where .. u n (t), . u n (t), and u n (t) are the spectral acceleration, velocity and displacement time history responses of the SDoF in the n direction, respectively; .. u g,n (t) is the acceleration ground motion; m, c, and k represent the mass, damping, and stiffness of the system, respectively. IMs from 1 to 6 described in Table 2 belong to this category.

Energy-Based Intensity Measures
For both design and assessing the performance of civil structures, the peak response of physical magnitudes like displacement, velocity or acceleration has been widely used. However, several researchers have found that the expected damage of structures can be strongly tied to the amount of energy introduced to the system [16][17][18][19]. In this respect, the equivalent velocity spectrum represents the amount of energy introduced to a set of SDoF systems. This energy can be calculated by rewriting equation 6 in terms of energy. That is, each term of this equation is multiplied by the differential increment of displacement ( . u n dt) and then integrating in the time interval (0, t), as follows [44,45]: where E k,n = m u n dt is the energy introduced into the system by the ground motion. The Geosciences 2021, 11, 234 9 of 27 latter term is commonly expressed in terms of equivalent velocity, VE, and is normalized with respect to the mass of the structure: VE n = 2E I,n T n,j (8)  u n T n,j Sa = Sa x T x,j * Sa y T y,j Spectral velocity at T n,j 2 Sv n T n,j = max . u n T n,j Sv = Sv x T x,j * Sv y T y,j Spectral displacement at T n,j 3 Sd n T n,j = max u n T n,j Sd = Sd x T x,j * Sd y T y,j Average spectral acceleration 3,4 4 Equivalent velocity at T n,j 7 VE n T n,j = 2E I,n T n,j VE = VE x T x,j * VE y T y,j Average equivalent velocity 8 Peak ground acceleration 9 PGA n = max ..    Fajfar intensity [52] 18 I F n = PGV n ∆ n β I F = I F x * I F y 1 1D stands for 1 horizontal dimension. 2 2D stands for 2 horizontal dimensions. 3 T i represents a vector of periods around the fundamental period of the structure. 4 n T is the length of T i . 5 ∆ n is the significant duration of the record in the n direction.
Calculating VE n for several elastic oscillators will produce the equivalent velocity spectrum. IMs 7 and 8 described in Table 2 are obtained from the energy introduced into the system.

IMs Based on Direct Computations of the Ground Motion Record
IMs presented above use the spectral response of SDoF systems. However, several IMs are obtained from direct computations of the ground motion record. The appeal of this type is that they do not depend on the dynamic properties of the structures. In this way, a generic fragility function could be developed to estimate the seismic performance of buildings with very different structural properties. However, such independency causes a decrease of the efficiency for predicting EDPs, as shown later on. IMs from 9 to 18 described in Table 2 belong to this category. Note that these IMs are generally computed as the sum of the IM values calculated for each component of the record.

Engineering Demand Parameters
Engineering demand parameters are used to design or assess the expected behavior of buildings. The most desirable feature for EDPs is that they quantify as best as possible the performance level of structures subject to ground motions. One of the most used EDP for estimating this level is the maximum inter-storey drift ratio, MIDR [53]. Many other variables like the maximum displacement of the roof, maximum global drift ratio or the base shear coefficient can be considered as EDP. Herein, EDPs have been calculated as the combined response in the main direction of the structure. In the following section, several EDPs are briefly described.

Maximum Displacement at the Roof (δr)
The maximum displacement at the roof, δr, is an EDP highly used in the capacity spectrum method [54]. In the case of NLDA, this EDP can be estimated as follows: where δr x (t) and δr y (t) are the displacement at the roof level in each principal direction.

Maximum Global Drift Ratio (MGDR)
Maximum Global Drift Ratio is mostly used as a representative quantity for determining the general performance of structures when subjected to earthquakes [55]. This EDP can be obtained by dividing δr between the height of the building:

Maximum Inter-Storey Drift Ratio (MIDR)
As said in the preceding, the MIDR is probably the most used EDP when analyzing buildings subject to horizontal ground motions. This EDP is highly used in both designing or assessing the vulnerability of buildings. For storey i, the evolution of the inter-storey drift is given by: where δ i,n (t) is the displacement at the floor i of the structure; h i represents the height of the storey i. The maximum inter-storey drift ratio at the storey i, MIDR i , can be calculated as follows: In order to estimate the damage level of the most affected storey, the maximum inter-storey drift ratio observed in the building, MIDR, is given by:

Base Shear Coefficcient (CS)
In order to quantify the amount of force acting on the columns located at the ground level of the structure, the base shear coefficient (see Equation (14)) is calculated as the ratio between the combination of the maximum base shear in each direction and the total weight of the building, W. This coefficient is of paramount importance when designing civil structures to withstand seismic-induced effects.
where V x (t) and V y (t) are the base shear in each principal direction.

Statistical Analysis of IM-EDP Pairs
Mostly, seismic risk comes from the damage occurred to civil structures after an earthquake. This damage appears when the elastic response of the structural and nonstructural elements is exceeded. There are many numerical methods to estimate this nonlinear dynamic response. They range from adapted linear-static-based-methods to more advanced ones, considering the nonlinear static (pushover-based methods) or dynamic response of a structure (NLDA). This latter is the most reliable numerical tool to simulate the nonlinear dynamic response of buildings.
Two sets of NLDAs are performed in this research. The first one considers as seismic hazard the selected near-fault records whilst the latter the far-fault ones. The Ruaumoko software has been used to perform the structural analyses [56]. From these calculations, the IMs and EDPs presented in Sections 4 and 5, respectively, are obtained. The correlation coefficient, R 2 , of the resulting cloud of IM-EDP points is estimated so that the most efficient IM can be identified.

Analysis of IMs Efficiency
In this study, IM-EDP relationships are characterized by performing a nonlinear regression analysis in the log-log space. As with linear least squares, nonlinear regression is based on determining the values of the parameters that minimize the sum of the squares of the residuals. In this sense, the following general linear least-square model allows several types of regression: where α 0 , α 1 , . . . α m are the coefficients providing the best fit between model and data; z 0 , z 1 , . . . z m are m+1 basis functions; ε represents the residuals. It can easily be seen how polynomial regression falls within this model. That is, z 0 = 1, z 1 = x, . . . z m = x m . Substituting in equation 15 y = ln EDP and z i−1 = (ln I M) i−1 , the linear least-square model using polynomial functions can be used to extract statistical information from IM-EDP pairs according to the following expression: For m = 2, this equation adopts the following quadratic form: where α 0 , α 1 , and α 2 are scalars maximizing R 2 for IM-EDP pairs. Further information regarding the development and implementation of this type of polynomial models in the log-log space can be found in [57]. For a perfect fit, R 2 = 1, signifying that the quadratic function explains 100% of the variability of the data. R 2 is used herein to provide an estimation of the variability when analyzing IM-EDP pairs. That is, the higher R 2 , the lower the variability when predicting some EDP given an IM. Consequently, the IM providing the highest R 2 will be the most efficient one. Note that the quadratic arrangement described above allows to fit linear functions if that is what the data suggests. For these cases, α 2 tends to zero.
In the following, variables described in Sections 4 and 5 are used as IMs and EDPs, respectively. Note that most of the simulations performed in this study fell within the nonlinear range. This is not a coincidence but a consequence of the scaling values of the ground motion records. Figures 4-7 show the relationships in the log-log space between EDPs and IMs considering near-fault records. Figures 8-11 show analogous data for far-fault records.            Figure 11. Correlation coefficient between IMs and CS in the log-log space for far-fault records. Figure 11. Correlation coefficient between IMs and CS in the log-log space for far-fault records. Figure 4 shows that AvSd exhibits the highest ability to predict δr for near-fault ground motion records. It is worth noting that the relationship between both variables in the loglog space seems linear. In general, IMs based on displacement are the ones most correlated to δr. Regarding the group of IMs independent of structural properties, SED is the one with the highest efficiency to predict δr.

Near-Fault Ground Motion Records
VE is the IM that minimizes the dispersion when predicting MGDR (see Figure 5). It is worth mentioning the high efficiency exhibited by the classic Sa, which is even higher than that of AvSa. The latter has been shown to have superior ability to predict MIDR, at least in terms of acceleration-based IMs. Regarding IMs independent of structural properties, PGD is the one with the highest efficiency. Figure 6 allows to identify that AvSv is the IM exhibiting the highest efficiency for predicting MIDR. In general, it can be seen that velocity-based IMs are the ones most correlated with MIDR. This also occurs when analyzing IMs independent of the structural properties, e.g., PGV, SED, and vel RMS . Note that the arrangement originally proposed by Fajfar et al. [50] (i.e., β = 0.25) tends to show less correlation when compared with the PGV, at least when MIDR plays the role of EDP. It has been found that using β = 0.07 provides a better correlation value for this IM (R 2 = 0.949).
For the case of CS, Figure 7 shows that the IM most correlated is Sa. Note that the correlation index obtained when relating this IM with CS is significantly higher than the rest of IMs. Figure 8 shows that AvSd is again the IM with the highest ability to predict δr. It is also observed that the relationship between both variables in the log-log space seems linear. IMs based on displacement are the ones that exhibit the highest efficiency with respect to δr. From the group of IMs independent of structural properties, SED is again the IM minimizing the dispersion when predicting δr. For the case of MGDR, Sa is the IM that exhibits the highest efficiency (see Figure 9). Regarding IMs independent of structural properties, SED is the one most correlated to MGDR. AvSv is again the IM exhibiting the highest efficiency for predicting MIDR (see Figure 10). In general, it can be seen again that velocity-based IMs are the ones most correlated with MIDR. This also occurs when analyzing IMs independent of the structural properties, e.g., PGV, SED and vel RMS . When MIDR plays the role of EDP, the arrangement originally proposed by Fajfar et al. [50] (i.e., β = 0.25) tends to show less correlation when compared with the PGV. It has been found that using β = 0.07 provides a better correlation value for this IM (R 2 = 0.949). For the case of CS, Figure 11 shows that the IM most correlated is Sa. Note again that the correlation index obtained when using this IM is significantly higher than the rest of IMs.

Spearman Coefficcient
It can be argued that R 2 is not adequate to explain the causality between IMs and EDPs. This is because the lack of linearity (exhibited between some sets of IM-EDP pairs), normality and homoscedasticity of these variables. Regarding linearity, the quadratic model used to perform the regression analysis overcomes this issue. Regarding normality and homoscedasticity, it has been observed that both types of variables, i.e., IMs and EDPs, tend to be lognormal in the log-log space. This calls into question the use of R 2 . Therefore, it has been analyzed the Spearman coefficient, ρ, which is an alternative index less sensitive to the lack of such properties. It has been observed that the correlation values are similar between this coefficient and the one emerging from the quadratic regression model. What is most important is that both approaches identify the same IM as the one maximizing the correlation with the specific EDP, as shown in Figure 12 (see blue circles). and homoscedasticity, it has been observed that both types of variables, i.e., IMs and EDPs, tend to be lognormal in the log-log space. This calls into question the use of . Therefore, it has been analyzed the Spearman coefficient, ρ, which is an alternative index less sensitive to the lack of such properties. It has been observed that the correlation values are similar between this coefficient and the one emerging from the quadratic regression model. What is most important is that both approaches identify the same IM as the one maximizing the correlation with the specific EDP, as shown in Figure 12 (see blue circles).  In general, slightly higher R 2 values are observed for far-fault records. This could be because near-fault records introduce larger demands to the analyzed structures. That is, the higher the nonlinearity the higher the dispersion of the data. In order to analyze the implications of this increased demand, fragility functions may provide valuable information. These curves allow estimating the conditional probability of exceeding a certain damage threshold given an IM value. Note that since both subsets of records were scaled to meet similar IMs, fragility functions that provide the highest probability of exceedance are those that come from the highest demands. In the following section, fragility functions derived from near-and far-fault records are compared.

Fragility Functions
From the IM-EDP relationships shown in Figures 4-11, one can derive fragility functions according to the so-called 'cloud analysis' approach [19]. This methodology requires to calculate the best fit curve between a set of IMs and EDPs realizations in the log-log space. The resultant curve is used to calculate the mean of a normal distribution which depends on the IM value. The variability of this distribution is estimated as the standard deviation of the IM-EDP residuals with respect to the fitted curve. Hence, when varying the IM value, the probability of exceeding a specific damage threshold can be estimated. It is worth recalling that these thresholds are particular realizations of the EDP under consideration, EDP C .
According to the above, the variability of the fragility functions is directly related to the dispersion of the IM-EDP points. The higher this dispersion, the more uncertainty there is when defining whether the structure presents one specific damage state or another. That is why it is important to identify efficient IMs.
In the present study, fragility functions are derived with respect to a regression analysis which allows to consider the non-linear relationship between random variables. This approach renders the estimation of the probability of exceedance more reliable. However, it should be borne in mind that, in low correlation cases, it is not recommended to derive fragility functions using the quadratic approximation since the fitted curve tends to be highly non-linear, thus compromising the quality of the corresponding function.
As discussed above, fragility curves can be used to analyze the consequences of a subset of records that place higher demands on similar structures than another. In other words, the first subset contains more damaging ground motion records than the second one.
Considering the cloud analysis approach described above, fragility functions for MGDR, MIDR and CS considering near-and far-fault records are derived (see Figure 13). Fragility functions for δr have not been calculated due to the high dependency of this EDP on the building's height.  In Figure 13, it can be seen that, for all the fragility curves, near-fault records tend to induce larger demands on the analyzed buildings than the far-fault ones. This effect is more conspicuous when analyzing MGDR, reaching differences in probabilities in the order of 0.5. For MIDR, these differences may reach values in the order of 0.18. For CS, the The IM considered to derive each curve is the one exhibiting the highest correlation with the EDP. In this way, VE, AvSv and Sa have been used to calculate fragility functions for MGDR, MIDR and CS, respectively. Table 3   In Figure 13, it can be seen that, for all the fragility curves, near-fault records tend to induce larger demands on the analyzed buildings than the far-fault ones. This effect is more conspicuous when analyzing MGDR, reaching differences in probabilities in the order of 0.5. For MIDR, these differences may reach values in the order of 0.18. For CS, the observed differences are significantly lower (0.06). The damage thresholds were established at MIGR C = 0.02, MIDR C = 0.02, and CS C = 0.3.

Concluding Remarks
It has been observed that a single IM is not able to predict, with the highest efficiency, all the EDPs considered in this research. This means that depending on the EDP to be studied, it is more convenient to use one IM or another. However, a general efficiency can be estimated by averaging all the R 2 associated to each IM-EDP set of points. This average coefficient of correlation has been performed (see Table 3) for each subset of analysis (R 2 NF and R 2 FF for near-and far-fault, respectively). From these variables, it can be concluded that the seismic response and damage of structures can be better related to the energy entering into them than to the forces exerted upon them. It is important to note that this energy can be regarded as a function of the velocity of the system. Thus, results presented in this article indicate that the accuracy in seismic risk estimations can be improved if variables related to the velocity, energy, or even displacement at the ground level are used as IMs. This improvement in the accuracy will aid decision-makers in prioritizing actions oriented to the reduction of the seismic risk. In the case of design, this will allow a better quantification of the safety factors used in current guidelines to cover uncertainties related to the seismic acting forces.
Regarding IMs obtained from direct computations of the earthquake record, it was also shown that those related to the velocity are the most effective to predict EDPs. In addition, these IMs do not depend on the dynamic properties of the analyzed structures. This implies that buildings with very different structural properties can be analyzed using the same fragility function. This is positive only if the IM exhibits an adequate level of correlation (e.g., PGV and SED). Nonetheless, this steadfastness is associated with a reduction in efficiency compared to IMs that consider the dynamic properties of the structures.
It has been confirmed that the seismic response of structures to near-fault ground motion is substantially different from the response to far-field earthquake records. More precisely, for the same intensity intervals, near-fault ground motions impose a larger demand in terms of EDPs than far-fault motions. This verification has been carried out through comprehensive statistical calculations.
The increased demand could be because, in the directivity zone, near-fault of ground motions may contain large amplitude velocity pulse of long duration. These characteristics affect the response of both short and long period structures [58].
Within this research, the as-recorded components of the ground motions have been used as input to perform the probabilistic calculations. It means, structures have not been consistently rotated with respect to the geometry of the fault. Anyhow, directivity effects may be present in many of the selected records. It is quite likely that if the structures were rotated so that their main axes consider the position of the forward directivity zone, the increase in demand would be even higher. In a way, in this research only the effect of being close to the epicentre is the responsible of increasing the expected demand. Further analysis should be aimed at analyzing this increase through consistently rotating the structures with respect to the fault; this would allow to include their azimuthal position when designing or assessing them.
It has been observed the importance of classifying the ground motion records according to the distance to the rupture when deriving fragility functions. It has been detected that, in general, the efficiency of the IMs increases if the records are grouped according to this variable. That is, if the IM-EDP pairs are analyzed without considering the distance classification, the R 2 coefficients decrease significantly. It is also important to observe the ability of certain IMs to maintain high levels of correlation with EDPs despite the dynamic properties of the structure. This finding will make it possible to diminish the number of structural calculations when assessing seismic risk, since the number of fragility functions can be significantly reduced to account for the entire building stock of cities.
In this article it has analyzed EDPs widely used both in the design and in the assessment of civil infrastructure. However, the study of other EDPs related to damage measurements based on dissipated energy or plastic rotations in critical sections could provide more information on the behavior of the analyzed structures. Further research should be geared towards this end.
In general, it can be concluded that velocity-based IMs have higher predictive power of the structural response (efficiency) than acceleration-based IMs. Note that several researchers have demonstrated the enhanced capabilities of velocity-based IMs [59,60]. Despite this, acceleration-based IMs still rule. Consequently, the entire framework for designing new (or assessing the performance of existing) structures has been built using acceleration as the physical magnitude. In the author's view, this obeys the lack of a probabilistic framework that could prove, in a generalized way, the enhanced capabilities of IMs based on velocity. This article provides extensive evidence oriented to achieve a paradigm shift in the way in which the seismic problem is currently being faced.
Finally, it has been observed that the effectiveness of two basic IMs (acc RMS and PGV) increases when combined as power functions with ∆. This gave rise to two enhanced IMs named I c and I F . It will be of interest to analyze if new improved IMs can be developed using the IMs analyzed herein. Actually, the concept of improved IMs, that is, the linear combination of basic IMs and other type of variables like ∆ in the log-log space, allows increasing R 2 . New types of variables like the ones presented in Table 1 or those described in Section 2.2 can also be used to enhance the predictability of IMs. In this respect, polynomial regression analysis allows a better approximation of the structural response than linear regression models. Note that this improvement in the predicting potential of the model can be also measured through R 2 .
Results presented in this article meet the principles of fair data [61]. Thus, potential users could find, access, interoperate, and reuse data presented herein. A text file containing the main results as well a document describing the data arrangement can be downloaded from the following website: http://kairoseq.upc.edu (accessed on 27 May 2021). This text file can be easily managed within several computer programs, allowing potential users to develop further applications.