Determining the Most Sensitive Socioeconomic Parameters for Quantitative Risk Assessment

Risk assessment of climatic events and climate change is a globally challenging issue. For risk as well as vulnerability assessment, there can be a large number of socioeconomic indicators, from which it is difficult to identify the most sensitive ones. Many researchers have studied risk and vulnerability assessment through specific set of indicators. The set of selected indicators varies from expert to expert, which inherently results in a biased output. To avoid biased results in this study, the most sensitive indicators are selected through sensitivity analysis performed by applying a non-linear programming system, which is solved by Karush-Kuhn-Tucker conditions. Here, risk is assessed as a function of exposure, hazard, and vulnerability, which is defined in the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5), where, exposure and vulnerability are described via socioeconomic indicators. The Kolmogorov-Smirnov statistical test is applied to select the set of indicators that are the most sensitive for the system to assess risk. The method is applied to the Bangladesh coast to determine the most sensitive socioeconomic indicators in addition to assessing different climatic and climate change hazard risks. The methodology developed in this study can be a useful tool for risk-based planning.

Sensitivity analysis provides a powerful means of learning about the degree of sensitivity of all the socioeconomic parameters of the system. It primarily studies the degree of impact of each indicator on the composition of the indices [19]. This technique may be used to come up with a functional relationship between a small group of parameters and the response of the system output [20][21][22]. Baker et al. identified the sensitivity analysis method as one of the key quantitative analyses for risk management, as it can provide the groundwork for planning adaptation operations to minimize the risks associated with climate change [23]. The tool can also be used for the identification of uncertainties to prioritize additional research or data collection [24].
Various methods such as the differential method, analysis of variance (ANOVA), linear regression analysis (RA), response surface method (RSM), mutual information index (MII), fourier amplitude sensitivity test (FAST), Sobol's method [25], and non-linear programming [26] can be used to perform sensitivity analysis.
In this study, non-linear programming is applied to the sensitivity analysis, which can be viewed as a part of mathematical optimization. Non-linear programming deals with optimization problems where the objective function or some of the constraints are non-linear. This method covers the whole system where the risk is assessed. In this study, the selected system is the coastal zone of Bangladesh.
This study is focused on solving the prime answer of the question, "How sensitive are single indicators and how does single indicator sensitivity influence the overall risk of the system?" In this study, risk is defined according to IPCC AR5 [27] to formulate the non-linear programming and to select the most sensitive indicators. This gives the ranking of the set of indicators. Within this ranking, a statistical test is performed to select the most sensitive indicators, which can be used to assess the risk in the selected system.
Methodology of this study includes study area selection, selection of hazard and socioeconomic indicators in the study area, methodology and formulation of non-linear programming, solution of the non-linear programming, and statistical analysis to detect significant change.

Study Area Selection
To apply the methodology developed in this paper, a study area is needed where the required data of socioeconomic indicators are available. The coastal area of Bangladesh (Figure 1) is found suitable for this purpose. The extent of this coast covers a long distance inland [28] where about onefourth of the country's total population lives [29]. This population may increase to almost double by mid-century [30]. The coexistence of availability of resources [31], high poverty rate [32], and natural hazards [33] makes the selection of socioeconomic parameters complicated in this area. Data from available socioeconomic indicators in the study area are used to determine the most sensitive ones to assess risk.

Hazards in the Study Area
The study area is vulnerable due to different types of natural hazards. The dominant hazards in this area are storm surges, floods, salinity, and river erosion ( Figure 2). Necessary data for hazard assessment is extracted from model simulation results [34]. Storm surge is the result of cyclonic event that occurs in pre-monsoon (April-May) and post-monsoon (October-November) seasons. The main impact zone of storm surge is confined within the landfall location along the exterior coast [35]. Storm surge hazard (Figure 2a) is assessed by combining surge depth and cyclonic wind speed. Flooding is caused by combined action of fluvial flow from upstream rivers and tide from sea and occurs mainly during monsoon (June-September). The northern parts of the coast which are not protected by polders (an encircled embankment) are mainly affected by flood [36]. The parameter used for flood hazard assessment (Figure 2b) is flood depth. Salinity in the study area is represented by the river

Hazards in the Study Area
The study area is vulnerable due to different types of natural hazards. The dominant hazards in this area are storm surges, floods, salinity, and river erosion ( Figure 2). Necessary data for hazard assessment is extracted from model simulation results [34]. Storm surge is the result of cyclonic event that occurs in pre-monsoon (April-May) and post-monsoon (October-November) seasons. The main impact zone of storm surge is confined within the landfall location along the exterior coast [35]. Storm surge hazard (Figure 2a) is assessed by combining surge depth and cyclonic wind speed. Flooding is caused by combined action of fluvial flow from upstream rivers and tide from sea and occurs mainly during monsoon (June-September). The northern parts of the coast which are not protected by polders (an encircled embankment) are mainly affected by flood [36]. The parameter used for flood hazard assessment (Figure 2b) is flood depth. Salinity in the study area is represented by the river salinity. Salinity intrusion in the region occurs during dry season (December-March). Salinity magnitude is the maximum in western and eastern region, whereas the central region has the minimum salinity [37]. Salinity hazard is assessed (Figure 2c) by using salinity magnitude in the rivers and estuaries. Erosion in the study area are mainly confined along the river banks [38]. To assess erosion hazards (Figure 2d), the total eroded area along the river bank is used as the hazard parameter. salinity. Salinity intrusion in the region occurs during dry season (December-March). Salinity magnitude is the maximum in western and eastern region, whereas the central region has the minimum salinity [37]. Salinity hazard is assessed (Figure 2c) by using salinity magnitude in the rivers and estuaries. Erosion in the study area are mainly confined along the river banks [38]. To assess erosion hazards (Figure 2d), the total eroded area along the river bank is used as the hazard parameter.

Available Socioeconomic Indicators to Assess Risk in the Study Area
From the available data sources, 23 socioeconomic indicators are preliminary selected that can be used to assess risk in the area due to the four hazard indicators mentioned in section 2.2. All of these 27 indicators (4 hazards and 23 socioeconomic) are divided into different domains. The methodology developed in this study is applied to determine the most sensitive indicators from this preliminary list to assess risk in the study area. Table 1 shows the preliminary selected indicators in different domains, their role in assessing risk, sources of data and data units.

Domain
Indicators Impact on risk Data source and data unit

Available Socioeconomic Indicators to Assess Risk in the Study Area
From the available data sources, 23 socioeconomic indicators are preliminary selected that can be used to assess risk in the area due to the four hazard indicators mentioned in Section 2.2. All of these 27 indicators (4 hazards and 23 socioeconomic) are divided into different domains. The methodology developed in this study is applied to determine the most sensitive indicators from this preliminary list to assess risk in the study area. Table 1 shows the preliminary selected indicators in different domains, their role in assessing risk, sources of data and data units. Cropped Area Negative impact on risk due to its exposure to hazard [39,40].
Data source: [41]. Data unit: Percentage of Cropped Area per unit of administrative area.

Number of Household
Increased number of households causes increased risk [42,43].
Data source: [29]. Data unit: Percentage of Number of Household per unit of administrative area.

Population Density
Increased population density increases exposed population to risk [43,44].
Data source: [29]. Data unit: Total number of population per unit of administrative area.

Sensitivity
Female to Male Ratio Female population are more sensitive to risk than male population. Increased number of female populations increases risk.
Data source: [29]. Data unit: Ratio of female to male population.

Poverty Rate
Poor people are sensitive to hazards. So, higher poverty rates are indicative of higher risk due to same hazard.
Data source: [29]. Data unit: Percentage of extreme poor lies below poverty line.

Dependent Population
Dependent population in an area are the women, children, and elderly people. These group of population are considered as less able to adaptation against risk [42,44,45].
Data source: [29]. Data unit: Percentage of summation of women, children and elderly population to the total population of an administrative unit.

Disabled People
Physically and mentally disabled people are more sensitive to hazard because of their inability and slow response during a hazard event [46].
Data source: [29]. Data unit: Percentage of total disabled people to total number of population in an administrative unit. Unemployed Population Unemployment decreases the coping capacity and increases the sensitivity and susceptibility to risk [47].
Data source: [29]. Data unit: Percentage of total unemployed population to total population in an administrative unit.

Adaptive Capacity
Growth center Growth center is an economic indicator. Increased number of this indicator indicates better economic strength and better adaptive capacity against vulnerability [48].
Data source: [29]. Data unit: Number of growth center per 5000 of population in an administrative unit.

Plantation
Plantation is considered as a buffer against storm surge hazard that reduces the initial thrust of the hazard. Reduction of hazard means reduction of risk [49].
Data source: [34]. Data unit: Forest area (natural and artificial) per unit of administrative area.

Aquaculture
Shrimp cultivation is the dominant aquaculture in the study area. Aquaculture is considered as an alternative livelihood to adapt against salinity hazard.
Data Source: [34]. Data unit: Shrimp cultivated area per unit of administrative area.
Cyclone shelter Cyclone shelter is a structural adaptive measure against storm surge hazard. Increased number of cyclone shelter reduces number of human casualty and thus reduces storm surge risk [43,50].
Data source: [51]. Data unit: Number of cyclone shelter per unit of administrative area.

Cropping intensity
Cropping intensity is an indicator of agricultural activity. Increased cropping intensity means increased adaptive capacity that reduces risk against hazard [39,40,52] Data source: [41]. Data unit: Percentage of gross cropped area per net cropped area in an administrative unit.

GDP
Gross Domestic Product (GDP) is an economic indicator. Higher GDP means better ability to recover from loss and reduce risk from hazard [53].

Irrigation Equipment
Shallow tube-well (Stw), Deep tube-well (Dtw), and Low Lift Pump (LLP) are known irrigation equipment in the study area. Increased number of Irrigation Equipment enable a farmer to better adapt with the hazard and thus reduce risk.
Data source: [29]. Data unit: Number of irrigation equipment per unit of cropped land area.

Domain Indicators Impact on Risk Data Source and Data Unit
Polder Area Polder is an encircled embankment constructed to prevent flood in the study area. Increased number of polders reduces flood and thus reduces flood risk [43,54].
Presence of Lifeline Lifeline is represented by water supply, sanitation and electricity. Higher number of lifeline utilities are considered to increase adaptive capacity against vulnerability and thus reduces risk [43][44][45].
Data source: [29]. Data unit: Percentage of tap water and other pond types surface water sources and percentage of connected sanitary and electricity lines per unit area of an administrative unit.

Loan
Loan is considered as the credit facility by co-operative society and banks, particularly to recover from loss due to hazard. Increased loan facilities thus reduce risk due to hazard [55].
Data source: [29]. Data unit: Percentage of total account holder per total number of populations in an administrative unit.
Literacy Rate Literate people know better how to adapt with the vulnerability and reduce risk [42,44,45] Data source: [29]. Data unit: Percentage of number of literate people per unit of administrative area.

Number of Health care Provider
Health care providers play an important role to reduce human casualty during a hazard, which acts to reduce vulnerability and risk of the community [56].
Data Source: [29]. Data unit: Percentage of health care provider compared to total population in an administrative unit.

Paka and Semi-paka house
Paka and Semi-paka houses represent households which are structurally strong to resist impacts of hazard. Presence of these housing types reduces risk [43,45].
Data source: [29]. Data unit: Percentage of Paka and Semi-paka houses compared to total number of households in an administrative unit. Communication Infrastructure Communication infrastructure is represented by all types of structural measures related to communication. It acts as an adaptive capacity for a community and reduces vulnerability and risk during a hazard [43,44,57].
Data source: [29]. Data unit: Weighted sum of length of different types of structural measures used for communication purpose in an administrative unit.

Road Density
Increased road density in an area increases the mobility during the time of hazard. This makes it possible to utilize other adaptive measures that reduces risk.
Data source: [29]. Data unit: Total road length in an administrative unit.

Non-Linear Programming to Determine the Most Sensitive Indicators
As mentioned earlier, the central research question of this paper is: 'What are the most sensitive indicators that determine risk in a system?' To answer this question, non-linear programming is applied to determine the most sensitive indicators in the study area from the list described in Section 2.3.
A non-linear programming system is a system that solves an optimization problem where some of the constraints are non-linear or the objective function is non-linear. A general constraint and unconstraint non-linear programming problem is to select n decision variables x 1 , x 2 , . . . , x n from a given feasible region in such a way as to optimize (minimize or maximize) a given objective function f (x 1 , x 2 , . . . . . . , x n ) of the decision variables [58]. The feasible region is defined as a boundary where all possible points of a non-linear programming problem satisfy the problem's constraints [58]. A flow chart describing non-linear programming is shown in Figure 3.

Unconstrained Non-linear Programming
The simplest non-linear programming problem is that of minimizing or maximizing a function [58]. : .
An unconstrained non-linear programming problem is given by  is a smooth and real valued objective function of the vector

Constrained Non-linear Programming
The problem is called a non-linear programming problem (NLP) if the objective function is nonlinear and/or the feasible region is determined by non-linear constraints [58]. Thus, in minimization form, the general non-linear programming is stated as: subject to:

Unconstrained Non-Linear Programming
The simplest non-linear programming problem is that of minimizing or maximizing a function [58]. (1) An unconstrained non-linear programming problem is given by Where f : R N x → R. is a smooth and real valued objective function of the vector x ∈ R N x .

Constrained Non-Linear Programming
The problem is called a non-linear programming problem (NLP) if the objective function is non-linear and/or the feasible region is determined by non-linear constraints [58]. Thus, in minimization form, the general non-linear programming is stated as: subject to: where each of the constraint functions g 1 through g m is given and b 1 ,b 2 , . . . . . . .,b m are constant vectors.

Development of Non-Linear Programming System
There are some 'disturbance parameters' that are responsible for creating the ambiguous results in risk as well as vulnerability. To avoid chaos in risk and vulnerability analysis, a non-linear programming system is incorporated. For the development of this system, risk, hazard, exposure, and vulnerability are needed to be used as the constraints and the objective function under the system. For these, 23 (out of total 27, where 4 are hazard parameters) possible socioeconomic indicators (Section 2.3) are selected in this study area [27]. Risk is a multiplicative function (non-linear combination) of hazard, exposure, and vulnerability, where vulnerability is a simple linear combination of sensitivity and adaptive capacity [27].
In this research, the non-linear programming systems are developed from the linear and non-linear combination of parameters (Section 2.3) that are called decision variables. Here, risk is considered as an objective function and constraints are developed from the weighted scores of parameters for each spatial unit of the study area (a total of 139 administrative unit named as 'upazila'). The relative weighted scores are calculated using PCA [59]. PCA gives a correlation matrix that identifies the principal component for a system [59]. Pearson correlation coefficient was used to find the weights of the parameters that describe how much an indicator can explain a component vector.

Solution of Non-Linear Programming System with Karush-Kuhn-Tucker (KKT) Conditions
In modern non-linear programming algorithms, the Fritz John and the Karush-Kuhn-Tucker conditions are directly applicable to practical optimization problems [60]. In this research, Karush-Kuhn-Tucker conditions [60] are used to solve the non-linear programming system. A prerequisite for stating KKT conditions is the Lagrangian function [60,61]. Consider the non-linear programming in general: where g(y) is a function of the inequality constraint and h(y) is a function of the equality constraint. The Lagrangian function is defined by: Where Taking l 0 = 1 for KKT condition, Necessary conditions for KKT [60] are: (1) f(y) is to be feasible to apply the above constraints (iv) and (v).
Using these conditions, a set of equations are formulated for the required values of variables (decision parameters) to minimize the objective functions.
Applying the above methodology, a MATLAB code to solve the non-linear programming problem (Equation 1) is developed.

Statistical Analysis to Detect Significant Change
Two risk levels are developed for all spatial units (in this case spatial unit is the 'upazila', which is an administrative unit one step below the district level). One risk level is based on the base risk scores of each upazila, considering all 23 socioeconomic indicators. The other risk level is developed by using the risk scores calculated by elimination of indicators one-at-a-time. Statistical tests are performed to examine whether there are significant changes between these two risk levels. To detect statistically significant change between two risk levels, the Kolmogorov-Smirnov test [62][63][64][65] is performed. The test is performed by calculating the difference between two risk levels c 1 and c 2 . To perform these tests, the following hypotheses are applied where C1i is the base risk score considering 23 indicators and C2i is the risk score after elimination of indicators one-at-a-time: H 0 : C1i = C2i; there is no significant change between two curves (12) H 1 : C1i C2i; there is a significant change between two curves (13) Cumulative distributions of c 1 curve and c 2 curve are C f 1i , C f 2i . Cumulative distributions are calculated by the following formulae: where N is number of upazila.
The distance (D) between cumulative distributions are calculated as the maximum distance by applying the following formula Here, 138 represents the total number of spatial units (in this case upazila). Distance (D) implies the dissimilarity between two curves.

Results and Discussion
A non-linear programming system with inequality and equality constraints is designed from 27 (23 socioeconomic and 4 hazards) possible parameters (decision variables) by using normalized scores of each parameter. This non-linear programming problem is solved by KKT conditions and it gives the rank of indicators during the risk minimization process (which is the ultimate target). Variation of the coefficient of variables in the objective function describes the sensitivity of the parameters (decision variables). A MATLAB code is developed for the solution of the non-linear programming problem where one function 'lambda' is defined in the code named as the Lagrangian multiplier. This 'lambda' displays the lower limit and the upper limit of the coefficients of variables of the objective function. If the coefficients of variables of the objective function vary within the range between the lower limit and the upper limit, risk (the objective function) is optimized (minimized), otherwise the risk varies. This implies that the range between the lower and upper limit gives the rank among the indicators where lower deviation gives the most sensitive parameter and higher deviation gives the least sensitive parameter ( Table 2). Based on this ranking (Table 2), domain specific indicators are arranged according to their ranking (Table 3).

Selection of the Most Significant Indicators
The most significant indicators are selected by applying a process of elimination of indicators among the 'ranked indicators', which are determined by applying non-linear programming ( Table 2). Elimination criteria is based on measuring 'statistically significant' change (10% dissimilarity between the c1 and c2 levels) when one indicator is 'eliminated' from the system. In order to evaluate how much effect a single parameter can have on the overall analysis, the analysis is performed repeatedly by excluding the parameters one-at-a-time, while keeping the total number of 'used' parameters the same for each case. The elimination process is started by excluding the least significant indicator first and the process is repeated by eliminating each indicator one-at-a-time. To measure statistically significant change of the risk score (when one is eliminated from the system) from the base risk score, the Kolmogorov-Smirnov test is applied in testing the difference between the two risk scores. Both the risk scores are cumulatively distributed, and the maximum difference is found between the two cumulative risk scores which implies the percentage of dissimilarity. Table 4 represents the elimination list of indicators and Figure 4 shows the insignificant change due to the elimination process.  For the fifth elimination, when the second least sensitive indicator (Disabled People) from the sensitivity domain is excluded, the dissimilarity is found to be above 10% of the base risk score, as shown in Figure 5. Similarly, when the fourth least sensitive indicator from the adaptive capacity domain (Literacy Rate) are eliminated, more than 10% dissimilarity is calculated ( Figure 6). Thus, we can conclude that these two parameters show significant dissimilarity from the base risk score, which entitles them to be included in the most sensitive indicators.    For the fifth elimination, when the second least sensitive indicator (Disabled People) from the sensitivity domain is excluded, the dissimilarity is found to be above 10% of the base risk score, as shown in Figure 5. Similarly, when the fourth least sensitive indicator from the adaptive capacity domain (Literacy Rate) are eliminated, more than 10% dissimilarity is calculated ( Figure 6). Thus, we can conclude that these two parameters show significant dissimilarity from the base risk score, which entitles them to be included in the most sensitive indicators.
Completion of statistical analysis gives the final set of indicators (19 socioeconomic indicators are finally selected from a list of 23 as described in Section 2.3), which are the most sensitive socioeconomic indicators for risk assessment in the study area (Table 5).  For the fifth elimination, when the second least sensitive indicator (Disabled People) from the sensitivity domain is excluded, the dissimilarity is found to be above 10% of the base risk score, as shown in Figure 5. Similarly, when the fourth least sensitive indicator from the adaptive capacity domain (Literacy Rate) are eliminated, more than 10% dissimilarity is calculated ( Figure 6). Thus, we can conclude that these two parameters show significant dissimilarity from the base risk score, which entitles them to be included in the most sensitive indicators.   Completion of statistical analysis gives the final set of indicators (19 socioeconomic indicators are finally selected from a list of 23 as described in section 2.3), which are the most sensitive socioeconomic indicators for risk assessment in the study area (Table 5).

Implication of the Most Sensitive Indicators
The most sensitive domain specific indicators selected in Table 5 are applied to assess storm surge risk in the study area (Figure 7). The risk map shows risk zones varying from very high to very low ( Figure 7). The risk map generated in this way by using the most sensitive indicators can be used in risk-based planning. If we consider 'the most sensitive indicators' as the 'most sensitive sectors' that generate high risks due to 'inadequate investment', then policy makers can decide in which sector they will invest to minimize risk in a location. As sensitivity of these indicators are determined from a system approach, policy makers can view the system response from the generated risk map after an investment is made in a specific sector that is considered the most sensitive for the system. In this way, investment on a less sensitive sector can be avoided in risk-based planning. The method can be made dynamic by re-computing the most sensitive indicators with the changed biophysical and socioeconomic settings.

Implication of the Most Sensitive Indicators
The most sensitive domain specific indicators selected in Table 5 are applied to assess storm surge risk in the study area (Figure 7). The risk map shows risk zones varying from very high to very low (Figure 7). The risk map generated in this way by using the most sensitive indicators can be used in risk-based planning. If we consider 'the most sensitive indicators' as the 'most sensitive sectors' that generate high risks due to 'inadequate investment', then policy makers can decide in which sector they will invest to minimize risk in a location. As sensitivity of these indicators are determined from a system approach, policy makers can view the system response from the generated risk map after an investment is made in a specific sector that is considered the most sensitive for the system. In this way, investment on a less sensitive sector can be avoided in risk-based planning. The method can be made dynamic by re-computing the most sensitive indicators with the changed biophysical and socioeconomic settings.

Conclusions and Recommendations
The sensitivity analysis made in this study shows that non-linear programming is effective to select the most sensitive indicators for risk assessment. It helps avoid the multi-collinearity and disturbance among the parameters. It provides an appropriate insight into the problems associated with the system under constraints. Using this method, it is possible to assess how sensitive a solution is due to any change in one or more parameters. In this research, for sensitivity analysis, risk is

Conclusions and Recommendations
The sensitivity analysis made in this study shows that non-linear programming is effective to select the most sensitive indicators for risk assessment. It helps avoid the multi-collinearity and disturbance among the parameters. It provides an appropriate insight into the problems associated with the system under constraints. Using this method, it is possible to assess how sensitive a solution is due to any change in one or more parameters. In this research, for sensitivity analysis, risk is considered as an objective function under some inequality and equality constraints that are developed with the values of each parameter (indicators) for each spatial unit for different domains-i.e., exposure, sensitivity, adaptive capacity, and hazards. This system is solved by the KKT conditions. This creates a ranking among the indicators. For statistical analysis, a Kolmogorov-Smirnov test is performed for the selection of a set of indicators that are most sensitive for the system. This test is performed by eliminating indicators one-at-a-time from the ranked indicators. By repeating the process for each indicator, the most sensitive indicators for quantitative risk assessment are found. This method was applied with regard to the Bangladesh coast in an effort to determine the most sensitive domain specific socioeconomic indicators, which assessed risk due to four dominant hazards in the region. The methodology developed in this study can be a useful tool in risk-based planning. Policy makers can decide which sector they will invest that will effectively minimize risk in a specific location and can avoid investing on the less sensitive sector. At the same time, they will be able to visualize the system response due to investment on the most sensitive sector. The method has limited applicability for the region where amount of data availability is too low with questionable accuracy.