A Nonparametric Economic Analysis of the US Natural Gas Transmission Infrastructure: Efﬁciency, Trade-Offs and Emerging Industry Conﬁgurations

: This paper presents a study aimed at measuring the efﬁciency of the transmission segment of the US natural gas industry from an economic perspective. The gas transmission infrastructure is modeled as an economic production function and a multi-stage modeling approach based on the implementation of Data Envelopment Analysis is employed to obtain an efﬁciency measure in a two-dimension performance space, i.e., cost and revenue-efﬁciency. This approach allows taking into account conﬂicting business goals. The study also performs cluster analysis to uncover homogeneous efﬁciency proﬁles relative to the gas transmission systems to explore determinants of efﬁciency rates, and trade-off situations. A sample containing 80 US gas transmission systems is used in the analysis. Results indicate that the transmission segment of the US gas industry has considerable inefﬁciencies, while average cost and revenue-efﬁciency scores are 0.324 and 0.301, and only three transmission systems achieve high scores on both efﬁciency dimensions. Cluster analysis identiﬁed seven conﬁgurations. In three of them there are no trade-off situations between cost and revenue efﬁciencies. However, only in one of them gas transmission systems have high efﬁciencies. The remaining four conﬁgurations exhibit trade-off situations having different intensity. Such trade-offs can be determined by the gas transmission infrastructure size.


Introduction
About one quarter of the United States energy needs depend on natural gas supply. According to estimates from the US Energy Information Administration, the total natural gas consumption in 2016 was 27,485,517 million cubic feet [1]. Natural gas is generally used as fuel in natural gas processing plants, fuel used by vehicles, and in private dwellings, including apartments, for heating, air-conditioning, cooking, water heating, and further household uses [2].
Since the beginning of the 2000s the supply of natural gas is playing an important role in the energy strategy of US, ensuring that the economy of the country relies more and more on diversified mix of energy sources. Indeed, between 2001 and 2015, the production of natural gas in US has increased by more than 40%, whereas its price (city-gate price) diminished by about 25% making natural gas a more competitive source in the energy market [3,4].
As in other countries, the transmission segment of the natural gas industry is regulated both at the federal and local levels. At the federal level, regulating entities are the Federal Energy Regulatory Commission (FERC), Pipeline and Hazardous Materials Safety Administration (PHMSA), Occupational Safety and Health Administration (OSHA), Transportation Safety Administration (TSA), and Environmental Protection Agency (EPA), whereas at the local level there are a number of public service or public utility commissions whose main task is to control that the local distributors choose • 210 interstate and intrastate gathering and transmission pipelines that extend for more than 320,000 miles around the country (including about 20,000 miles of gathering pipelines); • more than 1400 compression stations, 11,000 delivery points, 1400 interconnection points, 5000 receiving points, 24 hubs, 400 natural gas storage facilities, and eight liquefied natural gas (LNG) facilities.
On average, the transmission infrastructure moves 70 billion cubic feet of natural gas to 1300 local distribution companies that sell this commodity to more than 71 million customers, e.g., households, commercial and industrial firms.
According to forecasts by the US Energy Information Administration demand for natural gas will double by the end of 2030. To support such a growth in gas consumption, the gas transmission infrastructure has developed by a factor of 100 in the last 50 years, and the positive trend will continue with a rate of about 7% per year over the next two decades. As about half of the natural gas transmission network was built between the 1950s and 1960s, additional investment is needed to keep the infrastructure in operation.
Network efficiency is an important goal to achieve both in the planning of new transmission infrastructure and the management of the existing one. However, while engineers are generally more interested to increase thermal, compressing and hydraulic pipeline efficiencies, improving the overall economic efficiency of the natural gas network should be a major concern of policy makers as a higher economic efficiency is usually related to lower cost and prices to customers [7].
This paper presents an efficiency study of the transmission segment of the US natural gas industry by assuming an economic perspective. The gas transmission infrastructure is modeled as an economic production function and its economic efficiency is defined as the ratio of weighted outputs to weighted inputs. The combined resources necessary to operate the transmission network (i.e., pipeline, compressors stations, people, etc.) should be combined in the most efficient way to provide competitive and cost-effective movement of the natural gas from one location to another. Henceforth, more output per unit indicates greater efficiency.
The study implements Data Envelopment Analysis (DEA) and adopts a multi-stage modeling approach to generate an efficiency measurement in a two-dimension space for a sample of US natural Energies 2018, 11, 519 3 of 24 gas transmission systems taking in account conflicting business goals. Furthermore, the study performs cluster analysis to uncover homogeneous efficiency profiles relative to the gas transmission systems which help explore determinants of high-efficiency and trade-off situations. The paper is organized as follows: Section 2 presents a literature review focused on efficiency measurement in the gas industry by using DEA. Section 3 introduces the multi-stage modeling approach, while Section 4 presents the DEA method. Section 5 provides information relative to sample, data and variables of DEA models specification adopted to conduct the efficiency analysis. Results of the study are reported in Section 6. Finally, conclusions and limitations are discussed in the last section.

Literature Review
There is a huge amount of papers that consider the measurement of economic efficiency in the energy industry. Particularly, some scholars performed extensive literature reviews considering papers that used DEA to deal with different efficiency measurement issues in the field of energy generation and management [8][9][10][11]. However, there are relatively few papers that provided efficiency measurements in the specific gas industry adopting DEA, and most of the efficiency studies focused on the distribution segment of the industry. Hollas et al. [12] investigated the impact of the Natural Gas Policy Act of 1978 and policies of the US Federal Energy Regulatory Commission (FERC) that increased the level of competition on the industry efficiency. Scholars employed DEA to examine the economic efficiency of gas distributing companies between 1975 and 1994. Results showed that the reduction of scale economies did not modify the operators' economic efficiency. Hawdon [13] employed bootstrapped DEA to estimate efficiencies of the natural gas industry in 33 countries. The scholar finds that the national industry efficiencies are significantly affected by rising or falling of gas sales. Moreover, results support the assumption that the reforms of the energy market occurred in some countries (for instance, UK) have positively affected the efficiency of the gas industries, improved the utilization of labor and utilization of capital. Jamasb et al. [14] used non-parametric DEA and regression analysis to study the impact of US gas industry regulatory reform for a panel of US interstate companies in terms of static efficiency and productivity. Sample contains 39 companies observed from 1997 to 2004. Results indicate that regulation stimulated efficiency increase. Erbetta and Rappuoli [15] examined the nature of returns to scale in the Italian natural gas distribution industry by using DEA. Results show that scale inefficiencies negatively affect the overall efficiency of gas operators, whereas technology shows increasing returns only for the smallest operators suggesting that efficiency improvement can be achieved by intensify the merging process and concentration that have characterized the early years of the 2000s. Goncharuk [16] developed three DEA models to calculate the efficiency in the gas industry distribution segment In Ukraine and US comparing 54 Ukrainian and 20 US operators. The author analyzed factors that have an impact on efficiency, e.g., scale, regional location, ownership, etc. The benchmarking study allowed find that Ukrainian gas distribution companies are generally scarcely efficient and there is a potential 10% resource consumption that should be reduced to achieve industry efficiency in Ukraine. Zorić et al. [17] conducted a cross-country benchmarking study considering a sample including gas distribution utilities in Slovenia, the Netherlands and UK. This study showed that UK utilities perform better than Dutch and Slovenian utilities, and these latter are less efficient than Dutch utilities, even though they operate at optimal scale. According to scholars, such efficiency difference might be explained in terms of a more extensive regulation of the UK gas market. Amirteimoori and Kordrostami [18] proposed a Euclidean distance-based measure of efficiency to develop a DEA super-efficiency score is proposed to have a better discrimination of units. The super-efficiency model is utilized to estimate efficiency of 25 Iranian gas companies. Nieswand et al. [19] employ PCA-DEA to measure the efficiency of 37 US natural gas transmission companies in 2007. Particularly, they implement two model settings which include the same cost measurement but differ in the number of cost drivers under the assumption of variable returns to scale technology. The adoption of PCA-DEA allows reduce the number of efficient companies in the sample in comparison to conventional DEA. Ertürk and Türüt-Aşık [20] adopted DEA to evaluate the Energies 2018, 11, 519 4 of 24 efficiency of 38 gas management companies in the distribution segment of the Turkish natural gas industry. Technical, allocative and cost efficiencies under the assumptions of both constant and variable returns to scale were calculated to identify reasons of low performance and trajectories of improvement. Scholars found that the high investment costs are a major cause of inefficiency. Additionally, public owned and large size operators utilize resources more efficiently. Sadjadi et al. [21] utilized a stochastic super-efficiency DEA model to assess the efficiency of a sample of 27 Iranian province gas companies and generate a ranking across them. The model is based on a robust optimization technique that may be an alternative to sensitivity analysis and stochastic programming. Marques et al. [22] implemented DEA under the assumption of variable returns to scale to evaluate the efficiency of the Portuguese gas distributors to identify targets for the regulatory period, 2010-2013. To avoid sample misspecifications, distributors were divided into three groups with different scale factors and exogenous factors were also included in the study. The cross-section analysis was crossed with a dynamic one using panel data methodology. Results suggest that factors influencing the efficiency of Portuguese gas distributors can be different and depend on the characteristics of the company and the operating context. lo Storto [23] evaluated the operational, density and scale efficiencies of the natural gas distribution industry in Italy by implementing DEA. The empirical analysis considered a sample of 32 natural gas distributing companies. Results indicate that the average operational inefficiency in the sample is about 25%, and scale inefficiency is a major cause of scarce performance. Yardimci and Karan [24] measured the efficiency and service quality of a sample of Turkish natural gas distribution companies. Both DEA and statistical methods were used to calculate efficiency, whereas the quality of service was used to rank companies analyzing the correlation between efficiency and service quality. Results are useful to obtain insights relatively to the effectiveness of market regulation and the adoption of reward/penalty schemes to choose industry tariffs. Goncharuk and lo Storto [25] performed a cross-country benchmarking study considering a mixed sample of natural gas distributing companies in Italy and Ukraine. They use a 2-stage DEA procedure to estimate efficiency of gas providers and find critical context factors and policy issues that impact on it. Results show that both countries are low performing with respect to concessionaire operational efficiency and size. However, while increasing efficiency is necessary to reduce cost and improve quality of service, experience indicates that other goals may be critical at different stages of the reform of the industry in both countries.
Scholars who used DEA to measure efficiency in the gas industry generally adopted a "black-box" modeling approach. However, the black-box approach is unable to provide robust efficiency measurements when the units to be evaluated are complex systems and there are conflicting business goals that influence the management decision-making and have an impact on the system performance. The aim of this study is to fill these gaps.

The Measurement of the Economic Efficiency of Gas Transmission Infrastructure
Implementing DEA to measure the gas transmission infrastructure efficiency requires that the corresponding production technology is modeled in terms of inputs and outputs. To have a more accurate efficiency measurement and accounting for both financial and operational issues, the gas transmission production process was split into the following three stages: (a) cost generation; (b) operations management; (c) revenue generation (Figure 1). The outputs of the first production stage (cost generation) are used as inputs of the second production stage (operations management), whereas the outputs of this stage are used as inputs of the third production stage (revenue generation).
Hence, this multi-stage production model of the gas transmission infrastructure is more effective than a conventional one-stage or black-box model to understand how different types of resources (i.e., financial and physical) are sequentially utilized and transformed to produce necessary outputs. Indeed, by adopting this approach the underlying production function is decomposed into three interlinked sub-production functions that capture the same number of different efficiency components, i.e., cost generation-efficiency (CE), operations management-efficiency (OE) and revenue-generation efficiency (RE). At the first stage of this model (cost generation), costs are incurred to carry on the gas transmission business operations, preventive and unplanned maintenance when the infrastructure is utilized. The gas transmission infrastructure is efficient from the cost-generation view if it can be operated and maintained spending the minimum cost. At the next stage (operations management), the model is focused on the service provided by the infrastructure (i.e., gas transmission from one point to another). Efficiency increases when the same volume of natural gas can be delivered by utilizing a physical infrastructure having reduced capability (i.e., length, number of compression facilities). Hence, for a given volume of gas transmitted from one point to another, the lower the infrastructure capability, the higher the operations-management efficiency. The companies operating the transmission infrastructure are generally reluctant to make additional investment to avoid reducing profits. Finally, at the last stage (revenue generation), revenues are generated selling the gas transmission service to the distributing companies. Accordingly, the revenue-generation efficiency is measured as the ratio of the revenues generated selling the gas transmission service to the distributing companies to the volume of gas transmitted. The higher the revenues are for a given volume of gas, the higher the revenue-generation efficiency. At the first stage of this model (cost generation), costs are incurred to carry on the gas transmission business operations, preventive and unplanned maintenance when the infrastructure is utilized. The gas transmission infrastructure is efficient from the cost-generation view if it can be operated and maintained spending the minimum cost. At the next stage (operations management), the model is focused on the service provided by the infrastructure (i.e., gas transmission from one point to another). Efficiency increases when the same volume of natural gas can be delivered by utilizing a physical infrastructure having reduced capability (i.e., length, number of compression facilities). Hence, for a given volume of gas transmitted from one point to another, the lower the infrastructure capability, the higher the operations-management efficiency. The companies operating the transmission infrastructure are generally reluctant to make additional investment to avoid reducing profits. Finally, at the last stage (revenue generation), revenues are generated selling the gas transmission service to the distributing companies. Accordingly, the revenue-generation efficiency is measured as the ratio of the revenues generated selling the gas transmission service to the distributing companies to the volume of gas transmitted. The higher the revenues are for a given volume of gas, the higher the revenue-generation efficiency. Whereas the first two efficiency components are measured in terms of input reduction, the third one is measured in terms of output increase. Therefore, to implement Data Envelopment Analysis (DEA) and compute efficiency consistently, the multi-stage model was reorganized into two main parts, one having an input orientation (production segment) and including the cost-generation and operations management stages, and one having an output orientation (market oriented segment), including the revenue-generation stage. Particularly, Network DEA (NDEA) was used to calculate efficiency in the production-orientated segment of the model and conventional DEA to calculate efficiency in the market-oriented segment. In the next section, both methods are illustrated.

Method
Data Envelopment Analysis (DEA) is a nonparametric method based on the adoption of linear programming techniques commonly used to evaluate the efficiencies of a set of units denominated DMUs (i.e., decision-making units). The efficient DMUs are identified from this set and combined to construct an efficient frontier used as a benchmark to measure the efficiency of inefficient units [26]. Efficiency is measured as the ratio of the weighted sum of output variables to the weighted sum of input variables. The method does not require any assumption about the functional form of the relationship necessary to convert inputs into outputs and the weights utilized to combine them. Hence, the production technology that transforms inputs into outputs is generally considered as a black-box [27].
In this study, gas infrastructure systems are considered as DMUs in the DEA model formulation. We assume there are n DMUs (j = 1, …, n) corresponding to the same number of gas transmission systems that should be evaluated. Every DMU consumes varying amounts of m different inputs to produce r different outputs. Whereas the first two efficiency components are measured in terms of input reduction, the third one is measured in terms of output increase. Therefore, to implement Data Envelopment Analysis (DEA) and compute efficiency consistently, the multi-stage model was reorganized into two main parts, one having an input orientation (production segment) and including the cost-generation and operations management stages, and one having an output orientation (market oriented segment), including the revenue-generation stage. Particularly, Network DEA (NDEA) was used to calculate efficiency in the production-orientated segment of the model and conventional DEA to calculate efficiency in the market-oriented segment. In the next section, both methods are illustrated.

Method
Data Envelopment Analysis (DEA) is a nonparametric method based on the adoption of linear programming techniques commonly used to evaluate the efficiencies of a set of units denominated DMUs (i.e., decision-making units). The efficient DMUs are identified from this set and combined to construct an efficient frontier used as a benchmark to measure the efficiency of inefficient units [26]. Efficiency is measured as the ratio of the weighted sum of output variables to the weighted sum of input variables. The method does not require any assumption about the functional form of the relationship necessary to convert inputs into outputs and the weights utilized to combine them. Hence, the production technology that transforms inputs into outputs is generally considered as a black-box [27].
In this study, gas infrastructure systems are considered as DMUs in the DEA model formulation. We assume there are n DMUs (j = 1, . . . , n) corresponding to the same number of gas transmission systems that should be evaluated. Every DMU consumes varying amounts of m different inputs to produce r different outputs.
Particularly, the generic transmission system DMU j consumes amounts x j of inputs (x ij with I = 1, . . . , m), whereas produces amounts y j of outputs (y kj with k = 1, . . . , r). x o ≡ (x 1o , . . . , x mo ) and y o ≡ (y 1o , . . . , y ro ) indicate amounts of inputs and outputs of the gas transmission system identified by DMU o that is under evaluation. X = (x ij ) ∈ m×n and Y = (y kj ) ∈ r×n with X > 0 and Y > 0 respectively denote the m × n input and the r × n output matrices for the n systems.
The slack-based-measure (SBM) efficiency index is used in the specification of the DEA model because it does not assume proportional changes of inputs or outputs and, consequently, provides more realistic efficiency measurements [28]. Inputs and outputs of DMU o (x o , y o ) can be described as follows: where s − and s + are respectively input and output slack variables, and λ is a nonnegative vector in n . When output is increased by s + and/or input is decreased by s − DMU o can achieve full efficiency. Further, we assume that the gas transmission infrastructure production function has variable returns-to-scale (VRS) at all production segments because of the great variance across the gas transmission infrastructure size in the sample.
Two DEA models are specified, one for the input-oriented segment and one for the output-oriented segment of the production model of the transmission infrastructure.

Production Oriented Segment
The weighted network slacks-based measure (NSBM) model proposed in the literature is used to evaluate efficiencies of gas transmission systems in the production-oriented segment [29]. We consider a multiple-stage production process consisting of T production stages (t = 1, . . . , T) and assume there are m t and r t inputs and outputs to stage t. The link from stage t to stage h and the set of links are denoted by (t, h) and L, respectively. The observed measurements of inputs to DMU j at stage t are {x t j ∈ m t + } (j = 1, . . . , n; t = 1, . . . , T) and the observed measurements of outputs from DMU j at stage t are {y t j ∈ r t + } (j = 1, . . . , n; t = 1, . . . , T), respectively. The observed data that measure the linking intermediate products from stage t to stage h are {z h) is the number of items in link (t, h). In addition, we assume that intermediate links are freely determined.
The input-oriented efficiency θ o * of DMU o can be evaluated by solving the following linear program: where: w t is the relative weight of production stage t, T ∑ t=1 w t = 1, w t ≥ 0 (∀t) λ k ∈ n + is an intensity vector related to production stage t (t = 1, . . . , T)

Market Oriented Segment
Efficiencies of gas transmission systems in the output-oriented segment of the multi-stage production model are calculated by implementing a conventional DEA model, under the assumption of output orientation (maximization) [30].
Under the assumption of output orientation and variable returns to scale, in the SBM-model the efficiency of a DMU o (x o , y o ) can be measured by solving the following fractional program [28]: Variables s − and s + measure the distance of DMU inputs and outputs from inputs Xλ and outputs Yλ of a virtual unit. When s o + = s o − = 0, 1/ρ = 1 and DMU o is efficient.

Data, Variables and DEA Model Specifications
This study utilized data relative to US natural gas transmission systems in 2012 that are available in the Oil & Gas Journal [31,32]. The research sample includes 80 gas transmission systems chosen on the base of data availability for all variables used in the efficiency analysis that is about 49% of the total number of transmission systems reported by [31]. Table 1 displays variable main statistics for the production-oriented segment (model 1) and the market-oriented segment (model 2) of the multi-stage production model of the transmission infrastructure. Figures show that there is a considerable variance among transmission systems in the sample. For instance, the length of the transmission system varies from 28 miles to 14,807 miles, while the number of compression stations ranges between 0 and 121. Three types of variables were used, "pure" inputs, "pure" outputs and "mixed" (or intermediate) variables. The first two types include variables which are used as inputs or outputs, while the third type includes variables used either as inputs or outputs, depending on the DEA model specification and stage considered. Legend: model 1 = production-oriented segment; model 2 = market-oriented segment; MMcf = Million Cubic Feet.
In model 1 there are one input, one output and two mixed variables shared between stage 1 and stage 2. Particularly, the amount of operating and maintenance expenses (X 1 ) and the gas volume transmitted (Z 3 ) were introduced in the analysis respectively as input in stage 1 and output in stage 2 of model 1, while the transmission system length (Z 1 ) and the compression stations (Z 2 ) were used as outputs in stage 1 and inputs in stage 2 of the same model. Model 2 contains one input and one output, respectively the gas volume transmitted (Z 3 ) and the operating revenue (Y 1 ). Observed data relative to each variable were normalized by dividing them by their means. Additionally, as variables Z 1 and Z 2 showed high correlation, the method proposed by [33] was adopted to merge them performing Principal Component Analysis (PCA). The new composite variable was used as a proxy of the infrastructure capacity [34]. Table 2 summarizes information relative to the two DEA models adopted in the study. Variable Z[1, 2] was constructed as the weighted average of Z 1 and Z 2 by utilizing the PCA eigenvectors as weights.

Efficiency Measurement
The results of the calculation of the efficiencies of the gas transmission infrastructure systems by implementing model 1 and model 2 are displayed in Table 3. The efficiency scores obtained by performing model 1 are reported in column named "costEff" and the efficiency scores obtained performing model 2 appear in column "revEff" (model 2), respectively. Particularly, the index denominated costEff provides an aggregate measurement of the efficiencies at stages "cost generation" and "operations management" of the multi-stage model (relative to the overall production-orientated segment).
Natural gas infrastructure systems that are on the efficiency frontier enveloped by model 1 are considered 100% cost-efficient, while those that are on the frontier enveloped by model 2 are identified as 100% revenue-efficient. Both DEA models have high discriminating capability as a relatively small number of gas transmission systems are fully efficient.
Six systems are 100% cost-efficient and four systems are 100% revenue-efficient. Particularly, Transcontinental Gas Pipe Line Co. LLC (Houston, TX, USA) achieves the maximum efficiency in both models, while Tennessee Gas Pipeline Co. is placed on the efficient frontier in model 1 and is very close to the efficient frontier in model 2. These results suggest that cost-efficiency and revenue-efficiency can be two compatible business goals to achieve at the same time.
In both segments of the multi-stage production model of the gas infrastructure systems, efficiency scores are extremely variable. Indeed, the standard deviation values displayed in Table 3 are relatively high and very close to means. However, average efficiencies are very low, respectively 0.324 (cost-efficiency) and 0.301 (revenue-efficiency), whereas minimum efficiencies are 0.018 and 0.007 emphasizing the low efficiency of the transmission segment of the US natural gas industry. Even though the cost and revenue efficiencies have similar means and standard deviations values, the behaviors of the two indexes are very different and there is no correlation between them as their plot in Figure 2 shows. Indeed, this plot supports what emerged from Table 3. Three gas infrastructure systems having both high cost and revenue-efficiencies can be easily identified: Transcontinental Gas Pipe Line Co. LLC, Tennessee Gas Pipeline Co., and Tennessee Gas Pipeline Co. However, the remaining transmission systems are scattered in the plane and there is no evident association between the efficiency indexes. Six systems are 100% cost-efficient and four systems are 100% revenue-efficient. Particularly, Transcontinental Gas Pipe Line Co. LLC (Houston, TX, USA) achieves the maximum efficiency in both models, while Tennessee Gas Pipeline Co. is placed on the efficient frontier in model 1 and is very close to the efficient frontier in model 2. These results suggest that cost-efficiency and revenueefficiency can be two compatible business goals to achieve at the same time.
In both segments of the multi-stage production model of the gas infrastructure systems, efficiency scores are extremely variable. Indeed, the standard deviation values displayed in Table 3 are relatively high and very close to means. However, average efficiencies are very low, respectively 0.324 (cost-efficiency) and 0.301 (revenue-efficiency), whereas minimum efficiencies are 0.018 and 0.007 emphasizing the low efficiency of the transmission segment of the US natural gas industry. Even though the cost and revenue efficiencies have similar means and standard deviations values, the behaviors of the two indexes are very different and there is no correlation between them as their plot in Figure 2 shows. Indeed, this plot supports what emerged from Table 3. Three gas infrastructure systems having both high cost and revenue-efficiencies can be easily identified: Transcontinental Gas Pipe Line Co. LLC, Tennessee Gas Pipeline Co., and Tennessee Gas Pipeline Co. However, the remaining transmission systems are scattered in the plane and there is no evident association between the efficiency indexes.

Relationship between Cost and Revenue-Efficiencies and Trade-Off Analysis
An inductive approach was implemented to conduct a more in-depth data analysis [35]. Particularly, the clustering method was used as a tool to explore the relationship between the cost and revenue efficiency scores by finding similarities between the natural gas transmission systems regarding their efficiency measurements and finally extracting homogeneous configurations from the sample. In this study, configurations are defined as groups of gas transmission systems that share a common bi-dimensional efficiency profile. The analysis of configurations will help identifying the characteristics of the transmission infrastructure that affect efficiency performance and, particularly, efficiency trade-offs.
The cost-efficiency and revenue-efficiency scores were used as clustering variables, while the k-means clustering algorithm was chosen to identify groups because it is very efficient and easy to implement [36]. However, because the algorithm converges to an arbitrary local optimum and does not provide indications about the correct number of groups, the following "robust" clustering procedure was adopted. Several clustering strategies were employed to evaluate the stability of the results. Firstly, for every clustering iteration k, the k-means procedure was run three times by choosing initial centroids differently, i.e., the first k, the last k, k randomly selected observations in sample. Secondly, the order of the gas transmission systems in the dataset was changed randomly and the clustering procedure was re-run to identify potential outliers that might influence the results. Finally, the VRC index proposed by Calinski and Harabasz [37] was adopted to select the correct number of groups (Appendix A). To this aim, the k-means algorithm was iteratively performed nine times (k = 2, . . . , 10) to have a wide range of clustering solutions from which to choose the best one.
The clustering procedure identified seven configurations. Table A1 in Appendix B presents the list of gas transmission systems classified by group. Figure 3 illustrates the output of the VRC analysis. At k = 7 VRC is 121. 17 and ω is −57.39 which are the higher VRC and the lower ω values obtained by iterating the clustering procedure for different k. Table 4 provides summary statistics for all cases, while Table 5 reports main statistics for each configuration. The big size of the F statistics in the analysis of variance shows that the two clustering variables are statistically significant and relevant to identify homogeneous groups of gas transmission systems. Configurations differ with respect to the number of components and the efficiency profile. The smallest configuration-E-includes only 3 gas transmission systems that achieve high efficiency scores in both production and market orientation perspectives. That is consistent with the graphic plot in Figure 2, supporting the idea that "no trade-off situations" in which different performance goals are indeed compatible can actually be found in the US gas transmission market, although they are rare. This group includes only 3.75% of the total number of sample systems. These gas transmission systems can be considered excellent. Nonetheless, even important "trade-off situations" are not frequent. Indeed, figures in Table 5 indicate that only in configurations C and G there are critical trade-offs between cost and revenue-efficiency scores. These groups contain 8 and 7 gas transmission systems, respectively 10% and 8.75% of sample. The largest number of gas transmission systems in the sample achieves low efficiency rates in both segments of the multi-stage production model, as in the case of configurations A and B totally including half of the sample. As these transmission systems are low performing in comparison to other systems in the sample, the operating companies should carry on a more in-depth technical and business analysis to identify determinants of scarce efficiency. Finally, there are two configurations-D and F-in which a partial trade-off exists between the two efficiency measurements. In Figure 4, the plot of costEff and revEff score centroids relatively to each configuration clearly emphasizes the different situations.   Table 5. Statistics of clustering variables relative to pipeline infrastructure configurations. In order to understand why some configurations are more efficient than others and are not related to trade-off situations, additional investigation was carried on taking into account the DEA model variables and a set of key structure and performance indicators (KSPIs). Specifically, KSPIs were obtained as ratios by dividing the following model variables by the transmission system length: no. of compression stations, gas volume trans. for others, operating & maintenance expenses, and operating revenue. Measurements relative to each KSPI were normalized by dividing them by the maximum to have scores in the range 0-1. The utilization of such indicators allows having useful information about the structural and performance characteristics of the gas transmission infrastructure independently of the transmission network size. Table 6 indicates that, on average, the gas transmission systems belonging to configuration E have a larger structural and operational size. The average infrastructure length is 10,363 miles and the average number of compression stations is 66, whereas the average volume of gas transmitted is 2,885,485 MMcf. Table 7 shows that gas transmission systems belonging to configuration E have the lower number of compression stations per mile. However, the other KSPIs values, as a whole, do not seem to indicate that infrastructure  In order to understand why some configurations are more efficient than others and are not related to trade-off situations, additional investigation was carried on taking into account the DEA model variables and a set of key structure and performance indicators (KSPIs). Specifically, KSPIs were obtained as ratios by dividing the following model variables by the transmission system length: no. of compression stations, gas volume trans. for others, operating & maintenance expenses, and operating revenue. Measurements relative to each KSPI were normalized by dividing them by the maximum to have scores in the range 0-1. The utilization of such indicators allows having useful information about the structural and performance characteristics of the gas transmission infrastructure independently of the transmission network size. Table 6 indicates that, on average, the gas transmission systems belonging to configuration E have a larger structural and operational size. The average infrastructure length is 10,363 miles and the average number of compression stations is 66, whereas the average volume of gas transmitted is 2,885,485 MMcf. Table 7 shows that gas transmission systems belonging to configuration E have the lower number of compression stations per mile. However, the other KSPIs values, as a whole, do not seem to indicate that infrastructure systems in this configuration achieve better operational performance than systems belonging to the other configurations.

Configuration
Energies 2018, 11, x 12 of 24 systems in this configuration achieve better operational performance than systems belonging to the other configurations.  As discussed above, configurations A and B exhibit low efficiency measurements. Both groups include gas transmission systems which, on average, have different size-2346 and 712 milesrespectively, that is lower than systems in group E. Nevertheless, they have a high number of compression stations per mile along the transmission network and greater operations and maintenance expenses per mile than systems belonging to other configurations. Configurations C and G for which Figure 4 clearly highlighted an important trade-off between the two efficiency dimensions contain infrastructure systems that on average have rather different lengths (483 and 3335 miles), too. Anyhow, both configurations possess a similar ratio of number of compression stations to gas transmission network length (0.65 vs. 0.73). Furthermore, figures in Table 7 emphasize the considerable difference relative to the remaining KSPIs consistently with the configurations efficiency indexes and trade-off typologies. Configurations D and F differ significantly with respect to the average length of their transmission systems, respectively 562 and 6174 miles. Howsoever, the KSPIs measurements in configuration F are on average lower than those in configuration D, especially the first two. While in these configurations there is a partial efficiency trade-off, on average infrastructure  As discussed above, configurations A and B exhibit low efficiency measurements. Both groups include gas transmission systems which, on average, have different size-2346 and 712 miles-respectively, that is lower than systems in group E. Nevertheless, they have a high number of compression stations per mile along the transmission network and greater operations and maintenance expenses per mile than systems belonging to other configurations. Configurations C and G for which Figure 4 clearly highlighted an important trade-off between the two efficiency dimensions contain infrastructure systems that on average have rather different lengths (483 and 3335 miles), too. Anyhow, both configurations possess a similar ratio of number of compression stations to gas transmission network length (0.65 vs. 0.73). Furthermore, figures in Table 7 emphasize the considerable difference relative to the remaining KSPIs consistently with the configurations efficiency indexes and trade-off typologies. Configurations D and F differ significantly with respect to the average length of their transmission systems, respectively 562 and 6174 miles. Howsoever, the KSPIs measurements in configuration F are on average lower than those in configuration D, especially the Energies 2018, 11, 519 13 of 24 first two. While in these configurations there is a partial efficiency trade-off, on average infrastructure systems in configuration F achieve higher efficiency scores whereas the trade-off between the two efficiency dimensions is decidedly smaller. Figure A1 in Appendix C summarizes information relative to individual configurations. The previous analysis suggests that the transmission network size may have an important weight in the occurrence of trade-off situations, and, particularly increasing length may reduce potential trade-offs and improve performance at the same time.

Conclusions
The primary objective of the study presented in this paper was to measure efficiency of the transmission segment of the US natural gas industry. To address this research objective, a multi-stage modeling approach based on the implementation of DEA was adopted and the efficiency of a sample containing 80 US gas transmission systems was measured in a two-dimension performance space, i.e., cost-efficiency and revenue-efficiency. The proposed multi-stage model of the gas transmission system is more effective than the traditional black-box approach as it allows understanding more effectively how different types of inputs are sequentially utilized and converted into outputs and accounts for different technical and business goals-many times conflicting-that the operating companies have to deal with. Hence, the approach is especially apt to conduct benchmarking and efficiency studies as it allows analyzing the relationship between several efficiency measurements.
Utilizing cost and revenue efficiencies as grouping variables, cluster analysis was employed to identify homogeneous configurations of gas transmission systems having similar efficiency profiles and investigate the existence of efficiency trade-offs. Furthermore, key structure and performance indicators (KSPIs) were measured to have insights about the configurations extracted from sample.
Findings indicate that the transmission segment of the US natural gas industry has considerable inefficiencies, and average cost and revenue-efficiency scores are 0.324 and 0.301 respectively. Only 3 systems achieved high scores on both efficiency dimensions. Cluster analysis uncovered totally 7 configurations showing that in 3 of them there are no trade-off situations between cost and revenue efficiencies. However, only in one of such configurations gas transmission systems achieve high efficiency rates. The remaining four configurations exhibit trade-off situations having different intensity. Finally, the analysis of KSPIs suggests that trade-offs can be determined by the transmission infrastructure size.
Although this study is explorative in nature, it gives important contributions to literature on benchmarking and efficiency analysis in the energy industry, offers insights concerning performance trade-offs to the managers of the gas transmission operating companies, and useful information for policy and industry regulation. Particularly, empirical results have showed that the use of individual indicators rather than a comprehensive efficiency measurement may be misleading as provides not consistent indications. At the same time, the adoption of a multiple dimension efficiency modeling framework capable to deal with conflicting technical and business goals may help to identify trade-off situations.
The production model developed to measure the efficiency of the gas transmission infrastructure might benefit from the inclusion of additional variables. The efficiency score in the market oriented segment of the gas transmission infrastructure production model is influenced by the natural gas price. Indeed, the revenues of the companies that operate the gas transmission infrastructure systems considerably depend on gas prices which can vary greatly in the US territory. Such differences are determined by the interaction of several factors, i.e., the distance from the area where the gas is produced to the area where it is utilized, the availability and capacity of the transmission pipeline to move gas from the producing areas, the physical storage capacity and trading hubs, state regulations, the level of direct and indirect competition, and volatility of gas consumption. Storage can play an important role in mitigating price volatility together with the adoption of contractual agreements and financial hedging instruments. In the evaluation of the infrastructure system efficiency, the physical storage of gas is as relevant as the pipeline and the compression facilities because it plays a key role as a mechanism for providing flexibility in the market, and finally, influencing short-term gas price volatility [38]. Both interstate and intrastate gas transmission companies rely on gas storage to maintain contractual balance, perform load balancing in order to preserve operational integrity of the pipeline, and regulate the level of gas supply over periods of fluctuating demand. Increasing the storing capacity usually requires large investment and storing gas may be relatively expensive and risky because of gas price volatility. Hence, the storage capacity of the gas transmission infrastructure system is an important variable that should be included in the efficiency analysis. Additional research might consider how the gas transmission infrastructure system efficiency is influenced by the interaction between price volatility and storage level. This study has not taken into account the storage capacity provided by operators of the gas infrastructure systems or independent companies as data were not available for such variable for most of the gas transmission systems included in the sample.
Future research should account for heterogeneities in the sample. This study neither distinguished between "intra-state" and "inter-state" gas transmission systems, nor considered the ownership structure of the operating companies. These heterogeneities can be important moderating factors of the efficiency-profile configuration-structure relationship. To deal with such heterogeneities a meta-frontier SBM DEA modeling approach might be adopted [39]. The efficiency analysis used input and output data relative to fiscal year 2012. The utilization of a dataset covering a wider time interval is necessary to confirm the results emerged from the configuration analysis because changes in the industry structure and demand in the energy market affect how inputs and outputs can be efficiently combined and, finally, influence the performance of the gas transmission infrastructure. Dynamic analysis also helps account for the influence of technical progress on efficiency [40].

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A
The VRC index has been introduced by Calinski and Harabasz in 1974 to determine the correct number of groups in cluster analysis [37]. Milligan and Cooper [41] proved that the utilization of the VRC index to choose the optimal clustering solution works generally well.
If n is the number of data objects to be clustered and k is the number of groups obtained, the VRC k index is given by: where SS B and SS W are the overall between-group sum of squares and within-group sum of squares.
To determine the correct number of groups, for each clustering solution k the additional index ω k is computed as follows: The optimal number of groups is chosen finding a value of k that maximizes VRC k and minimizes ω k .