Estimating the Renovation Cost of Water, Sewage, and Gas Pipeline Networks: Multiple Regression Analysis to the Appraisal of a Reliable Cost Estimator for Urban Regeneration Works

: Water, sewerage, and gas infrastructures play a crucial role in optimising the housing quality of buildings and cities. On the other hand, water, sewer, and gas pipelines constantly need maintenance, checks, and repairs. These interventions require large budgets, and therefore scrupulous investment planning is necessary. In this study, Multiple Regression Analysis (MRA) is applied to estimate the urban renovation costs related to the works on water, sewage, and gas networks. The goal is to build a reliable cost estimator that is easy to apply and has a minimum number of explanatory variables. Four regressive models are tested: linear, linear-logarithmic, logarithmic-linear, and exponential. The analysis is implemented on two datasets of projects carried out in Italy: the ﬁrst collects the data of 19 projects made in historical centres, while the second collects the data of 20 projects made in the peripheries. The variables that impact costs the most are selected. In terms of results, the estimated functions return an average error of 1.25% for historical centres and 1.00% for peripheral areas. The application shows that a differentiation of cost functions based on the urban context is relevant. Speciﬁcally, two different functions are detected: exponential for historical centres and linear for peripheral areas. In conclusion, we interpret that the exponential growth of costs in historical centres depends on a series of critical issues (logistical, architectural, etc.), present to a lesser extent in the peripheries, which complicate the execution of the interventions. The approach adopted, which led to the detection of cost functions differentiated based on the urban context, allows us to beneﬁt from more accurate modelling that considers the places’ speciﬁcities.


Introduction
Cities are composed of a labyrinth of underground pipelines [1].In particular, water, sewerage, and gas infrastructures play a crucial role in optimising the environmental and housing quality of cities, with particular relevance for residential buildings [2].These infrastructures have the function of satisfying the fundamental needs of citizens, such as the supply of drinking water to homes, the correct disposal of wastewater and domestic liquid waste, as well as the supply of energy for heating and other household activities, resulting in a significant increase in living comfort [3,4].A properly functioning urban water system is not only able to guarantee a constant flow of water inside homes, facilitating daily domestic activities, but can also effectively reduce the risk of contamination of drinking water, ensuring the safeguarding of the health of the inhabitants [5].Adequate sewerage infrastructure is necessary to avoid the accumulation of wastewater, limit any risks of flooding, improve air quality, and reduce potential sanitation problems [6].Furthermore, the correct design of gas pipelines is essential to guaranteeing the adequate thermal comfort of buildings and the safety of structures through technological solutions capable of avoiding potentially dangerous leaks [7].
The topic of sustainable urban regeneration is inextricably linked to the importance of water, sewerage, and gas infrastructures [8].The objective of sustainable urban regeneration consists of transforming pre-existing urban areas into environmentally sustainable and resilient contexts to improve the quality of life of the inhabitants and reduce the environmental impact [9,10].In this process, attention is placed on updating and strengthening existing infrastructure, including water, sewerage, and gas systems.Urban regeneration can promote the sustainable use of water and gas resources through the adoption of technologies aimed at reducing waste and the use of ecologically sustainable technological solutions, thus contributing to the preservation of natural resources.The accurate planning of the renovation and maintenance activities of water, sewerage, and gas infrastructures represents a crucial factor for the notable improvement of the living quality of residential buildings [11].The correct design of such infrastructures can contribute to improving the health and safety of inhabitants and the overall quality of life [12].
To avoid the deterioration of underground pipelines, municipalities and utilities are forced to bear large costs [13].Water, gas, and sewage pipelines are constantly in need of maintenance, control, and repairs.These interventions require large budgets, and therefore scrupulous investment planning is necessary [14].For this reason, it is essential to have reliable and easy-to-interpret cost functions during the programming phase [15].Regression analysis is one of the most widely used modelling techniques in the construction industry to develop fast and sufficiently reliable cost estimation models [16].Even in the case of underground infrastructure networks, this technique is often used to estimate the costs of renovation work.
This research examines the potential for the application of Multiple Regression Analysis (MRA) to estimate the costs of urban renovation related to interventions on water, sewerage, and gas pipelines, starting from the limited information available in the planning phase.In addition to the interventions on the networks, the application also includes works involving road and masonry infrastructure.The main goal is to build a simple and reliable cost estimator that works with a minimum number of input variables.The analysis was conducted, starting with projects carried out in Italy.The work aims in part to overcome some of the limitations found in the scientific literature, including the paucity of detailed analyses on the issues related to the simultaneous presence of multiple network infrastructures and the failure to differentiate cost functions according to the relevant urban context (historical centre or peripheral area).Please refer to Section 2 for an in-depth analysis of the reference literature, the limits identified therein, and the criteria adopted for the selection of the independent variables starting from the studies analysed.
In summary, the questions that we intend to answer in this work are the following: 1.
What are the minimum explanatory variables needed to make a sufficiently accurate estimate of urban regeneration costs involving multiple infrastructure networks simultaneously?2.
Does it make sense to differentiate cost functions according to the different urban contexts of historical centres and peripheral areas, and what factors eventually influence this differentiation?
The main contributions to the work are as follows: • Build a unique cost function for those urban restructuring interventions that involve several types of infrastructure (roads and masonries; sewer, water, and gas networks); • Establish whether it is necessary to differentiate cost functions according to the urban context, adopting different functions for historical centres and peripheral areas; • Determine, by testing various models (linear, linear-logarithmic, logarithmic-linear, and exponential), the one that returns a more accurate estimate of costs; • Identify the minimum number of explanatory variables that can explain the model; • Implement the case studies conducted in Italy about the application of the MRA to estimate the renovation costs of sewer, water, and gas infrastructures [17].This objective is interesting if contextualised concerning the not always easy execution of the works in Italian historical centres (and in those countries characterised by similar urban structures).
The analyses are carried out on two different datasets concerning projects carried out in Italy.The first dataset, already used in a previous study [18], collects data on 19 infrastructure renovations in historical centres carried out between 1996 and 1999.In this work, the cost items for each project were updated through the regional price lists in force.The second dataset, specially built for the present study, records data on 20 infrastructure renovations carried out in peripheral areas between 2012 and 2021.Also, in this case, the cost items have been updated through the regional price lists in force.
The document is structured as follows: Section 2 is devoted to literature studies on the application of MRA for estimating the costs of projects concerning road infrastructure and sewer, water, and gas networks; Section 3 describes the methodological approach followed; Section 4 presents the main results of this study; Section 5 is dedicated to the discussion of the results; and Section 6 shows the main conclusions and future research ideas.

Literature Analysis
For the estimation of the costs of civil and building projects, three families of models are generally identified: models based on Multiple Regression Analysis (MRA), models using Artificial Neural Networks (ANNs), and Case-Based Reasoning models (CBR) [19].
Regression is one of the purposes of Supervised Learning, the branch of Machine Learning that uses certain software to find the relationship between input and output so that future outputs can be predicted by having only the input.As a predictive technique, MRA has often proven to be a powerful tool, although it is less reliable in describing nonlinear relationships [20][21][22][23].ANNs are estimators suitable for simulating the behaviour of complicated non-linear relationships.In them, the weights of the connections between the processing elements can be calibrated based on a presented data set.Changing the weights also changes the state of the network, so that the system appears to be "learning".ANNs do not require the specification of a mathematical form [24]. Finally, CBR is a rulebased reasoning technique that consists of solving new problems by adopting solutions used to solve old problems [25,26].The approach is similar to expert judgments that rely on using experience to solve problems.
In construction cost estimation, MRA and ANN models are among the most widely used and compared with each other.Specifically, it turns out that regression models have a higher rate of convergence than ANNs.In addition, they require less memory space, are easy to use, and are generally less expensive.On the other hand, ANN-based models generally have a better level of accuracy in both regression and classification problems [27].In addition, ANNs are preferable when the data does not adhere to loworder polynomial forms [28].For these reasons, recent cost estimation approaches tend to use mainly ANNs as they allow for the investigation of the multiple and non-linear relationships between final costs and independent variables [29,30].ANNs are commonly used to estimate construction costs, as they are capable of providing precise estimates and can detect complex relationships between costs and the factors involved.However, ANNs are often seen as "black boxes", as they produce results without explaining exactly how they reach these conclusions [24,31].Since our goal is to understand the different factors that influence costs in historic centres and peripheries, we want to establish a clear relationship between the variables that influence costs in these two contexts.To perform this, we chose to use MRA instead of ANNs in the present study.MRA offers greater transparency, as it allows us to examine in detail how independent variables influence costs in historic centres and peripherical areas.
In the estimation of the costs of road works using regression models, many research studies have been reported.In the study by Mahamid (2011), 11 cost estimation models were developed for road construction projects and tested on 131 data sets collected in Palestine [28].In the work of Mahamid and Bruland (2010), multiple linear regression models were developed to estimate the costs of building roads as a function of the physical characteristics of the project [32].Han et al. (2008) focused on the budgeting phase in highway construction projects in Korea, also developing cost estimation models [33].Sodikov (2005) focuses on estimating the costs of highway projects in Poland and Thailand [34].In the work of Bell and Bozai (1987), multiple linear regression models were developed for estimating long-term costs on behalf of the Alabama Highway Department (AHD) [35].
Also, for estimating the costs related to interventions on the water networks, numerous studies have been reported that use regressive models.In the study by Marchionni et al. (2016), multiple linear regression was implemented to estimate cost functions from a dataset on various Portuguese water supply systems' costs and characteristics (hydraulic and physical) [36].In the work of Kasaplı (2014), 73 datasets from domestic water projects approved by ˙Ilbank Inc. between 2013 and 2017 were employed [37].Walski (2012) presents a summary of the cost equations for water and wastewater pumping stations [38].Still in the field of water supply, Fuchs-Hanusch et al. (2012) address two challenges: selecting pipelines for rehabilitation with a long-term cost perspective and predicting pipeline failures and maintenance costs.Statistical analyses and a hazard proportionality model are used [39].Swamee and Sharma (2008) have developed cost functions for various assets of water distribution systems (pumping stations, piping, tanks) [40].Clark et al. (2002) propose equations to be used for estimating the costs of construction, expansion, rehabilitation, and repair of the individual components of the water supply distribution system [41].
Regarding the estimation of costs using regressive models for interventions on sewerage networks, many works are also present in the literature.In recent research, Sueri and Erdal (2022) developed regression models for the preliminary cost estimation of sewage networks, starting from data extrapolated from 182 projects [42].In another work, Marchionni et al. (2014) estimated cost functions for various elements of sewage systems via MRA [15].
Finally, regression models are also widely used for estimating the costs related to interventions in gas networks.In the study by Rui et al. (2011), 412 gas pipeline projects were collected between 1992 and 2008, and statistical methods were used to estimate the distribution of cost overruns [43].In the study by Rui et al. (2012), the variation in the cost of building pipelines according to specific parameters (diameter, length, and position of pipelines, etc.) is analysed [44].Kaiser and Liu (2021) analysed statistics on pipeline construction costs made between 2014 and 2019 to define regression models for cost estimation, identifying the minimum number of variables needed to develop robust linear models (path length and line diameter) [45].
From these literature examples, it was possible to identify the main independent variables that can influence the cost of the construction/renovation of network infrastructure.A summary of the above contributions is shown in Table 1, highlighting the type of infrastructure studied, the purpose of the research, the methods used, and the independent variables identified.
For the case study presented in this paper, the independent variables capable of influencing infrastructure cost were selected based on their relevance and recurrence found in the relevant literature but also based on available data extrapolated from the projects analysed.For further discussion of the independent variable selection process, please refer to Section 3.1.3.
As mentioned in the introduction, from the analysis of the literature, at least three limits emerge.First, as shown in the second column of Table 1, regression models are mostly applied to estimate the costs of specific work, while there are rare cases of applications developed for urban renovation interventions that involve simultaneous works on several infrastructure networks (roads, water supplies, sewerages, and gas pipelines).In addition to those in Table 1, additional regression models aimed at predicting the construction/renovation costs of infrastructure types other than water, sewer, and gas networks have been examined, including public-interest facilities and buildings.However, the distinguishing feature of these regression models is their specificity, since they lack an integrated approach that can simultaneously involve multiple infrastructure or building categories [46][47][48][49][50].In general, works that investigate in detail the issues inherent to multiple network infrastructures considered simultaneously are quite rare.Among the few exceptions is the work of Nuti et al. (2009), who focus on the realistic assessment of the structural safety of complex networks such as electric power, water supply, and roadways following seismic events.This paper highlights commonalities in the modelling and analysis of different infrastructures, highlighting the feasibility of the approach for analysing complex networks subject to structural hazards.However, economic-financial profile issues are only hinted at [51].As a second limit, in most of the applications examined, the estimated cost function is also valid for intercalated projects in different urban contexts [52].Otherwise, in the present study, two different cost functions are tested, one for historical centres and the other for peripheral areas.Finally, the last issue concerns the limited number of case studies carried out in Italy on MRA for estimating the costs of restructuring sewerage, water, and gas infrastructures.From the three aforementioned limitations, consequently, a research gap emerges concerning the current planning and budgeting requirements of urban regeneration interventions, which are now oriented toward an integrated perspective and strongly contextualized concerning different urban areas [53].The gap is intended to be bridged with the present research.
In the next paragraph, the methodological approach adopted is described.

Methods
The methodological approach is summarised in the following flow chart (Figure 1): Buildings 2023, 13, x FOR PEER REVIEW 7 of 40

Methods
The methodological approach is summarised in the following flow chart (Figure 1): The main steps are described in detail below.The phase of collecting projects for the renovation of service infrastructure in historical centres and peripheral areas was a long-term operation that took several years (from 1999 to 2021).Obtaining service infrastructure for restructuration projects was a process that involved several stages and actors.In particular, it involved the cooperation of local authorities and public administrations (mainly Italian municipalities and provinces), construction companies, technical consultants, universities, and research institutions.Some of the projects in question have been used as case studies for numerous research papers The main steps are described in detail below.The phase of collecting projects for the renovation of service infrastructure in historical centres and peripheral areas was a long-term operation that took several years (from 1999 to 2021).Obtaining service infrastructure for restructuration projects was a process that involved several stages and actors.In particular, it involved the cooperation of local authorities and public administrations (mainly Italian municipalities and provinces), construction companies, technical consultants, universities, and research institutions.Some of the projects in question have been used as case studies for numerous research papers dealing with issues other than the cost estimation of interventions.Overall, there were more than 70 projects identified in the initial phase.However, many of them had data gaps, flat estimates of some parameters, and large documentary deficiencies.Since these were most often pre-feasibility or preliminary studies, the documentation that was accessible did not always prove to be exhaustive or complete with all the necessary technical and accounting data.It was therefore necessary to make a selection of those projects deemed most reliable and for which sufficient meaningful data were available.However, this was not before carefully analysing all documents related to each project.

Step 2: Analysis of Project Technical-Accounting Documents and Data Organization
This phase represents a crucial part of the research process.For each project, project plans, technical reports, financial plans, budgets, cost estimates, graphical drawings, and any other relevant documents were carefully analysed.It was then possible to compile a list of disposable documents for each project.Next, key technical, economic, and financial variables were divided and recorded on an Excel database for each document.From this stage, projects were categorised according to urban context.

Step 3: Selection of Explanatory Variables Based on Available Data and Theoretical Relevance Reported in the Literature
In the next step, the most recurrent parameters in most projects were identified.Then, from the reference literature on the topic of cost estimation of underground utility infrastructure, summarised in Section 2 (and in Table 1), it was possible to select the variables that most influence renovation costs.These include information on the size of the projects, the complexity of the infrastructure, and the type of materials used.This step is crucial for understanding which factors significantly influence renovation costs and for selecting the key variables that will be used in the regression analysis.The variables will be presented and discussed in more detail in Section 3.2 below.

Step 4: Selection of Projects for Which the Explanatory Variables Previously Identified Are Present and Measurable
Having identified the variables that were most recurrent and, at the same time, most influential on the cost formation mechanism, it was possible to proceed with the final selection of the projects used for the case study.In essence, the two datasets mentioned in the introduction were constructed.Nineteen infrastructural renovation projects carried out between 1996 and 1999 were chosen for historical centres.These projects, also used for a previous study [18], represent the only complete data source for historical centres, including all variables considered significant.Cost items were updated for each project through the current regional price lists.In Italy, construction techniques have remained virtually unchanged over the past two decades, so the error made by updating work prices by regional price lists appears negligible.For suburban areas, a second dataset was constructed, including 20 renovation projects carried out between 2012 and 2021.Again, cost items were updated through the current regional price lists.The total of 39 projects from which the data underlying this study were collected were carried out in the municipalities of Ferrara, La Spezia, Lecce, Matera, Modena, Reggio Calabria, Reggio Emilia, Rimini, Salerno, and Todi.

Step 5: Classification of Variables for Each Project by Work Categories
In most of the cases analysed, the regeneration interventions of the urban territory are made up of a long series of operations: dismantling of paving elements, possible recovery of reusable materials, dismantling of dilapidated networks, consolidation of tunnels and disposal dens, construction of concrete beds, laying and testing of pipes, creation of covering and protection layers, possible installation of warning tapes, creation of load distribution structures (reinforced concrete screed with electro-welded mesh), creation of waterproofing layers, creation of the installation of flooring elements (sand and cement cushion), laying and finishing of the flooring.
For these operations, four categories of work are considered: road and masonry works, sewerage, water supply, and gas network.

Step 6: Implementation of Multiple Regression Analysis
At this point, all that remains is to define for each work category the algebraic form capable of best relating the independent variables to the dependent variable.To this end, an iterative procedure was used based on the analysis and statistical-estimative verification of four regressive models (linear model, linear-logarithmic model, logarithmic-linear model, multiplicative exponential model).The models in question, for each work category, were tested separately on the two reference datasets.

Step 7: Selection of Significant Variables and the Optimal Functional Model for Each Work Category
From the results of the statistical tests, it is possible to determine the significant variables (that satisfy the t-test) for each functional model for the two groups of projects and the four categories of work.Furthermore, it is possible to identify the best functional model (the model that has the highest adjusted R 2 value).

Step 8: Construction of the Total Cost Function
Once the optimal functional relationships have been defined for each of the four work categories, adding the latter for both datasets gives the overall cost.
The explanatory variables selected according to the criteria described in Section 3.1.3are presented below.

The Variables That Influence the Renovation Cost
Based on the reference literature and the information available on the projects analysed, the following representative variables are identified that may affect the costs of the operations: • L = length of the networks built (expressed in linear meters); • S = renewed pavement surface (expressed in square meters); • E = difficulty in the execution of the work (measured in dimensionless units); • W = average width of the road section for intervention sites (expressed in linear meters); • D = average diameter of the pipelines weighed on the length of the sections made and corrected with numerical coefficients proportional to the different costs of the materials used (expressed in millimetres); • P = number of special pieces per pipeline linear metre (measured in dimensionless units).
How these variables affect the four categories of work (road and masonry works, sewerage, water supply, and gas network) is analysed in the following subparagraphs.

Variables That Influence the Cost of Road and Masonry Works
The cost function of road and masonry works depends on the length of the networks built (L), as a considerable part of the cost of this category of work is linked to the size of the excavation to be carried out.
The cost function of road and masonry work also depends on the renewed pavement surface (S).In fact, for some works (such as the removal of the flooring elements and the demolition of pre-existing masonry works, the construction of any consolidation structures of the support surfaces, and the construction of reinforced concrete slabs with electrowelded and waterproofing), the quantities carried out are correlated, according to a direct proportionality, with the overall area affected by the intervention.The variable S is then used as a proxy for the effects of these processes on total costs.

The cost function of road and masonry work also depends on the difficulty of the execution of the works (E). This variable is defined as follows:
Optimal working conditions: soil of contained consistency, clayey nature, absence of water.The excavation has an open section.It is easy to use mechanical means.There are no archaeological finds, and the underlying networks are easy to spot.It is therefore easy to carry out the new networks.This level of working conditions is given a value of 1.
Good working conditions: the soil has a significant consistency and is of a soft calcarenite nature.There is no water.The excavation has a narrow section.The use of mechanical means is possible, but problems occur due to the presence of archaeological finds.This level of working conditions is given a value of 3.
Difficult working conditions: the soil has a considerable consistency and is of a rocky or calcarenite nature.Presence of inconsistent soil and water.So, there is a need for support and containment cages, as well as hydraulic pumps.The use of mechanical means is possible, but their use is made difficult by the morphological conformation of the site.Difficulties may occur due to outcropping archaeological finds and obsolete pre-existing nets.There are problems with mechanically induced vibrations in surrounding buildings.This level of working conditions is given a value of 4.7.
Extreme working conditions: soil of remarkable consistency and calcarenite or rocky nature.The use of mechanical means is very difficult.Due to the site morphology, it is recommended to carry out the work manually.The excavation is obligatory.Problems may occur due to archaeological finds or the presence of cavities under the road.This level of working conditions is given a value of 7.7.
The four possible values that can be associated with the E variable have been obtained by developing a detailed analysis of the work items that most influence the costs of road and masonry work.The element that has the greatest impact on costs is excavation.From the study of the accounting documents, it was possible to divide the excavation costs into four classes with an average of respectively €/m 3 11, €/m 3 33, €/m 3 52, and €/m 3 85.The four classes are representative of the different working conditions previously described.So, calculating the arithmetic ratio between these costs, we obtain the values 1, 3, 4.7, and 7.7.
The last independent variable that is assumed to affect the cost of road and masonry works is the average width of the road section (W).The following operations depend on this variable: storage of materials, use of mechanical means of transport and intervention, and carrying out excavations of functional width for the convenient arrangement of the pipelines (avoiding excessive depths of earth movement).
The numerical indicator is obtained by calculating the average width (weighed against the length of the renovated urban routes) of the road site for each intervention.
For an intervention with strokes of different widths, the following elementary formula was applied: where: • W j = average width of the road for the jth intervention; • W i,j = width of the road for the ith section of the jth intervention; • L i,j = length of the ith section of the jth intervention; • L j = total length of the jth intervention.

Variables That Influence the Cost of Sewerage Water Supply and Gas Networks
Unlike the cost function of road and masonry works, which is assumed to depend on four variables (L, S, E, and W), the cost functions of sewerage, water supply, and gas networks are assumed to depend on five variables.Firstly, the extension of the network built (L), the difficulty of carrying out the works (E), and the average width of the road section (W).These variables have already been introduced previously.Then, by the weighted average diameter of the pipeline used in the intervention (D).As in the case of road and masonry works, the coherence of the choice of the aforementioned variables is corroborated by the literature examples in Table 1.The small number of selected variables is also confirmed in the literature.In particular, one of the studies examined shows that a few geometric parameters (length and diameter of the pipelines) are sufficient to obtain sufficiently reliable cost estimates [45].Beyond this, during the statistical analysis phase (see Section 4), steps were taken to identify and eliminate the less significant variables among those initially chosen.
Similarly to the weighted average width, the partial diameters are compared to the total length to evaluate their weight in overall terms.The formula used is: where: • D j = average diameter of the pipeline of the jth intervention; • D i,j = diameter of the ith segment of the pipeline of the jth intervention; • L i,j = length of the ith segment of the jth intervention; • L j = total length of the jth intervention.
With reference to the 19 projects implemented in historic centres, Table 2 shows the values assumed by the parameters D i,j , L i,j , and L j necessary to estimate the parameter D j for each category of network infrastructure (sewerage, water supply, and gas network).
Similarly, with reference to the 20 projects implemented in the suburbs, Table 3 shows the values assumed by the D i,j , L i,j, and L j parameters for each category of network infrastructure.Estimated D j using Equation ( 2) for each project, the result is then multiplied by a coefficient that takes into account the influence on costs exerted by the material of which the pipeline is composed.The coefficients adopted are as follows: The coefficients are obtained from the ratios between the unit costs of the various materials used (we noticed that as the diameters vary, the ratios between the costs of pipelines made of different materials remain almost constant).The data were obtained from the analysis of the prices and the final costs of building the networks.Placing the cost of PVC and polyethylene (materials with similar and lower costs) equal to 1, the ratio between the unit costs results in a coefficient of 1.2 for turbovibrocompressed concrete.With the same logic, we obtain that the synthetic resin has a coefficient of 1.15, the ductile iron has a coefficient of 1.3, and the steel has a coefficient of 1.25.Again, from the analysis of the accounting documents attached to the project, it was found that the cost of the pipe increases by approximately 40% when it is housed in a reinforced concrete tunnel.By doing so, the coefficients of 1.6 and 1.8 were obtained, respectively, for the synthetic resin pipelines inserted in a reinforced concrete tunnel and for the ductile iron pipelines inserted in a reinforced concrete tunnel.Table 4 shows the materials of the pipelines and the corrective coefficients used for the 19 projects located in historical centres.
Similarly, Table 5 shows the materials of the pipelines and the corrective factors used for the 20 projects located in peripheral areas.
Finally, the P variable must be considered, indicative of the tortuosity of the path and therefore of the need to use special pieces.The P variable is calculated from the overall network path.On the intervention plan, primary nodes must be identified and numbered, that is, the points where at least three different branches of the same network converge.Secondary nodes should also be numbered, i.e., those relating to the terminal ends of pipes that do not converge in any main node.In this way, each branch to be built is uniquely determined according to the end nodes, so that, for instance, Sections 3 and 4 are between nodes 3 and 4. For each branch, the angular deviations are measured between the ith element and the ith − 1 element of the broken line that makes up the branch under examination.The angles formed by the segments that converge at the nodes are not computed.It follows that a branch formed by a single segment has no angular data.Among the angular data, only those relating to deviations exceeding 19 degrees, which are significant for the use of special pieces, should be selected [18,[54][55][56][57].For instance, this is the case with curved elements.It should be emphasised that the connection joints between the pipes can dispose of a deviation of the path, compared to the straight course, of 5-10 degrees.In this case, the attribution of a double tolerance to the deviation of the path depends on the fact that the cost of the special pieces, and in particular of those of the networks under pressure, increases as the angular deviation increases due to the consequent increase in hydrostatic thrust on the walls of the pipeline.Therefore, as can be seen from technical practise and consultation with industry experts, the number of deviations with an angle greater than 19 degrees is indicative of the number of special pieces used per linear metre of pipeline.The P values obtained were finally approximated, as reported in Table 6.Situations with P equal to zero were eliminated as these limit conditions are unlikely [18,[54][55][56][57].

The Total Cost Function
A general function of the cost, which adequately interprets the observed phenomenon, can be written as follows: where: Each cost category can be made explicit through its functional relationship that takes into account the main influences exercised by the variables presented in Section 3.2.It is therefore possible to rewrite Equation (1) as follows: The following section presents the results of this study.

Results
In Section 4.1, the results obtained from the first dataset (interventions in historical centres) are analysed, while in Section 4.2, the results related to the second dataset (interventions in peripheral areas) are presented.The main project variables were analysed and disaggregated considering the four categories of works introduced in Section 3.1 (road and masonry works, sewerage, water supply, and gas network).

Application and Results in the Case of Interventions in Historical Centres
Appendix A (Table A1) shows the costs and the values assumed by the independent variables for each of the four categories of works carried out on the 19 projects selected in the historical centres.Below, we analyse the results of statistical tests for each of the four categories of work performed in historical centres.

Road and Masonry Works
Table 6 shows the results of statistical tests for road and masonry works in historical centres.Of the four regressive models considered, the best proved to be the multiplicative exponential one, in line with what has been demonstrated using similar models aimed at estimating the costs of civil and building works.It has the highest adjusted R 2 value and the lowest standard error.The analysed models all pass the F-test.The multiplicative exponential model has the highest number of independent variables that meet the t-test.Only for the W variable is a low significance reported.The multiplicative exponential function was therefore chosen as representative of the costs of road and masonry works.To further improve the exponential function, it was decided to eliminate the W variable from the model.The last column of Table 7 shows the results of the statistical tests for the multiplicative exponential model corrected following the elimination of the variable W.
For the adjusted exponential model adopted to estimate the costs of road and masonry works carried out in historical centres, the graphs in Figure 2 are shown.The residual plots show that no correlation can be found between the independent variables and the residuals (the latter are not predictable from the explanatory variables).The Q-Q plot shows that the errors have an almost normal distribution (the regression is sufficiently robust).The last graph shows that the data appear to be randomly distributed around 0 (the assumption of linearity is acceptable).

Sewerage
Table 8 shows the results of statistical tests for sewerage systems in historical centres.Of the four regressive models, the one considered the best is the multiplicative exponential one.It has the highest adjusted R 2 value, although the lowest standard error is found for the linear model.The analysed models all pass the F-test.Again, the multiplicative exponential model has the largest number of independent variables that satisfy the t-test, although, for the W and E variables, the significance is low.As can be seen from the last column of Table 8, the exponential model improves by eliminating these variables.
For the adjusted exponential model adopted to estimate the costs of the sewage networks in historical centres, the graphs in Figure 3 are shown.
From the graphs of the residuals, it can be seen that no relationship can be identified between the independent variables and the error.From the Q-Q plot, it can be deduced that the errors have an almost normal distribution.The last graph shows that the data appears to be randomly distributed around 0.

Water Supply
Table 9 shows the results of statistical tests for water supply in historical centres.Of the four regressive models considered, the best is again the multiplicative exponential one.In this case, for the W variable, there is low significance.As can be seen from the last column of Table 9, the exponential model improves following the elimination of the variable W. For the adjusted exponential model adopted to estimate the costs of road and masonry works carried out in historical centres, the graphs in Figure 2    For the adjusted exponential model adopted to estimate the costs of the sewage networks in historical centres, the graphs in Figure 3   From the graphs of the residuals, it can be seen that no relationship can be identified between the independent variables and the error.From the Q-Q plot, it can be deduced that the errors have an almost normal distribution.The last graph shows that the data appears to be randomly distributed around 0.

Water Supply
Table 9 shows the results of statistical tests for water supply in historical centres.Of the four regressive models considered, the best is again the multiplicative exponential one.In this case, for the W variable, there is low significance.As can be seen from the last column of Table 9, the exponential model improves following the elimination of the variable W. For the adjusted exponential model adopted to estimate the costs of water supply in historical centres, the graphs in Figure 4 are shown.
From Figure 4, it is possible to draw the same conclusions valid for road and masonry works and the sewerages in historical centres.

Gas Network
Table 10 shows the results of statistical tests for gas networks in historical centres.Of the four regressive models considered, the best is the multiplicative exponential one, despite the low significance of the P and E variables.As can be seen from the last column of Table 10, the exponential model improves by eliminating the P and E variables.From Figure 4, it is possible to draw the same conclusions valid for road and masonry works and the sewerages in historical centres.

Gas Network
Table 10 shows the results of statistical tests for gas networks in historical centres.Of the four regressive models considered, the best is the multiplicative exponential one, despite the low significance of the P and E variables.As can be seen from the last column of Table 10, the exponential model improves by eliminating the P and E variables.For the modified exponential model adopted to estimate the costs of gas networks in historic centres, the graphs in Figure 5   From the graphs, it is possible to draw the same conclusions valid for the road and masonry works, for the sewerages, and for the water supplies in the historical centres.
Please refer to Appendix A (Table A2) for a summary of the real cost values, those estimated through the adjusted exponential function, the residues, and the percentage residues obtained for road and masonry works, sewerage, water supply, and gas networks in historical centres.The identified function returns an average percentage error of 1% in estimating the costs of road and masonry works and interventions on water supply and gas networks.Per the sewerage, the average percentage error is 2%.

Application and Results in the Case of Interventions in Peripheral Areas
Appendix B (Table A3) shows the costs and the values assumed by the independent variables for each of the four categories of work carried out on the 20 projects selected in the peripheral areas.

Road and Masonry Works
Table 11 shows the results of statistical tests for road and masonry works in peripheral areas.Of the four regressive models considered, the best is the linear one.It has the highest corrected R 2 value and the lowest standard error.Therefore, the linear function is chosen as representative of the costs of road and masonry works.For this function, the variables S and E do not meet the t-test.To further improve the linear function, it was decided to eliminate the S and E variables from the model.The last column of Table 11 From the graphs, it is possible to draw the same conclusions valid for the road and masonry works, for the sewerages, and for the water supplies in the historical centres.
Please refer to Appendix A (Table A2) for a summary of the real cost values, those estimated through the adjusted exponential function, the residues, and the percentage residues obtained for road and masonry works, sewerage, water supply, and gas networks in historical centres.The identified function returns an average percentage error of 1% in estimating the costs of road and masonry works and interventions on water supply and gas networks.Per the sewerage, the average percentage error is 2%.

Application and Results in the Case of Interventions in Peripheral Areas
Appendix B (Table A3) shows the costs and the values assumed by the independent variables for each of the four categories of work carried out on the 20 projects selected in the peripheral areas.

Road and Masonry Works
Table 11 shows the results of statistical tests for road and masonry works in peripheral areas.Of the four regressive models considered, the best is the linear one.It has the highest corrected R 2 value and the lowest standard error.Therefore, the linear function is chosen as representative of the costs of road and masonry works.For this function, the variables S and E do not meet the t-test.To further improve the linear function, it was decided to eliminate the S and E variables from the model.The last column of Table 11 shows the results of the statistical tests for the linear model following the elimination of the S and E variables.For the adjusted exponential model adopted to estimate the costs of road and masonry works carried out in peripheral areas, the graphs in Figure 6 are shown.
Residual plots show that no relationship can be identified between the independent variables and the error (the error cannot be predicted using the explanatory variables).The Q-Q plot shows that the errors have an almost normal distribution.From the last graph, it can be deduced that, since the model is quite random, linear regression is appropriate to describe it.

Sewerage
Table 12 shows the results of the statistical tests for the sewerage in the peripheral areas.Of the four regressive models considered, the best is the linear one.As can be seen from the last column of Table 12, the linear model improves following the elimination of the variable W. For the adjusted exponential model adopted to estimate the costs of the sewerages built in peripheral areas, the graphs in Figure 7 are shown.
From the graphs of the residuals, it can be seen that no relationship can be identified between the independent variables and the error.The Q-Q plot shows that the errors have an almost normal distribution.From the last graph, it can be deduced that, since the model is quite random, linear regression is appropriate to describe it.

Water Supply
Table 13 shows the results of the statistical tests for the water supply in the peripheral areas.Of the four regressive models considered, the best is the linear one.As can be seen from the last column of Table 13, the linear model improves following the elimination of the W and D variables.Residual plots show that no relationship can be identified between the independent variables and the error (the error cannot be predicted using the explanatory variables).The Q-Q plot shows that the errors have an almost normal distribution.From the last graph, it can be deduced that, since the model is quite random, linear regression is appropriate to describe it.From the graphs of the residuals, it can be seen that no relationship can be identified between the independent variables and the error.The Q-Q plot shows that the errors have an almost normal distribution.From the last graph, it can be deduced that, since the model is quite random, linear regression is appropriate to describe it.

Water Supply
Table 13 shows the results of the statistical tests for the water supply in the peripheral areas.Of the four regressive models considered, the best is the linear one.As can be seen from the last column of Table 13, the linear model improves following the elimination of the W and D variables.For the adjusted exponential model adopted to estimate the costs of the water supplies built in peripheral areas, the graphs in Figure 8 are shown.
From the graphs, it is possible to draw the same conclusions valid for the road and masonry works and for the sewage networks in the peripheral areas.

Gas Network
Table 14 shows the results of the statistical tests for gas networks in peripheral areas.Of the four regressive models considered, the best is the linear one.As can be seen from the last column of Table 14, the linear model improves following the elimination of the E and P variables.From the graphs, it is possible to draw the same conclusions valid for the road and masonry works and for the sewage networks in the peripheral areas.

Gas Network
Table 14 shows the results of the statistical tests for gas networks in peripheral areas.Of the four regressive models considered, the best is the linear one.As can be seen from the last column of Table 14, the linear model improves following the elimination of the E and P variables.For the adjusted exponential model, adopted to estimate the costs of gas networks built in peripheral areas, the graphs in Figure 9  From the graphs, it is possible to draw the same conclusions valid for the road and masonry works, for the sewerage, and for the water supplies in the peripheral areas.
Please refer to Appendix B (Table A4) for a summary of the actual cost values, those estimated through the modified exponential function, and the residues and percentage residues obtained for road and masonry works, sewerage, water supply, and gas networks From the graphs, it is possible to draw the same conclusions valid for the road and masonry works, for the sewerage, and for the water supplies in the peripheral areas.
Please refer to Appendix B (Table A4) for a summary of the actual cost values, those estimated through the modified exponential function, and the residues and percentage residues obtained for road and masonry works, sewerage, water supply, and gas networks in peripheral areas.The identified function returns an average percentage error of 1% for road and masonry works, 3% for sewerage, and close to zero for water supply and gas networks.

The Total Cost Function
In summary, in the case of historical centres the following cost function is obtained: C T = L 0.41 E 0.17 0.00001E 0.15 S 0.60 +0.000003L 0.51 W 0.92 P 0.16 +D 0.36 0.000008L 0.62 W 0.25 +0.00002L 0.52 D 0.39 P 0.51 (5) Otherwise, in the case of peripheral areas, the following cost function is obtained: A discussion of the results is reported in the next section.

Discussion
The results obtained are interpreted here in a critical way.Below, we analyse, for each of the two urban contexts, the differences found in terms of selected variables and their possible causes.

Historical Centres
In the case of historical centres, for road and masonry works, the variable W (average width of the road section) is not very significant.It is evident that the information on the average width is already contained in the variable S (renewed pavement surface), and so it can be redundant.In addition, in the Italian historical centres, the road sections are for the most part restricted, and therefore we see little variability of the W parameter for the selected sample.The cost function adopted is exponential multiplicative, so the costs grow more than proportionally as the length and surface of the road to be repaved increase, as well as the level of difficulty of the work.The exponential trend of costs is probably due to a series of critical issues that usually characterise the Italian historical centres (high population density, compact urban tissue, narrow spaces, tortuous road paths, the presence of stone pavements, the presence of archaeological finds, etc.).
The low significance of the W variable is also reflected in the work carried out on sewerage and water networks.The average road sections in Italian historical centres do not vary much.The dataset shows a high average road width only in 2 cases out of 19.Therefore, in almost all the locations, there are the same possibilities in terms of storage of materials, use of mechanical means, and execution of excavations.These possibilities are often prohibitive because of the critical issues previously mentioned.This affects the costs of interventions in terms of economies of scale.Therefore, the increase in significant variables corresponds to an exponential increase in costs.
The discussion on the exponential growth of the cost function can also be extended to interventions on the gas networks.In this case, the P and E variables lose meaning as they were defined.Due to the limited information available, these variables refer to the overall path deduced from the general plans of the interventions, so they are not differentiated for each category of work.However, for the projects analysed, gas networks are much shorter than water and sewerage networks.From this, it can be deduced that, since the P variable refers to the linear metres of piping, it becomes not very incisive in the case of gas networks (a negligible total number of special pieces is obtained).The issue relating to the smaller extension of gas networks and the resulting consequences deserve greater investigation in future research.Differently, the E variable depends on the cost per cubic metre obtained under different working conditions.Probably, as the prevailing size (length) decreases, the incidence of the E variable on overall costs is also significantly reduced.The reasoning made for the E variable can be extended to sewerage networks, which also have a low average length.

Peripheral Areas
In the case of peripheral areas, for road and masonry works, the S and E variables have been eliminated from the cost function as they are not very significant.Probably, the S variable has little influence on costs in the case of peripheral areas due to economies of scale.The average area of intervention in the suburbs is 12,383 m 2 greater than that of the historical centres.The unit costs of excavation and repaving operations tend to decrease as the area concerned increases.The strong variability of the S parameter within the reference sample suggests a sort of internal compensation, and therefore the variable does not affect the total costs.As for the E variable, it remains mostly constant in the case of peripheral areas, assuming the lowest value is equal to 1.This is because the works are more easily achievable than in historical centres.For this reason, the E variable can also be overlooked.In the case of peripheral areas, the W variable acquires relevance, as the road sections can vary significantly from one context to another, affecting costs to varying degrees.The function that best approximates the cost trend as the significant parameters change is this time linear.The almost total absence of the critical elements typical of historical centres is reflected in the growth trend of costs, which grow in direct proportion to the L and W variables.The presence of less pedestrian traffic, larger spaces, fewer winding paths, and mainly bitumen pavements greatly simplifies the work in almost all the contexts analysed.And this also has a clear impact on the cost function.
However, the W variable does not have a significant influence on the work carried out on sewerage and water networks.In fact, in almost all cases, the average width has sufficient dimensions to easily allow the storage of materials, the use of mechanical means, and the execution of excavations.In addition, it is possible to note a certain uniformity in the average diameters of water pipes in the suburbs, which stands at 200 mm.From this, it can be seen that the contribution of the D variable in the case of water networks can also be considered negligible.Even for sewerage and water networks, the cost function adopted is linear due to the almost total absence of diseconomies of scale.
The discourse relating to the exponential growth of the cost function can also be extended to interventions on gas networks.For the same reasons analysed in the case of historical centres, it is possible to neglect the contributions of the variables E and P.

Implications of the Research
This study presents significant implications for researchers, institutions, and professionals operating in the field of urban regeneration, with particular reference to the renovation and maintenance works of underground pipelines.The document concretely addresses the gap between theory and practice, providing an MRA-based methodology for accurately estimating the costs associated with such interventions.This represents a significant contribution for professionals in the sector, offering them concrete tools and approaches to evaluate costs and plan targeted interventions.From the point of view of economic and commercial impact, the results of this research can be used to optimise the allocation of financial resources intended for urban restructuring.The differentiation of cost functions based on the urban context allows a more accurate evaluation of costs, allowing targeted strategic planning.This can result in more efficient management of financial resources and a reduction in overall costs, with clear economic benefits for local authorities, public service companies, and other interested parties.The proposed research also has a significant impact on technical policy in the urban regeneration sector.The results and conclusions of this study can provide an empirical basis for the definition of guidelines that aim to improve the efficiency and effectiveness of interventions on water, sewerage, and gas networks.Understanding cost dynamics in different urban contexts can contribute to the definition of policies that consider local specificities, promoting sustainable and optimised management of urban infrastructure.From a research perspective, this study contributes to the development of the body of knowledge by providing a specific methodology and interpretation for urban renovation cost modelling.The insights obtained can stimulate future research on the economic evaluation of urban interventions and on the analysis of other factors and predictors that influence costs.This would allow us to deepen our understanding of the economic dynamics of these interventions and provide more sophisticated tools for evaluation and planning.

Limitations of Research and Future Prospects
The research presents some limitations that must be taken into consideration to evaluate the results adequately.First of all, the two datasets used for the analysis were created from a limited number of projects (19 for the historic centres and 20 for the peripheral areas).This could partly influence the representativeness and generalizability of the results.Therefore, it is necessary to interpret the results with caution and consider the possibility that a greater number of projects with varied characteristics could lead to different conclusions.The selection of a limited number of projects depended both on the restrictions on access to technical and accounting documents and on the difficulties encountered in finding complete, quality data.Therefore, the selection of the independent variables taken into consideration for this study was also influenced by the availability of the data.The risk linked to a limited number of variables partially limits this study's ability to capture the complexity and multidimensionality of the factors that influence urban renovation costs.Nonetheless, the variables taken into consideration have strong theoretical relevance for the phenomenon studied, having a significant impact on costs [43].
To address these limitations, the proposed operational protocol will be tested on a greater number of projects in the future, evaluating the possibility of introducing new explanatory variables.Further implementation of the databases are currently underway.Another future research perspective could consist in the application of ANNs for the clustering of projects within the two databases (for example, identifying those particular projects carried out in historical centres that should be classified in the database on peripheral areas, and vice versa).

Conclusions
Urban regeneration has emerged as a widely explored field of research on a global scale in response to the complex challenges that cities must confront, including the decline of urban functions, the deterioration of urban structures, the energy transition, and the well-being of residents [58][59][60][61].In this context, the renovation of deteriorating underground pipelines plays a crucial role, as it is instrumental in improving the overall urban environment and ensuring the sustainable development of cities [62].
This study aims to construct a model for the rapid estimation of costs in the programmatic phase in the case of interventions that involve a complex series of works on several types of infrastructure (road and masonry, sewerage, water supply, gas networks).MRA was used to estimate costs.The fundamental phases of the implementation of the model are the identification of the main variables that influence the cost of intervention and the definition of the most suitable functional form to represent the causal relationship.
This study made it possible to define a unique cost function for the four categories of work.First, the minimum explanatory variables necessary to carry out a sufficiently precise estimate of the urban regeneration costs that simultaneously involve multiple infrastructure networks were identified.In addition, it is shown that it is correct to differentiate the functional form as the urban context changes.In particular, for historical centres, it is correct to adopt multiplicative exponential functions, while for peripheral areas, the cost estimate is more reliable if linear functions are considered.The motivation is to be found in the presence of diseconomies of scale in the case of interventions in historical centres.
Diseconomies depend on the characteristics of Italian historical centres (narrow alleys, winding paths, stone pavements, high population density, presence of archaeological finds, etc.).The estimated functions return an average error of 1.25% for historical centres and 1.00% for peripheral areas.
In conclusion, the study that estimates the costs of restructuring the city's underground pipelines can help improve the population's quality of life.Correct economic-financial planning of maintenance and renovation interventions on service infrastructure contributes to ensuring a reliable water supply and adequate waste disposal, improving the health and well-being of building occupants.Likewise, the optimization of gas distribution networks ensures efficient use of resources and greater safety in buildings.Therefore, the analysis of the renovation costs of these infrastructures provides valuable information to guide investment decisions and promote a sustainable and quality living environment.

Figure 1 .
Figure 1.Flow chart of the logical steps followed in the research work.

3. 1 . 1 .
Step 1: Collection of Projects for the Renovation of Service Infrastructures

Figure 1 .
Figure 1.Flow chart of the logical steps followed in the research work.

3. 1 .
Steps of this Study 3.1.1.Step 1: Collection of Projects for the Renovation of Service Infrastructures of the four categories of work; • C R = cost of road and masonry works; • C S = cost of renovation of the sewer network; • C W = cost of renovation of the water network; • C G = cost of renovation of the gas network.
are shown.

Figure 2 .
Figure 2. Road and masonry works in historical centres: residuals, Q-Q plot, and residuals vs. fitted.
are shown.

Figure 4 .
Figure 4. Water supply in historical centres: residuals, Q-Q plot, and residuals vs. fitted.

Figure 4 .
Figure 4. Water supply in historical centres: residuals, Q-Q plot, and residuals vs. fitted.
are shown.

Figure 5 .
Figure 5. Gas networks in historical centres: residuals, Q-Q plot, and residuals vs. fitted.

Figure 5 .
Figure 5. Gas networks in historical centres: residuals, Q-Q plot, and residuals vs. fitted.

Figure 6 .
Figure 6.Road and masonry works in peripheral areas: residuals, Q-Q plot and residuals vs. fitted.

Figure 6 .
Figure 6.Road and masonry works in peripheral areas: residuals, Q-Q plot and residuals vs. fitted.

Figure 8 .
Figure 8. Water supplies in peripheral areas: residuals, Q-Q plot, and residuals vs. fitted.

Figure 8 .
Figure 8. Water supplies in peripheral areas: residuals, Q-Q plot, and residuals vs. fitted.

Figure 9 .
Figure 9. Gas networks in peripheral areas: residuals, Q-Q plot, and residuals vs. fitted.

Table 1 .
Summary of key references on estimating the construction and renovation costs of network infrastructures.

Table 1 .
Cont.Variables extrapolated from the literature and taken into consideration in this study. *

Table 2 .
Values assumed by the parameters D i,j , L i,j , and L j for each of the 19 projects carried out in the historical centres and differentiated by the type of network infrastructure (sewerage, water supply, and gas network).

Table 3 .
Values assumed by the parameters D i,j , L i,j and L j for each of the 20 projects carried out in the suburbs and differentiated by the type of network infrastructure (sewerage, water supply, and gas network).

Table 4 .
Pipeline materials and correction coefficients were used to estimate the D j parameter for the 19 projects located in historical centres.

Table 5 .
Pipeline materials and correction coefficients were used to estimate the D j parameter for the 20 projects located in peripheral areas.

Table 6 .
Angular deviations on the overall network path represent the number of special pieces per linear metre of pipeline: actual measured values and approximate values, both for projects completed in historical centres and for those completed in peripheral areas.

Table 7 .
Road and masonry works in historical centres: results of statistical tests and values of regression coefficients.

Table 8 .
Sewerage in historical centres: results of statistical tests and values of regression coefficients.

Table 9 .
Water supply in historical centres: results of statistical tests and values of regression coefficients.

Table 10 .
Gas networks in historical centres: results of statistical tests and values of regression coefficients.

Table 11 .
Road and masonry works in peripheral areas: results of statistical tests and values of regression coefficients.

Table 12 .
Sewerage networks in peripheral areas: results of statistical tests and values of regression coefficients.

Table 13 .
Water supply in peripheral areas: results of statistical tests and values of regression coefficients.

Table 14 .
Gas networks in peripheral areas: results of statistical tests and values of regression coefficients.