Spatial Model of Optimization Applied in the Distributed Generation Photovoltaic to Adjust Voltage Levels

The main objective of this work is to develop a methodology for analyzing the quality of the voltage level in the distribution power grid to identify and reduce the violations of voltage limits through the proposition of optimal points for the allocation of photovoltaic distributed generation. The methodology uses the geographic location of the power grid and its consumers to perform the grouping and classification in spatial grids of 100 × 100 m using the average annual consumption profile. The generated profiles, including the grid information, are sent to the photovoltaic distributed generation allocation algorithm, which, using an optimization process, identifies the geographic location, the required installed capacity, and the minimum number of photovoltaic generation units that must be inserted to minimize the violations of voltage limits, respecting the necessary restrictions. The entire proposal is applied in a real feeder with thousands of bars, whose model is validated with measurements carried out in the field. Different violations of voltage limits scenarios are used to validate the methodology, obtaining grids with better voltage quality after the optimized allocation of photovoltaic distributed generation. The proposal presents itself as a new tool in the work of adapting the voltage of the distribution power grid using photovoltaic distributed generation.


Introduction
The world is facing an energy crisis aggravated by the growth in demand, and with this, the electric power grids require more marked maintenance, overloading the energy distribution companies [1][2][3]. Thus, ensuring the proper voltage levels for consumers becomes increasingly challenging. If there is a bad quality in the distributed electricity, losses are produced to the companies due to punitive administrative actions applied by regulatory agencies [4][5][6][7][8][9]. Sustainable energy sources are necessary for this scenario, and using them has become the best option [10][11][12].
The development of current electrical grids is directed towards smart grids, which integrate information technology, communication, and automation, being applied in transient analysis, failures, among others [13][14][15]. In addition to the application in a centralized generation, smart grids are mainly applied in a sustainable distributed generation, such as (i) photovoltaic, (ii) wind, and (iii) small hydroelectric plants, among others [14,16].
Distributed Generation (DG) refers to generation connected directly to the distribution power grid or the consumer. In Distributed Generation Photovoltaic (DGPV), the allocation location in the distribution power grid interferes with the adequacy of the voltage, even when the installed capacity is the same for the different allocation geometries [17]. As an example of this statement, Acharya et al. [18] proposes an analytical approach using geographic location and optimized installed capacity based on losses for the allocation of a unit DG, concluding that it is not always possible to find the best DG allocation site due to restrictions of the problem.
Hung et al. [19] proposes a method that calculates the installed capacity and power factor of four types of DG to reduce losses, taking into account the allocation of unit DG. Subsequently, Hung et al. [20], use the same methodology to obtain maximum loss reduction in large grids with multiple DG. The allocation of DG with optimized parameters has the following advantages: (i) Reduction of losses, (ii) improvement in the voltage profile, (iii) expansion of the load capacity, (iv) increased reliability, stability, and security of the grid, and (v) higher quality in the energy produced among many others [21][22][23][24][25].
There are several DG allocation techniques in the literature, characterized in five groups: (i) Analytical techniques, (ii) classical optimization techniques, (iii) artificial intelligence techniques, (iv) diverse and empirical techniques, and (v) prediction techniques for future applications [21,26,27]. Between 2010 and 2020, the different techniques gave rise to different mathematical models for the allocation of distributed generation, such as (i) improvement of voltage stability, (ii) improvement of the voltage profile, and (iii) reduction of electrical losses. Each model presents different constraints on the optimization problem: (i) Bus voltage limit, (ii) maximum DG capacity, (iii) current limits, and (iv) reactive power limits of DG. Different optimization parameters are used for each DG, such as (i) installed capacity, (ii) geographical location, (iii) quantity, and (iv) DG types [28]. Regarding the best methodology to be applied to problems related to electricity distribution planning, it is still an open question among researchers in the field [28][29][30][31][32][33][34].
There are several effectual actions to improve the efficiency of the distribution power grid. One of them is the optimal allocation of distributed generation, protecting the grid against unforeseen events, and allowing work in a decentralized manner [28]. In several countries, there are still high rates of transmission and distribution losses compared to China, which is 5.81%, and Germany, which is 3.94% [35]. DG is advantageous over a centralized generation, mainly regarding: (i) reduction of losses and environmental impact and (ii) expansion of system loading maintaining adequate voltage levels, among other advantages [36][37][38][39][40].
There are several surveys to present the gains from the application of DG. Hassan et al. [41] developed a methodology for optimal DG location and sizing to minimize losses in radial grids. For this, the Augmented Lagrangian Genetic Algorithm (ALGA) is used, applied in three IEEE test feeders with 33 bars, 69 bars, and 119 bars. In this work, the authors insert a maximum of four DG, and the results show the optimal locations and installed capacity with improved voltage profiles and reduced losses. The authors argue that the proper location and sizing of DG are essential for the efficiency of electrical power systems. Moradi et al. [42] develop a methodology to minimize losses and improve voltage regulation and stability in electrical grids. For this, the authors use the combination of Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) to find optimal locations and installed capacity of DG. The methods are applied separately and combined, and the results are compared. The methodology is applied to test feeders with 33 and 69 bars. The combined methods generated the best DG location and installed capacity results, providing more stable and adequate voltage levels in the grid. However, the combination of methods demonstrated the long simulation time as a disadvantage.
Ganguly et al. [43] propose a methodology for the allocation of DG to minimize electrical losses in the grid and adjust voltage levels taking into account the uncertainties of load and generation demand. For this, IEEE test feeder with 33 and 52 nodes are used in heuristic and deterministic approaches. The generation and demand uncertainties are modeled with fuzzy logic. The solution strategy used is the Adaptive GA, which generates better results for location and installed capacity of DG than the basic GA. However, the variability of results is high, and it is necessary to reduce this variability in the results. AlHajri et al. [44] developed a methodology for the allocation and dimensioning of one or multiple DG with specific and non-specific power factors. Location is assessed using stability and sensitivity analysis. The optimization problem is modeled with non-linear equality and inequalities constraints. For the DG dimensioning, the hybrid method entitled Fast Sequential Quadratic Programming (FSQP) is compared to the conventional Sequential Quadratic Programming (SQP) method, with favorable results for hybridization with a reduction of execution time by up to 1/3. The simulations are carried out in a 69-bus test feeder. The optimization of DG location and installed capacity demonstrated to significantly interfere in the electrical losses of the grid and the voltage profile, reducing the required power from the substation, allowing the planning of expansions in the electrical system.
In the literature, several methodologies/techniques are used to improve the electrical parameters of the distribution power grid, maintaining the quality of the distributed electrical energy. Several researchers are working to present the best methodology/technique to relieve the actions to be implemented in the power grid financially. One of the techniques used is clustering, which identifies and gathers the types of consumers, for example. Angelos et al. [45] propose a two-step methodology to carry out the classification of electrical consumption profiles. In the first step, the Fuzzy C-Means algorithm is used to find consumers with similar consumption profiles. Subsequently, the fuzzy classification is performed using the fuzzy membership matrix and the Euclidean distance from the cluster centers. Measures of distance from potential fraudsters to cluster centers are normalized and ordered, producing unit index scores, where consumers with irregular consumption patterns obtain higher scores. The approach is tested and validated in a real database, showing satisfactory performance in detecting fraud and measurement defects.
Zamora et al. [46] propose a method for long-term space load forecasting for application in distribution system planning. The authors use aspects such as load curve aggregation, small areas, land use, database integration, artificial intelligence techniques, especially self-organizing neural networks (Kohonen), and multivariate statistical techniques for cluster analysis. The future load forecast is initially carried out considering predefined geographic spaces and a global load model based on time series. Cluster analysis allows the recognition of patterns and loads allocation related to small areas. The global load model allows following the evolution of land use in terms of load. The GIS platform is suitable for visualizing the geographic space and its integration with the database.
Several methodologies are proposed in an attempt to improve the quality of distributed energy, such as those by Hassan et al. [41], Moradi et al. [42], Ganguly et al. [43], and AlHajri et al. [44], which optimize the location and installed capacity of unitary or multiple DG. However, there is a gap related to the number of optimized variables and the feeders used for simulation. For example, they do not optimize the amount of DG to correct the problems in the eletrical power distribution, they do not optimize the installed capacity of each DG separately, and in most of the works, developed test feeders with a reduced number of buses used, without applying the methods in real feeders that have thousands of bars [47,48]. The main contribution of this work is to develop a computational tool for optimally allocating DGPV in order to maximize the voltage adequacy in the electric power distribution, model and simulate the electric distribution system using the georeferenced registration information of the assets, currently present in the databases of electricity concessionaires, generating georeferenced maps with separation of consumers by consumption range, indication and qualification of the violations of voltage limits, and obtaining the optimized geometry for the allocation of DGPV.
The methodology proposed in this work performs photovoltaic distributed generation allocation optimizing: (i) The geographic allocation location, (ii) the installed capacity of each DG, (iii) the identification and location of the consumer per consumption range, and (iv) the number of generation units required for the suitability of voltage in the distribution power grid. For the mapping of consumers by consumption band, the georeferenced cadastral information of the grid assets, currently present in the databases of the electric power concessionaires, is used. Thus, in addition to the optimization process of the optimal point of allocation of DGPV, considering all buses of the actual circuit, the proposed methodology uses the geographical profile of consumers grouped in grids of 100×100 m, which through clustering process are divided into consumption classes.
The application of clustering makes it possible to limit the optimization process to the buses present in the geographic areas of a specific consumer class. In this way, it is possible to carry out studies considering the allocation of DGPV in areas with greater feasibility of installation through the use of existing roofs of larger homes, companies, or industries, benefiting both the customer with the reduction of the invoice value, as well as the concessionaire which will improve the quality of the supply voltage. The multiobjective optimization of the four parameters (without limiting the amount of DGPV) together with the clustering process is the originality of this work.
This work is divided: Section 2 presents the international regulations that regulate the electrical system with emphasis on energy distribution, presents the model of the electrical power system, and performs a brief description of the optimization and clustering process. In Section 3 presents the proposed methodology and in Section 4 the results obtained from the application of the proposed methodology are displayed. Sections 5 and 6 present the discussions and conclusion of this work, respectively.

Theoretical Background
In this section, some international technical standards that deal with the distribution of electric energy and distributed generation are discussed. A comparison is made between norms from some countries, emphasizing the norms that regulate the violations of voltage limits. The theoretical bases of some elements that are part of the electric power system are discussed, with emphasis on the model of the electric energy distribution network and on the photovoltaic generation model. Finally, some parameters that integrate the optimization process are presented, in addition to the main characteristics of the deterministic, heuristic, hybrid, and clustering optimization methods.
Several countries have their own regulatory agents, such as Brazil, which follows the National Electric Energy Agency (ANEEL) regulations, with emphasis on the Normative Resolution ANEEL 482/2012 regarding distributed generation [70,71]. ANEEL operates in the distribution of electricity through the Distribution Procedures (PRODIST), with emphasis on Module 3 and Module 8 [4,72,73], in which Module 3 is the conditions of access to the distribution system and Module 8 is about the quality of the product and the service offered. In addition, in Brazil, the Brazilian Association of Technical Standards (ABNT) acts as the national representative of the IEC [72,74].  [4,5,8,9,58,64,75].
In Table 1, it is observed that indicators that are important in a given country are disregarded in others. For example, Japan uses only harmonic distortion as a power quality criterion, disregarding the voltage fluctuation analyzed in all other countries. It is also observed that in the United States, the steady state voltage, which is verified in all countries except Japan, is not evaluated.

Model and Parameters of the Electric Power System
The power flow study is used to determine the electrical voltage of the power bus nodes and the active and reactive powers of the energy distribution lines, where nodes are the connection points on Phase 1, Phase 2, or Phase 3 bars [76,77]. Through this study, it is possible to plan the operation of the electric sector [78]. The electrical system buses are associated with: (i) Nodal voltage magnitude, (ii) nodal voltage angle, (iii) net active power, and (vi) net reactive power [79]. Furthermore, the electric bars can be categorized as (i) charge bars, (ii) generation bars, and (iii) swing bar [76].
The distribution power grid can be modeled by the elements: (i) Power supply, (ii) power overhead lines, (iii) distribution transformers, and (iv) loads. Regarding the power supply, this is modeled using the Thevenin equivalent. The modeling can be performed in four different ways: (i) By impedances, (ii) by three-phase and single-phase short-circuit currents, (iii) by three-phase and single-phase short circuit powers, and (iv) by infinite bus. The definition by impedances uses the symmetrical components method to obtain the impedances, currents, and voltages in symmetrical components. In the definition through the three-phase and single phase short-circuit currents, the three-phase short circuit is first analyzed, and then the single-phase short circuit. The definition is carried out using three-phase and single-phase short-circuit powers from the three-phase and single-phase short circuit currents. Finally, the power grid modeled using an infinite bus defines the point in the electrical system where voltage and frequency are fixed, regardless of the power supplied [76,78].
The other necessary element for modeling distribution power grids is overhead distribution lines with various line configurations. The representation by the model π of the lines uses matrices of series impedance and nodal capacitance [76]. The study of currents returning through the earth and their influence on the impedance parameters are carried out by the modified Carson correction [80]. Through the image method, the nodal capacitance matrix is obtained, calculated using the inverse of the matrix of Maxwell's potential coefficients [76,79]. For the modeling of power distribution transformers, the primitive admittance matrix can be used in any type of transformer, regardless of the connection, the number of windings, the number of phases, among other parameters [81]. Short circuit impedances and open circuit impedances are used by the model, both for three-phase and for single-phase transformers.
For the load model, active power and reactive power are functions of frequency and voltage in magnitude. Depending on the applied voltage, the power absorption by the load varies. This variation also depends on the nature of the load. Several models characterize the behavior of the load as a function of voltage: (i) Constant power model, (ii) constant current model, (iii) constant impedance model, and (iv) ZIP model, which is a composition of previous models [76,78,79]. Photovoltaic generation is an external element to the electricity grid and considers standard conditions such as (i) solar irradiation, (ii) cell temperature, and (iii) air mass. The parameters of interest for modeling the photovoltaic generator are: (i) Open circuit voltage, (ii) short-circuit current, (iii) maximum generation power, (iv) maximum power voltage, and (v) maximum current potency [82].

System, Model, and Optimization Process
Systems are sets of distinct elements that, when joined together, generate results unattainable by isolated elements [83]. Systems can contain and be contained in other systems, a characteristic known as systemity [84]. The model is the representation of the system and mimics the behavior of the real system [84]. For Eykhoff [85], models are defined as representations of the essential aspects of the system through expressions and are used to: (i) Carry out forecasts, (ii) analyze performance and costs, and (iii) direct projects, among others [83].
The optimization process is the systematic search for optimal or optimized values f (x * ) of system parameters. It consists of finding the best solution within the set of viable values of the problem variables, respecting the constraints [86]. The basic elements of the optimization process are illustrated in Figure 1. The optimization methods can be divided into methods: (i) Deterministic, (ii) heuristic, (iii) stochastic, and (iv) inferential [86]. When different optimization methods work together in order to improve performance in finding solutions, hybridization occurs. Thus, the strengths of each type of method are used to obtain better solutions than those obtained by the methods separately [86]. The advantages of using hybrid methods are: (i) Improving the performance of known methods, and (ii) partitioning high computational cost problems, among others [87]. With the model in hand, the simulator can be created, which checks the model's behavior and compares it with the behavior of the real system, making systematic changes to the input parameters [84].

Clustering for Mapping Consumers by Consumption Range
The use of data mining techniques is essential for the analysis of large amounts of information, intending to obtain additional relevant information, not yet observed [88]. The clustering technique is one of the most used in data mining in order to identify hidden patterns in the database, forming similar groups [89]. Thus, the clustering technique aims to partition the input dataset into similar clusters, considering specific pre-established criteria. In the cluster, data have a high degree of similarity to each other compared to other groups [90].
When there are no pre-established classes or examples to guide the types of desirable relationships, the algorithm used is classified as unsupervised, which can generate different groups based on established criteria [91]. Thus, the task of data pre-processing is necessary, which depends on the steps: (i) Assertively select the characteristics that make up the grouping process, including as much relevant information as possible, (ii) perform the selection of the clustering algorithm to be used, which quantifies the similarities, and (iii) build the objective function or rule type [92]. The validation of results verifies assertiveness through the use of appropriate criteria and techniques [92]. The interpretation of the data, in several cases, uses experts to cross the information obtained in the grouping process with other experimental evidence in order to obtain coherent results [92].

Methodology
In this section, the methodology proposed in this work is presented to adapt the voltage of the distribution power grid by inserting photovoltaic distributed generation. For the allocation of distributed generation, spatial model applications are used to analyze and optimize the distribution system in an attempt to reduce the violations of voltage limits, building a tool with layers of solutions, highlighting the role of each step in the process and the correlation between them. For that, the allocation algorithm of distributed generation photovoltaic, optimization method, and the modeling of the elements of the energy distribution grid to be used are presented.

Adequacy of Voltage in the Distribution Power Grid
There is the possibility of inserting Distributed Generation Photovoltaic (DGPV) to adjust the voltages of the electricity grid, as the injection of active power carried out by the distributed generation units changes the voltage levels of the feeder. However, for the violations of voltage limits to be remedied, it is necessary to know some variables, such as: (i) Geographical location where violations of voltage limits occur for the allocation of DGPV, (ii) minimum installed capacity of DGPV at points of voltage violations of shape to remedy the problem, and (iii) the amount of DGPV needed to adjust the voltage levels in the distribution grid. For this, a method is developed to classify the electrical voltages of the feeder in: (i) Adequate, (ii) precarious, and (iii) critical, according to the regulations of each country.
Thus, a method of inserting DGPV is proposed, carried out by dividing into monthly consumption groups and using the real data from the public utilities. The proposed methodology uses the flow: (i) Geographical information system, which is the environment for accessing and presenting information from the distribution power grid database and spatial analysis of the average consumption of customers in grids, (ii) system simulator of distribution, which is responsible for the electrical calculations of the grid in order to identify the voltage violations points, (iii) grouping and classification, which is the application of the Fuzzy C-Means algorithm for clustering the grids in the number of clusters predefined, (iv) DGPV allocation algorithm, which is the indication of the parameters installed capacity, quantity, locations, among others, and (v) optimization process, which is the use of the genetic algorithm in order to minimize the level of violations of voltage limits in the distribution power grid.

Spatialization of Information, Grouping, and Classification in the Distribution Power Grid
The first step to enable the implementation of the methodology is to perform the spatialization of information on the grid. This process is initiated by selecting the circuit that makes up the region to be studied. In the process of spatialization of information, the circuit under study is divided into squares of 100 m × 100 m, in which, through spatial layer crossings, the consumers geographically present in each grid are identified. A search is carried out in the database for each identified consumer to obtain their classification (residential, commercial, industrial, rural, and others) and the average consumption in the last 12 months.
Subsequently, consumers are subclassified by the average consumption range for the residential, commercial, and industrial classes using the division criteria: (i) Residential from 0 to 50 kWh, (ii) residential from 50 kWh to 200 kWh, (iii) residential from 200 kWh to 400 kWh, (iv) residential above 400 kWh, (v) commercial from 0 to 200 kWh, (vi) commercial above 200 kWh, (vii) industrial below 1000 kWh, (viii) industrial above 1000 kWh, (ix) rural, and (x) others. The result of this process is the creation of the consumption curve by grid, containing information on 10 consumption classes. To avoid possible magnitude variability in the consumption curves, a normalization process is performed using the interval [−1,1].
There are no pre-established classes or even examples that guide the types of desirable relationships for the grouping problem. In this way, the algorithm used in this process is classified as unsupervised, generating different groups based on the established criteria. After creating the consumption curves of the grids of the feeder selected for study, the grid grouping and classification process are carried out using the Fuzzy C-Means technique. Therefore, it is necessary to inform the number of groups to be generated, the value of the stop criterion (ε), which measures the difference between the membership degree of the current iteration with the membership degree of the previous iteration, and the index fuzzification, which defines the allowable distance between the curve and center of the group in which the curve is inserted.
In addition to considering the 10 consumption ranges for each curve, the classification can be carried out considering specific consumption ranges. Once the grouping process is finished, each square is classified in the same group where its consumption curve was closer to the centroid curve. Each group formed is assigned a color, which is replicated to all the squares of the same group. Thus, after the grouping process, squares that receive the same color belong to the same class and have a similar average consumption profile. In this way, it is possible to choose the region and type of consumer to install the DGPV.

Loading, Data Export, and Simulation of the Distribution Power Grid
To start the analysis process in a given circuit it is necessary to load the data from this circuit from the Distributor's Databases (DB) to the Database (DaB) to be used in the proposed tool. The load action provides a specific location for selecting one or more circuits for study. Once the circuit data has been loaded, it is possible to carry out the simulation. The simulator used is the Open Distribution System Simulator (OpenDSS), software under open source license, which uses command lines to build the model of the electrical grid [93].
In the simulation, power flow calculations are performed in order to identify the existence of grid segments with voltage level violations and present the results in the form of a vectorized circuit drawing on the map, highlighting the segments with violations of voltage limits and generating files with *.dss format instructions for the grid elements, following specific models, such as: (i) Feeder, (ii) Medium Voltage segment (MV) and Low Voltage segment (LV), (iii) description of conductors, (iv) branch, (v) transformer, (vi) load curve, (vii) LV and MV consumers, (viii) voltage regulator, (ix) capacitor, and (x) coordinates of the busbars. All elements of the distribution power grid are grouped in a single file for simultaneous execution.
The feeder output bus uses the parameters: (i) Base voltage [kV] line nominal, (ii) element connection electric bar, (iii) busbar [pu] voltage, (iv) number of phases, and (v) resistance and positive sequence reactance of the source. The conductor description contains the cable impedances, and the parameters that must be configured in the simulator are:

Optimization Process
The optimization process aims to minimize electrical violations of voltage limits in the Consumer Units (CU) connected to the distribution optimization process grid buses by allocation of the DGPV in geographic locations pos, with installed capacity pot and minimum quantity qtd of DGPV. Two optimization processes are carried out for the allocation of DGPV and the comparison between them. The first considers all buses in the feeder as candidates for the allocation of DGPV, and the second considers only the buses contained in the grids of a specific group of consumers. The evaluation function used in the optimization process is given by: In (1), the sums are the total violations of voltage limits values by phase, in which trgA( x) are the violations of voltage limits in phase A, trgB( x) are the violations of voltage limits in phase B, and trgC( x) are the violations of voltage limits in phase C, x is the vector with the parameters to be optimized, n a is the number of nodes in phase A, n b is the number of nodes in phase B, and n c is the number of nodes in phase C. In this way, the evaluation function in (1) can be rewritten as [94]: The identification of the bar where the DGPV will be connected is represented by its geographic location pos. The optimization problem has constraints of: (i) Voltage, (ii) current, (iii) geographic location of DGPV, (iv) installed capacity of DGPV, and (v) quantity of DGPV. These constraints are given by: where V min is the minimum voltage limit and V max is the maximum voltage limit [4], I max cond is the maximum current supported by the conductor, pot max is the maximum allowable installed capacity for the DGPV, and qtd max is the maximum amount of DGPV that can be inserted. For the DGPV installed capacity, the maximum value is defined in order to avoid the allocation of distributed generation units with large physical dimensions. The proposed optimization process is illustrated in Figure 2.

Algorithms for Inserting Distributed Photovoltaic Generation in Distribution Power Grid
The DGPV allocation algorithm is used to optimize the variables location for allocation pos, installed capacity pot, and quantities of DGPV qtd. The maximum amount qtd max of DGPV that can be inserted into the grid is equal to the total amount of bars in the system qtd bars . Thus, to define the x vector it is necessary to define the pos and pot, given by: where n = 2 × qtd. Since for each inserted DG unit, there is a location and installed capacity, the vector x is given by: In the proposed methodology, different technologies and stages of analysis, treatment, and integration of data are involved in order to enable the obtaining of optimized scenarios according to the proposed objectives, using the georeferenced database of the distribution power grid. Figure 3 illustrates the steps involved in this process, which are divided into groups based on the arrangement of roles within the methodology flow: (i) geographic information system: access environment and presentation of information on the distribution grid and spatial analysis of the average consumption of customers in grids and With the roles defined within the steps of the methodology, we have: (i) grouping and classification: application of the Fuzzy C-Means algorithm for clustering the grids in a number of predefined clusters, (ii) photovoltaic distributed generation allocation algorithm (DGPV): definition of the DGPV allocation parameters (installed capacity, quantity, location, and others) for the optimization process and (iii) optimization method: genetic algorithm with the intention to minimize violations of voltage limits.

Results
This section presents the results obtained from the application of the proposed methodology. The application is performed through a case study that includes: (i) Simulation of the feeders, (ii) voltage classification, (iii) clustering of the consumption profiles, (iv) analysis of the geographic location of the Distributed Generation Photovoltaic (DGPV), (v) validation of the proposed methodology, and (vi) comparison between the results obtained considering the allocation of DGPV freely in the entire feeder or only in clusters of a certain consumption profile.

Data for the Case Study
The data of the distribution power grid used in this work are real, coming from the Enel Distribuição Goias/Brazil (EDG) concessionaire's databases. EDG is part of the Enel Group, which is currently the largest electricity distribution company in Brazil, operating in the states of Sao Paulo, Ceara, Rio de Janeiro, and Goias, totaling more than 17 million consumers served. Brazilian standards define steady-state voltage limits as: (i) adequate, (ii) precarious, and (iii) critical. The values of these limits consider the relationship between the reading voltage V r and the reference voltage V R or nominal voltage V N . Tables 2 and 3 show how each service voltage profile V S is defined, according to the voltage range of the feeder [4]. Table 2. Nominal voltage limits greater than 1 kV and less than 69 kV.
For precarious and critical voltages, several actions can be taken to improve the performance of the feeder, such as (i) adjusting the transformer winding derivation (TAP), (ii) closing the LV loop circuit, (iii) complementing the phases, (iv) replacing transformers with others of a higher power, (v) dismembering the circuit, (vi) installing voltage regulators, (vii) installing a capacitor bank, (viii) installing a distributed generation, and (ix) various others [95,96]. These and other actions are analyzed by EDG, which serves 237 municipalities in Goias, totaling approximately 3.1 million consumers, through more than 32,000 km of LV distribution power grid and more than 178,000 km of MV distribution systems in the State of Goias [96]. The EDG contains several Databases (DB) and to carry out the simulations, different DB are used, both conventional DB and Spatial DB (SDB).
To make the simulation process feasible, it is necessary to obtain specific information from the source databases (conventional and spatial) and load this information into the Database (DaB) of the proposed computational tool. Such a load is performed through the Extract, Transform, Load (ETL) interface. For spatial data, it is necessary to use the Geographic Information System (GIS) to manipulate the information, given the type of non-conventional data (point, line, and polygon). The database of spatial assets is mainly used to model the distribution power grid, as well as the database of the commercial system and equipment management. Figure 4 shows the BD of the electricity distributor and the flow of these data to the DaB, which occurs using the proposed computational tool. In this way, the DaB stores updated information on the distribution power grid at each new simulation of a given circuit, thus considering the real dynamics of changes in the distribution system. In this case, if the database is up to date, the simulation will be the representation of the real updated system. The load curve of each type of consumer and the circuit impedances are considered as system operating conditions. The system operates with a constant load, constant generation, and invariant topology. The reference voltage for the MV circuit is 13.8 kV and for the LV circuit, it is 380 V phase-phase. The power flow calculation used Newton Raphson's iterative method [97]. Feeders from the Goiania Leste substation were simulated, which has a voltage level of 230/13.8 kV and installed capacity of 136 MVA.
A total of 17 feeders from the Goiania Leste substation are simulated, in which Table 4 displays the characteristics of each feeder. The maximum and minimum values for each characteristic are highlighted. In Table 4 it is possible to observe that Feeder 14 has the smallest number of branches and Consumer Units (CU) in LV and Feeder 20 has the largest number of branches and CU in LV. Figure 5a,b show the distribution lines of Feeder 14 and Feeder 20, respectively, in the coordinates x, y, in which the thicker the lines, the greater the power flow. Table 4. Characteristics of the Goiania Leste substation feeders.

Model Validation and Simulation
Quarterly, ANEEL requests information from Brazilian energy concessionaires on the voltage levels of the CU. For this, the concessionaires carry out measurements at transformation stations that serve the indicated CU, a process known as Sampling Campaign. Voltage is measured at each post for seven days, totaling 168 h with 1008 valid readings. As EDG has more than three million CU, information is collected from 330 posts, with the right to purge 10% of these records. After that, the 300 samples/quarter are sent to ANEEL. In this work, to validate the model of the distribution power grid, data collected by the concessionaire EDG in a sampling campaign were used. The validation compared the simulated data with data measured in the real feeder, in a given transformation station, for 24 h [95,96]. Figure 6 shows the measured current and voltage data in one phase and Figure 7 illustrates the simulated current and voltage data for the three phases.  Tables 5 and 6 provide the comparison between measured and simulated current and voltage data, respectively. Despite the high deviation between the measured and simulated current values, both for the maximum and minimum value, it is noteworthy that the measured data were obtained only for one phase and the simulated data for the three phases. However, it is observed that the maximum value of the simulated current does not exceed the maximum value of the measured current, implying a model that does not generate false overcurrent alarms [95]. Deviations between measured and simulated voltage data were less than 1%, validating the grid model for the object of study of this work, the voltage adequacy. Among all the simulated feeders, the Feeder 14 was chosen for the voltage analysis and presentation of results, as it has the lowest number of CU in the LV. In the simulation, the instantaneous snapshot solution model was used, which provides the boundary conditions of the system at a specific point in time. Figure 8a presents the voltage classification performed under the boundary conditions, in which each point corresponds to a pole with several nodes. When a post is classified as violations of voltage limits, it means that it has at least one node with inadequate voltage, precarious or critical (see Tables 2 and 3). In this way, even a pole with several nodes that contains only one node with violations of voltage limits will be classified as a pole with inadequate voltage. Therefore, Figure 8 indicates the geographic location of the violations of voltage limits.
Briefly, Table 7 provides the voltage classification at each node of the Feeder 14, in which: (i) CV is the number of nodes with critical voltage, (ii) PV is the number of nodes with precarious voltage, and (iii) AV is the number of nodes with adequate voltage. Figure 8b shows the voltage profile of the feeder, in which: (i) The phases are in red, black, and blue colors, (ii) the continuous lines show the MV, (iii) the dotted lines show the LV, and (iv) the green horizontal lines show the lower limit (0.92 pu) and the upper limit (1.05 pu) of voltage, as required by [4]. The electrical losses in the feeder are 13.08%, and the adequate voltage level is 56.79%.

Clustering for Mapping Consumers by Consumption Range
In the clustering process, consumption curves are generated for the squares with dimensions of 100 × 100 [m]. To overlay all of Feeder 14, 73 geographically-distributed grid cells were needed. In this way, for each grid, the average consumption curve for the last 12 months is created and divided by consumption profiles. The Fuzzy C-Means algorithm was used with a fuzzification index of 1.25, a stopping criterion less than or equal to 0.01, and a number of groups to be generated equal to 3. Figure 9 presents the curve of average consumption in kWh for the centroids of the three groups formed. Due to the urban characteristic of Feeder 14, groups with predominantly residential and commercial consumption were generated. Group 1 has squares with a predominance of commercial consumption above 200 kWh and industrial consumption below 1000 kWh with low representation. Group 2 has squares with predominantly residential consumption (approximately 90%), with greater representation in the profile above 400 kWh and from 50 kWh to 200 kWh and commercial (approximately 10%), with greater representation for the profile above 200 kWh. Finally, Group 3 has squares with predominantly residential consumption (approximately 65%), with similar distribution for profiles from 50 kWh to 200 kWh, 200 kWh to 400 kWh, and above 400 kWh, but with a greater share of commercial consumption (35%), when compared to Group 2.
Once the clustering process is completed, a given color is applied to the squares of each group, in which squares of the same color belong to the same group, and the squares are assigned to the group in which its average consumption curve has the smallest Euclidean distance when compared to the centroid curve of a given group. Figure 10

Validation of the Allocation of Distributed Generation Photovoltaic and Use of the Optimization Process
The validation of the proposed Distributed Generation Photovoltaic (DGPV) allocation algorithm is divided into two parts: (i) Optimization of the location of the generation pos and the installed capacity pot, in which the user informs the number of DGPV to be allocated into the feeder and limit of installed capacity per connection point and the algorithm determines the allocation locations and supply power of the DGPV and (ii) optimization of the amount of DGPV qtd, pos, and pot, in which the user only informs the limit installed capacity per connection point and the algorithm determines the quantity, allocation locations, and installed capacity of the DGPV. To validate the DGPV allocation algorithm, Feeder 14 was used, connecting and disconnecting loads and changing the power of some Consumer Units (CU).
In the process of optimizing the variables pos and pot, four scenarios are used with the presence of violations of voltage limits in different regions of the distribution power grid, selected as: (i) Scenario 1-violations of voltage limits in only one region and DGPV allocation algorithm used for the allocation of one generation source, (ii) Scenario 2violations of voltage limits in two regions and DGPV allocation algorithm used for the allocation of two generation sources, (iii) Scenario 3-violations of voltage limits in three regions and DGPV allocation algorithm used for the allocation of three generation sources, and (iv) Scenario 4-violations of voltage limits in three regions and the DGPV allocation algorithm used for the allocation of a generation source.
The optimization method used was the genetic algorithm with the configuration: (i) Tournament selection method, (ii) adaptive mutation operator, (iii) heuristic crossing operator, (iv) stop criterion with maximum number of generations g max = 50 or evaluation function f ( x) = 0, (v) number of individuals in the population of 50, except for Scenario 3 which uses a population of 200 individuals because it needs to optimize more variables than the other scenarios, and (vi) evaluation function f ( x) based on the violations of voltage limits. In these evaluations, the maximum installed capacity of the DGPV pot max ≤ 1000 kVA. Figure 11a shows the voltage classification of Scenario 1 before the DGPV allocation, indicating violations of voltage limits in a single region, and Figure 11b shows the location of the DGPV allocation in the distribution power grid after applying the optimization process. Figure 12a shows the voltage profile of the distribution power grid before the allocation of the DGPV and the optimization process, indicating the violations of voltage limits at the end of the grid. Figure 12b presents the voltage profile after the optimization process, indicating the adequacy of the grid regarding the voltage level. The initial evaluation function was f ( x) = 10.9508 and the final evaluation function was f ( x * ) = 0, obtained after three generations. After optimization, the electrical losses of the feeder reduced from 5.57% to 4.70%, a reduction of 0.87%.    Figure 13a. The voltage profile of the distribution power grid in this scenario is shown in Figure 14a, in which it is observed that before the allocation of the DGPV, there is a violation of voltage limits. The purpose of analyzing Scenario 2 is to optimize the allocation of two DGPV in an attempt to adjust the voltage. Thus, the optimization starts with f ( x) = 42.7708 and ends with f ( x * ) = 0, indicating no violation of voltage limits in six generations. Figure 13b presents the voltage classification in the distribution power grid with the locations of the DGPV after optimization and Figure 14b presents the voltage profile after optimization.  For Scenario 2, the optimization resulted in the geographic locations of the bar pos 1 = 163, bar pos 2 = 366, and the installed capacity pot 1 = 423.84 kVA, pot 2 = 302.77 kVA for DGPV 1 and DGPV 2 , respectively. The active power injected into the feeder was pot total = 726.61 kVA, the electrical loss before DGPV allocation was 6.49% and after allocation it was 7.20%, obtaining an increase of 0.71%, caused by the high power injected into the grid.
In Scenario 3, before the DGPV allocations, violations of voltage limits occur in three regions, as shown in Figure 15a and the distribution power grid voltage profile as shown in Figure 16a. In this scenario, the intention is to optimize the allocation of three DGPV and obtain the distribution system without violations of voltage limits. In the optimization process, the initial evaluation function was f ( x) = 50.9316 and the final one was f ( x * ) = 0, performing the entire process in seven generations. Figure 15b presents the voltage classification and the place of allocation of the DGPV in the feeder, and Figure 16b presents the voltage profile, both after the allocation of the DGPV.   For Scenario 3, the heuristic optimization process, genetic algorithm, resulted in: (i) Locations in bars pos 1 = 119, pos 2 = 114 and pos 3 = 174 and (ii) installed capacity pot 1 = 221.2 kVA, pot 2 = 221.5 kVA, and pot 3 = 100.7 kVA, respectively for DGPV 1 , DGPV 2 , and DGPV 3 . The active power injected into the grid was pot total = 543.40 kVA and the electrical losses before the allocation of the DGPV was 6.59% and the electrical losses after optimization was 4.18%, a reduction of 2.41%. Scenario 4 uses the same distribution power grid configuration as Scenario 3, shown in Figure 15a and the same electricity system voltage profile shown in Figure 16a. The purpose of Scenario 4 analysis is to optimize the allocation of only one DGPV, observing the location and the necessary installed capacity so that the distribution power grid has no violations of voltage limits. In the optimization, initially f ( x) = 50.9316 and at the end f ( x * ) = 0 with five generations. The geographic location of the bar pos = 183 and installed capacity pot = 821.60 kVA were obtained.
The installed capacity value is relatively high since a single DGPV must adapt the voltages of the entire grid, while in Scenario 3, three DGPV were inserted with pot total = 543.40 kVA. In Scenario 3, the DGPV locations produced a reduction of pot total = 278.20 kVA when compared to Scenario 4. The voltage classification with the location of the DGPV allocation is shown in Figure 17a, and the voltage profile is shown in Figure 17b, both after optimization. The electrical losses of the distribution power grid before the allocation of the DGPV were 6.59%, and after the allocation of the DGPV, it was 6.82%, with an increase of 0.23%. Scenario 1, Scenario 2, Scenario 3, and Scenario 4 were simulated/optimized using the parameters pos and pot, taking the parameter fixed amount of DGPV qtd. In order to analyze the optimization process, the variable qtd will be part of the variables to be optimized. Therefore, in the new case study, the following variables are optimized: (i) pos, (ii) pot, and (iii) qtd. The new case study is composed of two new scenarios: (i) Scenario 5, which uses the Scenario 1 distribution power grid and (ii) Scenario 6, which uses the Scenario 3 distribution system.
In Scenario 5, the same distribution system as Scenario 1 was used, presented in Figures 11a and 12a. In this feeder, even optimizing the qdt variable, only one DGPV (it is possible to configure the maximum admissible voltage level for the DGPV in the algorithm) was needed to adjust the voltage across the entire distribution power grid. The inserted DGPV has a geographic location in the bar pos = 26 and installed capacity pot = 245.93 kVA. The optimization process ended with three generations, and the electrical losses before the optimization were 5.57%, and after the allocation of the DGPV, it reduced to 4.82%, a reduction of 0.75%. Comparing Scenario 1 with Scenario 5, the location of the DGPV has changed, however, it remains in close proximity to each other: pos = 35 and pos = 26, respectively. Regarding the generated power, Scenario 1 got pot = 254.42 kVA and Scenario 5 got pot = 245.93 kVA. The results are similar, and the small difference between the values of the variables pos and pot is related to the heuristic optimization method.
In Scenario 6, the same distribution system as Scenario 3 was used, presented in Figures 15a and 16a. It took 27 generations to obtain f ( x * ) = 0 and the optimization process obtained a result of qtd = 1 with geographic location in the bar pos = 159 and installed capacity pot = 675 kVA. The entire distribution power grid had adequate voltage, and the electrical losses of the grid before allocation was 6.59%, and after allocation of the DGPV, it reduced to 5.06%, a reduction of 1.53%. Figure 18a presents the new voltage classification after optimization, indicating the allocation location of the DGPV at the end of the distribution power grid, and Figure 18b Table 8 displays the results obtained in the processes of inserting DGPV and optimizing the parameters pos, pot, and qtd for all analyzed scenarios. In Table 8, the items are defined as: (1) (11) voltage was adequate in the entire electrical system after applying the methodology (yes/no). Analyzing the data in Table 8, it is observed that in all scenarios, the voltages of the electrical grid were adequate after the allocation of the DGPV with optimized parameters.

Allocation and Analysis of Distributed Generation Photovoltaic in Different Geographic Locations
Smith, Dugan, and Sunderman [19] state that the different allocation placements of the DGPV impact the grid voltage adequacy in different ways, even for identical total installed capacities. To analyze this statement, tests were carried out to change the geographic location of the DGPV allocation of the Feeder 14, shown in Figure 8a,b, with the characteristics: (i) Analysis 1-the impact of allocation of a single DGPV at the beginning of the grid, close to the substation and with 1000 kVA of installed capacity, (ii) Analysis 2-the impact of allocation of a single DGPV at the end of the distribution power grid, with 1000 kVA of installed capacity, and (iii) Analysis 3-the impact of inserting ten DGPV with 100 kVA of installed capacity each, at random points in the grid, totaling 1000 kVA of installed capacity. Figures 19-21 Table 9 provides the voltage classification for all nodes of Feeder 14 for the three analyses performed, in which the percentages of critical, precarious, and adequate voltage are highlighted in bold. Each analysis resulted in different voltage ratings and different voltage profiles for the same installed capacity into the grid. The electrical losses were 13.57% for Analysis 1, 14.30% for Analysis 2, and 38.14% for Analysis 3. The highest percentage of nodes with an adequate voltage was 67.01%, obtained from Analysis 2. In this way, the same total installed capacity and different geographic locations of allocation of DGPV cause different impacts on the grid.

Optimization Process Applied in the Allocation of DGPV in the Real Feeder
The analyses carried out previously used the adapted real electrical distribution power grid. The next analyses are carried out using the allocation and optimization algorithm in the complete real distribution system, as it is in the utility's databases. To carry out this methodology, the DGPV allocation algorithm will be used in two ways: (i) Applied freely, using all feeder bars as possible DGPV allocation points and (ii) applied in a restricted way using the result of clustering, inserting the DGPV only in the bars contained in a predefined group between Group 1, Group 2, and Group 3. In this methodology, Feeder 14 presented in Figure 8a,b was used.
Several simulation runs were carried out with g max = 1000, however, due to the high number of feeder bars, there were problems stopping the process before the desired end. Due to limitations of the machine used, the optimizations returned out of memory errors from g ≥ 500, which occurred randomly. The machine used in this work contains a 64-bit Windows 10 Enterprise operating system, a 1.9-GHz Intel Core i5-8365U processor, and 16 GB of DDR4 2400 MHz RAM, of which the average optimization time was ≈35 h for g max = 500. Testing was carried out with Virtual Machine which has a 64-bit operating system Windows Server 2021 R2, 8 Intel Xeon E5-2650 processors of 2 GHz, and 32 GB of DDR3 1600 MHz RAM, on which it was possible to analyze that the optimization process considering g max = 1000, would result in an improvement of only 1.27% in f ( x), with a duration of ≈96 h. Thus, the limit of g max = 500 is defined for all case studies in this work.
To apply the DGPV allocation and optimization algorithms in the real grid, there are variables to be optimized: pos, pot, and qtd = 10 with pot max = 100 kVA per allocation point. The optimization method used was the genetic algorithm with some differences from the one used in Section 4.4, they are: (i) Stochastic and uniform selection methods, (ii) fitness measure by ranking, (iii) criteria of stop maximum number of generation g max = 500 or evaluation function f ( x * ) = 0, and (iv) number of individuals in the population of 200. The evaluation function used to measure fitness of each possible solution is given by (2). Figure 22 shows the average processing aptitude of the 500 generations. It is observed that in the generation g = 170, a change was made from the stochastic selection method to the uniform selection method. This was necessary to increase genetic diversity, since from g = 150, there is stagnation and loss of diversity, leaving the optimization process in a possible optimal location. The initial evaluation function is f ( x) = 1046.20 and the final evaluation function is f ( x * ) = 483.58, obtained after 500 generations. There was an improvement in the voltage level of ≈53.77%, obtained by analyzing the value of the evaluation function. This improvement, when compared to the data shown in Table 7, reflects an increase of 43.65% of nodes with adequate voltage, a reduction of 29.21% of nodes with precarious voltage, and a reduction of 65% of nodes with critical voltage. The 10 DGPV inserted in the optimization process add up to pot total = 298.55 kVA of power injected into the grid and the electrical losses reduced from 13.08% to 11.77%. Figure 23 shows the voltage rating of Feeder 14 before and after the allocation of DGPV, with an optimization process, without bars restriction, and with the presence of the ten DGPV inserted. It is observed in Figure 23b, the reduction of points with critical and precarious voltage when compared to the situation of the real grid in Figure 23a. Figure 24 shows the voltage profile before (Figure 24a) and after (Figure 24b) of the allocation of DGPV in optimized locations and without the restriction of bars, confirming the reduction in the number of nodes with precarious and critical voltage and the increase of nodes with adequate voltage.  Still as a test without bar restriction, the optimization process was used for the three variables: qtd, pos, and pot with the only restriction that pot max = 100 kVA. Figure 25 shows the average of the proficiency in the optimization processing that reached the stopping criterion g max = 500. It is observed in Figure 25 that from g = 80, there is stagnation in the value of the objective function and at g = 205, the change from the stochastic selection method to the uniform selection method is performed.
The optimization obtained the value of the variable qdt = 14. If the stopping criterion g max = 500 had not been met, the optimization process would have continued and would only stop with f ( x) less. However, optimization in conjunction with simulation slows down the entire process, which is time-consuming. In this new analysis with the allocation of 14 DGPV, the value of the initial evaluation function is f ( x) = 1046.20 which decreases to f ( x * ) = 324.47, producing a reduction of ≈70% in mains violations of voltage limits. This result causes an increase of 55.99% of the nodes with adequate voltage, a reduction of 44.66% of the nodes with precarious voltage, and a reduction of 80.22% of the nodes with critical voltage. The value of pot total = 374.48 kVA of power injected into the grid produces a reduction in electrical losses from 13.08% to 12.22%. Figure 26 shows the voltage rating of Feeder 14 before (Figure 26a) and after ( Figure 26b) the allocation of the DGPV with the optimized variables qtd, pos, and pot, in which there is a reduction in points with critical and precarious voltage in the grid. Figure 27a,b show the voltage profile before and after allocation of the DGPV, respectively, showing the im-provement in the electrical distribution power grid in relation to the original (Figure 8 and Table 7). With the results obtained with the optimization process in the real distribution system, the objective is to use the same optimization process together with the clustering process. In this new process, for each formed group, the list of buses present inside the group squares is sent to the optimization algorithm. In this way, it is possible to carry out a comparison between the optimization without bus restriction and the optimization by group of average consumption profile per grid. Three groups are obtained in Feeder 14, as shown in Figure 10.
In the application of this optimization process that considers the allocation of DGPV in a given consumption profile group, two steps will be performed: (i) The first considering two variables to be optimized pot and pos and (ii) the second considering three variables to be optimized pot, pos, and qtd. The DGPV allocation algorithm has the additional restriction that it is only to insert the eligible bars from a given group. In this application, Group 2 ( Figure 9) is selected because it has a predominantly residential profile, with several consumption ranges (green squares in Figure 10).
The same parameters used previously in the optimization without bus restrictions are considered in the optimization process, with the fixed quantity in qtd = 10 DGPV with pot max = 100 kVA per DGPV. The initial configuration of the genetic algorithm remains the same. Figure 28 presents the average of the suitability in the optimization process for the first step of this analysis, optimizing only the variables pos and pot with bar constraints. In this analysis, g max = 500 was considered and in g = 162, the stochastic selection method was changed to the uniform selection method, since from g = 105 there was stagnation in the optimization process. Initially f ( x) = 1046.20 and at the end f ( x * ) = 532.57 obtained with g max = 500, providing an improvement of ≈50% in f ( x). This produces an increase of ≈40% of the nodes with adequate voltage, a reduction of ≈30% of the nodes with precarious voltage, and a reduction of ≈60% of the nodes with critical voltage, when compared to the actual data, arranged in Table 7. The 10 DGPV included in the optimization process totaled 380.75 kVA of installed capacity into the grid, with a reduction in electrical losses from 13.08% to 10.72%. Figure 29 shows the voltage rating of the feeder before and after allocation of the DGPV. Figure 29b shows the reduction of points with critical and precarious voltage in the distribution power grid and Figure 30 shows the voltage profile before and after the allocation of the DGPV.  When comparing the results of the optimization process of the variables pot and pos without and with bus restrictions, a significant difference is observed, with the exception of the installed capacity, which increased by ≈5% in the procedure with bar restriction. Regardless of the increase in the installed capacity, this result demonstrates the feasibility of applying the proposed methodology to the real feeder, since the injection of 380.75 kVA of installed capacity in a residential region is present inside 29 squares of 100 × 100 [m], using the roofs of the houses, is shown to be possible. For the process of optimizing the variables pos, pot, and qtd with bar restriction using the real grid, the same values of pot max and g max are kept with the same settings for the genetic algorithm. Figure 31 presents the average of the suitability in the optimization process for the second stage of this analysis, optimizing the variables pos, pot, and qtd with bars constraint. It is observed that at g = 175, the stochastic selection method was changed to the uniform selection method in an attempt to increase the genetic diversity in the population since from g = 145 there was stagnation in the value of f ( x). In this process, coincidentally, the value found for qtd = 14. The initial evaluation function is f ( x) = 1046.20 and the value of the final evaluation function was f ( x * ) = 526.63, promoting an improvement of ≈50% in the evaluation function, which increased the nodes with adequate tension by ≈40%, reduced by 18.54% for the nodes with precarious voltage and 64.24% the nodes with critical voltage. Of the 14 DGPV inserted in the optimization process, they totaled 549.40 kVA of installed capacity, reducing the electrical losses in the distribution power grid from 13.08% to 11.06%. Figure 32 shows the voltage rating of the feeder before and after allocation of the DGPV, indicating the location of the 14 DGPV in the grid, in which it is possible to observe the reduction of points with critical and precarious voltage. Figure 33 shows the voltage profile before and after allocation of DGPV.  Comparing the results of the optimization process of the variables pos, pot, and qtd with and without bus restrictions, it is observed that there was an increase of 44.29% in the installed capacity with a small reduction in violations of voltage limits. This situation occurs due to the restriction of the DGPV allocation area, which hampered the algorithm by not allowing the DGPV to be inserted at strategic points in the grid. In this case, the scenario with the allocation of the 10 DGPV becomes more viable, however, this is not a rule, varying according to the feeder, the chosen consumer group, and the optimization method.

Discussion
Some difficulties were found in the development of this research. The first concerns the collection of data for the simulation of distribution power grid, a task that burdened the research, as the data needed for the simulation of the grid were found in different databases of the electricity concessionaire. In addition, some data were not available in the databases, requiring political strategies to access them, which made the development of the work difficult. Regarding optimization, finding which penalty level would be the most appropriate for the problem so that the various constraints were met without compromising the optimization process was a challenging activity.
Regarding clustering, determining the most adequate fuzzification index, as well as the minimum number of groups that would provide a more homogeneous division of the consumption curves of the squares, demanded a study of the utility's consumer dynamics. As a heuristic method was used, both for clustering and optimization, several simulations were performed for verification and validation. Each simulation spent ≈17 h. For the concessionaire, the proposed methodology serves both to help adjust the voltage level in the distribution power grid and to indicate the locations that need quick maintenance action (corrective and/or preventive). In the real system, maintenance and improvements in the distribution power grid are carried out little by little and the optimization process indicates where to make the first adjustments assertively.
Still, in the optimization process, several simulations were carried out seeking to avoid stagnation due to local minimums. Several parameters were changed and after several attempts, it was necessary to implement two mutation methods, putting energy into the optimization process during its iterations. The initial objective of this work also included the optimization of the allocation of DGPV, taking into account its penetration in the grid. However, this analysis would demand more time and computational effort than the one spent in this work, considering that this analysis is carried out over several hours of the day and for each hour a new simulation would be necessary. That said, we chose to use the OpenDSS snapshot solution mode, which works at the specific point in time when the distribution power grid is in extreme conditions when compared to other hours of the day. In this way, it was possible to significantly reduce the violations of voltage limits in the most harmful conditions to the electrical system.
In addition to the main objective of the methodology, the clustering results made it possible to analyze the horizontal and vertical growth of the distribution power grid when compared to simulations at specific time intervals (monthly, yearly, among others). The horizontal growth analysis is performed by comparing existing grid cells at different times (in different years, for example), thus detecting the new grid cells from the last simulation. In this way, it is possible to verify not only the growth areas of the grid but also the consumption profile of this growth. For the analysis of vertical growth, the grids common to the two simulation scenarios are observed. Thus, it is possible to identify where there was an increase or decrease in demand and, consequently, if this shift generated a change in the grid's consumption profile. Such analyses provide important additional information for the area of grid planning of energy distributors.
This methodology can be used by any utility that has databases with data capable of reliably modeling distribution power grid. The cost of applying this methodology is related to the maintenance and updating of the concessionaire's databases. Outdated or inaccurate distribution system data impair the application of this methodology. With the proposed methodology, it is possible to improve the quality of the distributed electric energy, decreasing service discontinuity and violations of voltage limits. This reduces the financial losses caused by disciplinary penalties for energy distribution concessionaires, improving their image towards consumers.
There are still implementations in the methodology to be improved, such as: (i) Optimize the allocation of DGPV taking into account its penetration into the distribution power grid, (ii) include new variables to be optimized to improve the quality of distributed energy, for example, electrical losses, (iii) apply other heuristic techniques to try to obtain better results, (iv) apply a prediction technique to determine the growth profile of the distribution power grid by the consumption profile, considering the history of simulations, and (v) include additional variables in the clustering process, such as: Area of residence, estimated household income, territorial classification, among others. The future scope of this work is based on the construction of new computational blocks that consider the inclusion of additional variables in the optimization and clustering processes and on the inclusion of a predictive methodology using artificial intelligence to improve the quality of distributed electricity.

Conclusions
The objective of this work was to develop a methodology capable of optimizing the allocation of DGPV in a real feeder, considering the restriction or not of the bars for the allocation of DGPV, in an attempt to adapt the electrical voltage. The heuristic genetic algorithm method was used to provide: (i) The geographic locations of the DGPV (pos), (ii) the installed capacity of the DGPV (pot), and (iii) the amount of DGPV (qtd). For the restriction of the feeder bars, the heuristic method Fuzzy C-Means algorithm was used to cluster the average consumption curves of the squares of 100 × 100 [m], which geographically overlap the feeder, thus forming groups with grids of similar consumption profiles.
The methodology was used as a tool for analyzing the voltage in the electrical grid, before and after the process of allocation and optimization of DGPV. The proposed methodology was verified and validated using several scenarios with real data from the utility. Studies have shown that the best DGPV allocation location is not always close to the transgressed region, especially in cases where the number of regions with violations of voltage limits is greater than the amount of DGPV available to be inserted. The validation also demonstrated that it is possible to optimize the allocation of DGPV by restricting specific areas of the feeder, without substantially affecting the results obtained. Thus, through clustering, it is possible to define the optimization of the allocation of DGPV for a specific consumption group, in necessarily inhabited areas of the feeder. This action enables the creation of incentive projects for the allocation of DGPV and benefits both the consumer, by reducing the amount paid for electricity, and the concessionaire by improving the quality of distributed electricity, avoiding possible sanctions from the regulatory agency.
Regarding the optimization and allocation of DGPV, if the minimum optimized amount of DGPV necessary to minimize the grid violations of voltage limits is incompatible with the concessionaire's budget plan, the methodology allows lower quantities to be tested until reaching the number that is compatible with the budget. For the grid analysis process carried out in the methodology, the additional gain obtained is the mea-surement and location of points with violations of voltage limits in the feeder, enabling the maintenance team to carry out analysis of the grid parameters, verifying the need to (i) adjust the TAP of transformers, (ii) update conductors, (iii) update the consumer database, and (iv) change the feeder topology, among other important information for maintaining the quality of electricity in the grid.
Thus, it is concluded that the proposed methodology is a practical and functional tool for the optimized allocation of DGPV for any concessionaire that has technical and financial data on its electrical grids and its customers. The results obtained for the allocation of DGPV are considered satisfactory for voltage adequacy and the results obtained from the optimization process provided information that partially relieves the utility, as it minimizes the minimum amount of DGPV necessary for maximum voltage adequacy in the distribution power grid.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: