# An Artificial Neural Network Approach to Forecast the Environmental Impact of Data Centers

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

_{2}emissions, according to the utility power source adopted. In Brazil, almost 70% of electrical power is derived from clean electricity generation, whereas in China 65% of generated electricity comes from coal. In addition, the value per kWh in the US is much lower than in other countries surveyed. In the present work, we conducted an integrated evaluation of costs and CO

_{2}emissions of the electrical infrastructure in data centers, considering the different energy sources adopted by each country. We used a multi-layered artificial neural network, which could forecast consumption over the following months, based on the energy consumption history of the data center. All these features were supported by a tool, the applicability of which was demonstrated through a case study that computed the CO

_{2}emissions and operational costs of a data center using the energy mix adopted in Brazil, China, Germany and the US. China presented the highest CO

_{2}emissions, with 41,445 tons per year in 2014, followed by the US and Germany, with 37,177 and 35,883, respectively. Brazil, with 8459 tons, proved to be the cleanest. Additionally, this study also estimated the operational costs assuming that the same data center consumes energy as if it were in China, Germany and Brazil. China presented the highest kWh/year. Therefore, the best choice according to operational costs, considering the price of energy per kWh, is the US and the worst is China. Considering both operational costs and CO

_{2}emissions, Brazil would be the best option.

## 1. Introduction

_{2}) released by the use of coal, petroleum, natural gas and other similar energy sources contributes significantly to global warming. According to estimates, CO

_{2}emissions may increase between 9% and 27% by 2030, depending on which policies are enacted [4].

_{2}emissions) of data centers.

_{2}emissions originates from the production of energy. In China, burning coal generates 65% of the electricity consumed [5], which is a very high level of non-green power generation when compared with Brazil, where the level is 3%. In Germany, 14.7% of the energy produced comes from nuclear fission [6], whereas in China this figure is only 1% [7]. Many countries now require that polluting energy sources should be replaced by cleaner alternatives, such as solar, wind or hydro plants.

_{2}emissions, which depends on the utility power source adopted. For example, in Brazil, 73% of electrical power is derived from clean electricity generation [8], whereas in USA 82.1% of generated electricity comes from petroleum, coal or gas [9].

## 2. Related Works

_{2}emissions.

## 3. Basic Concepts

#### 3.1. Sustainability

#### 3.2. Artificial Neural Networks

#### Perceptron

- Works fine in the case of incomplete information
- Does not require knowledge of the algorithm solving the problem (automatic learning)
- Processes information in a highly parallel way
- Can generalize to unknown cases
- Resistant to partial damage
- Performs associative memory (associative—similar to working memory in humans) as opposed to addressable memory (typical for classical computers).

#### 3.3. ARIMA

## 4. Methodology

_{2}emissions). If the designer has no energy history for the environment, the results are displayed and the data may be analyzed. Otherwise, the forecasting option will be chosen and the data center designer should inform the energy history.

## 5. Energy Flow Model (EFM)

- $N={N}_{s}\cup {N}_{i}\cup {N}_{t}$ represents the set of nodes (i.e., the components), in which ${N}_{s}$ is the set of source nodes, ${N}_{t}$ is the set of target nodes and ${N}_{i}$ denotes the set of internal nodes, ${N}_{s}\phantom{\rule{3.33333pt}{0ex}}\cap \phantom{\rule{3.33333pt}{0ex}}{N}_{i}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}{N}_{s}\phantom{\rule{3.33333pt}{0ex}}\cap \phantom{\rule{3.33333pt}{0ex}}{N}_{t}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}{N}_{i}\phantom{\rule{3.33333pt}{0ex}}\cap \phantom{\rule{3.33333pt}{0ex}}{N}_{t}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}\oslash $.
- $A\subseteq $ (${N}_{s}\times {N}_{i}$) ∪ (${N}_{i}\times {N}_{t}$) ∪ (${N}_{i}\times {N}_{i}$) = {(a,b) ∣ a ≠ b} denotes the set of edges (i.e., the component connections).
- $w:A\to {\mathbf{R}}^{+}$ is a function that assigns weights to the edges (the value assigned to the edge (j and k) is adopted for distributing the energy assigned to the node, j, to the node, k, according to the ratio, w(j,k)/${\sum}_{i\in {j}^{\u2022}}$ w(j, i), where ${j}^{\u2022}$ is the set of output nodes of j).
- ${f}_{d}:N\to \left\{\begin{array}{cc}{\mathbf{R}}^{+}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}n\in {N}_{s}\cup {N}_{t},\hfill \\ 0\hfill & otherwise;\hfill \end{array}\right.$ is a function that assigns to each node the heat to be extracted (considering cooling models) or the energy to be supplied (regarding power models).
- ${f}_{c}:N\to \left\{\begin{array}{cc}0\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}n\in {N}_{s}\cup {N}_{t},\hfill \\ {\mathbf{R}}^{+}\hfill & otherwise;\hfill \end{array}\right.$ is a function that assigns each node with the respective maximum energy capacity.
- ${f}_{p}:N\to \left\{\begin{array}{cc}0\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}n\in {N}_{s}\cup {N}_{t},\hfill \\ {\mathbf{R}}^{+}\hfill & otherwise;\hfill \end{array}\right.$ is a function that assigns each node (a node represents a component) with its retail price.
- ${f}_{\eta}:N\to \left\{\begin{array}{cc}1\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}n\in {N}_{s}\cup {N}_{t},\hfill \\ 0\le k\le 1,k\in \mathbf{R}\hfill & otherwise;\hfill \end{array}\right.$ is a function that assigns each node with the energetic efficiency.

## 6. Considering Energy Mix in the EFM

## 7. Applying ANNs to the EFM

- A set of synapses, each characterized by a weigh. Specifically, a signal ${X}_{j}$ in the input of the synapse j connected to the neuron k is multiplied by the synaptic weigh ${W}_{kj}$. It is important to notice the way how the indexes of the synaptic weigh ${W}_{kj}$ are written. The first index refers to the neuron under analysis and the second one refers to the terminal input of the synapse which the weight refers to.
- An adder to add two input signals, weighed by the respective synapses of the neuron. These operations were implemented in the Mercury tool and constitute a linear combiner.
- An activation function to limit the output range of a neuron. Typically, the normalized interval of the output range of a neuron is written as a closed unitary interval [0, 1] or alternatively [−1, 1].

**Load and normalize data**: Mercury has been configured to accept spreadsheets in odt, xls and csv formats, with three columns (year, month and power consumption). Thus, users may upload a file with the monthly levels of power consumption from the previous years of a data center. These data are read and stored by the engine to use during the following phases.

**Create an artificial neural network**: In this phase, the basic parameters for creating the artificial neural network are set (e.g., number of neurons in the input layer, number of neurons in the first hidden layer and the number of neurons in the output layer). Empirical testing with the MLP backpropagation neural network [31] does not demonstrate a significant advantage in the use of two hidden layers rather than one for small problems. Therefore, most problems consider only one hidden layer.

**ANN training**: The most important property of neural networks is the ability to learn in their own environment, and thereby to improve their performance. This is done through an iterative process of adjustments applied to their weights, and training. The backpropagation training algorithm is the most popular algorithm for training multi-layer ANNs. The algorithm consists of two steps: propagation and backpropagation. In the first step, an input vector is applied to the input layer and its effect propagates across the network producing a set of outputs. The response obtained by the network is subtracted from the desired response to produce an error signal. The second step propagates this error signal in the opposite direction to the synaptic connections, adjusting them in order to approximate to the network outputs. Additionally, using the EFM with RNA, it is possible to set the training stop criteria for a specific error rate or a fixed number of iterations.

**Prediction**: This option produces forecasts related to the energy consumption of the environment over the next twelve months. At the end of the forecasts, the mean absolute percentage error is displayed.

**Graph**: This button graphically displays a comparison between the measurements, with a blue line for the actual data and a red line for the expected monthly consumption values.

## 8. Case Study

_{2}emissions of a data center in the US, taking into account the energy mix adopted over a period of 15 years. Moreover, this paper provides the EFM models for a Tier III data center, besides estimating the environmental impact of the energy consumption of this data center using locations in different countries (with different energy mixes), namely Germany, China and Brazil. A neural network (multi-layer perceptron (MLP)) was used to forecast the energy consumption for the following 12 months, according to the consumption history of the data center. In the context, this work applied ANNs to approach the functions and to forecast future energy consumption based on the previous history.

#### 8.1. Models

**Data Center Tier III (Simultaneous Maintenance and Operation)**

#### 8.2. Energy Mixes, Energy Cost and CO_{2} Emissions

_{2}emissions and cost, obtained from [5,6,7,33].

_{2}emissions according to the energy mix adopted by each country (see Equation (3)), considering the same demand per year. China presented the highest CO

_{2}emissions, with 41,445 tons per year in 2014, followed by the US and Germany, with 37,177 and 35,883, respectively. Brazil, with 8459 tons, proved to be the cleanest. Due to the number of rivers and topology, Brazil is outstanding in its generation of clean energy, which may represent an interesting option for building a data center when considering only CO

_{2}emissions.

_{2}emissions produced by the US, which continue to be high despite these concerns. Currently, the US government shows little concern regarding such emissions, and a large increase in these levels was thus forecasted.

_{2}emissions levels, as well as the highest kWh/year. Therefore, considering only power consumption in China, the corresponding cost was 3.5 times higher than if the data center were located in the US. Thus, the best choice according to operational costs, considering the price of energy per kWh, is the US and the worst is China. Considering both operational costs and CO

_{2}emissions, Brazil would be the best option.

#### 8.3. ANN Forecast

^{−4}. This achieved error was verified through the graph presented in the figure, which corresponds to a very small difference and, therefore, it may be considered that the ANN demonstrated a good learning.

_{2}emissions, which might cause health problems for the world population in a few decades. Table 3 presents the ARIMA and MLP forecasts, for the following 12 months starting from January, 2015.

#### 8.4. Considerations

## 9. Conclusions

_{2}emissions and data center energy consumption considering the variation of the generating source, as well as forecasting the energy consumption over the next months.

_{2}emissions associated with Brazilian, Chinese, German and US energy mixes was performed, including the proposal of formal models for a Tier III data center. In addition, this approach enabled the use of an artificial neural network to forecast the values of these metrics. We evaluated the power consumption of data centers located in the US over the past 15 years, considering the energy mix (wind, coal, hydroelectric, nuclear and oil). The results reveal that China’s energy mix presents the highest levels of CO

_{2}emissions and Brazil the lowest. The US presented the lowest operating costs per kWh/year, while China presented the highest. Furthermore, we adopted the MLP to forecast energy consumption over the next 12 months and the increase was around 1.25–1.04%. These features are available for academic use through the Mercury tool.

#### Main Contributions and Future Work

- Global benefits by reducing environmental impact through reduced energy consumption
- Energy efficient architectures
- Accurate modeling of the electrical infrastructure of data centers that use variation of energy sources
- Possibility of predictions based on artificial neural network for cost and environmental impact of electric flow models

**Consider the LCA of the equipment**: The sustainability impact was estimated considering the exegetical consumption during the operational phase of the data center. One possible extension is to consider the impact of sustainability throughout the cycle (life cycle assessment (LCA)) of the equipment.**Consider cooling infrastructures**: This study considered only the components of the electrical infrastructure; it would be interesting to consider the cooling infrastructure, responsible for almost 50% of data centers’ energy consumption.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Hallahan, R. Technical information from Cummins Power Generation. In Data Center Design Decisions and Their Impact on Power System Infrastructure; Power Generator: Plymouth, MN, USA, 2011; Available online: https://power.cummins.com/system/files/literature/brochures/PT-9020-Data-Ctr-Design-Decisions.pdf (accessed on 15 July 2017).
- Environmental Protection Agency. Report to Congress on Server and Data Center Energy Efficiency. Available online: http://www.energystar.gov/ia/partners/prod_deve lopment/downloads/EPA_Datacenter_Report_Con gress_ Final1.pdf (accessed on 23 May 2016).
- Delforge, P. America’s Data Centers Consuming and Wasting—Growing Amounts of Energy. Available online: http://switchboard.nrdc.org (accessed on 1 May 2017).
- Matt McGrath. Climate Change: CO
_{2}Emissions Rising for First Time in Four Years. Available online: https://www.bbc.com/news/science-environment-46347453 (accessed on 24 December 2018). - Shehabi, A.; Smith, S.; Sartor, D.; Brown, R.; Herrlin, M.; Koomey, J.; Masanet, E.; Horner, N.; Azevedo, I.; Lintner, W. United States Data Center Energy Usage Report; Lawrence Berkeley National Laboratory: Berkeley, CA, USA, 2016.
- Willkommen bei den Energy Charts. Available online: https://www.energy-charts.de/ (accessed on 23 April 2017).
- U.S. Energy Information Administration (EIA). Available online: https://www.eia.gov/beta/international/analysis.cfm?iso=CHN (accessed on 13 February 2017).
- Callou, G.; Maciel, P.; Tutsch, D.; Ferreira, J.; Araújo, J.; Souza, R. Estimating Sustainability Impact of High Dependable Data Centers: A Comparative Study between Brazilian and US Energy Mixes; Springer: Vienna, Austria, 2013. [Google Scholar]
- Institute for Energy Research. Energy Encyclopedia. Available online: http://instituteforenergyresearch.org/topics/ency clopedia/ (accessed on 29 June 2016).
- Ferreira, J.; Callou, G.; Maciel, P. A power load distribution algorithm to optimize data center electrical flow. Energies
**2013**, 6, 3422–3443. [Google Scholar] [CrossRef] - Silva, B.; Matos, R.; Callou, G.; Figueiredo, J.; Oliveira, D.; Ferreira, J.; Dantas, J.; Junior, A.L.; Alves, V.; Maciel, P. Mercury: An Integrated Environment for Performance and Dependability Evaluation of General Systems. In Proceedings of the IEEE 45th Dependable Systems and Networks Conference (DSN-2015), Rio de Janeiro, Brazil, 22–25 June 2015. [Google Scholar]
- Kuo, W.; Zuo, M.J. Optimal Reliability Modeling—Principles and Applications; John Wiley and Sons: New York, NY, USA, 2003. [Google Scholar]
- Molloy, M.K. Performance analysis using stochastic Petri nets. IEEE Trans. Comput.
**1982**, 9, 913–1007. [Google Scholar] [CrossRef] - Zeng, Y.-R.; Zeng, Y.; Choi, B.; Wang, L. Multifactor-influenced energy consumption forecasting using enhanced back-propagation neural network. Energy
**2017**, 127, 381–396. [Google Scholar] [CrossRef] - Wang, L.; Hu, H.; Ai, X.-Y.; Liu, H. Effective electricity energy consumption forecasting using echo state network improved by differential evolution algorithm. Energy
**2018**, 153, 801–815. [Google Scholar] [CrossRef] - Wang, L.; Lv, S.-X.; Zeng, Y.-R. Effective sparse adaboost method with ESN and FOA for industrial electricity consumption forecasting in China. Energy
**2018**, 155, 1013–1031. [Google Scholar] [CrossRef] - He, Y.; Qin, Y.; Wang, S.; Wang, X.; Wang, C. Electricity consumption probability density forecasting method based on LASSO-Quantile Regression Neural Network. Appl. Energy
**2019**, 233, 565–575. [Google Scholar] [CrossRef] - Reddy, V.D.; Setz, B.; Rao, G.S.V.; Gangadharan, G.R.; Aiello, M. Metrics for sustainable data centers. IEEE Trans. Sustain. Comput.
**2017**, 2, 290–303. [Google Scholar] [CrossRef] - Dandres, T.; Moghaddam, R.F.; Nguyen, K.K.; Lemieux, Y.; Samson, R.; Cheriet, M. Consideration of marginal electricity in real-time minimization of distributed data centre emissions. J. Clean. Prod.
**2017**, 143, 116–124. [Google Scholar] [CrossRef] - Helm, J.L. Energy: Production, Consumption, and Consequences; National Academy Press: Washington, DC, USA, 1990. [Google Scholar]
- Kammen, D.M.; Pacca, S. Assessing the costs of electricity. Annu. Rev. Environ. Resour.
**2004**, 29, 301–344. [Google Scholar] [CrossRef] - Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1994. [Google Scholar]
- McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys.
**1943**, 5, 115–133. [Google Scholar] [CrossRef] - De Pádua Braga, A.; de Leon Ferreira, A.C.P.; Ludermir, T.B. Redes Neurais Artificiais: Teoria e Aplicações; LTC Editora: Rio de Janeiro, Brazil, 2007. [Google Scholar]
- Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley and Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
- Pindyck, R.S.; Rubinfeld, D.L. Econometria: Modelos & Previsões; Elsevier: Amsterdam, The Netherlands, 2004. [Google Scholar]
- Ferreira, J.; Callou, G.; Dantas, J.; Souza, R.; Maciel, P. An algorithm to optimize electrical flows. In Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics, Manchester, UK, 13–16 October 2013; pp. 109–114. [Google Scholar]
- De Gooijer, J.G.; Hyndman, R.J. 25 years of time series forecasting. Int. J. Forecast.
**2006**, 22, 443–473. [Google Scholar] [CrossRef] [Green Version] - Gardner, M.W.; Dorling, S.R. Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences. Atmos. Environ.
**1998**, 32, 2627–2636. [Google Scholar] [CrossRef] - Haykin, S. Neural Networks and Learning Machines; Prentice Hall: Pearson Upper Saddle River, NJ, USA, 2009. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; McClelland, J.L. A general framework for parallel distributed processing. Parallel Distrib. Process. Explor. Microstruct. Cognit.
**1986**, 1, 45–76. [Google Scholar] - The Up Time Institute. Available online: https://uptimeinstitute.com/ (accessed on 9 July 2016).
- Neoenergia Group. Available online: ttp://www.neoenergia.com/ (accessed on 28 April 2017).

**Figure 3.**Nonlinear model of a neuron k [30].

Energy Source | ${\mathit{CO}}_{2}$ (g/kWh) |
---|---|

Wind | 10 |

Coal | 950 |

Hydroelectric | 20 |

Nuclear | 150 |

Oil | 510 |

Energy Source | GER | BRA | CHN | USA | CO_{2} (g/kWh) |
---|---|---|---|---|---|

Cost kWh (USD) | 0.25 | 0.18 | 0.43 | 0.12 | - |

Wind (%) | 14.3 | 1.44 | 6 | 4.7 | 10 |

Coal (%) | 42.9 | 1.5 | 63 | 33 | 950 |

Hydroelectric (%) | 4 | 69.76 | 22 | 6 | 20 |

Nuclear (%) | 14.7 | 1.68 | 1 | 20 | 150 |

Oil (%) | 0.94 | 6 | 2 | 1 | 510 |

Others (%) | 23.16 | 19.62 | 6 | 35.3 | - |

ARIMA | MLP | |||
---|---|---|---|---|

Lower | Upper | Lower | Upper | |

January | 64.07 | 64.59 | 63.63 | 63.77 |

February | 63.95 | 64.75 | 63.23 | 63.37 |

March | 63.82 | 64.92 | 63.43 | 63.57 |

April | 63.69 | 65.10 | 63.20 | 63.34 |

May | 63.56 | 65.28 | 63.20 | 63.34 |

June | 63.42 | 65.47 | 63.33 | 63.47 |

July | 63.27 | 65.66 | 63.03 | 63.17 |

August | 63.12 | 65.85 | 63.53 | 63.67 |

September | 62.97 | 66.05 | 63.13 | 63.27 |

October | 62.81 | 66.26 | 63.53 | 63.67 |

November | 62.65 | 66.47 | 63.23 | 63.37 |

December | 62.49 | 66.68 | 63.83 | 63.97 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ferreira, J.; Callou, G.; Josua, A.; Tutsch, D.; Maciel, P.
An Artificial Neural Network Approach to Forecast the Environmental Impact of Data Centers. *Information* **2019**, *10*, 113.
https://doi.org/10.3390/info10030113

**AMA Style**

Ferreira J, Callou G, Josua A, Tutsch D, Maciel P.
An Artificial Neural Network Approach to Forecast the Environmental Impact of Data Centers. *Information*. 2019; 10(3):113.
https://doi.org/10.3390/info10030113

**Chicago/Turabian Style**

Ferreira, Joao, Gustavo Callou, Albert Josua, Dietmar Tutsch, and Paulo Maciel.
2019. "An Artificial Neural Network Approach to Forecast the Environmental Impact of Data Centers" *Information* 10, no. 3: 113.
https://doi.org/10.3390/info10030113