In recent times, renewable energy sources have gained attention as a sustainable alternative to traditional fossil fuels. Fueled by goals outlined for ecological transition, such as those outlined in the European Agenda 2030 [
1], there is a growing focus on renewable resources. The rise of communities dedicated to sustainable energy, often portrayed as agents of change with significant benefits for participants [
2], has prompted the need for a carefully structured framework to facilitate the integration of these sources into the power grid. As delineated in [
3],
renewable energy communities (RECs) are coalitions of individuals and entities collaborating to advocate and employ renewable energy sources, encompassing solar, wind, and hydroelectric power. This collective effort promotes the adoption and utilization of environmentally friendly energy alternatives. These entities exist in various forms, from small clusters of residents collectively funding solar panel installations to large-scale organizations driving communal renewable energy initiatives. The primary goal of RECs is to enhance the overall social welfare of the community (refer to the formal explanation below). This encompasses managing the expenses and income associated with energy transactions within the community and with the broader grid. A crucial component of the necessary infrastructure is the Battery Energy Storage System (BESS), which plays a vital role in balancing energy supply and demand.
Addressing this challenge has seen the application of diverse techniques such as traditional optimization methods, heuristic algorithms, and rule-based controllers [
4,
5]. Notably, the mixed-integer linear programming (MILP) solution has demonstrated considerable success in energy management applications. In [
6], the authors addressed the problem of managing an energy community hosting a fleet of electric vehicles for rent. The request-to-vehicle assignment requires the solution of a mixed-integer linear program. In [
7], the authors presented an optimized MILP approach specifically aimed at improving the social welfare of a REC, a parameter that embraces revenues from the energy sold to the grid, costs for energy bought from the grid, costs for battery usage, and potential incentives for self-consumption. However, effectively scheduling BESS charging/discharging policies encounters a critical hurdle due to the pronounced intermittency and stochastic nature of renewable generation and electricity demand [
8]. Accurately predicting these uncertain variables proves to be a formidable task, and this is where machine learning-based strategies come into play [
9,
10]. Alternatively, to address this issue, uncertainty-aware optimization techniques have been developed, such as stochastic programming (SP) and robust optimization (RO) [
11]. SP employs a probabilistic framework but requires a priori knowledge of the probability distribution to model the uncertainties, whereas RO focuses on the worst-case scenario and, therefore, requires the known bounds of the uncertainties, leading to far more conservative performance. On the flip side, an alternative strategy for real-time scheduling, such as model predictive control (MPC), introduced in [
12], has emerged. MPC continuously recalibrates its solution in a rolling-horizon fashion. Although beneficial, the effectiveness of the MPC solution is contingent upon the precision of forecasts [
13]. Moreover, the substantial online computational load of long-horizon MPC in extensive systems poses a potential hurdle for real-time execution. Hence, there is a notable shift in focus toward methodologies rooted in deep reinforcement learning (DRL). In recent years, DRL has gained substantial relevance in diverse energy management applications [
14,
15,
16]. Addressing the distinctive constraints inherent in power scheduling poses a significant challenge for DRL approaches. An uncomplicated strategy involves integrating constraints into the reward function as soft constraints [
17]. To tackle this issue, approaches based on imitation learning (IM) have been suggested. In [
18], the agent learns directly from the trajectories of an expert (specifically, an MILP solver). Despite the commendable outcomes, these approaches often overlook the incentive for virtual self-consumption proposed in the Italian framework. In [
19], an interactive framework for benchmarking DRL algorithms was presented; however, it is not reflective of REC behavior and incentive schemes. Furthermore, the primary objective typically revolves around flattening consumption curves (benefiting energy suppliers), as observed in [
14], rather than maximizing the social welfare of the community.
In our previous work [
20], we introduced a novel DRL strategy for managing energy in renewable energy communities. This strategy introduces an intelligent agent designed to maximize social welfare through real-time decision making, relying only on currently available data, thus eliminating the necessity for generation and demand forecasts. To enhance the agent’s training effectiveness, we leverage the MILP approach outlined in [
7] by directly incorporating optimal control policies within an asymmetric actor–critic framework. Through diverse simulations across various REC setups, we demonstrate that our methodology surpasses a state-of-the-art rule-based controller. Additionally, it yields BESS control policies with performance closely matching the optimally computed ones using MILP. In this study, we have expanded our research scope to encompass a broader framework that addresses the entire European landscape. By doing so, we aim to shed light on the advantages of our study in a more comprehensive context. This extended perspective allows us to offer insights into the applicability and benefits of our innovative DRL approach not only within individual RECs but also on a larger scale, contributing to a more sustainable and efficient energy landscape across Europe.
1.3. Regional Variations in Energy Dynamics: Insights from Italian Regions and beyond
Understanding REC dynamics requires analyzing regional variations in energy dynamics. The inclusion of various regions within Italy serves as a crucial aspect of our study, providing insights into the specific challenges and opportunities within the Italian energy landscape. Italy’s diverse geographical and socio-economic characteristics necessitate a tailored approach to energy management, making it an ideal testing ground for our proposed reinforcement learning (RL) controller. On the other hand, extending our analysis to countries beyond Italy, such as France, Switzerland, Slovenia, and Greece, adds a transnational dimension to our research. The selection of these specific countries is grounded in their diverse energy market structures, regulatory frameworks, and renewable energy adoption rates. France, with its nuclear-heavy energy mix, offers a contrasting scenario to Italy’s reliance on renewables. Switzerland, a country known for its hydropower capacity, provides insights into decentralized energy systems. Slovenia and Greece, with their unique geographical characteristics, contribute to a more comprehensive understanding of REC dynamics in southern Europe. By incorporating this diverse set of countries, we aim to capture a broad spectrum of challenges and opportunities faced by different European regions. This comparative analysis enhances the generalizability of our findings, allowing us to extrapolate insights that are not only relevant to Italy but also applicable to a broader European context. The careful selection of these countries aligns with our goal of providing a holistic perspective on the application of RL controllers in optimizing energy management across diverse geographical and regulatory landscapes.
This work proceeds as follows.
Section 2 encompasses a review of the relevant literature.
Section 3 formalizes the task and introduces our methodology.
Section 4 delineates the simulations and illustrates the numerical outcomes. Finally,
Section 5 concludes the discussion, delving into potential future advancements.