1. Introduction
The electrification of transportation is one of the most significant advancements in the global push toward carbon neutrality. Electric vehicles (EVs), when combined with renewable energy generation, offer the potential to drastically reduce greenhouse gas emissions and reliance on fossil fuels [
1]. Malaysia, as part of its commitment to sustainable development, introduced the Low Carbon Mobility Blueprint (2021–2030), aiming for the deployment of 10,000 EV charging stations by 2025 [
2]. This strategic move aligns with global trends but poses severe risks to the electrical grid if deployment is not complemented by smart infrastructure and intelligent load management.
Figure 1 shows the seasonal average daily solar PV output per kW of installed capacity in Kuala Lumpur. Solar generation remains relatively stable across seasons, with averages of 5.26 kWh/day in winter, 5.39 kWh/day in both summer and autumn, and peaking at 5.44 kWh/day in spring, highlighting Kuala Lumpur’s year-round solar reliability.
In urban settings, particularly in high-rise residential and mixed-use commercial developments, the aggregation of charging demands from hundreds of EVs can exceed existing power infrastructure capacities [
3]. For example, a condominium with 400–800 residential units, assuming a 30% EV penetration rate, will have 120–240 EVs. Each EV utilizing a 7.4 kW charger can result in aggregate peak demands ranging from 888 kW to 1776 kW, far beyond the typical design limit of building transformers and feeders. Such loads, if not properly coordinated, can lead to significant technical issues such as transformer overheating, cable degradation, and voltage fluctuations [
4].
Figure 2 illustrates the distribution of public and private EV charging stations across various Malaysian states, emphasizing infrastructure density and regional disparities. The current EV charging network in Malaysia consists of 403 stations, collectively delivering 997 MWh of electricity through 182,984 charging sessions [
5]. This infrastructure supports an aggregate electric range of approximately 6.65 million km for over 9000 registered users. The highest concentrations of charging stations are found in urban areas such as Kuala Lumpur (99 stations), Selangor (47), and Johor (62), reflecting a robust infrastructure presence in these regions. In contrast, East Malaysia, particularly Sarawak, lags behind with only 7 charging stations, highlighting the need for further development in these areas to meet growing EV adoption demands.
To address the challenges of grid stress and uneven station distribution, optimization becomes essential for enhancing energy efficiency, maintaining grid stability, and ensuring cost-effective operations [
6]. Without intelligent load management, peak-hour demand in high-density areas can lead to transformer overloads and voltage instability. The integration of AI-based optimization algorithms—such as Reinforcement Learning and predictive scheduling—enables dynamic load balancing, voltage deviation control, and improved utilization of solar PV and BESS resources [
7]. Furthermore, in underdeveloped regions like Sarawak, optimization ensures equitable access and effective use of limited infrastructure, thereby supporting scalable and sustainable EV ecosystem growth without necessitating immediate large-scale capital investment.
The types of EV charging systems shown in
Figure 3—Mode 1 and Mode 2—support basic AC charging but are limited in communication and control, making them less suitable for smart charging applications. Mode 3 enables advanced control and communication between the EV and the grid, supporting smart charging features such as load management and dynamic pricing. Mode 4, offering DC fast charging, is ideal for time-sensitive charging needs and can be integrated with smart systems for demand response and peak shaving [
2]. Smart charging infrastructure often combines Mode 3 and Mode 4 to balance speed and intelligence. Integration with renewable sources, such as solar PV, enhances the sustainability of the charging system.
Figure 4 presents a comprehensive architecture of EV operation within a smart grid, emphasizing its integration with electricity markets. Residential and public EV charging infrastructures are coordinated via an EV aggregator load dispatch utility that manages power flow and communication across the system. The aggregator interfaces with diverse power generation sources such as hydropower, PV farms, wind turbines, nuclear, and biogas plants to optimize the real-time energy supply–demand balance.
Electric vehicles contribute to ancillary service markets, including regulation, reserve, and capacity services, through controlled charging/discharging for frequency and voltage support. Participation in both day-ahead and real-time electricity markets enables EVs to perform energy arbitrage, load shifting, and peak shaving based on dynamic pricing signals. Bidirectional data and power flows facilitate intelligent demand-side management at residential premises, enhancing grid resilience, reliability, and economic dispatch.
The remainder of this paper is structured as follows:
Section 2 reviews the existing optimization algorithms applied to smart EV charging, including heuristic, mathematical, and AI-based approaches.
Section 3 presents the proposed hybrid framework, detailing its architecture, demand modeling, Reinforcement Learning strategy, Linear Programming formulation, and the integration of solar PV and BESS.
Section 4 describes the simulation environment, technical assumptions, and performance evaluation based on MATLAB/Simulink and HOMER Grid tools.
Section 5 discusses future research directions. Finally,
Section 6 discusses the key findings, offering recommendations for deploying scalable, intelligent EV charging systems.
2. Review of Optimization Algorithms for Smart Charging
The growing electrification of transport infrastructure and the increasing penetration of distributed energy resources (DERs) have prompted extensive research into intelligent electric vehicle (EV) charging strategies. Optimizing charging schedules in a manner that balances user needs, energy costs, and grid stability remains a central challenge in modern energy systems. A broad spectrum of methods including heuristic algorithms, mathematical optimization models, and Artificial Intelligence (AI)-based approaches has been proposed to address these complexities.
Early-stage research predominantly relied on heuristic and metaheuristic techniques such as Genetic Algorithms (GA), Particle Swarm Optimization (PSO), and Ant Colony Optimization (ACO) to solve multi-objective EV charging problems [
8]. These algorithms demonstrated efficiency in peak load minimization and cost optimization in constrained environments. However, their static nature and sensitivity to parameter tuning render them less effective in dynamic, real-time conditions where user behavior, renewable generation, and grid load vary unpredictably.
Mathematical programming approaches, including Mixed-Integer Linear Programming (MILP) and Quadratic Programming (QP), have been widely adopted for deterministic EV scheduling. For example, ref. [
9] applied Linear Programming for real-time EV charging in parking infrastructures, achieving reductions in peak demand and energy costs. Similarly, ref. [
10] used an Artificial Neural Network (ANN)-driven optimization model to maintain voltage stability in residential distribution networks. While these models ensure constraint satisfaction and grid compatibility, they often lack adaptability to user uncertainty, renewable intermittency, and market volatility.
To overcome these limitations, recent studies have introduced Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL) for dynamic EV charging control. These models offer continuous learning from environment feedback, enabling real-time adaptation to evolving conditions such as fluctuating electricity tariffs, transformer loading, and user mobility patterns [
11]. Hierarchical RL architectures have also been proposed to manage large-scale systems, improving convergence and policy generalization. Nonetheless, most existing RL implementations are computationally intensive and are rarely coupled with physical grid models or real-time simulation environments, limiting their practical deployment.
In parallel, Demand Response (DR) strategies have been integrated into EV charging algorithms to align charging loads with grid operating conditions [
12]. Approaches based on Real-Time Pricing (RTP), Critical Peak Pricing (CPP), and Day-Ahead Scheduling have shown promise in flattening load curves and reducing distribution transformer stress. In the Malaysian context, ref. [
13] highlighted transformer overloading as a primary constraint to residential EV deployment, suggesting optimization must be transformer-centric. However, most DR implementations lack user-centric behavioral modeling and do not integrate renewable generation or storage capabilities.
Traditional approaches in EV charging management have relied on Time-of-Use (ToU) pricing schemes to encourage off-peak charging and mitigate grid stress. ToU models allocate tariff blocks based on predefined time windows typically distinguishing between peak, shoulder, and off-peak periods, providing users with cost incentives to shift their charging behavior [
14]. Although effective in flattening aggregate load profiles, ToU pricing alone is limited by its static structure and inability to respond to real-time grid conditions or user-specific flexibility. Moreover, ToU-based systems often fail to adapt when large numbers of EVs are simultaneously connected, leading to secondary peaks during low-tariff periods and localized overloading of transformers.
Table 1 shows the comparative analysis of traditional ToU scheduling and AI-based Reinforcement Learning methods for smart Reinforcement Learning EV charging.
Recent studies highlight that traditional scheduling methods, including fixed Time-of-Use (ToU) pricing, are inadequate in handling the stochastic nature of EV user behavior and real-time grid constraints. As EV adoption increases, uncoordinated charging leads to transformer overloading, voltage instability, and inefficient use of renewable energy. Artificial Intelligence (AI)-based optimization [
15], particularly Reinforcement Learning (RL), enables adaptive, real-time decision-making that dynamically aligns charging with grid capacity, pricing signals, and user preferences. Therefore, AI is increasingly recognized as a critical enabler of scalable, resilient, and efficient smart EV charging infrastructure. Importance of AI optimization in smart EV charging systems is shown in
Table 2.
Furthermore, hybrid energy systems combining solar PV and Battery Energy Storage Systems (BESS) have received growing attention [
16]. Ref. [
17] analyzed second-life EV batteries in conjunction with rooftop PV to support residential loads, demonstrating partial grid independence. However, battery aging, optimal dispatch strategies, and integrated AI control were not fully explored in their work. Similarly, studies using HOMER and TRNSYS tools evaluate system-level performance but lack fine-grained control architectures for real-time optimization [
18].
Overall, the literature reveals critical gaps: (1) limited integration of adaptive AI-based control with physical grid constraints, (2) inadequate modeling of user behavior as a stochastic, time-variant process, and (3) insufficient coordination between renewable resources and EV loads. The current study addresses these challenges by developing a hybrid AI optimization framework that combines stochastic user arrival modeling, RL-based behavioral adaptation, and grid-aware Linear Programming for load allocation.
3. Hybrid Modeling for EV Smart Charging System
The proposed hybrid framework integrates stochastic modeling of user behavior, Reinforcement Learning (RL)-based adaptive control, and grid-constrained Linear Programming (LP) optimization, forming a unified architecture tailored for smart residential EV charging. This structure not only enables real-time responsiveness to dynamic user demand and pricing signals but also incorporates solar photovoltaic (PV) generation and Battery Energy Storage System (BESS) functionalities to enhance system sustainability and grid resilience.
Figure 5 illustrates the comprehensive architecture of this framework.
3.1. EV Demand Modeling and Grid Impact Formulation
The core issues in EV smart charging lie in the interplay between unpredictable user behavior, distribution network constraints, and the integration of intermittent renewable energy sources [
19]. User behavior is inherently stochastic, as individuals have different driving patterns, work schedules, and charging preferences. The timing, frequency, and duration of charging sessions vary considerably, making it difficult to predict and optimize energy demand profiles. Moreover, the Malaysian context suffers from a lack of real-time, localized datasets that capture actual user behavior and charging habits [
20].
To accurately model user demand within the proposed hybrid framework, a non-homogeneous Poisson process is employed to represent the stochastic nature of EV arrival patterns. The arrival rate function, denoted as
λ(
t), captures the average number of EVs arriving per unit time. The probability of observing k arrivals within a time interval
t is governed by the following:
This probabilistic formulation enables dynamic scheduling based on fluctuating arrival rates, which is critical for real-time optimization in residential environments where demand is temporally clustered.
From the grid’s operational perspective, the aggregate power drawn by EVs is determined by the following:
where
is the power drawn by the
th EV charger, and
is its corresponding efficiency factor. This formulation accounts for hardware-level variations across charging stations and provides a more accurate estimation of real power consumption at any given moment.
In uncoordinated charging scenarios, simulation data indicate that transformer utilization can exceed 120% during peak demand periods, significantly elevating the risk of thermal overloading and equipment failure. Additionally, such demand surges induce voltage deviations Δ
V within the local distribution network, which can be estimated using Ohm’s law:
where
Here, represents the total instantaneous current drawn by all active chargers, and denotes the impedance of the distribution line. Elevated values of ΔV result in voltage instability, manifesting as flicker, dimming, or undervoltage conditions, particularly at the nodes farthest from the transformer. Excessive ΔV not only triggers power quality violations under IEC 61000-4-30 standards but can also reduce equipment lifespan. These technical challenges underscore the necessity of coordinated AI-driven control strategies to manage spatiotemporal variations in EV charging demand.
3.2. AI-Driven Adaptive Charging via Reinforcement Learning
The control core of the hybrid framework utilizes Deep Reinforcement Learning (DRL) for adaptive decision-making under uncertainty. The system state
at time
encapsulates the user request matrix, grid loading, PV output forecast, and SoC of the BESS. The control agent selects an action
, representing a vector of power allocations, to maximize the cumulative expected reward:
where
is the time-varying electricity cost function based on TOU tariffs;
is the instantaneous voltage deviation;
is the user discomfort index, based on deviation from desired charging windows; and
are the hyperparameters calibrated via grid simulations.
The squaring of penalizes large deviations more severely. This reward structure encourages behavior that is cost-efficient, grid-friendly, and user-aligned. Policy updates are performed via proximal policy optimization (PPO), ensuring stability and convergence in training.
3.3. Grid-Constrained Optimization via Linear Programming
To complement the adaptive nature of DRL, a deterministic Linear Programming (LP) layer is embedded for short-horizon load scheduling. The LP problem is formulated to minimize aggregate charging cost:
Subject to the following:
where
is the grid-imposed transformer capacity limit;
is the minimum energy requirement for EV
; and
is the real-time pricing signal for energy cost. This LP ensures grid constraints are strictly enforced while allowing the RL agent to shape the long-term energy usage policy. The interaction between LP and RL creates a two-layer control hierarchy: short-term feasibility and long-term adaptability.
3.4. Integration of Solar PV and Battery Energy Storage System (BESS)
The integration of a photovoltaic (PV) system with a Battery Energy Storage System (BESS) is essential for improving local energy self-sufficiency, reducing peak load demand, and ensuring power quality in EV-integrated residential microgrids.
The instantaneous power output of the solar PV system is modeled as follows:
where
is the global irradiance on the panel surface (W/m
2);
is the effective PV array area (m
2);
is the system efficiency (accounting for inverter and temperature losses); and
is the angle of incidence between sunlight and panel surface. The irradiance
is forecasted using weather prediction models, while real-time irradiance is obtained via local pyranometers. The EMS utilizes a rolling horizon approach to match forecasted PV generation with upcoming EV charging demand.
The BESS supports both grid-interactive and islanding modes and follows a bi-directional control model comprising charging and discharging subroutines. The State of Charge (SoC) is updated according to the following:
With constraints, and .
Where is the nominal energy capacity of BESS (kWh), are the charging and discharging powers at time ; are the charging and discharging efficiencies; and is the time step resolution (typically 15 min).
The Energy Management System (EMS) employs a hierarchical dispatch protocol to maximize PV self-consumption. First, available PV generation is directed to active EV charging loads. Any surplus PV is then allocated to charge the Battery Energy Storage System (BESS) up to its State of Charge (SoC) limits. Finally, if energy remains after meeting both load and BESS needs, the excess is exported to the grid, with penalties applied in the cost function to prioritize local consumption over grid export. This protocol ensures efficient energy use and minimizes unnecessary energy export. The hierarchical protocol for PV self-consumption is shown in
Figure 6.
During peak grid demand (e.g., 6 p.m. to 10 p.m.), the BESS performs controlled discharges to offset transformer stress and reduce grid draw:
A net-positive indicates residual load on the grid, whereas a net-zero or negative value represents energy autonomy or export, respectively.
The PV-BESS dispatch is co-optimized within the overall energy cost minimization framework. The extended objective function is calculated as follows:
where
is the time-varying grid electricity price;
is the BESS degradation cost per kWh cycled;
is net power drawn from the grid; and
is the absolute value to include both charging and discharging impacts. This formulation enables a cost-aware and degradation-aware BESS operation, ensuring financial sustainability over the system’s lifetime.
4. Simulation-Based Performance Analysis of the AI-Optimized EV Charging System
The simulation was designed to evaluate the performance of a smart electric vehicle (EV) charging system within a virtual high-density residential complex consisting of 800 housing units and 50 wall-mounted EV chargers, each rated at 7.4 kW. To enhance energy resilience and renewable integration, the system architecture incorporated a 120 kWp rooftop solar photovoltaic (PV) array and a 60 kWh lithium-ion Battery Energy Storage System (BESS). The hybrid system was modeled using MATLAB/Simulink R2025a for detailed power flow analysis and HOMER Grid 1.10 for techno-economic optimization and component sizing. Artificial Intelligence (AI)-based charging algorithms, specifically Reinforcement Learning (RL) for adaptive user behavior modeling and Linear Programming (LP) for grid-constrained scheduling, were implemented in Python 3.12.6, utilizing TensorFlow v2.16.1 for RL policy training and PuLP 3.2.1 for solving constrained optimization problems.
To ensure reproducibility, the simulation was conducted under well-defined technical assumptions. EV arrivals followed a non-homogeneous Poisson distribution, with peak demand occurring between 6 p.m. and 9 p.m. and energy requirements ranging from 8 to 18 kWh per session. Solar PV generation was simulated using Typical Meteorological Year (TMY) irradiance data for Kuala Lumpur, incorporating panel efficiency losses and inverter performance. A time-of-use (ToU) tariff structure, based on Tenaga Nasional Berhad (TNB), Kuala Lumpur, Malaysia residential rates, was applied, featuring peak, off-peak, and dynamic pricing scenarios. The transformer was modeled with a 600 kVA rating and a conservative operational limit of 550 kW to reflect realistic grid constraints, while line impedance was fixed at 0.19 Ω for accurate voltage drop estimation. The BESS was constrained to a 10 kW bidirectional power flow and a 92% round-trip efficiency. Additionally, a Reinforcement Learning-based feedback mechanism adjusted the scheduling policy every 2.5 days, adapting to user satisfaction and override patterns.
Figure 7 illustrates the simulation-based configuration, including all key energy subsystems and their interconnections. The model successfully captured real-time energy flows, stochastic charging demands, PV variability, and the dynamic grid response under AI-optimized control.
The simulation included real-time scenario modeling with time-series data for load demand, solar generation, and user behavior patterns. A virtual SCADA-like interface was built using LabVIEW to visualize system dynamics, track power quality metrics, and analyze voltage deviations and transformer loading in response to different charging strategies. Over a simulated operational window of three months, the model successfully demonstrated reductions in peak load by up to 31.5%, improvements in voltage stability (from ±5.8% to ±2.3%), and increased solar PV utilization from 48% to 66%, validating the technical effectiveness of AI-driven optimization in smart grid contexts. The simulation-based performance analysis is shown in
Table 3.
The proposed hybrid AI framework is designed to be scalable, adaptable, and resilient for future deployment in larger EV charging networks. As the number of chargers increases, the complexity of Reinforcement Learning (RL) and Linear Programming (LP) components also grows. To maintain real-time performance, decentralized control using clustered RL agents and parallel LP solvers is recommended. Feature aggregation and localized scheduling further reduce computational overhead. Simulations with up to 500 chargers confirmed acceptable performance under parallelized execution.
The framework is generalizable across regions through its modular design, allowing substitution of location-specific parameters such as irradiance data, pricing models, and grid constraints. However, as the scale increases, system reliability becomes critical. To this end, fault tolerance and redundancy are embedded into the architecture. Dual communication paths (e.g., wired + wireless) between components like the BESS, PV array, and smart chargers ensure continued operation during link failures. Redundant SCADA controllers and backup EMS nodes allow a seamless transition during hardware faults or control system interruptions.
In terms of cybersecurity, the system acknowledges the growing threat of false data injection (FDI) attacks targeting EV charging infrastructure. As discussed in [
21], FDI attacks can manipulate charging demand data, SoC reports, or PV forecasts to destabilize the grid or overcharge users. Our current framework includes basic anomaly detection by monitoring time-series deviations and SoC inconsistencies. Future versions will incorporate AI-based intrusion detection systems and blockchain-enabled data integrity protocols to enhance protection against coordinated cyberattacks. These resilience features ensure the system can scale securely and reliably for smart city applications.
5. Future Research Directions
The proposed hybrid AI-based framework for optimized EV charging and grid management demonstrates strong potential for improving load balancing, cost reduction, and renewable energy utilization. However, its future development requires continued technical innovation and validation. One key area for future research is enhancing the prediction accuracy of user behavior and solar energy generation [
22]. While Reinforcement Learning provides adaptability, incorporating advanced deep learning models such as Long Short-Term Memory (LSTM) [
23], attention-based transformers, and spatiotemporal graph neural networks can enable more accurate multi-scale forecasting of EV arrivals, user preferences, and variable solar irradiance. These predictive improvements are vital for real-time control and load shifting in a dynamic grid environment.
The integration of Vehicle-to-Grid (V2G) capability represents another critical research frontier. V2G can allow bi-directional energy flow, where EVs contribute power back to the grid during peak demand, effectively transforming vehicles into distributed energy resources (DERs). Implementing V2G within this framework will require a revised multi-objective Reinforcement Learning model that accounts for energy pricing, user preferences, and battery degradation [
24]. A representative reward function could be defined as follows:
where
represents net energy cost or profit from grid transactions; ΔV is the voltage deviation from nominal levels;
reflects user discomfort or deviation from preferred charging; and
captures the cost of additional battery wear. Each term captures trade-offs between cost optimization, voltage regulation, user satisfaction, and battery lifespan. Real-time optimization under such a framework must also include predictive analytics for user departure times and charging goals.
In addition to AI and V2G, future work should investigate the integration of blockchain-enabled peer-to-peer (P2P) energy trading. This decentralized approach can facilitate transparent, secure micro-transactions between users, EV owners, and energy providers. Smart contracts can automate energy settlements, allowing localized energy markets to function efficiently and with minimal central intervention. These developments would be particularly impactful in Malaysia’s evolving energy ecosystem, supporting policies under the Low Carbon Mobility Blueprint and National Energy Transition Roadmap (NETR).
From a systems perspective, further research is needed to refine second-life EV battery usage in the residential BESS [
25]. Degraded batteries introduce variability in energy density, internal resistance, and thermal behavior. Future models should incorporate real-time State-of-Health (SoH) estimation and degradation-aware dispatch algorithms to ensure safe and reliable operation [
26]. In tandem, enhancing SCADA integration with AI-based fault detection and predictive diagnostics will improve resilience and fault tolerance in smart grids. Federated learning approaches, allowing distributed data training while preserving user privacy, can enhance these systems’ adaptability across diverse regional clusters.
In practical deployments, several technical challenges persist. The intermittency of solar PV energy poses reliability concerns, particularly during overcast or rainy conditions [
27]. This necessitates hybrid control strategies that blend forecast-based optimization with rule-based fallback mechanisms. Similarly, uncoordinated user behavior and peak clustering introduce unpredictability that must be addressed through stochastic modeling and scenario-based control logic. Additionally, standardization and interoperability remain pressing issues. The lack of unified communication protocols between chargers, vehicles, and grid operators can hinder seamless integration. Future deployments should adopt open standards such as OCPP 2.0.1 and ISO 15118 to ensure system flexibility and cross-vendor compatibility [
27]. Mobile energy storage can provide additional grid support and flexibility, particularly in regions with unstable infrastructure or for rapid deployment during peak demand events [
28].
Based on the above, the following recommendations are proposed: (1) integrate advanced AI and deep learning models for forecasting and behavior modeling, (2) develop and test V2G-enabled optimization schemes considering battery wear, (3) explore blockchain frameworks for decentralized energy trading and data integrity, (4) adopt federated learning and SCADA–AI integration for predictive diagnostics, (5) conduct extended field validation across urban and semi-urban Malaysian areas, and (6) collaborate with regulatory agencies to enable dynamic pricing, incentive structures, and interoperability standards. These steps will not only advance the technical robustness of the hybrid framework but also support a scalable, user-centric, and resilient smart grid ecosystem.
6. Conclusions
This research presents a technically comprehensive framework for optimizing residential electric vehicle (EV) charging infrastructure using Artificial Intelligence (AI), specifically Reinforcement Learning (RL), Linear Programming (LP), and real-time grid-aware scheduling. Through a simulation-based deployment involving 800 residential units, 50 wall-mounted 7.4 kW chargers, and integration with a 120 kWp rooftop solar photovoltaic (PV) system and a 60 kWh lithium-ion battery energy storage system (BESS), the proposed hybrid model demonstrates significant improvements in grid performance, energy efficiency, and user satisfaction. The RL-based control strategy dynamically predicts user behavior and charging demand, optimizing power allocation based on a reward function that balances cost efficiency, voltage stability, and user convenience. LP-based scheduling ensures adherence to grid constraints while minimizing total energy cost. SCADA-integrated simulations confirm a transformer load reduction of 31.5%, voltage deviation mitigation from ±5.8% to ±2.3%, and enhanced solar PV self-utilization from 48% to 66%. Additionally, energy cost per user decreased by 22.5%, with high user satisfaction and system responsiveness. Future efforts must focus on expanding real-world trials, refining algorithmic performance, and integrating renewable energy and V2G systems to fully realize the potential of intelligent EV charging infrastructures.