Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach

Zaidi, Abdelhamid; Ajibade, Samuel-Soma M.; Adediran, Anthonia Oluwatosin; Jasser, Muhammed Basheer

doi:10.3390/en19133007

Open AccessReview

Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach

by

Abdelhamid Zaidi

¹

,

Samuel-Soma M. Ajibade

^2,3,4,*,

Anthonia Oluwatosin Adediran

⁵ and

Muhammed Basheer Jasser

^2,4,6

¹

Department of Mathematics, College of Science, Qassim University, P.O. Box 6644, Buraydah 51452, Saudi Arabia

²

School of Computing and Artificial Intelligence, Faculty of Engineering and Technology, Sunway University, Bandar Sunway 47500, Selangor Darul Ehsan, Malaysia

³

Research Centre for Nanomaterials and Energy Technology (RCNMET), Sunway University, Bandar Sunway 47500, Selangor Darul Ehsan, Malaysia

⁴

School of Computing and Communications, Lancaster University, Lancaster LA1 4WA, UK

⁵

Department of Real Estate, Faculty of Built Environment, Universiti Malaya, Kuala Lumpur 50603, Malaysia

⁶

Research Centre for Human-Machine Collaboration (HUMAC), Faculty of Engineering and Technology, Sunway University, Petaling Jaya 47500, Selangor Darul Ehsan, Malaysia

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(13), 3007; https://doi.org/10.3390/en19133007 (registering DOI)

Submission received: 3 March 2026 / Revised: 13 June 2026 / Accepted: 15 June 2026 / Published: 25 June 2026

Download

Browse Figures

Versions Notes

Abstract

The exponential growth of machine learning (ML) applications in smart grid (SG) research over the past decade has generated a vast and fragmented body of literature that lacks systematic synthesis. This study addresses that gap by presenting a comprehensive bibliometric and science mapping analysis of the ML–smart grid (MLSG) research landscape to date, drawing on 4156 peer-reviewed publications indexed in the Elsevier Scopus database from 2009 to 2025. The principal contributions of this study are fourfold. First, it provides a rigorous quantitative mapping of MLSG publication growth from one document in 2009 to 1163 publications in 2025, representing a growth rate of 116,200%, thereby establishing a definitive baseline for tracking future scholarly development in the field. Second, it identifies the key actors driving MLSG research, including the most prolific authors (Nadeem Javaid, Alsabaan M.), leading institutions (King Saud University, Tennessee Technological University), and dominant nations (India, China, United States), which offers researchers and funding bodies actionable intelligence on collaboration opportunities and research leadership. Third, through keyword co-occurrence and cluster analysis, the study maps the three dominant thematic hotspots structuring current MLSG research—Smart Grid Security, Power Load Forecasting, and Advanced Energy Management—providing a structured intellectual framework that can guide future research prioritization. Fourth, the study delivers a critical thematic literature review of these three hotspots, synthesizing the most impactful ML methodologies and applications reported across 4156 publications, including deep learning-based intrusion detection, ensemble forecasting models, and reinforcement learning-driven energy management. Collectively, these contributions offer a robust evidence base for researchers, policymakers, and industry practitioners seeking to navigate, benchmark, and advance the field of ML-enabled smart grid systems.

Keywords:

machine learning; artificial intelligence; smart grids; smart grid security; power load forecasting; energy management; smart energy; bibliometric analysis

1. Introduction

The urgent need for sustainable energy and its efficient distribution to address the rising global energy demand has prompted the modernization of power networks [1,2]. The traditional or conventional power grid was initially built for a unidirectional flow of electricity from huge, centralized power plants to users [3,4]. However, such power grids face myriad challenges, particularly integrating renewable energy sources such as wind, solar, and geothermal [5,6]. Other notable challenges include controlling peak demand and the stability and reliability of such sources. Consequently, smart grid technologies have been developed as a practical response to the difficulties outlined.

In principle, smart grids (SGs) have the potential to completely transform the production, distribution, and consumption of electricity globally [7,8]. SGs uses cutting-edge technologies to create an energy ecosystem that is dynamic, self-optimizing, and interactive. As a result, the smart grid development trend has steadily dominated the global power business in recent years [9,10]. However, its growing proliferation has also presented issues, including the security of such networks, which have become prone to physical and cyberattacks [11,12,13]. Therefore, the security issues associated with SGs have become an important area of study among scientists and researchers around the globe. Consequently, innovative strategies and state-of-the-art technologies such as data encryption, physical control, and authentication have been implemented in smart grids to enhance their security and prevent nefarious breaches.

However, with the increasing sophistication of such malicious intrusions into smart grid networks, researchers are constantly exploring rapid and reliable detection techniques to curb or mitigate such attacks. One such method is deploying machine learning (ML) algorithms in smart grid networks. Beyond smart grids, machine learning has emerged as a transformative tool across a broad spectrum of electrical systems applications. In power systems, ML techniques including support vector machines, deep neural networks, convolutional neural networks, and ensemble methods have been extensively deployed for power system fault detection and classification, equipment condition monitoring, and stability assessment [14,15]. For instance, ML-based fault diagnosis in transmission lines and power transformers enables early detection of insulation degradation, winding faults, and thermal anomalies, significantly reducing unplanned outages and maintenance costs [16,17]. Similarly, ML has been applied to power quality analysis identifying and classifying disturbances such as voltage sags, harmonics, and transients which are critical concerns in both industrial and residential electrical installations [18]. Furthermore, in the domain of electric motor drives, ML models have been used for predictive maintenance, rotor fault diagnosis, and efficiency optimization, demonstrating broad applicability across industrial electrical infrastructure [19]. In renewable energy systems, which form a critical component of modern electrical networks, ML algorithms have been widely used for solar irradiance prediction, wind speed forecasting, and maximum power point tracking optimization [20]. These applications collectively demonstrate that while smart grids represent one of the most prominent and data-rich environments for ML deployment, the underlying algorithmic foundations and methodological approaches are relevant across the full spectrum of electrical engineering applications. The present study specifically focuses on smart grids as a well-defined and rapidly growing sub-domain within this broader electrical systems landscape, motivated by the exponential growth of publications identified in the Scopus database and the critical societal importance of intelligent grid management.

ML stands out among the various transformative technologies that support SGs [21]. ML is a powerful tool that enables data-driven decision-making, predictive analytics, and effective management of SGs [22,23]. ML-based algorithms can process enormous amounts of data by utilizing the power of artificial intelligence (AI) [24]. These insights have been reported to help improve grid stability, smarter energy use, and overall efficiency [25,26,27].

Over the years, researchers have developed various ML-based algorithms and models to detect smart grid intrusions, also known as DoS attacks [28,29,30]. In practice, the ML algorithm/models first gather network data, choose features and apply principal component analysis (PCA) to reduce the dimensionality of the data. Next, the models use support vector machine (SVM) algorithms to detect anomalies [31,32]. The literature search on machine learning (ML) applications in smart grids (SGs) revealed over 2000 publications. The published documents or publications on machine learning (ML) in smart grids (SGs), or MLSG research, comprise numerous reviews, articles, conferences, book chapters, books, and editorial papers.

The literature review reveals publications that examine the application of big data and machine learning in the context of smart grids [33,34]. The studies also highlight the contemporary strategies proposed and implemented for effectively managing the power distribution and energy consumption of SGs [35]. In addition, the publications examine many issues and difficulties related to applying machine learning methods in SGs [36,37]. For example, SGs are prone to challenges such as security, the detection of false data attacks [38], data-driven probabilistic machine learning [23], and cybersecurity [39,40]. These studies demonstrate the tremendous influence and possible advantages of the data-driven and ML-based approach in SGs.

Although prior studies have explored machine learning applications in smart grids, a detailed comparison reveals several limitations in scope, methodology, and analytical depth. For instance, the bibliometric study by Purna Prakash et al. [41] focuses broadly on smart grid research, providing insights into publication trends, authors, and collaboration networks; however, it does not specifically isolate or deeply analyze the machine learning smart grid (MLSG) intersection, nor does it integrate thematic synthesis beyond descriptive statistics [41]. Similarly, Gao et al. [42] conducted a bibliometric analysis of artificial intelligence in renewable energy, mapping research trends and knowledge structures using approximately 1054 publications, but the study emphasizes renewable energy transition rather than the smart grid domain specifically, limiting its applicability to MLSG research.

More specialized bibliometric works, such as Rasoulnia et al. [43], examine smart grid technologies and measurement systems using combined bibliometric and systematic review approaches. However, their focus is largely on hardware, monitoring technologies, and grid infrastructure, rather than the algorithmic and methodological evolution of machine learning within smart grids. Likewise, Jaramillo et al. [44] provide a bibliometric assessment of digital technologies (AI, IoT, blockchain) in energy systems, but the analysis is multi-domain and technology-aggregated, lacking a focused and in-depth exploration of MLSG as a distinct research field. In addition, earlier domain-specific bibliometric surveys, such as the work by Sakhnini et al. [45], concentrate narrowly on smart grid cybersecurity, offering valuable insights into threat detection and defense mechanisms but failing to capture the broader MLSG landscape, including forecasting, optimization, and energy management applications. These studies are therefore thematically constrained and do not provide a holistic structural mapping of the field. Furthermore, several recent studies combine bibliometric and narrative reviews, like Banad et al. [46] identifying key clusters such as forecasting, control, and security; however, these analyses are typically based on relatively small datasets (e.g., ~123 studies) and do not offer large-scale, longitudinal mapping or integrated thematic synthesis across multiple dimensions of the field.

Despite increased studies examining the use of ML in SG networks, a thorough review that methodically compiles and synthesizes the information in this field is conspicuously lacking. Therefore, a comprehensive literature search was carried out using versions of the terms “deep learning,” “smart grids,” “machine learning,” and “smart grid networks.” Consequently, the related publications on the topic were retrieved from the Elsevier Scopus database to examine and highlight the publication trends, bibliometric analysis, and the scientific literature on the application of machine learning (ML) in smart grids (SGs). BA is a quantitative research method used to investigate and quantify many elements of scholarly publications, including citation, authorship, and collaboration trends [47,48]. It aids in assessing the significance and influence of the academic literature, locating important writers and journals, and spotting trends in certain research areas [30,49]. Therefore, it is envisaged that the paper will provide current and future researchers with comprehensive insights into the topic’s scientific growth and technological growth trends over the years.

While numerous studies have reviewed machine learning applications in smart grids, most adopt narrative or systematic review approaches, which primarily focus on synthesizing findings within predefined thematic boundaries. However, given the rapid expansion, fragmentation, and interdisciplinary nature of the MLSG literature (4156 publications), a conventional systematic review alone is insufficient to capture the global research structure, collaboration networks, and thematic evolution of the field. Therefore, a bibliometric and science mapping approach is adopted in this study to provide a quantitative, large-scale, and objective analysis of the MLSG research landscape. This approach enables the identification of hidden patterns, influential contributors, and emerging research directions that cannot be systematically revealed through traditional review methods.

The purpose of this study is to systematically map and critically synthesize the global research landscape on the application of machine learning in smart grids (MLSG), using bibliometric analysis and science mapping as the primary methodological framework. Building on earlier contributions that have examined individual dimensions of the field, this study advances the discourse by simultaneously and systematically integrating publication trend analysis, social network mapping, thematic hotspot identification, and key actor profiling across the Scopus-indexed MLSG corpus from 2009 to 2025 offering a more holistic and current view of the field’s intellectual structure and development. Specifically, the study is guided by five research objectives: (1) to quantify and characterize the growth trajectory of MLSG publications from 2009 to 2025, including document types, source titles, and subject area distributions; (2) to identify the most prolific and influential authors, institutions, and countries contributing to MLSG research and map the collaboration networks connecting them; (3) to determine the leading funding organizations supporting MLSG research globally; (4) to uncover the dominant thematic hotspots and emerging research themes through keyword co-occurrence and cluster analysis; and (5) to critically review the most significant ML methodologies and findings within the three identified thematic hotspots: Smart Grid Security, Power Load Forecasting, and Advanced Energy Management.

In pursuing these objectives, this study makes five original contributions to knowledge. First, it provides a comprehensive bibliometric baseline by analyzing 4156 Scopus-indexed MLSG publications from 2009 to 2025, establishing a current and detailed quantitative reference point for the field that can inform future longitudinal studies. Second, it maps the global distribution of MLSG research productivity, identifying the leading authors, institutions, countries, and funding bodies that are directly actionable for collaboration-seeking researchers and strategic funding decisions. Third, through systematic keyword co-occurrence analysis, the study constructs a thematic map of the MLSG landscape, identifying three coherent research clusters, which are Smart Grid Security, Power Load Forecasting, and Advanced Energy Management, that structure the field’s intellectual architecture. Fourth, beyond bibliometric mapping, the study delivers a critical synthesis of ML applications within each identified hotspot, summarizing methodological advances, key findings, and open research challenges in a manner that directly informs future research directions. Fifth, the study situates MLSG research within the broader context of ML applications across electrical systems including power system fault detection, power quality analysis, transformer diagnostics, and renewable energy integration, thereby demonstrating the wider relevance and transferability of smart grid ML methodologies beyond the smart grid domain.

This study is guided by three research questions formulated prior to data collection and addressed systematically across the results and discussion sections. (1) What is the overall growth trajectory of MLSG research between 2009 and 2025, and which journals, institutions, countries, and funding organizations have been most productive and influential in shaping the field? This question establishes the quantitative and structural baseline of the study, capturing publication trends, dominant dissemination outlets, leading stakeholders, and the collaboration networks connecting them. (2) What are the dominant thematic hotspots and emerging research themes within the MLSG corpus, as revealed through keyword co-occurrence and cluster analysis? This question employs science mapping techniques to uncover the intellectual structure of the field, identifying the principal thematic clusters around which research has coalesced and how those themes have evolved over time. (3) What are the most significant ML methodologies, application domains, persistent challenges, and current trends shaping the development of machine learning in smart grid systems? This question moves beyond bibliometric mapping to engage critically with the substantive content of the literature, examining the specific models most widely deployed, the barriers limiting operational translation, and the emerging directions expected to define the next phase of MLSG development. Table 1 summarizes the key characteristics of representative prior bibliometric studies on related domains, demonstrating how the present study advances beyond each in scope, dataset size, methodological depth, and thematic coverage.

As shown in Table 1, none of the prior studies simultaneously analyzed the full MLSG corpus at this scale (4156 publications), applied integrated bibliometric and science mapping techniques, and delivered a critical thematic synthesis across multiple research hotspots, confirming the distinct contribution of the present study.

The remainder of this paper is structured as follows: Section 2 describes the methodology; Section 3 presents and discusses the results; and Section 4 presents the conclusions and directions for future research.

2. Methodology

This study examines the current research landscape on applying machine learning (ML) in smart grids (SGs) through publication trends, bibliometric analysis, and a literature review. The study is based on selecting related publications on the topic indexed in the Elsevier Scopus database from 2009 to 2025. Consequently, the PRISMA methodology was adopted to identify, screen, and analyze related documents on machine learning (ML) in smart grids (SGs), hereafter termed MLSG research. The Scopus database was selected due to its broad coverage of peer-reviewed journals, conference proceedings, and interdisciplinary research, particularly in engineering, energy, and computer science domains. Prior studies have demonstrated that Scopus provides comparable or broader coverage than Web of Science in these fields [50,51]. The first step involved the identification of related publications on the topic in the Scopus database using the designed keywords on the topic.

To enhance methodological transparency and reproducibility, the search query was refined using explicit Boolean grouping in Scopus as follows: TITLE-ABS-KEY (“machine learning” AND (“smart grid” OR “smart energy”)). Furthermore, to evaluate robustness, a sensitivity analysis was conducted using alternative queries: (i) “Artificial Intelligence AND Smart Grid”, (ii) “Deep Learning AND Smart Grid”, and (iii) “Machine Learning AND Energy Systems”. The results revealed that (i) the core dataset (4156 documents) shared 72–81% overlap with alternative queries, (ii) the three dominant thematic clusters (Smart Grid Security, Load Forecasting, Energy Management) remained structurally stable across all queries, and (iii) top 10 authors, institutions, and countries exhibited >85% consistency across query variants.

The primary search query TITLE-ABS-KEY (“machine learning” AND (“smart grid” OR “smart energy”)) was selected to maximize precision by ensuring retrieved documents explicitly address the ML–smart grid intersection, thereby reducing noise from broader AI or energy literature. While narrower than queries using umbrella terms such as “artificial intelligence” or “data-driven methods,” this specificity was intentional: the aim was to study the MLSG intersection as a distinct, well-defined sub-field rather than the broader AI–energy domain already covered by prior bibliometric studies (e.g., Gao et al. [42]). The sensitivity analyses confirmed that the 72–81% overlap between the core dataset and alternative queries, combined with >85% consistency in top actors and dominant themes, validates the representativeness of the core query. However, it is acknowledged that a fraction of high-impact work, particularly studies employing reinforcement learning for energy dispatch, graph neural networks for grid topology inference, or federated learning for distributed smart meter analytics, may not use “machine learning” as an explicit keyword and could therefore fall outside the core dataset.

The search string (“machine learning” AND “smart grid” OR “Smart Energy”) was executed in the database on 3 May 2026 to identify related documents on MLSG research based on the TITLE-ABS-KEY criteria. The search returned 5138 documents based on the search string “TITLE-ABS-KEY (“machine learning” AND “smart grid” OR “Smart Energy”) AND PUBYEAR > 2008 AND PUBYEAR < 2026”. The search was executed on 3 May 2026; while the PUBYEAR < 2026 filter captures all records indexed by Scopus for the 2025 publication year at the time of retrieval, it is explicitly acknowledged that a subset of 2025 papers indexed after this date are absent from the dataset. Specifically, journals with longer post-acceptance indexing delays may contribute to late-indexed 2025 articles that fall outside the present corpus. Consequently, 2025 publication counts reported herein should be interpreted as a lower-bound estimate for that year, and any year-on-year growth comparisons involving 2025 should be read with this caveat in mind. This limitation does not affect the structural findings of the study, including thematic cluster composition, author rankings, or country-level productivity patterns, which are stable across the entire 2009–2024 sub-period. Next, the results were screened using the LIMIT-TO function of the Scopus database to obtain related articles and conference proceedings published in English. The final number of published documents after screening was 4156 document results based on screening string “TITLE-ABS-KEY” (“machine learning” AND “smart grid” OR “Smart Energy”) AND PUBYEAR > 2008 AND PUBYEAR < 2026 (LIMIT-TO (DOCTYPE, “cp”) OR LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO (DOCTYPE, “re”)) AND (LIMIT-TO (LANGUAGE, “English”) OR EXCLUDE (LANGUAGE, “Polish”) OR EXCLUDE (LANGUAGE, “Spanish”)) AND (LIMIT-TO (SRCTYPE, “p”) OR LIMIT-TO (SRCTYPE, “j”)).

Lastly, the final search results were recovered from the Scopus database as CSV files for publication trends, bibliometric analysis, and a literature review of the research landscape on MLSG research. The publication trends analysis examined the growth trajectory of the published documents, document types, sources, and subject areas on the topic.

Two independent reviewers personally evaluated titles and abstracts of the documents that were found through the database search in order to categorize them as being “yes,” “no,” or “maybe” for inclusion. The inclusion criteria are: (i) studies explicitly addressing ML/AI applications in smart grids, (2) peer-reviewed journal articles, conference papers, and reviews and (3) English-language publication. The exclusion criteria are: (i) studies unrelated to smart grid systems, (2) papers focusing solely on energy policy without ML methods and (3) duplicate or incomplete records. To reduce errors and omissions, each review was completed independently, and the results were compared to determine which documents should be included. The study’s research aim, including whether it focuses on artificial intelligence and the Internet of Things, was used to categorize the documents. Next, bibliometric analysis was performed using the VOSviewer software and Rstudio 4.4 (Biblioshiny). The clustering technique is used with VOSviewer [52] as one of our objectives is to understand the co-authorship behavior of authors and countries. The clustering technique is precise in determining the relationships of publications based on direct citation relations [53]. Bibliographic data were fed into VOSviewer (Version 1.6.18) to deduce the network of co-authors, keyword occurrences, and citation trends on MLSG, from 2009 to 2025. From the raw data, the following parameters of the study were obtained: document type, language, chronological growth, journal, author, country, organization and keyword. VOSviewer and Rstudio were primarily used for the analysis of bibliometric networks, a software tool for creating maps of scientific publications, scientific journals, research institutions, countries, keywords. To ensure full reproducibility, the following VOSviewer parameters were applied: for co-authorship analysis, a minimum of 5 documents per author and 50 citations per author were set using the full counting method; for keyword co-occurrence analysis, a minimum occurrence threshold of 100 was used, resulting in 83 keywords meeting the threshold; for country collaboration analysis, a minimum of 10 publications and 50 citations per country were required; additionally, the clustering resolution parameter was set to 1.00, and the visualization type employed was network visualization. These tools have been widely used in scientometric studies such as in analyzing scientific collaboration in a given discipline [54], social networking and academic performance [55], bibliometric visualization and analysis of mapping of the knowledge domain [56] and bibliographic coupling and co-citation analyses [57]. Intercoder reliability was assessed using percentage agreement, involving two independent reviewers. Each reviewer independently screened the titles and abstracts of the retrieved documents and classified them into inclusion categories. The level of agreement was calculated as the proportion of documents with consistent classification decisions, yielding an agreement rate of approximately 84%, which is considered acceptable based on established thresholds (>80%) in systematic review studies. Discrepancies were resolved through discussion and consensus, ensuring consistency in the final dataset. Inter-rater reliability was assessed using Cohen’s Kappa coefficient, which yielded a value of κ = 0.79, indicating substantial agreement. The raw percentage agreement was 84%. Discrepancies were resolved through consensus discussions. The objective was to examine the research impact and global interest in the topic over the timespan examined [58]. On the other hand, the bibliometric analysis aimed to identify and examine the social networks and collaborative links between the major stakeholders (authors, affiliations, and countries) actively involved in research on the topic [59,60]. The literature review was then conducted to examine the current developments and potential research directions on the topic in the coming years [61].

Furthermore, to ensure the robustness and reliability of the bibliometric findings, a sensitivity analysis was conducted by varying key parameters, including search keywords and time range selection. The primary search query combined terms related to machine learning and smart grids; to assess the stability of the results, alternative keyword combinations such as “Artificial Intelligence AND Smart Grid,” “Deep Learning AND Smart Grid,” and “Machine Learning AND Energy Systems” were tested. Although the total number of retrieved publications varied across these queries, the overall research trends, leading countries, key authors, and dominant themes remained consistent, with core clusters such as Smart Grid Security, Load Forecasting, and Energy Management appearing across all variations. This indicates that the findings are robust to changes in keyword selection and not overly sensitive to specific query formulations. In addition, temporal sensitivity was evaluated by comparing two time periods: 2009–2018 (early development phase) and 2019–2025 (recent growth phase). The results show that earlier studies focused mainly on foundational machine learning applications and smart grid infrastructure, whereas recent studies demonstrate a shift toward deep learning, cybersecurity, and advanced energy management systems. Despite these temporal differences, the overall upward publication trend and thematic structure remained stable. Collectively, these results confirm that the study’s findings are methodologically robust, not dependent on a single query configuration, and therefore support the validity and generalizability of the identified research trends and thematic clusters.

Figure 1 shows the PRISMA framework displaying the four-phase search and selection procedure organized to include publications that might be utilized to perform a reliable scientometric analysis. This same procedure was used in the study of [52].

3. Results and Discussion

3.1. Published Documents Analysis

Figure 2 shows the publication trends in MLSG research between 2009 and 2025. It is a pictorial representation of the growth in publications over the years. As observed in Figure 2, MLSG publications grew from one document in 2009 to 1163 in 2025, yielding a cumulative growth of 116,200% over the 16-year period. While this cumulative percentage is mathematically precise, it is naturally amplified by the extremely low baseline of the field’s early years and should be interpreted accordingly. A more informative measure of sustained growth momentum is the Compound Annual Growth Rate (CAGR), which for the MLSG corpus stands at approximately 56.4% per annum across the 2009–2025 period. This figure compares favorably and substantially exceeds estimated CAGR benchmarks reported for the broader energy research literature (~8–12% per annum) and general AI/machine learning publications (~25–30% per annum), confirming that MLSG represents one of the fastest-growing sub-fields at the intersection of electrical engineering and data science. The observed growth trajectory reflects both the increasing availability of large-scale smart meter datasets and the broader institutionalization of data science methods across engineering disciplines.

The findings indicate an enormous increase in interest by major stakeholders in the field, as evident in the 4156 publications comprising 2040 (49.1%) conference papers, 1842 (44.6%) articles and 264 (6.4%) review papers. The large number of publications on MLSG research suggests that extensive research has been advancing the field and knowledge on the topic [62]. This observation could encourage networking, collaboration, and access to many viewpoints. However, it may result in duplication and variances in quality, calling for careful assessment. Likewise, due to the enormous number of publications, researchers may find it difficult to stay current, necessitating efficient filtering and analysis methods. While there are advantages to having numerous publications, there are drawbacks, including issues with quality control, preventing duplication, and maintaining up-to-date information on the most recent studies. To address the challenges linked to quality control, journals in various fields have set up rigorous systems such as the peer review system, which aims to publish only the best-quality papers in the literature. Section 3.2 provides insights into the top journals on the topic of MLSG over the years.

3.2. Source Titles and Subject Area

Table 2 and Figure 3 present insights into the source titles that have actively published various publications on the MLSG topic in the Scopus database using the RStudio tool (Biblioshiny). The data from Scopus indicates that 85 source titles have published one or more publications on the topic, garnering 26,630 citations.

The top 10 source titles have published 731 documents, or 17.6% of the TP on the topic. Based on the findings, Energies, IEEE Access, and Applied Energy are the top journals in the field of MLSG. The nature and productivity of the journals may be due to the nature, impact, and reputation of the journals. Another critical factor may be the subject area and theme of the journals, which attracts researchers in the field. To further examine this submission, a critical analysis of the subject area and scope of the MLSG research landscape was carried out using data from Scopus. Table 3 presents the key subject areas of the MLSG research landscape. The data shows that MLSG publications in the Scopus database are classified into 22 subject areas. The top 10 (which account for 95.94% of total publications) presented in the table show that the MLSG research landscape is broadly themed with multiple disciplines ranging from STEM (science, technology, engineering, and mathematics) to MASH (management, arts, social sciences, and humanities). Bradford’s Law, in its classical formulation, applies to the scattering of literature across journal sources rather than subject area classifications. However, in the bibliometric literature, the underlying principle that a small number of core categories account for a disproportionately large share of the literature has been applied analogously to subject area distributions to identify dominant disciplinary orientations [63,64]. Following this convention, and noting that engineering (25.6%), computer science (24.9%), and energy (16.0%) collectively account for 66.5% of all subject area classifications substantially exceeding the 33.33% core zone threshold, we designate these three as the dominant disciplinary anchors of MLSG research. We acknowledge that this represents an analogical extension of Bradford’s original formulation and note this limitation for transparency.

3.3. Authors

Figure 4 shows the top five most prolific researchers on the MLSG research landscape. The results show that 160 authors have published one or more publications over this study’s time frame. As observed, Nadeem Javaid, based at the COMSATS University Islamabad in Pakistan, is the most prolific researcher, with 23 publications cited 1256 times over the years. The researcher’s most notable publication is “A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids,” with 599 citations to date [65]. Published in the open-access journal “Renewable and Sustainable Energy Reviews,” the paper proposed an FA-XGBoost classifier for detecting electricity theft. The classier accomplished an F1-score of ~94%, precision (~93%), and recall (97%). The integrated VGG-16 module was observed to have higher generalized performance for training and testing data at precision values of ~87% and ~84%, respectively. Lastly, the authors reported that the suggested FA-XGBoost accurately recognized real electricity thieves with 97% recall value. In another study, Nadeem Javaid and co-workers examined two proposed MIMO DRNN models (ESAENARX and DE-RELM) for estimating the prices and loads of electricity in a smart city [66]. The findings revealed that the ESAENARX and DE-RELM models outperformed a benchmark model and their sub-models. Lastly, the study reported that the refined and informative characteristics obtained from ESAE enhanced the predicting precision in ESAENARX, whereas optimization enhanced the accuracy of DE-RELM. Another notable study by Javaid and co-workers is “Towards Efficient Energy Utilization Using Big Data Analytics in Smart Cities for Electricity Theft Detection,” which examined the use of big data and machine learning in the detection of electricity theft in smart cities [67].

Another significant contributor to the field of MLSG is Alsabaan, M of King Saud University, in Saudi Arabia. The group of Alsabaan, M has published 18 papers on MLSG, which have been cited 234 times over the years. The most notable publication by the researcher is “Electricity Theft Detection Using Deep Reinforcement Learning in Smart Power Grids,” which has been cited 55 times to date [68]. The study, published in the journal “IEEE Access”, highlighted that electricity theft cyberattacks can be launched by fraudulent customers by compromising their SMs to report false readings to pay less for their electricity usage. In addition, the study noted that these attacks harmfully affect the power sector since they cause substantial financial loss and degrade the grid performance because the readings are used for energy management. However, the authors warn that using ML and BD from smart grids will also present challenges shortly. Another notable publication by Alsabaan, M and co-authors is “Clustering and Ensemble Based Approach for Securing Electricity Theft Detectors Against Evasion Attacks,” cited 32 times [69]. The IEEE Access publication highlighted that in smart power grids, electricity theft causes huge economic losses to electrical utility companies. Machine learning (ML), especially deep neural network (DNN) models, hold state-of-the-art performance in detecting electricity theft cyberattacks. The findings indicate that the cluster-based detector is not only more robust against evasion attacks but also enhances normal classification accuracy because its training data has more consumption pattern similarity compared to the training data of the global detector which requires a higher level of regularization.

Other notable researchers in the top five of the field include Badr M.M. (SUNY Polytechnic Institute, Utica, United States), and Muhammad Ismail (Tennessee Technological University, United States) and Fouda M.M. (Idaho State University, Pocatello, United States) with 17, 17, and 16 publications, respectively. Overall, the findings indicate high productivity, scientific impact, and technological development in the field, which could be ascribed to numerous factors. For example, the collaboration between scientific stakeholders (e.g., authors/researchers, affiliations, and countries) is considered an integral part of research productivity and scholarly advancement in many fields. This research examined the extent of collaboration between researchers through bibliometric analysis (BA). Figure 5 shows the network visualization map (NVM) of the co-authorship on the MLSG research landscape using the VOSviewer software.

In Figure 5, based on a minimum of five publications with at least 50 citations, the NVM displays the seven clusters of collaborations between 30 of the top 141 researchers in MLSG research. Further analysis shows that Alsabaan, M, Badr M.M. and Refaat Shady have the highest total link strength (TLS) of 62, 54, and 46, respectively. The findings indicate that they have the strongest collaboration links among the major researchers on the topic. Furthermore, the top 10 corresponding authors’ countries in the MLSG research are displayed in Figure 6 and represented in Table 4 respectively using the RStudio tool (Biblioshiny).

The researchers and institutions in China have produced 157 published documents and have a frequency of 0.107% of the total publications (TP) in the literature. This indicates that China is the prime mover in MLSG, and its corresponding authors are the most relevant and prolific as earlier shown by the findings in Figure 6. In second place is the United States which has a corresponding authorship frequency of 0.067% and has been a long-term front-runner in the research and development of machine learning worldwide. The impact of US companies such as Samsara (San Francisco, CA, USA), Vates (Atlanta, GA, USA), Andersen Inc. (Bayport, MN, USA), and Oxagile (New York City, NY, USA) is widely reported in the media in the ML world. Other top nations such as India, South Korea, and Pakistan are also important players in the MLSG industry with frequencies of 0.056%, 0.026% and 0.016% respectively on the topic in Scopus. The dominance of the top five nations in MLSG research could be attributed to numerous factors including the quest for improved living standards as well as increased efficiency, cost savings, and the environmental friendliness of processes, products, goods, and services worldwide.

3.4. Affiliations

Figure 7 shows the top five most productive affiliations in MLSG research. The data from Scopus indicates that 167 affiliations worldwide have published one or more publications on the topic. As shown in Figure 4, the most prolific affiliations on the topic have published 257 documents or ~7.06% of the total publications. The most prolific is King Saud University (KSU), based in Saudi Arabia, with 57 publications garnering 1600 citations. The productivity of KSU is largely due to Alsabaan, M, who has collectively produced 18 publications to date. The researchers at Tennessee Technological University have produced 51 publications (2154 citations), which place the affiliation as the second most productive in the field. Other noteworthy contributors to the field include Saveetha Institute of Medical and Technical Sciences, India (51 publications, 593 citations); College of Engineering, Ashraf Islam Engineering Building, Tennessee Technological University, United States (50 publications, 2136 citations); and Saveetha School of Engineering, India (48 publications, 442 citations). The findings reveal that institutions play a role in the productivity rate, particularly via research collaborations. Further analysis reveals that while intra-organizational collaboration is prevalent among top MLSG researchers, inter-organizational collaboration appears comparatively limited relative to other well-established interdisciplinary fields. The co-authorship network in Figure 5, based on 30 of the top researchers meeting the minimum threshold criteria, does reveal some inter-institutional links; however, the density of these connections as reflected in the total link strength distribution suggests that most high-productivity researchers collaborate primarily within their own institutional clusters rather than across organizational boundaries. This pattern may partly reflect the early-to-mid developmental stage of MLSG as a formalized sub-field. The findings nonetheless point to a meaningful opportunity for enhanced cross-institutional and cross-sectoral collaboration to accelerate methodological progress.

3.5. Countries/Regions

Figure 8 presents the top five most prolific countries on the MLSG research landscape. It is important to state that 95 countries have published one or more publications on the topic. However, the top five nations have produced 2860 publications or 68.82% of the total publications on the topic. Based on Bradford’s rule, these nations are the core nations on the topic. The findings also suggest that the United States, China, India, Canada, and the United Kingdom are the prime movers on the topic worldwide. In technology development, prime movers are the key individuals, factors, or events that have significantly influenced the initiation, development or growth trajectory of any given idea, project, or initiative with societal or historical impacts.

As observed in Figure 8, the most prolific nation in MLSG research is India, with 963 publications cited 14,473 times over the years. The most notable publications on the MLSG landscape include “Machine Learning Paradigms for Next-Generation Wireless Networks” [70], “Machine Learning Methods for Attack Detection in the Smart Grid” [71], and “Detecting stealthy false data injection using machine learning in smart grid” [72]. The findings of such studies have highlighted and demonstrated that ML is one of the most promising AI tools with the potential of ML in addressing security, privacy, and integrity issues related to smart grids and next-generation wireless networks. ML can support device-to-device communications, smart radio terminals, cognitive radios, massive MIMOs, femto/small cells, heterogeneous networks, and smart grids. Likewise, ML can be utilized to enhance the quality of service, spectral effectiveness, and energy efficiency in 5G networks [70]. Furthermore, measurements in the SG can be classified as secure or under attack using ML methods. The attack detection problem can be modeled using decision- and feature-level fusion. Compared to state vector estimation techniques, ML algorithms are more effective in detecting attacks [71].

Researchers in China have also significantly contributed to the research landscape on MLSG. The nation has produced 762 publications and gained 25,130 citations on the topic. Examples of notable works from China include “Intelligent Edge Computing for IoT-Based Energy Management in Smart Cities” [73], “Real-time energy management of a microgrid using deep reinforcement learning” [73], and “Artificial Intelligence techniques for stability analysis and control in smart grids: Methodologies, applications, challenges and future directions” [74]. Studies focus on the various facets and methods of AI, real-time energy management, and intelligent edge computing in IoT-based energy management in smart cities and microgrids. The publications have highlighted the importance of AI-driven techniques, such as deep reinforcement learning and other AI techniques, in enhancing energy management and stability in smart cities, microgrids, and smart grid contexts. Additionally, they have demonstrated how intelligent edge computing and AI have the potential to revolutionize how energy is used, distributed, and managed, resulting in more effective, robust, and sustainable energy systems in cities.

Researchers and affiliations based in the United States (695), Saudi Arabia (262), and the United Kingdom (214) have also greatly contributed to the scientific growth and technological development of the research landscape on MLSG. Overall, the landscape has experienced high output in publications and citations, possibly due to numerous factors. Notably, collaboration and funding have been key to the changes observed in the landscape. To examine the effects, the co-authorship feature of VOSviewer was adopted to examine the level of collaboration between nations. Figure 9 presents the network visualization map on the level of collaboration between nations actively involved in the MLSG research landscape. The analysis is based on the set criteria of a minimum of 10 publications cited at least 50 times. Based on the analysis, 61 countries fulfilled the set criteria resulting in eight clusters with 2 to 12 nations each. The results reveal that the US is the most influential in collaboration based on its high TLS of 458 compared to 410 and 324 for China and India, respectively. The influence of the US is further emphasized by the high number of publications and citations on MLSG, as earlier surmised. Overall, the results indicate that the research landscape on MLSG is characterized by 694 Links and 2553 TLS. Hence, it could be reasonably surmised that it is an impactful area of research with the potential for even more output in the coming years.

3.6. Funding Organizations

Figure 10 presents the top five most influential funders of MLSG research globally. Scopus data on funders indicates that 159 organizations have funded one or more publications on the topics. This observation further buttresses the point that MLSG is an important area of research worldwide. The top funder of MLSG research is the NSFC (National Natural Science Foundation of China), with 190 publications cited 9876 times. Funding from the NSFC has been critical to research among researchers in China and their collaboration with peers based in the US, among other nations. The second largest funder is the National Science Foundation (NSF), with 134 publications cited 5625 times. The NSF has been crucial to the success of researchers (such as Aljohani N, Parvez I, and Alden RE, among others) who possess various affiliations (such as Florida International University, University of Florida, and the University of Kentucky, among others) in the US. Other notable research funders are the European Commission, Horizon 2020 Framework Programme (European Union) and the Department of Energy (US). Overall, these organizations’ funding has provided critical funding for publications, resources, remunerations, collaborations, travel, facilities, and other expenses related to research in the landscape on MLSG.

Table 5 shows the most highly cited publications on MLSG research in the Scopus database. The findings indicate that the top 12 most-cited publications have been cited between 88 and 840 times (or an average of 120 times), of which 60% are “articles” whereas 40% are “review” papers. With 1200 citations, these publications are considered benchmark publications on the topic. The most highly cited publication is “Optimal deep learning LSTM model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches” by [75], with 840 citations published in Energies. Another highly cited article on MLSG research is “A Survey on the Detection Algorithms for False Data Injection Attacks in Smart Grids,” by [76], published in IEEE Transactions on Smart Grid with 672 citations to date. Overall, the papers aim to critically address security issues and explore potential future study areas while providing a thorough overview of the various applications of MLSG, from attack detection to energy management and forecasting. Furthermore, the data shows that the scientific growth and technological development of the research landscape on MLSG has occurred based on various themes and clusters. Section 3.7 presents the keyword co-occurrence analysis, which examines the research landscape’s various research themes and clusters.

3.7. Keyword Co-Occurrence Analysis

Figure 11 presents the network visualization and overlay visualization maps for the keyword co-occurrence in MLSG. KCO is a bibliometric method that examines connections between terms in scholarly publications, revealing primary subjects and themes, mapping knowledge structures, finding research trends, and conducting literature reviews [82]. This study developed the research landscape maps with a minimum of 100 keyword occurrences.

The results showed that 83 keywords fulfilled the set criteria, which resulted in four clusters, 3288 links, and 99,674 TLS. Each cluster contained 8 to 29 words, as depicted in Figure 11. The most prevalent keywords are smart grid, smart power grids, machine learning, and electrical power transmission networks, which occurred 2307, 2300, 2209 and 1227 times, respectively. Based on the clusters, the hotspot analysis was performed to elucidate the current themes and future directions of the topic in the literature. The three identified clusters in Figure 11a,b indicate that the scientific growth and technological development of the research landscape on MLSG is based on three themes, namely:

(i): Cluster 1—“Smart Grid Security”;
(ii): Cluster 2—“Power Load Forecasting”;
(iii): Cluster 3—“Advanced Energy Management.”

3.8. Most Frequent Words and Word Cloud

Figure 12 presents the most frequent words and word cloud for the keyword occurrences in MLSG. Word clouds are graphical representations of word frequency that give greater prominence to words that appear more frequently in a source text [82].

From the figures above, the result shows that “smart power grids” appears to be the most frequent keyword with 1029 occurrences and this is followed by “electric power transmission network” which has 926 occurrences while “smart grid” and “machine learning” appears to be the third and fourth most frequent keywords with 700 and 578 occurrences respectively. Based on these results we can deduce that the research on smart grid and machine learning applications seems to be hot and trending, hence the need for this study.

Table 6 shows the top 10 most frequent author keywords in the MLSG corpus, while Table 7 shows the top 10 most relevant keywords in the MLSG corpus ranked by Biblioshiny relevance score.

3.9. Review of the MLSG Literature

The papers discussed in the following thematic review were selected from the 4156-document MLSG corpus using a two-stage process. In the first stage, citation-ranked lists were generated for each of the three identified thematic clusters (Smart Grid Security, Power Load Forecasting, and Advanced Energy Management) using the VOSviewer-derived cluster assignments. The top 30 most-cited publications within each cluster were identified as the primary candidate pool. In the second stage, the research team independently reviewed the titles, abstracts, and full texts of candidate papers, applying the same inclusion criteria used for the broader bibliometric analysis. Papers were included in the narrative review if they: (i) explicitly addressed ML applications within the cluster theme, (ii) contributed a methodologically distinct approach or finding not already represented, and (iii) were published in peer-reviewed venues. This process yielded 12–18 anchor papers per cluster, supplemented by additional studies referenced in the highest-cited works. The selection prioritizes citation impact and methodological diversity rather than exhaustive coverage, consistent with the aim of synthesizing key developments rather than cataloging all contributions. The review of the scientific literature on the scientific growth and technological development of the MLSG research landscape was carried out based on the technique described in the literature [48,83]. Therefore, the studies on the three identified themes or hotspots in MLSG were critically reviewed.

(i): Smart Grid Security (SGS)

The term “smart grid security” or SGS refers to the policies, procedures, and tools to defend and safeguard the smart grid infrastructure against potential physical and cyberattacks [84]. With the help of contemporary information and communication technology, the smart grid effectively manages power generation, transmission, distribution, and consumption. However, incorporating these technologies also brings new risks and weaknesses [85,86]. The significance of SGS increases as it becomes increasingly digitally dependent and networked. Addressing potential security vulnerabilities is essential to maintain the smart grid infrastructure’s integrity, stability, and reliability and ensure a secure and sustainable energy future.

Security for the smart grid includes several elements, such as authentication and access control, resilience and dependability, monitoring and incident response, data privacy, physical security, cybersecurity, and standards and compliance [87,88,89]. Due to its significance, research into the security of smart grids has been extensively investigated in the literature. The study by Ozay et al. [71] revealed that ML algorithms are more effective in detecting attacks than attack detection algorithms that use state vector estimation techniques in the proposed attack detection framework. Likewise, machine learning algorithms are more effective in detecting attacks than state vector estimation techniques. Furthermore, the measurements in the smart grid can be classified as secure or under attack using ML methods. Lastly, the attack detection problem can be modeled using decision- and feature-level fusion. In another study on the security of smart grids, Ahmed et al. [80] demonstrated that detection accuracy is increased using a feature selection method based on genetic algorithms compared to conventionally used ML-based techniques. Therefore, a supervised ML-based method is suggested to identify covert cyber-deception attacks in communication networks for SGs. The performance evaluation demonstrated that when compared to current machine learning-based methods, the suggested scheme significantly improved the accuracy of covert cyber-deception assault detection [80]. In a separate study, Ahmed et al. [77] also showed that a covert data integrity attack on the smart grid communications network might seriously jeopardize its security and dependability. A novel unsupervised machine learning-based strategy is suggested to solve this problem. This technique employs the isolation forest algorithm to identify covert data integrity threats in the communication networks of the smart grid. The strategy assumes the attack would provide the generated random forest’s smallest average path length. Using industry-standard IEEE 14-bus, 39-bus, 57-bus, and 118-bus systems to test the proposed method greatly boosts attack detection efficiency [77]. Wang et al. [31] proposed a smart grid DoS attack detection methodology based on ML. The authors showed that with the KDD99 dataset, SVM performs exceptionally well. The proposed smart grid DoS attack detection model uses machine learning, features are chosen, PCA is used to reduce the number of dimensions, and SVM outperforms Decision Tree and Naive Bayesian Network in terms of performance [31]. Panthi [90] adopted ML techniques to detect anomalies in smart grids. The study evaluated many cutting-edge ML approaches. The findings showed that it is possible to identify and distinguish between natural and artificial disruptions in power systems using ML techniques. Overall, the study showed that ML approaches can reliably identify cyberattacks, including those that use dishonest methods to obscure their tracks. Guihai and Sikdar [91] examined false data injection attack detection using adversarial machine learning for smart grid demand response. The authors showed that adversarial ML attacks can target deep learning-based attack detectors in distributed DR settings. Therefore, AML attacks can exploit deep learning-based FDI attack detectors in DR settings. To trick deep learning-based FDI attack detection, a new black-box FDI assault methodology is provided. It can create power demands in distributed DR scenarios. The suggested AML framework surpasses existing AML methods in the literature and can drastically reduce the accuracy of FDI detection models. More recently, Aziz et al. [92] explored the use of effective and unique machine learning models to protect a smart grid by detecting cyber-malware assaults. The results indicated that by using supervised learning and hybrid techniques in a simulated exercise, classification systems that detect FDI assaults function better. False data injection (FDI) attacks can be found using ML in SGs. Therefore, six alternative boosting and feature selection (FS) strategies were used to analyze six supervised learning (SVM-FS) hybrid techniques. It was reported that the application of supervised learning and hybrid methods enhanced the performance of classification algorithms used to identify FDI attacks. Other studies have highlighted the importance of ML methods in protecting SGs. Various themes and topics, including cyber-physical attack development, fake data injection, attack detection, and anomaly detection have been critically examined over the years. Furthermore, the effectiveness of numerous algorithms, such as the SVM, Decision Tree, Naive Bayesian Network, and Convolution approaches, have been explored and exploited in detail. The studies suggest that performance can be enhanced by employing feature selection and PCA to reduce the dimensionality of the data. The emphasis has also been to promptly identify cyberthreats and preserve the integrity of communication networks for SG. These studies have also sought to emphasize the value of unsupervised learning with isolation forests for detecting covert data integrity attacks. Overall, the objective is to utilize cutting-edge and effective ML-based algorithms to protect smart grids from cyberattacks.

Critically, the SGS literature reveals important methodological trade-offs that merit explicit discussion. Supervised ML approaches, such as SVM and random forests, consistently achieve high detection accuracy on benchmark datasets (e.g., IEEE 14-bus, KDD99) but are vulnerable to adversarial perturbations, as demonstrated by Guihai and Sikdar [91], who showed that deep learning-based FDI detectors can be compromised by carefully crafted adversarial inputs. More recent approaches employing adversarial machine learning (relevance score: 1.54; Table 7) attempt to address this limitation by incorporating adversarial training, though at the cost of increased computational overhead. Meanwhile, unsupervised approaches using isolation forests [77] offer the practical advantage of requiring no labeled attack data, but their detection sensitivity may be lower for novel, previously unseen attack patterns. A further emerging tension exists between centralized and federated ML architectures: while centralized models benefit from richer training datasets, federated learning approaches (relevance score: 1.84; Table 7) address critical data privacy concerns in smart grid deployments, particularly in contexts where utility companies are unwilling to share raw consumer data. The application of graph neural networks (GNNs) (relevance score: 1.72; Table 7) to grid topology-aware intrusion detection represents a promising frontier but remains under-studied relative to its potential, with fewer than 100 publications identified in the sensitivity analysis.

(ii): Power Load Forecasting (PLF)

The term “power load forecasting” or PLF estimates future electricity demand or consumption for a certain area or power system over a specified time frame. The objective of PLF is to effectively plan and manage resources, ensure grid stability, and prevent shortages or overproduction of energy. Due to its critical importance, energy suppliers, utility companies, and grid operators strongly depend on PLF for smooth operations. Typically, PLF is performed using various methods, including statistics and time-series analysis. Recently, machine/deep learning algorithms and artificial intelligence models have been proposed to carry out PLF. The PLF approach’s major advantage is that it considers past load data, weather patterns, seasonal trends, economics, and other pertinent variables. Based on the foregoing, utilities can effectively execute demand response programs, manage maintenance schedules, and optimize electricity generation and distribution through accurate PLF. Various studies have examined the use of ML to execute PLF. Ahmad and Chen [81] examined the potential of three different ML models for predicting district-level long-term and medium-term energy demand in SGs. The study showed that district-level medium- and long-term energy demand can be accurately and precisely forecast using ML. The ML-based models examined in the study were artificial neural networks with nonlinear autoregressive exogenous multivariable inputs, multivariate linear regression, and adaptive boosting models. Furthermore, the accuracy of the models was enhanced using various data tests and accomplished feature extraction, data modification, and outlier detection. The study showed that the models offer suitable forecasting intervals and support the consolidation of district-level variances and spatiotemporal energy usage inconsistencies in a smart grid setting. Ungureanu et al. [93] examined the use of ML for industrial load forecasting in an SG. The findings showed that although forecasting the energy behavior in the industry is challenging, using ML can assist particularly with forecasting and optimizing the loads. In addition, the authors showed that ML could help to reduce balancing costs and foresee network issues to improve load forecasting for industrial consumers. The forecasts of detailed energy behavior can enhance the integration of industrial users into SGs. For example, data on high-frequency recording intervals and real-time processing can be used to justify investments in the SG [93]. Syed et al. [94] evaluated the performance of distributed ML for load forecasting in a smart grid. The study employed big data platforms such as Apache Spark and Apache Hadoop for PLF. The findings showed that Spark produced excellent accuracy while parallelizing the load forecasting process [94]. Bahaghighat et al. [95] employed ML and computer vision to remotely estimate the angular velocity of wind turbines in an SG. The authors demonstrated that the angular velocity of wind turbines in SGs can be reliably predicted using ML techniques and vision sensor networks with 95.4% accuracy. Furthermore, the study showed that computer vision algorithms (such as FAST, SIFT, SURF, BF, FLANN, AE, and SVM) can be used to precisely pinpoint the hub and track the existence of the blade in successive frames of a video stream. Another earlier study, Bahaghighat et al. [96], showed that convolutional neural networks and video mining could be utilized to remotely evaluate wind turbines’ angular velocity. In Tiwari et al. [25], the authors opined that using ML-based models for predicting energy use is a clever step toward creating a smart city. In addition, the study observed that the support vector machine (SVM) method produced the most accurate results when used with the dataset for Smart Grid Stability. Overall, the authors observed and reported that transforming traditional grids into smart grids using sensors and ML algorithms can assist in smart city creation. Cebekhulu et al. [97] also demonstrated the potential of ML algorithms for predicting energy consumption. The study carried out a performance evaluation of ML algorithms for SG energy demand–supply prediction. Overall, the studies on FLP in the literature have focused on using ML in SGs to classify energy imbalances, identify energy consumers, estimate wind turbine velocities, and forecast load. ML-based algorithms have also sought to maximize energy efficiency and enable smart city applications. The topics typically cover energy demand–supply forecasting to short-term load forecasting. The ultimate objective is to create effective models and algorithms that promote smart grid technology and energy optimization.

Across the PLF literature, a persistent tension exists between model accuracy and interpretability. Deep learning architectures, particularly LSTM networks [75] and transformer models (relevance score: 1.68; Table 7), consistently outperform classical ML methods on benchmark load forecasting tasks but operate as black boxes, limiting their practical adoption in regulated utility environments where decision auditability is required. Explainable AI (XAI) techniques (relevance score: 1.64; Table 7) have begun to bridge this gap, though their application in PLF remains nascent. Federated learning approaches to load forecasting (relevance score: 1.84; Table 7) offer an attractive solution to the data-sharing barrier among competing utilities, but existing results suggest a measurable accuracy penalty relative to centralized models trained on pooled data. Physics-informed neural networks (PINNs) (relevance score: 1.61; Table 7) represent an emerging paradigm that constrains model outputs to comply with known physical laws of power systems, potentially improving generalization to out-of-distribution load conditions, an important practical consideration given the increasing penetration of electric vehicles and distributed energy resources. These trade-offs suggest that future PLF research should move beyond benchmark accuracy comparisons toward multi-criteria evaluation frameworks that explicitly account for interpretability, communication efficiency, and physical plausibility.

(iii): Advanced Energy Management

The concept of “Advanced Energy Management,” or AEM, involves using cutting-edge technologies to maximize efficient and sustainable production, delivery, and energy use. AEM integrates advanced analytics, energy storage systems, renewable energy sources, and smart grid technology. Selected components of AEM include demand response initiatives, incorporating renewable energy sources, energy storage options, energy analytics, and grid optimization. AEM strategy aims to promote a cleaner, greener energy future through reduced reliance on fossil fuels, fewer greenhouse gas emissions, grid stability, and increased overall energy efficiency. Due to its importance to global sustainability, various researchers have extensively examined AEM through various empirical and numerical investigations reported in the literature. Li et al. [98] examined the potential of ML in predicting the comfort level of users in smart grid environments using three widely used supervised learning algorithms. The findings revealed that ML algorithms can forecast consumer comfort levels for novel gadget usage patterns. The prediction accuracy of the algorithms was influenced by the number of training samples (Li et al. [98]). The study by Azad et al. [99] highlighted the potential of ML in transforming smart grids. The study highlighted that ML can help SGs intelligently adapt to unexpected changes, for instance changes in consumer demand, power outages, unexpected decreases/increases, intermittencies in the output of renewable energy sources, or catastrophic events. In addition, reinforced learning can help with energy dispatch decisions and trigger demand management signals, which could balance the supply and demand for electricity. Other ML applications include data authentication and identifying and preventing aberrant behavior, intrusion, cyberattacks, and criminal actions. Babar et al. [78] proposed an ML-based, secure, and robust engine for demand-side management of an IoT-enabled smart grid. The study showed that a safe demand-side management engine is recommended for the grid powered by the Internet of Things. For the Internet of Things (IoT)-enabled grid, a secure demand-side management (DSM) engine is proposed. Additionally, the study showed that an ML classifier could predict dishonest entities in a resilient model, which could help manage intrusions into the smart grid. Hence, the authors showed that advanced energy management and interface-regulating agents can ensure the best possible energy utilization. Ahmed et al. [79] proposed an ML-based energy management strategy for renewable energy districts and smart grids. The findings showed that an effective energy management model (EMM) that incorporates renewable energy sources with smart grids can be created using machine learning (ML) and Gaussian process regression (GPR). Energy consumers and the grid gain from the proposed adaptive service level agreement (SLA) between these two parties. To demonstrate the proposed model’s validity, its outcomes are carefully compared with those of traditional optimization (GA and PSO)-based EMM [79]. Min et al. [100] proposed an innovative technique for enhancing the observability of automated SGs based on stochastic ML. Simulations and numerical results on a real system confirmed that the suggested method ensured the distribution network’s visibility before and during reconfiguration in the planning time frame. Krč et al. [101] employed ML-based node characterization to assess an SG’s flexibility in demand responses. The study findings showed that the network flexibility potential could be measured using ML-based node characterization [102]. In contrast, artificial neural networks could categorize historical demand data from network substations. Other studies have also demonstrated that ML techniques are useful for uncertainty quantification in SG applications with algorithms comparing stability prediction signal processing, condition monitoring, and observability enhancement have also been extensively reported.

Although this study focuses on smart grid applications, it is important to contextualize MLSG research within the wider electrical systems domain to appreciate the breadth of ML’s impact. Machine learning has been extensively applied across power system components beyond smart grids, including high-voltage transmission infrastructure, distribution networks, power electronics, and rotating electrical machinery.

In the area of power system fault detection and protection, ML classifiers, particularly random forests, support vector machines, and deep learning architectures, have demonstrated high accuracy in detecting and classifying faults in transmission lines, underground cables, and power transformers [103]. These models analyze current, voltage, and impedance waveforms in real-time, enabling faster protective relay coordination and reducing fault clearance times compared to conventional threshold-based schemes [104]. Transformer condition monitoring represents another critical ML application in electrical systems. Dissolved gas analysis (DGA) combined with ML classifiers has enabled early identification of incipient faults including partial discharge, arcing, and overheating in power transformers, which are among the most critical and expensive assets in electrical networks [105]. ML-enhanced DGA overcomes limitations of traditional IEC ratio methods by learning complex, nonlinear relationships between dissolved gas concentrations and fault types.

In power quality monitoring, ML models have been applied to automatically detect, classify, and localize power quality disturbances, including voltage sags, swells, interruptions, harmonics, and flicker phenomena that affect both grid-connected and off-grid electrical installations [106]. These capabilities are particularly relevant in the context of increasing penetration of nonlinear loads and distributed generation, which intensify power quality challenges across distribution networks. The application of ML to electric motor drives and industrial systems has focused on induction motor fault diagnosis, bearing defect detection, and predictive maintenance scheduling [107]. Vibration signals, stator current spectra, and thermal imaging data have been combined with ML algorithms including convolutional neural networks and long short-term memory networks to identify mechanical and electrical faults at early stages, supporting condition-based maintenance strategies in industrial facilities.

From the results, the bibliometric findings not only reveal the evolution of research trends but also provide important insights into the technological trajectory of smart grid systems. The dominance of research clusters such as Smart Grid Security, Load Forecasting, and Energy Management reflects the critical technological priorities required for next-generation grid infrastructures. The strong emphasis on machine learning-based cybersecurity, particularly in areas such as false data injection and intrusion detection, highlights the increasing vulnerability of smart grids to cyberthreats. This trend suggests that future smart grid architectures must integrate real-time, intelligent security frameworks capable of adaptive threat detection. The growing use of deep learning models further implies a shift toward automated, data-driven security systems, although challenges related to interpretability and deployment in real-time environments remain.

Similarly, the prominence of load forecasting research, particularly using deep learning models such as LSTM and hybrid architecture, indicates a technological transition toward predictive and proactive grid management. Accurate forecasting enables better demand-response strategies, renewable energy integration, and operational efficiency. However, the reliance on large datasets and computational resources suggests that future systems must incorporate edge computing and scalable data infrastructures to support real-time forecasting capabilities. In the domain of advanced energy management, the increasing adoption of reinforcement learning reflects a shift toward autonomous and self-optimizing smart grid systems. These approaches enable dynamic decision-making in energy distribution, storage, and consumption. Nevertheless, practical deployment remains constrained by issues such as training instability, model reliability, and integration with existing grid infrastructure.

Furthermore, the geographical distribution of publications, with leading contributions from countries such as China and the United States, indicates that technological advancements in smart grids are closely tied to national investments in digital infrastructure and energy innovation. This highlights the need for global collaboration and standardization to ensure interoperability and scalability of smart grid technologies. Overall, the bibliometric analysis suggests that the future of smart grid development will be characterized by the convergence of artificial intelligence, big data analytics, and distributed computing, enabling more resilient, efficient, and intelligent energy systems. However, addressing challenges related to data quality, model interpretability, cybersecurity risks, and real-time implementation remains essential for translating these research advancements into practical, large-scale deployments.

The AEM literature similarly exhibits important trade-offs between solution optimality and real-time feasibility. Reinforcement learning, particularly deep reinforcement learning (relevance score: 1.76; Table 7), has demonstrated superior long-term reward optimization for energy dispatch and demand response tasks [73], but convergence times and sample complexity remain practical barriers for real-time grid management. Model-based approaches incorporating digital twin frameworks (relevance score: 1.58; Table 7) offer a promising avenue for accelerating RL training in simulated environments before deployment, though the fidelity of digital twin representations to real grid dynamics is an open challenge. A key unresolved tension in the AEM literature concerns the centralized versus decentralized architecture debate: centralized optimization achieves global optimality but is computationally intractable at scale, while multi-agent reinforcement learning and federated approaches enable scalability at the cost of sub-optimal coordination. Explainability also emerges as a critical gap: while recent XAI methods have been applied to fault detection, their use in AEM decision-support systems particularly in regulatory or consumer-facing applications remains limited. Future research should prioritize hybrid approaches that combine the optimality of model-based methods with the adaptability of data-driven techniques.

3.10. Critical Thematic Synthesis of the MLSG Research Landscape

The keyword co-occurrence analysis presented in Section 3.7 identified three structurally coherent thematic clusters within the MLSG corpus, each characterized by a distinct set of high-frequency keywords, dominant citation networks, and concentrated authorship activity. The present section synthesizes the most significant literature within each cluster, explicitly anchoring the review to the bibliometric structure of the field rather than providing a general narrative account. For each cluster, the dominant papers are identified by citation count within the 4156-document corpus, the core keywords defining the cluster boundary are noted, and cross-cluster relationships are highlighted where they reflect genuine intellectual interdependence in the literature.

(i): Cluster 1—Smart Grid Security (SGS)

Bibliometric cluster profile: Cluster 1 is the largest and most densely connected thematic cluster in the MLSG keyword co-occurrence map, anchored by the keywords smart grid security, false data injection, cyberattack, intrusion detection, anomaly detection, support vector machine, and deep learning. The cluster contains 29 keywords and accounts for the highest total link strength (TLS) among the four clusters, reflecting the dense co-citation relationships between security-focused studies. Within the 4156-document corpus, the most highly cited publications in this cluster include Ozay et al. [71] with 596 citations, Ahmed et al. [77] with 264 citations, Ahmed et al. [80] with 95 citations, and Zhang et al. [38] with 179 citations. These four papers collectively account for a substantial proportion of the cluster’s citation mass and serve as the intellectual anchors for the security sub-domain.

Synthesis of cluster themes: The dominant research question across Cluster 1 concerns the detection and classification of malicious intrusions into smart grid communication and measurement infrastructure, particularly false data injection attacks (FDIAs) and denial-of-service (DoS) attacks. The most-cited work in this cluster, Ozay et al. [71], is the bibliometrically defining study, establishing that ML algorithms particularly SVM and feature-level fusion approaches outperform conventional state vector estimation techniques in detecting coordinated attacks on smart grid measurement systems. This finding has been replicated and extended across the cluster, confirming SVM’s dominance as the baseline detection model in the pre-deep-learning era of the literature. The cluster’s citation network shows a clear temporal transition: studies published before 2018 are predominantly SVM and tree-based, while post-2018 publications shift toward deep learning architectures, particularly autoencoders and CNN-based anomaly detection, consistent with the overlay visualization of keyword emergence shown in Figure 11b.

Ahmed et al. [77], the second most-cited paper in this cluster, advanced the field by proposing unsupervised ML using isolation forests for detecting covert data integrity attacks, moving beyond the supervised paradigm and addressing the critical challenge of operating without labeled attack data, a practical constraint in real-world grid deployments. This methodological contribution is reflected in the bibliometric co-occurrence map, where the keyword isolation forest forms a secondary bridge node linking Cluster 1 to Cluster 3 (Advanced Energy Management), suggesting that anomaly detection techniques developed in the security domain have been transferred to energy management applications. Ahmed et al. [80] further demonstrated that genetic algorithm-based feature selection substantially improves detection accuracy over standard ML pipelines, a finding that has been cited across multiple sub-clusters as evidence for the importance of dimensionality reduction in high-dimensional smart grid data. Zhang et al. [38], a review paper with 179 citations, serves as the cluster’s most-cited synthesis work, mapping the landscape of ML-based FDIA detection and establishing the taxonomy of attack types and corresponding ML countermeasures that subsequent empirical studies have used as a reference framework.

The cluster also contains several studies examining adversarial robustness, a theme that represents the most recent growth frontier within SGS research. Guihai and Sikdar [91] demonstrated that deep learning-based FDIA detectors are themselves vulnerable to adversarial ML attacks in distributed demand response settings; a finding that introduces a reflexive vulnerability into the ML security paradigm and explains the emergence of the keyword adversarial machine learning as a high-growth term in the overlay visualization. More recently, Aziz et al. [92] systematically compared six supervised-learning and hybrid feature-selection combinations for FDIA detection, reporting that ensemble and hybrid approaches consistently outperform single-classifier models, making it consistent with the broader bibliometric trend toward ensemble methods observed across all three clusters.

Cross-cluster linkage: Cluster 1 shares boundary keywords with Cluster 2 (neural networks, deep learning) and with Cluster 3 (smart meters, demand response), reflecting the fact that security-focused studies increasingly incorporate forecasting components and that electricity theft detection, a security problem, relies on the same consumption pattern modeling techniques used in load forecasting.

(ii): Cluster 2—Power Load Forecasting (PLF)

Bibliometric cluster profile: Cluster 2 is anchored by the keywords load forecasting, electricity demand, renewable energy, LSTM, neural networks, deep learning, time series, short-term forecasting, and wind power. It contains 23 keywords and exhibits the highest average citation count per node among the three thematic clusters, reflecting the maturity and citation concentration of PLF research. The dominant papers within this cluster in the 4156-document corpus are Hafeez et al. [75] with 840 citations which is the single most-cited paper in the entire MLSG corpus, Ahmad and Chen [81] with 88 citations, and Syed et al. [94] with moderate citation counts, along with the review by Ahmad et al. [23] with 618 citations, which bridges Clusters 2 and 3. These papers define the intellectual center of gravity for the forecasting sub-domain.

Synthesis of cluster themes: The bibliometric structure of Cluster 2 reveals a clear methodological hierarchy: the most-cited work in the cluster, and in the entire corpus, is Hafeez et al. [75], which proposed an optimized deep learning LSTM model for electric load forecasting using genetic algorithm-based feature selection. Its position as the most-cited paper across all 4156 documents confirms that LSTM-based architectures, particularly when combined with evolutionary feature selection, represent the current methodological consensus for PLF in the MLSG domain, a finding that is directly corroborated by the high co-occurrence frequency of the keywords LSTM and feature selection in Cluster 2 of Figure 11a.

The second most-cited work bridging this cluster, Ahmad et al. [23] with 618 citations, is a review of data-driven probabilistic ML in smart energy systems that serves as the theoretical anchor for the cluster’s shift from deterministic to probabilistic forecasting approaches. Its high citation count reflects the field’s recognition that capturing forecast uncertainty not merely point estimates is critical for operational grid management under increasing renewable penetration. This probabilistic turn is also reflected in the co-occurrence map, where keywords such as uncertainty quantification, probabilistic forecasting, and Monte Carlo appear as mid-density nodes in the outer ring of Cluster 2.

Ahmad and Chen [81] contributed a systematic comparison of three ML model families—ANN-NARX, multivariate linear regression, and adaptive boosting for district-level medium- and long-term energy demand forecasting in smart grids—demonstrating that ensemble methods incorporating feature extraction and outlier detection produce superior spatiotemporal forecasting accuracy. This paper is representative of a broader sub-theme within Cluster 2 that focuses on the scalability of ML forecasting from building-level to district-level granularity, a research direction reflected in the keyword co-occurrence of smart city, energy consumption, and smart meter within the cluster boundary. Ungureanu et al. [93] and Cebekhulu et al. [97] both confirmed that ML-based industrial and system-level load forecasting substantially reduces balancing costs and network stress, particularly in settings with high industrial demand variability themes that link Cluster 2 to Cluster 3 through the shared keyword demand response.

Syed et al. [94] demonstrated that distributed ML using Apache Spark and Hadoop platforms can parallelize load forecasting at scale without sacrificing accuracy, addressing one of the practical deployment barriers identified in Cluster 2: the computational cost of deep learning models at grid-scale inference. This finding is consistent with the emergence of the keyword edge computing in the high-growth zone of the overlay visualization (Figure 11b), suggesting that scalable, edge-deployable forecasting architectures represent the next methodological frontier for this cluster.

Cross-cluster linkage: Cluster 2 is most strongly linked to Cluster 3 through the shared keyword demand response, reflecting the operational relationship between accurate load forecasts and energy management decisions. The keyword renewable energy also bridges Clusters 2 and 3, confirming that renewable integration is simultaneously a forecasting problem and an energy management optimization challenge in the MLSG literature.

(iii): Cluster 3—Advanced Energy Management (AEM)

Bibliometric cluster profile: Cluster 3 is anchored by the keywords energy management, demand response, reinforcement learning, microgrid, smart meters, optimization, IoT, electricity theft, and energy storage. It contains 31 keywords, the largest cluster vocabulary, and represents the broadest thematic scope among the three identified hotspots, encompassing both consumer-side and grid-side energy optimization applications. The dominant papers within this cluster in the 4156-document corpus are Babar et al. [78] with 159 citations, Ahmed et al. [79] with 124 citations, Ji et al. [73] with a high citation count on real-time DRL-based energy management, Kotsiopoulos et al. [33] with 320 citations, and the review by Ahmad et al. [36] with 461 citations. The latter two are the most-cited review papers in the cluster and serve as the principal reference frameworks for the AEM sub-domain.

Synthesis of cluster themes: The bibliometric structure of Cluster 3 reflects a field in active methodological transition. The most-cited review in the cluster, Ahmad et al. [36] with 461 citations, provides the foundational taxonomy of ML and deep learning applications in smart manufacturing and smart grid systems, establishing the range of tasks from anomaly detection to energy scheduling that fall within the AEM domain. Its high citation count relative to empirical papers suggests that the field is still consolidating its conceptual framework, with review papers serving a disproportionate scaffolding function, a pattern consistent with a rapidly expanding research area that has not yet fully standardized its methodological vocabulary. The second most-cited work anchoring this cluster, Kotsiopoulos et al. [33] with 320 citations, is a review examining ML and deep learning in smart manufacturing in the context of the smart grid paradigm, providing a cross-domain perspective that links industrial energy management to grid optimization. Its high citation count within Cluster 3 reflects the keyword co-occurrence of manufacturing, industry 4.0, and IoT within the cluster boundary, confirming that AEM research is increasingly drawing on industrial data science methodologies and that the smart grid is conceptualized as part of a broader industrial digitization ecosystem rather than as an isolated power system component.

Among empirical papers, Babar et al. [78] with 159 citations is the most-cited study in Cluster 3, proposing a secure ML-based demand-side management engine for IoT-enabled smart grids that simultaneously addresses energy optimization and intrusion prevention—a dual contribution that explains the paper’s bridging position between Clusters 1 and 3 in the co-occurrence map. The co-occurrence of IoT and demand response as cluster-defining keywords confirms that consumer-side energy management in connected environments is the primary application context within this cluster. Ahmed et al. [79] advanced this direction by proposing an ML-based energy management model incorporating Gaussian process regression and adaptive service level agreements between energy consumers and smart grid operators, demonstrating that ML-driven AEM can achieve performance comparable to traditional optimization approaches such as genetic algorithms and PSO while offering substantially greater adaptability.

Ji et al. [73], representing the reinforcement learning sub-theme within Cluster 3, demonstrated real-time DRL-based energy management in a microgrid environment using Deep Q-Networks, establishing RL as the most promising control paradigm for autonomous grid operation under uncertainty. This finding is directly corroborated by the high co-occurrence frequency of reinforcement learning and deep reinforcement learning as growth keywords in the overlay visualization, where they occupy the high-density recent emergence zone of Figure 11b, confirming that RL-based AEM represents both the current frontier and the fastest-growing sub-theme within the cluster. Li et al. [98] and Azad et al. [99] further contextualized RL within Cluster 3, demonstrating its capacity to model consumer comfort prediction and to enable adaptive grid responses to demand volatility and renewable intermittency respectively, both themes reflected in the cluster’s keyword co-occurrence of consumer behavior, renewable energy, and energy storage.

Cross-cluster linkage: Cluster 3’s breadth and its high number of bridge keywords to both Cluster 1 (smart meters, anomaly detection) and Cluster 2 (demand response, renewable energy) confirm its integrative role in the MLSG landscape. The cluster effectively constitutes the applied synthesis domain of the field where the security intelligence developed in Cluster 1 and the forecasting accuracy achieved in Cluster 2 are operationalized into real-time grid control, consumer management, and energy optimization systems.

3.11. Overview of ML Models, Application Areas, Challenges, and Current Trends in MLSG Research

(i): Machine learning models used in MLSG research

The bibliometric analysis reveals that a diverse range of ML models has been deployed across the MLSG research landscape, each offering distinct strengths suited to specific smart grid tasks. Based on the reviewed literature, the principal ML models applied in smart grid contexts can be broadly categorized into supervised learning, deep learning, reinforcement learning, and unsupervised approaches [6,108].

Supervised learning models continue to dominate the MLSG corpus. Support Vector Machines (SVMs) have been widely applied to intrusion detection, electricity theft classification, and fault identification tasks due to their strong generalization capability in high-dimensional feature spaces [39]. Decision Trees and their ensemble variants, particularly random forests and Gradient Boosted Trees including XGBoost, have been extensively used for load classification, demand forecasting, and anomaly detection owing to their interpretability and robustness to noisy data [65]. Logistic Regression has served as a reliable baseline classifier in binary attack detection and demand response prediction tasks across numerous benchmarking studies [109].

Deep learning models have gained significant traction in the MLSG landscape over the past decade. Long short-term memory (LSTM) networks and Gated Recurrent Units (GRUs) have demonstrated strong performance in sequential time-series tasks, particularly short-term and medium-term load forecasting, where capturing temporal dependencies is critical [69,110]. Convolutional neural networks (CNNs) have been applied to power quality disturbance classification and smart meter data analysis, effectively extracting spatial and local features from signal data [35,111]. Hybrid CNN-LSTM architecture has emerged as a leading approach for combined spatial–temporal modeling in energy consumption prediction and grid stability monitoring [34].

Reinforcement learning (RL), and particularly deep reinforcement learning (DRL), has become a prominent paradigm for real-time energy management and demand response optimization in microgrids and grid-connected storage systems. DRL agents, including Deep Q-Networks (DQN) and Proximal Policy Optimization (PPO), learn optimal control strategies through interaction with simulated grid environments, enabling adaptive decision-making under conditions of uncertainty [73]. Unsupervised and semi-supervised models, including autoencoders, isolation forests, and clustering algorithms such as K-Means and DBSCAN, have been applied to anomaly detection, false data injection identification, and consumer behavior segmentation, particularly in settings where labeled training data is scarce or costly to obtain [77]. Together, these model families reflect the breadth and methodological maturity of ML deployment across the MLSG research landscape.

Based on the systematic review of the MLSG literature, ten principal application areas have been identified, reflecting the full operational scope of smart grid systems. They are: (1) intrusion and cyberattack detection, where SVM and deep autoencoders identify false data injection and DoS attacks in real time; (2) electricity theft detection using supervised classifiers on smart meter consumption data; (3) short- and long-term load forecasting using LSTM and ensemble regressors; (4) renewable energy output forecasting to manage variability from solar and wind sources; (5) demand response and energy management via reinforcement learning agents; (6) fault detection and grid stability monitoring from voltage and frequency measurements; (7) power quality disturbance classification using CNN and wavelet-based models; (8) smart meter analytics and consumer behavior profiling through clustering algorithms; (9) predictive maintenance of transformers and other grid assets using sensor data; and (10) electric vehicle charging optimization and vehicle-to-grid scheduling as EV penetration continues to rise globally.

(ii): Challenges of applying ML in smart grids

Despite the significant progress documented in the MLSG literature, the application of ML in smart grid environments faces a number of persistent and interrelated challenges that continue to temper the translation of research advances into operational deployment.

Firstly, smart grid ML models depend heavily on large volumes of high-quality, labeled operational data. In practice, real-world grid datasets are frequently incomplete, imbalanced, noisy, or corrupted by measurement errors and sensor faults. Furthermore, smart meter and operational data are subject to stringent privacy regulations, including GDPR and equivalent national frameworks, which restrict data sharing between utilities, researchers, and technology developers, thereby limiting the size and diversity of available training datasets [32].

Secondly, many critical smart grid ML tasks, including fault detection, cyberattack identification, and electricity theft classification, are characterized by severely imbalanced class distributions in which anomalous events represent a small minority of all observations. Standard ML models trained on imbalanced data tend to be biased toward majority classes, producing high overall accuracy but poor recall for the rare events of greatest operational importance. Addressing this challenge typically requires specialized resampling strategies, cost-sensitive learning, or purpose-built anomaly detection frameworks [112].

Thirdly, the increasing adoption of complex deep learning architectures, while delivering state-of-the-art predictive performance, introduces significant challenges around model transparency and interpretability. Grid operators, regulators, and utility engineers require meaningful explanations for ML-driven decisions, particularly in safety-critical contexts such as fault protection and demand response. The opacity of black-box models represents a substantial barrier to regulatory approval and operational trust, driving growing interest in explainable AI (XAI) frameworks such as SHAP and LIME for smart grid applications [113].

Lastly, smart grid systems operate at considerable scale, encompassing millions of smart meters, thousands of substations, and complex multi-layered communication networks, and they require ML inference at timescales ranging from milliseconds in protective relaying to seconds in real-time energy management. Many computationally intensive deep learning models cannot currently satisfy these latency and throughput requirements without significant hardware investment or model compression, which limits their applicability in resource-constrained edge computing environments [29].

(iii): Current trends and emerging directions in MLSG research

Several converging trends are reshaping the MLSG research frontier. Federated learning has emerged as a practical response to privacy constraints, allowing utilities to train shared models on decentralized data without exposing raw consumption records [37]. Transformer-based architectures, originally developed for language tasks, are demonstrating strong performance in load and renewable generation forecasting by capturing long-range temporal dependencies more effectively than LSTM models [102]. Digital twin technology is being paired with ML to generate synthetic training data for rare fault scenarios and cyberattack simulations, addressing the persistent scarcity of labeled real-world data. Graph neural networks are gaining traction for topology-aware grid modeling, capturing the structural dependencies of bus-line networks that standard architectures cannot represent. Finally, the integration of ML with physics-based models including physics-informed neural networks that embed Kirchhoff’s laws as training constraints is improving prediction reliability and physical plausibility, while explainable AI tools such as SHAP and LIME are increasingly being embedded into MLSG pipelines as a prerequisite for responsible operational deployment [92]. Additionally, the comparative analysis of ML approaches in smart grid applications is shown in Table 8.

Recent advances published since 2022 have further refined the MLSG methodological frontier across all three thematic clusters. In Smart Grid Security, large-scale intrusion detection systems incorporating multi-head attention mechanisms and contrastive learning have demonstrated improved generalization across diverse attack types, including zero-day FDIA variants for which labeled training data does not yet exist. Ensemble-based anomaly detectors combining isolation forests with gradient boosted classifiers have shown statistically significant improvements over single-model baselines when evaluated on heterogeneous smart meter datasets from multiple grid operators, suggesting that model diversity is a more reliable predictor of out-of-distribution performance than raw model complexity. In Power Load Forecasting, temporal fusion transformers and patch-based time-series foundation models pre-trained on large energy corpora have achieved state-of-the-art accuracy on multiple benchmark datasets including GEFCOM and PecanStreet, outperforming LSTM baselines by margins of 8–14% on RMSE metrics while requiring substantially less task-specific fine-tuning data, a practically significant finding for utilities operating in data-scarce environments. In Advanced Energy Management, multi-agent deep reinforcement learning frameworks incorporating communication protocols between prosumer nodes have demonstrated robust demand-response coordination in simulated distribution networks with high renewable penetration, reducing peak load variance by up to 23% compared to centralized optimization baselines. However, these advances also bring into sharp relief several persistent gaps and challenges that constrain the translation of research results into operational deployment. First, the vast majority of studies in all three clusters continue to rely on simulated or semi-synthetic grid environments, with less than 12% of empirical papers in the 4156-document corpus reporting results from field trials or live grid deployments; this evaluation gap represents the most structurally significant barrier between MLSG research and practical adoption. Second, model interoperability and standardization remain unaddressed: there is currently no widely adopted benchmark dataset or evaluation protocol shared across MLSG sub-domains, making cross-study comparison methodologically unreliable and impeding reproducibility. Third, the computational overhead of state-of-the-art deep learning and RL architectures frequently exceeds the inference latency budgets of real-time grid control applications, and lightweight model compression techniques such as quantization-aware training and structured pruning have received disproportionately little attention in the MLSG literature relative to their practical importance. Fourth, adversarial robustness testing against adaptive and coordinated multi-point attacks, as opposed to single-point perturbations, remains an underdeveloped area despite its direct relevance to operational grid security. Addressing these gaps through dedicated benchmark development, real-world pilot studies, and methodologically rigorous adversarial evaluation protocols represents the most critical near-term research priority for the MLSG community.

The bibliometric findings presented in this study, while scoped to the smart grid domain, carry meaningful implications for the broader electrical power engineering field. The three thematic clusters identified in the keyword co-occurrence analysis, namely Smart Grid Security, Power Load Forecasting, and Advanced Energy Management, each correspond to a class of ML methodology whose underlying algorithmic logic is not intrinsically bound to the smart grid context. The discussion of transferability is therefore not a peripheral observation but a direct corollary of the bibliometric structure of the field. As MLSG research matures, its methodological outputs are progressively diffusing into adjacent electrical systems domains, a process already visible in the cross-domain citations of several high-impact papers within the corpus.

The ML classifiers that dominate Cluster 1, including SVM, random forest, and deep autoencoders developed for anomaly and attack detection, are methodologically equivalent to the classifiers applied in power system fault detection and protection. Studies operating outside the smart grid context have demonstrated that these same architectures achieve high accuracy in detecting and classifying faults in transmission lines, underground cables, and power transformers [103], analyzing current, voltage, and impedance waveforms in real time to enable faster protective relay coordination compared to conventional threshold-based schemes [104]. The intellectual transfer between MLSG security research and the broader fault detection literature is therefore methodologically direct, and the citation base of papers bridging both domains, such as Ozay et al. [71] and Aziz et al. [92], confirms that cross-pollination is already occurring within the corpus examined in this study.

The time-series forecasting architectures that define Cluster 2, particularly LSTM and hybrid CNN-LSTM models, are structurally equivalent to those applied in transformer condition monitoring, where dissolved gas analysis (DGA) data is modeled as a sequential signal to detect incipient faults including partial discharge, arcing, and thermal overheating [105]. The MLSG forecasting literature’s emphasis on temporal feature extraction and probabilistic uncertainty quantification provides methodological tools that are directly applicable to prognostic health management of power transformers and other critical grid assets. Similarly, the power quality monitoring domain, which encompasses the detection and classification of voltage sags, swells, harmonics, and flicker, relies on the same CNN and wavelet-based signal classification architectures that appear as high-growth keywords in Cluster 2 of the overlay visualization [106]. This convergence confirms that the methodological frontier of PLF research and power quality analysis are increasingly drawing on the same deep learning architectures, a development that reflects the maturation and broadening applicability of the methods first systematically developed and validated in the MLSG context.

The reinforcement learning and IoT-driven energy management approaches that anchor Cluster 3 have direct analogs in the domain of electric motor drives and industrial automation, where DRL agents are increasingly being applied to predictive maintenance scheduling, rotor fault diagnosis, and efficiency optimization [107]. The shared methodological foundation in this case involves RL agents operating on sensor streams from physical systems with complex and high-dimensional state spaces, meaning that the algorithmic advances documented in Cluster 3 of the MLSG corpus are simultaneously advancing the state of the art in industrial electrical systems, even when the two sets of literature develop in parallel rather than through direct citation exchange.

These observations collectively suggest that the MLSG research landscape, as documented in this bibliometric study, functions not as a self-contained domain but as a methodological incubator whose outputs are progressively transferring across the full spectrum of electrical engineering applications. Future bibliometric studies should examine the citation flows between MLSG publications and adjacent domains, including power system protection, transformer diagnostics, motor control, and power electronics, to quantify the rate and directionality of this methodological transfer. Such an analysis would provide a more complete picture of ML’s cumulative contribution to electrical engineering as a discipline and would complement the within-domain mapping provided by the present study.

4. Conclusions

This study makes several original and substantive contributions to the understanding of how machine learning is being applied within smart grid research. By analyzing 4156 MLSG publications indexed in Scopus between 2009 and 2025 through bibliometric and science mapping techniques, the study advances knowledge of the field in the following specific ways. First, this study contributes a definitive longitudinal mapping of MLSG research growth, demonstrating that publications increased from one in 2009 to 1163 in 2025, an increase of over 116,200%. This trajectory confirms that MLSG has transitioned from an emerging niche into a mature and highly productive research domain and provides the first comprehensive quantitative baseline against which future growth can be benchmarked. Second, the study contributes original insight into the global distribution of MLSG research productivity and impact. The findings establish that India (963 publications, 14,473 citations), China (762 publications, 25,130 citations), and the United States (695 publications) are the three most prolific nations, and King Saud University, and Tennessee Technological University, are the most productive institutional contributors. This mapping of the field’s intellectual geography is a direct contribution to the global understanding of where MLSG expertise is concentrated and where capacity gaps exist. Thirdly, the keyword co-occurrence and cluster analysis applied systematically to the MLSG corpus at this scale reveals three thematically coherent hotspots: (i) Smart Grid Security, (ii) Power Load Forecasting, and (iii) Advanced Energy Management. This thematic cartography is an original contribution that structures the intellectual landscape of the field and provides researchers with a clear framework for positioning new work within the existing knowledge base. Fourth, the critical literature review of the three identified hotspots synthesizes the most significant ML methodologies and findings across the corpus. In Smart Grid Security, the review demonstrates that deep learning and ensemble classifiers have achieved state-of-the-art performance in detecting false data injection, DoS attacks, and electricity theft, while highlighting persistent challenges around adversarial robustness and real-world deployment. In Power Load Forecasting, the review shows that LSTM-based and hybrid deep learning models have consistently outperformed classical statistical approaches, particularly for short-term and probabilistic forecasting under renewable energy uncertainty. In Advanced Energy Management, the review confirms that reinforcement learning, particularly deep reinforcement learning, has emerged as the leading paradigm for real-time, adaptive energy optimization in microgrids and demand response systems. Lastly, this study contextualizes MLSG research within the broader landscape of ML applications across electrical systems including power system fault detection, transformer condition monitoring, power quality analysis, and motor drive diagnostics establishing that the methodological advances demonstrated in the smart grid domain have substantial transferability to the wider electrical engineering field. Collectively, these contributions provide a robust, evidence-based foundation for researchers entering the MLSG field, for policymakers designing research investment strategies, and for practitioners seeking to identify and adopt the most effective ML tools for smart grid applications. Future work should extend this analysis to include multi-database comparisons (Web of Science, IEEE Xplore), apply citation network analysis to trace the intellectual lineage of key innovations, and examine the specific transferability of MLSG methodologies to broader electrical systems domains through targeted systematic reviews.

5. Limitations

While this study provides a comprehensive bibliometric and thematic synthesis of MLSG research, several limitations must be explicitly acknowledged to guide appropriate interpretation of the findings. First, the study relies exclusively on the Elsevier Scopus database. Although Scopus offers broad coverage of peer-reviewed engineering and computer science literature, publications indexed solely in Web of Science, IEEE Xplore, or domain-specific repositories may not be captured. Future studies should consider multi-database searches to further enhance coverage. Second, the English-language restriction applied during document screening introduces a language bias: significant contributions published in Chinese, Arabic, German, or other languages are systematically excluded. This limitation is particularly relevant given that China is the single most productive country in MLSG research, and a portion of its output may appear in domestic Chinese-language venues. Third, the citation-based selection process used to identify anchor papers for the thematic review introduces recency bias against recent publications (2022–2025), which are structurally disadvantaged in citation accumulation relative to older works. The supplementary integration of papers from Table 7 (ranked by bibliometric relevance score) partially mitigates this limitation but does not fully resolve it. Fourth, the search was conducted on 3 May 2026, and late-indexed 2025 publications may be absent, as discussed in Section 2. Fifth, as is inherent in bibliometric studies generally, this analysis maps the intellectual structure of a research field as represented in publication metadata; it does not evaluate the practical implementation fidelity, real-world performance, or scalability of the ML methods discussed. The gap between promising laboratory-scale ML results and validated, deployed smart grid applications is itself an important area of future inquiry.

6. Future Research Direction

The bibliometric and thematic findings of this study collectively point to several specific and underexplored research directions. First, the under-representation of federated learning, graph neural networks, and physics-informed neural networks in the current MLSG corpus despite their high bibliometric relevance scores (Table 7) indicates that these methodologies are at an early adoption stage and warrant dedicated empirical investigation, particularly for privacy-preserving anomaly detection and topology-aware load forecasting. Second, the accuracy–interpretability trade-off identified across all three thematic clusters suggests an urgent need for XAI frameworks tailored specifically to smart grid operational contexts, where regulatory compliance and operator trust are paramount. Third, the dominance of benchmark dataset evaluations (e.g., IEEE bus systems, KDD99) in the SGS literature, combined with limited real-world deployment evidence, highlights the need for large-scale, real-world validation studies conducted in partnership with utility operators. Fourth, the comparative scarcity of multi-country and multi-institutional collaborative publications particularly in the Global South suggests that targeted funding mechanisms and research consortia could substantially broaden the geographic diversity and generalizability of MLSG findings. Fifth, longitudinal tracking of the MLSG corpus replicating the current study at five-year intervals would enable systematic monitoring of paradigm shifts, emerging clusters, and the citation lifecycle of key methodologies. Finally, the integration of digital twin frameworks with federated ML offers a particularly promising avenue for real-time energy management optimization, combining the simulation fidelity of digital twins with the privacy preservation of federated architectures.

7. Practical Implications

The findings of this study have important implications for researchers, policymakers, and smart grid developers.

For researchers, the identification of dominant research themes such as cybersecurity, load forecasting, and energy management provides a clear roadmap for future investigations. Researchers are encouraged to focus on interdisciplinary approaches, combining machine learning with domain-specific knowledge in energy systems. Additionally, the gaps identified in areas such as model interpretability and benchmarking highlight opportunities for impactful contributions.

For policymakers, the increasing research focus on Smart Grid Security underscores the need for strong regulatory frameworks and cybersecurity standards. Policymakers should support the development of data-sharing policies, funding for AI-driven energy research, and international collaboration initiatives to accelerate smart grid innovation. Furthermore, policies promoting renewable energy integration and digital infrastructure are critical for advancing intelligent energy systems.

Finally for industry practitioners, the findings highlight the growing importance of AI-driven solutions for predictive maintenance, demand forecasting, and real-time energy management. Developers should prioritize the integration of scalable ML models, real-time analytics systems and cybersecurity-aware architectures. However, attention must also be given to deployment challenges, including computational constraints, system interoperability, and model reliability in real-world environments.

Funding

The researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2026).

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Aguero, J.R.; Takayesu, E.; Novosel, D.; Masiello, R. Modernizing the Grid: Challenges and Opportunities for a Sustainable Future. IEEE Power Energy Mag. 2017, 15, 74–83. [Google Scholar] [CrossRef]
Bhat, S.M.; Venkitaraman, A. Strategic integration of predictive maintenance plans to improve operational efficiency of smart grids. In Proceedings of the 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS); IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
Haas, R.; Auer, H.; Resch, G. Heading towards democratic and sustainable electricity systems—The example of Austria. Renew. Energy Environ. Sustain. 2022, 7, 20. [Google Scholar]
Ohanu, C.P.; Rufai, S.A.; Oluchi, U.C. A comprehensive review of recent developments in smart grid through renewable energy resources integration. Heliyon 2024, 10, e25705. [Google Scholar] [CrossRef] [PubMed]
Elshiekh, M.; Elwakeel, A.; Venuturumilli, S.; Alafnan, H.; Pei, X.; Zhang, M.; Yuan, W. Utilising SMES-FCL to improve the transient behaviour of a doubly fed induction generator DC wind system. Int. J. Electr. Power Energy Syst. 2021, 131, 107099. [Google Scholar] [CrossRef]
Rajaperumal, T.A.; Columbus, C.C. Transforming the electrical grid: The role of AI in advancing smart, sustainable, and secure energy systems. Energy Inform. 2025, 8, 551. [Google Scholar] [CrossRef]
Gunavathi, R.; Karthikeyan, G. Modernization of Rural Electric Infrastructure. In AI-Powered IoT in the Energy Industry: Digital Technology and Sustainable Energy Systems; Springer: Berlin/Heidelberg, Germany, 2023; pp. 229–252. [Google Scholar]
Meenual, T.; Usapein, P. Microgrid Policies: A Review of Technologies and Key Drivers of Thailand. Front. Energy Res. 2021, 9, 591537. [Google Scholar] [CrossRef]
Shankar, R.; Singh, S. Development of smart grid for the power sector in India. Clean. Energy Syst. 2022, 2, 100011. [Google Scholar] [CrossRef]
Raza, M.A.; Aman, M.M.; Abro, A.G.; Tunio, M.A.; Khatri, K.L.; Shahid, M. Challenges and potentials of implementing a smart grid for Pakistan’s electric network. Energy Strat. Rev. 2022, 43, 100941. [Google Scholar] [CrossRef]
Jena, P.K.; Ghosh, S.; Koley, E. Design of a coordinated cyber-physical attack in IoT based smart grid under limited intruder accessibility. Int. J. Crit. Infrastruct. Prot. 2021, 35, 100484. [Google Scholar]
Karanfil, M.; Rebbah, D.E.; Ghafouri, M.; Kassouf, M.; Debbabi, M.; Hanna, A. Security Monitoring of the Microgrid Using IEC 62351-7 Network and System Management. In Proceedings of the 2022 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT); IEEE: New York, NY, USA, 2022; pp. 1–5. [Google Scholar]
Mallick, M.A.I.; Nath, R. Navigating the cyber security landscape: A comprehensive review of cyber-attacks, emerging trends, and recent developments. World Sci. News 2024, 190, 1–69. [Google Scholar]
Kezunovic, M.; Xie, L.; Grijalva, S. The role of big data in improving power system operation and protection. In Proceedings of the 2013 IREP Symposium Bulk Power System Dynamics and Control-IX Optimization, Security and Control of the Emerging Power Grid; IEEE: New York, NY, USA, 2013; pp. 1–9. [Google Scholar]
Baembitov, R.; Karmacharya, A.; Kezunovic, M.; Saranovic, D.; Obradovic, Z. Interpretability of the ML-Based State of Risk Predictions for the Electric Grid Forced Outages. IFAC-PapersOnLine 2025, 59, 181–186. [Google Scholar] [CrossRef]
Godse, R.; Bhat, S. Mathematical Morphology-Based Feature-Extraction Technique for Detection and Classification of Faults on Power Transmission Line. IEEE Access 2020, 8, 38459–38471. [Google Scholar]
Qutub, A.M. Dissolved Gas Analysis of Renewable Energy Generation Transformers Using Statistical Machine Learning. Master’s Thesis, University of Illinois Chicago, Chicago, IL, USA, 2024. [Google Scholar]
Khaldi, B.F.; Dekhandji, F.Z.; Recioui, A. Power Quality Disturbances: A review of Detection, Classification, Optimization, and Mitigation Techniques. Alger. J. Signals Syst. 2024, 9, 261–286. [Google Scholar] [CrossRef]
Echabarri, S. Artificial Intelligence-Based Failure Prognosis for Predictive Maintenance: Application to Hydrogen Power Generators. Ph.D. Thesis, Université de Lorraine, Nancy, France, 2025. [Google Scholar]
Alazemi, T.; Darwish, M.; Radi, M. Renewable energy sources integration via machine learning modelling: A systematic literature review. Heliyon 2024, 10, e26088. [Google Scholar] [CrossRef] [PubMed]
Tiwari, S.; Jain, A.; Yadav, K.; Ramadan, R. Machine Learning-Based Model for Prediction of Power Consumption in Smart Grid. Int. Arab. J. Inf. Technol. 2022, 19, 323–329. [Google Scholar] [CrossRef]
Aribisala, A.; Khan, M.S.; Husari, G. Machine learning algorithms and their applications in classifying cyber-attacks on a smart grid network. In Proceedings of the 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON); IEEE: New York, NY, USA, 2021; pp. 63–69. [Google Scholar]
Ahmad, T.; Madonski, R.; Zhang, D.; Huang, C.; Mujeeb, A. Data-driven probabilistic machine learning in sustainable smart energy/smart energy systems: Key developments, challenges, and future research opportunities in the context of smart grid paradigm. Renew. Sustain. Energy Rev. 2022, 160, 112128. (In English) [Google Scholar] [CrossRef]
Rai, S.; De, M. Analysis of classical and machine learning based short-term and mid-term load forecasting for smart grid. Int. J. Sustain. Energy 2021, 40, 821–839. [Google Scholar]
Tiwari, S.; Jain, A.; Ahmed, N.M.O.S.; Charu; Alkwai, L.M.; Dafhalla, A.K.Y.; Hamad, S.A.S. Machine learning-based model for prediction of power consumption in smart grid- smart way towards smart city. Expert Syst. 2022, 39, e12832. (In English) [Google Scholar] [CrossRef]
Önder, M.; Dogan, M.U.; Polat, K. Classification of smart grid stability prediction using cascade machine learning methods and the internet of things in smart grid. Neural Comput. Appl. 2023, 35, 17851–17869. [Google Scholar] [CrossRef]
Khan, N.; Qureshi, M.I.; Falahat, M.; Sikandar, H.; Sham, R.B. Navigating the Renewable Energy Transition: A Systematic Review of Economic and Policy Strategies for Grid Integration, Stability, and Viability. Int. J. Energy Econ. Policy 2025, 15, 709–723. [Google Scholar] [CrossRef]
Ali, A.S.; Azad, S.; Khorshed, T. Securing the smart grid: A machine learning approach. In Smart Grids: Opportunities, Developments, and Trends; Springer: London, UK, 2013; pp. 169–198. [Google Scholar]
Khoei, T.T.; Hu, W.C.; Kaabouch, N. Residual Convolutional Network for Detecting Attacks on Intrusion Detection Systems in Smart Grid. In Proceedings of the 2022 IEEE International Conference on Electro Information Technology (eIT); IEEE: New York, NY, USA, 2022; pp. 231–237. [Google Scholar]
Alsirhani, A.; Tariq, N.; Humayun, M.; Alwakid, G.N.; Sanaullah, H. Intrusion detection in smart grids using artificial intelligence-based ensemble modelling. Clust. Comput. 2025, 28, 238. [Google Scholar] [CrossRef]
Zhe, W.; Wei, C.; Chunlin, L. DoS attack detection model of smart grid based on machine learning method. In Proceedings of the 2020 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS); IEEE: New York, NY, USA, 2020; pp. 735–738. [Google Scholar] [CrossRef]
Amanlou, S.; Hasan, M.K.; Mokhtar, U.A.; Malik, K.M.; Islam, S.; Khan, S.; Khan, M.A. Cybersecurity Challenges in Smart Grid Systems: Current and Emerging Attacks, Opportunities, and Recommendations. IEEE Open J. Commun. Soc. 2025, 6, 1965–1997. [Google Scholar] [CrossRef]
Kotsiopoulos, T.; Sarigiannidis, P.; Ioannidis, D.; Tzovaras, D. Machine Learning and Deep Learning in smart manufacturing: The Smart Grid paradigm. Comput. Sci. Rev. 2021, 40, 100341. (In English) [Google Scholar] [CrossRef]
Kiasari, M.; Ghaffari, M.; Aly, H.H. A Comprehensive Review of the Current Status of Smart Grid Technologies for Renewable Energies Integration and Future Trends: The Role of Machine Learning and Energy Storage Systems. Energies 2024, 17, 4128. [Google Scholar] [CrossRef]
Xu, C.; Liao, Z.; Li, C.; Zhou, X.; Xie, R. Review on Interpretable Machine Learning in Smart Grid. Energies 2022, 15, 4427. (In English) [Google Scholar] [CrossRef]
Hossain, E.; Khan, I.; Un-Noor, F.; Sikander, S.S.; Sunny, S.H. Application of Big Data and Machine Learning in Smart Grid, and Associated Security Concerns: A Review. IEEE Access 2019, 7, 13960–13988. (In English) [Google Scholar] [CrossRef]
Taherdoost, H. A systematic review of big data innovations in smart grids. Results Eng. 2024, 22, 102132. [Google Scholar] [CrossRef]
Cui, L.; Qu, Y.; Gao, L.; Xie, G.; Yu, S. Detecting false data attacks using machine learning techniques in smart grid: A survey. J. Netw. Comput. Appl. 2020, 170, 102808. (In English) [Google Scholar] [CrossRef]
Berghout, T.; Benbouzid, M.; Muyeen, S. Machine learning for cybersecurity in smart grids: A comprehensive review-based study on methods, solutions, and prospects. Int. J. Crit. Infrastruct. Prot. 2022, 38, 100547. [Google Scholar] [CrossRef]
KDD Cup Dataset. Available online: https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (accessed on 30 October 2021). (In English)
Purna Prakash, K.; Venkata Pavan Kumar, Y.; Himajyothi, K.; Pradeep Reddy, G. Comprehensive Bibliometric Analysis on Smart Grids: Key Concepts and Research Trends. Electricity 2024, 5, 75–92. [Google Scholar] [CrossRef]
Gao, D.; Cai, J.; Wu, K. The smart green tide: A bibliometric analysis of AI and renewable energy transition. Energy Rep. 2025, 13, 5290–5304. [Google Scholar] [CrossRef]
Rasoulnia, M.; Yaghoubi, E.; Hussain, A.; Kamwa, I. A comprehensive systematic and bibliometric review of technologies and measurement tools for power quality events detection, classification, and fault location in smart grids. Renew. Sustain. Energy Rev. 2026, 226, 116302. [Google Scholar]
Jaramillo, M.; Carrión, D.; Muñoz, J.; Tipán, L. A Bibliometric Assessment of AI, IoT, Blockchain, and Big Data in Renewable Energy-Oriented Power Systems. Energies 2025, 18, 3067. [Google Scholar]
Sakhnini, J.; Karimipour, H.; Dehghantanha, A.; Parizi, R.M. AI and security of critical infrastructure. In Handbook of Big Data Privacy; Springer: Berlin/Heidelberg, Germany, 2020; pp. 7–36. [Google Scholar]
Banad, Y.M.; Sharif, S.S.; Rezaei, Z. Artificial intelligence and machine learning for smart grids: From foundational paradigms to emerging technologies with digital twin and large language model-driven intelligence. Energy Convers. Manag. X 2025, 28, 101329. [Google Scholar] [CrossRef]
Ajibade, S.-S.M.; Ojeniyi, A. Bibliometric Survey on Particle Swarm Optimization Algorithms (2001–2021). J. Electr. Comput. Eng. 2022, 2022, 3242949. [Google Scholar] [CrossRef]
Kek, H.Y.; Saupi, S.B.M.; Tan, H.; Othman, M.H.D.; Nyakuma, B.B.; Goh, P.S.; Altowayti, W.A.H.; Qaid, A.; Wahab, N.H.A.; Lee, C.H.; et al. Ventilation strategies for mitigating airborne infection in healthcare facilities: A review and bibliometric analysis (1993–2022). Energy Build. 2023, 295, 113323. [Google Scholar] [CrossRef]
Ajibade, S.-S.M.; Bashir, F.M.; Dodo, Y.A.; Dayupay, J.P.; De La Calzada, L.M.; Adediran, A.O. Application of Machine Learning in Energy Storage: A Scientometric Research of a Decade. In Proceedings of the International Conference on Information and Software Technologies; Springer: Berlin/Heidelberg, Germany, 2023; pp. 124–135. [Google Scholar]
Prayogo, G.S.; Mamat, R.; Ghazali, M.F.; Nugroho, A.; Catrawedarma, I.G.N.B.; Muriban, J.; Zikri, M. A Bibliometric Review of the Research Progress and Trends in Nanolubricants for Refrigeration Systems (2003–2025). Int. J. Automot. Sci. Technol. 2025, 9, 353–373. [Google Scholar]
Xu, X.; Xia, Z. Bibliometric analysis on organizational innovation research based on Scopus from 2012 to 2024. Iberoam. J. Sci. Meas. Commun. 2025, 5, 1–19. [Google Scholar] [CrossRef]
Borgohain, D.J.; Bhardwaj, R.K.; Verma, M.K. Mapping the literature on the application of artificial intelligence in libraries (AAIL): A scientometric analysis. Libr. Hi Tech 2024, 42, 149–179. [Google Scholar]
Ajibade, S.-S.M.; Alhassan, G.N.; Zaidi, A.; Oki, O.A.; Awotunde, J.B.; Ogbuju, E.; Akintoye, K.A. Evolution of machine learning applications in medical and healthcare analytics research: A bibliometric analysis. Intell. Syst. Appl. 2024, 24, 200441. [Google Scholar] [CrossRef]
Baek, C.; Doleck, T. Educational Data Mining: A Bibliometric Analysis of an Emerging Field. IEEE Access 2022, 10, 31289–31296. [Google Scholar] [CrossRef]
Li, Y.; Ding, Y.; He, S.; Hu, F.; Duan, J.; Wen, G.; Geng, H.; Wu, Z.; Gooi, H.B.; Zhao, Y.; et al. Artificial intelligence-based methods for renewable power system operation. Nat. Rev. Electr. Eng. 2024, 1, 163–179. [Google Scholar] [CrossRef]
Deng, C.; Zhang, T.; He, Z.; Chen, Q.; Shi, Y.; Xu, Y.; Fu, L.; Zhang, W.; Wang, X.; Zhou, C.; et al. K2: A foundation language model for geoscience knowledge understanding and utilization. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Merida, Mexico, 4–8 March 2024; pp. 161–170. [Google Scholar]
Ferreira, N.C.; Ferreira, J.J. The field of resource-based view research: Mapping past, present and future trends. Manag. Decis. 2024, 63, 1124–1153. [Google Scholar] [CrossRef]
Zaidi, A.; Ajibade, S.-S.M.; Musa, M.; Bekun, F.V. New Insights into the Research Landscape on the Application of Artificial Intelligence in Sustainable Smart Cities: A Bibliometric Mapping and Network Analysis Approach. Int. J. Energy Econ. Policy 2023, 13, 287–299. [Google Scholar] [CrossRef]
Wong, S.L.; Nyakuma, B.B.; Wong, K.Y.; Lee, C.T.; Lee, T.H.; Lee, C.H. Microplastics and nanoplastics in global food webs: A bibliometric analysis (2009–2019). Mar. Pollut. Bull. 2020, 158, 111432. [Google Scholar] [CrossRef] [PubMed]
Xin, H.; Ajibade, S.-S.M.; Alhassan, G.N.; Yilmaz, Y. Emerging trends and bibliometric analysis of internet of medical things for innovative healthcare (2016–2023). Digit. Health 2026, 12, 20552076251395701. [Google Scholar] [CrossRef] [PubMed]
Nyakuma, B.B.; Wong, S.; Mong, G.R.; Utume, L.N.; Oladokun, O.; Wong, K.Y.; Ivase, T.J.-P.; Abdullah, T.A.T. Bibliometric analysis of the research landscape on rice husks gasification (1995–2019). Environ. Sci. Pollut. Res. 2021, 28, 49467–49490. [Google Scholar] [CrossRef]
Nyakuma, B.B.; Nordin, A.H.; Lee, C.T.; Ngadi, N.; Wong, K.Y.; Oladokun, O. Uncovering the dynamics in global carbon dioxide utilization research: A bibliometric analysis (1995–2019). Environ. Sci. Pollut. Res. 2021, 28, 13842–13860. [Google Scholar] [CrossRef] [PubMed]
Yumnam, G.; Singh, C.I. An application of Bradford’s law of scattering and Leimkuhler model: Identification of the core journals of India Cancer Research Productivity. Sci. Technol. Libr. 2024, 43, 188–201. [Google Scholar]
Sadeqi-Arani, Z.; Janavi, E. The Global Researches Trends in Customer Knowledge Management (CKM). Iran. J. Inf. Process. Manag. 2024, 39, 267–297. [Google Scholar]
Khan, Z.A.; Adil, M.; Javaid, N.; Saqib, M.N.; Shafiq, M.; Choi, J.-G. Electricity Theft Detection Using Supervised Learning Techniques on Smart Meter Data. Sustainability 2020, 12, 8023. [Google Scholar] [CrossRef]
Mujeeb, S.; Javaid, N. ESAENARX and DE-RELM: Novel schemes for big data predictive analytics of electricity load and price. Sustain. Cities Soc. 2019, 51, 101642. (In English) [Google Scholar] [CrossRef]
Arif, A.; Alghamdi, T.A.; Khan, Z.A.; Javaid, N. Towards Efficient Energy Utilization Using Big Data Analytics in Smart Cities for Electricity Theft Detection. Big Data Res. 2022, 27, 100285. (In English) [Google Scholar] [CrossRef]
El-Toukhy, A.T.; Badr, M.M.; Mahmoud, M.M.E.A.; Srivastava, G.; Fouda, M.M.; Alsabaan, M. Electricity Theft Detection Using Deep Reinforcement Learning in Smart Power Grids. IEEE Access 2023, 11, 59558–59574. [Google Scholar] [CrossRef]
Syed, D.; Abu-Rub, H.; Ghrayeb, A.; Refaat, S.S.; Houchati, M.; Bouhali, O.; Banales, S. Deep Learning-Based Short-Term Load Forecasting Approach in Smart Grid with Clustering and Consumption Pattern Recognition. IEEE Access 2021, 9, 54992–55008. (In English) [Google Scholar] [CrossRef]
Jiang, C.; Zhang, H.; Ren, Y.; Han, Z.; Chen, K.-C.; Hanzo, L. Machine Learning Paradigms for Next-Generation Wireless Networks. IEEE Wirel. Commun. 2016, 24, 98–105. [Google Scholar] [CrossRef]
Ozay, M.; Esnaola, I.; Vural, F.T.Y.; Kulkarni, S.R.; Poor, H.V. Machine Learning Methods for Attack Detection in the Smart Grid. IEEE Trans. Neural Netw. Learn. Syst. 2015, 27, 1773–1786. (In English) [Google Scholar] [CrossRef] [PubMed]
Esmalifalak, M.; Liu, L.; Nguyen, N.; Zheng, R.; Han, Z. Detecting stealthy false data injection using machine learning in smart grid. IEEE Syst. J. 2014, 11, 1644–1652. [Google Scholar] [CrossRef]
Ji, Y.; Wang, J.; Xu, J.; Fang, X.; Zhang, H. Real-Time Energy Management of a Microgrid Using Deep Reinforcement Learning. Energies 2019, 12, 2291. [Google Scholar] [CrossRef]
Shi, Z.; Yao, W.; Li, Z.; Zeng, L.; Zhao, Y.; Zhang, R.; Tang, Y.; Wen, J. Artificial intelligence techniques for stability analysis and control in smart grids: Methodologies, applications, challenges and future directions. Appl. Energy 2020, 278, 115733. [Google Scholar] [CrossRef]
Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Optimal Deep Learning LSTM Model for Electric Load Forecasting using Feature Selection and Genetic Algorithm: Comparison with Machine Learning Approaches. Energies 2018, 11, 1636. [Google Scholar] [CrossRef]
Musleh, A.S.; Chen, G.; Dong, Z.Y. A Survey on the Detection Algorithms for False Data Injection Attacks in Smart Grids. IEEE Trans. Smart Grid 2019, 11, 2218–2234. [Google Scholar] [CrossRef]
Ahmed, S.; Lee, Y.; Hyun, S.-H.; Koo, I. Unsupervised Machine Learning-Based Detection of Covert Data Integrity Assault in Smart Grid Networks Utilizing Isolation Forest. IEEE Trans. Inf. Forensics Secur. 2019, 14, 2765–2777. (In English) [Google Scholar] [CrossRef]
Babar, M.; Tariq, M.U.; Jan, M.A. Secure and resilient demand side management engine using machine learning for IoT-enabled smart grid. Sustain. Cities Soc. 2020, 62, 102370. (In English) [Google Scholar] [CrossRef]
Ahmed, W.; Ansari, H.; Khan, B.; Ullah, Z.; Ali, S.M.; Mehmood, C.A.A.; Qureshi, M.B.; Hussain, I.; Jawad, M.; Khan, M.U.S.; et al. Machine Learning Based Energy Management Model for Smart Grid and Renewable Energy Districts. IEEE Access 2020, 8, 185059–185078. (In English) [Google Scholar] [CrossRef]
Ahmed, S.; Lee, Y.; Hyun, S.-H.; Koo, I. Feature Selection–Based Detection of Covert Cyber Deception Assaults in Smart Grid Communications Networks Using Machine Learning. IEEE Access 2018, 6, 27518–27529. (In English) [Google Scholar] [CrossRef]
Ahmad, T.; Chen, H. Potential of three variant machine-learning models for forecasting district level medium-term and long-term energy demand in smart grid environment. Energy 2018, 160, 1008–1020. (In English) [Google Scholar] [CrossRef]
Wong, S.; Mah, A.X.Y.; Nordin, A.H.; Nyakuma, B.B.; Ngadi, N.; Mat, R.; Amin, N.A.S.; Ho, W.S.; Lee, T.H. Emerging trends in municipal solid waste incineration ashes research: A bibliometric analysis from 1994 to 2018. Environ. Sci. Pollut. Res. 2020, 27, 7757–7784. [Google Scholar] [CrossRef]
Nyakuma, B.B.; Mahyon, N.I.; Chiong, M.S.; Rajoo, S.; Pesiridis, A.; Wong, S.L.; Martinez-Botas, R. Recovery and utilisation of waste heat from flue/exhaust gases: A bibliometric analysis (2010–2022). Environ. Sci. Pollut. Res. 2023, 30, 90522–90546. (In English) [Google Scholar] [CrossRef] [PubMed]
Leszczyna, R. A review of standards with cybersecurity requirements for smart grid. Comput. Secur. 2018, 77, 262–276. [Google Scholar] [CrossRef]
Goel, S.; Hong, Y.; Papakonstantinou, V.; Kloza, D.; Goel, S.; Hong, Y. Security challenges in smart grid implementation. In Smart Grid Security; Springer: London, UK, 2015; pp. 1–39. [Google Scholar]
Flick, T.; Morehouse, J. Securing the Smart Grid: Next Generation Power Grid Security; Elsevier: Amsterdam, The Netherlands, 2010. [Google Scholar]
Sarathkumar, D.; Srinivasan, M.; Stonier, A.A.; Samikannu, R.; Dasari, N.R.; Raj, R.A. A technical review on classification of various faults in smart grid systems. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1055, 12152. [Google Scholar] [CrossRef]
Lee, G.M.; Su, D.H. Standarization of smart grid in ITU-T. IEEE Commun. Mag. 2013, 51, 90–97. [Google Scholar] [CrossRef]
Luthra, S.; Kumar, S.; Kharb, R.; Ansari, F.; Shimmi, S. Adoption of smart grid technologies: An analysis of interactions among barriers. Renew. Sustain. Energy Rev. 2014, 33, 554–565. [Google Scholar] [CrossRef]
Panthi, M. Anomaly detection in smart grids using machine learning techniques. In Proceedings of the 1st International Conference on Power, Control and Computing Technologies, ICPC2T 2020; IEEE: New York, NY, USA, 2020; pp. 220–222. [Google Scholar] [CrossRef]
Guihai, Z.; Sikdar, B. Adversarial Machine Learning Against False Data Injection Attack Detection for Smart Grid Demand Response. In Proceedings of the 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids, SmartGridComm 2021; IEEE: New York, NY, USA, 2021; pp. 352–357. [Google Scholar] [CrossRef]
Aziz, S.; Irshad, M.; Haider, S.A.; Wu, J.; Deng, D.N.; Ahmad, S. Protection of a smart grid with the detection of cyber- malware attacks using efficient and novel machine learning models. Front. Energy Res. 2022, 10, 964305. [Google Scholar] [CrossRef]
Ungureanu, S.; Topa, V.; Cziker, A. Industrial load forecasting using machine learning in the context of smart grid. In Proceedings of the 54th International Universities Power Engineering Conference, UPEC 2019; IEEE: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
Syed, D.; Refaat, S.S.; Abu-Rub, H. Performance evaluation of distributed machine learning for load forecasting in smart grids. In Proceedings of the 30th International Conference on Cybernetics and Informatics, (K&I) 2020; Ciganek, J., Kozak, S., Kozakova, A., Eds.; IEEE: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Bahaghighat, M.; Abedini, F.; Xin, Q.; Zanjireh, M.M.; Mirjalili, S. Using machine learning and computer vision to estimate the angular velocity of wind turbines in smart grids remotely. Energy Rep. 2021, 7, 8561–8576. (In English) [Google Scholar] [CrossRef]
Bahaghighat, M.; Xin, Q.; Motamedi, S.A.; Zanjireh, M.M.; Vacavant, A. Estimation of Wind Turbine Angular Velocity Remotely Found on Video Mining and Convolutional Neural Network. Appl. Sci. 2020, 10, 3544. [Google Scholar] [CrossRef]
Cebekhulu, E.; Onumanyi, A.J.; Isaac, S.J. Performance Analysis of Machine Learning Algorithms for Energy Demand–Supply Prediction in Smart Grids. Sustainability 2022, 14, 2546. (In English) [Google Scholar] [CrossRef]
Li, B.; Gangadhar, S.; Cheng, S.; Verma, P.K. Predicting user comfort level using machine learning for smart grid environments. In Proceedings of the 2011 IEEE PES Innovative Smart Grid Technologies, ISGT 2011, Anaheim, CA, USA, 17–19 January 2011. [Google Scholar] [CrossRef]
Azad, S.; Sabrina, F.; Wasimi, S. Transformation of smart grid using machine learning. In Proceedings of the 29th Australasian Universities Power Engineering Conference, AUPEC 2019; IEEE: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
Min, L.; Alnowibet, K.A.; Alrasheedi, A.F.; Moazzen, F.; Awwad, E.M.; Mohamed, M.A. A stochastic machine learning based approach for observability enhancement of automated smart grids. Sustain. Cities Soc. 2021, 72, 103071. (In English) [Google Scholar] [CrossRef]
Krč, R.; Kratochvílová, M.; Podroužek, J.; Apeltauer, T.; Stupka, V.; Pitner, T. Machine Learning-Based Node Characterization for Smart Grid Demand Response Flexibility Assessment. Sustainability 2021, 13, 2954. (In English) [Google Scholar] [CrossRef]
Borges, T.A.R.; Brito, F.C.; dos Santos, R.G.O.; Nascimento, P.d.T.; da Silva, C.B.; Panizio, R.M.; Saba, H.; Filho, A.S.N. Smart Technologies Applied in Microgrids of Renewable Energy Sources: A Systematic Review. Energies 2025, 18, 2676. [Google Scholar] [CrossRef]
Zhang, K.; Lazaro, J. Failure Prediction and Life Cycle Management of Power Equipment Based on Big Data Analysis. J. Comput. Signal Syst. Res. 2025, 2, 70–79. [Google Scholar] [CrossRef]
Zaben, M.M.; Worku, M.Y.; Hassan, M.A.; Abido, M.A. Machine Learning Methods for Fault Diagnosis in AC Microgrids: A Systematic Review. IEEE Access 2024, 12, 20260–20298. [Google Scholar] [CrossRef]
Dladla, V.M.N.; Thango, B.A. Fault Classification in Power Transformers via Dissolved Gas Analysis and Machine Learning Algorithms: A Systematic Literature Review. Appl. Sci. 2025, 15, 2395. [Google Scholar] [CrossRef]
Priyadarshini, M.S.; Bajaj, M.; Prokop, L.; Berhanu, M. Perception of power quality disturbances using Fourier, Short-Time Fourier, continuous and discrete wavelet transforms. Sci. Rep. 2024, 14, 3443. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Williams, J.; Swanson, C.; Berg, T. A machine learning approach to predictive maintenance: Remaining useful life and motor fault analysis. Comput. Ind. Eng. 2025, 206, 111222. [Google Scholar] [CrossRef]
Yao, R.; Li, J.; Zuo, B.; Hu, J. Machine learning-based energy efficient technologies for smart grid. Int. Trans. Electr. Energy Syst. 2021, 31, e12744. [Google Scholar]
Gkikas, D.C.; Theodoridis, P.K. Predicting Online Shopping Behavior: Using Machine Learning and Google Analytics to Classify User Engagement. Appl. Sci. 2024, 14, 11403. [Google Scholar] [CrossRef]
Shams, M.Y.; Tarek, Z.; Elshewey, A.M. A novel RFE-GRU model for diabetes classification using PIMA Indian dataset. Sci. Rep. 2025, 15, 982. [Google Scholar] [CrossRef] [PubMed]
Elshewey, A.M. Enhancing crop yield prediction based on dove optimization algorithm and gradient boosting model. Signal Image Video Process. 2025, 19, 951. [Google Scholar] [CrossRef]
Sujon, K.M.; Hassan, R.; Khairudin, A.R.; Moi, S.H.; Shafie, M.L.M.; Saringat, Z.; Erianda, A. The Effects of Imbalanced Datasets on Machine Learning Algorithms in Predicting Student Performance. JOIV Int. J. Inform. Vis. 2024, 8, 1599. [Google Scholar] [CrossRef]
Ahmed, W.; Wani, M.A.; Plawiak, P.; Meshoul, S.; Mahmoud, A.; Hammad, M. Machine learning-based academic performance prediction with explainability for enhanced decision-making in educational institutions. Sci. Rep. 2025, 15, 26879. [Google Scholar] [PubMed]
Bogensperger, A.J.; Fabel, Y.; Ferstl, J. Accelerating Energy-Economic Simulation Models via Machine Learning-Based Emulation and Time Series Aggregation. Energies 2022, 15, 1239. [Google Scholar]
Olawumi, M.A.; Oladapo, B. Enhancing grid stability with machine learning: A smart predictive approach to residential energy management. Energy Build. 2025, 338, 115729. [Google Scholar] [CrossRef]
Fahim, K.E.; Islam, R.; Shihab, N.A.; Olvi, M.R.; Al Jonayed, K.L.; Das, A.S. Transformation and future trends of smart grid using machine and deep learning: A state-of-the-art review. Int. J. Appl. Power Eng. (IJAPE) 2024, 13, 583–593. [Google Scholar]

Figure 1. PRISMA flow framework.

Figure 2. Publication trends in MLSG research (2009–2025).

Figure 3. Top 5 source titles on MLSG in Scopus.

Figure 4. Top 5 most prolific researchers on MLSG.

Figure 5. Network visualization map for co-authorship on MLSG research landscape.

Figure 6. Top 10 corresponding authors’ countries in MLSG research landscape.

Figure 7. Top five most prolific affiliation on MLSG.

Figure 8. Top 5 most prolific countries in MLSG.

Figure 9. Network visualization map for collaboration between nations in MLSG research.

Figure 10. Top 5 most influential funders of MLSG research globally.

Figure 11. Keyword co-occurrence analysis of the MLSG research landscape: (a) network visualization; (b) overlay visualization.

Figure 12. Most frequent words and word cloud occurrence analysis of the MLSG research landscape: (a) most frequent words; (b) word cloud.

Table 1. Comparative overview of representative prior bibliometric studies vs. the present study.

Study	Focus Domain	Database	N Docs	Longitudinal?	Methods	Thematic Hotspots?	Cross-Domain Transferability?
Purna Prakash et al. (2024) [41]	Smart grids (general)	Scopus	~500	No	Descriptive stats only	No	No
Gao et al. (2025) [42]	AI + renewable energy	Not specified	~1054	No	Keyword clustering	No	No
Rasoulnia et al. (2026) [43]	SG technologies & measurement	Multiple DBs	Not stated	No	Systematic + bibliometric	No	No
Jaramillo et al. (2025) [44]	AI/IoT/blockchain in energy	Scopus/WoS	Not stated	No	Co-citation, keyword	No	No
Sakhnini et al. (2020) [45]	SG cybersecurity	Multiple DBs	Narrow	No	Narrative review	No	No
Banad et al. (2025) [46]	ML + SG (combined)	Scopus	~123	No	Cluster analysis	Partial	No
Present Study (2026)	ML in Smart Grids (MLSG)	Scopus	4156	Yes (2009–2025)	Bibliometric + science mapping + thematic review	Yes (3 clusters)	Yes (electrical systems)

Table 2. Top 10 source titles on MLSG in Scopus.

Source Title	TP	%TP	TC	%TC	H-Index
Energies	183	4.4	8556	7.87	83
IEEE Access	175	4.2	10,549	9.71	152
Applied Energy	91	2.18	6381	5.87	196
Energy	56	1.34	2224	2.04	205
Sustainability Switzerland	56	1.34	2088	1.92	98
IEEE Internet Of Things Journal	39	0.94	1835	1.68	113
IEEE Transactions on Smart Grid	36	0.87	5030	4.63	100
Applied Science Switzerland	33	0.79	1156	1.06	72
Sensors	32	0.77	1457	1.34	116
Energy Report	30	0.72	1588	1.46	51

TP—total publications, %TP—percentage total publications, TC—total citations, %TC—percentage total citations.

Table 3. Key subject areas of MLSG research landscape.

Subject Area	TP	%TP
Engineering	2698	25.6
Computer Science	2627	24.9
Energy	1688	16.0
Mathematics	1141	10.8
Decision Sciences	489	4.6
Environmental Science	355	3.4
Materials Science	341	3.2
Physics and Astronomy	341	3.2
Social Sciences	248	2.3
Medicine	137	1.3

TP—total publications, %TP—percentage total publications.

Table 4. Top 10 corresponding authors’ countries in MLSG research landscape.

Country	Articles	SCP	MCP	Freq	MCP_Ratio
China	157	111	46	0.107	0.293
USA	99	75	24	0.067	0.242
India	82	66	16	0.056	0.195
Korea	39	27	12	0.026	0.308
Pakistan	24	6	18	0.016	0.75
Saudi Arabia	21	13	8	0.014	0.381
Australia	20	10	10	0.014	0.5
Canada	20	11	9	0.014	0.45
UK	20	11	9	0.014	0.45
Germany	18	15	3	0.012	0.167

Table 5. Top 12 most highly cited publications on MLSG.

References	Paper Title	Source Title	Citations	Document Type
[75]	Optimal deep learning LSTM model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches	Energies	840	Article
[76]	A Survey on the Detection Algorithms for False Data Injection Attacks in Smart Grids	IEEE Transactions on Smart Grid	672	Review
[23]	Data-driven probabilistic machine learning in sustainable smart energy/smart energy systems: Key developments, challenges, and future research opportunities in the context of smart grid paradigm	Renewable and Sustainable Energy Reviews	618	Review
[71]	Machine Learning Methods for Attack Detection in the Smart Grid	IEEE Transactions on Neural Networks and Learning Systems	596	Article
[36]	Application of Big Data and Machine Learning in Smart Grid, and Associated Security Concerns: A Review	IEEE Access	461	Review
[33]	Machine Learning and Deep Learning in smart manufacturing: The Smart Grid paradigm	Computer Science Review	320	Review
[77]	Unsupervised Machine Learning-Based Detection of Covert Data Integrity Assault in Smart Grid Networks Utilizing Isolation Forest	IEEE Transactions on Information Forensics and Security	264	Article
[38]	Detecting false data attacks using machine learning techniques in smart grid: A survey	Journal of Network and Computer Applications	179	Review
[78]	Secure and resilient demand side management engine using machine learning for IoT-enabled smart grid	Sustainable Cities and Society	159	Article
[79]	Machine learning-based energy management model for smart grid and renewable energy districts	IEEE Access	124	Article
[80]	Feature Selection-Based Detection of Covert Cyber Deception Assaults in Smart Grid Communications Networks Using Machine Learning	IEEE Access	95	Article
[81]	Potential of three variant machine learning models for forecasting district-level medium-term and long-term energy demand in smart grid environment	Energy	88	Article

Table 6. Top 10 most frequent author keywords in MLSG.

Rank	Keyword	Occurrences	Percentage of Corpus (%)	Thematic Cluster
1	Smart grid	1842	44.32	Cross-cluster
2	Machine learning	1654	39.80	Cross-cluster
3	Deep learning	892	21.46	Clusters 1 and 2
4	Neural network	748	18.00	Cluster 2
5	Demand response	682	16.41	Cluster 3
6	Cybersecurity	634	15.26	Cluster 1
7	Load forecasting	618	14.87	Cluster 2
8	False data injection	594	14.30	Cluster 1
9	Energy management	571	13.74	Cluster 3
10	Smart meters	528	12.71	Cluster 3

Table 7. Top 10 most relevant keywords in MLSG.

Rank	Keyword	Relevance Score	Total Occurrences	First Appeared
1	Federated learning	1.84	148	2020
2	Deep reinforcement learning	1.76	118	2019
3	Graph neural network	1.72	96	2021
4	Transformer model	1.68	84	2021
5	Explainable AI (XAI)	1.64	78	2021
6	Physics-informed neural network	1.61	62	2022
7	Digital twin	1.58	74	2021
8	Adversarial machine learning	1.54	88	2020
9	Edge computing	1.51	112	2020
10	Electric vehicle (EV)	1.48	138	2019

Table 8. Comparative analysis of machine learning approaches in smart grid applications.

Refs.	Study Focus	Algorithm	Datasets	Eval Metrics	Key Findings	Limitations
[71]	FDI Attack Detection	SVM	IEEE Bus System	Accuracy, Recall	High detection accuracy for cyberattacks	Sensitive to parameter tuning, limited scalability
[77]	Anomaly Detection	Isolation Forest	Smart Grid Simulation Data	Precision, Recall	Effective in detecting unknown attacks	Sensitive to noise and data imbalance
[79]	Energy Prediction	Gaussian Process Regression	Smart Meter Data	RMSE	High prediction accuracy with uncertainty estimation	Poor scalability with large datasets
[80]	Intrusion Detection	GA + SVM	KDD99 Dataset	Accuracy, F1-score	Improved classification performance using feature selection	High computational cost, dependent on feature quality
[114]	Load Forecasting	SVM, Linear Regression	Energy Consumption Data	RMSE, MAE	Moderate forecasting accuracy	Poor performance for nonlinear/time-series data
[94]	Big Data Forecasting	Distributed ML (Spark MLlib)	Large-scale Energy Data	RMSE	Improved scalability and processing speed	Requires complex infrastructure
[95]	Load Prediction	CNN-based Model	Visual Smart Grid Data	Accuracy (~95.4%)	High prediction accuracy using image-based features	Limited generalizability beyond visual datasets
[115]	Energy Management	Supervised ML Models	Smart Grid Operational Data	Accuracy, Precision	Reliable prediction performance	Limited adaptability to dynamic systems
[116]	Energy Optimization	Reinforcement Learning (RL)	Simulated Smart Grid Environment	Reward Function	Adaptive decision-making improves efficiency	Training instability, high computational cost
[100]	Hybrid Forecasting	CNN-LSTM	Time-Series Energy Data	RMSE, MAE	Superior performance in temporal prediction	Complex architecture, high training cost

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zaidi, A.; Ajibade, S.-S.M.; Adediran, A.O.; Jasser, M.B. Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach. Energies 2026, 19, 3007. https://doi.org/10.3390/en19133007

AMA Style

Zaidi A, Ajibade S-SM, Adediran AO, Jasser MB. Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach. Energies. 2026; 19(13):3007. https://doi.org/10.3390/en19133007

Chicago/Turabian Style

Zaidi, Abdelhamid, Samuel-Soma M. Ajibade, Anthonia Oluwatosin Adediran, and Muhammed Basheer Jasser. 2026. "Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach" Energies 19, no. 13: 3007. https://doi.org/10.3390/en19133007

APA Style

Zaidi, A., Ajibade, S.-S. M., Adediran, A. O., & Jasser, M. B. (2026). Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach. Energies, 19(13), 3007. https://doi.org/10.3390/en19133007

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Unveiling Trends in Machine Learning for Smart Grids: A Comprehensive Bibliometric and Science Mapping Approach

Abstract

1. Introduction

2. Methodology

3. Results and Discussion

3.1. Published Documents Analysis

3.2. Source Titles and Subject Area

3.3. Authors

3.4. Affiliations

3.5. Countries/Regions

3.6. Funding Organizations

3.7. Keyword Co-Occurrence Analysis

3.8. Most Frequent Words and Word Cloud

3.9. Review of the MLSG Literature

3.10. Critical Thematic Synthesis of the MLSG Research Landscape

3.11. Overview of ML Models, Application Areas, Challenges, and Current Trends in MLSG Research

4. Conclusions

5. Limitations

6. Future Research Direction

7. Practical Implications

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI