1. Introduction
Integrated electronic health record (EHR) system is an extensive real-time digital patient-centered health record accessible from many different interoperable automated systems and available instantly and securely to authorized users through standardized health information data format, which supports system functions [
1]. Healthcare facilities using EHR systems face enormous and persistent cybersecurity attacks that challenge the integrity of critical EHR infrastructure with dire consequences to patient privacy, patient safety, and risk to an organization’s finances or reputation. As such, confidentiality, integrity, and availability of the EHR system are very crucial, as health providers need to be able to make life-or-death decisions by recording accurate patient hospital-related activities, including but not limited to diagnosis, personally identifying information (PII), and demographic information [
1]. The 2019 National Electronic Record survey shows that approximately 89% of USA office-based physicians use EHRs [
2]. In addition, over 90% of large, medium, small rural, and critical access hospitals use some form of EHRs [
2]. There are four core EHR uses, with increasing subs uses as research and development in technology continue to grow. The four uses include providing healthcare practitioners with history and a potential projected view on patients’ health; aiding healthcare practitioners in enhancing the quality of patient care and efficiency in care by providing access to current health state concerning disease, medication history, medical exams records, from a central location; reducing the cost of care by removing redundancy in procedures, reducing errors (i.e., such as wrong prescription and drug interactions); and serving as a memory bank for practitioners and patients in understanding previous ailments and care [
3].
Such core functionalities make EHR systems an essential part of any healthcare information yechnology infrastructure, requiring every measure to guarantee that sensitive patient information such as PII, medical history, diagnosis, medications, treatment plans, immunization dates, allergies, radiology images, laboratory and test results are protected against any adverse threat (either internally or externally). For example, PII collected by a Health custodian during a patient visit, if not safeguarded and subjected to a data breach, can result in identity theft with severe consequences (i.e., impersonation attacks and fraud). Although there are many definitions of what constitutes a data breach, for the purpose of this work, a data breach is limited to any unauthorized access to patients PII, demographic data, diagnosis data, or other EHR system data in a way that compromises the confidentiality of patients or system information.
Unfortunately, there are documented challenges [
1,
4] in designing and securing EHR systems, including but not limited to how to adequately address security and privacy control requirements for the secure collection, retention, and use of available data. Other difficulties include but are not restricted to protecting data in multiple states (transit, storage, or process); protecting infrastructure to support EHR; access control provisioning to online EHR resources to prevent data breaches; determining the authenticity of an individual during enrollment into the EHR before granting access, privileges, credentials, and services; securing access to other stakeholders to connect to the EHR and how to protect stakeholder’s sensitive data; and providing education to consumers, providers, and employees on the importance of protecting data and somehow introducing incentives [
5].
In the past, such challenges have resulted in data breaches in terms of some key organization EHRs. As documented in
Table 1, several healthcare facilities across the globe have suffered data breaches. Such Cyber attacks indicate that security measures employed to secure EHRs in most jurisdictions might be subpar and require measured security control and aggressive solutions to address security vulnerabilities that can lead to a successful data breach for EHRs.
As Healthcare data breaches become omnipresent, as depicted in
Table 1, patients continuously lose confidence in the security and protection of their health records [
4]. Therefore, they are uncomfortable providing information or interest in the fully participating EHR system [
12]. Patients’ trust and confidence that Healthcare providers are protecting their private and sensitive information at all costs have dwindled. In a recent global survey, approximately 80% of Americans, 81% of Britons, and 83% of Australians had strong reservations about allowing their paper health record to be migrated into the EHR system because of the risk of identity theft, the possibility of privacy breaches, intrusive privacy violation by nosy healthcare workers, or other employers [
12]. Participants from the survey acknowledge a high risk of exposure to privacy threats while their medical records are managed by healthcare organizations [
12]. Keeping EHR secure is a challenge that government and healthcare providers around the globe are beginning to grasp in its infancy [
13].
The significance of this work focuses on the integrated EHR systems that have revolutionizing healthcare delivery by enabling real-time, patient-centered, and data-driven decision-making across interoperable platforms. These systems serve not only as comprehensive repositories for patient health data, including diagnoses, treatments, medications, and imaging, but also as critical enablers of cost-effective, accurate, and timely healthcare services. As more healthcare institutions adopt EHRs, their role in ensuring continuity of care, reducing medical errors, and improving patient outcomes becomes increasingly indispensable.
However, this growing reliance on EHRs has also made them a prime target for cybersecurity threats. Given the volume and sensitivity of information stored—particularly personally identifiable information (PII) and diagnostic data—any breach can result in severe consequences, including identity theft, medical fraud, and erosion of public trust. The escalating frequency and sophistication of cyberattacks, as evidenced by global incidents involving millions of compromised records, underscores the urgent need for stronger data protection mechanisms in EHR systems.
Despite growing awareness, there remain significant gaps in how EHR systems are secured, particularly within integrated healthcare environments. Existing security frameworks often fail to address the full spectrum of privacy and protection requirements, especially those involving data in various states (in transit, at rest, or in use). Furthermore, current systems lack robust mechanisms for secure identity verification, access provisioning, and stakeholder protection across distributed networks.
This study is an exploratory study into current integrated EHR cybersecurity attacks using United States Health Insurance Portability and Accountability Act (HIPAA) privacy and security breach reported data. This work investigates if current EHR implementation lack the requisite security control to prevent a cyber breach and protect user privacy? A descriptive and trend analysis is conducted to describe, demonstrate, summarize data points, and predict direction based on current and historical data by covered entity, type of breaches, and point of breaches (examine, attack methods, patterns, and location of breach information). Autoregressive Integrated Moving Average (ARIMA) model is used to provide a detailed analysis of the data demonstrating breaches.
In addressing the research question, “Do current Electronic Health Record (EHR) implementations lack the requisite security controls to prevent cyber breaches and adequately protect patient data privacy?” Based on current literature and preliminary work, we hypothesize that:
H1: Most successful EHR cybersecurity breaches exploit similar attack vectors and stem from common security vulnerabilities, indicating that current EHR implementations lack sufficient security controls to prevent unauthorized access and protect patient privacy.
In addressing our stated research question and testing our hypothesis, we assess the current solutions in the literature and conduct an exploratory study on existing HIPAA data breaches between 2010 and 2025. Based on our findings, this work makes two key contributions to the field of health informatics and cybersecurity:
This study adopts a mixed-methods approach, including a comprehensive literature review, analysis of major healthcare cyberattacks from 2010 to 2024, and the design of a tailored security framework. The proposed solution integrates encryption, identity verification, anomaly detection, and stakeholder-specific access controls. Its effectiveness is evaluated through theoretical modeling and risk assessment simulations, benchmarked against current industry standards.
6. Trend Analysis of Data Breaches by Type and Point of Breach
This section examines data breach trends across different types and points of breach from 2010 to 2025, focusing on both the frequency of incidents and the scale of personal records affected. We extended the period from 2016 to 2025 to 2010–2025 to have a longer period for our analysis. The findings reveal significant shifts in the data breach landscape, with hacking and IT-related incidents showing the most dramatic increases in both frequency and impact.
6.1. Type of Breach Analysis
Our examination of breach incident trends by type reveals distinct patterns across the 15-year analysis period.
Figure 8 illustrates the monthly number of breach incidents across different breach categories, with LOESS smoothing curves highlighting underlying trends. The most striking finding is the consistent and substantial increase in “Hacking/IT” incidents throughout the analysis period. Beginning with relatively low numbers in 2010, hacking-related breaches experienced exponential growth, becoming the dominant breach type by 2025. This trend reflects the increasing digitization of healthcare systems and the corresponding growth in sophisticated cyberattacks targeting these environments.
In contrast, traditional breach types, including improper use of devices, loss of data or devices, theft, and unauthorized access, have remained relatively stable throughout the analysis period. These patterns suggest that while organizational security practices for physical assets and access controls have matured, cybersecurity defenses have struggled to keep pace with evolving digital threats.
Analyzing the average number of personal records breached (number of affected individuals) provides a better view of the trends.
Figure 9 illustrates the monthly average personal records reported in the dataset grouped by the types of breaches. The logged total number of affected individuals is relatively low and stays constant during the analysis period for all groups. There is one exception, which is incidents caused by hacking. The average number of individuals has grown from 20,000 to 160,000 individuals for incidents caused by hacking, while for other groups, the number is around 3000 and remains constant. For more detailed analysis, we fit the data into the ARIMA model and reported the coefficients and their significance in
Table 3 and
Table 4. ARIMA model was employed due to its effectiveness in modeling and forecasting univariate time series data. Given the chronological structure of HIPAA-reported EHR data breaches. ARIMA is well-suited to capture underlying trends, account for non-stationarity, and project future breach occurrences. The model’s interpretability and established use in healthcare analytics make it appropriate for analyzing breach frequencies and identifying evolving threat patterns.
6.2. Scale of Impact Analysis
The analysis of average personal records affected provides crucial insights into the severity trends across breach types.
Figure 9 presents the monthly average of personal records compromised, grouped by breach type, revealing significant disparities in the impact scale. Hacking incidents demonstrate not only an increasing frequency but also a dramatically expanding scope of impact. The average number of individuals affected by hacking incidents has grown from approximately 20,000 in 2010 to over 160,000 by 2025—an eight-fold increase. This trend indicates that successful cyberattacks are becoming increasingly sophisticated and capable of accessing larger data repositories. Other breach types have maintained relatively constant impact scales, with average affected individuals remaining around 3000 throughout the analysis period. This stability suggests that the scope of physical breaches (theft, loss, unauthorized access) is naturally limited by the physical constraints of the compromised media or access points.
Consistent with the visualization, breaches caused by hacking and IT incidents show a significant trend (coefficient 0.84, p-value < 2.2 × 10−16 ***). Interestingly, the Theft and Unauthorized types are also significant and increasing. However, these two types have much smaller coefficients. Unlike visuals, the results of ARIMA models for the trends of median size of the breaches show that all types of breaches have no significant trends. This indicates the high amount of noise in breach-size data that could have originated from measurement errors, inconsistent reports to Health and Human Services, and misattribution of records. These results partially support our H0 hypothesis indicating a significant increasing trend in the number of incidents but inadequate evidence of the increased number of individual records lost in each breach incident. In other words, although the median size of data breach incidents remained unchanged the frequency of the occurrence of those breaches has increased significantly. These trends show that current EHR implementations lack sufficient security controls, thus compromising patient privacy, safety, and hospital operation continuity during a cyberattack.
6.3. Statiscal Model Result
To quantify these trends more precisely, we fitted ARIMA models to the data and analyzed the statistical significance of trend coefficients. The statistical analysis confirms our visual observations with high precision. Hacking/IT incidents show the strongest significant upward trend (coefficient 0.84, p-value < 2.2 × 10−16 ***), indicating robust statistical evidence for the increasing frequency of cyberattacks. Notably, theft and unauthorized access also demonstrate statistically significant increasing trends, though with smaller coefficients (0.633 and 0.492, respectively), suggesting these traditional breach types are also experiencing growth, albeit at lower rates.
Interestingly, the analysis of median breach sizes as shown in
Figure 10 reveals no statistically significant trends across any breach type. This finding contrasts with the clear trends observed in average breach sizes and suggests high variability in breach impact within each category. The lack of significant trends in median values indicates substantial noise in breach size data, which may originate from several sources including measurement errors, inconsistent reporting practices to regulatory bodies, and potential misattribution of affected records.
6.4. Key Findings and Implications
Cyber threat dominance emerges as the most significant pattern, with hacking/IT incidents becoming the predominant breach type while showing both the highest frequency growth and largest impact scale increases. Traditional breach stability characterizes physical security breaches such as loss and improper disposal, which have remained relatively constant, suggesting effective traditional security controls have been implemented and maintained. Emerging patterns include theft and unauthorized access, showing statistically significant increases, potentially reflecting new attack vectors or improved detection capabilities within organizations. Impact variance reveals high variability in breach sizes within categories, suggesting inconsistent reporting standards and diverse attack sophistication levels across the healthcare sector.
6.5. Data Quality Consideration
The analysis reveals important data quality challenges that affect trend interpretation. The significant noise in breach size measurements, as evidenced by the lack of trends in median values despite clear trends in averages, indicates several potential issues affecting data reliability and interpretation. Reporting inconsistencies manifest through variations in how organizations count, and report affected individuals, creating challenges for accurate trend analysis. Detection delays between breach occurrence and discovery may affect size estimations, while attribution challenges create difficulty in accurately attributing records to specific incidents in complex breaches involving multiple systems or attack vectors.
These findings have significant implications for healthcare data security strategies across multiple dimensions. Resource allocation decisions should reflect the dominance of hacking/IT trends, suggesting organizations should prioritize cybersecurity investments over traditional physical security measures while maintaining baseline physical protection. Preparedness planning must account for the increasing scale of cyber incidents, requiring enhanced incident response capabilities and larger-scale breach notification processes to handle the growing impact of successful attacks. Regulatory focus appears warranted given the trend data supporting increased regulatory attention on cybersecurity standards and requirements for healthcare organizations. Industry collaboration becomes increasingly important as the sophisticated nature of increasing cyber threats suggests a need for enhanced information sharing and coordinated defense strategies among healthcare organizations and with government agencies.
6.6. Methodlogy Notes
This analysis employs LOESS (locally estimated scatterplot smoothing) for trend visualization and ARIMA (Autoregressive Integrated Moving Average) models for statistical trend analysis. The combination of visual and statistical approaches provides both an intuitive understanding and rigorous quantification of observed trends while accounting for the time series nature of the data. The significance levels reported follow standard statistical conventions, with three asterisks indicating p-values less than 0.001, representing extremely strong evidence for the reported trends. This analytical framework ensures both accessibility for stakeholders and statistical rigor for research and policy applications.
6.7. Point of Breach Analysis
The analysis of trends for groups of data breaches based on the point of breach provides deeper insights into recent developments in health records security. Understanding where breaches originate within healthcare systems is crucial for developing targeted security strategies and allocating resources effectively to protect patient information.
Figure 11 illustrates the monthly number of data breach incidents during the analysis period for each category of incidents based on the point of breach. The visualization reveals significant patterns in how breach points have evolved over the study period, reflecting the changing landscape of healthcare technology infrastructure and attack methodologies. Note that for this section, we integrated Desktop and Laptop into one category because of the small number of incidents and similarity between them.
Analyzing trends for groups of data breaches based on the point of the breach could provide deeper insights into recent developments in health records security. The number of breaches that occurred via network servers, email, and electronic health record management systems show increasing trends. For further investigation, we ran an ARIMA model to see if the trends were statistically significant. The results are shown in
Table 5. Consistent with visuals, ARIMA coefficients for all types of breach are statistically significant except for the groups Desktop and Other. The largest coefficients belong to Network Servers and Email groups, indicating the increasing usage of these platforms for communication and inappropriate access to health records. Changes in the median size of breach incidents in terms of the number of personal health records are illustrated in
Figure 12. In line with our discussion in the previous section, due to the large noise in the report of the size of data breaches, we cannot identify any meaningful trend in this variable for any point of the breach.
Table 4 provides further evidence of this issue. The results show that, historically, most prevalent points of vulnerabilities have been via emails, network servers, papers/films, and laptops. From these points of breach, however, the frequency of incidents has significantly been increasing for emails, electronic medial records, network servers, and laptops but not for other groups. The median size of breach for different points of breach incidents do not show any significant trends. These results support our H1 indicating that most EHR cybersecurity attacks are concentrated using similar attack methodologies and face common vulnerabilities.
The number of breaches that occurred via network servers, email, and electronic health record management systems shows increasing trends throughout the analysis period. Network servers demonstrate the most pronounced upward trajectory, reflecting the increasing centralization of healthcare data storage and the corresponding expansion of attack surfaces as healthcare organizations migrate to digital systems. Email-based breaches also exhibit substantial growth, indicating that email remains a primary vector for both targeted attacks and inadvertent data exposure despite widespread awareness of email security risks.
Electronic health record management systems show a concerning upward trend in breach incidents, which is particularly significant given the central role these systems play in modern healthcare delivery. This trend suggests that while EHR adoption has improved care coordination and efficiency, it has also created new vulnerabilities that attackers are increasingly exploiting.
6.8. Statistical Significance Analysis
To validate these visual observations and quantify the trends more precisely, we applied ARIMA modeling to assess the statistical significance of observed patterns.
Table 5 presents the comprehensive results of this analysis, revealing which trend coefficients represent statistically significant changes rather than random variation. The ARIMA analysis confirms that coefficients for most types of breach points are statistically significant, with notable exceptions being Desktop and Other categories. Network servers exhibit the highest coefficient (0.797,
p < 2.2 × 10
−16 ***), indicating the strongest upward trend and highlighting the critical importance of server security in modern healthcare environments. This finding aligns with broader cybersecurity research indicating that centralized data repositories have become primary targets for sophisticated attackers seeking to maximize the impact of successful breaches [
46,
47].
Email breaches show the second-highest coefficient (0.724,
p < 2.2 × 10
−16), reflecting the persistent vulnerability of email systems to both technical attacks and social engineering. This trend is consistent with industry reports indicating that email remains one of the most common initial attack vectors in healthcare breaches [
48]. The statistical significance of this trend underscores the need for enhanced email security measures, including advanced threat protection, user training, and secure communication alternatives.
Laptop-related breaches demonstrate a substantial and statistically significant upward trend (coefficient 0.548, p < 2.2 × 10−16), reflecting the increasing mobility of healthcare workers and the corresponding challenges of securing mobile endpoints. This finding is particularly relevant in the context of increased remote work patterns accelerated by the COVID-19 pandemic, which expanded the attack surface for healthcare organizations significantly.
Electronic Medical Records systems show a moderate but statistically significant increasing trend (coefficient 0.297, p < 0.001), indicating growing targeting of these critical systems. While the coefficient is smaller than network servers or email, the statistical significance suggests a consistent pattern of increasing EHR-focused attacks, which concerns the centrality of these systems to healthcare operations. Interestingly, Paper/Films breaches also show statistical significance (coefficient 0.268, p < 0.001), suggesting that traditional physical security challenges persist even as organizations digitize their operations. This finding indicates that comprehensive security strategies must continue to address both digital and physical threat vectors.
6.9. Breach Size Analysis
Figure 12 above presents changes in the median size of breach incidents measured by the number of personal health records affected, displayed on a logarithmic scale to accommodate the wide range of breach sizes across different points of breach. The logarithmic transformation helps reveal patterns that might be obscured by the extreme values that characterize large-scale cyber incidents. Consistent with our previous analysis of breach types, the examination of breach sizes by point of breach reveals significant data quality challenges that limit our ability to identify meaningful trends. The high variability in reported breach sizes creates substantial noise that obscures underlying patterns, reflecting the complex challenges organizations face in accurately quantifying the scope of data breaches.
Table 6 presents estimated coefficient of trends in the log median size of data breaches for each point of breach. The statistical analysis of median breach sizes confirms the limited presence of significant trends, with only laptop-related breaches showing statistical significance (coefficient −0.203,
p = 0.007). Interestingly, this coefficient is negative, suggesting that while laptop breaches are becoming more frequent, their median size may be decreasing. This pattern could reflect improved detection capabilities leading to earlier discovery of laptop-based breaches, or it might indicate that laptop breaches tend to involve more limited datasets compared to server-based incidents. The lack of significant trends in breach sizes for most categories provides further evidence of the substantial measurement challenges in breach size reporting. These challenges likely stem from several factors, including inconsistent methodology for counting affected individuals, variations in breach discovery timing, and the complex technical challenges of determining the full scope of sophisticated cyberattacks [
4].
6.10. Implications for Healthcare Security Strategy
The analysis reveals critical insights into healthcare security strategy development and resource allocation. The dominance of network server breaches in both frequency and statistical significance indicates that healthcare organizations must prioritize server security infrastructure including robust access controls, network segmentation, and advanced threat detection capabilities.
The persistent growth in email-based breaches suggests that current email security measures are insufficient to address evolving threats. Healthcare organizations should consider implementing advanced email security solutions, including zero-trust architectures, enhanced user authentication, and comprehensive security awareness training programs that specifically address healthcare-relevant attack scenarios. The significant trend in laptop breaches highlights the ongoing challenges of mobile security in healthcare environments. This finding suggests that organizations need robust mobile device management solutions, enhanced endpoint protection, and clear policies governing the use of mobile devices for accessing patient data.
The continued significance of EHR breaches indicates that these critical systems require enhanced security attention despite their central role in care delivery. Healthcare organizations should prioritize EHR security through regular security assessments, robust access controls, and integration with broader security monitoring systems. Even the persistence of paper/film breaches underscores the importance of maintaining comprehensive security programs that address both digital and physical threats. Healthcare organizations cannot focus exclusively on cybersecurity while neglecting traditional physical security measures.
6.11. Data Quality Considerations
The analysis reveals significant data quality challenges that affect our understanding of breach impact patterns. The high noise levels in breach size data suggest several areas where the healthcare industry could improve breach reporting and analysis capabilities. Standardized reporting methodologies would improve the quality and comparability of breach data across organizations and time periods. Currently, variations in how organizations count affected individuals and attribute records to specific incidents create substantial noise in trend analysis.
Enhanced detection and forensic capabilities could improve the accuracy of breach size estimates by providing better tools for determining the actual scope of data compromise. Investment in these capabilities would benefit both individual organizations and industry-wide understanding of breach patterns. Improved incident attribution methods would help distinguish between different types of breaches and improve the accuracy of trend analysis by breach point. Current challenges in definitively attributing breaches to specific systems or attack vectors limit the precision of analytical insights.
The point of breach analysis reveals a healthcare security landscape increasingly dominated by digital threats, with network servers and email emerging as the most significant and rapidly growing attack vectors. While traditional physical security challenges persist, the statistical evidence clearly indicates that healthcare organizations must prioritize digital security infrastructure to address the most pressing and rapidly evolving threats to patient data protection. The persistence of measurement challenges in breach size reporting highlights the need for industry-wide improvements in incident response and forensic capabilities. Enhanced standardization and improved technical capabilities for breach assessment would significantly improve the healthcare industry’s ability to understand and respond to evolving security threats.
7. Discussion
To look for avenues for addressing data security issues within EHR, it must be established, understood, and agreed on that EHR data must be treated differently, and priority must be set to protect it at all costs. EHR data is about people, usually people’s health data. It is unique in finding ways, tools, and methodology to prevent it from getting into the hands of the wrong people or being used for non-intended purposes. In addressing the inherent problem with data breaches, the crucial part focuses on the understanding that once patient data confidentiality is breached and the data is within the public sphere, it can not be retracted. Its effects can be more significant and far-reaching than ever imagined. Again, this makes EHR data unique and requires very stringent mechanisms and rules to protect it within the EHR.
This study aimed to investigate the trends and characteristics of data breaches in the U.S. healthcare system, with a specific focus on breach frequency, size, type, and point of compromise. Through a combination of descriptive statistics and time-series modeling, our analysis offers several important insights into the evolving cybersecurity landscape of electronic health records (EHRs). The descriptive analysis presented in
Section 5 and
Section 6 serves a crucial foundational role in informing the statistical inference and modeling efforts of this study. By visualizing the distribution, frequency, and trends of breach incidents across covered entities, breach types, and points of entry, we identify underlying patterns, outliers, and data characteristics such as skewness and variability. These insights are not merely illustrative but essential in guiding the subsequent use of inferential techniques such as ARIMA modeling. For instance, the consistently increasing frequency of breaches in specific categories, such as Hacking/IT and Network Server incidents, highlighted in the descriptive figures, provided the rationale for modeling time-dependent trends in breach frequency. Additionally, the observed data skewness and variability across groups justify the need for log transformation and trend decomposition in the inferential phase. Thus, the descriptive statistics do not stand alone; they lay the groundwork for robust statistical inference by validating assumptions, informing model selection, and contextualizing the significance of estimated trend.
The descriptive findings revealed that most breach incidents involved a relatively small number of individual records, with distributions heavily skewed toward zero across most categories. However, exceptions were noted in the Hacking/IT incident type and breaches involving network servers and email categories that showed more frequent and higher-volume breaches. These results suggest that while most breaches may be minor in scale, a small but growing subset poses significant risk due to the large number of patient records compromised.
Trend analysis using ARIMA modeling confirmed that the frequency of Hacking/IT-related breaches has significantly increased over the past decade, with the average number of individuals affected by such incidents growing substantially. This trend highlights a clear shift in the cybersecurity threat landscape, where attackers are increasingly targeting large-scale systems such as hospital servers and email platforms. Similarly, breaches through network servers and emails have shown statistically significant upward trends, indicating a growing vulnerability in these critical points of EHR infrastructure.
Interestingly, while the frequency of breaches has increased across several categories, the median size of breach incidents has not shown a significant upward trend. This divergence suggests that although breaches are becoming more frequent, the number of records affected in each incident remains relatively stable—likely due to reporting inconsistencies, measurement errors, or mitigation efforts that limit breach scope. This finding partially supports our hypothesis (H0), indicating a significant increase in the number of breach incidents, but not in their median size.
The implications of these findings are substantial. First, the increasing trend in hacking and network-based breaches signals the need for healthcare organizations to prioritize investments in cybersecurity, particularly in email security, server protections, and intrusion detection systems. Second, the lack of growth in breach size may reflect improvements in containment practices or reporting inconsistencies that warrant further investigation. Third, the concentration of breaches among certain covered entities, particularly healthcare providers and business associates, underscores potential policy gaps in vendor and third-party risk management.
Taking together, our results provide empirical evidence that EHR data breaches are not only becoming more frequent but are increasingly associated with digital attack vectors. These patterns raise critical concerns about the adequacy of current security protocols and call for a reevaluation of regulatory standards, staff training, and IT infrastructure in the healthcare sector.
The contribution of this work is centered around the provision of descriptive analysis of PHI breach data, emphasizing the individual entities covered and the impact of cyberattack breaches. Such information is important for other researchers in understanding the various data breach risks associated with each covered entity and required targeted solution that can be applied. Similarly, these entities can garner information from this work to understand where within their infrastructure they should be spending the limited security budget in addressing risks. Overall, the detailed analysis of current health data breaches to demonstrate common modes of attacks highly breach area assets within the EHR infrastructure, allowing health entities to invest in solutions that focus on identified areas.
Second, the contribution made through the analysis of frequency of type of breach, and points of breaches, is an important one in understanding the most occurring breach type, method use by adversary. This contribution allows stakeholders within the healthcare domain to understand the requisite controls needed to address the most occurring breach type with maximum impact. Such information allows organization to prioritize risk and required effort needed to address them. Descriptive and trend analysis is used to describe, demonstrate, and summarize data points, and also to predict the direction of EHR data breaches based on current and historical data from a covered entity, allowing other researchers to build on our work.
8. Conclusions
In this work, we demonstrated that electronic health record (EHR) data breaches create severe concerns about patients’ privacy and safety, as well as about a risk of loss for healthcare entities responsible for managing patient health records. This explorative work into integrated EHR cybersecurity attacks using United States Health Insurance Portability and Accountability Act (HIPAA) privacy and security breach data reported shows, through descriptive and trend analysis, breaches caused by hacking, and IT incidents show a significant trend (coefficient 0.84,
p-value < 2.2 × 10
−16 ***) over the duration of the data collection. The finding indicates that individual records in breach incidents on all categories of covered entities are skewed toward zero, demonstrating that healthcare providers are consistently at the top in the number of breaches. Further, the trend is increasing, with the number of breach incidents attributed to “Hacking/IT” increasing consistently from 2010 to 2025. The analysis validated that some EHR implementations lack sufficient security controls to guarantee patient privacy, safety, and hospital operation continuity during a cyberattack. The analysis proved that attacks on integrated EHR systems are concentrated using similar attack methodologies and face common vulnerabilities. The reliability of this explorative research work was confirmed through retesting and reanalyzing the HIPAA breach data. The result achieved was consistent with the initial result and analysis. What is interesting for all categories is an increased number of incidents which peaked in 2022, declined 2023 and began rising again as demonstrated in
Figure 4 and
Figure 6.
Based on the findings of this study, there are several important implications for healthcare organizations and policymakers. First, the analysis underscores the urgent need to treat electronic health record (EHR) data as a uniquely sensitive and high-risk asset. Unlike other forms of data, once personal health information is breached and exposed to the public, the consequences are irreversible and potentially far-reaching—impacting not just on individual privacy but also public trust in healthcare systems. As such, healthcare organizations must prioritize the implementation of more stringent, proactive security measures to prevent unauthorized access and mitigate the risk of cyberattacks. The study’s descriptive and trend analysis of HIPAA-reported breaches reveals that most incidents stem from consistent and predictable attack methods—particularly hacking and IT incidents—suggesting that many healthcare entities face common vulnerabilities. This insight provides a roadmap for organizations to make data-driven, risk-based decisions in allocating limited cybersecurity resources toward the most vulnerable areas of their EHR infrastructure.
For policymakers, the findings emphasize the need to strengthen regulatory oversight and enforce standardized security controls that address the unique challenges integrated EHR systems. Additionally, given the study’s limitation regarding the completeness of breach reporting, there is a clear need for the development of automated and mandatory reporting mechanisms to ensure accurate national breach data. Ultimately, both healthcare leaders and regulators must work collaboratively to adopt targeted solutions, enhance breach reporting transparency, and implement adaptive security frameworks that evolve alongside technological advancements in healthcare.
The limitation of this work relate the authors’ inability to validate if companies are reporting all data breaches to US Health and Human services. As such, feature work should evaluate and explore automated breach reporting options to ensure a level of accurate data reporting.