Enhancing Microsoft 365 Security: Integrating Digital Forensics Analysis to Detect and Mitigate Adversarial Behavior Patterns
Abstract
:1. Introduction
1.1. Problem Definition
1.2. Terms and Definitions
1.2.1. Exposure to Data Breaches
1.2.2. Known Compromised Data Breaches
1.2.3. Compromised Email Addresses across All Known Data Breaches
1.3. Scope
1.4. Research Question and Hypothesis
- (RQ1). How effective are DFA techniques in identifying patterns and trends in malicious failed login attempts in M365 environments?
1.5. Significance of the Research
2. Materials and Methods
2.1. Methodology
2.2. Data Collection
2.3. Data Preprocessing Methods and Procedures
2.3.1. Elastic Extraction of M365 Login Information
- Timestamp;
- User ID;
- Source IP address (attacker);
- Action (login attempt);
- Result (outcome of the login attempt).
2.3.2. Windows PowerShell Extraction of M365 Account Information
- User ID;
- Account enabled (Y/N);
- Blocked login (Y/N);
- User type (List);
- Licensed (Y/N);
- Mailbox type (List);
- MFA enabled (Y/N).
2.3.3. Exploration and Identification of Public Data Breaches
Have I Been Pwned
BreachAlarm
Firefox Monitor
Identity Leak Checker
DeHashed
2.3.4. Consolidation of Public Data Breach Information
2.3.5. Anonymization and Transforming of Data
- Anonymizing all personal data that could identify a user or an organization by assigning a randomly generated four-digit number.
- Removing irrelevant entries.
- Normalizing timestamp formats.
- Breaking out timestamp information and converting it to a numerical format.
- Converting the source IP address to a numerical format.
- Transform any additional remaining categorical data (Action and Result) into numerical representations.
2.3.6. Final Preprocessing
2.4. Pattern Recognition Techniques
2.4.1. Correlation Analysis
2.4.2. Clustering Analysis
2.4.3. Association Rule Mining
2.5. Validation
3. Results
3.1. Introduction to Results
3.2. Data Collection and Preprocessing Results
3.2.1. Demographic Distribution Analysis of Known Data Breaches
- The organizations’ email domains were involved in sixty-nine known compromised data breaches.
- The organizations involved in the study have 2968 valid email addresses, which were used to determine their exposure to data breaches.
- Out of valid email addresses, 1530 unique accounts were found to have compromised email addresses across known data breaches.
- 485 (16.341%) matching email addresses were found in both the list of valid organizational email addresses and the list of known compromised data breaches, suggesting a significant security concern for the organizations. These identified accounts were utilized for the study.
- 956 (32.210%) fake or spoofed email addresses were identified in the breaches. Although these email addresses were not valid organizational email addresses, they represent potential threats to the organization’s email security. These identified accounts were excluded from the study.
- 89 (2.999%) user IDs were excluded for not having complete or enough valid organizational information or email addresses. These identified accounts were also excluded from the study.
- A total of 3925 compromised email addresses were used in the data breaches, indicating that some individuals experienced multiple breaches.
3.2.2. Retrospective Analysis
- Historical learning: By analyzing previous data breaches, invaluable insights into the methodologies employed by cybercriminals can be gleaned. This knowledge aided in identifying patterns and trends, which can then be utilized to bolster current and future cybersecurity strategies.
- Vulnerability identification: The examination of specifics from past breaches, such as the types of data compromised, enables common vulnerabilities exploited by attackers to be pinpointed. This information can guide organizations in directing their resources and efforts toward protecting against similar vulnerabilities.
- Relationship establishment: Computing correlation coefficients between pairs of datasets where a user’s ID was compromised facilitated the understanding of the relationships between these breaches. This process is key in determining whether these breaches are isolated incidents or parts of broader, interconnected cyberattack patterns.
3.2.3. Data Preprocessing of Datasets
- Unsuccessful malicious failed login attempts (Dataset 1)
- ○
- Dataset 1 (D1) consisted of 2,025,493 failed login attempts from 60,209 unique source IPs across 176 countries.
- ○
- In this dataset, the most frequent outcome of the login action observed was “UserLoginFailed”, which aligns with the anticipated expectation.
- ○
- 449 unique user IDs were identified for the study.
- Successful, legitimate logins (Dataset 2).
- ○
- Dataset 2 (D2) contained 253,148 successful login attempts that originated from 8990 unique source IPs across 99 countries.
- ○
- In this dataset, the most frequent outcome of the login action observed was “UserLoggedIn”, which is what was expected.
3.3. Pattern Recognition Results
3.3.1. Correlation Analysis Results
3.3.2. Clustering Analysis Results
Descriptive Statistics of the Clusters Showed
- Cluster 1 had the most significant number of user IDs, with 215.
- Cluster 2 had a moderate number of user IDs, with 117.
- Clusters 3, 4, and 5 had the smallest user IDs, with 52, 60, and 56, respectively.
Analysis of the Combined Cluster Matrix Revealed Several Key Findings
- A significant proportion of user IDs were associated with multiple data breaches, indicating that users are often exposed to multiple threats.
- Some data breaches were more prevalent across user IDs, suggesting that certain breaches have a wider-reaching impact on user exposure.
- The distribution of user IDs among clusters varied, with some clusters having a higher concentration of users exposed to specific data breaches.
- Relationships between clusters and data breaches were observed, with certain clusters being more strongly associated with specific data breaches.
Appendix A and Appendix B Analysis of the Cluster Matrix
- Cluster 1 contained most of the dataset and likely represented those users who have experienced the most severe security incidents or breaches.
- Cluster 2 represents users who have experienced more significant security incidents or breaches.
- Cluster 3 represents users who have experienced security incidents or breaches related to specific industries or regions.
- Cluster 4 represents users who have experienced some security incidents but are not as significant as those in other clusters.
- Cluster 5 represents users who have not experienced any significant data breaches or security incidents.
3.3.3. Association Rule Mining Results
- Rules 2, 3, and 5 have similar confidence, lift, and Zhang’s Metric values, suggesting that these rules also have a strong positive association between the antecedents and consequents.
- Rule 4 has slightly lower confidence but still presents a high lift, and Zhang’s Metric also indicates a strong positive association.
3.3.4. APT Groups and Data Breaches Results
- APT28 (Fancy Bear) was linked to the LinkedIn breach, using spear-phishing and exploiting software vulnerabilities to compromise millions of user accounts.
- The Syrian Electronic Army (SEA) was suspected of being behind the Twitter breach, leveraging social engineering tactics and stolen credentials to gain unauthorized access.
- APT29 (Cozy Bear) was connected to the Dropbox breach, using advanced malware and lateral movement techniques to maintain persistence and exfiltrate data.
3.3.5. Proposed Future Exploratory Analysis
3.4. Validation Results
3.5. Summary of Results
- Correlation analysis results: 98 meaningful correlations were identified, with the top ten pairs having the highest correlations, suggesting shared characteristics, patterns, or vulnerabilities between the breaches.
- Clustering analysis results: The analysis grouped user IDs based on their similarity in breach characteristics, revealing differing risks of compromise. It also showed relationships between clusters and data breaches, providing insights into specific threats and vulnerabilities.
- Association rule mining results: The analysis identified relationships between breach pairs and TTPs, uncovering patterns within security logs and helping to better understand the tactics employed by malicious actors.
3.5.1. Combination Results
Pattern Recognition Results
Demographic Distribution Summary Results
APT Groups and Data Breaches Results
3.5.2. Validation Results
4. Discussion Section
4.1. Introduction
4.2. Interpretation of Results
4.2.1. Pattern Recognition Results Interpretation
Brute Force Attacks and Credential Stuffing
Targeted Accounts and High-Value Users
Inactive and Disabled Accounts
4.2.2. Interpretation of Results for RQ1 and H1
Enhancement of Threat Intelligence
Prioritization of Vulnerability Management
Development of Incident Response Playbooks
Augmentation of User Awareness
Sharing of Findings and Collaboration
4.3. Comparison to Previous Research
4.4. Practical Implications and Recommendations
4.4.1. Implications and Impact
Develop More Proactive and Targeted Cybersecurity Strategies
Enhance Threat Detection and Response Capabilities
Strengthen Overall Cybersecurity Posture
Foster Collaboration and Information Sharing
4.4.2. Limitations and Future Research
- Scope of data: The research focused on malicious failed login attempts and their connections to public data breaches and TTPs in M365 tenants. Consequently, the findings may not be generalizable to other cloud-based platforms or cybersecurity contexts.
- Data collection period: The data analyzed in this study were collected over a specific time frame. As cyber threats continuously evolve, further research should be conducted periodically to ensure the relevance and effectiveness of the proposed strategies.
- Human factors: While the importance of addressing human factors in cybersecurity was discussed, the research did not thoroughly explore the psychological, social, and organizational aspects that may contribute to the observed patterns of malicious failed login attempts. Future research could investigate these dimensions more comprehensively.
- Insider threats: The study primarily focused on external threats associated with malicious failed login attempts. Future research could expand the scope to include insider threats and investigate potential links between internal and external threat actors.
- Causality: The relationships observed in the study are correlational and do not necessarily imply causality. Future research could employ experimental or longitudinal designs to better understand the causal relationships between malicious failed login attempts, public data breaches, and TTPs.
- Mitigation strategies: The research focused on analyzing and understanding the relationships between malicious failed login attempts, public data breaches, and TTPs, rather than proposing specific mitigation strategies. Future research could build on these findings to develop more targeted and effective cybersecurity defenses.
5. Conclusions
5.1. Summary of Main Findings
- A significant relationship exists between malicious failed login attempts in M365 tenants and known public data breaches or compromised email addresses.
- Digital forensics techniques effectively analyze M365 security logs, identifying patterns and trends in failed malicious login attempts linked to public data breaches or compromised email addresses.
- APT data integration enhances the detection of potential sources of failed malicious logins in M365 tenants and informs the development of proactive cybersecurity strategies.
- The study used association rule mining to reveal patterns within the security logs, highlighting the frequent co-occurrence of specific TTPs employed by malicious actors.
- Top association rules revealed in the study show strong relationships between multiple combinations of the identified TTPs. Security teams can use this information to identify patterns and trends in malicious login attempts and develop targeted mitigation strategies.
- Correlation analysis demonstrated the potential of using breach and APT data to detect potential sources of failed malicious logins and inform proactive cybersecurity strategy development. Significant correlations were found between different breaches.
- Cluster analysis identified distinct user ID clusters with varying risk levels, helping organizations prioritize defenses and allocate resources against relevant threats.
5.2. Contributions to the Field
- Determining a significant relationship exists between malicious failed login attempts in M365 tenants and known public data breaches or compromised email addresses.
- Demonstrating the effectiveness of digital forensics techniques in analyzing M365 security logs and identifying malicious login attempt patterns.
- Providing insights into the TTPs employed by threat actors in M365 cyberattacks.
5.3. Practical Implications
- Enhanced detection and mitigation of malicious failed login attempts by leveraging digital forensics techniques.
- Improved understanding of the threat landscape, enabling organizations to adopt a proactive stance toward cybersecurity.
- Targeted allocation of resources and prioritization of defenses against the most relevant threats based on the identified patterns and trends.
5.4. Potential Areas for Future Research Include
- Examining the role of artificial intelligence and automation in enhancing the analysis of M365 security logs.
- Exploring the impact of new cybersecurity policies, regulations, or industry standards on mitigating M365 cyberattacks and developing proactive cybersecurity strategies.
5.5. Regarding Future Research Directions
- Cybercriminal psychology: A more profound investigation into the psyche of cybercriminals could reveal their motivations, decision-making patterns, and behavioral tendencies. Understanding these psychological aspects could potentially improve predictive capabilities and inform more effective preventative measures against future attacks.
- Broadening the analysis scope: This includes extending the exploration to other cloud services and platforms, aiming to gather a more comprehensive understanding of adversarial behavior patterns and TTPs in various digital environments.
- Longitudinal data analysis: This approach involves analyzing data across extended periods to uncover evolving trends and shifts in threat actor tactics. The insights gathered would enrich our understanding of the constantly transforming cyber threat landscape.
- Delving into mitigation strategies: A deeper investigation into the efficacy of mitigation strategies against the identified TTPs can provide actionable recommendations for organizations, helping to fortify their cybersecurity defenses.
- Experimental research: Research involving controlled experiments or simulations could be beneficial to evaluate the effectiveness of diverse countermeasures and their impact on diminishing the risk of successful cyberattacks.
- Process modeling: Efforts could be directed towards creating a more systematic and replicable description of the process that leads to data breaches. The challenge in this endeavor arises from the varying methodologies employed by cybercriminals. However, developing such a process model could offer valuable insights into the dynamics of these cyberattacks, subject to the data’s limitations.
5.6. Final Thoughts
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
A.1. Demographic Distribution Summary of Known Data Breaches
- Finance and Insurance: 35%;
- Healthcare: 22%;
- Technology: 16%;
- Retail: 12%;
- Manufacturing: 10%;
- Other industries: 5%.
A.1.1. Distribution of Breaches across Industries
- Business and Data Services: Adapt, Apollo, B2B USA Businesses, Data-Leads, Elasticsearch Instance of Sales Leads on AWS, Exactis, Factual, Lead Hunter, NetProspex, Verifications.io, Whitepages;
- Technology Platforms: Adobe, Animoto, Bitly, Canva, Chegg, Disqus, Dropbox, Edmodo, Emotet, Epik, LinkedIn, LiveAuctioneers, LiveJournal, Modern Business Solutions, mSpy, MyFitnessPal, Nitro, QuestionPro, ShareThis, SlideTeam, Stratfor, Ticketfly, Twitter, Zomato, Zynga;
- Retail and E-commerce: Bonobos, CafePress, Covve, Drizly, EatStreet, Evite, Fling, Gaadi, Gravatar, HauteLook, Houzz, Justdate.com, Minted, MMG Fusion, River City Media Spam List;
- Automotive: Audi;
- Gaming and Entertainment: ArmorGames, Digimon, Forbes, Straffic, Zynga;
- Adult Content: Fling;
- Health and Fitness: MyFitnessPal;
- Education: Chegg, Edmodo;
- Social Media and Networking: Disqus, Gravatar, LinkedIn, LiveJournal, Twitter;
- Food and Beverage: Drizly, EatStreet, Zomato.
A.1.2. Timeline of Data Breaches
- 2011: Fling, Stratfor, LinkedIn;
- 2012: Adobe, Dropbox, Disqus, LinkedIn, Twitter;
- 2013: Adobe, Gravatar;
- 2014: Bitly, Digimon, Forbes, LiveJournal;
- 2015: Gaadi, mSpy;
- 2016: Anti Public Combo List, Exploit.In, NetProspex, Lead Hunter, Modern Business Solutions;
- 2017: Edmodo, Factual, Onliner Spambot, River City Media Spam List, Trik Spam Botnet, Zomato;
- 2018: Adapt, Animoto, Apollo, Bitly, Chegg, Covve, Dropbox, HauteLook, Houzz, MyFitnessPal, Straffic;
- 2019: CafePress, Canva, EatStreet, Elasticsearch Instance of Sales Leads on AWS, Evite, Exactis, LiveAuctioneers, River City Media Spam List, ShareThis, Verifications.io, Whitepages;
- 2020: Audi, Bonobos, Covve, Data Enrichment Exposure from PDL Customer, Drizly, HauteLook, LiveAuctioneers, Nitro, ParkMobile;
- 2021: B2B USA Businesses, Data-Leads, Epik, MeetMindful, Minted, MMG Fusion, QuestionPro, SlideTeam.
- Largest breaches (over 100 million records): Adobe, Canva, Collection #1, Evite, Exactis, LinkedIn, River City Media Spam List, Verifications.io;
- Medium-sized breaches (10 million–100 million records): Adapt, Apollo, Bitly, CafePress, Chegg, Disqus, Dropbox, Edmodo, Houzz, MyFitnessPal, NetProspex, Nitro, ParkMobile, ShareThis, Zomato, Zynga;
- Smaller breaches (1 million–10 million records): Animoto, Audi, Bonobos, Covve, Data-Leads, Drizly, EatStreet, Emotet, Epik, Factual, Fling, Forbes, Gaadi, Gravatar, HauteLook, Lead Hunter, LiveAuctioneers, LiveJournal, mSpy, Minted, MMG Fusion, Modern Business Solutions, QuestionPro, SlideTeam, Stratfor, Ticketfly, Twitter;
- Minor breaches (under 1 million records): ArmorGames, B2B USA Businesses, Digimon, ElasticSearch Instance of Sales Leads on AWS, Onliner Spambot, Trik Spam Botnet, Whitepages.
A.1.3. Types of Data Compromised
- Email addresses;
- Hashed and plaintext passwords;
- Usernames;
- Names;
- Physical addresses;
- Phone numbers;
- Date of birth;
- Social media profiles;
- Personal preferences;
- Payment information;
- Health and fitness data.
A.1.4. Potential TTPs (Tactics, Techniques, and Procedures) Employed in Attacks
- Phishing campaigns;
- Social engineering;
- Exploiting unpatched vulnerabilities;
- Credential stuffing;
- Password spraying;
- Brute force attacks;
- SQL injection;
- Malware infections;
- Third-party service compromises;
- Insider threats;
- Advanced Persistent Threats (APTs);
- Supply chain attacks.
Appendix B
- Adapt: A business data provider suffered a data breach in 2018, exposing approximately 9.3 million records, including email addresses, personal information, and business data.
- Adobe: In 2013, Adobe experienced a significant data breach, compromising about 153 million user records, including email addresses, passwords, and password hints.
- Animoto: A video creation platform breached in 2018, compromising 25 million user records, including email addresses and hashed passwords.
- Anti Public Combo List: A compilation of data breaches discovered in 2016, compromising over 458 million email addresses, usernames, and plaintext passwords.
- Apollo: A sales engagement platform suffered a data breach in 2018, compromising 125 million records, including email addresses, names, and job titles.
- Audi: A 2020 data breach at Audi and Volkswagen impacted 3.3 million customers, exposing email addresses, phone numbers, and vehicle identification numbers (VINs).
- B2B USA Businesses: In 2021, a database containing 63 million records from various B2B USA companies was leaked, including email addresses and other personal information.
- Bitly: The URL shortening service experienced a data breach in 2014, leading to the compromise of email addresses, encrypted passwords, and API keys.
- Bonobos: The men’s clothing retailer suffered a data breach in 2020, exposing approximately 70 GB of data, including 7 million email addresses and other personal information.
- CafePress: In 2019, CafePress experienced a data breach, compromising 23 million user records, including email addresses and password hashes.
- Canva: A graphic design platform breached in 2019, exposing 137 million user records, including email addresses and bcrypt-hashed passwords.
- Chegg: An education technology company suffered a data breach in 2018, compromising 40 million records, including email addresses, usernames, and hashed passwords.
- Cit0day: In 2020, a collection of 23,000 breached databases was leaked, containing billions of records, including email addresses, usernames, and plaintext passwords.
- Collection #1: A massive data breach compilation discovered in 2019, consisting of over 770 million unique email addresses and over 21 million unique passwords.
- CouponMom-ArmorGames: A data breach in 2020 affected both CouponMom and ArmorGames, compromising 11 million records, including email addresses and plaintext passwords.
- Covve: A data breach in 2020 exposed the records of 22 million users, including email addresses, names, phone numbers, and LinkedIn profiles.
- Data Enrichment Exposure From PDL Customer: In 2019, a security lapse at People Data Labs (PDL) exposed 622 million records, including email addresses and other personal information.
- Data-Leads: In 2021, a data breach compromised 63 million records from various B2B companies, including email addresses, names, and phone numbers.
- Digimon: An unofficial forum for Digimon fans was hacked in 2014, compromising 4.9 million records, including email addresses, usernames, and IP addresses.
- Disqus: A blog comment hosting service breached in 2012, resulting in the exposure of 17.5 million user records, including email addresses, usernames, and hashed passwords.
- Drizly: An alcohol delivery platform experienced a data breach in 2020, compromising 2.5 million user records, including email addresses, hashed passwords, and personal information.
- Dropbox: In 2012, Dropbox suffered a data breach, resulting in the exposure of 68 million user records, including email addresses and hashed passwords.
- EatStreet: A food delivery platform breached in 2019, compromising 6 million user records, including email addresses, hashed passwords, and personal information.
- Edmodo: An educational platform experienced a data breach in 2017, exposing 77 million user records, including email addresses, usernames, and bcrypt-hashed passwords.
- Elasticsearch Instance of Sales Leads on AWS: In 2019, an unprotected Elasticsearch instance exposed 60 million sales leads, including email addresses and other personal information.
- Emotet: A notorious botnet and malware family involved in multiple phishing campaigns targeting email addresses, banking credentials, and other personal information.
- Epik: A domain registrar and web hosting company suffered a data breach in 2021, compromising email addresses, account credentials, and customer records.
- Evite: A social planning and invitation platform breached in 2019, leading to the exposure of 101 million user records, including email addresses, plaintext passwords, and personal information.
- Exactis: A data aggregator experienced a data breach in 2018, compromising 340 million records, including email addresses, phone numbers, and other personal information.
- Exploit.In: A forum for hackers, which in 2016 released a database containing 593 million email addresses and plaintext passwords from multiple data breaches.
- Factual: A location data company suffered a data breach in 2017, compromising 2.5 million user records, including email addresses, hashed passwords, and personal information.
- Fling: An adult dating website experienced a data breach in 2011, exposing 40 million user records, including email addresses, usernames, and plaintext passwords.
- Forbes: The media company suffered a data breach in 2014, compromising 1 million user records, including email addresses, usernames, and hashed passwords.
- Gaadi: An Indian car research platform experienced a data breach in 2015, exposing 2.2 million user records, including email addresses, usernames, and hashed passwords.
- Gravatar: In 2013, a security researcher discovered a vulnerability in Gravatar that could potentially expose user email addresses, but no data breach was reported.
- HauteLook: A fashion retailer suffered a data breach in 2018, compromising 28 million user records, including email addresses, bcrypt-hashed passwords, and personal information.
- Houzz: A home design platform experienced a data breach in 2018, exposing 48 million user records, including email addresses, usernames, and hashed passwords.
- Justdate.com: A dating platform suffered a data breach in 2017, compromising 1.7 million user records, including email addresses, bcrypt-hashed passwords, and personal information.
- Kayo.moe Credential Stuffing List: In 2018, a collection of 42.5 million email addresses and plaintext passwords from various sources was discovered, potentially used for credential stuffing attacks.
- Lead Hunter: A data breach in 2016 affected the sales lead generation platform, compromising 68 million user records, including email addresses, hashed passwords, and personal information.
- LinkedIn: In 2012, LinkedIn experienced a data breach, compromising 165 million user records, including email addresses and hashed passwords. A separate incident in 2021 involved scraped data from around 500 million LinkedIn users, including email addresses, though this was not a direct breach of their systems.
- LiveAuctioneers: An online auction platform breached in 2020, leading to the exposure of 3.4 million user records, including email addresses, hashed passwords, and personal information.
- LiveJournal: A blogging platform experienced a data breach in 2014, compromising 26 million user records, including email addresses, plaintext passwords, and usernames.
- MeetMindful: A dating platform suffered a data breach in 2021, exposing 2.3 million user records, including email addresses, names, and location data.
- Minted: An online marketplace for independent artists experienced a data breach in 2020, compromising 5 million user records, including email addresses, hashed passwords, and personal information.
- MMG Fusion: A dental marketing software provider suffered a data breach in 2021, exposing 2.6 million user records, including email addresses and other personal information.
- Modern Business Solutions: A data management and monetization company experienced a data breach in 2016, compromising 58 million user records, including email addresses, IP addresses, and personal information.
- mSpy: A mobile monitoring and parental control software provider suffered a data breach in 2015, exposing 4 million user records, including email addresses, encrypted passwords, and payment details.
- MyFitnessPal: A fitness and nutrition app experienced a data breach in 2018, compromising 150 million user records, including email addresses, hashed passwords, and usernames.
- NetGalley: An online book review platform suffered a data breach in 2020, exposing email addresses, names, usernames, and hashed passwords.
- NetProspex: A sales lead generation company experienced a data breach in 2016, compromising 33 million user records, including email addresses, names, job titles, and company information.
- Nitro: A document management and productivity software provider suffered a data breach in 2020, exposing 70 million user records, including email addresses, names, and hashed passwords.
- Onliner Spambot: In 2017, a spambot campaign is known as Onliner Spambot was discovered, compromising 711 million email addresses, along with usernames and passwords, used for sending spam and infecting systems with malware.
- ParkMobile: A parking app experienced a data breach in 2021, compromising 21 million user records, including email addresses, names, and hashed passwords.
- QuestionPro: An online survey platform suffered a data breach in 2021, exposing 198 million user records, including email addresses, names, and hashed passwords.
- River City Media Spam List: In 2017, a data breach involving River City Media, a spamming organization, exposed 1.34 billion email addresses, names, and other personal information.
- ShareThis: A social sharing platform experienced a data breach in 2018, compromising 41 million user records, including email addresses, hashed passwords, and usernames.
- SlideTeam: A presentation template provider suffered a data breach in 2021, exposing 1.4 million user records, including email addresses, names, and bcrypt-hashed passwords.
- Straffic: A botnet involved in various phishing campaigns was discovered in 2021, potentially compromising millions of email addresses, banking credentials, and others.
- Stratfor: A global intelligence company experienced a data breach in 2011, compromising 860,000 user records, including email addresses, usernames, and hashed passwords.
- Ticketfly: An event ticketing platform suffered a data breach in 2018, exposing 27 million user records, including email addresses, names, and phone numbers.
- Trik Spam Botnet: A malware botnet discovered in 2017, compromising 43 million email addresses and plaintext passwords, used for sending spam and infecting systems with additional malware.
- Twitter: In 2018, Twitter advised its 330 million users to change their passwords due to a bug that stored plaintext passwords in an internal log. However, there was no confirmed data breach or unauthorized access.
- Unverified Data Source: A collection of compromised records discovered in 2019 containing over 62 million email addresses and plaintext passwords from various sources, with no specific attribution to a single breach.
- Verifications.io: A data validation service experienced a data breach in 2019, exposing 763 million records, including email addresses, phone numbers, and other personal information.
- Whitepages: In 2019, an unprotected Elasticsearch database exposed 22 million Whitepages records, including email addresses, names, and phone numbers. However, this was not a direct breach of Whitepages systems.
- Youve Been Scraped/You’ve Been Scraped: These incidents refer to data scraping, where publicly available information is collected from websites without authorization. Email addresses are often a target in these situations, but specific breaches are difficult to pinpoint.
- Zomato: An Indian food delivery platform suffered a data breach in 2017, compromising 17 million user records, including email addresses and hashed passwords.
- Zynga: A mobile gaming company experienced a data breach in 2019, exposing 218 million user records, including email addresses, usernames, and hashed passwords.
Appendix C
Advanced Persistent Threats (APTs) Groups Associated with the Known Public Data Breaches
- LinkedIn (2012): The LinkedIn data breach, where approximately 165 million user accounts were compromised, has been attributed to a Russia-based hacker group known as APT28 or Fancy Bear. They are believed to have ties to the Russian government.
- Twitter (2013): The Twitter breach, in which around 45,000 accounts were compromised, has been suspected to be the work of the Syrian Electronic Army (SEA), an APT group with connections to the Syrian government.
- Dropbox (2012): The Dropbox breach, which affected nearly 68 million users, has been attributed to a group known as APT29 or Cozy Bear. This group is also believed to have ties to the Russian government.
- Emotet (2014–present): Emotet is a sophisticated malware strain and botnet known for distributing banking Trojans and ransomware. Although not directly attributed to a specific nation-state APT, it has been linked to various cybercrime groups and is considered an advanced threat due to its persistence and evolving nature.
- Stratfor (2011): The breach of the global intelligence company Stratfor, where around 860,000 users’ data was compromised, was claimed by the hacktivist group Anonymous. However, some cybersecurity researchers have suggested that the attack might have been supported by a nation-state APT group due to the level of sophistication.
- Collection #1 (2019): While direct attribution is not available, the sheer scale of this massive data breach compilation suggests the involvement of advanced threat actors. It is possible that multiple APT groups and cybercriminal organizations contributed to or took advantage of the compromised data.
- Adobe (2013): The breach is suspected to be the work of an APT group called “PawnStorm” (also known as APT28 or Fancy Bear), which has been linked to Russian intelligence agencies. This group is notorious for targeting high-profile organizations and using spear-phishing campaigns to infiltrate networks.
- mSpy (2015): The breach was initially attributed to an unknown hacking group. However, further analysis linked the breach to the Chinese APT group called “APT3” or “Buckeye”. This group is known for targeting high-profile organizations in various industries, primarily to gain intellectual property and sensitive information.
- Onliner Spambot (2017): While not directly linked to a specific APT group, it can be associated with advanced persistent cybercriminal campaigns. These campaigns often involve the use of large-scale spamming operations and the distribution of sophisticated malware such as banking Trojans and ransomware.
- Cit0day (2020) is a collection of 23,000 breached databases containing billions of records. Although difficult to attribute to a specific APT group, the scale implies multiple hacking groups’ involvement. Various TTPs, such as phishing, credential stuffing, and exploiting web vulnerabilities, were likely employed in the breaches.
- SolarWinds (2020): This high-profile supply chain attack compromised numerous government and private organizations. The breach has been attributed to a Russian APT group known as APT29, also referred to as Cozy Bear or The Dukes. They are believed to have ties to Russia’s foreign intelligence service, the SVR.
- Equifax (2017): The massive breach of the credit reporting agency, which affected around 147 million users, has been attributed to the Chinese APT group called APT10 or Menupass. The group is known for targeting large organizations and is believed to have ties to China’s Ministry of State Security.
- WannaCry (2017): This widespread ransomware attack affected organizations and users globally. The attack has been attributed to the North Korean APT group known as Lazarus Group or Hidden Cobra. They are believed to be linked to the North Korean government and have been involved in several high-profile cyberattacks.
- NotPetya (2017): This destructive malware attack targeted organizations primarily in Ukraine but also affected global businesses. The NotPetya attack has been attributed to the Russian APT group Sandworm Team, also known as Voodoo Bear or TeleBots. They are believed to be connected to Russia’s military intelligence agency, the GRU.
- Zomato (2017): Although direct attribution is not available, the scale and nature of the attack suggest that an advanced cybercriminal organization or APT group may have been involved. The breach resulted in the compromise of 17 million user records, including email addresses and hashed passwords.
- Zynga (2019): The breach affecting 218 million user records has been attributed to a well-known cybercriminal known as Gnosticplayers. While not an APT group, Gnosticplayers is responsible for a series of large-scale data breaches, indicating a high level of sophistication and persistence in their operations.
- MyFitnessPal (2018): The breach of MyFitnessPal, which compromised 150 million user records, was attributed to a group of prolific hackers known as “Magecart.” Although typically known for its attacks on e-commerce sites, the group’s scale and sophistication suggest it might operate at a level comparable to a nation-state APT.
- Houzz (2018): Houzz’s data breach exposed 48 million user records. While a specific APT group hasn’t been linked to this incident, the scale and nature of the data compromised suggest the involvement of a highly organized and possibly state-sponsored group.
- Verifications.io (2019): This incident exposed 763 million records, making it one of the most extensive collections of public data breaches. While the actual breach hasn’t been linked to a specific APT, the scale and type of data suggests the involvement of advanced and persistent threat actors.
- Ticketfly (2018): While no specific APT group was attributed to the breach, the nature of the attack (a defacement of the website coupled with data exfiltration) suggests the involvement of a sophisticated threat actor, possibly with the characteristics of an APT.
Appendix D
Different Types of Microsoft 365 Accounts Observed in the Study
- UserMailbox: Yes, users can log in directly to their UserMailbox. This is the primary account type used by individuals to access their emails, calendar, contacts, and other Microsoft 365 services.
- SharedMailbox: No, users cannot log in directly to a shared mailbox. They need to have their own individual UserMailbox and be granted access to the shared mailbox. They can then access it via their own account.
- GAL Contact: No, users cannot log in directly to a GAL (Global Address List) Contact. These are just contact entries in the address book and do not have any login credentials associated with them.
- Room Mailbox: No, users cannot log in directly to a Room Mailbox. A room mailbox is a resource mailbox that represents a meeting space, like a conference room. Users can book the room through their own UserMailbox but cannot access the room mailbox itself.
- Health Mailbox: No, users cannot log in directly to a Health Mailbox. These mailboxes are used by Microsoft Exchange Server to monitor and test the health of the server. They are not meant for direct user access.
- Team Mailbox: No, users cannot log in directly to a Team Mailbox. A team mailbox is associated with a Microsoft Teams team and its channels. Users need to have their own individual UserMailbox and be a member of the relevant team to access the team mailbox.
- Alias: No, users cannot log in directly to an Alias. An alias is an additional email address associated with a UserMailbox that can be used to send and receive email. It is not a separate account and cannot be accessed independently.
- Equipment Mailbox: No, users cannot log in directly to an Equipment Mailbox. An equipment mailbox is a resource mailbox that represents a piece of equipment, like a projector or a company car. Users can book the equipment through their own UserMailbox but cannot access the equipment mailbox itself.
- System.Object: This is not a type of Microsoft 365 mailbox account. It appears to be a generic object reference in a programming language or script, and therefore cannot be logged into directly.
- No Mailbox: No, users cannot log in directly to a “No Mailbox” account, as it indicates that there is no mailbox associated with the user or object in question. Without a mailbox, there is no account for a user to log into.
- NoUser: No, users cannot log in directly to a “NoUser” account. This term typically refers to an account or object that has not been assigned a user or that does not have a mailbox associated with it. There is no account to log into in this case.
- Sync: This term is not a specific type of Microsoft 365 mailbox account. It might refer to the synchronization process between on-premises Active Directory and Azure Active Directory, or other data synchronization scenarios. As such, users cannot log into a “Sync” account, as it does not represent a mailbox or user account.
- Alias: No, users cannot log in directly to an Alias. An alias is an additional email address associated with a UserMailbox that can be used to send and receive email. It is not a separate account and cannot be accessed independently. Users need to log in to their primary UserMailbox to access emails sent to their alias.
References
- Carlson, A. Microsoft 365 and Exchange Server Hybrid Forensics. Ph.D. Thesis, Utica College, Utica, NY, USA, 2019. [Google Scholar]
- El Jabri, C.; Frappier, M.; Tardif, P.-M.; Lepine, G.; Boisvert, G. Statistical approach for cloud security: Microsoft Office 365 audit logs case study. In Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), Taipei, Taiwan, 21–24 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Back, S.; LaPrade, J. The future of cybercrime prevention strategies: Human factors and a holistic approach to cyber intelligence. Int. J. Cybersecur. Intell. Cybercrime 2019, 2, 1–4. [Google Scholar] [CrossRef]
- Cornejo, G.A. Human Errors in Data Breaches: An Exploratory Configurational Analysis. Ph.D. Thesis, Nova Southeastern University, Fort Lauderdale, FL, USA, 2021. [Google Scholar]
- Huang, T.-K. Understanding Online Malicious Behavior: Social Malware and Email Spam. Ph.D. Thesis, University of California, Riverside, CA, USA, 2013. [Google Scholar]
- Bhardwaj, A.; Kaushik, K.; Alomari, A.; Alsirhani, A.; Alshahrani, M.M.; Bharany, S. BTH: Behavior-Based Structured Threat Hunting Framework to Analyze and Detect Advanced Adversaries. Electronics 2022, 11, 2992. [Google Scholar] [CrossRef]
- Derbyshire, R.J. Anticipating Adversary Cost: Bridging the Threat-Vulnerability Gap in Cyber Risk Assessment. Ph.D. Thesis, Lancaster University, Lancaster, UK, 2022. [Google Scholar]
- Mavroeidis, V.; Jøsang, A. Data-Driven Threat Hunting Using Sysmon. In Proceedings of the ICCSP 2018: Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, Guiyang, China, 16–19 March 2018. [Google Scholar]
- Montasari, R. The Comprehensive Digital Forensic Investigation Process Model (CDFIPM) for Digital Forensic Practice. Ph.D. Thesis, University of Derby, Derby, UK, 2021. [Google Scholar]
- Amin, R.M. Detecting Targeted Malicious Email through Supervised Classification of Persistent Threat and Recipient-Oriented Features. Ph.D. Thesis, The George Washington University, Washington, DC, USA, 2010. [Google Scholar]
- Agrawal, G.; Deng, Y.; Park, J.; Liu, H.; Chen, Y.-C. Building Knowledge Graphs from Unstructured Texts: Applications and Impact Analyses in Cybersecurity Education. Information 2022, 13, 526. [Google Scholar] [CrossRef]
- Mouzakitis, S.; Askounis, D. Assessing MITRE ATT&CK risk using a cyber-security culture framework. Sensors 2021, 21, 3267. [Google Scholar] [CrossRef] [PubMed]
- Serketzis, N.; Katos, V.; Ilioudis, C.; Baltatzis, D.; Pangalos, G.J. Actionable threat intelligence for digital forensics readiness. Inf. Comput. Secur. 2019, 27, 273–291. [Google Scholar] [CrossRef] [Green Version]
- Ferguson-Walter, K.J.; Gutzwiller, R.S.; Scott, D.D.; Johnson, C.J. Oppositional human factors in cybersecurity: A preliminary analysis of affective states. In Proceedings of the Institute of Electrical and Electronics Engineers (IEEE) Conference, Melbourne, Australia, 15–19 November 2021; pp. 153–158. [Google Scholar] [CrossRef]
- Greitzer, F.L.; Hohimer, R.E. Modeling human behavior to anticipate insider attacks. J. Strateg. Secur. 2011, 4, 25–48. [Google Scholar] [CrossRef]
- Ramlo, S.; Nicholas, J.B. The human factor: Assessing ‘individuals’ perceptions related to cybersecurity. Inf. Comput. Secur. 2021, 29, 350–364. [Google Scholar] [CrossRef]
- Rohan, R.; Funilkul, S.; Pal, D.; Chutimaskul, W. Understanding of Human Factors in Cybersecurity: A Systematic Literature Review. In Proceedings of the International Conference on Computational Performance Evaluation (ComPE), Shillong, India, 1–3 December 2021; pp. 133–140. [Google Scholar] [CrossRef]
- Jeong, J.; Mihelcic, J.; Oliver, G.; Rudolph, C. Towards an Improved Understanding of Human Factors in Cybersecurity. In Proceedings of the IEEE 5th International Conference on Collaboration and Internet Computing (CIC), Los Angeles, CA, USA, 12–14 December 2019. [Google Scholar]
- Hultquist, K.E. An Analysis of the Impact of Cyber Threats upon 21st Century Business. Ph.D. Thesis, The College of St. Scholastica, Duluth, MN, USA, 2011. [Google Scholar]
- Liu, K.; Wang, F.; Ding, Z.; Liang, S.; Yu, Z.; Zhou, Y. Recent Progress of Using Knowledge Graph for Cybersecurity. Electronics 2022, 11, 2287. [Google Scholar] [CrossRef]
- Nisioti, A.; Loukas, G.; Rass, S.; Panaousis, E. Game-Theoretic Decision Support for Cyber Forensic Investigations. Sensors 2021, 21, 5300. [Google Scholar] [CrossRef] [PubMed]
- Triplett, W.J. Addressing Human Factors in Cybersecurity Leadership. J. Cybersecur. Priv. 2022, 2, 573. [Google Scholar] [CrossRef]
- Salik, H. Offensive Cyber Operations: Failure to Dissuade Nation-State Adversaries in Cyberspace. Ph.D. Thesis, University of the Cumberlands, Williamsburg, KY, USA, 2022. [Google Scholar]
- Rahman, T.; Rohan, R.; Pal, D.; Kanthamanon, P. Human Factors in Cybersecurity: A Scoping Review. In Proceedings of the 12th International Conference on Advances in Information Technology (IAIT2021), Bangkok, Thailand, 29 June–1 July 2021; pp. 1–11. [Google Scholar] [CrossRef]
- Sutter, O.W. The Cyber Profile: Determining Human Behavior through Cyber-Actions. Ph.D. Thesis, Capitol Technology University, Laurel, MD, USA, 2020. [Google Scholar]
- Tyworth, M.; Giacobe, N.A.; Mancuso, V.F.; McNeese, M.D.; Hall, D.L. A human-in-the-loop approach to understanding situation awareness in cyber defence analysis. EAI Endorsed Trans. Secur. Saf. 2013, 1, e6. [Google Scholar] [CrossRef]
- Elastic. Filebeat Module: o365. Elastic.co. Available online: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-o365.html (accessed on 31 May 2023).
- Wells, J.A.; LaFon, D.S.; Gratian, M. Assessing the Credibility of Cyber Adversaries. Int. J. Cybersecur. Intell. Cybercrime 2021, 4, 3–24. [Google Scholar] [CrossRef]
- Dalal, R.S.; Howard, D.J.; Bennett, R.J.; Posey, C.; Zaccaro, S.J.; Brummel, B.J. Organizational science and cybersecurity: Abundant opportunities for research at the interface. J. Bus. Psychol. 2022, 37, 1–29. [Google Scholar] [CrossRef] [PubMed]
- Kioskli, K.; Polemi, N. Psychosocial approach to cyber threat intelligence. Int. J. Chaotic Comput. 2020, 7, 159–165. [Google Scholar] [CrossRef]
- Singh, T. The Role of Stress among Cybersecurity Professionals. Ph.D. Thesis, The University of Alabama, Tuscaloosa, AL, USA, 2021. [Google Scholar]
- Clapper, J.; Lettre, M.; Rogers, M.S. Foreign Cyber Threats to the United States. Hampton Roads Int. Secur. Q. 2017, 1, 1–7. [Google Scholar]
- Spearman, C. The proof and measurement of association between two things. Am. J. Psychol. 1904, 15, 72–101. [Google Scholar] [CrossRef]
- McCall, G.C., Jr. Exploring a Cyber Threat Intelligence (CTI) Approach in the Thwarting of Adversary Attacks: An Exploratory Case Study. Ph.D. Thesis, Northcentral University, Scottsdale, AZ, USA, 2022. [Google Scholar]
- Pangsuban, P.; Nilsook, P.; Wannapiroon, P. Real-time Risk Assessment for Information System with CICIDS2017 Dataset Using Machine Learning. Int. J. Mach. Learn. Comput. 2020, 10, 538–543. [Google Scholar] [CrossRef]
- Parsons, K.; McCormac, A.; Butavicius, M.; Ferguson, L. Human Factors and Information Security: Individual, Culture and Security Environment; Defense Science and Technology Organization, Commonwealth of Australia: Canberra, Australia, 2010. [Google Scholar]
- Scott, J.; Kyobe, M. Trends in Cybersecurity Management Issues Related to Human Behaviour and Machine Learning. In Proceedings of the International Conference on Electrical, Computer and Energy Technologies (ICECET), Cape Town, South Africa, 9–10 December 2021. [Google Scholar]
Industry | Percentage |
---|---|
Finance and insurance | 35% |
Healthcare | 22% |
Technology | 16% |
Retail | 12% |
Manufacturing | 10% |
Other industries | 5% |
Pair | Correlation | p-Value |
---|---|---|
LiveAuctioneers and Eye4Fraud | 1 | 0 |
LiveAuctioneers and Drizly | 1 | 0 |
Eye4Fraud and Drizly | 1 | 0 |
MeetMindful and Houzz | 0.989842782 | 0 |
LiveAuctioneers and EatStreet | 0.978510047 | 0 |
Eye4Fraud and EatStreet | 0.978510047 | 0 |
EatStreet and Drizly | 0.978510047 | 0 |
NetGalley and LeadHunter | 0.893865598 | 0 |
DataEnrichmentExposureFromPDLCustomer and Exactis | 0.805917369 | 0 |
Verificationsio and Exactis | 0.804184683 | 0 |
Rank | Antecedents | Consequents | Confidence | Lift | Leverage | Zhang’s Metric |
---|---|---|---|---|---|---|
1 | {Exploit.In, Verifications.io} | {Data_Enrichment_Exposure_From_PDL_Customer, Anti_Public_Combo_List} | 0.857143 | 34.675325 | 0.013094 | 0.986682 |
2 | {Exploit.In, Data_Enrichment_Exposure_From_PDL_Customer, Verifications.io} | {Anti_Public_Combo_List} | 0.857143 | 31.785714 | 0.013059 | 0.984018 |
3 | {Exploit.In, Data_Enrichment_Exposure_From_PDL_Customer} | {Anti_Public_Combo_List, Verifications.io} | 0.857143 | 31.785714 | 0.013059 | 0.984018 |
4 | {Data_Enrichment_Exposure_From_PDL_Customer, Anti_Public_Combo_List} | {Exploit.In, Verifications.io} | 0.545455 | 34.675325 | 0.013094 | 0.995776 |
5 | {Anti_Public_Combo_List} | {Exploit.In, Data_Enrichment_Exposure_From_PDL_Customer, Verifications.io} | 0.5 | 31.785714 | 0.013059 | 0.995381 |
Parameter | Value |
---|---|
Antecedents | ‘Exploit.In’, |
‘Verifications.io’ | |
Consequents | ‘Data_Enrichment_Exposure_From_PDL_Customer’, |
‘Anti_Public_Combo_List’ | |
Confidence | 0.857143 |
Lift | 34.675325 |
Leverage | 0.013094 |
Zhang’s metric | 0.986682 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rich, M.S. Enhancing Microsoft 365 Security: Integrating Digital Forensics Analysis to Detect and Mitigate Adversarial Behavior Patterns. Forensic Sci. 2023, 3, 394-425. https://doi.org/10.3390/forensicsci3030030
Rich MS. Enhancing Microsoft 365 Security: Integrating Digital Forensics Analysis to Detect and Mitigate Adversarial Behavior Patterns. Forensic Sciences. 2023; 3(3):394-425. https://doi.org/10.3390/forensicsci3030030
Chicago/Turabian StyleRich, Marshall S. 2023. "Enhancing Microsoft 365 Security: Integrating Digital Forensics Analysis to Detect and Mitigate Adversarial Behavior Patterns" Forensic Sciences 3, no. 3: 394-425. https://doi.org/10.3390/forensicsci3030030
APA StyleRich, M. S. (2023). Enhancing Microsoft 365 Security: Integrating Digital Forensics Analysis to Detect and Mitigate Adversarial Behavior Patterns. Forensic Sciences, 3(3), 394-425. https://doi.org/10.3390/forensicsci3030030