Figure 1.
Temporal distribution of AI incidents by year (2014–2024).
Figure 1.
Temporal distribution of AI incidents by year (2014–2024).
Figure 2.
Distribution of incidents by harm category.
Figure 2.
Distribution of incidents by harm category.
Figure 3.
Harm category trends over time.
Figure 3.
Harm category trends over time.
Figure 4.
Distribution of incidents by actor type.
Figure 4.
Distribution of incidents by actor type.
Figure 5.
Actor type trends over time.
Figure 5.
Actor type trends over time.
Figure 6.
Geographic distribution of AI incidents.
Figure 6.
Geographic distribution of AI incidents.
Figure 7.
Overall word cloud showing dominant vocabulary in AI incident reports.
Figure 7.
Overall word cloud showing dominant vocabulary in AI incident reports.
Figure 8.
Filtered word cloud with platform names removed.
Figure 8.
Filtered word cloud with platform names removed.
Figure 9.
Pre-ChatGPT vs. Post-ChatGPT word cloud comparison.
Figure 9.
Pre-ChatGPT vs. Post-ChatGPT word cloud comparison.
Figure 10.
Silhouette score analysis for K-Means (k = 2 to 10).
Figure 10.
Silhouette score analysis for K-Means (k = 2 to 10).
Figure 11.
K-Means cluster size distribution.
Figure 11.
K-Means cluster size distribution.
Figure 12.
BERTopic topic distribution (25 topics).
Figure 12.
BERTopic topic distribution (25 topics).
Figure 13.
LDA model selection: perplexity and log-likelihood by number of topics.
Figure 13.
LDA model selection: perplexity and log-likelihood by number of topics.
Figure 14.
LDA topic distribution (15 topics).
Figure 14.
LDA topic distribution (15 topics).
Figure 15.
Word clouds for all 15 LDA topics.
Figure 15.
Word clouds for all 15 LDA topics.
Figure 16.
NMF reconstruction error by number of topics.
Figure 16.
NMF reconstruction error by number of topics.
Figure 17.
NMF topic distribution (15 topics).
Figure 17.
NMF topic distribution (15 topics).
Figure 18.
Word clouds for all 15 NMF topics.
Figure 18.
Word clouds for all 15 NMF topics.
Figure 19.
NMF vs. LDA topic correspondence heatmap.
Figure 19.
NMF vs. LDA topic correspondence heatmap.
Figure 20.
(a). Hierarchical clustering dendrogram. (b). Topic similarity matrix. (c). Topic co-occurrence correlation matrix. (d). Identified topic clusters.
Figure 20.
(a). Hierarchical clustering dendrogram. (b). Topic similarity matrix. (c). Topic co-occurrence correlation matrix. (d). Identified topic clusters.
Figure 21.
Overall sentiment distribution.
Figure 21.
Overall sentiment distribution.
Figure 22.
VADER component scores.
Figure 22.
VADER component scores.
Figure 23.
VADER sentiment distribution by K-Means cluster.
Figure 23.
VADER sentiment distribution by K-Means cluster.
Figure 24.
VADER sentiment by BERTopic topic (sorted).
Figure 24.
VADER sentiment by BERTopic topic (sorted).
Figure 25.
LIWC psycholinguistic profiles by harm category.
Figure 25.
LIWC psycholinguistic profiles by harm category.
Figure 26.
LIWC psycholinguistic profiles by actor type.
Figure 26.
LIWC psycholinguistic profiles by actor type.
Figure 27.
Multi-method comparison: (A) Topics by method, (B) Largest topic size, (C) Cross-method alignment heatmap, (D) Domain-specific alignment.
Figure 27.
Multi-method comparison: (A) Topics by method, (B) Largest topic size, (C) Cross-method alignment heatmap, (D) Domain-specific alignment.
Figure 28.
Sentiment range comparison across methods.
Figure 28.
Sentiment range comparison across methods.
Table 1.
Dataset Overview.
Table 1.
Dataset Overview.
| Attribute | Value |
|---|
| Total Incidents | 3494 |
| Time Period | January 2014–October 2024 |
| Pre-ChatGPT (before 30 November 2022) | 1288 (36.9%) |
| Post-ChatGPT (after 30 November 2022) | 2206 (63.1%) |
| Mean Text Length | ~150 words per incident |
Table 2.
Analytical Methods Mapped to Research Objectives.
Table 2.
Analytical Methods Mapped to Research Objectives.
| Method | Research Objective | Unique Contribution |
|---|
| LDA Topic Modeling | Thematic structure | Probabilistic topic mixtures |
| NMF Topic Modeling | Thematic structure | Parts-based decomposition |
| K-Means Clustering | High-level categorization | Discrete assignments |
| BERTopic Clustering | Fine-grained discovery | Automatic topic selection |
| VADER Sentiment | Emotional framing | Valence scores (−1 to +1) |
| LIWC Analysis | Psycholinguistic profiling | Cognitive style dimensions |
Table 3.
Mathematical Comparison of Analytical Methods.
Table 3.
Mathematical Comparison of Analytical Methods.
| Method | Type | Objective Function | Output |
|---|
| LDA | Probabilistic generative | max P(W|α, β) via variational EM | Topic distributions θ_d, φ_k |
| NMF | Matrix factorization | min ||X-WH||2_F, W, H ≥ 0 | Document-topic W, topic-term H |
| K-Means | Partitional clustering | min Σ||x-μ_k||2 | Cluster assignments |
| BERTopic | Density-based clustering | HDBSCAN on UMAP embeddings | Topics + outliers |
| VADER | Lexicon-based | compound = Σv_i/√(Σv_i2 + α) | Sentiment scores [−1, +1] |
Table 4.
Incidents by Year.
Table 4.
Incidents by Year.
| Year | Count | Year | Count |
|---|
| 2014 | 12 | 2020 | 423 |
| 2015 | 28 | 2021 | 512 |
| 2016 | 67 | 2022 | 634 |
| 2017 | 134 | 2023 | 892 |
| 2018 | 245 | 2024 | 198 * |
| 2019 | 349 | | * partial |
Table 5.
Primary Harm Category Counts.
Table 5.
Primary Harm Category Counts.
| Harm Category | Count | Percent |
|---|
| AI system safety and reliability | 1017 | 29.1% |
| Physical and psychological harms | 721 | 20.6% |
| Privacy and data protection | 534 | 15.3% |
| Discrimination and bias | 456 | 13.1% |
| Misinformation and manipulation | 398 | 11.4% |
| Other harms | 368 | 10.5% |
Table 6.
Actor Type Counts.
Table 6.
Actor Type Counts.
| Actor Type | Count | Percent |
|---|
| Private sector/Industry | 2456 | 70.3% |
| Government/Public sector | 534 | 15.3% |
| Research/Academic | 287 | 8.2% |
| Individual actors | 156 | 4.5% |
| Other/Unknown | 61 | 1.7% |
Table 7.
Top Countries/Regions.
Table 7.
Top Countries/Regions.
| Country/Region | Count | Percent |
|---|
| United States | 1678 | 48.0% |
| China | 423 | 12.1% |
| United Kingdom | 312 | 8.9% |
| European Union | 287 | 8.2% |
| Australia | 134 | 3.8% |
| Other | 660 | 18.9% |
Table 8.
K-Means Cluster Characteristics.
Table 8.
K-Means Cluster Characteristics.
| Cluster | Label | Count (%) | Top Terms |
|---|
| 1 | General AI & Platform Risks | 2808 (80.4%) | ai, google, data, facebook, algorithm |
| 2 | Autonomous Vehicles | 205 (5.9%) | tesla, car, driving, autopilot, crash |
| 3 | Facial Recognition & Surveillance | 481 (13.8%) | facial, recognition, police, surveillance |
Table 9.
BERTopic Top Topics.
Table 9.
BERTopic Top Topics.
| Topic | Label | Count | Top Terms |
|---|
| 7 | Facial Recognition | 362 | facial, recognition, police |
| 23 | Facebook/Social Media | 324 | facebook, meta, instagram |
| 15 | AI Bias | 255 | bias, algorithm, discrimination |
| 24 | ChatGPT | 200 | chatgpt, openai, gpt |
| 1 | Self-Driving | 198 | tesla, autopilot, driving |
| 22 | Deepfake Porn | 185 | deepfake, porn, video |
Table 10.
LDA topics (15 topics).
Table 10.
LDA topics (15 topics).
| Topic | Label | N (%) | Top Terms |
|---|
| 10 | AI/ML & Algorithmic Bias | 1035 (29.6%) | ai, intelligence, learning, machine, bias |
| 1 | Deepfakes & Synthetic Media | 502 (14.4%) | deepfake, video, fake, media, generated |
| 4 | Social Media Platforms | 425 (12.2%) | facebook, social, media, instagram, users |
| 9 | ChatGPT & Generative AI | 375 (10.7%) | chatgpt, openai, generative, chatbot |
| 11 | Facial Recognition | 363 (10.4%) | facial, recognition, police, surveillance |
| 13 | Tesla & Self-Driving | 201 (5.8%) | tesla, driving, self, autopilot, car |
Table 11.
NMF topics (15 topics).
Table 11.
NMF topics (15 topics).
| Topic | Label | N (%) | Top Terms |
|---|
| 7 | Deepfakes & Synthetic Media | 417 (11.9%) | deepfake, videos, fake, video, media |
| 10 | US Government & Regulation | 362 (10.4%) | united, states, federal, government |
| 2 | Facial Recognition | 353 (10.1%) | facial, recognition, police, surveillance |
| 14 | Machine Learning & Bias | 352 (10.1%) | learning, machine, bias, algorithm |
| 4 | Facebook & Social Media | 263 (7.5%) | facebook, social, media, instagram |
| 6 | ChatGPT & Generative AI | 237 (6.8%) | chatgpt, openai, chatbot, generative |
| 3 | Tesla & Self-Driving | 193 (5.5%) | tesla, driving, car, self, autopilot |
Table 12.
NMF-LDA Topic Correspondence.
Table 12.
NMF-LDA Topic Correspondence.
| Domain | NMF Topic | LDA Topic | Overlap% |
|---|
| Tesla/Self-Driving | Topic 3 | Topic 13 | 85.0% |
| Facial Recognition | Topic 2 | Topic 11 | 73.1% |
| Deepfakes | Topic 7 | Topic 1 | 71.9% |
| ChatGPT | Topic 6 | Topic 9 | 67.1% |
Table 13.
Overall Sentiment Statistics.
Table 13.
Overall Sentiment Statistics.
| Classification | Count | Percentage |
|---|
| Positive (compound ≥ 0.05) | 2138 | 61.2% |
| Neutral | 74 | 2.1% |
| Negative (compound ≤ −0.05) | 1282 | 36.7% |
Table 14.
Sentiment by Topic Category.
Table 14.
Sentiment by Topic Category.
| Category | Mean Compound | % Negative | % Positive |
|---|
| Child Safety/YouTube | −0.326 | 67.6% | — |
| Self-Driving/Tesla | −0.140 | 57.6% | — |
| US Government AI | +0.633 | — | 87.2% |
| ChatGPT/GenAI | +0.586 | — | 85.0% |
Table 15.
LIWC Profiles by Harm Category.
Table 15.
LIWC Profiles by Harm Category.
| Harm Category | Analytic | Tone | Cognition | Affect | Social |
|---|
| Physical harms | 78.2 | 32.1 | 12.4 | 4.8 | 8.2 |
| Privacy | 82.4 | 45.6 | 14.2 | 3.2 | 7.8 |
| Discrimination | 76.8 | 28.4 | 11.8 | 5.6 | 9.4 |
| Misinformation | 74.2 | 38.2 | 13.6 | 4.2 | 10.8 |
Table 16.
Cross-Method Maximum Alignment.
Table 16.
Cross-Method Maximum Alignment.
| Method Pair | Max Alignment | Interpretation |
|---|
| BERTopic ↔ NMF | 96.4% | Strongest |
| K-Means ↔ BERTopic | 91.0% | Strong |
| BERTopic ↔ LDA | 86.5% | Good |
| K-Means ↔ LDA | 84.9% | Good |
| K-Means ↔ NMF | 84.4% | Good |
| LDA ↔ NMF | 81.6% | Moderate |
Table 17.
Domain-Specific Cross-Method Alignment.
Table 17.
Domain-Specific Cross-Method Alignment.
| Domain | K-Means | BERTopic | LDA | NMF | Alignment |
|---|
| Autonomous Vehicles | Cluster 2 | Topic 1 | Topic 13 | Topic 3 | 84–89% |
| Facial Recognition | Cluster 3 | Topic 7 | Topic 11 | Topic 2 | 66–68% |
Table 18.
Comprehensive multi-method comparison summary.
Table 18.
Comprehensive multi-method comparison summary.
| Aspect | K-Means | BERTopic | LDA | NMF |
|---|
| Type | Centroid Clustering | Density Clustering | Probabilistic (Dirichlet) | Matrix Factorization |
| Topics/Clusters | 3 | 25 | 15 | 15 |
| Outlier Handling | None (forced) | 377 outliers (10.8%) | None (forced) | None (forced) |
| Largest Group | 80.40% | 10.4% | 29.6% | 11.9% |
| Distribution | Highly Imbalanced | Balanced | Moderately Imbalanced | Balanced |
| Granularity | Coarse | Fine-grained | Medium | Medium |
| K-Means Align | — | 91.0% | 84.9% | 84.4% |
| Interpretability | High (simple) | High (detailed) | Good | Excellent |
| Best For | High-level taxonomy | Detailed discovery | Probabilistic modeling | Clean categorization |