# A New Benford Test for Clustered Data with Applications to American Elections

## Abstract

## 1. Introduction

## 2. Benford’s Law and Generalizations

## 3. Empirical Estimates Using 2004 Election Data

#### 3.1. Modeling with the Discrete Weibull

#### 3.2. Data and Model Estimation

## 4. Comparisons of 1-BL 3 and 1-BL 10 to Potential Vote Distributions

#### 4.1. Simulations Using Chi-Squared Testing

#### 4.2. Simulations Using Binomial Testing

## 5. Applications of 1-BL 3 to the 2004 US Presidential Election

#### 5.1. Chi-Square Test Results

#### 5.2. Binomial Test Results

## 6. Discussion

## 7. Conclusions and Future Research

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

**Figure 1.**Vote Distributions Across Precincts in US 2004 Presidential Election for Selected Counties.

**Figure 4.**Discrete Weibull parameter estimation and Kolmogorov Smirnov Goodness of Fit Test Results for Ohio, Colorado, and Wisconsin, US 2004 Presidential Election. p-values above 0.05 (indicate conformance to discrete Weibull distribution.

**Figure 5.**First Digit Base 10 Monte Carlo Analysis of Simulated Discrete Weibull Distributions with Various Choices of Parameterizations and Precinct Sizes Using Chi Squared Testing. p-values above 0.05 (plotted in blue) indicate conformance to 1-BL 10.

**Figure 6.**First Digit Base 3 Monte Carlo Analysis of Simulated Discrete Weibull Distributions with Various Choices of Parameterizations and Precinct Sizes Using Chi-Squared Testing. p-values above 0.05 (plotted in blue) indicate conformance to 1-BL 3.

**Figure 7.**First Digit Base 3 Monte Carlo Analysis of Simulated Discrete Weibull Distributions with Various Choices of Parameterizations and Precinct Sizes Using Binomial Testing. p-values above 0.05 (plotted in blue) indicate conformance to 1-BL 3.

d | Probability First Digit d | Probability Second Digit d |
---|---|---|

0 | 0.1197 | |

1 | 0.3010 | 0.1139 |

2 | 0.1761 | 0.1088 |

3 | 0.1249 | 0.1043 |

4 | 0.0969 | 0.1003 |

5 | 0.0792 | 0.0967 |

6 | 0.0669 | 0.0934 |

7 | 0.0580 | 0.0904 |

8 | 0.0512 | 0.0876 |

9 | 0.0458 | 0.0850 |

d | Probability First Digit d | Probability Second Digit d |
---|---|---|

0 | 0.4022 | |

1 | 0.6309 | 0.3247 |

2 | 0.3691 | 0.2732 |

**Table 3.**Maximum Likelihood Estimates and Goodness of Fit Test Results for Vote Counts for Selected Counties in 2004 US Presidential Election—President George W. Bush. p-values above 0.05 (plotted in blue) indicate conformance to discrete Weibull distribution.

${\widehat{\mathit{\alpha}}}_{\mathit{Bush},\mathit{i},\mathit{j}}$ | ${\widehat{\mathit{\beta}}}_{\mathit{Bush},\mathit{i},\mathit{j}}$ | p-Value | Number of Precincts | |
---|---|---|---|---|

Milwaukee County, WI | 351.457 | 1.410 | 0.321 | 560 |

Stark County, OH | 287.717 | 2.289 | 0.472 | 364 |

Grand County, CO | 205.389 | 2.706 | 0.726 | 12 |

**Table 4.**Maximum Likelihood Estimates and Goodness of Fit Test Results for Vote Counts for Selected Counties in 2004 US Presidential Election—Senator John F. Kerry. p-values above 0.05 indicate conformance to discrete Weibull distribution.

${\widehat{\mathit{\alpha}}}_{\mathit{Kerry},\mathit{i},\mathit{j}}$ | ${\widehat{\mathit{\beta}}}_{\mathit{Kerry},\mathit{i},\mathit{j}}$ | p-Value | Number of Precincts | |
---|---|---|---|---|

Milwaukee County, WI | 598.1 | 2.2 | 0.223 | 560 |

Stark County, OH | 289.057 | 4.085 | 0.304 | 364 |

Grand County, CO | 167.35 | 3.188 | 0.614 | 12 |

State | George W. Bush (R) | John F. Kerry (D) | Number of Counties |
---|---|---|---|

North Carolina | 56.02% (1,961,166) | 43.58% (1,525,849) | 100 |

Vermont | 38.80% (121,180) | 58.94% (184,067) | 16 |

Wisconsin | 49.32% (1,478,120) | 49.70% (1,489,504) | 72 |

Ohio | 50.81% (2,859,768) | 48.71% (2,741,167) | 88 |

Colorado | 51.69% (1,101,255) | 47.02% (1,001,732) | 64 |

**Table 6.**First Digit Base 3 Benford’s Analysis on Selected Battleground and Non-Battleground States in the US 2004 Presidential Election.

State | County | Candidate | ${\mathit{\chi}}^{2}$ Stat | p-Value | Number of Precincts | Adjusted p-Value |
---|---|---|---|---|---|---|

Colorado | Arapahoe | John F. Kerry | 21.152 | <0.001 | 364 | 0.001 |

El Paso | John F. Kerry | 20.513 | <0.001 | 378 | 0.001 | |

Jefferson | George W. Bush | 38.050 | <0.001 | 324 | <0.001 | |

Jefferson | John F. Kerry | 108.747 | <0.001 | 324 | <0.001 | |

North Carolina | No Anomalies Detected | |||||

Wisconsin | No Anomalies Detected | |||||

Ohio | Ashtabula | John F. Kerry | 39.385 | <0.001 | 127 | <0.001 |

Butler | John F. Kerry | 14.594 | 0.001 | 289 | 0.013 | |

Geauga | George W. Bush | 15.196 | <0.001 | 96 | 0.013 | |

Geauga | John F. Kerry | 20.812 | <0.001 | 96 | 0.001 | |

Greene | George W. Bush | 13.860 | 0.001 | 142 | 0.016 | |

Greene | John F. Kerry | 36.189 | <0.001 | 142 | <0.001 | |

Lorain | John F. Kerry | 43.509 | <0.001 | 239 | <0.001 | |

Miami | John F. Kerry | 14.669 | 0.001 | 82 | 0.013 | |

Muskingum | John F. Kerry | 13.971 | 0.001 | 85 | 0.016 | |

Portage | John F. Kerry | 27.249 | <0.001 | 129 | <0.001 | |

Summit | John F. Kerry | 68.917 | <0.001 | 475 | <0.001 | |

Vermont | No Anomalies Detected |

**Table 7.**Analysis of Whether Anomalous Counties from Table 6 Deviate from Benford’s Law.

County | Candidate | ${\mathit{\alpha}}_{\mathit{x},\mathit{i},\mathit{t}}$ | ${\mathit{\beta}}_{\mathit{x},\mathit{i},\mathit{t}}$ | Number of Precincts | p-Value (KS-Test) | Should Conform to BL but Fails? |
---|---|---|---|---|---|---|

Arapahoe | John F. Kerry | 326.634 | 2.973 | 364 | 0.003 | |

El Paso | John F. Kerry | 231.420 | 2.499 | 378 | 0.507 | * |

Jefferson | George W. Bush | 400.330 | 3.650 | 324 | 0.001 | |

Jefferson | John F. Kerry | 352.800 | 4.717 | 324 | <0.001 | |

Ashtabula | John F. Kerry | 208.527 | 4.538 | 127 | <0.001 | |

Butler | John F. Kerry | 218.783 | 2.869 | 289 | 0.037 | |

Geauga | George W. Bush | 348.690 | 4.200 | 96 | 0.003 | |

Geauga | John F. Kerry | 228.818 | 3.869 | 96 | 0.099 | * |

Greene | George W. Bush | 383.344 | 2.628 | 142 | 0.395 | * |

Greene | John F. Kerry | 244.027 | 2.272 | 142 | 0.494 | * |

Lorain | John F. Kerry | 367.292 | 2.964 | 239 | 0.074 | * |

Miami | John F. Kerry | 235.479 | 4.963 | 82 | 0.010 | |

Muskingum | John F. Kerry | 215.593 | 3.613 | 85 | 0.054 | * |

Portage | John F. Kerry | 347.487 | 4.109 | 129 | <0.001 | |

Summit | John F. Kerry | 365.481 | 3.712 | 475 | <0.001 |

