Throughout most of human history, electronic amplification of the spoken word was unavailable; therefore, all human gatherings were effectively limited in size by the acoustic range over which the person speaking could be heard intelligibly. Many of the largest reported crowds in history are records of generals delivering a speech or “harangue” to an army [1
], since the army’s numbers were counted and thus represent a relatively rare instance of a counted crowd in ancient history (although there is certainly a large range of error in all—ancient and modern—crowd-counting methods [2
]). These accounts were generally accepted by modern historians (e.g., [4
]) until a landmark paper by Mogens Herman Hansen in 1993, which cast doubt upon the entire historicity of a general’s speech to the army [6
]. Hansen made many historical and textual arguments that need not be reproduced here in detail, but among these, he made acoustical claims, such as the area over which an ancient phalanx stretched, or noise from rattling hoplite armor, leading him to conclude that
Under such circumstances it must have been impossible for a general, even if he had had the voice of a Stentor, to deliver a speech that could be heard by all the soldiers simultaneously.
Though propagation distance and background noise are certainly valid acoustical criteria to examine, the sentence above contains all the acoustical analysis the author thought necessary to include in his paradigm-shattering thesis. Of course, the problem with vague allusions to scientific truth to prove a point is that the other side can allude to science just as vaguely, which is indeed what happened in this case: a year later, the historian W. Kendrick Pritchett replied with a 100-page rejoinder defending the authenticity of the general’s harangue, listing in detail different accounts of speeches to large crowds from ancient Greece and Rome, and also including more recent accounts, from Henry V at Agincourt to George Washington [7
]. Pritchett mentioned these military examples as well as another from the preacher John Wesley, arguing that it is acoustically possible for a single speaker to reach such a large assembly.
This tendency to enlist science as an ally without careful numerical consideration often leads to an abuse or neglect of the specific issues in question. For example, Hansen’s example of rattling hoplite armor, to which he returns several times in his essay, is based on a single reference to Alexander the Great in the midst of a great battle, rather than hoplites standing at attention before
], 4.13.37). In turn, Pritchett’s reference to John Wesley uncritically accepted Wesley’s estimate using an assumed crowd density of five people per square yard [9
], while modern crowd estimation methods show that in large gatherings the highest densities achieved are less than half that value [10
]. In such cases, quantitative references are somewhat superfluous; they are enlisted not to elucidate the past, but because the historians have already made up their minds and want to give their argument a veneer of scientific credibility (as some have asserted,“There are no statistics in ancient sources, just rhetorical flourishes made with numbers” [11
Humanities disciplines (including History, although it sometimes presents itself as Social Science instead) primarily interpret
empirical facts which in themselves are not contested. The arguments employed in this process of interpretation are not (and in a sense cannot be) quantitative. Different historians come to different conclusions on the basis of the same collection of facts. The field which primarily seeks to collect empirical facts to be interpreted is not history but archaeology. In the past, archaeology focused on the excavation of physical artifacts, but recently has embraced computational methods to expand the range of acceptable empirical data [12
]. This is a specific application of the movement generally known as the digital humanities, which seeks to use computational methods to shed light on uncontroversial but not immediately obvious facts, which may themselves be the basis for further humanistic interpretation and inquiry. The sub-field variously known as acoustic archaeology or archaeoacoustics uses computational acoustic simulation to uncover facts about the nature of sound in history, which is necessary to address the issue raised by Hansen and Pritchett’s disagreement.
Both papers make reference to Julius Caesar’s speeches to his army, both during the Gallic Wars and the Roman Civil War. The speeches during the Gallic Wars are smaller and less contested, but during the Civil War Caesar records giving two major speeches to large gatherings of soldiers, one following his army’s defeat at the battle of Dyrrachium, and one immediately before the battle of Pharsalus, both in 48 BC [13
]. This paper examines the historicity of these accounts through archaeoacoustic simulations of the speeches as Caesar describes them. Using this framework, it can be shown that, even without every historical detail preserved exactly, the plausibility or implausibility of certain historical speeches may be known with a high degree of certainty once we examine the acoustical evidence.
After the controversy between Hansen and Pritchett, other historians writing on these speeches referenced the debate, but generally avoided making strong claims about whether these speeches actually occurred [14
]. Again, this reticence may be a combination of not feeling comfortable with quantitative acoustics, as well as a general feeling that sound as a transient phenomenon is more or less lost to history once it is silenced. As it happens, there is a long tradition of working backwards through known physical laws to study sounds of the past, motivated by this same question: How many listeners can hear a single human voice?
2.1. Benjamin Franklin’s Experiment
In 1739, the Methodist revivalist preacher (and friend of John Wesley) George Whitefield drew large outdoor crowds in London that were estimated as high as 80,000 people [17
]. Across the Atlantic, Benjamin Franklin, who was at that time the publisher of The Pennsylvania Gazette in Philadelphia, had stopped printing the estimates of Whitefield’s crowds because he thought they were exaggerated. Franklin described in his autobiography how he carried out an experiment to measure Whitefield’s intelligible distance when the preacher came to Philadelphia. Using a semicircular approximation for the crowd shape, Franklin calculated that
⋯[Whitefield] might well be heard by more than Thirty Thousand. This reconciled me to the newspaper accounts of his having preached to twenty five thousand people in the fields and to the ancient histories of generals haranguing whole armies of which I had sometimes doubted.
While Franklin used Whitefield’s example as a conduit to explore the vocal ranges of ancient generals, the data from his experiment also provide information about Whitefield’s vocal level, as the preacher was known for having one of the loudest voices of his generation [19
]. Data from Franklin’s experiment have been used to infer noise characteristics based on known noise sources and site geometry [20
], study the maximum sound pressure [21
] and directivity patterns for trained vocalists [22
], and to simulate Whitefield’s own maximum SPL and crowd size based on Franklin’s data [23
]. This work estimated that Whitefield’s time-averaged on-axis L
could be as great as 90 dB
, about 16 dB greater than the ANSI standard [24
] for “loud speech” (about 74 dB when the ANSI spectrum is A-weighted [25
]), and still significantly higher than that for “shouted speech” (about 82 dB
). The corresponding ISO standard uses more discrete vocal levels, but similarly has its highest value of “Very Loud” speech as 78 dB after A-weighting is applied [25
High vocal levels allow for animal communication at very long distances, producing an evolutionary sexual selection effect [27
], which may vestigially influence human vocal capacity today. The computer simulations predicted that Whitefield could be heard intelligibly by a crowd of over 20,000 people without assuming overly optimistic acoustic conditions, although the reported crowd of 80,000 is acoustically implausible even under very favorable conditions [23
This example shows that there exist regions of plausibility between being naively accepting or close-mindedly skeptical of all historical accounts of speeches to large crowds. Using what is already known about the human voice, sound propagation, and speech intelligibility, we can give a good approximation of how many people could hear a speaker on a specific occasion. While we cannot affix a precise crowd size to every historical account, we can shed a good deal of light on the historical account, which may inform the way we interpret the original text.
Even in situations where we do not have data as convenient as that which Franklin recorded, we can investigate a range of possible acoustic conditions (e.g., background noise). We can simulate the extreme “optimistic” and “pessimistic” ends of this range of conditions to better understand the historical situation. If, even under pessimistic conditions, the intelligibility (measured by the speech transmission index (STI)) is still acceptable throughout the crowd, we may consider the historical account plausible even if we cannot know the precise noise condition without the benefit of a time machine. Conversely, if under optimistic conditions the intelligibility is still too low throughout the simulated crowd, we may consider the account acoustically implausible. To apply this method to Caesar’s speeches, we need to first consider his voice, the sites of his speeches, and the background noise present.
2.2. Caesar the Orator
Vocal training is an important factor to the maximum pressure achievable by a speaker [28
], and Caesar received extensive training from childhood on. Though most famous for his achievements as a general and later as dictator, Julius Caesar was born in 100 BC into a high family in the Roman Republic and was trained from childhood to perform the public ceremonies required of the family’s inherited priesthood, and Caesar would later assume the title of pontifex maximus
, the high priest of all Rome (the Latin title was later also given to the Bishop of Rome, i.e., the Pope) [29
]. In 70 BC, Caesar traveled to the island of Rhodes to study oratory with the noted rhetoritician Apollonius Molon, who also trained Cicero in oratorical delivery [30
]. In addition to his smaller speeches to his centurions or smaller military gatherings, Caesar also gave a noted speech in the Roman Senate advocating mercy for the Cataline conspirators in 63 BC [31
In addition to the many examples of Caesar’s experience with oratory, no less an orator than Cicero himself (who also wrote an entire text on the subject [32
]) testified to the quality of Caesar’s delivery. In a letter to Cornelius Nepos, Cicero wrote
Do you know any man who can speak better than Caesar, even if he has concentrated on the art of oratory to the exclusion of all else?
Despite his natural talent for oratory, Caesar chose to pursue the military instead of oratory as his chief vocation, and thus might not be expected to have perfected his delivery to the extent reported for speakers such as Demosthenes, who is reported to have undertaken specific exercises to perfect his elocution, lung control, and overall level [34
]. In addition to vocal training, youth is also correlated with maximum vocal output, the level decreasing with increasing age [35
]. In this regard, Caesar was no longer so young (52) at the time of the civil war in 48 BC, and thus simulations of his battle speeches cannot assume that he was near the highest vocal levels possible (90 dB
), as in the case of Whitefield, who was only 24 at the time of Franklin’s experiment. Because of this, a more moderate averaged SPL range of 74–80 dB
is assumed for Caesar in the computer models.
Based on these simulations, we may shed new light on the acoustical plausibility of these two speeches, which will affect our interpretation of Caesar’s account. At Dyrrachium, we simulated rather pessimistic conditions—giving Caesar only a 6 dB boost above normal loud speech, while assuming background noise of 45 dB
, over 10 dB greater than that measured at some ancient Greek amphitheaters [47
]. It can be seen from the simulations that, even under these relatively pessimistic conditions, Caesar could have plausibly spoken to all his 14,000 soldiers at once (leaving room for more!), as long as he did not speak with a “normal” vocal level of 74 dB. Since the pessimistic case is plausible, we do not need to stretch the bounds of the simulation by probing how loud Caesar might have been, how quiet his men might have been, or assuming the existence of atypical environmental propagation, such as refraction patterns via a temperature inversion that would have carried his voice farther. The historical details must be sorted out by historians, but acoustically it must be said that this speech seems physically valid based on the descriptions we have of the event.
In contrast, before Pharsalus, even assuming the louder value for Caesar and a relatively quiet noise value for his army, his MIA value decreases by an order of magnitude. The maximum width of his intelligible area is less than 20 m, while the battlefront stretched up to 3 km once the armies were engaged [44
]. Even assuming his army was somewhat more compact before he spoke to them, there does not appear to be any acoustically plausible scenario in which he could be heard intelligibly by his entire force if they were all in front of him, arranged less densely, and on the move. Since the optimistic case does not appear plausible, there is no need to simulate greater background noise values. In both cases, there is clearly some uncertainty about Caesar’s exact STI value, vocal level, and background noise. However, by considering an implausible optimistic scenario or a plausible pessimistic scenario, it is possible to speak confidently about the general plausibility of an account even if we cannot know the precise intelligibility value at each point in the crowd.
Having made this acoustical point, however, it should not be inferred that Caesar’s speech as written did not happen—it is quite possible that he did speak something similar to the soldiers and officers nearby when he realized Pompey’s forces were extended far enough to engage them. The signal to engage could then have been conveyed by messengers, trumpet signal, or flag signals, all of which Caesar used during the Civil War [13
]. In fact, as Pompey’s cavalry attempted to flank his right side, Caesar reacted as follows:
However, once [Caesar] had a look at the enemy formation described above, he feared a flanking attack by the mass of enemy cavalry circling around his right wing; he therefore rapidly drew individual cohorts out of the third line of his formation, placed them as a fourth line to oppose Pompey’s cavalry, and explained to them what he wanted them to do, making it clear that this day’s victory would depend on the bravery of these cohorts. At the same time he commanded the third line not to move forward and engage with the enemy without explicit orders from himself: when he wanted this to happen, he would give the signal with a flag.
This action seems to be largely improvised, as Caesar recorded himself explaining and commanding his soldiers verbally of what they were about to do. Assuming that he could make himself heard by a single cohort of 500 men in formation, this seems reasonably in line with him riding behind his third line and speaking multiple times to multiple cohorts as he formed them into the fourth line which would counterattack Pompey’s cavalry, resulting in Caesar’s successful flank of Pompey, the route of Pompey’s forces, and the end of the Civil War. The fact that Caesar ordered them all to coordinate this action by a flag signal again shows that he did not expect to be able to verbally communicate with his entire army, or even a portion of it, during the battle itself. However, his reported oratorical skill would have come in useful in this and many other noisy battles where Caesar’s instincts and improvisational abilities led his forces to many such incredible victories and very few defeats.
Although this discussion has focused mainly on Caesar, clearly many such pitched battles between ancient armies occurred on fairly flat terrain without many significant reflecting surfaces, so these simulations may be seen as a proxy for any male speaker with the level and background noise conditions assumed here. This may be less useful for modern assemblies, which generally have electrical amplification, but it can be viewed as a lens through which to consider other ancient reported speeches before armies. However, for each situation, the orator’s vocal ability, training, and age should be considered, along with the likely background noise of the crowd listening.