In the debate about whether to use technology in sports officiating—and if so, how much—it is widely agreed that adding technology would improve accuracy relative to unaided human officiating, but at the loss of some sort(s) of value. Collins, Evans, and Higgins [1
] worry about the degradation of the officials’ authority. Ryall [2
] considers the loss of entertainment value that would follow the reduction in controversial missed calls. Johnson and Taylor [3
] value the lessons in humanity and humility provided by having to suffer bad calls.
I disagree that such values, either separately or together, count strongly enough against the value of accuracy that we should forego more accurate (technologically-assisted) officiating—where and when it’s available—in favor of (less accurate) human officiating. But setting this matter aside, the current debate is premised on a false dilemma. Human officiating need not be less accurate than technology; thus, the choice between maximal accuracy and human officiating is not an exclusive one. The way through the horns of the dilemma lies in appreciating that accuracy is relative to criteria that we are free to specify. We may thus freely specify them in ways that unaided human beings can reliably judge.
In what follows, I argue that sports—much like science—rely on classification under concepts, the criteria for which we choose based on our values and interests. Accuracy, in turn, requires correctly applying the criteria. This task is made more or less difficult for human beings to accomplish depending on how fine- or coarse-grained the criteria are. So, in effect, we choose which facts will matter, as well as how to track them. If we insist on tracking them with unaided human officials, then we should choose facts that such people can accurately track. I also suggest some possible changes that might be made to existing sports if we take seriously the idea of tracking those facts that human beings can reliably judge. Whether these suggestions are welcome or repugnant, they can serve either as guides for how to change our sports or as an indictment of the desire for human officiating at the expense of accuracy.
2. A Lesson from Astronomy
When Clyde Tombaugh discovered Pluto in 1930 it was, by the definition of ‘planet’ in use at the time, the ninth known planet: it was a massive, spherical, non-luminous body that orbited a star, our Sun, like the other known planets, albeit on an unusually inclined orbit. Even being smaller than our moon, Pluto was a significant body orbiting our Sun, and so for 75 years, it remained the ninth of nine planets; the smallest of what astronomers considered the most basic components of our solar system.
All of that changed in 2005 when Caltech astronomer Mike Brown discovered Eris. Initial measurements indicated Eris was both bigger and more massive than Pluto—surely, the tenth planet!1
What followed was not celebration, however, but crisis. On the one hand, the case for counting Eris as a planet was clear. Pluto was a planet, and Eris was at least as big as Pluto; ergo, Eris was a planet. But at the same time, Eris’ discovery undermined the case for counting Pluto as a planet in the first place. Until then, Pluto was unique—small, yes, but notably bigger than any of the thousands of other bodies in the Kuiper belt beyond Uranus. But Eris showed that Pluto was not all that special after all, in either size or in location.
Eris’ discovery threw the astronomical community into dispute over one of its most basic concepts: what is a ‘planet’? The dispute was ultimately settled in 2006 by the International Astronomical Union’s adoption of a new definition of ‘planet’. To the old definition was added the criteria that a planet must have sufficient mass to have “cleared the neighborhood” of its orbit: roughly, that any other object in its orbit will orbit it. Since Pluto, as was long known, shared its orbit with many other objects of comparable size to its own satellite Charon, it failed to meet this new criteria and was defined out of existence as a planet. Pluto was no longer a planet; Eris would never be one. And instead of being credited with the discovery of the tenth planet, Brown is known instead as the man who killed Pluto.
Pluto’s undoing was probably inevitable, but it was prompted by advances in technology. Brown’s discovery of Eris (and dozens of other nearly-Pluto-sized bodies) was made possible by improvements in telescopic design, image capturing, and data analysis. These improvements in studying the stars ultimately showed that how scientists classified the objects orbiting our Sun was, for astronomical purposes, wrong. But the error here is instructive. By the time Pluto’s planetary status came into question, its size, shape, mass, and orbit were long known. The question was not, do Pluto’s physical parameters meet the criteria for planetary status, but rather was, what should the criteria be? Crucially, what the criteria should be is not an empirical question but a normative one—one that bears on the interests and aims of scientists rather than measurable features of the world.
Revising the definition of ‘planet’ was fraught and controversial. In doing so, astronomers had to navigate competing interests of scientific utility, conceptual naturalness, and cultural familiarity. One of the biggest objections to any official revision was that it would do violence to the colloquial use of the term. Schoolchildren since the 1930s learned the names of nine
planets and some worried the familiarity with astronomy that this promoted would be undermined by a revision. But how a scientific concept is employed by non-scientists does not necessarily track the sorts of natural properties that scientists want to study and which constitute the reason for considering objects as belonging to the same kind in the first place. So, there is the opposing concern that technical concepts that also have use in ordinary language need to remain tethered to their scientific origins or risk further confusing a public that already struggles to understand its theories.2
Classification under concepts is a fundamental scientific practice and problem. Since Galileo first looked through his telescope to discover the moons of Jupiter, however, sometimes advances in our knowledge and technological capabilities put pressure on those concepts and the roles they play in scientific practice. The wealth of information provided by newer, better technology poses problems as well as opportunities for scientists and their conceptual scheme. Accommodating this information into theories is how science (eventually) progresses.
In Pluto’s demise, there are lessons to be learned for sports theorists as well. Sports, as well as science, depend on classification under concepts. Often these concepts are put under pressure by new information discovered through advances in technology. Like in the case of Pluto and Eris, sometimes sports face difficulties concerning what the criteria should
be. This is particularly the case in sports officiating where advances in technology have shown us the sometimes-wide gap between how sports define the criteria and how officials apply them. The job of sporting officials is not unlike that of scientists, as it were, observing, measuring, recording, certifying, and curating the “data” produced by the athletes.3
And it is in this respect that innovations and improvements in technology raise both problems for sports as they are officiated, and opportunities for how officiating can be improved.
Thus far, the debate about what role technology should play in sports officiating has largely centered around one question: should sporting officials be given use of—or even replaced by—technology to improve officiating accuracy, or not?4
More technology, or not? Previously, I have argued for more, maybe much more, as long as it improves accuracy [6
]. The counterpoint has been, not so much that more technology would not
be more accurate, but that accuracy itself is overrated [1
], or that the costs of introducing technology may outweigh the benefits [10
], or even that the inaccuracy of human officials improves
sports in a way [3
There has been much less discussion about what accuracy in officiating means. Largely, it has been assumed an official’s call is accurate to the extent that it correctly applies the criteria
—accuracy means “getting it right.”6
This takes for granted that there is a right way to get it: that there are objective facts about the players or pieces of the game that can be correctly or incorrectly judged by officials. And indeed, there are often such facts and officials often do judge them incorrectly. A baseball pitch that passes through no part of the strike zone and is not swung at by the batter has not met the criteria to be called a strike. It ought not
to be called a strike. An umpire who calls it a strike anyway would be wrong to do so; they would be applying the criteria incorrectly. But even in the relatively well-defined case of balls and strikes, the criteria can be problematic in at least two ways. First, some degree of vagueness is likely ineliminable in defining any criteria. Often, it will be unclear whether
the criteria apply in a case or not; in baseball, for example, the criteria for balls and strikes are not exhaustive—some pitches are technically neither balls nor strikes.7
Second, it can be practically impossible for officials to apply the criteria to the events as they happen. Primarily, this will be due to the simple difficulties of seeing all of what one is supposed to judge or seeing it clearly enough. The human being as a measuring device is rather limited under the best of conditions, and conditions are not always best. Moreover, humans have cognitive biases in addition to their perceptual limitations that further complicate the already-difficult task facing officials [11
Technology can at best help address the second problem by seeing what the human eye cannot see or seeing it free from biases. This is not an insignificant benefit, but as is well known, it comes at some costs. Largely unexplored, however, is the role that poorly defined criteria play in making officiating too difficult for human beings to do well. It is partly because of how the strike zone is defined—“that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow below the kneecap, […] determined from the batter’s stance as the batter is prepared to swing at a pitched ball”—that calling balls and strikes is as challenging as it is. Tracking a baseball with the naked eye is not by itself too much for the human perceptual system to handle; catchers almost always catch the ball, after all. What is not so easy is telling whether any part of a spinning, speeding ball passes through any part of a vaguely defined, invisible, three-dimensional region of space. Human eyes are not built to see invisible lines. Cameras can be and have been; it is in virtue of them that we now know just how wide the gap is between where the strike zone is supposed to be and where umpires think it is.
The problem can be even worse for officials tasked with judgments of a player’s intention, such as whether a foul was flagrant or whether a player’s conduct was unsporting. In such cases it is not so much that we lack the perceptual abilities to determine the facts, it is that what counts as the relevant fact is poorly or vaguely defined. It is not clear what the criteria even are, so questions of whether an official’s call is accurate are unanswerable.
Everyone agrees that we can do better; in particular, (almost) everyone agrees that using technology would improve accuracy. But, recall, the debate has largely taken for granted that “accuracy” has been thought of in terms accuracy with respect to the current criteria. It is in this context that the desire for more accurate officiating pushes in favor of advanced technology and against human officials.
Acknowledging the roles that definitional vagueness and ambiguity play in making the job of officiating more difficult, however, presents us with a unique opportunity: it allows us to consider how our sports, like our sciences, might be improved by clarifying or refining the criteria we use to measure and judge them. If we take seriously the costs of officiating technology—both the financial costs as well as the potential negative effects on entertainment value—it is worth exploring ways in which we might still aid the goal of maximal accuracy in officiating by refining our criteria to make them simpler and/or easier for human officials to adjudicate. In much the same way that the astronomical community made studying astronomy easier—by redefining ‘planet’ with clearer, more easily measurable criteria—sports might likewise improve accuracy and make the jobs of officials easier by redefining the phenomena they are tasked with judging.
3. The Challenge of Sports Officiating
Sporting officials have both epistemic and judicial duties. As Royce aptly states, “[r]eferees have to decide rather than discover what to do, and not merely what happened. But we expect them to decide
what to do partly on the basis of discovering
what happened, and to interpret and apply rules in relation to what happened when arriving at their decision” [9
] (p. 62). The principal difficulty facing officials is not deciding what to do
, as such, though certainly applying the rules can be challenging even when it is clearly known what happened.8
Rather, officials often don’t know what happened
—either because they did not or could not see the relevant facts—and so cannot decide what to do
appropriately within the rules.
But what exactly counts as what happened? Not every fact is relevant to the athlete or the official. No player or official is concerned to know or measure the average number of kernels in boxes of popcorn sold during the game, for instance. Even many facts that affect or partially constitute the relevant events are not relevant per se. The exit velocity of a batted ball is in part determinative of whether it leaves the yard for a homerun. But the speed itself is irrelevant—what matters to players and officials alike is only where the ball ends up. The simple reason for this is that the rules of baseball nowhere make reference to exit velocity. In short, the rules specify both which sorts of facts are relevant to the game and what officials are supposed to decide given what happens in the set of relevant facts.
Not only do the rules specify which facts matter, they often specify what counts as a fact
. Whether a batted ball leaves the field of play is an empirically discoverable fact, but it is not a natural fact.9
That is, the concepts of a “batted ball” or the ‘field of play” are not to be found in any natural scientific theory; they are artificial inventions of the rules of the game. Our best physical theories recognize hard bits like protons and electrons, but not hardball; they acknowledge electromagnetic fields, but not ball fields. Sporting facts are social facts, as Borge argues, in that “the attitude we take towards the phenomenon … is partly constitutive of the phenomena” [14
] (p. 356).
So, in deciding what the rules of the game should be, we also decide what the relevant sorts of facts will be that we want officials to pay attention to, as well as what officials should do if certain facts obtain or certain conditions are met. That is, we decide what the criteria will be.
Our decisions are, in a sense, arbitrary and unnecessary [15
]. There is no moral or metaphysical requirement that we have foot races with hurdles, for example.10
To justify having hurdles in a race, we need only insist that we want to see a race like that
. We are not wrong
to want such races, nor are we wrong
to prefer races without hurdles. Choosing what games to play is exercising our autonomy.
That said, we might have reasons to prefer some rules and some criteria over others. Once decisions are made about what the relevant sorts of facts are, there will be better and worse ways of organizing, playing, and officiating
games. That is, once it is decided that the first person to reach the line is the winner, it will be true that running will be a better tactic than walking, ceteris paribus
. Likewise, once it is decided that the official’s job is to determine which person reaches the line first, it will be true that marking the line clearly and visibly with chalk or paint (and conducting the race in the daylight) will be a better choice than, say, using invisible ink or a single thread of spider silk (or running the race in total darkness). That is, once we decide what the relevant facts are, we have decided that those facts matter to the game. If we then adopt unreliable or inconsistent methods for tracking those facts (when more reliable or consistent methods are available), we make bad choices
given our prior decision that those facts matter.11
The point here is that, since we not only define the game but also the criteria by which it will be measured and judged, we are free to set those criteria wherever—and define them however—we like. So, the central problem for officiating in sports—deciding what to do in on the basis of what happened—can be recast in this light: what sorts of happenings should we keep track of, and how should we track them? What should the relevant criteria be? As with the question of what the criteria for ‘planet’ should be, this is not primarily an empirical question; answering it requires consulting our values.
In considering potential rules or rule changes in sports, we might discover
what our values are through reflective equilibrium. But we might also insist
on values and let these dictate how we craft the rules. One significant value is justice or fairness.12
Another is accuracy. Together, these push strongly in favor of using the most accurate means of officiating without regard to concerns about the impact on gameplay or spectator enjoyment.13
We might also value elegance, the “flow” of the game, a sense of drama, or more to the present concern, the role of human officials
. We need not have an argument for insisting on human officials, any more than we need an argument for why the 110-m high hurdles should have hurdles. Indeed, I think no such argument is possible.14
We need only insist that we want it that way
. As things stand, valuing human officiating cuts against the interest in “getting it right”, in large part because getting it right given the current criteria is often too difficult for unaided human officials to accomplish on their own and requires technological intervention.15
Thus, the central question of the debate: more officiating technology or not?
This question, I contend, presents a false dilemma between accuracy, on the one hand, and a human-officiated game. Human officials could be as accurate as we liked if only we decided that the relevant facts were ones that human beings are well-suited to measure. However, we have not; instead, our sports have largely been defined in quasi-scientific terms, as if the playing field were a Cartesian plane and the players and equipment idealized solid bodies with discreet outer envelopes. Thus, sports have become scientized experiments studying the movement of bodies in bounded but infinitely divisible spaces across infinitely divisible times where the margins between success and failure might resolve down to the millimeter or the thousandth of a second. These are the decisions that have been made about what count as the relevant facts in sports. Given this, the choice to use human beings as measuring devices is a bad one—human beings are not well-suited to resolve the position of bodies in motion down to the millimeter or the thousandth of a second—and the choice to continue doing so only gets worse as technology gets better.
But the sports we play and the facts we decide are relevant are not forced on us. We do not have to insist on both using criteria that cannot be judged accurately by human officials and using human officials. If we want to insist on human officials, then, it is worth considering how sports might be designed or modified not just for humans to play, but also for humans to judge excellently.
4. What Sports Designed for Human Officials Might Look Like
In general, the criteria used in sports are too fine-grained for humans to judge nearly as accurately as technology can. Baseball umpires are, by the most optimistic estimates, only 90% accurate in calling balls and strikes.16
While this might sound impressive, we only know this because of the already-99%-accurate Pitchf/x system Major League Baseball uses in every ballpark to evaluate its umpires. Moreover, the 9% difference in accuracy between human umpires and Pitchf/x is enormous: for context, MLB umpires missed more than 34,000 ball-strike calls in 2018 alone [17
]. And as determinations of sporting facts go, ball-strike calls are relatively simple: the umpire, from a static position, need only track one object moving directly toward him/her with the plate and the batter’s stance to help indicate the strike zone. Things are much more complicated and difficult in other sports such as (American rules) football, basketball, or soccer/football, where officials themselves are in motion along with 10–22 players and a ball, and where the sorts of determinations that need to be made concern spaces and times measured in fractions of inches or seconds. In general, we should expect human officials to perform relatively poorly in judging these events.
If we define the relevant facts more coarsely, however, we can reasonably expect that human beings will be correspondingly more accurate at determining them. For example, rather than making relevant precisely where a ball leaves the field of play, or precisely where it was positioned when a player was downed, we could make relevant only whether the ball (or player) crossed a certain clearly marked threshold.
Consider: gridiron football is often called “a game of inches”, but the field is measured in yards, lines are drawn at intervals of whole yards, and statistics are recorded only in whole yards. Teams have four downs to advance the ball ten yards. If they succeed, they earn another four downs. Yet despite this, the ball can be “spotted” (placed) at any point on the field, meaning that if the ball carrier is tackled between yard lines, the ball will be placed there to begin the next play; territory can be gained or lost in any fraction of a yard (or foot, or inch). There are no markings between the yard lines, though, and moreover only the lines marking five-yard intervals are marked all the way across the field. When the ball carrier is tackled outside of the hash marks, the ball is subsequently moved (by being tossed from the official to pick it up to an official standing at the nearest hash mark) and placed at the same yardage (which can be anywhere between two yard lines). And when there is a close call as to whether the ball has been advanced 10 yards, since there is often no yard line to go by, a comically low-tech ritual occurs. The chain crew—a pair of officials bearing poles with a 10-yard long chain between them—marches out from the visiting team sideline to the hash line where the ball is spotted, which can be more than halfway across the field. One pole is supposed to mark the previous spot of the ball and the other the line to gain a first down. A marker is attached to the chain denoting the point at which the chain crosses a five-yard line, so as to calibrate its placement upon moving it across the field. All of this takes time, and every step of the process is an opportunity for introducing error; the initial spot of the ball could be off, the poles could be misplaced, the chain could kink or stretch, etc. Each movement of the poles and chain is based solely on an individual’s estimate of where the ball (or the pole) was relative to the sideline. But, recall, these are estimates of where things are supposed to stand precisely.
The farcical nature of this has not gone unnoticed and regularly prompts calls for advanced technology to more accurately track the exact position of the ball [18
]. But the technology isn’t ready yet, nor is it necessarily needed. Nearly all such measurements could be eliminated by redefining ‘the spot’ as the last yard line the ball carrier contacts
in the direction of travel from the previous spot.17
That is, in a game measured only in whole yards, we could stipulate that ground can be gained only occur in whole yards, whether forward or backward.18
As a result, the precise location of the ball would seldom be of concern; it would only matter whether the ball carrier makes contact with the next yard line. If he comes up short, clearly landing somewhere between yard lines (which will be true in the vast majority of cases), there will be no question as to where the ball should be spotted (namely, the last yard line contacted). It is very easy for a human official to be mistaken by an inch or two about a ball’s location; it is much harder to be mistaken by three feet.
There are other changes we might effect to make officiating easier for humans. Many of these will involve physically altering the playing space to make boundaries more salient. Here I have in mind what Collins, Evans, and Higgins [1
] call “level 1” devices for “capturing” events, such as the bails in cricket, the bar in high jump, or the cup in golf. The falling of the bails constitutes
the breaking of the wicket; the falling of the ball into the cup constitutes
completing the hole. It is normally a simple matter for a human official to judge whether the bails have fallen or the ball is in the cup; indeed, it is hard to even imagine a scenario where human perception alone would be insufficient to decide whether the bails or a high-jump bar have fallen and
where some more advanced technology would do better. In such a case, unaided perception would be no worse than
more advanced technology.
We could introduce similar methods for capturing events in other sports. For instance, in professional tennis, line calls are still made by humans but challenges are settled by consulting the Hawk-Eye track estimating system. The speed of the game makes it quite difficult for the human eye to tell whether the ball bounces in (which means on the line) or out; if any part of the ball touches any part of the line, the ball is in. But such calls would be much easier to make with the eye alone if tennis were more like squash. In squash, lines are out and the bottom half-meter of the front wall is covered by “the tin”, the upper edge of which is beveled. The beveled edge of the tin is also the line above which shots must hit the front wall. Thus, when shots hit on the bottom line (out), they hit the bevel and ricochet at an unmistakably different angle than compared to a flat wall; the carom and the sound of the ball striking the tin clearly indicate the shot was out. Tennis could be modified analogously, by changing the criteria to make lines out of bounds and physically modifying the lines to cause the ball to rebound differently. The resulting sport would hardly ever require a system as sophisticated as Hawk-Eye to settle in/out questions. Whether such a sport is one we want to play or watch is another question.
Many will react to these proposals or others in the same vein with incredulity or revulsion. So be it. (Indeed, I think tennis modified as suggested would be ridiculous, though I am quite partial to the idea of quantum football.) These are not proposals to make sports better tout court. Rather, I offer these as potential examples of what sports would have to look like if we view both accuracy and human officiating as desiderata for sports. It is a false dilemma that we necessarily need to choose one or the other, but this is not to say there aren’t tradeoffs. To have both would require making changes to how our sports are defined and played, to make the relevant facts coarse enough to be reliably judged by unaided human perception. Those who do find such changes repugnant should consider these proposals as the sort of absurdity to which a commitment to both desiderata reduces. For my part, insofar as I find the principal value in sports being the display of excellence by the players, I would prefer even more fine-grained criteria to allow for even finer margins of measurement and comparison. At the same time, I think the arguments in favor of maximal accuracy in officiating are compelling and conclusive. As a result, I place no value whatsoever in human officiating as such: if it is the best we can do given our resources, so be it; if we can do better, let us do so.
Others may prefer the aesthetics of sports officiated by all and only humans. As I have said, it is not wrong to want it so in and of itself. Indeed, if given the choice between an impersonal technological system and an equally accurate human official, I have no objection against those who would prefer the human. I wonder, though, what choice the proponents of human officiating would make between an often-inaccurate technology and a nearly perfect human official. I have a suspicion that the supposed good-making features of error-prone officiating would disappear if it were fallible robots in contrast to nearly-infallible humans.
In any case, this is not a choice we are likely to ever face. Our sports are not likely to be revised radically so as to bring the criteria for judging them to well within the limits of human perception. For those that admire the undeniable skills of human officials and who appreciate the excellence that they too often display, I would urge them to consider incorporating the fact-determining tasks of human officials into a sport of its own. In so doing, the inherent perceptual limitations that human officials face would become a feature, not a bug, of the contest, and the resistance to allowing them the use of technology would be more well-motivated on the grounds that doing so would make the game too easy. After all, choosing what games to play is exercising our autonomy.