This paper draws an earlier book (with Evans and Higgins) entitled Bad Call: Technology’s Attack on Referees and Umpires and How to Fix It
(hereafter Bad Call
] and its various precursor papers [2
]. These show why it is that current match officiating aids are unable to provide the kind of accuracy that is claimed for them and that sports aficianados have been led to expect from them. Accuracy is improving all the time but the notion of perfect accuracy is a myth. It is a myth in science, where measurement of any kind is always associated with a statement of possible error—and the amount of error is itself uncertain as the so-called ‘replication crisis’ in medicine and psychology shows. It is clearly a myth in sport because of obvious examples such as that the lines drawn on sports fields and the edges of balls are not perfectly defined. The devices meant to report the exact position of a ball—for instance ‘in’ or ‘out’ at tennis—work with the mathematically perfect world of virtual reality, not the actuality of an imperfect physical world. Even if ball-trackers could overcome the sort of inaccuracies related to fast ball speeds and slow camera frame-rates the goal of complete accuracy will always be beyond reach. Here it is suggested that the purpose of technological aids to umpires and referees be looked at in a new way that takes the viewers into account.
2. Justice Not Accuracy. Also Continuity
In Bad Call, we argue that match officiating has always been flawed and will always be flawed if the standard is mathematical purity. Traditionally, however, this kind of exactness this was not the aim; the aim was to run the game in a way that kept everyone reasonably satisfied that justice was being done. It was accepted that since the match official nearly always had a better view than anyone else and given the match official’s training and experience, no-one could do a better job of providing acceptable judgements. We argue that the match officials’ privilege has been eroded with the advent of television replays, especially slow-motion replays; these put the TV viewer in a better position than the match official to make a fair judgement—so long as the official has no access to the replays. Thus have the fragility of match officials’ real-time judgements, and occasional gross mistakes, been revealed to a wide audience, causing a sense of injustice in sports fans and spoiling the games. Therefore, we (the authors of the earlier books and articles), recommended the introduction of video-referees and umpires with access to TV replays as aids to the on-field officials.
The crucial thing is that this is not, and should not be seen as, a technological fix for bringing about exact accuracy—inaccuracy will always be there—but a fix for obvious injustices. This philosophical distinction makes a huge difference to the way technological aids are applied to sports-officiating but it is a distinction that does not seem to be widely understood. A second related philosophical principle that goes along with the main principle is that the technologically assisted game should be as like the technologically unassisted game as possible; we should not be playing a completely different game when we move from the lower reaches of amateur sport to the highest level of professional sport, at least, not in terms of the rules. These two principles can be summed up as Justice and Continuity (JAC)
One can see how JAC works with a single example: the skidding ball in tennis. It is (or at least, was) claimed that Hawkeye is more accurate than the human eye in certain circumstances because it takes into account the fact that a hard-driven tennis ball skids when it hits the ground. If the ball is very close to the back edge of the baseline, the human eye (and TV replays will make no difference), sees the ball bounce up from the end-point of the skid and, projecting backwards, humans see the ball as being ‘out’. Hawkeye, however, which projects the ball track forward, can show that the ball actually made contact with the baseline before skidding and was therefore ‘in’. JAC says that if the human eye and TV always see such a ball as ‘out’ then there is no felt injustice and the ball simply is ‘out’ for all practical purposes; it is only the shibboleth of accuracy that would cause us to want to say it was ‘in’. If in every game from the Sunday morning romp in the park all the way to non-technically assisted professional games such a skidding ball has always been counted as ‘out’ with no objection, then the technically assisted game should count it as ‘out’ too so as to maintain continuity with the rest of the sport.
A more recent event illustrates the same point. (It is proper to point out in reference to this incident that that the author is a Liverpool fan and his analysis of the incident coincides with his loyalties; readers should, therefore, assure themselves that the argument is not a product of bias.) The event in question concerns the ‘non-goal’ that occurred about 20 minutes into the crucial Premiership game between Manchester City and Liverpool on 3rd January, 2019. The result was a win for City but, other things being equal, it would have been a draw if the disallowed goal had been allowed. It could well be (at the time of the final revision of this paper undertaken with only one match left to play in the season) that this non-goal will mean that ‘City’ rather than Liverpool win the Premiership this year.
The goal was disallowed after the application of ‘goal-line technology’—a technology which was criticised in the book, Bad Call, long before this incident, on the grounds that it is expensive and unnecessary, relevant to a vanishingly small number of cases compared to other refereeing errors, and brought in only because TV replays had highlighted goal-line mistakes and so, in all these high-profile cases, TV replays could have corrected them. In this instance the use of goal-line technology has driven a further wedge between the technologically assisted game and the traditional game.
The dispute concerned a clearance by a Manchester City player. Figure 1
shows a TV replay of the clearance and Figure 2
the graphic generated by goal line technology.
It was claimed that goal-line technology showed that the ball had failed to clear the line by 11.2 millimeters subject to a 3.6 millimeter average accuracy. Here we are not questioning the accuracy of the judgment even though we do not know the extent of the scatter of the average error. Also, the presentation of the measurement to one tenth of a millimetre is bizarre given that we are dealing with painted lines on grass and a goal frame that would have to be a fine piece of engineering to preserve a front-to-back plane to within even a centimetre over the course of a game. It is also regrettable that the fact that this must be a virtual reality reconstruction is obscured by the ‘realistic’ presentation of the grass and the ball; the ball should be presented as a plain disk with fuzzy edges to represent measurement errors while the grass and line should be presented as geometrical blocks without texture, with the line having a fuzzy edge to express its real-world inexactness. Something more in the spirit of Figure 3
would be a more revealing way to present the outcome of the estimate made by goal-line technology. As it is, the way goal-line technology is presented misleads the public.
The main complaint, however, is that the use of goal line technology here offends against JAC. It is hard to imagine that any TV viewer or Video assistant referee would not award a goal after seeing the replay shown in Figure 1
. It looks like a goal and in all games that do not use the technology but use TV replays instead, it would be a goal. Given the TV replay, the incident offends against both justice and continuity. In this case, Liverpool fans watching the TV replay will feel they have been robbed by the technology rather than seen an injustice remedied, while Manchester City fans will feel they have been extremely lucky not to be a goal down.
If sports administrators, commentators, and the viewing public could train themselves to understand officiating technology as aiming at justice not accuracy and at maintaining games’ traditions as far as possible, the technology could be employed in a very different and very much more efficient way; cricket and some other sports already come close. Under these circumstances on-field match officials would continue to make decisions in real time just as they still do in every game that is not technologically assisted. Then, under circumstances which might vary from sport to sport, the video-assistant would offer a judgement. The on-field official’s real-time decision would be taken as right unless the video-assistant could, quickly, show it was unambiguously wrong. This situation is signified by the acronym ‘RINOWN’, which stands for ‘Right If NOt WroNg’. Continuity with the non-technologically assisted game would be preserved because where there was no technological assistance the default position would be that the on-field official was right as is traditionally the case. Given no TV-replays, everything would be just as it has always been through the centuries and a difference with the technologically-assisted game would occur only where the on-field official made the kind of mistake that is obvious to the TV viewer.
What are the circumstances under which the video-assistant’s help would be invoked? It could be decided to invoke it only when the players challenged the on-field decision, or when the on-field official decided to ask for help, or when the video-assistant, operating autonomously and monitoring the game continuously, decided to tell the on-field official that they were wrong and have them change their call. Other things being equal, autonomous video-assistants seem best because they have the same view as the TV-watcher and will be alert to exactly the same injustices, caused by same obvious mistakes, as are experienced by the TV watcher; the on-field official will, by definition, miss these, and even the players will not always be aware of injustices that TV viewers may spot.
4. Some Examples
Let us offer some examples of how things could be done differently and better under the principles set out here. We’ll start with one that occurred only a couple of weeks before the time of writing. In one of the cricket test matches between Sri Lanka and England in November 2018, a diving catch was taken low to the ground towards the edge of the field. At that distance, no-one near the middle of the field, including the batsman and the umpires, could be absolutely sure that the catch was taken fairly and had not bounced into the fielder’s hands; sometimes even fielders are not quite sure. Therefore, the umpires decided to ask the video-assistant to run through some TV replays. Cricket uses its technology properly most of the time and umpires had to give what is called a ‘soft signal’ before the TV replays were examined. The soft signal represents their unassisted decision—‘out’ or ‘not out’—leaving the video-assistant to decide only if they were obviously wrong, the default being that they were right (RINOWN).
The interesting thing here is the remarks of one of the commentators. He argued that since the umpires were so far from the point at which the catch was taken they could not see whether it bounced or not so they should not be asked to provide a ‘soft signal’, leaving it entirely to the video-assistant. But here the commentator is being misled by the desire for accuracy rather than justice and continuity. Had there been no TV cameras the umpires, badly sighted though they were, would still have to have made a judgement (incidentally, it would almost certainly have included element based on the trajectory of the ball and the fielder’s dive and the demeanor of the fielder—these things, like so many decisions, having nothing to do with exact accuracy). Furthermore, it is quite possible that the TV replays would themselves be indecisive but some decision would still have to be made—preferably as speedily as possible—and, under these circumstances, if it was made by the umpires, no injustice would have been done.
Incidentally, there are two kinds of technologically assisted decisions in cricket (and in some other sports). What we might call Type 1 examples are when the technology is simple, such as TV replay. In this case it is practical for the video-assistant to monitor the game continuously and warn the on-field official when they have made a mistake—‘autonomous assistance’. Type 2 is the when the technology is inherently slow, as in the case of ball-tracking and ‘ultra-edge’ which generates an oscilloscope trace of the sounds made as the ball passes the bat such as would indicate an ‘edge’ that might lead to a close catch decision—the ball hit the bat and did not just bounce off the pad or miss everything. Type 2 uses of technology invites the ‘player-challenge’ approach to video-assistance rather than autonomous video-assistance simply because the time taken to generate the images means that play cannot be monitored continuously. It may be that as technology improves, Type 2 will turn into Type 1 and make it possible to have autonomously video-assistance in every case.
In cricket the difference is clear because challenges are used for ‘lbw’ (this stands for ‘leg before wicket’ and requires an estimate of whether it was the player’s leg rather than the bat that prevented the ball from colliding with the target that the ‘bowler’ aims for—the three vertical ‘stumps’) and close catches whereas more distant catches, run-outs (similar to a runner not making his or her ground at baseball), stumpings (where the batter accidentally steps ‘out of his ground’ and allows the wicket to be broken with the ball), boundary saves and boundary-crossing by the ball ‘on the full’ (in cricket the ‘boundary’ is marked by a rope or similar and if the ball rolls across it then four runs are scored but if it flies across it ‘on the full’—without hitting the ground—then six runs are scored); these often involve the umpire asking for help which an autonomous video-assistant could provide without asking. At the time of writing there is a debate in cricket about whether the calling of ‘no-balls’, (when the bowler’s foot crosses the legal line), should be passed to an autonomous video-assistant and it seems likely that this will happen.
4.2. Ball Trackers and Tennis
A crucial lesson that has already been mentioned and needs to be widely learned is that technological devices such as ball-trackers, do not show what actually happened but only a statistical estimate of what might have happened, which is subject to unknown errors, often quite large, but which errors are concealed by the exact-looking virtual reality of the reconstruction. This is the argument set out and illustrated in Section 2
, above. Thus one may hear cricket commentators complaining that there is something wrong when two lbw challenges, decided on the basis of almost identical ball tracks, result in two different decisions, one ‘out’ one ‘not out’, depending on what the umpire had decided in the first place. In cricket this happens because there is, sensibly, a recognised margin of error when it comes to ball-trackers, inside which the umpire’s initial decision over-rules the technology. Therefore, the same ball track can lead to different outcomes depending on what the umpire thought in the first place. Certain cricket commentators complain that this is unjust because they do not seem to understand that the same ball-track reconstruction can mask two different actual ball-trajectories. We have to get over the idea in peoples’ heads that technological reconstructions present an exactly accurate account of what happened rather than an envelope of possibilities.
Tennis seems to be the sport where ball-tracking is most misused even though the public like the current system and the players have accepted it, some very reluctantly. In tennis the outcome of ball tracking is presented as though it can adjudicate ‘in’ and ‘out’ to an indefinitely fine margin with everyone reading the virtual reality reconstructions as reality itself. This, while it might seem like a ‘bit of fun’, is dangerously misleading in an era where fake news stories promulgated in social media are hard to distinguish from news from trustworthy sources; here tennis is encouraging the public to accept fake news instead of honing their ability to separate the credible from the incredible. Once more, JAC and RINOWN provide an easy solution. It is that the umpire’s initial decision should stand unless the ball-tracker shows that there was a clear mistake. In this case ‘clearness’ is a technical matter which depends on the horizontal component of ball speed and the frame speed of the ball-tracking cameras. When we, working with almost no data, tried to estimate the right kind of error margins in tennis, we thought that nothing within about three millimetres of the edge of the line was secure and within this margin umpire’s decision should be final. But this ‘three millimetres’ depends on many factors that, currently, only the ball-tracking companies know and which will be changing all the time as the technology changes. Unfortunately, this material is kept secret under the banner of commercial secrecy. This deliberate misleading of the public sets a bad precedent.
The same considerations apply to rugby. Unlike American football, the rule for scoring a ‘try’ in rugby is that the ball has to be carried over the line and placed on the ground with momentary downward pressure. An opposition team can stop the ball touching the ground even though it is over the line; this is referred to as the ball being ‘held up’. Sometimes there are a huge pile of bodies over or around the edge of the line with the ball invisibly buried somewhere toward the base of it. In the technologically assisted game the referee nowadays calls for the video-assistant to try to untangle the decision. Under JAC the referee would have to make a decision, as they have to when there is no technological assistance, and the video assistant’s job would be to say whether that decision was obviously and visibly wrong and if not, the decision would stand. This would remove the grounds for argument about the true outcome and the sense of injustice, while markedly speeding up the decision-making process.
What is known as VAR, or the ‘video assistant referee’, is currently being brought into football (known in the USA as soccer) at an ever-increasing rate. We like to think that this is partly a result of our arguments and analysis of three seasons of English Premier League football in our book, Bad Call, but it is hard to find any acknowledgement of this work. In Chapter 7 of our book we put forward a scheme for introducing TV replays into football while minimising delays in the game. The central principle is, once more, RINOWN—the referee makes the decisions just as now and that decision is the default unless it is obviously wrong to the TV viewer; this, of course, also satisfies JAC. Unfortunately, VAR seems to be being introduced into football in a variety of different ways and its future promises to be attended by a lot of confusion.
5. Goal-Line Technology
As an example, consider, once more, goal-line technology. This is a technological device that tells a referee via a wristwatch-type indicator whether the ball has fully crossed the goal-line in case of a disputed goal. As explained in Section 2
, exactly what is meant by ‘fully crossed the goal line’ is difficult to say given that goals and goal lines are not exact but no comprehensive estimates of possible error, which would include the scatter rather than being an average, are provided. To repeat, the clamour over goal-line technology was caused by well-publicised mistakes in important televised matches, notably an England World Cup semi-final against Germany, but in all these cases the mistake was obvious on the TV replay and this makes it difficult to see why any more advanced technology was called for unless the pressure emerged from the idea that technology could provide exactness. In our analysis we showed that over three seasons of English Premiership football the number of disputed goals that might be affected by goal-line technology was about 11 whereas the number of potential goal-related mistakes arising from flawed penalty, offside and red card decisions was well over 300. Of these 11, the number which would not have been obvious to an appropriately located TV replay camera would have been a very small subset.
It is obvious that the large majority of these ‘well-over 300’ mistakes could not possibly have been settled by an exactly accurate technology since such mistake often depend on judgement of intention, in the case of penalties and red-cards, and ‘interference’ with play’ in the case of offside and there is no foreseeable technology that can measure these things. It is obvious then, that starting with the notion of accuracy as the foundation of the introduction of VAR in football is going to lead to confusion. Starting with the notion of justice, continuity, and RINOWN, will resolve it.
Our recommendation is the same as for other sports. On-field referees make decisions and video-assistants monitor the game in the same way as TV viewers monitor it—with the same or better access to replays. Video assistants call play back when, and only when, an obvious mistake has been made—that is, a mistake that is obvious to anyone with the benefit of TV replays including viewers at home and video assistant referees. Given that the mistake must be obvious, this kind of judgment is quick and aligns with the TV viewer at home, thus eliminating injustice. We suggest various ways of halting and restarting the game under different circumstances in Table 7.2 (p124) of Bad Call but this is something that can only be refined with experience. The crucial arguments here are about resolving confusion about what decision-aid technology in sport is for, and the arguments are necessarily philosophical, at least in part.