1. Introduction
Content transfer is the established paradigm for the preservation of audio recordings today—as opposed to the preservation of the ‘original’, that is, a magnetic tape, a wax cylinder, or any analog or digital audio carrier. The fast and irreversible decay that characterizes audio carriers makes the ‘preservation of the original’ a hopeless solution. Transferring the content from the original carrier to other carriers eliminates the contingent problem of carrier decay. The process of content transfer involves the extraction of the audio signal from the original (or source) carrier. For a complete overview of the audio preservation process, see [
1]. In order to perform signal extraction, it is crucial to have:
This article focuses on magnetic tapes. On a magnetic tape, acoustic information is organized in ‘tracks.’ There are several possible track configurations (
Figure 2). In [
2], Bressan and Hess present an overview of possible standard and non-standard configurations. They also discuss the importance of determining the correct configuration before signal extraction, in order to choose and set the equipment well. With the exception of one-sided tapes (e.g.,
Figure 2a), where all the tracks are meant to be read in the same direction, most tapes will show two sides (e.g., standard track configuration in
Figure 2b, non-standard track configuration in
Figure 2c). The playback equipment must be compatible with the tracks on the tape. For example,
Figure 1b shows the detail of double stereo tape heads. These heads are compatible with double stereo tapes (
Figure 2b).
When we work with a digital signal, a number of signal manipulations become possible–whether the signal was
digitised (analog-to-digital transfer) or whether it was extracted from a digital carrier and transferred to a digital workstation (digital-to-digital transfer). Easy manipulations include normalization, equalization, trimming, or the reversal of a full track. In this article, we focus on the last one, that is, reversing a signal head to tail, equivalent to reading it “backwards” (
Figure 3). From the possibility of reverting a digital track like this derives the fact that we could extract all tracks on a two-sided tape in one pass, regardless of the intended direction, since we can subsequently reverse the signal in our workstation. This is an interesting option, because the total digitization time for each tape is reduced by 50%. In small and, especially, in large digitization projects, this means hundreds of hours saved—which also means money saved. Audio preservation is “a costly and time-consuming affair” [
3] with highly repetitive routines [
4], and “reading [some tracks on] a tape backwards” is a practical solution that one time or another has occurred to anyone involved in a digitization project. For a fact, a number of projects have resorted to this ‘trick’ to save time. The question is: does this solution alter the audio signal? Is the signal extracted in the orthodox way equivalent to that extracted “backwards” and then reversed? How do we define ‘equivalent’? The ultimate open question is whether this time-saving procedure is compatible with a sound methodological approach to audio preservation, both from a theoretical and empirical point of view.
In order to start answering this question, this article presents a data set that was built and studied to understand whether it is possible to define and measure the differences between audio signals extracted from a magnetic tape in the intended direction or ‘backwards.’ In particular, we present an experiment where seven different signals were digitized five times forward and five times backwards each. The digitized signals were analyzed to detect significant variations.
The value of this work lies in that:
it identifies an effective feature to measure the variations between different digitizations;
it is the first systematic study of the question of whether reading tapes backwards is a legitimate strategy in the context of digitization projects;
the complete data set is open and accessible, as well as the Matlab algorithms used for the data analysis (available on Zenodo at:
doi.org/10.5281/zenodo.4430360; accessed on 20 June 2021).
This study contributes to the long-term goal of building a structured body of knowledge on magnetic tapes in the context of their preservation. Knowledge can be expressed in the form of scientific literature [
2,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20] and applications [
21,
22,
23,
24] to (i) contribute to the normalisation of audio preservation practices and (ii) to support the systematic accumulation of existing and new knowledge, accelerating its capitalization. The authors have more than a decade of experience with audio preservation, as summarized in the overview articles [
18,
25].
2. Background
There are several reasons why it can be advantageous to read a tape backwards in the context of audio preservation. At the project level: 1. to save time; 2. to save money. While these are desirable reasons to opt for this solution under most circumstances, they are not the only ones. There are other reasons at the preservation level:
the tape only needs to be played once instead of twice or more: historical tapes are vulnerable and each pass is a stress for the tape, so one pass is less invasive than multiple passes;
saving time is not only interesting for the stakeholders, but it benefits the ultimate goal of preserving the audio heritage. Reducing the time necessary to complete one preservation project allows the team to focus on other preservation projects. The wealth of historical audio materials at risk of obsolescence and of irreversible loss stretches far beyond the resources that the audio community can currently deploy. Preserving the world’s audio heritage is both a race against time and a goal that can only be achieved asymptotically. A reduction of time costs for each project increases our chance to preserve more of the collective audio heritage.
These reasons make it appealing to adopt the ‘reverse’ strategy in order to achieve a project’s objectives more efficiently and not less effectively. However, there are also some disadvantages to consider at the methodological level:
limitations in the monitoring of the signal transfer: digitizing multiple tracks on a two-sided tape at once means that monitoring can only be performed for some tracks at a time—depending on the flexibility of the setup. Isolating and monitoring only the track that is being read backwards may or may not allow the operator to make a good evaluation of the quality of the signal transfer;
the digital processing required to separate and revert the tracks after signal extraction come with their own time cost, plus they require specific tools and competences, and, finally, it needs to be performed correctly (non-zero chance of introducing errors into the workflow.)
In general, the choice to read a tape backwards should be exercised with caution. This study aims to provide a better insight into the caveats of this decision, to help practitioners make informed decisions, and hopefully not wrong decisions. In any event, when this strategy is adopted, it should be explicitly reported in the preservation notes—including the operations that were subsequently applied to the signal in the digital workstation.
When the tracks on a tape are picked up at once, they are processed in the digital workstation. It is important that the tracks be separated, linked correctly in the case of stereo signals, and that the tracks that were read backwards: 1. be reverted, heads to tail; and 2. their polarity (phase) be inverted as well.
Reversion and inversion are linear operations that do not introduce unwanted alteration to the signal. The problem lies at the level of the tape head: how does a head behave when picking up a signal backwards, in the light of the ‘analog behavior’ that characterizes it? The next subsection provides a brief overview of magnetic recording to explain why, according to the theory, reading tracks backwards should not yield the same result as reading them in the intended direction.
Electro-Magnetic Recording
Magnetic tapes can be recorded and played by means of an electro-mechanic machine called a reel-to-reel tape recorder (see a simplified schema in
Figure 4). During playback, the tape moves from left to right; from the source reel to the receiving reel. The advancement of the tape takes place through the capstan, a rotating cylinder operated by an electric motor, against which the tape is made to adhere by means of a special roller. The angular velocity of the capstan is controlled in such a way as to ensure that the tape runs with the most constant possible speed.
To read the recorded magnetic signal, the tape is passed in front of the playback head on which the tape induces a variable magnetic field proportional to the recorded signal. This determines the generation of an induced electromotive force at the ends of the head winding. Following the block diagram in
Figure 4, this signal passes through various amplification stages and a post equalization circuit, which has the purpose of compensating partly for the amplitude losses at various frequencies that occur in the playback and recording process.
The complexity of the magnetic recording system just described has some consequences. The friction of the tape and the dynamics of the traction system, consisting of reels, motor, and rollers, causes the speed of the tape and its relative position with the playback head to be subject to continuous small and difficult to control variations. The result is that a signal read from the same magnetic tape will be slightly different each time it is read. In other words, there is some ‘tolerance’ in analogue playback: only after a certain threshold could it be hypothesized that the equipment is defective. Moreover, the whole electro–mechanical system presents some non-linear characteristics and a dynamic behavior that would support the hypothesis that reading tapes “backwards” introduces measurable differences with respect to the standard alternative.
3. Experimental Setup: Signal Flow and Data Set
The experiment presented in this paper is intended as an investigation that, if verified, could become a practice within the complex process of digitizing audio documents on open reel magnetic tapes, in other words, establishing the correct method for reading magnetic tapes, even if part of the content was written backwards (in the opposite sense of the first-choice reading). By proposing this experiment as part of the investigation of the complex digitalization process, we could not evade all those rules and good practices that also govern the choice of the analogue-to-digital transfer system. The choice of this system becomes, in this occurrence, an integral part of the experimental setup. According to the ethics and principles of preservation strategy, as it is described in [
27], the selected experimental setup reflects what is stated in [
28] regarding the reproduction of analogue magnetic tapes in chapter 5.4 [
29].
As a professional reel machine usually used for audio master recording, the Studer A810 (2-tracks, 1/4 inch), owned by the Centro di Sonologia Computazionale laboratory of the University of Padua, was picked for the experiment (CSC—
http://csc.dei.unipd.it (accessed on 20 June 2021)—has been a leading research center in the field of computer music, creation and preservation over the last 50 years [
25]). For the occasion, the machine was suitably overhauled with regards to both the electrical components and the mechanical ones, depending on the type of magnetic tape used for the experiment, a 1/4 inch RTM SM911. The Studer A810 has four recording and reading speeds, but only the speed of 15 ips (inch-per-second) was used for the experiment. It is well known, as reported in [
29], that the higher the writing speed, the higher the frequency response of the machine and the higher the quality of the audio recorded. Consequently, the speed of 15 ips was preferred, although the highest is 30 ips, as a compromise for all those open reel tape audio documents, which are mostly at speeds of 7 1/2 ips, if not lower. For some estimates, [
18] shows an analysis of the last seven years of work at the CSC laboratory of the University of Padua on national and international archival materials and, specifically, on the ratio of the number of tapes for each replay speed used to the total amount of tapes. The attention paid to the choice of the replay speed is due to the fact that the experiment involved both writing operations on tape to prepare the test material and reading, subject to the actual experiment in this paper. For completeness of information, CCIR equalization was set on the reel machine.
The samples recorded on the magnetic test tape were the result of a selection between different audio materials, lasting about 15 s each. Regarding the characteristics of the chosen ones, these firstly had to cover the entire frequency response spectrum of the Studer A810. Then, they had to present peculiarities useful for subsequent computer analysis, for example, having very evident transients, as in the case of a snare drum, and coming from heterogeneous sources for timbre and other sound qualities. For completeness of information, the samples used by the authors for the experiment were born-digital audio files, stereo interleaved, normalized to an average RMS around −18 dB, with .wav extension and 24-bit 48 kHz resolution.
Below is a list of the selected samples:
WHITE NOISE;
SWEEP 20 Hz–20 kHz;
IMPULSES: a file with a sequence of 25 square impulses of increasing intensity and with a distance of 0.5 s;
DRUM: a live recording of a drum set;
SPEECH: a live recording of a human voice storytelling;
SONG: a live recording of a singer with instrumental accompaniment, in particular a violin and a percussion instrument;
CLASSICAL: a live recording of a brass ensemble, chosen for the harmonic richness of the wind instruments and the four-voice writing of the extract interpreted that reaches up to three octaves.
The samples elected for the experiment, fully described in the paragraph above, were recorded on RTM SM911 magnetic tape using the Studer A810 with the previously described settings. As a last step, essential for the experiment, the created tape was then subjected to multiple readings, again on the Studer A810. The tape was read in the correct sense of writing, and then backwards. This operation was repeated five times in order to qualitatively monitor the response of the reel machine. The digitizations obtained from these readings formed the data set of the experiment. The setup was completed by the PrismSound sound card (Lyra2 model), an Apple iMac computer (late 2012 Intel Core i5 2.9 GHz 8 GB DDR3 RAM), on which was installed the digital audio workstation Adobe Audition CS6, and n. 2 Genelec 1037B monitors, all connected by balanced cables.
4. Results and Discussion
After the digitization process, the audio files were analysed in order to verify whether, and in what circumstances, differences could be measured between the forward and backward reading of the same magnetic tape. Many metrics can be used to capture differences among signals (see, e.g., [
30,
31]), both in the time and frequency domains. In general, metrics in the frequency domain are considered more suitable for capturing differences in timbre nuances, or in wide-band noises. On the contrary, metrics in the time domain are more directly related to small local changes such as impulsive noises or transient responses. Given the electromechanical processes involved in the reading of a magnetic tape and the trivial observation that stationary signals are invariant to transformations along the time dimension, it is reasonable to expect that the main differences due to the reverse reading will be, if any, local changes, probably related to transient responses. On the basis of this assumption, we focused on a metric in the time domain and, in particular, the main feature assessed in the following analysis is the Euclidean distance between signals in the time domain.
The Euclidean distance, however, cannot be directly used to compare the digital signals coming from the multiple readings. Indeed, it is well known that multiple readings of a tape using an analog recorder do not allow for a precise homogeneous alignment between the recordings, given the unavoidable and unpredictable mechanical effects while the tape is running. For example, as a result of these effects, the duration of a tape is slightly different each time it is read. This has led to a manual check of the audio samples and subsequent realignment in order to minimise the differences at the beginning of the tracks. Moreover, a Dynamic Time Warping algorithm was applied to guarantee a good alignment along the entire track.
Secondly, the tracks were grouped in the same block, as impulses, drum, speech, and so forth, and were then divided between forward and backward readings. The Normalized Euclidean Distance (NED) for each audio frame was computed between two tracks within the same block in order to highlight the differences of the time signals. Afterwards, the NEDs obtained were labeled as congruent and non-congruent. The first refers to the comparisons between the forward or the backward readings (same directions); vice versa, the second refers to couples of opposite directions.
The distribution of the data was analysed in order to ensure the requirements for ANOVA. Indeed, the Kolmogorov-Smirnov test highlighted that the data is non-normal, since it showed a natural limit shrinking to the left. Therefore, the power transformation , to ensure the Gaussian distribution of the data, was applied. A value of provided the necessary scaling factor.
The first analysis of variance (ANOVA) showed a correlation between NED and the increasing value of the frame. In
Figure 5 and
Figure 6, the boxplots of the first 100 of 233 frames of the audio files are shown for the congruent and non-congruent readings, respectively. It is possible to highlight the increasing value of the average NED according to the increasing number of frames. This result can be explained by the nature of an analog recorder, which is not able to provide a constant speed of tape reading. Indeed, the differences become more evident for longer readings than for shorter ones. Therefore, the dataset was reduced to the first 100 frames, since the analysis of variance showed a low correlation between NED and the frames, as shown in
Table 1.
Figure 7 and
Figure 8 show the boxplots for each block for the congruent and non-congruent comparisons, respectively. The most evident results are the differences of the average NED values that increase for the non-congruent comparisons. The biggest differences are shown in complex signals, that is, they are less constant in time and have a wide band spectrum. For instance, the average value increases more for the drum, song and white noise with respect to the train of impulses or the sweep. The average values are shown in
Table 2 together with the ratio between the two values for the congruent and non-congruent readings of the same block. The ANOVA performed on the new dataset highlighted the correlation between NED and the congruent and non-congruent recordings, as shown in
Table 3.
A Welch two samples
t-test was performed between the two types of comparisons within the same block. The results show that the distributions are not the same and highlight the non-negligible difference between them (see
Table 2). Furthermore, the analysis of NED differences between congruent forward readings and congruent backward readings shows that the distributions have non-negligible differences, and for all the blocks the resulting
p-value is always <0.001 (see
Table 4), even if the average values have a lower ratio with respect to the forward and backward readings.
These results suggest that, according to the NED parameter, opposite readings do not ensure the same results and may provide non-negligible differences in the audio file. Furthermore, even if the analog process of recording does not provide constant results, multiple forward readings can be compared to multiple backward readings, since the results are not the same but can be easily compared.
Audio production and preservation are two important areas in the non-academic world, audio being a huge driver in the creative economy around the globe. Especially in the past decades, practices that have been shared among the expert community have not been formalized in the scientific literature. As a result, much knowledge and useful information, relevant to academic studies including the present one, are in the heads of audio veterans or in their written communications (mostly mail threads, private and public). It is very hard to extract and collect that information, and integrate it into a scientific study today. However, it is not acceptable to ignore it, considering its magnitude, impact and value. Adding a huge asset to the present study, we present in the next subsection additional observations about phase delay in historical audio equipment and recording techniques, based on the reality of the audio manufacturing industry and the shared knowledge of the veterans working in the field.
4.1. Contextual Notes on Phase Delay
Phase delay in audio magnetic tape recording has been a well-known phenomenon in the expert community since before copying audio signals on tape was a necessity for preservation purposes. Phase delay caused by multiple sources is likely a major source of the mismatch between forward and reverse playback of the signals, based on the information presented in the next paragraphs.
One of the first studies of this was conducted by Johnson and Gregg in 1965 [
32], where there was an attempt to maintain accurate transient response in a tape duplication system. Additional insight was offered by [
33]. Joel Tall responded in 1968, citing an experiment he ran “in 1951 or 1952”: “After the original recording had been edited, a reverse re-recorded tape [copy] was made…During playback to air, I ran both tapes in synchronism and switched from “A” tape to “B” (reverse re-recorded) tape…Subsequent reports showed that in all cases the reverse-re-recorded tape was the most natural…The most noticeable improvement was in voice” [
34].
It was not until the 1970s and into the 1980s that the record phase distortion was addressed. In 1976, Heaselett published a long tutorial and description of how the record circuits were developed for Ampex’s final flagship audio recorder, the ATR-100. This offered a reduction in phase delay on the record side [
35].
Studer followed suit and, in 1982, issued a Product Information white paper describing modifications to the record amplifier for the A810 (which was also used in the A820) [
36]. This was followed by “Phase Compensation Design Considerations” in their company journal,
SwissSound [
37]. In this paper, different types of record phase delay reduction are discussed, and they claim that what they are doing for the A810 is the finest yet.
Finnberg pointed out that the “2nd Version” in the
SwissSound article [
37] is the Ampex ATR-100 design, as shown on pages 6–43/44 in the ATR-100 service manual: “IC A5, LM318 opamp, is the differentiator that gives the phase EQ that is summed into opamp A6 by way of rec EQ trim potentiometer R15/R18, high/low speed” [
38]. He also confirmed that the Sony APR-5000 uses the same record topology as the ATR-100. In addition, Otari did include record phase delay compensation in their MX80 machines (at least), and the 3M M23 had phase compensation on the playback side [
39].
While the original circuit design in most of the earlier tape machines created a situation where, when a tape was played backwards on a similar machine, the phase delay of the reversed tape would partially mitigate the recording phase delays. Reducing the phase delay of the record process, as exemplified in the Ampex ATR-100, Studer A810/A820 and Sony APR-5000, could conceivably create an overshoot of correction with reverse playback, but this would only pertain to tapes made on these advanced machines.
Tapes recorded in both directions are not professional master recordings, although undoubtedly important recordings that need to be preserved have been made in these formats. Bidirectional recordings were unlikely to be made on the late model recorders, which provided the reduced recording phase delay. Additionally, the recording polarity (as referenced to the acoustic polarity of the original signal) has not been rigorously standardized. Therefore, an argument can be made that, for important tapes where the utmost sound quality is paramount, one could make both forward and reverse copies, then make a reversed polarity copy of each (note that standard practice in making a reversed copy is to reverse both the direction and the polarity) and select the best sound from among the four copies.
It is worth noting that complementary, compander-type noise reduction systems (Dolby, dbx, Telcom, and others) will not respond well to reverse playback, since they all rely on closely controlled time constants for proper operation and will not properly decode a signal that is reversed. Once the reverse-recorded file is returned to normal playback (and the polarity inverted) in the computer, then it is fine to run that file through the noise reduction decoding process, making certain that the levels are properly calibrated.
5. Conclusions
This article was motivated by the open question of whether digitizing tracks in opposite directions on the same magnetic tape is a viable solution for audio preservation projects, and in particular whether it is compatible with a sound theoretical and empirical methodological approach to preservation. The first step to answer this question was to build a data set that would help determine whether it is possible to identify and measure the differences between signals that were picked up in the intended direction (’forwards’) and in the opposite direction (‘backwards’).
Seven audio excerpts (including music, voice and reference signals) were recorded on a magnetic tape. Then they were played back and digitized ten times: five forwards and five backwards.
The obtained 70 audio files were compared by means of the Normalized Euclidean Distance, a feature specifically defined to quantify the differences between audio signals.
The results show that the distance between the files obtained by reading the tape with the same direction (congruent) is significantly more minor than the distance between files obtained by reading the tape in the opposite direction and reversing them in the digital domain (non-congruent).
These results support the idea that the practise of reading tapes backwards introduces measurable alterations in the signal. These alterations are grater than the variability that is inherent in the playback process with analog reel-to-reel tape recorders.
It is important to observe that the introduction of said alterations does not imply that reading tapes backwards (in the context of digitisation projects) is wrong or should be avoided. It means that we should be aware that the two approaches (forwards and backwards) do not yield the same results, at least at a signal level. This means that the decision about whether it is convenient and safe to save time by reading tapes backwards should be evaluated case by case.
The analyses conducted on the audio signal in this study do not inform us about the subjective perception of the alterations, that is, whether a human listener could tell the difference between the takes. This aspect was not included in the scope of this study, but we plan to make it the center of a future study. The goal of audio preservation is not only to support future signal analyses, but also to prepare audio material for humans to listen to. Therefore, the listening experience is a relevant aspect in the broad picture of this study, of which the next steps are:
the reproduction of the experiment presented in this article on different equipment, to assess the influence of the playback machine on the differences observed in the signal;
a perception test where human listeners evaluate the signals digitized forward and backwards, in order to evaluate the differences found in the signal against the impact they have on the actual listening experience.
On a final note, audio preservation is a well established field. However, with the ongoing research presented in this article, and the authors’ other scientific productions, we show that this is a field with many open questions, which greatly impact the theory and practice of preservation, and which need to be addressed with the scientific method and small, incremental steps forward. Of equal importance is to “bring the science” to a field that has existed outside of academia and in tight connection with industry for many decades before becoming of scholarly interest, formalizing the questions and exploring them with a scientific approach—while at the same time not ignoring the information and knowledge present within the expert community of audio veterans and industry people, however hard (and ever harder) it is to retrieve, gather, verify and interpret.