Development and Validation of an Algorithm for the Digitization of ECG Paper Images

The electrocardiogram (ECG) signal describes the heart’s electrical activity, allowing it to detect several health conditions, including cardiac system abnormalities and dysfunctions. Nowadays, most patient medical records are still paper-based, especially those made in past decades. The importance of collecting digitized ECGs is twofold: firstly, all medical applications can be easily implemented with an engineering approach if the ECGs are treated as signals; secondly, paper ECGs can deteriorate over time, therefore a correct evaluation of the patient’s clinical evolution is not always guaranteed. The goal of this paper is the realization of an automatic conversion algorithm from paper-based ECGs (images) to digital ECG signals. The algorithm involves a digitization process tested on an image set of 16 subjects, also with pathologies. The quantitative analysis of the digitization method is carried out by evaluating the repeatability and reproducibility of the algorithm. The digitization accuracy is evaluated both on the entire signal and on six ECG time parameters (R-R peak distance, QRS complex duration, QT interval, PQ interval, P-wave duration, and heart rate). Results demonstrate the algorithm efficiency has an average Pearson correlation coefficient of 0.94 and measurement errors of the ECG time parameters are always less than 1 mm. Due to the promising experimental results, the algorithm could be embedded into a graphical interface, becoming a measurement and collection tool for cardiologists.


Introduction
The digitization process is an essential process for the analysis and processing of signals. In recent decades, the in-depth study of medical signals has been made possible thanks to its digital nature.
The fundamental advantages of digital signals are known in terms of security, storage, and the absence of information corruption due to paper deterioration. Furthermore, saving the patient's history guarantees the ease of knowing his clinical evolution. Finally, knowing the entire time series of each signal, it is possible to implement some algorithms for the automatic detection of pathologies [1][2][3].
The ECG signal is an electrical signal that describes cardiac activity. The graph represents the trend of the heart potential over time. Nowadays, using the latest generation of electrocardiographs, digital signals can be collected and stored in the cloud. However, to know the patient's medical history in depth and build automatic algorithms, it is essential to know the ECG signals of the past, which, in most cases, are paper-based. In this way, this tool, integrated into modern electrocardiographs, can be used to carry out a retrospective analysis of patients to deepen the ECG characteristics for making timely diagnoses [4], even in the case of rare diseases/syndromes, e.g., Short-QT [5,6]. In this sense, in order to preserve the patient history, it is essential to use a digitization method, i.e., a conversion of a paper image to digital data, for extracting the ECG signals. In [7], an entropy-based

Materials and Methods
The proposed data acquisition system is shown in Figure 1. Two medical instruments were used in cascade to build the database for our study: the Fluke ProSim 4 Vital Signs Simulator [10] and the GE MAC 2000 electrocardiograph [11]. The ProSim 4 patient simulator allows the simulation of several cardiac conditions, while the GE MAC 2000 is an electrocardiograph that, in addition to allowing the display of the simulated conditions through the monitor, guarantees the acquisition of 12 signals (leads) simultaneously, both in digital format (*.xml file and PDF file) and by printing the ECG on graph paper [12] at 25 mm/s and 10 mm/mV. The *.xml file is automatically provided by the GE electrocardiograph; it contains the ECG digital signals as vectors of amplitudes over time, one per each ECG lead, together with the recording metadata (e.g., the sampling frequency) and the ECG parameters (e.g., HR), computed per each lead. Since the sampling frequency is 500 Hz, a 5 s signal has 2500 data points. The graph paper is thermal paper, printed about one hour before the scan. The ECG tracings resist fading for roughly 5 years [13]. The simulated electrocardiographic conditions are as follows: • Normal sinus rhythm: the rhythm of a healthy heart. It means the electrical impulse from the sinus node is properly transmitted [14].
• Bradycardia: the presence of a slow or irregular heartbeat, less than 60 beats per minute [15]. • Normal sinus rhythm: the rhythm of a healthy heart. It means the electrical impulse from the sinus node is properly transmitted [14]. • Bradycardia: the presence of a slow or irregular heartbeat, less than 60 beats per minute [15]. • Tachycardia: the increase in the number of heartbeats per minute (heart rate) under resting conditions (more than 100 beats per minute) [16]. • Acute Pericarditis: the inflammation of the pericardium characterized by an accumulation of fluids in the pericardial space [17]. • Atrial Fibrillation: rapid and disorganized atrial activation leading to an impaired atrial function [18]. • Atrial Flutter: heart failure when the electrical activity in the atria is coordinated. The atria contract at a much-increased rate (more than 240 beats per minute) [19]. • Muscle tremor artifact: a type of movement artifact. It usually happens because the patient is trembling. • Breath artifact: a typical artifact caused by patient breathing [20]. • Premature Atrial Contractions (PACs): a common cardiac dysrhythmia characterized by premature heartbeats in the atria [21]. • Premature Ventricular Contractions (PVCs): single ventricular impulses caused by abnormal automatism of the ventricular cells or by the presence of re-entry circuits in the ventricle [22]. • Supra Ventricular Tachycardia: the high-rate heart rhythm originating above the ventricle [23]. • Ventricular Tachycardia: the hyperkinetic arrhythmia characterized by a high ventricular rate [24]. Table 1 summarizes the database artificially created with the Fluke ProSim4 simulator; 16 records have been simulated and the printed portion of each signal is 5 s long (see Figure 2). The images printed with the electrocardiograph GE MAC 2000 were scanned with the Kyocera TASKalfa 5053ci scanner, with a scanning speed of 220 ipm and a scan resolution of 600 dpi × 600 dpi, with 256 levels of gray per color. Finally, to make the image more suitable, the contrast and sharpness were increased by 70% using the scanner software.
The purpose of the digitization algorithm is to transform the signal printed on graph paper into a digital signal that respects the measurements of millivolts (ordinate axis) and milliseconds (abscissa axis). However, in addition to the conversion error due to the digitization process, there is the error due to the electrocardiograph printing process of the signal on graph paper. In this study, the two errors are not analyzed individually, but as a combination of both.
The images printed with the electrocardiograph GE MAC 2000 were scanned with the Kyocera TASKalfa 5053ci scanner, with a scanning speed of 220 ipm and a scan resolution of 600 dpi × 600 dpi, with 256 levels of gray per color. Finally, to make the image more suitable, the contrast and sharpness were increased by 70% using the scanner software.
The purpose of the digitization algorithm is to transform the signal printed on graph paper into a digital signal that respects the measurements of millivolts (ordinate axis) and milliseconds (abscissa axis). However, in addition to the conversion error due to the digitization process, there is the error due to the electrocardiograph printing process of the signal on graph paper. In this study, the two errors are not analyzed individually, but as a combination of both.

The Digitization Algorithm
To extract signal data from the image, a novel algorithm was developed using the MATLAB ® platform. The algorithm is able to digitize the image and separate the signals from the background while respecting the time and voltage proportions of the ECGs. It is based on step-by-step automatic processing which involves the operations summarized in Figure 3. ECG of normal sinus rhythm (60 bpm); the green rectangle represents the printed portion of each lead (duration of 5 s). The sensitivity/gain is 10 mm/mV and the paper speed is 25 mm/s.

The Digitization Algorithm
To extract signal data from the image, a novel algorithm was developed using the MATLAB ® platform. The algorithm is able to digitize the image and separate the signals from the background while respecting the time and voltage proportions of the ECGs. It is based on step-by-step automatic processing which involves the operations summarized in Figure 3. The images printed with the electrocardiograph GE MAC 2000 were scanned with the Kyocera TASKalfa 5053ci scanner, with a scanning speed of 220 ipm and a scan resolution of 600 dpi × 600 dpi, with 256 levels of gray per color. Finally, to make the image more suitable, the contrast and sharpness were increased by 70% using the scanner software.
The purpose of the digitization algorithm is to transform the signal printed on graph paper into a digital signal that respects the measurements of millivolts (ordinate axis) and milliseconds (abscissa axis). However, in addition to the conversion error due to the digitization process, there is the error due to the electrocardiograph printing process of the signal on graph paper. In this study, the two errors are not analyzed individually, but as a combination of both.

The Digitization Algorithm
To extract signal data from the image, a novel algorithm was developed using the MATLAB ® platform. The algorithm is able to digitize the image and separate the signals from the background while respecting the time and voltage proportions of the ECGs. It is based on step-by-step automatic processing which involves the operations summarized in Figure 3. A. Image crops. To obtain the signals of all the ECG leads, 12 crops (one for each lead) are made on the image by framing the image patch of the corresponding lead. For each lead, the user manually makes the crops by clicking and dropping a rectangle on the portion of interest, using the "imcrop" MATLAB function (see Figure 4). Since the purpose of the proposed algorithm is to reconstruct the ECG signal respecting its morphology, independently from the number of samples, each crop does not need to have strictly the same size.  A. Image crops. To obtain the signals of all the ECG leads, 12 crops (one for each lead) are made on the image by framing the image patch of the corresponding lead. For each lead, the user manually makes the crops by clicking and dropping a rectangle on the portion of interest, using the "imcrop" MATLAB function (see Figure 4). Since the purpose of the proposed algorithm is to reconstruct the ECG signal respecting its morphology, independently from the number of samples, each crop does not need to have strictly the same size.
At this point, steps B and C were done in parallel. B. Binary mask. The extraction of the binary masks has been inspired by the Math-Works Community [25]. Firstly, the image is converted from RGB to HSV. Then, in order to extract the signal from the rest of the image, three ranges of color (from 0 to 1) were chosen for each HSV channel: 0.000 ≤ ≤ 0.997; 0.000 ≤ ≤ 0.659; and 0.647 ≤ ≤ 1.000, where , , and are the values for each channel of the HSV space, within which, At this point, steps B and C were done in parallel.
B. Binary mask. The extraction of the binary masks has been inspired by the MathWorks Community [25]. Firstly, the image is converted from RGB to HSV. Then, in order to extract the signal from the rest of the image, three ranges of color (from 0 to 1) were chosen for each HSV channel: 0.000 ≤ p H ≤ 0.997; 0.000 ≤ p S ≤ 0.659; and 0.647 ≤ p V ≤ 1.000, where p H , p S , and p V are the values for each channel of the HSV space, within which, the pixels are considered to be part of the signal. The result is a black and white image (one per crop), as shown in Figure 5. B. Binary mask. The extraction of the binary masks has been inspired by the Math-Works Community [25]. Firstly, the image is converted from RGB to HSV. Then, in order to extract the signal from the rest of the image, three ranges of color (from 0 to 1) were chosen for each HSV channel: 0.000 ≤ ≤ 0.997; 0.000 ≤ ≤ 0.659; and 0.647 ≤ ≤ 1.000, where , , and are the values for each channel of the HSV space, within which, the pixels are considered to be part of the signal. The result is a black and white image (one per crop), as shown in Figure 5. After, the signal is thinned (see Figure 6) using the MATLAB function "bwmorph" with the operation "shrink", which replaces groups of neighboring pixels with a single pixel. Experimentally, it was observed that, in order to reduce noise, the best result was obtained with its parameter set to 2.5, i.e., the number of times the "shrink" operation is performed. C. Scale Factor (SF) calculation. The standard ECG leads are printed on graph paper (see Figure 7). After, the signal is thinned (see Figure 6) using the MATLAB function "bwmorph" with the operation "shrink", which replaces groups of neighboring pixels with a single pixel. Experimentally, it was observed that, in order to reduce noise, the best result was obtained with its parameter set to 2.5, i.e., the number of times the "shrink" operation is performed. B. Binary mask. The extraction of the binary masks has been inspired by the Math-Works Community [25]. Firstly, the image is converted from RGB to HSV. Then, in order to extract the signal from the rest of the image, three ranges of color (from 0 to 1) were chosen for each HSV channel: 0.000 ≤ ≤ 0.997; 0.000 ≤ ≤ 0.659; and 0.647 ≤ ≤ 1.000, where , , and are the values for each channel of the HSV space, within which, the pixels are considered to be part of the signal. The result is a black and white image (one per crop), as shown in Figure 5. After, the signal is thinned (see Figure 6) using the MATLAB function "bwmorph" with the operation "shrink", which replaces groups of neighboring pixels with a single pixel. Experimentally, it was observed that, in order to reduce noise, the best result was obtained with its parameter set to 2.5, i.e., the number of times the "shrink" operation is performed. C. Scale Factor (SF) calculation. The standard ECG leads are printed on graph paper (see Figure 7). C. Scale Factor (SF) calculation. The standard ECG leads are printed on graph paper (see Figure 7). When the image is scanned, the correspondence between pixels and millimeters is not always the same and it depends on some factors, e.g., the printer resolution and the available type of image (scanned or PDF). Since the ECG is printed on graph paper, the grid size is fixed and known a priori. Therefore, in order to find out how much a pixel is worth in each image, a specific function was created, starting from the twelve crops, to isolate the grid and derive the scale factor.
Each crop, which is an RGB image, is transformed into a grayscale image, using the "im2gray" MATLAB function, and the signal is extracted, as in the previous paragraph, and removed from the image. For this purpose, two thresholds have been chosen quite close to the grayscale extremes in order to isolate the grid. Nevertheless, it is not certain When the image is scanned, the correspondence between pixels and millimeters is not always the same and it depends on some factors, e.g., the printer resolution and the available type of image (scanned or PDF). Since the ECG is printed on graph paper, the grid size is fixed and known a priori. Therefore, in order to find out how much a pixel is worth in each image, a specific function was created, starting from the twelve crops, to isolate the grid and derive the scale factor.
Each crop, which is an RGB image, is transformed into a grayscale image, using the "im2gray" MATLAB function, and the signal is extracted, as in the previous paragraph, and removed from the image. For this purpose, two thresholds have been chosen quite close to the grayscale extremes in order to isolate the grid. Nevertheless, it is not certain that the remaining black points and white backgrounds have a shade of gray exactly corresponding to the extremes. Therefore, the thresholds have been chosen to be not too high (220 out of 255 for white) or low (100 out of 255 for black); in this way, black points and the white background and signal are excluded.
In the proposed data set, images have two grids (see Figure 8), one less dense (with larger squares) and one denser. The first one is composed of dots, very close to each other, which form the perimeter of squares with a 5 mm side. In the second one, dots are further away and delimit squares with a 1 mm side.
When the image is scanned, the correspondence between pixels and millimeters is not always the same and it depends on some factors, e.g., the printer resolution and the available type of image (scanned or PDF). Since the ECG is printed on graph paper, the grid size is fixed and known a priori. Therefore, in order to find out how much a pixel is worth in each image, a specific function was created, starting from the twelve crops, to isolate the grid and derive the scale factor.
Each crop, which is an RGB image, is transformed into a grayscale image, using the "im2gray" MATLAB function, and the signal is extracted, as in the previous paragraph, and removed from the image. For this purpose, two thresholds have been chosen quite close to the grayscale extremes in order to isolate the grid. Nevertheless, it is not certain that the remaining black points and white backgrounds have a shade of gray exactly corresponding to the extremes. Therefore, the thresholds have been chosen to be not too high (220 out of 255 for white) or low (100 out of 255 for black); in this way, black points and the white background and signal are excluded.
In the proposed data set, images have two grids (see Figure 8), one less dense (with larger squares) and one denser. The first one is composed of dots, very close to each other, which form the perimeter of squares with a 5 mm side. In the second one, dots are further away and delimit squares with a 1 mm side. The algorithm first searches the position of the signal points of the binary mask and transforms them into white in the original image (see Figure 9a). Then, the function "imclose" performs a morphological closing on the image in order to join the nearest points (see Figure 9b). The used structuring element object is obtained by the function "strel" with the parameters "square" (i.e., the shape of the structuring element) and "16" The algorithm first searches the position of the signal points of the binary mask and transforms them into white in the original image (see Figure 9a). Then, the function "imclose" performs a morphological closing on the image in order to join the nearest points (see Figure 9b). The used structuring element object is obtained by the function "strel" with the parameters "square" (i.e., the shape of the structuring element) and "16" as the square width in pixels. To delete the furthest points, the function "regionprops" is used to return in a "struct" the property researched ("Area") with the linear indices of the pixels in the region ("PixelIdxList"). A threshold equal to 1000 pixels was set experimentally, and below it, the areas were eliminated (see Figure 9c). In this way, we obtained a binary image with a grid, composed of horizontal and vertical lines that form 5 mm side squares (see Figure 9d). The square's area (in pixels) is calculated as the mean of all the squares areas and, taking the square root, we have the inner side of the mean square. By summing it and the width of one line, the length L in pixels of the square side is found. Knowing that it should be 5 mm, the scale factor SF is obtained as: where SF indicates how many millimeters a pixel corresponds to.
Since the shapes of the leads are different when the signal is removed, the grids show different discontinuities, which can alter the automatic recognition of the squares: in this case, identifying bigger or smaller areas, SF is not always the same. Therefore, the final SF is a mean value among the calculated SFs considering the 12 image crops. This one will be used in the next parts of the algorithm.
D. Final reconstruction of the signal. Once the binary images are obtained, the algorithm plots the data using the y-positions of the signal pixels as the vector of the amplitudes and joining the progressive (see Figure 10). Therefore, the amplitudes are converted from pixels to millimeters by multiplying them by SF.
summing it and the width of one line, the length L in pixels of the square side is found. Knowing that it should be 5 mm, the scale factor SF is obtained as: where SF indicates how many millimeters a pixel corresponds to. Since the shapes of the leads are different when the signal is removed, the grids show different discontinuities, which can alter the automatic recognition of the squares: in this case, identifying bigger or smaller areas, SF is not always the same. Therefore, the final SF is a mean value among the calculated SFs considering the 12 image crops. This one will be used in the next parts of the algorithm.
D. Final reconstruction of the signal. Once the binary images are obtained, the algorithm plots the data using the y-positions of the signal pixels as the vector of the amplitudes and joining the progressive (see Figure 10). Therefore, the amplitudes are converted from pixels to millimeters by multiplying them by SF. In order to align the isoelectric line on the abscissa axis (i.e., y = 0), the most frequent amplitude value of each lead (i.e., the mode of the signal) is calculated and subtracted from the signal itself. A portion of the reconstructed signal is shown in Figure 10.
E. Amplitude correction. The reconstruction by pixels leads to a pixel reduction. The result is that the amplitude is sometimes lower than reality. Furthermore, this happens where more black pixels are concentrated, especially close to R-peaks because leads are generally narrower here. Thus, the algorithm automatically detects the R-peak locations (using the "findpeaks" MATLAB function and taking the five points with the highest absolute value) and measures the peak amplitude, which is an under-estimation of the real R-peak amplitude because the image was previously processed using the "bwmorph" (shrink) operation which improves the image quality, but also introduces an error in amplitude estimation due to pixel removal. Then, it adjusts the amplitude value, adding 1 mm to those points (when the peak is positive) or subtracting 1 mm (when it is negative). We chose this quantity experimentally by averaging the differences between the reconstructed leads and the reference digital signal of some images provided by the scanner and those obtained by converting the PDF format to JPEG (as the one used in Section 2.2.3). Figure 11 shows the signal before and after the correction. In order to align the isoelectric line on the abscissa axis (i.e., y = 0), the most frequent amplitude value of each lead (i.e., the mode of the signal) is calculated and subtracted from the signal itself. A portion of the reconstructed signal is shown in Figure 10.
E. Amplitude correction. The reconstruction by pixels leads to a pixel reduction. The result is that the amplitude is sometimes lower than reality. Furthermore, this happens where more black pixels are concentrated, especially close to R-peaks because leads are generally narrower here. Thus, the algorithm automatically detects the R-peak locations (using the "findpeaks" MATLAB function and taking the five points with the highest absolute value) and measures the peak amplitude, which is an under-estimation of the real R-peak amplitude because the image was previously processed using the "bwmorph" (shrink) operation which improves the image quality, but also introduces an error in amplitude estimation due to pixel removal. Then, it adjusts the amplitude value, adding 1 mm to those points (when the peak is positive) or subtracting 1 mm (when it is negative). We chose this quantity experimentally by averaging the differences between the reconstructed leads and the reference digital signal of some images provided by the scanner and those obtained by converting the PDF format to JPEG (as the one used in Section 2.2.3). Figure 11 shows the signal before and after the correction. plitude estimation due to pixel removal. Then, it adjusts the amplitude value, adding 1 mm to those points (when the peak is positive) or subtracting 1 mm (when it is negative). We chose this quantity experimentally by averaging the differences between the reconstructed leads and the reference digital signal of some images provided by the scanner and those obtained by converting the PDF format to JPEG (as the one used in Section 2.2.3). Figure 11 shows the signal before and after the correction.
(a) (b) Figure 11. A portion of the reconstructed ECG signal (normal sinus rhythm, 60 bpm, II lead) with SF application; (a) before the amplitude correction; (b) after the amplitude correction.
F. Image plot. Since 10 mm correspond to 1 mV, the signal amplitude is converted from millimeters to voltage. Regarding the time scale length, each lead has a sample number equal to the number of pixels (voltage values). In order to create the visualization of the time scale, the samples are first converted into millimeters thanks to SF and then in milliseconds, knowing that the paper speed is 25 mm/s. Each lead is plotted with a pink grid background which reproduces the graph paper; the x-axis is time (ms) and the y-axis is voltage (mV). An example is shown in Figure 12.
Lastly, the algorithm saves the final version of the twelve images individually (one image for each digitized lead) and as a global image with all the signals. Furthermore, it saves the voltage data of the 12 leads with the corresponding time samples. F. Image plot. Since 10 mm correspond to 1 mV, the signal amplitude is converted from millimeters to voltage. Regarding the time scale length, each lead has a sample number equal to the number of pixels (voltage values). In order to create the visualization of the time scale, the samples are first converted into millimeters thanks to SF and then in milliseconds, knowing that the paper speed is 25 mm/s. Each lead is plotted with a pink grid background which reproduces the graph paper; the x-axis is time (ms) and the y-axis is voltage (mV). An example is shown in Figure 12.

Algorithm Validation Technique
The algorithm created to digitize and save the ECG signal in the digital format of each patient must be validated. As the algorithm is intended to be a tool for clinical support, it must be rigorously tested. To validate the algorithm, it was evaluated in terms of the similarity of the entire signals, using the Pearson coefficient [26], and in terms of repeatability and reproducibility, using the mean, standard deviations, range, and absolute error of the time parameters extracted from the ECG signals. In addition, the digital signal and the time parameters obtained by the *.xml file were used as references (true value). The time parameters considered, which are important because they describe the punctual behavior of the heart, are:  [32].
To assess the quality of the algorithm in terms of repeatability and reproducibility, both the time differences and the corresponding millimeters differences were considered. The former measures the relative distance from the ground truth, while the latter is more specific for the digitization process of paper-based signals. In this sense, the acceptable upper bound is 1 mm, which is the paper grid resolution. Figure 13 shows the algorithm validation scheme. Lastly, the algorithm saves the final version of the twelve images individually (one image for each digitized lead) and as a global image with all the signals. Furthermore, it saves the voltage data of the 12 leads with the corresponding time samples.

Algorithm Validation Technique
The algorithm created to digitize and save the ECG signal in the digital format of each patient must be validated. As the algorithm is intended to be a tool for clinical support, it must be rigorously tested. To validate the algorithm, it was evaluated in terms of the similarity of the entire signals, using the Pearson coefficient [26], and in terms of repeatability and reproducibility, using the mean, standard deviations, range, and absolute error of the time parameters extracted from the ECG signals. In addition, the digital signal and the time parameters obtained by the *.xml file were used as references (true value). The time parameters considered, which are important because they describe the punctual behavior of the heart, are:  [32].
To assess the quality of the algorithm in terms of repeatability and reproducibility, both the time differences and the corresponding millimeters differences were considered. The former measures the relative distance from the ground truth, while the latter is more specific for the digitization process of paper-based signals. In this sense, the acceptable upper bound is 1 mm, which is the paper grid resolution. Figure 13 shows the algorithm validation scheme.
To assess the quality of the algorithm in terms of repeatability and reproducibility, both the time differences and the corresponding millimeters differences were considered. The former measures the relative distance from the ground truth, while the latter is more specific for the digitization process of paper-based signals. In this sense, the acceptable upper bound is 1 mm, which is the paper grid resolution. Figure 13 shows the algorithm validation scheme. The digitized signal has a different number of samples than the digital one. Thus, to evaluate similarity, repeatability, and reproducibility in reconstructing the shape of the signal point by point, a resampling is necessary for the reconstructed leads. Two points were manually identified in the reconstructed signal (two R-peaks) and the same points were taken from the digital one. Then, the program calculated the difference between the point positions in the digital signal; the digitized signal must have the same sample number between the same two points. Finally, the distances between the points in both the The digitized signal has a different number of samples than the digital one. Thus, to evaluate similarity, repeatability, and reproducibility in reconstructing the shape of the signal point by point, a resampling is necessary for the reconstructed leads. Two points were manually identified in the reconstructed signal (two R-peaks) and the same points were taken from the digital one. Then, the program calculated the difference between the point positions in the digital signal; the digitized signal must have the same sample number between the same two points. Finally, the distances between the points in both the signals were used as parameters for the "resample" MATLAB function. The algorithm resamples the whole reconstructed lead proportionally, using an FIR Antialiasing Lowpass Filter. The resampling rate was 500 Hz as in the digital signal.
Finally, by observing the position of the first considered point and the corresponding one in the other signal, the two leads were aligned and overlapped. In this way, a fair signal analysis can be carried out.

Similarity
In this work, similarity indicates how close the result of the measurement of the digitized signal is to the true value, i.e., the reference samples (digital signal). To assess the validity of the algorithm in terms of similarity, the Pearson correlation coefficient (r) was used [33], examining the entire sequence of the digital signal and the entire sequence of the digitized one. The Pearson's coefficient measures the statistical relationship between two continuous variables, using the covariance method [15]. It is defined as follows: where y is the desired output (target), R PEER REVIEW 10 of 17 the signals were used as parameters for the "resample" MATLAB function. The algorithm resamples the whole reconstructed lead proportionally, using an FIR Antialiasing Lowpass Filter. The resampling rate was 500 Hz as in the digital signal. Finally, by observing the position of the first considered point and the corresponding one in the other signal, the two leads were aligned and overlapped. In this way, a fair signal analysis can be carried out.

Similarity
In this work, similarity indicates how close the result of the measurement of the digitized signal is to the true value, i.e., the reference samples (digital signal). To assess the validity of the algorithm in terms of similarity, the Pearson correlation coefficient (r) was used [33], examining the entire sequence of the digital signal and the entire sequence of the digitized one. The Pearson's coefficient measures the statistical relationship between two continuous variables, using the covariance method [15]. It is defined as follows: where y is the desired output (target), ̃ is the predicted values, and is the total number of data. It ranges from [−1, 1]; = 1 indicates perfect positive correlation between y and ; = −1 perfect negative correlation; and = 0 no correlation. In our case, y is the reference sample from the *.xml file digital signal, while ̃ is the sample of the digitized signal.

Repeatability
Repeatability indicates the agreement between repeated tests performed with similar measurement conditions [34][35][36]. In order to evaluate it, an image of the data set (in is the predicted values, and n is the total number of data. It ranges from [−1, 1]; r = 1 indicates perfect positive correlation between y and Sensors 2022, 22, x FOR PEER REVIEW the signals were used as parameters for the "resample resamples the whole reconstructed lead proporti Lowpass Filter. The resampling rate was 500 Hz as in Finally, by observing the position of the first con one in the other signal, the two leads were aligned signal analysis can be carried out.

Similarity
In this work, similarity indicates how close th digitized signal is to the true value, i.e., the referenc the validity of the algorithm in terms of similarity, th was used [33], examining the entire sequence of the d of the digitized one. The Pearson's coefficient m between two continuous variables, using the covar follows: where y is the desired output (target), ̃ is the predict of data. It ranges from [−1, 1]; = 1 indicates perfect p = −1 perfect negative correlation; and = 0 no corre sample from the *.xml file digital signal, while ̃ is th

Repeatability
Repeatability indicates the agreement between re ; r = −1 perfect negative correlation; and r = 0 no correlation. In our case, y is the reference sample from the *.xml file digital signal, while 22, x FOR PEER REVIEW 10 the signals were used as parameters for the "resample" MATLAB function. The algor resamples the whole reconstructed lead proportionally, using an FIR Antialia Lowpass Filter. The resampling rate was 500 Hz as in the digital signal. Finally, by observing the position of the first considered point and the correspon one in the other signal, the two leads were aligned and overlapped. In this way, a signal analysis can be carried out.

Similarity
In this work, similarity indicates how close the result of the measurement o digitized signal is to the true value, i.e., the reference samples (digital signal). To a the validity of the algorithm in terms of similarity, the Pearson correlation coefficien was used [33], examining the entire sequence of the digital signal and the entire sequ of the digitized one. The Pearson's coefficient measures the statistical relation between two continuous variables, using the covariance method [15]. It is define follows: where y is the desired output (target), ̃ is the predicted values, and is the total num of data. It ranges from [−1, 1]; = 1 indicates perfect positive correlation between y an = −1 perfect negative correlation; and = 0 no correlation. In our case, y is the refer sample from the *.xml file digital signal, while ̃ is the sample of the digitized signal is the sample of the digitized signal.

Repeatability
Repeatability indicates the agreement between repeated tests performed with similar measurement conditions [34][35][36]. In order to evaluate it, an image of the data set (in particular, the patient with normal sinus rhythm and 60 bpm) was chosen, and 12 lead crops were done 10 times on the same image. Since SF is slightly dependent on the crop made, as explained in Section 2.1.C, SF was collected and compared each of the ten times with the others. Trying to keep the crop shapes as similar as possible with each repetition, the variation of the most important time parameter (caused by SF variability) was also analyzed by calculating the mean, standard deviation, and range of each one.

Reproducibility
Reproducibility is defined as the agreement between two measurements done under different circumstances [37][38][39]. In this case, the test was performed by using an image (the chosen patient is the same as in repeatability) with two different formats: one is the JPEG produced by the scanned electrocardiogram (see Figure 7) and the second is the PDF saved by the GE MAC 2000 and then converted in JPEG, with a different structure of graph paper where the grid is not composed by points but from solid lines (see Figure 14). The two formats have different resolutions, the first one is 7014 × 4160 pixels, while the second is 1755 × 1240 pixels. After digitizing the two images, SF and the time parameters were extracted for both and compared one by one with the parameters of the digital signal by calculating the absolute Error (aE) which measures the difference between the measured value and the true value, computed as follows: where X_experimental is the time parameter from the digitized signal and X_true is the value of the digital one extracted from the *.xml file. Figure 15 shows the similarity between the signals for the normal sinus 60 bpm case (in particular for the II lead). Notably, the morphology is respected in comparison to the digital signal. The two formats have different resolutions, the first one is 7014 × 4160 pixels, while the second is 1755 × 1240 pixels. After digitizing the two images, SF and the time parameters were extracted for both and compared one by one with the parameters of the digital signal by calculating the absolute Error (aE) which measures the difference between the measured value and the true value, computed as follows:

Similarity
where X_experimental is the time parameter from the digitized signal and X_true is the value of the digital one extracted from the *.xml file. Figure 15 shows the similarity between the signals for the normal sinus 60 bpm case (in particular for the II lead). Notably, the morphology is respected in comparison to the digital signal.  Table 2 illustrates the Pearson coefficient for each pathology, i.e., each simulated record of the dataset. If the signals are highly correlated and superimposable, the Pearson coefficient is close to 1. In the best case (normal sinus rhythm, 60 bpm) the Pearson coefficient is equal to 0.9821; in the worst case (bradycardia, 30 bpm), the Pearson coefficient is equal to 0.8798.  Table 2 illustrates the Pearson coefficient for each pathology, i.e., each simulated record of the dataset. If the signals are highly correlated and superimposable, the Pearson coefficient is close to 1. In the best case (normal sinus rhythm, 60 bpm) the Pearson coefficient is equal to 0.9821; in the worst case (bradycardia, 30 bpm), the Pearson coefficient is equal to 0.8798.
The ranges for each parameter are 17.93 ms (QRS complex); 29.36 ms (QT interval); 24.70 ms (PQ interval); 4.18 ms (P-wave duration); and 1.98 bpm (heart rate). These variations corresponds to a difference of 0.45 mm (QRS complex); 0.73 mm (QT interval); 0.62 mm (PQ interval); and 0.10 mm (P-wave duration). The highest variation (34.04 ms for R-R peak distance) corresponds to a difference of 0.85 mm between the two tests.
Comparing the means with the true values, the worst case is for the P-wave duration, where the difference is 22.75 ms, which implies a variation of 0.59 mm. This difference derives from the manual identification of the P, Q, R, S, and T points, which is, of course, prone to a higher rate of error. For the other time parameters, the differences amount to: 14.98 ms, i.e., 0.37 mm (QRS complex); 1.65 ms, i.e., 0.04 mm (QT interval); 12.38 ms, i.e., 0.31 mm (PQ interval); 12.18 ms, i.e., 0.30 mm (R-R peak distance); and 0.70 bpm (heart rate).
All the time parameters are shown in Table 3. Table 3. Variation of scale factor and time parameters obtained by cropping the same ECG image 10 times (normal sinus rhythm, 60 bpm). The first 10 rows report the values of the ten tests. The 11th, 12th, and 13th show, respectively, the mean, standard deviation (SD), and range of the 10 tests. The last row has the values of the ProSim simulator setting (True Value).
It is also noteworthy that with the 2nd JPEG, there is a closer agreement with the true value; this is probably from the fact that, although the resolution is lower, the square recognition works better since the grid is made with solid lines and not with dots. Table 4. The comparison of the scale factor and the time parameters obtained by cropping two versions of the same ECG image (normal sinus rhythm, 60 bpm). The second column shows the values of the ProSim simulator (Ground Truth). The fourth column reports the absolute errors between the first image and the true value, while the sixth column shows the absolute errors between the second image and the true value.

Parameters
True

Discussion
In the previous paragraphs, the algorithm was validated in terms of similarity, repeatability, and reproducibility.
For similarity, reconstructed signals were compared with the original digital ones by calculating the Pearson coefficient, which was always close to 1 (the perfect similarity).
Regarding repeatability, the algorithm was applied ten times on the same images and the most important time parameters (QRS complex, QT interval, PQ interval, P-wave duration, R-R peak distance, and heart rate) were extracted. In addition, the mean was close to the ground truth (less than 1 mm) and the relative range was less than 1 mm in the worst case.
Considering reproducibility, the tool has two digitized versions of the same image with different resolutions and grid structures. The time parameters extracted from the two images were compared with the original values by calculating the absolute error, which was less than 1 mm in all the cases.
In summary, the similarity test shows that the reconstructed signal had a shape very close to the original one. Repeatability and reproducibility tests showed that there was always a difference lower than 1 mm. Thus, it is acceptable considering the combination of the conversion error due to the digitization process and the error due to the electrocardiograph printing process of the signal on the graph paper. In addition, the standard paper for ECG recording presents a distance between thin lines of about 1 mm, so these errors are close to the paper's resolution.
Future works will deal with the application of this algorithm to create a training set for a machine learning prediction system that reveals cardiac pathologies. In addition, this algorithm will be subjected to an improvement phase to be able to apply it to low-resolution images and binary images (black/white) where the distinction between signal and grid is more difficult. Finally, the amplitude correction algorithm is based on a static threshold, experimentally derived from the dataset, to better generalize this phase. In the future, a more sophisticated way of performing peak correction will be investigated.

Conclusions
This study presents a novel MATLAB-based tool for digitizing ECG graph paper. The Fluke ProSim simulator connected to the GE MAC 2000 electrocardiograph was used to generate 16 images related to different pathologies. They were scanned and then digitized by the algorithm with an automatic conversion from pixels to millimeters and, thus, to milliseconds. The proposed approach reconstructs ECGs with high values of correlation (Pearson coefficient close to 1) with respect to the original digital signal, also presenting promising results in terms of repeatability and reproducibility (measurement errors of the main parameters always less than 1 mm). Finally, the correct extraction of temporal parameters related to the digitized ECG could help doctors in detecting pathology, therefore ensuring a correct diagnosis.
In light of the carried-out analyses, the presented algorithm can be considered a good support tool for cardiologists when the ECG paper images are available without the corresponding digital data.