1. Introduction
Electrocardiography (ECG) is a representative biosensor that noninvasively measures weak bioelectrical signals, and improving the accuracy of these signals is essential for reliable applications. ECG is widely used as a measurement tool across various medical and physiological fields, including the diagnosis of cardiovascular diseases, assessment of autonomic nervous system activity, and exercise physiology research [
1,
2,
3,
4,
5]. Since ECG records the electrical activity of the heart over time, analysis of the rhythm and morphology of the waveform enables the evaluation of both normal and abnormal cardiac events, as well as subtle changes in cardiac function. In particular, the R-R interval (RRI), defined as the interval between consecutive R-peaks, serves as a fundamental metric in heart rate variability (HRV) analysis and is broadly applied in studies assessing autonomic function, monitoring stress load, analyzing sleep stages, and evaluating exercise responses [
6,
7,
8]. Such applications of ECG highlight its central role in biosignal processing, where cardiac dynamics are analyzed alongside other physiological signals to assess systemic conditions. Furthermore, RRI time-series data, when combined with techniques such as power spectral analysis, time-series analysis, and nonlinear analysis, have been reported to detect subtle changes in cardiac activity that cannot be captured by mean heart rate alone [
9,
10,
11,
12,
13,
14]. Therefore, accurate detection of R-peaks and precise calculation of RRIs are critically important in ECG analysis research.
Automated R-peak detection algorithms, including classical approaches such as the Pan–Tompkins method [
15] and subsequent improvements [
16,
17], as well as recent deep learning-based techniques [
18,
19,
20], are widely used in both research and clinical applications to support efficient large-scale ECG analysis.
However, baseline variations caused by experimental conditions or subject movement, cases where T-waves exceed R-waves [
21], and the presence of noise can result in inaccurate R-peak identification. Such errors reduce the precision of RRI calculation and may introduce inaccuracies in subsequent analyses, such as HRV or stress evaluation. Manual correction of misdetections is burdensome even in short recordings, and the workload increases dramatically in long-term recordings such as 24 h Holter ECG. Therefore, an efficient, user-friendly tool enabling analysts to intuitively correct R-peak positions is essential. In this study, we focus on short-term experimental recordings and partial corrections within longer or clinical recordings, rather than full manual annotation of long-duration datasets. Traditional approaches attempt to correct misdetections by adjusting filters or thresholds. However, such adjustments may introduce new errors outside the target region. This limitation highlights the need for a manual yet efficient correction method that avoids these trade-offs.
Previously developed software tools include Kubios HRV (version 2.1), widely used for HRV analysis [
22]; RR-APET, a Python-based graphical user interface (GUI) tool allowing R-peak editing and batch processing [
23]; ECGdeli (version 1.0.2), a MATLAB-based ECG visualization tool [
24]; as well as CEPS [
25] and PhysioZoo [
26]. These tools provide valuable functionalities for HRV and ECG research, offering extensive metrics and analysis capabilities. However, their support for interactive correction of single R-peak misdetections remains limited. For example, Kubios HRV provides comprehensive HRV metrics and some manual editing options but does not prioritize propagation of edits across all derived analyses. RR-APET allows GUI-based editing, but corrections are applied in batch rather than supporting fine-grained, single-beat adjustments with synchronized updates. ECGdeli is powerful for ECG delineation but does not emphasize interactive correction or recalculation of RRIs, while CEPS and PhysioZoo focus on complexity analysis or mammalian HRV, without a strong emphasis on GUI-based correction. In contrast, the proposed system was specifically designed to address this gap by enabling intuitive, single-event correction. Users can visually identify and directly adjust an R-peak with a single mouse click, after which the RRI time series and related analyses are updated. This unique combination of fine-grained manual correction and rapid feedback conceptually distinguishes our system in terms of its interaction design.
This study is intended as a proof-of-concept demonstration of an interactive GUI-based post-detection correction framework, rather than a comprehensive benchmarking or superiority study against existing ECG analysis software.
To address these challenges, this study proposes an ECG viewer equipped with GUI-based correction functionality. The system allows simple manual correction of failed R-peak detection or T-wave misidentification and updating of the RRI time series. By complementing the limitations of automated detection algorithms, the proposed system aims to improve both the efficiency and accuracy of ECG analysis.
It is emphasized that the proposed system is not intended to replace or compete with existing automated R-peak detection algorithms. Rather, it is designed as a post-detection correction framework that operates independently of the initial R-peak detection method. In practice, R-peaks may be detected using any approach, including classical signal-processing techniques, machine learning-based detectors, existing software tools, or manual annotation. The role of the proposed GUI is to allow intuitive correction and validation of these detected R-peak positions and to update the resulting RRI time series and related analyses.
For evaluation purposes, two types of gold standards were used in this study. In the simulation experiments, synthetic ECG signals are generated using a dynamic ECG model in which the exact R-peak timings are analytically defined in advance, enabling precise and unambiguous accuracy assessment. In the real-data experiments, reference R-peak annotations provided by the MIT-BIH Arrhythmia Database are used as the gold standard. With this design, we evaluate the performance of the proposed GUI-based correction method independently of the performance of any specific R-peak detection algorithm.
2. Design
2.1. Programming Language and Drawing Library
The computational core of the system was implemented in Fortran (FTN95 compiler, Silverfrost Ltd., Manchester, UK), and the DISLIN graphics library was used for drawing. Fortran was chosen because of its high numerical computation performance and compatibility with existing code assets. DISLIN provides versatile two- and three-dimensional graphics functions, as well as bitmap output, making it suitable for waveform visualization and interactive drawing.
For the development environment, the Fortran editor on Windows (Plato IDE, version 4.3.0, Silverfrost Ltd., Manchester, UK) was used to implement both the numerical processing and GUI logic. DISLIN (version 10.0, Dislin Software, Göttingen, Germany) was employed for graphics, and the linkage to DISLIN was established through dynamic linking. In Plato IDE, diss32.lib was added as the link library, enabling calls to DISLIN functions. At runtime, disdll.dll, diss32.dll, and the FTN95 runtime library salflibc.dll were placed in the same directory, allowing the graphics functions to be accessed via dynamic linking. The specifications of the PC and display used in this study are summarized in
Table 1.
2.2. Viewer Design and Operation
The viewer was designed using the DISLIN functions CALL CURVE and CALL BARS, as illustrated in
Figure 1. First, the CALL METAFL (‘XWIN’) function was used to define a graphical output window, with the window size set to approximately 465 mm (H) × 310 mm (V). The screen layout was organized as follows:
- (a)
Top panel: ECG waveform (cyan) and peak flags (magenta).
- (b)
Middle panel: RRI time series (cyan).
- (c)
Bottom panel: From left to right, the power spectral density function of the RRI time series (yellow), histogram (orange), Lorenz plot (green), and polar plot (peach, circular waveform of RRI).
The four analyses were mathematically defined as follows:
Power spectral density (PSD): estimated using the fast Fourier transform (FFT) of the RRI time series
r(
n):
Histogram: distribution of RRI values
r(
n), represented as frequency counts within bins:
Lorenz plot: scatter plot of successive RRIs: (rn, rn+1).
Polar plot: circular mapping of RRIs:
In addition, a toolbar was placed at the top of the screen. The toolbar contained the following buttons: Insert, Delete, Resolution: up, Resolution: down, Reset, and Quit, which could be operated using the left mouse button. Right clicking was disabled.
2.2.1. Time Scale Adjustment
Clicking the Resolution: up button shortened the time scale of the ECG display (e.g., 60 s→30 s→10 s), down to a minimum of 1 s. Conversely, the Resolution: down button extended the scale (e.g., 120 s→180 s→300 s), up to a maximum of approximately 1 h at a sampling frequency of 1000 Hz.
These operations affected only the time scale of the ECG graph, without influencing other plots. Clicking outside the toolbar advanced the time axis by one time scale unit. Similarly, pressing the numeric keypad “6” or the right arrow key advanced the display by half a time scale unit, while the Backspace key moved back one unit, and the numeric keypad “4” or the left arrow key moved back by half a unit. Clicking Reset restored the full-length ECG display.
2.2.2. Insert Function
Clicking the Insert button activated Insert mode, indicated by the button turning yellow. In this mode, clicking within the ECG graph frame or pressing the Enter key displayed a vertical line (flag), and the corresponding time was recorded. Clicks or key presses outside the frame were ignored. This function allowed the user to add flags at arbitrary time points simply by positioning the mouse cursor and clicking. Clicking Insert again terminated the mode.
2.2.3. Delete Function
The Delete function removed a flag and its associated time, using an operation similar to the Insert function. By clicking or pressing Enter just before the target flag, the flag and its recorded time were deleted. The deletion interval was fixed at 0.2 s, independent of the sampling frequency. This feature was useful for removing misdetected peaks caused by P- or T-waves.
2.3. Operational Flowchart and Algorithm
The processing flow of the proposed GUI-based ECG correction system is illustrated in
Figure 2. After data acquisition, the ECG signal is converted into a three-column, two-dimensional array,
array(3,
i), which stores measurement time, ECG amplitude, and peak information as follows:
- (a)
array(1, i): measurement time (in seconds or sampling index). For example, at a sampling frequency of 1000 Hz, values such as 0.001 s, 0.002 s, 0.003 s, … are stored.
- (b)
array(2, i): sampled ECG waveform (raw data).
- (c)
array(3, i): binary peak flag indicating whether an R-peak is present (1 for peak, 0 otherwise).
The system operates in an event-driven loop, remaining in a waiting state until a user action occurs. When a mouse click or key operation is detected in either Insert or Delete mode, an update cycle is triggered.
First, the horizontal screen coordinate of the mouse click is converted into the corresponding time position (
Tpos) using the current display time scale. This coordinate-to-time transformation is performed using Equation (4), which maps the pixel position on the ECG display to the time axis of the signal:
Here, xt is the starting time of the currently displayed ECG segment (s), Time scale is the display length (s), NX is the x-coordinate of the mouse click (in pixels), and Xleft and Xright denote the x-coordinates of the left and right ECG axes, respectively. The index corresponding to Tpos is then identified in the time array. In Insert mode, an R-peak is added by assigning array(3, i) = 1, whereas in Delete mode, the corresponding peak is removed by assigning array(3, i) = 0. Importantly, the raw ECG waveform stored in array(2, i) is never modified.
After updating the binary peak flag array, the R-peak time series is reconstructed by extracting all indices for which array(3, i) = 1. The RRI time series is then recalculated from the time differences between consecutive R-peaks. Subsequently, all dependent visualizations—including the RRI time series, power spectral density, Lorenz plot, histogram, and polar plot—are refreshed simultaneously. After completion of the update, the system returns to the waiting state for the next user input. Spline interpolation is applied solely for visualization purposes to display the RRI time series as a continuous curve. All quantitative accuracy evaluations and error analyses are performed using the original, non-interpolated RRI values.
Boundary and critical cases are handled conservatively to ensure stable operation. Mouse clicks occurring outside the ECG display region are ignored, and deletion is performed only when a peak flag exists within a predefined temporal window. These constraints prevent unintended modifications and preserve the integrity of the raw ECG data. This non-destructive design—based on simple toggling of binary peak flags—minimizes computational cost and enables recalculation of RRIs and rapid updating of all dependent analyses.
Figure 3 compares the ECG before and after editing for a 5 s segment around 2 min. In Insert mode (
Figure 3, middle), the P- and T-wave times for this segment were extracted, whereas in Delete mode (
Figure 3, lower), two R-peaks in this segment were removed. Compared with the unedited version, the RRI time series and other graphs changed accordingly.
2.4. ECG Simulation Data for Accuracy Evaluation
Because R-peak correction in the GUI was performed by converting screen coordinates to time, errors could arise depending on the accuracy of this transformation. Such errors were expected to depend on the ECG sampling frequency and the time scale used for display. To evaluate the accuracy of the system, a synthetic ECG dynamic model was employed (Equation (5)), in which the P-, Q-, R-, S-, and T-waves were approximated using Gaussian functions with specified amplitudes (
Ak), time offsets (
tk), and standard deviations (
σk). Each parameter was randomly varied within ±10% of its reference value to reproduce beat-to-beat morphological variability (
Table 2).
The reference values were derived from the widely used model of McSharry et al. [
27], which has become a standard for generating physiologically plausible ECG signals under normal sinus rhythm. Using this model ensured that the simulated ECG retained realistic morphology while providing precise gold-standard R-peak timings for accuracy evaluation. Although the primary goal of this experiment was to evaluate the accuracy of the coordinate-to-time transformation—which could theoretically be tested with any waveform containing known peak positions—we selected a synthetic ECG model so that the validation process remained aligned with typical ECG analysis scenarios.
Simulation data were generated at sampling frequencies of 125 Hz, 250 Hz, 500 Hz, and 1000 Hz, with cycle lengths set to approximately 1 s, yielding 180 s of synthetic ECG signals.
Figure 4 shows an example of a synthetic ECG signal at a sampling frequency of 1000 Hz. RRI values were calculated from each signal and defined as the gold-standard reference. Then, across 12 conditions (four sampling frequencies × three time scales: 2 s, 5 s, and 10 s), the author manually clicked on R-peaks using the mouse cursor, and relative errors between the GUI-derived RRI and the gold standard were computed using Equation (6). In this validation, the 180 s ECG contained approximately 180 R-peaks. All peaks were manually clicked to detect RRIs, and 180 relative errors were calculated for each condition. The mean and standard deviation of these errors were then obtained.
Additionally, to evaluate system responsiveness, the processing time between using the Insert/Delete functions and updating the graphical display (see
Figure 2 flowchart) was measured with the Fortran CALL CPU_TIME function. For each condition (four sampling frequencies × two modes), 30 clicks were performed, and the mean response time was calculated.
Furthermore, to evaluate the accuracy of the GUI-derived RRIs, statistical analyses were performed by comparing GUI-derived values with the simulated gold standard. For each condition (four sampling frequencies × three time scales), the following indices were calculated: the mean ± standard deviation (SD) and 95% confidence interval (CI) of both GUI-derived and simulated RRIs; the mean ± SD of relative errors; the Shapiro–Wilk test (W and p values) to assess the normality of error distributions; and the number of outliers (|z| > 3). In the Shapiro–Wilk test, the null hypothesis was that the error distributions follow a normal distribution, with a significance level of α = 0.05. Thus, p > 0.05 indicated no significant deviation from normality.
All statistical analyses were performed using Python (version 3.12.7). Data handling was conducted with pandas, descriptive statistics with numpy, and normality testing with the Shapiro–Wilk test implemented in scipy.stats.
2.5. Application to Real ECG Data
To verify the applicability of the proposed GUI-based correction method to real ECG data, electrocardiogram (ECG) waveforms from the MIT-BIH Arrhythmia Database (PhysioNet) were analyzed [
28,
29,
30,
31]. Three records were selected: two normal sinus rhythm recordings (records 115 and 122) and one arrhythmic recording (record 209), all of which contain two ECG leads (MLII and V1). The MLII lead was used for all analyses because it provides clearer R-peak morphology and higher signal-to-noise ratio than V1, and is therefore commonly adopted in QRS detection benchmarks. The MLII lead was used in this study because it provides clearer R-peak morphology and is commonly adopted in previous QRS detection benchmarks. All records have a duration of 30 min 06 s, and the sampling frequency is 360 Hz.
Record 115 corresponds to a 39-year-old female participant and includes 1953 R-peaks and 1952 RRIs, with heart rates of 50–84 bpm. Record 122 corresponds to a 51-year-old male participant and contains 2476 R-peaks and 2475 RRIs, with heart rates of 67–97 bpm. Record 209 corresponds to a 62-year-old male participant and exhibits mixed normal and supraventricular ectopic activity, consisting of 2621 normal beats, 383 atrial premature contractions (APCs), and 1 premature ventricular contraction (PVC), for a total of 3005 R-peaks and 3004 RRIs. Heart rates ranged 82–116 bpm during normal rhythm and 106–171 bpm during Supraventricular tachyarrhythmia (SVTA) episodes. For all three records, the R-peak timestamps provided in the annotation files (.atr) were used as the gold standard for accuracy evaluation.
The author manually clicked all R-peaks using the proposed GUI at three time scale settings (2 s, 5 s, and 10 s) to extract the corresponding RRIs (hereafter referred to as GUI-derived RRIs). The GUI-derived RRIs were computed directly from the differences between consecutive clicked R-peak timestamps, without any resampling or interpolation, because temporal resampling can introduce phase shifts and distort millisecond-level comparisons between the GUI-derived and gold-standard data. For accuracy evaluation, the GUI-derived RRIs were directly compared with those obtained from the gold-standard annotations. A GUI-derived RRI was regarded as correct if its duration differed from the corresponding gold-standard RRI by less than ±10, ±20, ±30, or ±40 ms. Since all R-peaks were visually confirmed, no missed beats occurred. Accordingly, accuracy was defined as the ratio of RRIs within the specified tolerance to the total number of RRIs in each record, as expressed by Equation (7):
where
denotes the number of RRIs whose differences from the gold standard were within ±10, ±20, ±30, or ±40 ms, and
represents the total number of RRIs in the record.
In addition to the tolerance-based accuracy evaluation, Bland–Altman analysis was performed to assess the agreement between the GUI-derived RRIs and the gold-standard RRIs. The differences were defined as (GUI − Gold) for each RRI pair. This analysis was conducted using the 10 s display time scale, which yields the largest expected annotation error among the tested conditions and therefore represents a conservative (worst-case) evaluation of GUI performance.
3. Results
Table 3 presents the mean ± SD and 95% CI of simulated RRIs (gold standard reference) and GUI-derived RRIs at different sampling frequencies and time scales. GUI-derived values were obtained from manual peak selections at 2, 5, and 10 s time scales. Across all conditions, GUI-derived RRIs showed close agreement with the simulated reference values, indicating consistency between manual corrections and the gold standard.
Table 4 summarizes the relative errors of GUI-derived RRIs compared with the simulated RRIs. The smallest relative error was 0.146 ± 0.264% at 125 Hz with a time scale of 2 s, whereas the largest error was 0.694 ± 0.601% at 125 Hz with a time scale of 10 s. The smallest standard deviation was ±0.132% at 1000 Hz with a time scale of 10 s. Shapiro–Wilk tests indicated that the error distributions did not significantly deviate from normality (all W > 0.98,
p > 0.19), and the number of outliers (|z| > 3) was negligible (≤2 cases per condition). These results confirm that GUI-induced errors were small, statistically well-behaved, and robust across different sampling frequencies and time scales.
Table 5 presents the mean and standard deviation of operation response times for Insert and Delete modes across different sampling frequencies. The mean response time for Insert ranged from 25 to 29 ms, while that for Delete ranged from 24 to 30 ms. The standard deviation of response times ranged from 8 to 17 ms for Insert and from 8 to 14 ms for Delete.
Table 6 summarizes the accuracy of manual R-peak clicking evaluated in terms of the resulting GUI-derived RRIs for MIT-BIH records 115, 122, 209, and their combined dataset, under different error tolerances (±10–±40 ms) and GUI display time scales (2, 5, and 10 s). For all three records, the accuracy at the 2 s display scale reached 1.000 across all tolerance levels, demonstrating that high temporal precision can be achieved when the waveform is sufficiently magnified. At the 5 s scale, accuracy at ±10 ms remained high (0.985–0.994 for the normal recordings and 0.987 for record 209), and reached 1.000 for tolerances of ±20 ms or greater. At the 10 s scale, accuracy at ±10 ms decreased due to waveform compression, which makes precise mouse clicking more difficult; however, accuracies at ±20 ms or wider tolerances remained ≥0.992 for all records, including record 209, which contains frequent ectopic beats. These results indicate that the GUI-based clicking mechanism maintains high precision not only for normal sinus rhythm but also for arrhythmic waveforms, provided that an appropriate display scale is selected.
Across all tested sampling frequencies from 125 Hz to 1000 Hz, the relative errors of GUI-derived RRIs remained below 0.7% (
Table 4). In particular, the error levels observed at 250 Hz and 500 Hz were highly similar, indicating that the sampling-frequency dependency of the GUI-based correction method was modest within this range. Because the MIT-BIH Arrhythmia Database uses a sampling frequency of 360 Hz, which lies between 250 Hz and 500 Hz, the GUI accuracy at 360 Hz is expected to fall within the same low-error regime validated by the simulation experiments.
Figure 5 compares the GUI-derived RRIs and the gold-standard RRIs obtained from MIT-BIH records 115, 122, and 209 at the 10 s display time scale. Across all three recordings—including record 209, which contains a substantial number of APCs—the GUI-derived RRIs closely overlap with the reference series. Even under the most error-prone viewing condition (10 s scale), the visual agreement indicates that beat-to-beat variability is well preserved, without apparent temporal distortion. These observations support the applicability of the proposed GUI-based correction method to both normal and arrhythmic ECG waveforms.
Figure 6 presents Bland–Altman plots of the differences between GUI-derived RRIs and gold-standard RRIs (GUI − Gold) for three MIT-BIH records (115, 122, and 209) at the 10 s display time scale. For all records, the mean differences were extremely small (−0.48 to −0.02 ms), indicating the absence of directional systematic bias. For record 115, the mean difference was −0.02 ms with a standard deviation of 6.31 ms, and the 95% limits of agreement (LOAs) ranged from −12.39 to 12.35 ms. Record 122 showed a mean difference of −0.020 ms with a standard deviation of 7.20 ms and LOAs from −14.13 to 14.09 ms, while record 209 exhibited a mean difference of −0.48 ms with a standard deviation of 7.07 ms and LOAs from −14.34 to 13.38 ms. In all cases, the differences were symmetrically distributed around 0 ms, indicating that the GUI did not systematically click on either the left or right side of the R-peak. Furthermore, no RRI-dependent widening of the differences was observed, and the error distribution was uniformly scattered across the RRI range for all records. These results indicate that the R-peak correction errors reflected in the GUI-derived RRIs predominantly consist of random errors, without waveform-dependent or time-dependent systematic deviations.
4. Discussion
The GUI-based ECG viewer developed in this study is characterized by its ability to intuitively correct R-peak misdetections, providing a design focus that differs from that of conventional ECG analysis software. For example, Kubios HRV covers a wide range of indices required for heart rate variability analysis and is widely used in both clinical and research settings; however, its R-peak correction functionality is limited [
22]. RR-APET, a Python-based tool, allows GUI-based R-peak editing and batch processing, but it is not fully optimized for rapid RRI updates [
23]. ECGdeli is a powerful MATLAB-based tool for ECG visualization and fiducial point detection [
24], and tools such as CEPS [
25] and PhysioZoo [
26] provide specialized functions for complexity analysis and mammalian HRV analysis, respectively. However, all of these tools primarily focus on analysis capabilities, and their GUIs provide only partial support for interactive, on-click correction of misdetections. More recently, attempts have been made to benchmark HRV analysis toolboxes [
32,
33], and LightWAVE [
34] allows waveform and annotation editing in a browser environment. Yet, the functionality required in short-term experimental settings—namely simple correction of misdetections with simultaneous updates of the RRI time series and analysis results—remains insufficient. The proposed algorithm offers a distinctive design in this regard, as it allows the correction of R-peak misdetections through the GUI while updating the RRI time series, making it potentially applicable to short-term experimental settings. A qualitative comparison of the proposed system and representative existing ECG analysis tools is summarized in
Table 7.
From an engineering perspective, the system achieves both high accuracy and strong responsiveness. The relative error of RRIs calculated after GUI-based correction was found to be less than 0.7%, demonstrating that manual corrections can be incorporated without compromising reliability. As shown in
Table 4, the relative error is minimally affected by the sampling frequency but is sensitive to the time scale setting. This is because a longer time scale compresses the waveform per beat, making it difficult to accurately click the R-peak with a mouse, whereas a shorter time scale enlarges the waveform, facilitating visual peak identification and reducing the relative error. Therefore, appropriate time scale settings are important for achieving high correction accuracy via GUI operation. Notably, at 125 Hz with a 2 s time scale, the relative error was smallest on average (0.146%), but the standard deviation was relatively large (±0.264%). This suggests that with coarse sampling, even a slight deviation in mouse clicking leads to substantial variability, since each sample point corresponds to a relatively large time step (8 ms). Conversely, at 1000 Hz with a 2 s time scale, the relative error showed the smallest standard deviation (±0.132%). This is because the temporal resolution is high (1 ms per sample), so even if the mouse click deviates by one sample, the resulting error remains small. Statistical analyses further confirmed that the errors introduced by the GUI followed a normal distribution, as indicated by the Shapiro–Wilk test (all W > 0.98,
p > 0.19). This finding suggests that the errors primarily reflect random fluctuations in mouse positioning rather than systematic bias in the GUI itself. The number of outliers (|z| > 3) was negligible (≤2 per condition), supporting the stability of the error distribution. Taken together, these results indicate that the correction accuracy achieved by the system is statistically reliable, with errors being small and well behaved.
Furthermore, as shown in
Table 5, the time required to update the RRI after correction was approximately 20–30 ms and remained stable regardless of the sampling frequency or time scale. This update time includes not only RRI recalculation but also updates of four graphs based on the RRI time series: power spectral density, the Lorenz plot, the histogram, and the polar plot. Despite the relatively high computational load of these analyses, the total processing time remains within 20–30 ms, highlighting the system’s high responsiveness, which enables rapid feedback during interactive correction.
To further evaluate its performance, the proposed GUI was validated using real ECG recordings. In addition to the two normal sinus rhythm records (MIT-BIH records 115 and 122, MLII lead), we newly included record 209, which contains substantial arrhythmic activity with frequent APCs and episodes of SVTA. This expanded validation allows assessment of the method under more diverse and physiologically challenging conditions.
Across the three records, a total of 7434 R-peaks (corresponding to 7431 RRIs) were analyzed. All R-peaks were manually clicked under three display time scale settings (2 s, 5 s, and 10 s), resulting in 22,302 manual clicking operations. Despite the irregular morphology and variable cycle lengths in record 209, the GUI maintained high accuracy: as summarized in
Table 6, accuracies exceeded 0.985 at the ±10 ms tolerance for the 2 s and 5 s scales and reached 1.000 at ±20 ms or higher for all three records. These results confirm that the GUI achieves sub-sample-level precision not only under normal sinus rhythm but also under arrhythmic conditions. Moreover, this large number of repeated manual operations demonstrates the operational stability and practical robustness of the proposed interaction framework.
Moreover, the comparison of the RRI time series (
Figure 5) indicates close overlap between GUI-derived and gold-standard RRIs, even at the 10 s display scale. This observation provides intuitive visual support that beat-to-beat variability is preserved under this viewing condition. The Bland–Altman analysis further supported the reliability of the GUI-derived RRIs (
Figure 6). Across the three representative records (115, 122, and 209), the mean differences in RRI (GUI − Gold) were extremely small (−0.48 to −0.02 ms), indicating the absence of systematic bias in the resulting RRI measurements. The LOAs were approximately ±12 to ±14 ms, which corresponds to, at most, about ±5 samples when considering the sampling frequency of 360 Hz (2.78 ms per sample). This level of discrepancy falls within the expected range of human visual point-selection variability and is unlikely to affect subsequent HRV analysis. In all records, the differences were symmetrically distributed around 0 ms, indicating no tendency for the operator to consistently click on either the left or right side of the R-wave peak. Moreover, no RRI-dependent increase in error magnitude was observed, suggesting that the majority of residual discrepancies consisted of random rather than systematic errors. Notably, this level of accuracy was achieved even under the most challenging display condition with a 10 s time scale. Therefore, even higher accuracy can be expected in typical operational settings using finer time scales (e.g., 2–5 s per screen). In record 209, the differences formed a bimodal distribution; however, this was not attributable to the GUI itself but rather to the physiological characteristics of the recording, which included two distinct phases: sinus rhythm and APCs. Because APCs are associated with shorter RRIs, the formation of a separate cluster at lower mean RRI values is physiologically natural. Importantly, even during the APC-dominant phase, the differences (GUI − Gold) remained symmetrically distributed around zero, indicating no systematic bias in click positioning. These findings demonstrate that the accuracy of GUI-based R-wave correction is preserved even in the presence of arrhythmic events such as APCs and that performance does not degrade due to rhythm irregularity. Consequently, the proposed GUI tool can be considered robust not only for sinus rhythm data but also for real-world ECG recordings that include arrhythmias. As discussed above, the present study focused on validating the core correction framework using controlled ECG data, and extending the evaluation to noisier signals (e.g., wearable or ambulatory recordings) and multi-lead ECG data remains an important topic for future work.
Taken together, the findings substantiate the robustness and practicality of the proposed algorithm for use across a wide range of ECG conditions. A key feature of the proposed algorithm is its simplicity: by maintaining a dedicated flag array that records the positions of detected peaks, the system enables recalculation of the RRI time series whenever peaks are added or removed, without altering the raw ECG data. This non-destructive structure ensures that only the flag values need to be updated, which greatly reduces computational cost and allows rapid updating of the RRI and associated analyses. The implementation of the system in Fortran is also an important strength in terms of processing speed. Fortran is a compiled language specialized for numerical computation, providing faster execution compared with interpreted languages such as Python or MATLAB. It is widely used in fields such as fluid dynamics, high-speed mathematical model development, and aerospace engineering [
35,
36,
37], and its long-standing track record in high-performance computing (HPC) and numerical analysis is well recognized in academia. For example, its compiled array-based optimization structures and explicit memory access directives can achieve several- to hundred-fold speed improvements compared with Python [
38]. Moreover, Fortran supports High Performance Fortran (HPF) for HPC, enabling efficient parallelization using FORALL constructs and PURE procedures [
39]. In this system, the GUI and computational modules are closely integrated, and although the entire waveform is recalculated each time, high responsiveness is still achieved. Thus, the Fortran implementation represents a rational design choice emphasizing both performance and responsiveness.
In addition to evaluating performance, the software architecture of the present prototype requires consideration. The system was implemented in Fortran with the DISLIN graphics library, a practical choice for achieving fast numerical operations and stable GUI responsiveness necessary for validating the proposed algorithm. Since the purpose of this study was to present the algorithm and demonstrate a working prototype, the implementation prioritized feasibility rather than software generality. We recognize, however, that the Fortran + DISLIN framework may pose limitations in terms of portability and broader adoption. As described in
Section 2.3, the underlying algorithm itself is simple, relying on only three arrays (time, ECG amplitude, and binary peak flags). Although such operations could, in principle, be reproduced in other environments, the present study did not implement or evaluate alternative platforms. Therefore, the applicability of the method beyond the current prototype remains a conceptual possibility rather than an established feature, and further software development lies outside the scope defined by this prototype-oriented study.
However, despite these advantages, the present study did not include quantitative benchmarking against existing tools such as Kubios HRV, RR-APET, or ECGdeli. Accordingly, the present work should be interpreted as a prototype-oriented, proof-of-concept study focusing on interaction design and immediate RRI updating, rather than as evidence of superiority over existing software. These software packages are primarily optimized for automated or offline analysis, whereas the present system focuses on user-driven correction with on-click updating of the RRI time series. A systematic comparison of latency, correction accuracy, and user-action load represents an important direction for future work, and this limitation has been explicitly noted to clarify the scope of the current study. In addition, system responsiveness was evaluated under varying computational loads by changing the ECG sampling frequency and display time scale, demonstrating stable update latency across these conditions.
Moreover, formal usability evaluation involving multiple users was not conducted in this study, as the primary objective was to propose the algorithm and demonstrate a working prototype. Accordingly, user-centered usability assessment and inter-operator variability remain important topics for future work.
Overall, the results indicate that the proposed system combines low error, low latency, and high efficiency, demonstrating feasibility for rapid updating after R-peak correction in short-term experimental settings. The present study primarily focused on the development and validation of the underlying algorithm, and the GUI was verified not only with simulated data but also with real ECG data from the MIT-BIH Database, demonstrating consistent accuracy under physiological conditions. Broader aspects of usability, such as interface layout, toolbar design, and user experience, were not considered in this study and remain subjects for future improvement. Further benchmarking against other available tools and user studies involving multiple operators should be conducted to establish practical applicability and assess inter-rater reliability. Although the current validation mainly targeted R-peak correction, the same algorithmic structure may be extended to other fiducial points (e.g., P-, Q-, S-, and T-waves) or to other biosignals, such as respiration and electromyography (EMG), when displayed as time series.
Supplementary Materials (Video S1, File S2, and File S3) include the synthetic ECG signals, GUI-derived RRI data, and a demonstration video to ensure transparency and reproducibility.