A modified Kwee van Woerden algorithm with reliable error-estimates

The Kwee van Woerden (KvW) method for the determination of eclipse minimum times has been a staple in eclipsing binary research for decades, due its simplicity and independence of external input parameters. However, its estimates of the timing error have been known to be of low reliability. During the analysis of very precise photometry of CM Draconis eclipses from TESS space mission data, KvW’s original equation for the timing error estimate produced numerical errors, which evidenced a fundamental problem in this equation. This contribution introduces an improved way to calculate the timing error with the KvW method. A code that implements this improved method, together with several further updates over the original method is presented as well. An example application on the CM Draconis light curves from TESS is given, where we show that its timing error estimates of about 1 second are in excellent agreement with error estimates obtained by other means.


Introduction
The Kwee van Woerden method, henceforth KvW, has been very popular for eclipse minimum time determination since its publication in 1956 [1]. This is due to its computational simplicity as well as due to its independence from assumptions about the data that are being analyzed, beyond the assumption of data-points spaced equally in time and a symmetric eclipse shape. However, practicioners have long been aware of the unreliability in the algorithm's error estimates, being typically considered as too optimistic [2,3,4]. During the analysis of highly precise eclipse time-series from the TESS space mission, KvW's equation for the timing error estimate went however from unreliable to becoming unsolvable, which motivated the modification to the error estimate described here.
The KvW method, in brief, assumes a light curve of an eclipse of N equidistant points separated by t, in which a given data point at time T1 represents a preliminary time of minimum. Using T1 as reflection axis, the differences in magnitudes or fluxes between the paired points mk (k=1...n) are taken, and their squared sum is calculated: S(T1) ≡ ∑ ( mk) 2 . The symmetry axis is then shifted to (T1 -½ t) and (T1 + ½ t), and the corresponding sums S(T1 -½ t) and S(T1 + ½ t) are calculated while keeping the number of pairings, n, the same in all reflections. The values of S against time are then fit by a parabola of the form Sfit(T) = a T 2 + bT +c . ( This parabola has a minimum value of Sfit(T0) given by which is the sought-after time of minimum. For the error of the minimum time, σT0, KvW give the following equation: where Z is the maximum number of independent flux pairings, with Z=N/2 in the case of equidistant points. We note that the original KvW uses just three reflections for the calculation of S; later implementations may also use five or seven reflections spaced by further ½ t-steps away from T1; we call them 3, 5 and 7-fold implementations of the KvW. the folding was performed. The round points are S values from five foldings at the given fold-ID, while the solid line is the second-order polynomial that is fitted through these five points. We note that the minimum of the fitted curve is slightly below zero.

Identification of the problem
The failure of Eq. (4) became apparent when we employed the original 3 or 5-fold KvW to determine T0 on individual eclipses of the well characterized M4-M4 binary CM Dra in very precise light curves from the TESS space mission [6]. Individual TESS light curves have lengths of about 28 days, and therefore contain 17-19 primary and secondary eclipses of CM Dra, which has a period of 1.268 days, with primary and secondary eclipses of rather similar depths of 47.5% and 44.5%, respectively.
In several individual eclipses, a computational error arose when attempting to solve Eq. 4, caused by a negative value of the term 4ac -b 2 . This condition of 4ac -b 2 < 0 is equivalent to the minimum value of Sfit (Eq. 2) becoming negative (see also Fig. 1).
While the derivation of σT0 from Eq. 4 was plainly failing in some cases, we also note that σT0 may approach zero -and potentially be underestimated by orders of magnitudes -whenever a positive term 4ac -b 2 approaches zero.
Tests were then performed with TESS light curves with artificially added noise. In these cases, the minimum values Sfit(T0) increase and the numerical problems in the determination of σT0 vanish. Hence, problems in the determination of σT0 , as well as serious underestimations of σT0 from very small values of 4ac -b have been a consequence of the significant increase in photometric precision in the more than 60 years since the algorithm's publication.

A revised determination of the timing error
Considering the average noise of the flux-measurements to be µ = <µi>, with µi being the observational error of an individual measurement, the average noise in the difference of two fluxes mk = m+k -m-k is given by µ√2.
Instead of using the sum of the squared residuals S(Tj), with Tj being the epoch of the reflection axis, we may as well calculate the usual 2 statistical values for these pairings, which are then If we assume that the flux from the observed binary is perfectly symmetric around the minimum time T0, then the observed value of 2 (T) at the time of the minimum T=T0 is only due to the average noise µ. Hence, 2 (T0) would be determined by the number of degrees of freedom Z, and be given by: Following Eq. 5, the equivalent expression for the value S (T0) is then: We note that KvW's Eq. 13 also indicates that S(T0) should have this value, except for a factor Z/(Z-1) that is close to unity. In our modified KvW algorithm, after an initial determination of the coefficients a, b and c as usual, we therefore offset S(T0) so that S(T0) is defined by Eq. (7). Using Eq. (2), the coefficient c is then recalculated so that S(T0) complies with Eq. (7), with c now given by: The average flux error µ should preferentially be supplied as an external parameter, from a measurement of a time-series' noise outside of the eclipses. If this is not possible, as an alternative we may assume that the lowest S(T) obtained from foldings on or between data points is dominated by the noise µ, while contributions to that S(T) from remaining imperfections in the folding's symmetry are relatively small. µ can then be derived from an inversion of eq. (7). In practical cases, if at least two data points are in an eclipse's central flat part, this method leads to values of µ that are within 50% of a µ measured from the off-eclipse noise.
Inserting the revised value of c from Eq. (8) into KvW's original error determination, Eq. (4) simplifies to: We note that a conversion to 2 statistics by dividing Eq. (9) with 2µ 2 leads to σT0 = 1/√a . This corresponds to the usual definition of the 1-σ region of confidence for one free parameter, where a quadratic function describing 2 (T) increases at T = T0 ± σT0 by 1 over the minimum value 2 (T0).

Code implementation of the Kwee van Woerden method with improved error estimate
A code kvw.pro has been written in the IDL language that implements the KvW method with the timing error estimate as described above (see the supplementary materials for a link to the code). It is not overly complex and is heavily commented, which should facilitate its implementation in other languages. The code provides also several further improvements over a basic implementation of the KvW method, which are itemized here as follows: -Test for equidistance of input flux points: Similar to the original KvW method, the code requires data points that are equally spaced in time. The code tests if variations in temporal spacing that larger then 1% of the median spacing occur, and if so, does halt further processing. If this is considered too stringent, the rejection value can be modified. The currently code does not facilize the processing of unequally spaced data. If needed, input data to kvw.pro should be converted into equidistant flux-points through prior linear (as proposed in the original paper by KvW) or higher-order interpolations or fits.
-Selection of data points: While the user has to take care that an input time-series contains only data taken during an eclipse (usually requiring some minimum flux-drop against the off-eclipse flux, see Fig. 2), asymmetric data-coverage around the center of an eclipse is recognized by the code. The code always selects the maximum available number of data points that are available for pairings on either branch of an eclipse, and hence balances the coverage of in-and of egress (Fig. 3).
-Employment of more than 3 foldings around the initial minimum time estimate: The number of foldings needs to be odd and the use of 5 or 7 foldings is recommended.
-For the initial minimum time estimate, the algorithm uses by default the central point of the supplied light curve, but the user may also select to use the point with the lowest flux. The central value is the better choice unless there are considerable asymmetries in the eclipse light curve. The point of least flux should only be used in low-noise data, when this point is well defined against the noise of the curve.
-Symmetrizing the fit to S(T): If the lowest value of S(T) does not correspond to a folding that is close to the initial estimate of the minimum time, an asymmetric fitted curve Sfit will be obtained, with the longer branch either to the left or right of the lowest S(T) having a larger weight onto the coefficients a, b and c (see also Fig. 4). To avoid this, the outer values of S(T) in the longer branch are cropped, so that this branch is at most one point larger than the shorter one. The fit for Sfit is then performed on the reduced set of values S(T). This cropping can only be performed if S(T) has been obtained at more than 3 foldings.

Example application to TESS data and verification of the error estimates
In the following we show the application of the modified KvW to the aforementioned TESS data of CM Dra. The analyzed data are the first ones obtained by TESS on CM Dra, in its Sector 16, which was acquired from 2019 Nov. 11 to 2019 Oct. 7. Besides Sector 16, CM Dra was also observed in TESS sectors 19 and 22 to 26; their analysis is however beyond the scope of this publication. The light curve was downloaded from NASA's MAST archive, and had been processed with its pipeline version spoc-4.0.28. (We note that light-curves available on MAST until Spring 2020, which had been processed by versions prior to spoc-4.0.26, have time-stamps that are too large by 2 seconds [7]). From this dataset we used the 'PDCSAP_FLUX' values, which are fluxes that underwent a 'Pre-search Data Conditioning' procedure to remove common instrumental effects [8]. Around each individual eclipse, a time-series covering about 3 times the eclipse-duration of 0.050d was extracted; see Fig. 2 for an example of an extracted section. For each eclipse snippet, a second order polynomial was then fitted to the off-eclipse sections before and after the eclipse. The fluxes were then divided by the polynomial fit, resulting in an eclipse light curve whose off-eclipse flux is normalized to 1, and which is free of gradients and other signals on time-scales larger than a few hours, be they from CM Dra or from instrumental effects (see again Fig. 2).
The flux error µ was determined from the off-eclipse point-to-point rms of these normalized light curves. Among individual eclipses, it varied between 0.98 · 10 -3 and 2.45 · 10 -3 in normalized flux units. Since the higher of these rms values were dominated by individual flux-peaks in the off-eclipse data, for the further analysis we used a noise that was averaged over all eclipses, which was µ = 1.38 · 10 -3 or 1380 ppm. The modified KvW was then executed separately on each of the primary and secondary eclipses, using only data-points with relative fluxes of less than 0.95 (red points in Fig. 2).
Resulting minimum times of all complete eclipses with their errors are shown in Table 1, while Fig.  5 shows their O-C values against the ephemeris given by [5] (We note that in [5], the epochs of reference are given in BJD_TAI. These have been converted to BJD_TDB by adding 32.184 seconds).
Based on Table 1, the average of the individual timing errors is 1.25 · 10 -5 d for primary minima, and 1.34 · 10 -5 d for secondaries (or 1.08 and 1.16 sec), with individual errors varying only very slightly around these averages.
An independent estimate of the timing error can be obtained from the standard deviation of the O-C values against the mean O-C value (dashed lines in Fig 5). This value is 1.18 · 10 -5 d for primary and 1.48 · 10 -5 d for secondary eclipses. This is in very good agreement to above mentioned errors, and shows that the timing errors from individual eclipse timings are reliable.
A further independent verification can be obtained from equations for the expected timing precision in eclipse or transit light curves [9]. That work provides several equivalent formulae for the timing error, based on a light curve's noise, and on the eclipse depth and duration. Here we use their Eq. 7, given as: where F the relative depth of the eclipse, T∇ is the combined in-and egress duration, and n∇ is the number of data-points within T∇. For the photometric noise we use again µ = 1.38 · 10 -3 and for F the above mentioned eclipse depths of 0.475 and 0.445. The duration we determine as T∇ = 0.050d, which corresponds to n∇ = 36 data points, given the 2-minute cadence of TESS. With Eq. (10) we obtain then timing errors of 1.21 · 10 -5 d for primary and 1.29 · 10 -5 d for secondary eclipses -which is again in excellent agreement with the results obtained by the modified KvW.  Table 1 and the calculated times and epoch numbers are from the ephemeris of [5]. The dashed line is the average O-C value of these eclipeses.