Introduction
Eye movements can be measured with different technical systems (for a review see
Collewijn, 1999) that all need a calibration procedure to provide the angular position of the eyes. Only the search-coil technique can be calibrated objectively (i.e. physically), all other techniques (e.g. limbus tracking, Purkinje image tracking, video systems) require a subjective calibration, i.e., the recording during steady fixation of single targets at known angular positions. Unsually, a linear regression is calculated between spatially defined calibration points x
i (deg) and corresponding raw data y
i (arbitrary units), measured during fixation of calibration points. Least square (LS) fits determine the coefficients b
0 and b
1, i.e. the y-intercept and the slope, respectively:
For any measured raw data
Ym (arbitrary units) within the calibration range, the corresponding eye position
Xm (deg) can be calculated by:
Figure 1 shows examples of typical calibration curves for the two eyes (relative to the x-axis, the position of the two eyes is indicated at the bottom). Both curves have been recorded separately for each eye with 7 calibration targets that have been presented monocularly.
Usually, the measured data points do not lie on a perfect line. Consequently, the measured eye position is subject to an uncertainty that can be described by a standard deviation (SD) given by the following equation (
Fogt & Jones, 1998a;
Fogt & Jones, 1998b; Neter, Wasserman & Kutner, 1990):
with
yi: representing the measurement value
: representing the calculated value
n: number of calibration points
Considering the mathematical characteristics, the standard deviation SD depends on the following 4 aspects (see equation 3):
(1) The actual angular position of the eye (relative to the calibration centre) is important: the more the eye position (xm) deviates from central fixation (), the larger the SD.
(2) Increasing the number of calibration points n decreases the SD - at least at points far from the central fixation.
(3) The separation between the calibration points (xi − ) is contributing to the SD, depending on the eccentricity.
(4) Generally, outliers contribute to the SD, because of the squared influence of the residuals on the mean square error.
Figure 1.
Example of typical calibration curves for the two eyes. Relative to the x-axis, the position of the two eyes is indicated at the bottom.
Figure 1.
Example of typical calibration curves for the two eyes. Relative to the x-axis, the position of the two eyes is indicated at the bottom.
For accurate eye movement recordings, one wishes to indicate a confidence interval (CI) for all angular positions that occur in a particular experiment. In general, it is desirable to calibrate the eye movements in a way that a small CI (or small SD as described so far) may result, reflecting minor uncertainties attributed to the calibration process. Obviously, the SD will be small if the mean square error (MSE; see first part of equation 3) during the calibration is small. But, the SD also depends on design parameters of the calibration procedure itself, i.e. the number of calibration points and their separation – as mentioned above.
Considering the literature of one-dimensional, horizontal eye movements, sometimes only 2 calibration points were used (see for example,
Semmlow & Yuan, 2002; Semmlow, Chen, Pedrone & Alvarez, 2008); these authors argue that a straight line can be determined with two points assuming linearity of the recording system. This requires a strongly reliable measure of the two points for a small SD. Other strategies include more calibration points to reach a good approximation of the calibration function. This procedure should result in small SD, but is time consuming. Thus, the question arises, how the calibration procedure should be designed to achieve small SD within an appropriate period of time.
We investigated the effect of the number and the angular separation of the calibration points on the SD in two ways: 1. We performed simulations according to equation (3) and 2. we compared the simulations with empirical data measured under experimental variations of the calibration procedure. This study was made to show that the calculation of SD may be a useful procedure to specify the quality eye movement recordings concerning the calibration; this is still uncommon, despite the previous contributions of Fogt and Jones (1998 a and b).
Experimental variation of the number and separation of calibration points
For comparison with the simulations reported above, we investigated empirically the effect on SDs based on measured calibration where the number and separation of the calibration points was varied.
Method
We used a mirror stereoscope (
Howard, 2002) with two mirrors at right angle and two VDU screens (CRT Sony F500 T9). In order to minimize head movements, we used a chin and forehead rest including a narrow temporal rest, which was adjusted to the size of subject’s head., The eye movements were recorded with the video- based EyeLink II
®, which tracks the centre of the pupil by an algorithm similar to a centroid calculation. The EyeLink II system has a linear horizontal tracking range of +/-30° and a spatial resolution of 0.6 min arc (more details provided by SR Research Ltd, Osgoode ON, Canada). The Eyelink cameras were attached to the head rest. We did neither use the head tracking system, nor the calibration procedure of the original EyeLink II system, rather we recorded the raw data with a sampling rate of 500 Hz and used the following calibration procedure.
Subjects were requested to carefully fixate calibration targets that appeared (for 1400 ms) randomly at different screen positions with 100 ms temporal gaps; monocular presentations to the right and left eye were randomly interleaved. Two of these calibration series were repeated directly one after the other and results were averaged. In order to draw attention to the calibration points and to facilitate exact fixation, the diameter of the calibration spot initially subtended 1 deg and shrank immediately during 1000 ms to a remaining cross of 8.1 x 8.1 min arc (stroke width: 2.7 min arc); the remaining cross was visible for additional 400 ms during which calibration data were stored. The whole calibration range subtended 720 min arc (12 deg) at 60 cm viewing distance.
This procedure was chosen since it represents a fixation task that is not difficult to perform for the subject: it includes a very small final target of only 8 min arc which requires central foveal fixation and thus stimulated an eye position corresponding to a very precise spatial location as required for calibration. But this small target was only presented for a short 400 ms interval; for comparison, fixation durations of about 220 ms are typical during reading. Longer periods of steady fixation would be rather unnatural and give rise to drifts and mirco-saccades. The saccades from one calibration point to the next were stimulated by targets that initially had a large diameter of 1 deg in order to be easily perceived in peripheral vision and to draw attention to the next calibration point; the latter feature resembles the one used in Tobii ® eye movement recording systems.
Generally, eye movement recordings and calibrations are more accurate and stable, if a bite-bar is used. However, even though a bite-bar has not been used for convenience of the subjects in the present study, the resulting standard deviations were in the same order of magnitude as in the studies of Fogt & Jones (1998 a and b) using a bite-bar and a search-coil recording system. Probably, our short recording period of less than 45 seconds had reduced the risk of possible artifacts due to small head movements.
To test the calibration procedure, we had calibration runs for each eye in a sample of 16 subjects: in a first run, we used separate calibrations with 3, 5 or 7 calibration points with constant inter-point separations of 90 min arc. Additionally, we had a second run containing 3 calibration points with inter-point separations of 60, 180, and 360 min arc.
Results
First of all, in our experiments we reached average standard deviations (SD) of less than 20 min arc. Varying the number of calibration points from 3 to 7 points resulted in mean SD of the two eyes as shown by the distributions in
Figure 5. No significant difference between the average SD for the 3, 5 or 7 point calibration was observed. Nevertheless, as seen in
Figure 5, using more calibration points reduces the appearance of large outliers.
In a similar way, the average SD was not significantly different when comparing the separations of 60, 180, and 360 min arc using 3 calibration points (not shown graphically).
Figure 5.
SD distribution for empirical calibrations with 3, 5 and 7 calibration points at constant inter-point separations of 90 min arc.
Figure 5.
SD distribution for empirical calibrations with 3, 5 and 7 calibration points at constant inter-point separations of 90 min arc.
Discussion
The accuracy of a measured eye position can be described by a standard deviation that depends on the quality of the measurement of calibration (i.e., the mean square error of the calibration regression) and the design of the calibration procedure, i.e. the number of calibration points and the separations between them. Our simulation of equation (3) suggested that the SD depends on the number and the separation in a way that we described in
Figure 2,
Figure 3 and
Figure 4. Our experimental data, however, showed only small insignificant effects on SD, which e.g. was 11.8, 10.5, and 10.2 min arc with 3, 5, and 7 calibration points, respectively. This suggests that one should be careful to use equation (3) and the resulting simulation as a guideline to design the calibration procedure, since the systematic variation of SD could not be validated by our empirical data set. The most convincing reason for this discrepancy between simulation and experimental data is the following: for our simulations we kept the R
2 of the calibration regression per definition constant (at a value of 0.975). Such an assumption is necessary, in order to make the simulations comparable. However, for the empirical data the assumption was not true; we calculated the R
2 for our last sample of 32 calibrations (16 subjects x 2 eyes) and observed a decrease of R
2 with the reduction of the number of calibration points (see
Figure 6).
Figure 6.
R2 distribution for empirical calibrations with 3, 5 and 7 calibration points at constant inter-point separations of 90 min arc.
Figure 6.
R2 distribution for empirical calibrations with 3, 5 and 7 calibration points at constant inter-point separations of 90 min arc.
The larger the number of calibration points the smaller will be the effect of single outliers on the standard deviation; more specifically, it can be seen from equation (3) that an increase of the number of calibration points n from 3 to 5 reduces the SD by a factor of three in spite of the squared influence of individual residues of single outliers.
In sum, even though the simulation shows dependencies of the SD on the design of the calibration procedure with constant R2, the empirical SDs are supposed to remain stable. Nevertheless, with large eye movements, a three point calibration results in large SD for eccentric eye positions and more calibration points are required to reduce SDs. Thus, a 5-point calibration is still a good choice, since the regression is less effected by extreme outliers and it is possible to calculate a robust regression, which can reduce the mean square error (see Appendix).
Although the present study was made with horizontal calibration positions, the principle results can be transferred to the vertical direction. The next step of research could be to calculate the regression coefficients (horizontal and vertical) in a multivariate design and to estimate confidence ellipses instead of confidence intervals for eye positions.
In conclusion, for quantifying the uncertainty of the measured eye position due to calibration errors we suggest that equation (3) is a useful tool for the calculation of the standard deviation based on the actually recorded calibration and the chosen positions of calibration points.
The practical procedure for designing the calibration might be to define the calibration range to cover the angular dimensions of the eye movements to be recorded. The number of calibration points is equally spaced across the calibration range. Although the number of calibration points and their separation did not have much effect on the standard deviation, the effect of outliers can be reduced by increasing the number of calibration points (particularly if robust regression analysis is used).