Results and Discussion for Central Tendency, Variability, and Reliability
The tables of results presented below are structured in two-levels: the top part presents results for 
single-value features. The bottom part presents results for 
multi-value features (six feature subtypes are presented in corresponding columns). As already described, these multiple values are extracted by calculating descriptive statistics (columns 
Mn, 
Md, 
Sd, 
Iq, 
Sk, 
Ku) on values from multiple feature instances in a recording. The tables present values only for the independent components of eye movement (horizontal 
H, and vertical 
V), except for the features extracted only from the radial component or from trajectory in 2-D plane. In 
Table 1,
Table 3, and 
Table 5 (fixations, saccades, and post-saccadic oscillations respectively) we present the values of central tendency (median, denoted MD) and overall variability (inter-quartile range, denoted IQ) for features values across subject population. In 
Table 2, 
Table 4, and 
Table 6 we present the respective measures from the assessment of normality and reliability of features. In this case, for each feature there is one column indicating the maximum p value (p) calculated following the described procedures for normality assessment, and the adjacent column presents the value of either the ICC when p value denotes a normally distributed feature (p ≥ 0.05), or Kendall’s W when p value denotes a nonnormally distributed feature (
p < 0.05). To further facilitate the overview of results, the cells that correspond to non-normal features have been highlighted using lightgrey shading. Although the two reliability measures (ICC/W) are presented interchangeably in the same column for simplicity, we should once more emphasize that it is not advised to directly compare their values.
In 
Table 1, we can overview the typical values of fixation features calculated over the experimental population. We can observe that the median fixation duration was calculated to be about 200 ms (
F02) and corresponds on an average rate of about 3–4 fixations per second (
F01). This duration is within the expected range for fixations during reading, and similar values have been reported in previous research studies (
Nyström & Holmqvist, 2010; 
Rayner, 1998). Since the fixation centroid (
F03) is a direct measure of position, the extracted values for this feature are heavily affected by the positioning and centering of the stimulus. However, when a common stimulus is used for all subjects (as in our experiments) the median and inter-quartile range can provide clues about the existence of systematic error and its variability, either system-related or subject-related (unique error signature) (
Hornof & Halverson, 2002). As revealed from the values of 
F05-
F06, the drift during fixation affects in a similar way both components of eye movement. Furthermore, the drift speeds of the two components (
F06) seem to be very close to previously reported values of 0.5°/s (
Poletti et al., 2010). The values of the drift linear-fit slope feature (
F07) reveal a positive tendency for the horizontal component and negative tendency for the vertical. Another important observation is that the values of the quadratic-fit R
2 feature (
F09) are larger than these of the linear-fit R
2 feature (
F08), which seems to indicate the occasional appearance of non-linearity (curvature) in fixation drifts (see 
Figure 1), a phenomenon previously reported in (Cherici, Kuang, Poletti, & Rucci, 2012). Finally, the calculated values for velocity and acceleration (
F14 to F25) demonstrate the relatively low levels of eye mobility during fixations, compared to the corresponding levels for saccades and post-saccadic oscillations (see corresponding tables).
The examination of 
Table 2 allows for an assessment of the normality and reliability of fixation features. In overall, 50.7% of fixation features (feature subtypes) are found to be normally distributed and the rest are distributed non-normally. An examination of the shaded parts of the table (non-normal features) reveals that there is a general tendency for non-normality from the acceleration feature categories, and when using the kurtosis (
Ku) descriptive statistic irrespectively of feature category. For the case of fixations, the calculated ICC values for assessing reliability are in range of 0.06 to 0.92. Following the categorization suggested in (
Cicchetti, 1994) we can see that 32.5% of them are in the region of ‘excellent’ reliability, 20.5% in the region of ‘good’ reliability, 23.9% in the region of ‘fair’ reliability, and 23.1% in the region of ‘poor’ reliability. The top performing fixation features in terms of reliability are 
F14, 
F15, 
F16 (modeling of fixation velocity profile with mean, median, and standard deviation), 
F09 (R
2 when modeling fixation drift with quadratic-fit), and 
F02 (fixation duration). For the case of non-normal features, the calculated W values are in range of 0.52 to 0.98. The difference in ranges of ICC and W (values of W seem to be compressed in the upper half of range [0.00, 1.00]) portrays the risk of attempting to directly compare the values of the two measures.
From the respective values we can see that the top performing fixation feature categories based on Kendall’s Wmeasure are F21, F22, F23 (modeling of fixation acceleration profile with mean, median, and standard deviation), and F05 (travelled distance during fixation drift). It is interesting to observe that although the eye mobility is relatively limited during fixations, the dynamic features (based on velocity and acceleration) seem to provide the best test-retest measurement agreement both for the case of normal and for non-normal features.
In 
Table 3, we present the values for the features extracted from saccades. The median duration (
S02) over the experimental population was calculated to be about 28 ms. This duration seems to be justified given the relatively small amplitude of the saccades performed during reading, and it is within the range reported in other studies that employed the reading paradigm (Abrams, Meyer, & Kornblum, 1989; 
Nyström & Holmqvist, 2010). The median rate of saccades (
S01) is similar but slightly lower than the rate of fixations, possibly due to the postfiltering of large saccadic events. A very interesting group of features are those that model the curvature of saccadic trajectory. As explained, the feature of saccade efficiency (
S05) models the difference between the amplitude and the actual travelled distance during a saccade. The smaller values of saccade tail efficiency (
S06) (efficiency at the ending part of saccade) when compared to overall saccade efficiency (
S05) indicates the appearance of ‘hooks’ in saccade trajectory towards the ending part (when the post-saccadic oscillation phase begins). Qualitative observations of such phenomena have been reported in previous studies (
Bahill & Stark, 1975a). The calculated value for the point of maximum raw deviation (
S11) shows that in general the maximum raw deviation can be expected to occur around the middle (54%) of saccadic trajectory. Since the horizontal component of eye movement is typically more active during the reading task, the values for the dynamic features are much larger than for the vertical component. The median horizontal peak velocity (
S21) was calculated to be about 170°/s, and the relatively large values of the 
Sd and 
Iq feature subtypes reveal a considerable variability of the peak velocity during the duration of a recording. The values of peak acceleration and deceleration (
S27, 
S28) are both close to 13000°/s
2. Similar values but for much smaller population are reported in (
Abrams et al., 1989). The median peak acceleration appears to be in overall slightly larger than the peak deceleration, however, the reported variability does not allow to support the generality of this phenomenon. The calculated values for the features of acceleration-deceleration duration ratio (
S39) and peak acceleration-peak deceleration ratio (
S40) also suggest the volatility of this difference. The median accelerationdeceleration duration ratio seems to be slightly over one although it is expected that the larger values of peak acceleration (compared to peak deceleration) should correspond to smaller values of duration. An explanation for this discrepancy is that, in general, there is greater difficulty to accurately estimate the exact durations of the acceleration-deceleration phases (atypical profiles, multiple zero-crossings etc.) compared to the estimation of peak values. The overview of the features of saccadic reading behavior further clarifies the previously discussed difference in fixation and saccade rates (features 
F01, 
S01). In specific, by adding the rates of ‘large’ saccades (
S49, 
S50) and ‘small’ saccades (
S47, 
S48) we get a value that is much closer to the fixation rate. The rate of leftward large saccades (
S50) is 0.4 (about one such saccade per two seconds), and seems to be consistent with the expected rate of line changes during normal reading. The calculated value for the rate of leftward small saccades (
S48) is 0.8 (about one such saccade per second), a value that seems to be quite large to represent only word regressions. This value can be attributed to small corrective saccades performed during reading, e.g., for correcting undershoots during line changes (
Rayner, 1998).
The overview of the results from assessing the normality and reliability of saccade features is provided in 
Table 4. An initial observation is that the percentage of saccade features that are normal (or can be normalized) is much larger (74%) than previously. A prominent clustering of non-normal features seems to occur for some of the skewness (
Sk) and kurtosis (
Ku) feature subtypes. Also, a considerable clustering of non-normal features can be observed in feature categories 
S05, 
S06, 
S07 (saccade efficiency, tail efficiency, tail inconsistency). For the case of saccade features, the calculated ICC values range from 0.00 to 0.96, with relatively larger percentage (42.1%) of them being highly reliable (‘excellent’ reliability), 19.9% are considered of ‘good’ reliability, 16.9% present ‘fair’ reliability, and 21.1% present ‘poor’ reliability. The saccade feature categories with the top values of ICC are 
S36 (the ratio of saccade peak velocity to saccade duration), 
S29, 
S30, 
S31 (modeling of saccade acceleration profile with mean, median, and standard deviation), and 
S06 (saccade tail efficiency). Top values refer to horizontal (or radial) components since they are more reliable than vertical components. There are also several other feature categories with exceptional reliability (ICC > 0.9), as for example 
S02 (saccade duration) and 
S27-
S28 (peak acceleration and peak deceleration). As previously, the calculated Kendall’s W values for the non-normal features seem to be compressed at the upper half of range, varying from 0.44 to 0.98. The excellent reliability of feature 
S36 (ratio of saccade peak velocity to saccade duration) is further solidified by the higher Kendall’s W measure calculated for the 
Mn subtype of this feature, which was designated as non-normal (the rest subtypes were designated as normal). The same holds for features 
S06 (saccade tail efficiency) and 
S02 (saccade duration) for subtype 
Md. These and other similar cases (where some feature subtypes are designated as normal and some as non-normal) seem to imply that although the values of ICC and Kendall’s W cannot be directly compared, there is a certain degree of correspondence in their relative assessments about which feature categories are more reliable than others. Finally, another saccade feature category with non-normal members with very high W values is 
S20 (number of local minima in velocity profile).
In 
Table 5, we show the values for the post-saccadic oscillation features. The median duration (
P01) was calculated to be about 14 ms, and the median interval between post-saccadic oscillations (
P02) was found to be about 400 ms (though, with high variability). In overall, post-saccadic oscillations seem to occur at more than half of the saccades (
P03). This finding agrees with previous observations (
Nyström & Holmqvist, 2010) and further justifies the necessity for modeling the characteristics of post-saccadic oscillations. The vast majority of postsaccadic oscillations (76.5%) are ‘slow’ (
P04) (peak velocities between 20°/s and 45°/s), whereas the percentages of ‘moderate’ (
P05) (peak velocities between 45°/s and 55°/s) and ‘fast’ (
P06) (peak velocities larger than 55°/s) post-saccadic oscillations are about 11–12% each. The velocity and acceleration profile-modeling features (
P10 to 
P20) demonstrate the intermediate levels of eye mobility compared to saccades and fixations. Also, the examination of the ratio features shows that the postsaccadic oscillations have about 2–3 times smaller duration (
P21) compared to the preceding saccades, whereas their peak velocities are about 5–6 times smaller (
P24) than saccades (for horizontal component).
An overview of 
Table 6 can reveal the characteristics of normality and reliability of post-saccadic oscillation features. The percentage of normal (or normalized) postsaccadic oscillation features is slightly higher than the percentage for fixations, lying at 54.9%.
The ICC values for the case of post-saccadic oscillations range from 0.00 to 0.92, and the corresponding levels of reliability for the post-saccadic oscillation features are 35.5% ‘excellent’, 23.4% ‘good’, 12.1% ‘fair’, and ‘poor’ 29.0%. Among the most reliable categories of postsaccadic oscillation features are P03 (percentage of saccades followed by a glissade), and P16 and P18 (modeling of post-saccadic oscillation acceleration profile with mean and standard deviation). The values of Kendall’s W vary from 0.14 to 0.94 with the most reliable features appearing in categories P02 (interval between postsaccadic oscillations), P10 (peak velocity of postsaccadic oscillations), P13 (modeling of post-saccadic oscillation velocity profile with standard deviation), P21 (ratio of durations of saccades and adjacent post-saccadic oscillations), and P22 (ratio of amplitudes of saccades and durations of adjacent post-saccadic oscillations).