Instrumental Activities of Daily Living Tools in Very-Low Vision: Ready for Use in Trials?

Traditional endpoints assessing visual function are limited by their responsiveness to interventions restoring or maintaining vision. An alternative concept is assessing instrumental activities of daily living (IADL). Herein, we review all available vision-specific IADL instruments relevant for vision restoration trials and report data for the most promising instrument. Six relevant instruments exist: The Low Vision Functional Status Evaluation (LVFSE), Timed IADL (TIADL), Melbourne Low-Vision Activities of Daily Living Index (MLVAI), Assessment of Disability Related to Vision (ADREV), Functional Low-Vision Observer Rated Assessment (FLORA), and Very Low Vision IADL (IADL-VLV). Both internal consistency and test-retest data were available for the LVFSE, MLVAI, and IADL-VLV. In a sample from a low-vision clinic (n = 51; age 57 ± 16 years), we report additional validation data on the IVI-VLV including test–retest reliability (intraclass correlation coefficient 0.981 [0.961; 0.991]). The LVSFE was noticeably less reliable than the MLVAI and the IADL-VLV. Content and construct validity data were available for the LVFSE, TIADL, MLVAI, ADREV, and IADL-VLV, but only the MLVAI and IADL-VLV were developed for an ultra-low vision context. Ceiling effects were present across instruments. Thus, of all appropriate IADL instruments related to vision, the IADL-VLV and MLVAI best meet existing requirements for use in vision restoration trials, e.g., in gene therapies or visual prostheses in inherited retinal diseases, but require further validation.


Introduction
About 80% of sensory information processed by human brains is acquired by the visual system [1]. Visual impairment and blindness thus affect people in numerous activities during their everyday lives. Retinal diseases are among the most common causes of irreversible visual disability worldwide and account for more than 25% of irreversible visual impairments and more than 15% of cases of irreversible blindness globally [2]. Inherited retinal diseases are particularly devastating since their onset is typically at a younger age than that of acquired retinal diseases, and many of those diseases still have a poor prognosis and currently lack causal therapeutic options [3]. However, with the ongoing technological advances, innovative therapeutic approaches are increasing including geneand cell-based therapies as well as visual prostheses. These interventions primarily target patients with very-or ultra-low vision. Traditional ophthalmic endpoints including conventional functional tests, such as visual acuity, do not capture overall functional abilities well in individuals with little residual vision and are unresponsive to only small changes in functional performance achieved with many novel sight-restoring interventions. At the same time, regulatory authorities prefer endpoints that reflect individuals' functional difficulties over purely structural metrics. Patient-reported outcome and performance-based tests may bridge this gap. For example, Activities of Daily Living (ADL) related to vision reflect global visual function in very-and ultra-low vision patients better than do conventional functional assessments [4][5][6][7]. Thus, we provide an overview of existing visionspecific ADL instruments and their status of validation in the context of vision restoration trial endpoints.

Activities of Daily Living
The first performance-based tests (PBTs) were used to measure independence and functional capabilities in older people when in the 1960s, Sidney Katz and colleagues validated a standardized set of tasks designed to measure the newly defined concept of ADL [8]. These cover basic skills necessary to maintain self-care, self-feeding, or mobility. Only a few years later, the concept of ADL was extended to more complex activities such as handling personal finances, meal preparation, shopping, taking medication, or doing housework, named instrumental activities of daily living (IADLs) [9]. Since then, a variety of IADL tools have been developed, including instruments specifically assessing IADL related to vision.

IADL Instruments in Low Vision
Vision researchers extended the concept of IADL over time, adding novel domains such as picture recognition and reading tests to the instruments. As a consequence, visionspecific ADL and IADL instruments are noticeably heterogeneous [4]. This contributes to reliability and validity issues seen in many of the longer-existing instruments, which has also been highlighted by a panel of experts from academia, industry, and the United States Food and Drug Administration (FDA) [10].

Search Strategy
For this reason, we have focused this review on instruments that meet a more traditional definition of IADL, understand IADL assessments as multidimensional, and have undergone at least the first steps of validation. For this purpose, we have searched the medical literature for IADL tools used in the context of ophthalmology, using PubMed. The search terms were the Medical Subject Headings "Activities of Daily Living" and "Low Vision". We included only publications available in English. Non-vision-specific instruments as well as instruments assessing one domain only (e.g., reading performance or mobility only) were excluded.

Results
We describe the identified instruments below; further information on the validation status can be found in Table 1.

Low Vision Functional State Evaluation (1999)
The Low Vision Functional Status Evaluation (LVFSE) was developed to measure the effectiveness of vision rehabilitation programs. It includes 27 items that cover both IADL tasks and reading tasks [11]. All items were selected to specifically target visionrelated disabilities and cover areas such as meal preparation, shopping, taking medicine, communication, handling finances, and home maintenance. Although most participants of the validation study underwent low vision rehabilitation, the cohort also included participants with mild visual limitations.

Timed Instrumental Activities of Daily Living (2001)
Owsley et al. developed the timed IADL tool (TIADL) to assess everyday performance in elderly people related to vision and memory with the goal of identifying items suitable for use in low vision trials [12]. The final version of the instrument consists of 5 timed tasks that are relevant ADLs. Accuracy is estimated based on the summary of individual scores. Time penalties are added when the performance of individual items is successful but inaccurate [12,13]. The authors especially used the TIADL to examine the relationship between vision and cognition.

Melbourne Low-Vision ADL-Index (2001)
The Melbourne Low-Vision ADL index (MLVAI) was specifically designed to assess visual performance in a low-vision population for visual rehabilitation [14]. The MLVAI contains two parts: Eighteen observed IADL items from areas such as meal preparation, shopping, taking medicine, communication, handling finances, and home maintenance, as well as 9 self-reported items concerning self-care ADL. All items are rated according to observed or self-perceived speed, accuracy, and level of independence on a five-step Likert scale [14][15][16].

Assessment of Disability Related to Vision (2009)
The Assessment of Disability Related to Vision (ADREV) contains 9 items and was originally developed based on existing work with two other instruments, the Assessment of Function Related to Vision (AFREV) and the Task Performance Test (TPT) [17]. It was developed for clinical research and ophthalmic practice settings. Each item is rated using seven difficulty levels. Its performance has been evaluated in different populations with age-related macular degeneration, glaucoma, and diabetic retinopathy [18,19].

Functional Low-Vision Observer Rated Assessment (2014)
The Functional Low-Vision Observer Rated Assessment (FLORA) was developed in the context of a trial for a retinal prosthesis device (Argus II) and is thus for people with ultra-low vision [20]. It has three parts, including a PBM, a self-reported part, and a qualitative assessment. The FLORA has only been validated in a small group of 26 individuals. Performance is assessed by the evaluator on a 4 step Likert scale but without timing.

Very Low Vision Instrumental Activities of Daily Living (2014)
The IADL-VLV was specifically designed to assess performance in IADLs in individuals with very-low vision and potentially serve as a vision restoration trial endpoint, within the Bionic Vision Australia Retinal Prosthesis Project [21]. Its development included a noticeable proportion of qualitative steps such as a Delphi consensus procedure. The IADL-VLV assesses performance based on accuracy and time and has not been widely validated to date. For example, no repeatability data have been published for the IADL-VLV, which is why we assessed its psychometric performance including its test-retest repeatability in a cohort of patients (see below).

IADL-VLV Study Methods
Participants were recruited from a low-vision clinic at the outpatient department of the University Hospital Bonn, Germany and provided written, informed consent prior to participating. The institution's ethics committee approved the study before it was started, and all procedures were in accordance with the tenets of the Declaration of Helsinki. The inclusion criterion was at least moderate visual impairment based on reduced visual acuity or visual fields. Exclusion criteria were cognitive impairment, a recent (within 3 months) change in visual impairment or change in visual impairment between repeated assessments, and any ocular surgery within three months prior to study start. All participants were tested with the IADL-VLV, following a method described previously [21]. The assessments were performed by one staff member specifically trained for this purpose. A Rasch model was fit to the dataset, the psychometric performance was assessed, and the item set of the IADL-VLV was subsequently reduced based on the item fit statistics. Intraclass correlation coefficients between the test and retest assessments were calculated based on person measures from the final Rasch model. As an additional outcome measure beyond the existing scale, we assessed the repeatability of item timing based on ICCs. Statistical analyses were conducted with Winsteps (version 3.92.1; Portland, OR, USA) and IBM SPSS Statistics (version 27; Armonk, NY, USA). p-values < 0.05 were considered statistically significant.

IADL-VLV Study Results
We recruited 51 individuals (women 37.3%; men 62.7%) with a mean better eye visual acuity of 1.04 ± 0.69 LogMar. The diagnoses of participants included retinitis pigmentosa (27.5%), age-related macular degeneration (19.6%), cone dystrophies (13.7%), Stargardt disease (15.6%), previous retinal detachments (5.9%), and other congenital or hereditary conditions impairing vision (17.6%). Test-retest data were available for 31 participants. Several Rasch model requirements were violated with the original version of the IADL-VLV, and we thus modified the instrument based on our data (Table 2). First, we collapsed two of the four categories (task completed inaccurately in 1-2 attempts and task completed inaccurately in more than 2 attempts) due to disordered thresholds in two items. This led to all thresholds being ordered, but several items showed misfit. We first removed six items with a distorting level of misfit (mean-square values > 2.0). After omitting a misfitting response to one item, twelve further misfitting items (outside the infit mean-square value corridor 0.5-1.5 [21]) were dropped from the scale. The resulting 11-item version of the IADL-VLV had no misfitting items and an internal consistency within recommended limits but showed evidence of multidimensionality (Table 2). However, splitting the scale into two separate subscales led to an instability in the scale, and we thus proceeded with one global scale. The baseline scores of the 11-item version of the IADL-VLV were significantly associated with the visual acuity in the better-seeing eye, with a correlation coefficient (Pearson) of −0.673. The average score on the 11-item IADL-VLV was 2.9 ± 2.2, the mean absolute difference between test and retest assessments was 0.25 ± 0.45, and the revised instrument had an ICC of 0.981 [95% confidence interval 0.961; 0.991], but ceiling effects were observed, which is why we subsequently investigated timing of the individual items as an additional outcome measure (Table 3). Of the eleven items included in the revised version of the IADL-VLV, nine items showed positive intraclass correlations of timing data that were significantly different from 0 (Table 3). Moreover, the association between these two items as well as three additional items ("Coffee mug", "Bowl", "Hankies colored") were not significantly correlated with the better eye visual acuity (other items: Pearson r coefficients 0.30 to 0.48, p ≤ 0.038). In order to improve the 11-item instrument, we considered dropping the items "Dinner plate" and "Dinner spoon" from the Rasch model, but this resulted in a decline of the person separation index to 1.95. For this reason, we consider the 11-item IADL-VLV the revised version of the instrument and recommend it for use in future studies instead of the longer version.

Discussion
A number of vision-related IADL instruments exist, none of which has been validated to an extent required for clinical trial endpoints. Providing additional validation steps and assessing the psychometric performance of one of the better-suited IADL instruments available, we found ceiling effects for the majority of items as well as several psychometric issues with the IADL-VLV. Removing items, restructuring its rating scales, and employing additional outcome ratings such as time improved its psychometric performance to acceptable levels. Both the literature review and the psychometric evaluation of the IADL-VLV highlight that the measurement of vision-related IADLs comes with considerable challenges.
First and foremost, in the initial design, including the selection of appropriate items in IADL tools is crucial. Only a proportion of the mentioned instruments (LVFSE, ADREV, FLORA, IVI-VLV) reported including structured feedback from experts and/or patients in the selection of tasks, while the other instruments were constructed based on literature review only, which may lead to such instruments not being patient-relevant. The IADL-VLV was the only instrument which qualitatively implemented patient feedback during its development phase, using Delphi methodology. The level of reliability testing varied noticeably between instruments. Both internal consistency and test-retest reliability were reported only for the LVFSE, MLVAI, and IADL-VLV. While the data reported for the MLVAI and the IADL-VLV suggested that these instruments can be considered reliable, the testretest correlation of the LVFSE was only 0.5, and this instrument cannot be recommended for further use on this basis. Not all identified IADL instruments were specifically developed for use in a very-or ultra-low vision population (LVFSE, MLVAI, FLORA, IADL-VLV), which may limit the content validity of the respective tests. Four of the instruments were reported to be construct valid based on associations with visual function tests or patientreported outcome measures, i.e., the LVFSE, TIADL, MLVAI, ADREV, and IADL-VLV. Of these instruments, first responsiveness data have been published for the TIADL and the MLVAI, as well as for the FLORA, supporting the instruments' responsiveness to an intervention.
Based on our evaluation of the scientific quality criteria of IADL instruments in the context of clinical trials in very-or ultra-low vision, only the MLVAI and the IADL-VLV have passed important validation steps so far and were constructed for use in a low-vision population, making these instruments particularly promising for further evaluation and responsiveness testing in a clinical trial context. While the validation cohorts of the ML-VAI were larger (overall n = 241; IADL-VLV: overall n = 91), the IADL-VLV has been designed based on more qualitative work and comes with two types of outcome assessments (accuracy and time), whereas the MLVAI generates only a single outcome measure (accuracy). This makes the IADL-VLV match the recommendations for the development and properties of tools assessing ADL-based vision-related performance testing better than does the MLVAI since these recommendations include speed and accuracy in the scoring strategy [23].
The overall range of activity categories assessed by the different instruments is relatively wide. Nevertheless, most tasks fall into the categories of "searching tasks", "fine motor tasks", "reading and identification tasks", or "navigation tasks". All listed instruments include the searching, fine motor, and reading task categories while the navigation category is only included in the ADREV and the FLORA. Even though this may increase an instrument's validity, navigation-specific tools that are conducted using obstacle courses under prespecified laboratory conditions seem more appropriate to quantify individuals' difficulties navigating under a clinical trial setting than do the tests included in the ADREV, which includes one navigation item only, or the FLORA, which is home-based. Tasks related to reading words or texts are included in the TIADL, LVFSE, MLVAI, and ADREV, which may not be feasible for many individuals with very-or ultra-low vision, depending on the individual visual function. We thus recommend multidimensional tools for future vision restoration trials. The reduced, 11-item version of the IADL-VLV meets these criteria and includes tasks from the searching, fine motor, and reading/identification domains. Against this background, we recommend the IADL-VLV with 11 items for further validation in other low-vision cohorts including testing of responsiveness to effective therapeutic interventions.
The first IADL instruments described in this article were developed more than 20 years ago, but their use in interventional trials is still relatively sparse, which is partly due to only a handful of sight-restoring or similar interventions having been tested in clinical trials. A strong argument for implementing more IADL endpoints is their content validity (direct relation to patients' daily lives). Despite this, we have outlined that only a minority of the existing instruments meet other basic psychometric quality criteria and that the remaining instruments have not been extensively studied in interventional settings thus far, which comes with risks for the sponsors of large-scale, costly trials selecting their endpoints. This is also reflected by the recommendations of the Harmonization of Outcomes and Vision Endpoints in Vision Restoration Trials (HOVER) taskforce [23]. To improve this evidence and add IADL instruments to the toolbox, clinical research with approved therapies and in the field of visual rehabilitation is required.
Patient-reported outcome measures (PROMs) designed specifically to be used in a low-vision population are an alternative to performance-based tests (PBTs). The Impact of the Vision Impairment-Very Low Vision (IVI-VLV) questionnaire and the Veterans Affairs Low-Vision Visual Functioning Questionnaire (VA LV-VFQ) are two examples of instruments that have already passed initial validation steps [24,25]. When compared to PBTs, PROMs have the advantage of being self-administered and therefore require less resources by clinical site staff. However, adequately designed PBTs may be less variable and have better repeatability over PROMs since they have been shown to be less affected by, for example, neuro-psychiatric comorbidities [6,[26][27][28]. In addition, the development of compensatory mechanisms for ADL tasks is associated with the onset of visual impairment. This suggests that PBTs might be particularly useful in rapidly progressive diseases for which such strategies have not yet been developed [29]. Depending on the intended use, both PROMs and PBTs may have advantages over one or the other, or may yield complementary information. The choice of endpoints should as always be informed by the research question posed.
Our work has several strengths and limitations. We performed a focused literature review concentrating on PBTs of vision-related IADLs and performed additional evaluation of one of the best-suited PBTs. It is unlikely that we missed relevant PBTs or any studies evaluating these, as the field is small and we have worked with these PBTs for many years. We assessed the repeatability of the IADL-VLV in a relatively small and heterogeneous population with different levels of visual impairment. A large proportion of the participants (49%) met the very low-vision criterion in both eyes, which makes the sample comparable to those studies in the reviewed literature. Lastly, we were not able to provide responsiveness data for the IADL-VLV to an intervention, which we consider future work.

Conclusions
In conclusion, only few multidimensional PBTs under ongoing validation are available for use in future vision restoration trials. The validation status of the IADL-VLV and the MLVAI is most advanced in the field of low-vision PBTs assessing IADLs and show promising results, but longitudinal studies are currently lacking and are required for the assessment of, for example, responsiveness to change over time or minimally important difference, both of which are required for clinical trial endpoints. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The original data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.