An Overview of the Genetics of ABCA4 Retinopathies, an Evolving Story

Stargardt disease (STGD1) and ABCA4 retinopathies (ABCA4R) are caused by pathogenic variants in the ABCA4 gene inherited in an autosomal recessive manner. The gene encodes an importer flippase protein that prevents the build-up of vitamin A derivatives that are toxic to the RPE. Diagnosing ABCA4R is complex due to its phenotypic variability and the presence of other inherited retinal dystrophy phenocopies. ABCA4 is a large gene, comprising 50 exons; to date > 2000 variants have been described. These include missense, nonsense, splicing, structural, and deep intronic variants. Missense variants account for the majority of variants in ABCA4. However, in a significant proportion of patients with an ABCA4R phenotype, a second variant in ABCA4 is not identified. This could be due to the presence of yet unknown variants, or hypomorphic alleles being incorrectly classified as benign, or the possibility that the disease is caused by a variant in another gene. This underlines the importance of accurate genetic testing. The pathogenicity of novel variants can be predicted using in silico programs, but these rely on databases that are not ethnically diverse, thus highlighting the need for studies in differing populations. Functional studies in vitro are useful towards assessing protein function but do not directly measure the flippase activity. Obtaining an accurate molecular diagnosis is becoming increasingly more important as targeted therapeutic options become available; these include pharmacological, gene-based, and cell replacement-based therapies. The aim of this review is to provide an update on the current status of genotyping in ABCA4 and the status of the therapeutic approaches being investigated.


Introduction
Stargardt disease (STGD1, OMIM# 248200), otherwise known as Stargardt macular dystrophy, juvenile macular degeneration, or fundus flavimaculatus, is one of the most common causes of inherited retinal diseases. It is estimated to have an incidence of between 1 in 8000 and 1 in 10,000 [1]. STGD1 is normally detected in late childhood or early adulthood and is progressive, varies in severity, and sometimes vision loss may not be noticed until later in adulthood. It is inherited in an autosomal recessive manner and is caused by variants in the ABCA4 gene [2], with a carrier frequency reported to be as high as 1 in 20 (depending on the population) [3,4].
The disease was first described by Karl Stargardt in 1909 in two families with macular dystrophy associated with yellow-white pisciform flecks [5]. In 1965, Franceschetti used the term fundus flavimaculatus (FFM) to describe the widespread presence of flecks ( Figure 1) [6,7]. Genetic linkage in the early 1990s localised STGD1 and FFM to the same locus on the short arm of chromosome 1, 1p13-p21 [8], thus confirming that the two conditions were part of the same disease spectrum [9][10][11]. This genetic locus was subsequently refined to a 2-cM interval on 1p21-22.1 [12]. In 1997, Allikmets et al. demonstrated that the disease was caused by variants in the ABCA4 (ATP-binding cassette transporter, alpha 4 subunit) gene, originally called ABCR [2]. The ABCA4 transcript was first detected in rod photoreceptors [2,13,14] and then in cones [15]. The ABCA4 gene encodes a transmembrane protein that is localised to the rim of the disc membranes in the outer segments of rod and cone photoreceptor cells and plays an essential role in retinoid recycling in the visual cycle [13,15] (see Sections 2.1 and 2.2 and Figures 2 and 3).
Stargardt disease has a highly variable phenotype but there are three characteristic features: the presence of flecks, macular atrophy, and sparing of the peripapillary region, which if seen together are indicative of a retinal disorder associated with variants in ABCA4. [16][17][18]. Cideciyan et al. used the term ABCA4-associated retinal degenerations in 2005 [17] and since then other authors have used the term ABCA4-associated retinopathies [19][20][21]; for the purposes of this review, ABCA4R will be used to denote ABCA4 retinopathies. The progression of the disease is also variable but patients with early childhood onset typically have a more severe phenotype and more rapid disease progression [22][23][24][25]. In contrast, patients with a late onset disease (>45 years of age) usually have a milder phenotype and slower progression [26][27][28][29]. Information regarding the imaging and characterisation of the different ABCA4 retinopathies (ABCA4R) is described in a review [30].
Retinal disorders with clinical phenotypes resembling STGD1 but with a dominant pattern of inheritance are referred to as "Stargardt like" and have been assigned to STGD2-4 [31,32]. There are other retinal conditions associated with different gene variants that mimic the STGD1 phenotype, known as phenocopies, which can complicate a diagnosis and underline the need for accurate genetic testing (see Section 3).
The aim of this review article is to give an overview of the current status of genotyping in ABCA4, an update on missing heritability in ABCA4, phenocopies, the effect of genotype on the severity of the phenotype, and assessment techniques to predict the functional consequences of the variants. We will also provide an update on the current therapeutic approaches that are being investigated.

Function and Role within the Visual Cycle
In the visual cycle, light isomerises the chromophore 11-cis-retinal to all-trans-retinal, which is then released from the binding pocket of the rod or cone opsin. The majority of all-trans-retinal is reduced to all-trans-retinol by retinol dehydrogenase 8 (RDH8) [39] and transferred to the retinal pigment epithelium (RPE) by interphotoreceptor retinoid binding protein (IRBP). It is then esterified by lecithin retinol acyltransferase (LRAT) and converted to 11-cis-retinol by RPE65 isomerohydrolase. The 11-cis-retinol is oxidised to 11-cis-retinal by RDH8 and transported back to the photoreceptors by IRBP where it rebinds to the rod and cone opsin to be re-used in the visual cycle [35,37,40,41] (Figure 3a,b). However, some of the all-trans-retinal reversibly reacts with phosphatidylethanolamine (PE) to form N-retinylidene phosphatidylethanolamine (NrPE) in the photoreceptor disc membrane [42,43]. In the normal visual cycle, the function of ABCA4 is to import the ABC proteins NrPE [44,45] and PE [45]. This is unique to ABCA4 as it is the only known ABC protein that functions as an importer in mammals [45]. The ABCA4 protein flips the NrPE from the luminal side to the cytoplasmic side of the photoreceptor disc membrane (Figure 4) where the NrPE is subsequently hydrolysed to PE and all-trans-retinal. The all-trans-retinal is then reduced to all-trans-retinol by RDH8 in order to resynthesize 11-cis-retinal, which then re-enters the visual cycle [33,35,37,46].
Variants in ABCA4 can affect the structure and function of the ABCA4 protein. Dysfunction of the ABCA4 protein leads to an accumulation of NrPE and all-trans-retinal within the photoreceptor disc membrane, which condenses to form phosphatidyl-pyridinium bisretinoid (A2PE) [47]. The photoreceptor outer-segments are then shed and phagocytosed by RPE and the A2PE is hydrolysed by lysosomal enzymes to bisretinoid N-retinyl-Nretinylidene ethanolamine (A2E) [48], which cannot be metabolised further [49,50]. As a result, the A2E accumulates within the RPE and forms a major component of lipofuscin, which is toxic to the RPE, thus leading to degeneration of the RPE and subsequently loss of the photoreceptor cells [33,35,46,50,51] ( Figure 3B). Recently, ABCA4 has also been linked with decreasing excess 11-cis-retinal, which also reversibly reacts with PE to form N-11-cisretinylidene-phosphatidylethanolamine, which, in turn, is also transported by the ABCA4 protein, meaning that accumulation of 11-cis retinal might also be linked with STGD1 [52].
Genotype-phenotype correlation in ABCA4 is generally thought to correlate with the remaining function of ABCA4, meaning that more severe combinations, such as having two null variants, results in a severe phenotype with an early disease onset whilst milder variants are linked with later onset [53][54][55]. In this review, we will describe the effects of these different variants and the up-to-date methods used to investigate a variant's severity. Light photobleaches the opsin and isomerises the 11-cis-retinal to ATR. Some ATR reversibly reacts with PE to form NrPE, which is flipped onto the cytoplasmic side by the ABCA4 protein. The NrPE is then hydrolysed to PE and ATR, thus preventing the accumulation of ATR on the luminal side. The ATR is then reduced to all-trans-retinol by RDH8 and then transported to the RPE cell by IRBP. In the RPE, the all-trans-retinol is esterified to all-trans-retinyl esters by LRAT, which is then converted to 11-cis-retinol by RPE65 isomerohydrolase and then oxidized to 11-cis-retinal by RDH and transported back to the photoreceptors by IRBP. (B) Schematic diagram illustrating the visual cycle in the presence of ABCA4 dysfunction. Dysfunction of the ABCA4 protein prevents the flipping of the NrPE from the luminal side to the cytoplasmic side of the photoreceptor outer segments, meaning that the NrPE accumulates and condenses with all-trans-retinal into A2PE. The photoreceptor outer segments are then shed and phagocytosed by the RPE cell, which then hydrolyse the A2PE to A2E [56]. Created with BioRender.com (accessed on 1 June 2021). Abbreviations: ATR: All-trans-retinal; A2E: N-retinyl-N-retinylidene ethanolamine; A2PE phosphatidyl-pyridinium bisretinoid; IRBP: inter photoreceptor binding protein; LRAT: lecithin retinol acyltransferase, NrPE: N-retinylidene phosphatidylethanolamine; PE: phosphatidylethanolamine; RDH8: retinol dehydrogenase 8; RPE: retinal pigment epithelial. . Schematic diagram of the ABCA4 protein actively transporting NrPE from the luminal side of the photoreceptor disc membrane to the cytoplasmic side. The ADP is initially bound to the NBD and the ABCA4 binds the NrPE on the luminal side of the photoreceptor disc membrane; this is followed by binding of ATP to the NBDs, leading to a conformational change that creates a low affinity binding site on the cytoplasmic side, resulting in dissociation of the NrPE from the ABCA4 protein and followed by hydrolysis of the ATP, returning the ABCA4 to its primary conformation. Adapted from Molday et al. [37]. Created with BioRender.com (accessed on 1 June 2021). Abbreviations: NBD: nuclear binding domain; NrPE: N-retinylidene phosphatidylethanolamine.
Diagnosing ABCA4R is further complicated by phenotypic overlap with other disorders that share similar features (see Table 1). These may be referred to using terms such as Stargardt like or pseudo-fundus flavimaculatus. In 1994, the term "Stargardt like" was used to describe a dominantly inherited macular dystrophy located on chromosome 6q [31]. The STGD3 (OMIM# 600110) type was subsequently shown to be caused by variants in ELOVL4 [64]. The STGD4 (OMIM# 603786) locus was mapped to chromosome 4 [32] and is linked to dominant PROM1 variants [65]. STGD2, which was originally mapped to a locus on chromosome 13q34 [66], was subsequently shown to be caused by ELOV4 variants as in STGD3 [64]. The phenotype in STGD3 includes early onset disease with pigmentary changes and flecks within the macular region [67]. For STGD4, changes in PROM1 have been associated with variable phenotypes, which include cone rod dystrophy, macular dystrophy, retinitis pigmentosa [68], BEM [69], and the presence of flecks [32]. Juvenile onset macular dystrophy with associated hyptrichosis of scalp hair [76,77] Hydroxychloroquine retinopathy Bull's eye maculopathy Drug toxicity Bull's eye maculopathy [58] The term "phenocopies" is used for IRDs that manifest with a similar phenotype but are caused by variants in different genes. For ABCA4R, the commonest phenocopy is the pattern dystrophy caused by variants in PRPH2 inherited in an autosomal dominant manner [70]. Indeed, Ibanez et al. recently found that a PRPH2 variant was identified in 10% of their misdiagnosed STGD1 patients [71]. The typical phenotype linked to PRPH2 is a late-onset pattern dystrophy with variable penetrance within families. However, variants in PRPH2 have also been associated with the characteristic features of STGD1, such as flecks, macular atrophy, and changes in a full-field electroretinogram (ffERG) [70]. Occasionally, the peripapillary region can be spared, as in STGD1 [78]. Heterozygous variants in CRX can result in a phenotype that mimics STGD1 and is associated with late-onset disease with a bull's eye maculopathy (BEM) [72], and cone and rod dysfunction detected on electrodiagnostic testing [73]. In a recent study, Wolock et al. found that CRX and PRPH2 variants accounted for~10% of their "STGD1" patients not found to carry ABCA4 variants. CDH3 can also be misdiagnosed as STGD1 due to the juvenile onset macular dystrophy, but these patients can be distinguished by the presence of hypotrichosis of scalp hair [76,77].
BEST1 can also be misdiagnosed as STGD1 in compound heterozygous patients with autosomal recessive variants due to the widespread presence of vitelliform deposits that can be confused for flecks, particularly in autosomal recessive bestrophinopathies [74]. Distinguishing autosomal dominant Best disease (BEST1) is usually easy as it is characterised in most cases by the presence of a solitary yellow deposit usually at the macula, which is round and far larger than a fleck. In addition, the presence of an abnormal electrooculogram (EOG) and a normal electroretinogram (ERG) is discriminatory. However, autosomal recessive bestrophinopathy may be misdiagnosed as an ABCA4R due to the presence of multiple vitelliform deposits, but the shape and distribution of these deposits is different from STGD1, and macular schitic changes are often visible in autosomal recessive bestrophinopathy [74]. Genetic testing results are reassuring when differentiating the two conditions [75]. Another misdiagnosis of STGD1 can include cone rod dystrophy (CRD).
Toxicity from drugs may cause a BEM phenotype. BEM is observed in susceptible patients who develop a toxic retinopathy in response to quinolones, such as hydroxychloroquine and chloroquine [79]. Other drugs have been associated with BEM, but less frequently; an overview of these can be found in a review by Mitra et al. [80].
Another compounding factor is the high carrier frequency of variants in ABCA4 observed in the general population. This can result in a pseudo-dominant inheritance pattern being observed. As seen when the child and one parent is affected but the other parent is an unaffected carrier, the disease appears to be inherited dominantly. Variability is also observed within families as a result of individuals carrying a different combination of variants. Of significant interest is the observation that siblings carrying the same variants may have different phenotypes. It has been hypothesised that this could be due to genetic modifiers or environmental factors, which could affect the severity and penetrance of the disease [81][82][83][84]. Indeed, Runhart et al. clearly proposed the presence of modifiers when they reported non-penetrance of null/severe variants when in trans with the c. p.(Asn1868Ile) variant [85] and also proposed a female sex bias for the p.(Asn1868Ile) and the c.5882G>A p.(Gly1961Glu) variants [86]. However, Lee et al. did not find a significant sex bias for either of these variants when they investigated their cohort of patients [87].
The variability in the phenotypes seen in ABCA4R and similarities in the phenotypes caused by other genes can sometimes make it difficult to make a definitive clinical diagnosis, even for an experienced clinician. This highlights the importance of genetic testing to provide a molecular diagnosis. The importance of this is twofold for patients with a Stargardt-like phenotype but with variants in genes other than ABCA4: (1) for correct genetic counselling regarding inheritance patterns; and (2) in order to prevent incorrect recruitment into therapeutic trials, with the risk of invasive procedures that could lead to harm and no therapeutic benefit.

Genetics of ABCA4R
Since the gene was first identified by Allikmets et al. in 1997 [2], with a subsequent report of the full-length gene by Gerber et al. in 1998, genetic testing has been undertaken in many patient cohorts [88]. To date, more than 2000 ABCA4 variants have been reported in the literature (www.lovd.nl/ABCA4 accessed on 1 June 2021).

Genetic Testing in ABCA4
Initially, single-strand conformation polymorphism (SSCP) analysis was used to screen the gene [2,53] but a poor detection rate led to the use of a variety of techniques, each of which had advantages and disadvantages. The development of an arrayed primer extension (APEX) array by Asper Biotech increased the detection rate but only previously reported variants were able to be detected as they were included in the initial probe design [4]. This meant that novel variants were not detected. Other techniques that can be used in conjunction with panels to improve the detection rates include denaturing gradient gel electrophoresis (DGGE), denaturing high-performance liquid chromatography (dHPLC), and high-resolution melting (HRM) [89][90][91]. Multiplex ligation-dependent probe amplification (MLPA) has also been used to detect deletions or duplications in the ABCA4 gene [90]. Using Sanger sequencing for all 50 exons and at least 10 bp of flanking introns achieves a detection rate of up to 80% of alleles [3,92]. However, it is expensive and time consuming, which limits its use in large cohorts of patients.
More recently, next generation sequencing (NGS) has revolutionized genetic testing. Targeted exome, whole exome (WES), and whole genome sequencing (WGS) are highly automatable techniques that can be used to sequence large cohorts of patients and enable the detection of variants in both known IRD and novel genes [93,94]. WGS can detect both exonic and intronic variants [95]. However, the downside of these techniques is that they require large amounts of space for data storage and require expertise in bioinformatics (for a review, see Broadgate et al. (2017) [96]). A recent report using single-molecule molecular inversion probe (smMIP)-based sequencing has demonstrated a coverage of 97.4% of the 128 kb ABCA4 gene, including all 50 exons and splice sites, and can also identify copy number variants [97]. The main benefits of smMIP-based sequencing are that it is cost effective and can be used to sequence large cohorts of patients [97], which makes it a valuable sequencing technique for conditions with a characteristic phenotype such as STGD1.
These advances in sequencing techniques have improved our ability to detect variants but ABCA4 remains a complex gene due to its size, highly polymorphic nature, and wide spectrum of variants identified.

Spectrum of Pathogenic Variants in ABCA4
Sequencing carried out in large cohorts of patients has shown that ABCA4 variants include missense, nonsense, frameshift, splice site, and structural variants. Several studies have shown that the majority of ABCA4 variants reported to date are missense variants [98][99][100][101][102][103][104]. Missense variants are usually caused by a single nucleotide change, which usually results in a change in the amino acid. The consequences of this are difficult to predict as not all amino acid changes lead to changes in the protein structure that result in a change in function [54,105,106]. Many of these missense variants are rare, which further complicates determining their severity and whether they are truly pathogenic [20,107,108]. The effect of deleterious variants, such as nonsense or frameshift variants, are easier to predict as these variants create truncated proteins with no ABCA4 activity. The term structural variant refers to copy number variants, which encompass deletions, insertions, and duplications, as well as inversions and translocations [109]. These structural variants can lead to a significant alteration in protein structure [110]. Non-canonical splice site (NCSS) variants occur in intronic regions and the first and last nucleotide positions of exons and may alter the cryptic splice sites, thus activating them. These variants may result in abnormal splicing efficiency or alter the order of the splicing steps. Variants in these cryptic splice sites can produce a pseudoexon that frequently contains premature termination codons, thus resulting in nonsense-mediated decay (NMD) [111].

Variability in Variants between Populations
The majority of studies on ABCA4 have been conducted on patients of European descent. As a result, the variant databases have an inherent bias for this population. Indeed, Chinese and African-American patients have been found to have a higher prevalence of variants not seen in European patients and are also less likely to carry the variants frequently detected in this population [103,112]. Moreover, founder variants are often found in isolated populations and those originating from settlers. In these populations, the founder variants can explain a significant proportion of STGD1 with a high molecular diagnosis rate despite a small number of variants being detected in the population [113][114][115]. Detection rates for the ABCA4 variants appear to be similar in populations (Table 2), although it is important to note that the detection rate is mostly influenced by the clinical expertise and how well defined the phenotype of the patient is. However, it is of note that the detection rate in French Canadian patients was very low at 33%, which suggests the presence of yet undiscovered founder variants in this population [116]. The differences in the frequencies and of the variants seen in different populations highlights the importance of accurate genetic testing to identify the causative variants and limitations of using panel-based technologies that might not detect variants that are commonly seen in a specific population. South Africa 62% [123] Canadian 59% [116] French Canadian 33% [116] The complexity in genetic testing in ABCA4 is due to the large number of variant types that are present across the large ABCA4 gene and there are no specific mutation "hot spots" that can be targeted.
A significant proportion of STGD1 patients are not found to carry two ABCA4 variants in trans. This will be referred to as missing heritability in this review. Solving this missing heritability is important in understanding the natural history of the disease and its variability in expression, and help identify suitable patients for recruitment to therapeutic trials. The next section will focus on solving this missing heritability and determination of the pathogenicity of variants in ABCA4.

Missing Heritability
Despite the advances in genetic sequencing, genetic testing (see Section 4.1) in ABCA4 remains difficult and six years ago it was shown that only 65-79% of patients tested are biallelic for the ABCA4 variants [124][125][126]; 20-25% of patients are monoallelic and in 15% no ABCA4 variant was identified [127]. This missing heritability could be explained by undetected variants, such as deep intronic variants (Section 4.4.1), structural variants (Section 4.4.2), variants that are incorrectly classified as benign due to their relatively high carrier frequency (hypomorphic alleles) (Section 4.4.3), phenocopies (Section 3), and very rare cases of uniparental isodisomy [97,128,129]. In the following sections, we will discuss and explore the role that previously undetected variants have played towards solving the missing heritability in ABCA4.

Deep Intronic Variants
In an effort to identify the missing variants in ABCA4, Braun et al. sequenced RNA obtained from normal human retina to investigate whether variants in sequences near the splice sites of pseudoexons present in very low amounts of retinal mRNA were more susceptible to pathogenic variants. Indeed, this identified five minor splice-site variants (15 alternate exons that each accounted for less than 1% of the total RNA) in the donor retinas, which were all subsequently shown to be pathogenic [130].
Deep intronic variants have since been identified and shown to account for a significant amount of the missing heritability in monoallelic patients. Indeed, deep intronic variants have been identified in approximately 40% of STGD1 patients that were initially found to carry a single ABCA4 variant [126,130,131], and are more frequently detected in cohorts of monoallelic patients [132]. Moreover, Khan et al. have recently detected deep intronic variants in 15% of their monoallelic patients using single-molecule molecular inversion probes (smMIPs) [97]. SmMIPs are a cost-effective method of detecting genetic variation [95,97].
Currently, the most frequently identified deep intronic variant is c.4253+43G>A [125,131] and has been proposed to be a hypomorphic variant. It has been found to be associated with late-onset disease and a mild phenotype [132][133][134]. The most prevalent deep intronic variant in a study of Belgian patients is the c.4539+2001G>A, and has been found as a second variant in 26% of monoallelic patients [135]. Overall, there are 35 reported deep intronic variants and all but two have been shown to have splice defects in in vitro splice assays [54].
Based on these studies, deep intronic variants will be important in solving the missing heritability in STGD1. However, the highly polymorphic nature of the ABCA4 gene also means that caution is advised when assigning pathogenicity to deep intronic variants. For example, Zernant et al. found that the c.6006-609T>A variant was not actually pathogenic and was in fact always found in cis, with the pathogenic c.4253+43G>A variant as a complex allele [132].

Structural Variants
The first description of a CNV in ABCA4 was of the c.2654-905_2743+35del (IVS17-905_IVS18+35del) variant reported by Yatsenko et al. [110]. Since then, CNVs and other structural variants have been infrequently reported in ABCA4 [126,136,137]. Indeed, in one of the most comprehensive studies into SV using smMIP-based sequencing, Khan et al. identified 11 novel CNV in 16 alleles in 448 bi-allelic STGD1 cases (carrying 896 alleles) [97]. This would suggest that structural variants are unlikely to be a significant explanation for missing heritability in STGD1/ABCA4R. Cremers et al. predict that CNVs only account for approximately 1% of pathogenic ABCA4 variants [54].

Hypomorphic Alleles and Modifiers
As discussed above, ABCA4 has a high variant carrier frequency, which means that variants with a relatively high allele frequency are typically assigned as benign. It is now thought that some of these variants are hypomorphic alleles and are only pathogenic when certain conditions are met, such as being in trans with a severe variant. They are typically associated with a later onset of disease and a milder phenotype [85,127,132]. The c.5603A>T p.(Asn1868Ile) was recently reclassified as hypomorphic despite having a minor allele frequency of 7% in the European population [127] and has been reported in 348 homozygous unaffected individuals [138]. This variant has been found to be the second variant in 40-50% of monoallelic patients [85,127], 80% of monoallelic patients with late-onset disease [127], and has an allele frequency 3-4 times higher in STGD1 patients compared to the general population [85]. The c.5603A>T p.(Asn1868Ile) variant is typically found in trans with a severe variant whereby its predicted penetrance is 2.4% [85,139].
The c.5603A>T p.(Asn1868Ile) has been reported to form complex alleles with c.2588G>C p.(Gly863Ala) [85,127,140], c.5461-10T>C [127,141], c.4496G>A [142], c.2564G>A [115], and c.769-784C>T [133]. Zernant et al. proposed that this complex allele has a variable phenotype depending on the severity of the variant in trans and that both variants in the complex allele could have a synergistic effect due to both variants being predicted to affect protein folding and exerting their effect on the ABCA4 protein on the same side of the photoreceptor outer segment [127]. It has also been suggested that c.2588G>C p.(Gly863Ala) is a modifier that only causes disease when it is in cis with c.5603A>T p.(Asn1868Ile). The c.2588G>C p.(Gly863Ala) variant has also been detected in cis in 2/29 alleles with the c.5714+5G>A variant [115,143]. Moreover, the c.5714+5G>A variant was first suggested to have a moderately severe effect [144] and has been observed in mildly affected homozygous STGD1 patients [115].
It is clear that identifying hypomorphic alleles will be an important step towards solving the missing heritability in STGD1. However, deciding whether a variant is truly hypomorphic or a benign polymorphic variant poses a significant interpretation task for a laboratory. Indeed, caution is advised when assigning a variant as hypomorphic in order to avoid incorrect diagnosis. This highlights the importance of developing a pipeline using various bioinformatic algorithms as well as laboratory-based techniques to predict the pathogenicity of variants. These will be outlined in the following section. This approach is underpinned by a need for accurate phenotyping along with family segregation studies.

Pathogenicity and Severity of ABCA4 Variants
ABCA4 is a large polymorphic gene, meaning that "benign" missense variants are frequently identified. This complicates knowing which variants are truly pathogenic as many ABCA4 variants are "private" variants [20]. Segregation studies within families are not always able to elucidate the pathogenicity of the identified variant [126]. Genotypephenotype correlation studies in STGD1 are difficult as patients rarely have the same combination of ABCA4 variants or are homozygous for the same variant due to the large number of variants seen in ABCA4 [20,114]. This means that many novel variants get assigned as a variant of unknown significance (VUS) [103]. Indeed, Braun et al. found that 65% of the ABCA4 variants in their cohort were only identified once [130].
Early studies suggested that the phenotypic severity was inversely proportional to the amount of functional ABCA4 protein (see Figure 5). Severe variants were associated with severe phenotypes, such as cone-rod dystrophy [61], retinitis pigmentosa [59,60], choriocapillaris dystrophy [101], and rapid-onset chorioretinopathy [63], whilst missense variants were thought to lead to a milder phenotype because they produce some functioning protein [53,145,146]. However, some missense variants have been reported to cause more severe disease with a significantly earlier age of onset compared to other missense variants [117] and more severe phenotypes, such as cone rod dystrophy and RP [20]. Similarly, the c.768G>T p.(Leu257Valfs*17) variant results in a severe phenotype and has been shown to affect splicing, resulting in a 35 bp exon 6 elongation, despite it being a synonymous change [147]. Deep intronic variants similarly have a variable severity, as some with a partial effect are mild [125,133] whilst those with full effect are null alleles and can be considered as severe variants [125,126].
The age of onset has been used to predict the severity of variants [114,148] as more severe variants are identified in patients with an early age of onset [90,117]. However, this observation may be affected by recall bias. Indeed, some patients with foveal sparing disease may not present despite having atrophic macular lesions [149]. Other methods to predict the severity of variants include predicting the severity based on the amino acid change, using prediction software, and in vitro functional studies.

Predicting Pathogenicity
A number of techniques are used to assesses variant pathogenicity. Commonly used algorithms include grading based on the American College of Medical Genetics Crite-ria [150], allele frequency within populations, in silico analysis programmes, conservation of the region across different species, and calculating the combined annotation-dependant duplication (CADD) score. Pathogenic variants are expected to have a low allele frequency within the population; however, this method will miss the hypomorphic alleles, which can have a relatively high frequency in the general population. In silico analysis programmes are used to predict the pathogenicity of variants; however, these programmes can give differing predictions and historically could only correctly predict splicing defects in 70-80% of variants [151]. However, technological advances have meant that predictions of splicing defects have improved with the introduction of newer software, such as spliceAI, which utilises deep-learning techniques [152]. Moreover, using strict screening criteria could mean some causal variants are missed. Genetic regions that are highly conserved across different species are expected to be important regions and variants occurring within these regions are assumed to more likely affect the function of proteins [96]. A CADD score > 10 means that the variant would be predicted to be in the top 10% of most deleterious substitutions in the human genome [153].
Fujinami et al. proposed three genotype severity groups (see Table 3) [155] and this classification was used to classify the genotypes of patients recruited to the ProgStar study, which is the largest natural history study on STGD1 [104,156]. Interestingly, Fujinami et al. found that a higher proportion of patients with childhood-onset disease had more severe variants compared to patients with adult onset [154]. Another severity classification (see Table 4) was proposed by Cornelis et al. following a meta-analysis on all the variants on the LOVD database [148].

Functional Analysis of ABCA4 Variants
Functional assessment of the ABCA4 variants is important to determine whether they affect protein function. In silico analyses are used to predict the effect of variants on the protein; however, these are not always accurate [105]. In vitro analysis can have an important role in deciding whether rare ABCA4 variants can be classified as pathogenic. As ABCA4 is only highly expressed in the retina, obtaining patient tissue to be used for functional assessment of variants is very difficult, meaning that in vitro assays have to be used [157]. Early in vitro assay studies used lymphoblastoid cells that were limited due to the low expression of ABCA4 in blood [141,158]. Keratinocytes have since been shown to express ABCA4 at low levels [130] and more recently reverse transcription polymerase chain reaction (RT-PCR) on photoreceptor progenitor cells derived from induced pluripotent stem cells (iPSC) have been used to assess variants [106,159]. More recently, an in vitro assay to investigate the functional effects of NCSS variants and aberrant splicing utilising a midigene system has been developed [147]. This protocol enables any ABCA4 NCSS variants identified to be tested in their "natural context" and to determine the functional consequence [147].
The value of in vitro functional assessment of variants is demonstrated by the analysis of the c.5461-10T>C variant that is present in up to 7% of European STGD1 patients, making it the third most common ABCA4 variant in European patients [25,104,113,137]. This variant is associated with an early onset disease and cone rod dystrophy in homozygous patients [106]. However, the pathogenicity of this variant was debated at the time of its discovery by Maugeri et al. in 1999 [53]. It was suggested that it was in linkage disequilibrium with an unknown pathogenic variant [53,118] because it was not predicted to cause a splicing defect using in silico analysis and splicing defects were not identified in early functional studies that used COS7 and lymphoblastoid cells [118]. However, c.5461-10T>C has now been shown to result in a splicing defect that creates a shortened mRNA with alternate splicing and skipping of exons 39 and exons 39-40, producing a truncated protein, p.(Thr1821Valfs*13) and p.(Thr1821Aspfs*6), respectively [106,141], consistent with it being a severe variant.
In vitro protein assays can also be used to assess both the binding to NrPE in the absence of ATP and the dependant ATPase activity [160]. Curtis et al. recently classified missense variants into three classes based on functional assessment studies of expression of ABCA4, basal ATPase activity, and stimulation by N-Ret-PE (Table 5) [161]. Variants assigned to Class 1 were associated with an early age of onset (≤13 years of age) whilst Class 3 variants were associated with late-onset disease (>40 years of age). The authors proposed that Class 3 encompassed hypomorphic alleles as the c.5603A>T p.(Asn1868Ile) variant was found to cause a small but significant reduction in ABCA4 expression and ATPase activity [161]. Similarly, Garces et al. classified missense variants affecting the TMDs and found that these also resulted in protein misfolding, decreased substrate binding, and reduced ATPase activity [162]. However, a main limitation of in vitro protein assays is that the detergent solubilisation step can denature variants and result in a severe loss of function, as demonstrated by studies on the mild p.(Gly1961Glu) variant [160]. Table 5. A summary of the severity classification of the variants described by Curtis et al. [161]. These studies highlight the important role that in vitro analysis plays in assessing the pathogenicity of ABCA4 variants. Improvements in the ability to analyse the effects of the variants will be important towards assigning pathogenicity to rare missense variants and hypomorphic alleles, thus aiding in confirming a molecular diagnosis in ABCA4R patients and is of relevance for recruitment to therapeutic approaches, which will be discussed in the next section.

Therapies
Currently there are no commercially available treatments for ABCA4R/STGD1. Patients are currently advised to avoid supplements containing vitamin A, due to lipofuscin accumulation being seen in Abca4 knockout mice that were given vitamin A [163]. Wearing protective, dark-tinted glasses in bright conditions is recommended to reduce short wavelength light reaching the retina, thus reducing the risk of light toxicity [164]. Potential treatments currently being investigated include pharmacological interventions, gene therapy, and stem cell-based therapy approaches (see Table 6). Novel therapies are initially investigated in animal models followed by trials in human subjects. Human trials are divided into four phases: Phase I-to assess the safety of the therapy in a small number of subjects; Phase II-to assess efficacy where patients are randomly placed in treatment and placebo arms; Phase III-similarly assesses efficacy but uses a larger cohort of randomized patients; and Phase IV-monitoring the therapy when it becomes available.
Pharmacological therapies for ABCA4R are mainly based on targeting aspects of the visual cycle in order to reduce the accumulation of lipofuscin deposits. Table 6 details the effect of the compound and trial results if published. The main advantage of these potential therapies is that they can be taken orally, meaning they are less invasive.
Gene therapy approaches employ viral or non-viral vectors to introduce genetic material into cells in order to either replace an abnormal gene, modify a gene, or silence a dominant gene [165][166][167][168][169][170]. Recently, gene therapy using an adeno-associated virus (AAV) vector has been shown to be effective in the treatment of a different IRD called Leber congenital amaurosis (LCA) caused by RPE65 variants. It has been licensed for use by the US Food and Drug Administration (FDA) [171] and was approved by NICE in the UK in 2019, with treatment already started on the NHS for this specific disease [172]. However, using this approach in ABCA4 is complicated as the capacity of the AAV vector is approximately 4.8 kb and the ABCA4 gene is 6.8 kb [173,174]. Attempts to overcome this limitation have included the use of larger vectors, such as lentiviruses [175,176], a dual AAV approach [173,[177][178][179], and the use of nanoparticles [174,[180][181][182]. These different approaches will be briefly covered in this review and we direct readers to other review articles that cover gene therapy in more depth [180,183,184].
Lentiviruses are retroviruses that can be used to package ABCA4 due to their carrying capacity of 8kb [185,186]. The Equine Infectious Anemia Virus (EIAV) lentivirus containing the human ABCA4 gene has been shown to significantly reduce the accumulation of A2E in Abca4 KO mice [175] and was found to be safe in rabbit and macaque retinas [176]. The only human gene therapy trial in ABCA4 was carried out by Oxford Biomedica (SAR422459 NCT 01367444) using the EIAV vector, but the results were not published at the time of this review.
The dual AAV approach splits the ABCA4 gene between two AAVs that can combine to form the full length ABCA4 gene in the host [173,[177][178][179]. A study by Dyka et al. showed detectable levels of full-length protein in vitro [177]. The Abca4 KO mouse treat-ment resulted in expression of ABCA4 within the photoreceptor outer-segments [173,177], expression of ABCA4 beyond the treated region [173], a significant reduction in bisretinoid levels [173,177], and a reduction in AF signal [173]. However, this approach is limited by the dual vector being less able to transduce photoreceptors compared to single AAV vectors [178,179]. In addition, there is reduced promoter activity of the inverted terminal repeats (ITR), resulting in the production of truncated proteins [178,187]. However, recently Trapani et al. found that the inclusion of the CL1 degron led to a significant reduction in truncated proteins [188].
Nanoparticles are synthetic vectors that range in size from 10 to 500 nm [180] and with a capacity of between 5.3 and 20.2 kb [181], which can be used to package the wild-type ABCA4 gene. A review of nanoparticles and other non-viral vectors is beyond the scope of this review and we direct the readers to a review by Charbel Issa and MacLaren for more detailed information [189]. Treatment in Abca4 KO mice has been show to result in expression of transgene mRNA, expression of ABCA4 within the outer segments of rod and cone photoreceptors, a significant reduction in lipofuscin levels to levels similar to those in WT mice, a significantly thicker outer nuclear layer similar to that in WT mice, reduction in A2E, improved dark adaptation, improved dark adaptation on electrodiagnostic testing (EDT), delayed progression of disease, and the absence of white spots seen on the fundus of Abca4 KO mice [174,182].
Stem cell-based therapies in STGD1 aim to replace the diseased or atrophic RPE cells with RPE made from either human embryonic stem cells (hESCs) or iPSCs. Three Phase I/II trials in humans have all found the procedure to be safe [198][199][200][201]. Schwartz et al. found that 8/10 patients had a significant gain in VA but no significant changes to the visual field, electrodiagnostic test results, or reading speeds were detected, despite patients reporting a median of 8-20 points improvement in their National Eye Institute Visual Function Questionnaire [198]. Another approach by Oner et al. was to use suprachoroidal transplantation of adipose tissue-derived mesenchymal stem cells (ADMSC) in patients with AMD and STGD1. No serious adverse events were reported, and all eight treated patients had significant improvements in VA, visual field, and mfERG results [202].  A trial showed no improvement in retinal function [221] There are a number of different therapeutic approaches that could potentially become available in the future for ABCA4R, thus highlighting the importance of early identification of patients so that they can benefit when therapies become available and receive treatment at an earlier stage of disease. Detection of at least two pathogenic variants in trans will be mandatory in order to ensure patients do not undergo therapies that could cause harm without a likely benefit.

Discussion
STGD1/ABCA4R disease is one of the most common causes of IRD but its true prevalence is unknown. The frequently quoted prevalence of 1/10,000 was estimated by Blacharski et al. in 1988, but was not based on a prevalence study [1]. Recently, the British Ophthalmic Surveillance Unit (BOSU) reported the incidence of STGD1 to range between 0.11 and 0.13 per 100,000 individuals per year [222]. Although, the carrier frequency is reported to be as high as 1 in 20 in different populations [3,4].
Stargardt/ABCA4R is considered to be an autosomal recessive disease [2] but Runhart et al. have recently proposed that STGD1 should be considered a polygenic or multifactorial disease based on a reported female sex bias in 25% of their STGD1 cases that had a combination of a mild and null allele [86]. However, Lee et al. did not find a sex bias in their STGD1 cohort and proposed that STGD1 follows a Mendelian inheritance pattern [87]. The phenotypic appearance in ABCA4R is also highly variable, making identifying and diagnosing patients difficult, and this is explored in more detail in our review of multi-modal imaging in ABCA4R [30]. Indeed, patients usually present in early childhood [1,22,23], but onset in adulthood and late adulthood (>45 years of age), with some cases having an age of onset at 80, is being increasingly recognised [26][27][28][29]. Patients with late-onset disease may be misdiagnosed as geographic atrophy/dry age-related macular degeneration [27], and limited availability of genetic testing may mean that not all cases are identified. Thus, ABCA4R may be more prevalent than current estimates. This is further complicated by the existence of phenocopies, which must be considered especially when only one variant is identified or in families with a pseudo-dominant inheritance pattern. This is because the IRD could be due to variants in other genes, such as PRPH2 [70] and CRX [72], which are associated with an autosomal dominant inheritance pattern, and have variable penetrance. There may be other clinical features in these other disorders that could help distinguish them [30]. Correctly identifying these patients will be important in order to ensure adequate counselling regarding the prognosis, risk to offspring, and explanation regarding the risk of late-onset disease in unaffected relatives, as well as become highly relevant when therapeutic options become available.
However, this is not always straightforward. The complexity stems from a number of factors related to the ABCA4 gene, including its large size, highly polymorphic nature, and the large number of variants occurring in both the coding and non-coding regions. Patients are also frequently found to have single or no variants in ABCA4. The contribution of undetected variants, such as deep intronic variants, structural variants, hypomorphic alleles (incorrectly classified as "benign"), and disorders caused by variants in another gene (phenocopies) all play a role in solving the missing heritability in ABCA4R.
Advances in genetic sequencing, such as NGS and smMIPS, have enabled the detection of deep intronic variants, which have now been shown to be pathogenic [97,126,130,131] and are currently thought to account for 10% of all pathogenic variants in ABCA4 [54]. The recent classification of c.5603A>T as a hypomorphic allele has highlighted that hypomorphic alleles can explain a significant amount of monoallelic cases. However, care will be needed when classifying variants as hypomorphic in order to avoid incorrectly diagnosing patients with STGD1. Structural variants [110,126,136,137] and uniparental isodisomy [97,128,129] are rarely found in the ABCA4 gene, which suggests that these changes do not account for a significant proportion of the missing heritability.
However, the possibility that the detected variants are part of a complex allele rather than in trans must be investigated. Complex alleles are thought to be present in approx-imately 10% of patients [92], which highlights the importance of segregation studies to confirm that the identified variants are in trans. Segregation studies to identify whether detected variants form a complex allele rather than in trans are indicated (especially more than one IRD gene variant is present). However, this is not always possible.
Research into the genotype-phenotype correlation and genetic testing in ABCA4R is of significant importance. It was previously thought that missense variants result in a milder phenotype and that deleterious variants that result in truncated proteins are more severe [53]. However, this has not always been found to be the case [117,223]. The large number of variants complicates genotype-phenotype correlation in ABCA4 as it is rare for patients to have the same combination of variants or be homozygous unless there is a history of consanguinity.
The majority of variants are missense variants, which are difficult to predict the severity of, meaning that many novel missense variants are classified as VUS. The ProgStar study [104] and Cornelis et al. [148] have both devised methods to classify the severity of different combinations of ABCA4 variants in patients. However, it is still unknown if these predictions accurately correlate to the phenotypic appearance on multi-modal imaging and on photoreceptor function on electrodiagnostic testing.
Predicting the severity of ABCA4 variants is also difficult; for example, the c.768G>T variant, originally thought to be a synonymous change, was subsequently found to affect splicing and results in a truncated protein p.(Leu257Valfs*17) [147]. This also highlights the limitation of using the Fujinami genotype prediction model [104], which would have classified this variant as mild. The pathogenicity of known variants such as c.2588G>C, p.(Gly863Ala) has also been revaluated as this variant was initially thought to be mild and only pathogenic when in trans with a severe variant, but it is now considered to be a benign variant when it is on its own and is only pathogenic when in cis with the c.5603A>T, p.(Asn1868Ile) [115,143]. Indeed, the variable phenotype observed in siblings carrying the same variants supports the presence of modifier genes/variants and environmental factors influencing the phenotype [81][82][83][84]. The further role of ABCA4 variants or other genetic factors that act as modifiers is of great interest but will be difficult to evaluate.
Currently, our ability to assess the effects of specific ABCA4 variants is limited due to the lack of a crystal structure of the ABCA4 protein [224]. Predictions of the pathogenicity of variants is mostly based on prediction programs, which are not always accurate [151]. The frequency of variants within the general population may lead to hypomorphic alleles being inadvertently removed from the analysis. Moreover, the majority of studies on ABCA4 have been regarding patients of European descent, and recent studies have shown that the frequencies of different variants differ between countries [104] and ethnicities [103,112]. Further studies in different populations are required, highlighting the importance of ethnicity-specific genome databases to help ascertain whether a variant is a benign polymorphism or pathogenic. Differences in variants between ethnicities can also influence the phenotype. For example, African Americans have been reported to have milder phenotypes and later-onset disease compared to patients of European descent [112]. This variability will be important to consider when counselling patients regarding prognosis and also when screening for variants in the ABCA4 gene. Moreover, further research will be needed to accurately characterise the disease within different ethnicities to help define its natural history. This will be important in the design of and recruitment into therapeutic trials.
In vitro functional assays can be used to determine the functional effects of variants and recent advances with the creation of patient-derived iPSCs means that investigations are now more accurate, relevant, and reliable [106,159]. However, caution is still required when interpreting the results of these studies as some variants can be denatured by the detergents, thus affecting the outcomes of the studies [160]; these functional assays should be considered "surrogate" studies, as they are unable to measure the flippase activity. Furthermore, the lower amounts of retina-specific factors in iPSCs means that they will not have the same splicing as that in retinal tissue [97]. Nevertheless, these in vitro studies have a useful role in assessing whether rare variants and VUS are truly pathogenic. Functional studies are important in assigning pathogenicity to missense variants and hypomorphic alleles [161]. This information can be used to determine whether patients carrying these variants are eligible to be recruited to trials. Moreover, it will help identify patients with late-onset disease who are sometimes misdiagnosed with age-related macular degeneration who are at risk of being advised to take supplements containing high doses of vitamin A, which can have detrimental effects in STGD1/ABCA4R.

Conclusions
In conclusion, STGD1/ABCA4R is one of the most common IRDs. It is associated with central visual loss and in some more global retinal disease. Accurate diagnosis is complicated by the high variability in both the genotype and phenotype in ABCA4. Genotyping in ABCA4 is difficult due to the highly polymorphic nature of the gene, the presence of many "private variants", deep intronic variants, complex alleles, hypomorphic alleles, and "phenocopies". An accurate genotype-phenotype correlation is important to understand and document the natural history of the disease, providing valuable information when assessing the efficacy of therapies. It is still unknown whether and to what extent the phenotype is also affected by modifiers and environmental factors; these will also play a role in the response to treatment. NGS-based sequencing enables the detection of variants in phenocopy genes that can mimic the appearance of STGD1/ABCA4R and recent advances in genome sequencing are improving the detection rates of ABCA4 variants. Developments in functional studies are shedding light on the effect of variants on the ABCA4 protein as well as insights into treatment targets. Therapeutic approaches currently under investigation for ABCA4R include pharmacological, gene, gene-targeted, and stem cell-based therapies. A molecularly confirmed diagnosis of STGD1 is key for a diagnosis and recruitment into appropriate therapeutic trials.

Institutional Review Board Statement:
This study was conducted in accordance with the Declaration of Helsinki with Ethics approval obtained from the local research ethics committee (reference 08/H0302/96) and informed written consent was obtained from all patients.

Informed Consent Statement: Informed written consent was obtained from all patients.
Data Availability Statement: Data sharing not applicable.