The Role of Knockout Olfactory Receptor Genes in Odor Discrimination

To date, little is known about the role of olfactory receptor (OR) genes on smell performance. Thanks to the availability of whole-genome sequencing data of 802 samples, we identified 41 knockout (KO) OR genes (i.e., carriers of Loss of Function variants) and evaluated their effect on odor discrimination in 218 Italian individuals through recursive partitioning analysis. Furthermore, we checked the expression of these genes in human and mouse tissues using publicly available data and the presence of organ-related diseases in human KO (HKO) individuals for OR expressed in non-olfactory tissues (Fisher test). The recursive partitioning analysis showed that age and the high number (burden) of OR-KO genes impact the worsening of odor discrimination (p-value < 0.05). Human expression data showed that 33/41 OR genes are expressed in the olfactory system (OS) and 27 in other tissues. Sixty putative mouse homologs of the 41 humans ORs have been identified, 58 of which are expressed in the OS and 37 in other tissues. No association between OR-KO individuals and pathologies has been detected. In conclusion, our work highlights the role of the burden of OR-KO genes in worse odor discrimination.


Introduction
Animals, including humans, perceive themselves and everything surrounding them thanks to their senses, and only the sensory coding allows species to make crucial decisions that lead to a specific behavioral response [1]. Among the sensory systems, the sense of smell is the most ancient and gives us the ability to perceive odorants, which are mixtures of different chemical molecules. This ability is present in micro-organisms as well as in complex species such as mammals. However, during evolution, human beings' increasing reliance of other senses, such as vision, has decreased our sense of smell [2]. Nevertheless, the OS is the designated machinery for recognizing and elaborating conscious olfactory stimuli allowing humans to discriminate more than a trillion odorant stimuli [3,4]. Anatomically, the OS extends from the nose's superior part to the brain's higher structures. The crucial component is the olfactory epithelium (OE) which is highly specific for each species and is deeply connected to their reliance on the sense of smell. The OE is characterized by several types of cell, the most important of which are the olfactory sensory neurons (OSNs), bipolar neurons capable of regeneration [3]. The precise mechanism of OSN regeneration, maturation, and the subsequent axonal connection is still unknown, but this turnover mechanism decreases progressively over time leading to age-related olfactory function loss [5]. On each cilium, OSNs express one OR gene, which allows the interaction with different odorants [3]. The exact mechanism of odor coding is still undeciphered, but the odorants' identification seems to work as a "combinatorial code" in which one OR can identify several odorants while different odorants are recognized by multiple combinations of receptor [6].
Olfactory receptors (ORs) belong to the superfamily of G-protein coupled receptors (GPCR). The number of OR genes and pseudogenes in the genome varies significantly between different species [7][8][9]. It does not always correlate with their smell ability, suggesting that other factors may be involved (e.g., larger surfaces of OE in dogs, a high number of glomeruli in humans, etc.) [10][11][12]. Mammals have about 1000 olfactory genes, while other living organisms such as fishes no more than 100 [9]. Humans have 851 OR genes, but only 45% of them are functional [13]. In humans, the OR genes are distributed in clusters located on all chromosomes, except for chromosomes 20 and Y [14]. Recent evidence suggests that, apart from the OE, OR are widely expressed in several other tissues such as the brain, tongue, testis and liver [15]. The OR genes are intron-less and, despite not being individually expressed in each OSN, they are expressed by a single allele [13]. An inter-individual phenotypic variation in the olfactory function within members of the same species suggests a different pattern of genetic variants in ORs and an influence of both environment and demographic factors [15][16][17].
Among non-genetic components, it is well established that aging is a driver factor involved in olfactory decay [17]. Moreover, other conditions such as neurodegenerative diseases [18], head trauma [19], brain tumors [20], brain surgery [21], and infections [22] have been proved to play a role in olfactory dysfunction.
As for OR genetic variations, they probably contribute to the diversity of odorantspecific sensitivity phenotypes. For example, the role of two variants (rs61729907 or R88W, and rs5020278 or T133M) within the OR7D4 gene which impair the individual ability to perceive androstenone (5a-androst-16-en-3-one) is well known [23]. Recently, Gisladottir and colleagues [24], through a whole-genome sequencing analysis discovered a common variant in OR6C70 associated with a higher intensity and naming of licorice odor (transanethole). Other studies highlighted the role of OR variants on specific odorants [25][26][27][28][29][30][31]. However, there is still a lack of data regarding the hundreds of receptors' interactions with the multitude of odorous molecules. Therefore, more efforts are needed to increase our knowledge of the genetic basis of this sense. In this light, the possibility of studying individuals defined as human knockout (HKO) (i.e., carriers of biallelic loss of function (LoF) variants) can give the unprecedented opportunity further to explore the role and the function of OR genes.
In this study, we hypothesized that the amount of knockout OR genes (KO-OR) could impact the individual general smell ability, without focusing on a single OR or single odorant. Analyzing data from two Italian genetic isolates, we identified carriers of biallelic LoF variants in OR genes (i.e., OR-human knockout (HKO)), and investigated their relationship with odor discrimination data measured through the Sniffin' Sticks test. The main aim was to understand better the these genes' role in smell ability investigating the possible correlation between the burden of OR-KO genes and the smell ability. As secondary objects, we studied the expression pattern of the OR-KO genes in the OS and other tissues of both humans and mice and the possible development of organ-related diseases in individuals' OR-KO for proteins expressed in the non-olfactory epithelium.

Results
The Figure 1 shows the workflow of the study.

Results
The Figure 1 shows the workflow of the study.

Dataset Overview and Characterization of Olfactory Receptor Knockout (OR-KO) Variants
Briefly, as reported in our previous work [32], low coverage whole genome sequencing (WGS) data from Italian individuals was analyzed with an in-house bioinformatics pipeline based on GATK (Genome Analysis Toolkit) best practices [33] to identify common and rare genetic variants. Eight hundred and two individuals belonging to two Italian geographically distinct areas (n = 378 for the Friuli Venezia-Giulia (FVG), n = 424 for Val Borbera (VBI) cohorts) have been selected and investigated for homozygous LoF variants involving ORs. This research resulted in a list of 42 LoF variants in 41 OR genes and a total of 782 HKO (372 in FVG and 410 in VBI-defined as individuals carrying at least one homozygous LoF variant). Among these 42 variants, 14 (33.3%) were classified as stop gain and 28 (66.6%) as frameshift. The frequency of the alternative allele ranged from 0.004 (rs564566592) to 0.77 (rs10838851), and two LoFs were not present in the FVG cohort (11_5080307_AT_A and rs147062602). The comparison with data from the 1000 Genomes Project phase3 [34] and gnomAD v.2.1.1 [35] showed that the allele frequency distribution of the variants we selected was consistent with the general European population's allele frequency spectrum. The identified variants' complete characteristics were detailed in Table 1 and Supplementary Tables S1-S3. By comparison with the hORMdb database [36], we found information on 39 variants belonging to genes comprising 13 out of 18 OR families. All but one (14_20666175_C_CA/rs55781225) were Figure 1. Workflow of the study. The picture summarizes the general workflow applied in the present study. Briefly, whole genome sequencing (WGS) data of 802 samples have been checked, searching for olfactory receptors (OR) genes carrying loss of function (LoF) variants. Data analysis led to the identification of 41 OR-KO (Olfactory Receptor knockout) genes in 782 subjects. Among those individuals, Sniffin' Sticks test data were available for a total of 218 persons. The association between the burden of OR-KO genes and odor discrimination was tested, together with the analysis of OR-KO genes' expression in human and mouse tissues and the correlation between OR-KO genes and specific diseases.

Dataset Overview and Characterization of Olfactory Receptor Knockout (OR-KO) Variants
Briefly, as reported in our previous work [32], low coverage whole genome sequencing (WGS) data from Italian individuals was analyzed with an in-house bioinformatics pipeline based on GATK (Genome Analysis Toolkit) best practices [33] to identify common and rare genetic variants. Eight hundred and two individuals belonging to two Italian geographically distinct areas (n = 378 for the Friuli Venezia-Giulia (FVG), n = 424 for Val Borbera (VBI) cohorts) have been selected and investigated for homozygous LoF variants involving ORs. This research resulted in a list of 42 LoF variants in 41 OR genes and a total of 782 HKO (372 in FVG and 410 in VBI-defined as individuals carrying at least one homozygous LoF variant). Among these 42 variants, 14 (33.3%) were classified as stop gain and 28 (66.6%) as frameshift. The frequency of the alternative allele ranged from 0.004 (rs564566592) to 0.77 (rs10838851), and two LoFs were not present in the FVG cohort (11_5080307_AT_A and rs147062602). The comparison with data from the 1000 Genomes Project phase3 [34] and gnomAD v.2.1.1 [35] showed that the allele frequency distribution of the variants we selected was consistent with the general European population's allele frequency spectrum. The identified variants' complete characteristics were detailed in Table 1 and Supplementary Tables S1-S3. By comparison with the hORMdb database [36], we found information on 39 variants belonging to genes comprising 13 out of 18 OR families. All but one (14_20666175_C_CA/rs55781225) were annotated as affecting all gnomAD populations. Out of 39 putative LoF variants, seven are annotated as affecting the functional core, and two as affecting the corresponding OR's binding cavity (as defined in [35]). To 16 variants, a negative amino acid substitution score was assigned (two of them also affected the binding cavity and one the functional core). Therefore, we concluded that at least 22 variants could impact on the binding of odorant molecules or the receptor structural integrity (Table S4). Table 1. Characteristics of the homozygous LoF variants in OR genes identified in our Italian cohorts. All data are aligned to the human genome reference build 37 (GRCh37), and VEP (Variant Effect Predictor, https://www.ensembl.org/info/docs/tools/vep/index.html) version 90 was used to determine the variant consequence. Chr = chromosome, Pos = position, Ref = reference allele, Alt = alternative allele, Freq = frequency of the reference allele, KO = knockout, N = number, FVG = Friuli-Venezia Giulia, VBI = Val Borbera. The last two columns refer to the number of KO individuals in FVG and VBI ("KO FVG/VBI") and the number of human knockout (HKO) with information of Sniffin' Sticks test and used in regression tree analysis ("N smell FVG/VBI").

Relationship between OR-KO Genes' Burden and Smell Performance
After applying the exclusion criteria detailed in the Methods section (e.g., previous neurodegenerative disease diagnosis), 218 subjects with Sniffin' Sticks test (93 belong to VBI and 125 to FVG cohorts) were included in the study. Their features are summarized in Table 2. The hypothesis that an increasing number of OR-KO carried by an individual could impact the sense of smell (evaluated as the number of mistakes made in the odor discrimination test) was investigated using conditional inference tree analysis. As reported in Figure 2, this analysis showed that age and the OR-KO burden significantly influenced the number of errors, while the model was not influenced by sex or population (adjusted p-value > 0.05). In particular, the first variable affecting smell was age (node 1: 73 years cutoff, p-value < 0.001; node 2: 57 years cutoff, p-value < 0.001), while the second one was the OR-KO genes burden (node 3: cutoff 4 OR-KO, p-value 0.038). This partition led to four final subgroups (indicated as the terminal nodes labeled 4, 5, 6 and 7 in Figure 2), clearly proving that, from node 4 to node 7, there was an increasing number of errors due to both the high burden of OR-KO and aging.

Expression Patterns of OR-KO Genes
To investigate OR-KO genes' expression, we used publicly available data on human and mouse expression in multiple cell lines and tissues. The results are reported in Table  3.

Expression Patterns of OR-KO Genes
To investigate OR-KO genes' expression, we used publicly available data on human and mouse expression in multiple cell lines and tissues. The results are reported in Table 3.
Human RNA-seq data extracted from Saraiva et al. [37] revealed that 33 out of 41 OR genes (80.5%) had detectable expression in human olfactory tissue, with expression spanning from 0.35 to 160.36 normalized counts (NCs). In particular, 28 showed evidence of robust expression (>1 NCs). Moreover, according to the Human Protein Atlas (HPA) database, 27 genes (65.9%) were expressed, at least, in another tissue.
From the list of 41 human ORs, we identified 60 putative mouse homologs through the Mouse Genome Informatics (MGI) resource. OS expression data [32] showed robust expression (>1 NCs) for 51 out of the 60 identified mouse homologs (85%). The MGI database confirmed expression in the OS for 58 of 60 mouse homologs, with 37 of them (63.8%) being expressed in tissues other than OE.

Relationship with Pathologies
Given the expression of the investigated genes in tissues other than OS, the presence of pathologies in HKO individuals was investigated. We focused on the FVG cohort analysis since this was the subset of individuals with the most curated pathology data available. The analysis did not identify, after Bonferroni correction for multiple testing, pathologies significantly more frequent in HKO subjects than the remainder of the population.

Discussion
Although the olfactory sense's molecular bases are relatively well understood, there is still a considerable lack of knowledge of the contribution of the specific genes involved. Therefore, it is vital to explore further this sense considering that smell ability deficits are crucial/critical signs for the early diagnosis of neurodegenerative disorders [18,38,39]. Several works have already highlighted the effect of variants in OR genes on the perception of smell [23][24][25][26][27][28][29][30][31], but, to our knowledge, no studies evaluate the effect of the burden of OR-KO genes on smell ability.
In this light, we combined WGS data of a large cohort of samples with detailed phenotypic data to unravel this unsolved issue. In particular, thanks to the availability of WGS of 802 Italian samples, we identified 41 OR-KO genes (i.e., genes for which we identified individuals carrying LoF variants in the homozygous state). We evaluated their effect on the smell capacity in 218 individuals, for whom the odor discrimination evaluation was assessed through the Sniffin' Sticks test. For the first time, we demonstrated that OR-KO genes' burden was significantly associated with a worse smell performance in young subjects (i.e., aged ≤57 years). More precisely, the younger individuals carrying more than 4 OR-KO genes showed a worse performance in the odor identification test. Interestingly, although the OR-KO genes are 41, 4 is the median number of OR-KO genes per individual. This result might be related to these mutational events cumulative effects (that simultaneously turn off the expression of a series of OR genes), as also hypothesized for other conditions [40,41]. Moreover, the data made available by the recently published work of Jimenez et al. [35] allowed us to conclude that at least 22 HKO variants could impact the binding of odorant molecules or the receptor structural integrity. This last information suggested that an approach based on the burden test can help determine whether multiple homozygous LoF variants influence the ability to recognize the odors. Our data agreed with previous ones showing that age was a major player in the progressive worsening of the sense of smell, overcoming the genetic factors in older individuals (i.e., aged >57 years) [25].     Regarding the OR-KO expression patterns, it has been highlighted that many OR genes are expressed in several structures other than the OS in both humans and mice, thus suggesting that they may exert a role in non-chemosensory tissues. We looked for any relationship between OR-KO genes and specific pathologies, but we did not find any disorders significantly more frequent in OR-KO subjects than in the rest of the population. Several possible explanations could justify this lack of association, including the small number of cases and the lack or incompleteness of data on tissue-specific OR gene expression in public databases. Information about tissue-specific expression was not feasible for many ORs, and therefore, in this case, it was not possible to speculate on any association with a particular disease. On the other hand, regarding ORs whose pattern of expression was publicly available, it could be argued that data were still widely incomplete. Most ORs were apparently over-expressed in the male or female reproductive system, in bone-marrow-derived cells, and the brain, with a relative absence of expression in all other tissues.
In general, our study, for the first time, reported WGS data combined with the smell phenotype of a selected cohort of Italian genetic isolates. Our results allowed us to identify an interesting association between OR-KO genes' burden and less smell performance in younger people, suggesting the importance of the genetic background in determining human olfactory capability. Present data also corroborated the hypothesis that aging processes are more relevant than the individual genetic background in impairing smell ability. Further studies on larger datasets are needed, including other population cohorts, although data from individuals with WGS and information on the sense of smell are relatively limited.

Identification of OR-KO Genes and Comparison with External Databases
A subset of HKO variants involving OR genes were selected from the data generated in [30] for further analysis. HKO variants were defined as LoF variants presenting with a CADD (Combined Annotation Dependent Depletion) score ≥ 20 at homozygous state in at least one individual of at least one population. We defined "burden of OR-KO genes" as the total number of OR genes KO per individual and compared alternative allele (ALT) frequencies of HKO variants with data from 1000 Genomes Project phase 3 [34] and gnomAD v.2.1.1 [35] using the R implementation of the Chi-squared test. We extracted information about topological annotations from the Human Olfactory Receptor Mutation Database (hORMdb) [36].

Clinical Evaluation
The clinical evaluation of all subjects enrolled in the study was characterized by evaluating hundreds of functional parameters, including clinical, biochemical data, and bone densitometry. We performed a sensory evaluation focused on the analysis of senses (hearing, taste, smell, and vision-for details on the smell functionality assessment, see next section), a cardiovascular, neurological, orthodontic evaluation, a detailed personal and familial history with more than 200 questions asked to each subject. All parameters were systematically collected by professional and trained staff according to standardized protocols; participants were also required to fill in a questionnaire on health-related topics, including diet, lifestyles, and physical activity.

Smell Functionality Assessment
Smell functionality of each subject was assessed through the "Sniffin' Sticks test" (Screening 12 test, Burghardt Messtechnik GmbH, Wedel, Germany), a smell discrimination test which contains 12 "Sniffin' sticks", felt-tip pens with precise odorants to be recognized [42]. The test is based on the discrimination of every-day odors (i.e., peppermint, fish, coffee, banana, orange, rose, lemon, pineapple, cinnamon, cloves, leather and licorice) through a "multiple-forced-choice" method. Individuals with incomplete data about sex, age, and answers to all 12 sticks were excluded from the analyses. Furthermore, individuals with conditions that could affect smell performance, such as respiratory (asthma, sinusitis, septal surgery, etc.) or neurological diseases [43,44], were ruled out.

Relationship between Smell Performance and the Burden of OR-KO Genes
Conditional inference trees analysis (R "party" package) was used to test the influence of the burden of OR-KO genes (in addition to age, sex, and population) on smell functionality (number of errors in Sniffin' Sticks test) [45,46]. This statistical method is efficacious in studies in which there are subgroups with different levels of response to the variables explained. Briefly, the following algorithm was applied [47]: (1) to test the global null hypothesis of independence between any of the explanatory variables and the response. It was interrupted if this hypothesis could not be rejected based on a Bonferroni correction (α = 0.05). Otherwise, it selected the explanatory variable with the strongest association to the response; (2) implementing a binary split in the selected explanatory variable; (3) recursively repeating steps (1) and (2).

Expression of ORs in Human and Mouse
Human and mouse normalized expression data were downloaded from the supplementary materials of the mammalian olfactory mucosae transcriptomic atlas [37]. The data included normalized expression averages across three human and three mouse OE samples. The Human Protein Atlas (HPA) [48] was interrogated to verify the evidence of OR genes expression in non-OE tissues and the genes with expression below 1 normalized count were considered not expressed. The Mouse Genome Informatics (MGI) resource [49] was used to identify mouse homologs/orthologues and assess expression patterns of the homologs detected in the OS and other tissues.

Relationship with Pathologies
We asked if there was a significantly over-represented pathology in individuals carrying the KO genes than the rest of the sequenced population. The analysis focused on the sequenced individuals from the FVG cohort for whom detailed and curated anamnestic information was available (pathologies classified according to the International Classification ICD-10). For each OR gene, we extracted the pathologies observed in the group of KO individuals. For each disease/phenotype, a case-control study was carried out comparing its recurrence in HKO cases versus the group of individuals non-HKO (R implementation of the Fisher exact test, significance threshold set at Bonferroni corrected p-value < 0.001).
Supplementary Materials: Supplementary materials can be found at https://www.mdpi.com/ article/10.3390/genes12050631/s1. Table S1: 1000 Genomes project alleles frequencies for each LoF variant considered in this study. Table S2: gnomAD dataset alleles frequencies for each LoF variant considered in this study. Table S3: Comparison of the allele frequency of each LoF in our populations (FVG and VBI) to the corresponding allele frequency reported in both 1000 Genomes and gnomAD populations through a Chi-squared test. Table S4: Information retrieved from hORMdb to assess the likelihood of a functional impact on the corresponding OR.