Allergenicity and Conformational Diversity of Allergens

: Allergens are substances that cause abnormal immune responses and can originate from various sources. IgE-mediated allergies are one of the most common and severe types of allergies, affecting more than 20% of the population in Western countries. Allergens can be subdivided into a limited number of families based on their structure, but this does not necessarily indicate the origin or the route of administration of the allergen, nor is the molecular basis of allergenicity clearly understood. This review examines how understanding the allergenicity of proteins involves their structural characterization and elucidates the study of conformational diversity by nuclear magnetic resonance spectroscopy. This article also discusses allergen cross-reactivity and the mechanisms by which IgE antibodies recognize and bind to allergens based on their conformational and linear epitopes. In addition, we outline how the pH, the proteolytic susceptibility and the endosomal degradation affect the outcome of allergic reactions, and how this is correlated with conformational changes and secondary structure rearrangement events. We want to emphasize the importance of considering structural diversity and dynamics, proteolytic susceptibility and pH-dependent factors to fully comprehend allergenicity.


Introduction and Background to IgE Mediated Allergic Reaction
Allergic diseases, affecting millions of people, are the most common immune disorders worldwide and pose a significant health challenge.The number of affected individuals is continuously increasing, particularly in rapidly developing countries, whereas the number of affected people in industrialized countries has been historically high [1][2][3].The uneven geographical distribution of allergies is caused by cultural factors, general lifestyle and the different exposures to allergenic sources such as pollution and parasitic infection [1,4,5].Recent proposals suggest that these factors weaken the natural barriers of the airways, skin and gut, increasing susceptibility to allergies by affecting the epithelial barrier [6].According to the Hygiene hypothesis, a higher susceptibility to allergic sensitization in "westernized countries" has often been explained with a lower rate of infection during early childhood [7][8][9][10].The higher tolerance to potentially allergenic sources in healthy individuals can be observed by the expression products of naïve T H -cells, such as Interleukin (IL)-2, instead of the T H 2-cytokines IL-4, IL-5 or IL-13, which are expressed in allergic individuals in contact with the allergen.Furthermore, in healthy individuals, most T-cells differentiate into T reg memory cells, which are believed to regulate T-cell activation and T H 2 cell differentiation [11].
A key question in the field of allergenicity is "what makes a protein an allergen?".To answer this question, it is essential to characterize allergenic proteins both structurally and in terms of their biophysical properties.
Allergens are named systematically, and the World Health Organization (WHO)/International Union of Immunological Societies (IUIS) website provides a database of all recognized allergens which is regularly updated [12].For many allergens, structural information, sequences and the intrinsic protein structure play a major role in the determination of allergenicity, which has led to numerous databases grouping allergens according to structural similarities.All known protein allergens can be classified into 151 allergen families based on their structures, according to the AllFam database.This database relies on the definitions of the Pfam database [12][13][14].Although proteins with high structural similarities to allergens with no allergenic potential have been found, cross-reactions can often be observed with other allergens of the same structural family.Despite the fact that structural data are available for numerous allergens, the exact molecular basis and the biophysical determinants of allergenicity are currently under debate.For ease of reading, in the following section, the term protein allergens will be referred to as allergens and abbreviated as such.
For Immunoglobulin E (IgE)-mediated allergies to proteins, the allergenicity of proteins is typically defined by their ability to elicit and bind to the elicited IgE antibodies [15].Due to the cascade of reactions and consequences involved in an allergic reaction, there have been multiple attempts to measure the allergenicity of a substance [16].The key factor in determining an allergy is the overreaction of the immune system, which has multiple implications such as the T H 2 responses, increased levels of cytokines, for example, IL-4, and also the secretion of IgE antibodies [17].Methods which aim to determine allergenicity in vitro are typically designed to identify the specific IgE antibodies, which are produced in response to the respective allergen.Several tools have been developed which aim to predict the allergenicity of a protein, usually based on two main methods.The first approach is sequence-based and considers the sequence identity and similarity to known allergens.The second approach is based on the recognition of allergen-specific motifs.Innovative methods attempt to make these predictions using machine learning methods.Some of the state-of-the-art allergenicity predictors are alignment-independent and descriptor-based.However, none of the methods reliably differentiate between allergenic and non-allergenic substances [18][19][20][21][22][23].
Interestingly, all known allergens belong to very limited numbers of protein families, and newly discovered allergens typically belong to known classes [14].Between the different members of a protein family, the cross-reactivity often leads to a missing ability of the allergen-specific IgE antibodies to distinguish between the original sensitizing allergen and similar allergens of the same family.
The state-of-the-art method to estimate the cross-reactivity of two allergens is the determination of the A-RISC (Allergens'-Relative Identity, Similarity and Cross-Reactivity) index, which results in a single number that describes the risk of a cross-reactive reaction.To determine this index, proteins known to belong to a certain family are first subjected to a multiple sequence alignment.In the next step, the sequences are compared by calculating similarities and identities.The amino acids are, therefore, grouped into aromatic, aliphatic, positively charged, negatively charged, small with hydroxyl groups and neutral polar, and the remaining amino acids do not belong to any of these groups.The A-RISC index can then be calculated by averaging the proteins' similarity and identity.
This method assumes that members of a protein family are typically found in the same fold.Since the recognition and binding of epitope and paratope relies on the interactions of hydrophobic nature, hydrogen bond and shape complementarity, a high similarity or identity of the sequence may lead to similar interactions and therewith to cross-reactions [24].
To trigger an IgE-mediated immune response, also referred to as type I hypersensitivity, an individual must come into contact with the allergenic substance twice.During the first exposure, the "sensitization phase", the naïve CD4+ T-cell differentiates into a T H 2-cell, which is able to produce the cytokines IL-4, IL-5 and IL-13, thereby stimulating a B-cell to produce the allergen-specific IgE antibodies.These antibodies bind to FcεRI high-affinity receptors located on mast cells and basophils, enabling them to recognize the antigen.At this point, the mast cells and basophils are termed sensitized.The "sensitization phase" is visualized schematically in Figure 1A.During the second contact, the "elicitation phase" shown in Figure 1B, mast cells and basophils are immediately activated by cross-linking the antigens on their cell surface via IgE antibodies.They release cytokines, histamine and other anaphylactogenic mediators via degranulation, which cause the symptoms of the anaphylactic reaction or localized hypersensitivity.Plasma cells that differentiated previously are capable of producing high amounts of allergen-specific IgE antibodies.This allows the immune system to efficiently and immediately recognize and bind the foreign antigens.Once the plasma cells have been diversified, mast cells and basophils release products upon each re-exposure to the same allergen, or also at exposure to cross-reactive proteins, leading to the development of atopic disease [25][26][27].
Numerous different populations of T lymphocytes have been identified to regulate tolerance to allergenic antigens.If the tolerance of these T H -cells is disturbed, this can lead to the development of allergies and a misdirected reaction that triggers the production of IgE antibodies.The subgroups can be divided into type 2a T helper cells (T H 2a), follicular T helper cells (T fh ), regulatory T-cells (T reg ), type 1 regulatory T-cells (T r 1) and follicular regulatory T-cells (T fr ).The following section briefly explains each subgroup.
T H 2 T-cells trigger the immune response by secreting IL-4.In contrast to them, T H 2a cells are the terminally differentiated memory T-cells.T fh 13 cells are found in allergy sufferers and support B cells in the production of IgE with increased affinity.The suppressive counterpart to these B cell-stimulating cells is T fr cells.T r 1 cells are also suppressive T-cells that release IL-10, and T reg cells are suppressive T-cells that control T H 2 cell differentiation and suppress the immune response to harmless substances in healthy individuals.The balance and the interplay between these different T-cell subtypes is necessary to ensure tolerance against innocuous antigens [11].
Offending substances can originate from various sources, including plants, animals, foods or drugs [28].The sensitization of an allergen can occur via various routes of exposure, namely via ingestion through the gastrointestinal tract, via inhalation through the respiratory tract or via skin contact, sting or bite.Depending on the route of exposure and consequently on the body location of the immune insult, different subsets of dendritic cells are responsible for the immune answer [29].
Orally ingested allergens, i.e., food allergens in most cases, are decomposed in the stomach due to its highly acidic environment and the presence of digestive enzymes.The resulting peptides are transported to the mucosa through gut epithelial cells and by crossing the epithelial barrier to come into contact with the immune system.The mucosal dendritic cells process the proteins and peptides and present the antigens on MHC II molecules to T-cells [30][31][32].
Despite the sensitization via one organ, such as the gastrointestinal tract, sensitization can occur also in a distant organ, such as the skin, or via the respiratory tract [29,33].
Recent studies suggest that environmental exposures may also contribute to an increasing risk of sensitization for the development of food allergies.One prominent example is the peanut allergy, which affects an increasing number of people.The likelihood of developing this allergy increases when there are defects in the epithelial barrier [34].Interestingly, allergic reactions to food allergens have been observed to occur often at the first exposure.This observation led to the dual allergen exposure hypothesis, which assumes that the first exposure has already occurred through cutaneous sensitization.According to novel cell-based studies, the airway route has been proposed [35].(A) shows the sensitization phase, which corresponds to the first exposure of the immune system to a certain allergen.The allergen is presented to T-cells on the surface of an antigen-presenting cell in the form of small peptides.The activated T-cell divides to form TH2-cells and via the produced substances such as IL-4, B-cells become activated to produce specific antibodies.On the right of the picture, next to the magnifying glasses, sections of the interactions between the cell types are displayed in detail.ICOS: inducible co-stimulatory antigen; CD: cluster of differentiation; B7: CD80/CD86.In panel (B), the elicitation phase is shown, which represents the reaction at each re-exposure to the same allergen, or to cross-reactive allergens.The antibodies can now be quickly produced, and bind to mast cells and basophils.When an allergen binds to the antibodies anchored on mast cells and basophils, degranulation occurs: the cell releases cytokines, histamines and other toxic mediators which cause the immune response [36,37].

Structural Characterization and Cross-Reactivities-What Makes a Protein an Allergen?
Allergens do not have a single "allergen-specific fold", but rather exhibit various folds and tertiary structures that contribute to their allergenic activity.Typically, allergens are relatively small proteins, ranging in size from 5 to 100 kDa [38].These sizes may be subject to artefacts due to multimer formation: some protein allergens form dimers, Figure 1.Main steps of type I hypersensitivity reaction.(A) shows the sensitization phase, which corresponds to the first exposure of the immune system to a certain allergen.The allergen is presented to T-cells on the surface of an antigen-presenting cell in the form of small peptides.The activated T-cell divides to form T H 2-cells and via the produced substances such as IL-4, B-cells become activated to produce specific antibodies.On the right of the picture, next to the magnifying glasses, sections of the interactions between the cell types are displayed in detail.ICOS: inducible co-stimulatory antigen; CD: cluster of differentiation; B7: CD80/CD86.In panel (B), the elicitation phase is shown, which represents the reaction at each re-exposure to the same allergen, or to cross-reactive allergens.The antibodies can now be quickly produced, and bind to mast cells and basophils.When an allergen binds to the antibodies anchored on mast cells and basophils, degranulation occurs: the cell releases cytokines, histamines and other toxic mediators which cause the immune response [36,37].

Structural Characterization and Cross-Reactivities-What Makes a Protein an Allergen?
Allergens do not have a single "allergen-specific fold", but rather exhibit various folds and tertiary structures that contribute to their allergenic activity.Typically, allergens are relatively small proteins, ranging in size from 5 to 100 kDa [38].These sizes may be subject to artefacts due to multimer formation: some protein allergens form dimers, trimers or other large multimers in their natural state.The most prominent examples are the major peanut allergens Ara h 1 and Ara h 3, which naturally form trimers of bicupins and hexamers of bicupins, respectively.Based on the characterization by SDS-PAGE, these multimers are often monomerized, which can lead to artefacts in determining allergen size [39].
Attempts have been made to group allergens into families.AllFam is a database that aims to classify allergens into families based on common structural and functional properties [14].The classification is supported by the definitions of Pfam.Currently, 959 out of 1042 known allergens are classified into a total of 151 allergen families, of which only 23 are populated with at least 10 members (https://www.meduniwien.ac.at/allfam/ browse.php,accessed on 17 December 2023) [13,40].Prominent examples of such families include the pathogenesis-related (PR) protein class 10, which contains the major birchpollen allergen Bet v 1, profilins, which include the ragweed pollen allergen Amb a 8, or the Group 5/6 grass pollen allergens, with the timothy grass pollen allergen Phl p 6 as a member [41][42][43].Proteins within the same family are known to frequently cause crossreactions.However, structural similarity does not necessarily correlate with a common immune response.There are proteins that are structurally similar to known allergens but do not trigger allergic reactions.
Cross-reactivity can occur when specific IgE antibodies are able to bind not only to the original epitope but also to structural elements that have high similarity to other allergens.Importantly, not the entire polypeptide chain of the allergen is involved in antigen binding since mainly the surface residues are fundamental for antibody recognition.
The concept of cross-reactivity describes the relationship between more than one allergen to an IgE antibody.Sensitization occurs when the immune system first encounters an allergenic substance, leading to the production of allergen-specific antibodies.If these antibodies can also recognize a secondary allergen through cross-reactive epitopes and link the protein to mast cells, cross-reactivity may occur.An allergen very often has homologous proteins originating from very distinct sources, and also the route of exposure may not be the same for the primary and the secondary cross-reactive allergen.It is very common for allergic patients to have a pollen allergy and to be cross-reactively allergic to certain foods.For instance, patients may be allergic to inhaled birch pollen as well as to apples when consumed orally [44,45].In addition, recent studies show that the airway may be an alternative sensitization pathway for food allergies, which is consistent with the multiple pathway theory mentioned earlier [35].This condition is also referred to as oral allergy syndrome (OAS) or pollen-food syndrome.It is triggered by cross-reactivity due to structural similarities.For instance, birch pollen-apple allergy is caused by PR10 family allergens that have a highly conserved structural fold.Another crucial factor in OAS is the impact of food processing.Some allergens are sensitive to physical processes such as heating during the cooking process.These processes can affect the structure and conformational epitopes of protein allergens, which in turn affects their allergenicity.For example, patients who are allergic to eating raw apples may no longer be affected after cooking.Identifying the conformational epitopes involved in OAS, and the modifications of such post-processing, could be crucial in understanding OAS and developing strategies to prevent allergic reactions in patients [46].One further significant consideration of food processing is the formation of neo-allergens, which are new allergenic compounds formed through the Maillard reaction when different food ingredients interact during cooking.Heating processes can accelerate this reaction, which can alter allergen epitopes through glycation reactions [47].
The cross-reactivity discussed earlier pertains to IgE interactions.When a T-cell is reactive to more than one peptide-MHC ligand, stimulation of the T-cells is possible, which can further induce the production of IgE antibodies.This is known as T-cell cross-reactivity.In both cases of cross-reactivity, sequence and structural homology play important roles, as does physiochemical stability [45].
To distinguish allergenic proteins from their non-allergenic counterparts, several hypotheses have been proposed.Two factors may come into consideration to play a role in determining the allergenicity: the abundance and the stability of a particular protein.
Abundance is related to the probability of contact, thereby modulating the likelihood of allergenic sensitization, while stability correlates with the ability to not be degraded too early in the gastric system or even in the endosome [48].This second property is particularly crucial for allergens that are ingested orally, such as food allergens, but also for other routes of administration.
To assess allergenicity, a series of tools have been developed.These tools either focus on the sequence similarity to known allergens, on the recognition of known motifs of allergens, or on the biophysical properties of surface patches or residues [18].The latter ones are mostly machine learning methods, which are based on molecular descriptors.The sequences of a potential allergen are characterized by the biophysical properties of the individual residues, such as size, hydrophobicity, abundance, or the ability to form certain secondary structures.These properties are then transformed into binary fingerprints, resulting in comparable coefficients.
Protein allergens are complex molecules.Their conformational and structural arrangements play a central role in determining allergenicity and immunogenicity, i.e., whether a protein is recognized as foreign.The exposure of epitopes is influenced by conformational rearrangements and flexibility, which in turn are properties that co-determine allergenicity.Conformational rearrangements can affect the susceptibility of proteases responsible for the digestion and processing of allergens, thus directly affecting the immune response.This also applies to non-protein allergens, where the triggering substance is often a small peptide or molecule.In most cases, the drug is too small to be presented directly on the surface of antigen-presenting cells, so complexation with carrier molecules or polypeptides is critical, as small molecules are sometimes recognized only in the form of a drug-protein conjugate [49].
For the cross-linkage between the receptor on the surface of mast cells or basophils and the antibody, an allergen must possess at least two epitopes [38].Identifying these epitopes is a crucial step in allergen research.In the past, technical limitations restricted the identification of epitopes to those of contiguous nature, which consist of a contiguous stretch of amino acids.Investigating amino acids involved in binding in a folded protein, where residues at large distances may be in close proximity due to folding events, is experimentally much more challenging.However, allergens have well-defined and conserved structures, and the epitopes recognized by antibodies are, in most cases, of a conformational nature and therefore non-contiguous [50].Conformational epitopes are more likely to be destroyed during allergen uptake and processing, but they seem to appear more frequently due to the folded and three-dimensional structures of protein allergens.Identifying and characterizing conformational epitopes is essential for understanding protein-protein interactions and structural determinants that cause cross-reactions and allergic immune responses [51].Additionally, the allergenicity of an allergen is determined by its fold stability, which is affected by pH dependence and endosomal degradation.The fold stability is inherently affected by pH dependence, which, in turn, affects endosomal degradation.Furthermore, the digestion by proteases is dependent on the variability of a conformational ensemble of the allergen itself, with a higher variability of the probability of unfolding, and therewith the chances of a protease to digest the allergen are enhanced.In previous works, this has been shown to correlate in consequence with thermal stability [52,53].

Conformational Diversity and NMR
As previously discussed, the conformational diversity of allergens is a key feature that influences their allergenic properties.Nuclear magnetic resonance (NMR) spectroscopy provides an excellent technique to characterize the conformational diversity of allergens by experiments.Structure determination by NMR spectroscopy yields bundles of threedimensional structures, all of which agree equally well with the available experimental data, thereby yielding a direct representation of the conformational diversity that is present in allergens [54].Moreover, a variety of experimental tools are available to characterize conformational diversity in biomolecules in detail and in a quantitative manner.NMR spin relaxation techniques have been developed, which enable structural biologists to monitor conformational diversity at atomic resolution [55].On the one hand, these experiments provide information about populations, revealing which structures within the bundle are present to a higher degree than others.On the other hand, information about the time scale of transitions between these structures is obtained.In the past decade, the application of these techniques to allergens has provided in-depth information regarding their conformational diversity [41].Notably, NMR spectroscopy can also determine how interactions with low-molecular-weight molecules or other binding partners can affect the conformational variability of allergens [56].

Epitope Conservation and Structural Motifs
Epitope regions on the surface of the allergen can be either linear (continuous) or conformational (discontinuous).A linear epitope consists of consecutive amino acid residues that can be predicted as peptides.These peptides typically range from 5-30 residues but are often less than 20 residues in length [45].To identify linear epitopes, epitope prediction tools such as AlgPred2.0compare the amino acid sequence of a protein with all known linear epitopes stored in databases, based on the primary sequence.Additional machine learning methods are sometimes used to improve predictions [57].This approach is effective for certain protein families, such as the Bet v 1 group.However, there are many proteins in these databases for which IgE binding has been observed in the sera of a small number of patients, such as carrot cyclophilin and salmon enolase, leading consequently to wrong predictions [58,59].It is important to note that the presence of IgE binding does not necessarily indicate allergenicity.Some individuals may have IgE antibodies to a protein without experiencing an allergic reaction.Although the sequence identity can be quite high in some cases, neither cyclophilin nor enolase is a known allergen in most organisms.If enolase were a significant allergen, it would be expected that individuals sensitized to enolase would react to a wide range of organisms with glycolytic pathways.However, this is not observed, and the details related to allergen structure, binding to IgE, and functionality remain unknown.
Comparing conformational epitopes to known allergens can be challenging due to the requirement of three-dimensional structure information.The Immune Epitope Database (www.iedb.org)provides a list of 204 experimentally determined B-cell and T-cell conformational epitopes of allergens and 5722 linear epitopes (accessed on 14 December 2023) that have already been published [60].Table S1 in the Supplementary Materials lists the conformational and linear epitopes of selected members of three prominent allergen families, the Bet v1 family (also known as the PR10 allergen family), the profilin family and the lipocalins, as well as those of some other selected members of additional families, as determined by an IEDB search.The table includes information on the different routes of exposure and the origin of the respective allergens.Pollen-food syndrome, also known as oral allergy syndrome (OAS), is a common example of cross-reactions in individuals who are sensitized to certain pollen and also react to certain fruits, vegetables, or nuts.This occurs because some pollen proteins share similar epitopes due to structural homology.For instance, individuals who are allergic to birch pollen (e.g., Bet v 1) may also react to fruits of the rose family (Rosaceae) such as apples (Mal d 1) or sweet cherries (Pru av 1) [61][62][63].From known epitopes, we know that these three allergens share at least the conformational epitope given by the serine residue at position 113/112.Another known shared epitope is given by residues T11, F31(I31), S57(T58), S113 and I114.Mal d 1 and Bet v 1 share a sequence identity of 61%, although it seems to be more relevant to search for discrete areas of similarity and common structural motifs [64].All individuals sensitized to Mal d 1 are also sensitized to Bet v 1, and conversely, 94% of individuals sensitized to Bet v 1 are also sensitized to Mal d 1.Additionally, 100% of people with Api g 1 allergy are sensitized to Bet v 1, while 37% of people sensitized to Bet v 1 are also sensitized to Api g 1 (celery) [65].These examples demonstrate cross-reactivity within the Bet v 1 protein family.The cross-reactivity of this group is thought to result from a high sequence homology within the group.However, cross-reactivity can also occur between different families and across different plant sources.
The cross-reactivity between mugwort and celery exemplifies epitope conservation across two distinct protein families.The mugwort pollen allergen Art v 1 seems to share epitopes with certain vegetable proteins.Individuals who are allergic to the mugwort pollen allergen Art v 1, which has a defensin-like fold associated with a polyproline-rich region, may exhibit cross-reactivity with celery (Api g 1), a member of the PR10 protein family.The presence of structurally similar epitopes in Art v 1 from mugwort pollen and Api g 1 in celery is assumed; however, no B cell epitopes are published and listed in the IEDB [62].Another well-known example is the latex-fruit syndrome, which demonstrates how a contact allergy to latex from para rubber trees can lead to allergic sensitization to fruits such as bananas or avocados.Latex-sensitive individuals may experience allergic reactions to certain foods, such as bananas (Mus a 5, beta-1,3-glucanase) and avocados (Pers a 1, class I chitinase), due to the shared epitopes of major latex allergens, Hev b 6.02 and Hev b 5.This cross-reactivity is attributed to structural similarities between latex allergens and certain fruit proteins, despite differences in their structural folds [66].More than 100 epitopes are predicted for both of the mentioned latex allergens, but no conformational B cell epitopes have been published.The lipocalin protein family, which includes prominent airway allergens from domestic cats (Fel d 7), dogs (Can f 2), and cockroaches (Bla g 4), shares conserved sequence motifs that are known among most members of the family.Three motifs are conserved among the many members of the family, which are two loop regions and the N-terminal end [67].These regions are supposed to be important for cross-reactive binding.
Understanding the conservation of epitopes is crucial for predicting cross-reactivity and treating allergic disease.Identifying common epitopes aids in developing hypoallergenic variants of allergens and strategies for allergen-specific immunotherapy.

Allergen Proteolytic Susceptibility and pH-Dependency
During sensitization, an allergen undergoes endosomal degradation, which cleaves it into small peptides.These protein fragments are then presented on the cell surface by the major histocompatibility complex II (MHC II), facilitating their recognition by CD4+ T-cells, which then induce an immune response in allergic individuals.The process of protein degradation is depicted in Figure 2A: first, the allergen enters the cell via endocytosis.Proteases cleave the allergen into small fragments, and these fragments are then presented on the cell surface by the MHC II.The naïve CD4+ T-cells recognize and bind the peptides, leading to the differentiation into active T-helper cells.Polarization into T H 2-cells results in the release of cytokines that stimulate B-cells to produce IgE antibodies, causing an allergic response [68].Since the presentation of the fragments directly depends on the ability of proteases to cleave the allergenic proteins, proteolytic susceptibility can be directly correlated to the allergenicity of a protein [69,70].Also, the density of allergen fragments presented on the surface of antigen-presenting cells (APC) influences the destiny of the T-cell contacting the peptides.High concentrations of antigens promote T H 1-cell differentiation, whereas moderate concentrations lead to T H 2-cell differentiation [68].
Proteolytic cleavage occurs in the endosome of an antigen-presenting cell.Processing and loading onto MHC II take place in slightly acidic environments, where the acidity of the endosome increases during its maturation.The endosomal degradation of an allergen is thereby also dependent on the degree of maturation of the endosome.During the maturation process, the endosome undergoes an alteration of the pH, from initially 6.8-5.5 in the early endosome to 5.5-5 in the late endosome.The final stage of maturation involves the transformation of the fully matured endosome into a lysosome, which has an even lower pH of approximately 5 to 4 [30,33,71].

Discussion and Future Directions
Numerous protein groups with distinct biophysical properties can induce allergyspecific T-cell polarization.To determine allergenicity, it is important to understand the biophysical characteristics that depend on the sequence and structure of a protein.Both linear and conformational epitopes enable recognition and processing by dendritic cells, as well as binding to T-cell receptors, followed by activation mechanisms and reaction Previous studies have demonstrated that proteolytic susceptibility is dependent on the structural stability of an allergen.Since proteolytic cleavage sites are frequently hidden within secondary and tertiary structure elements, the key factor that facilitates proteolysis is the inherent ability of proteins to undergo local unfolding.Unstable proteins are more likely to undergo unfolding, resulting in increased susceptibility to cleavage.In contrast, highly stable proteins are less prone to unfolding and therefore exhibit reduced susceptibility to cleavage [69,72].
The acidity of the environment is a factor that affects the stability of an allergen.Proteins are more likely to undergo local or global unfolding in an acidic environment [42,69,73,74], as shown in Figure 2C.The figure illustrates the broader conformational ensemble at low pH values, where most of the titratable residues are protonated.As the pH increases, the percentage of deprotonated residues rises.In Figure 2C, the residues highlighted in cyan emphasize the higher fraction of deprotonation.Simultaneously, the conformational ensemble narrows, and the allergen becomes more rigid.This correlation has been observed for several respiratory allergens in recent studies which combined computational methods, namely constant pH-MD, with NMR experiments [73].Examples are the major timothy grass pollen allergen Phl p 6 together with a stabilized and a destabilized variant [43,73].For the profilin allergens Amb a 8, Art v 4 and Bet v 2, changes in the pH have been shown to affect thermal stability and flexibility [42].The investigated profilins show consistent protonation patterns at each pH range.Table 1 summarizes the examples of allergens that have demonstrated a pH-dependent effect on enzyme digestion and binding epitopes.
The digestion of an allergen is determined by its stability, with unstable proteins being digested in the early endosome and stable allergens being digested in a later stage, as depicted in Figure 2B.Endosomal acidification has been shown to play a key role in the induced allergic reaction since T-cell polarization, leading to the sensitization to a specific allergen, can only occur at a certain maturation level.Epitope fragments that are generated in the early endosome are more likely to yield T H 2 responses since the density of the presented peptides is moderate.In the late endosome, the MHC II loading is more efficient and therewith the density of the presented peptides is higher, generating a T H 1 response.Also, in the lysosome where the density of peptides loaded on the MHC II is lower, the probability of allergic sensitization is higher.Thus, proteolytic degradation must occur within specific margins of the endosomal maturation to trigger an allergic response, as illustrated in Figure 2B [48,75].The significance of this fact is emphasized by the intricate interplay between endosomal acidification and the immune system in the context of allergic reactions.Understanding this relationship clarifies the process of T-cell polarization and improves our understanding of the sensitization of specific allergens.T-cell polarization is a critical step in the development of allergies that relies on endosome maturation.Unraveling the complexities of this interplay offers potential avenues for investigating allergic sensitization and its associated health impacts [48].

Discussion and Future Directions
Numerous protein groups with distinct biophysical properties can induce allergyspecific T-cell polarization.To determine allergenicity, it is important to understand the biophysical characteristics that depend on the sequence and structure of a protein.Both linear and conformational epitopes enable recognition and processing by dendritic cells, as well as binding to T-cell receptors, followed by activation mechanisms and reaction cascades.It has been demonstrated in various studies that taking into account the dynamic properties of allergens is crucial, as their biological functions are directly linked to conformational variability [56,69].Inherent properties, such as stability, determine the folding and unfolding of secondary and tertiary structure elements and other structural rearrangements which further influence function, immune recognition, and subsequent processing [38,42,76,77].The conformational diversity of proteins and their structural ensembles are determined by intrinsic biophysical properties such as thermal stability, resistance to alterations in the environment's acidity and hydrophobicity.All of these indicators affect protein recognition, binding behavior, and allergenicity [78].
A crucial step in any allergic response is the interaction between the epitope of an allergen and the paratope of an antibody.Recently, the first structure of an IgE-allergen interaction was reported, which provides molecular details of the specific amino acids involved in allergen recognition [79].Characterizing the binding interface is crucial for understanding the binding mechanisms and for developing therapeutics that can inhibit cross-linking between allergen and effector cells via IgE antibodies.This process is known to cause the symptoms of an allergic reaction [80].The ability of an antibody to bind a specific target with a certain affinity and to recognize multiple epitopes is determined by the conformational diversity of its paratope [81].The large functional diversity of antibodies and their capability to recognize different epitopes can be explained by the concept of conformational variability, which is characterized by the surface plasticity of the interfaces [82][83][84][85].The ability to bind more than one epitope is a key factor in the cross-reactivity of an allergen.
Upon allergen uptake, the allergens enter the endosome of a dendritic cell, where they are further digested.A prerequisite for digestion by proteases is a partial unfolding of the tertiary and secondary structure elements of the allergen.This destabilization causes local unfolding events and structural rearrangements, resulting in higher cleavage rates and ultimately leading to allergic sensitization [69].
The categorization of allergens is heavily reliant on their three-dimensional structures and may be assigned based on their conformational epitopes.These structural features are determined by the protein's fold and shape.Conformational epitopes are the primary means by which IgE antibodies recognize most major allergens due to their three-dimensional protein structures.Linear epitopes, on the other hand, are responsible for sensitization by food allergens because they are digested in the gastrointestinal tract [86].
Processing techniques have also been shown to impact allergenicity by modifying or masking IgE-binding epitopes in the case of food allergens.For instance, the Bet v 1-like hazelnut allergens Cor a 1 and Cor a 2 showed reduced allergenicity when roasted at temperatures above 140 • C [87].Conversely, roasting has been shown to increase IgE binding for the major peanut allergen, Ara h 2. Roasting causes significant structural changes that affect digestibility, which is believed to further impact allergenicity.However, these findings cannot directly reflect the change in allergenicity because side effects such as monomerization or aggregation must be considered [88].Additionally, other processing mechanisms, such as high-pressure treatment, irradiation or microbial fermentation may affect the allergenicity of protein allergens by destroying binding regions.Particularly, high-pressure treatments followed by enzymatic hydrolysis, or enzymatic hydrolysis in combination with fermentation, seem to minimize allergenicity due to the cleavage of linear epitopes for some allergenic proteins, as it has been shown for cow milk [89].However, processing procedures can increase aggregation propensity and cause chemical modifications such as the Maillard reaction.It is important to note that these changes do not necessarily result in the loss of allergenicity.Nevertheless, aggregations and insolubility should be considered, as well as their effects on a molecular level, which are not yet fully understood [46].
Characterization of conformational epitopes requires the preservation of the threedimensional structure of the allergen-antibody complex for X-ray crystallography, and also for NMR structure determination [51,79,90,91].The paratope of antibodies determines antigen recognition and binding, but predicting this region accurately is challenging [92].A study characterizing several antibody-allergen interfaces revealed high variability in the antibody paratopes, reflected in higher surface plasticity compared to the respective epitopes.Additionally, it has also been reported in a limited number of cases that IgE epitopes may differ from other epitopes of other isotypes, indicating that some of the few known IgE epitopes appear to be more planar than other known epitopes [80,85,93].
Obtaining IgE antibodies poses difficulties due to the scarcity of B-cells that produce these antibodies.Experimental structure determination requires high amounts of pure and homogeneous antibodies, and characterizing paratopes and epitopes is also difficult.Despite these challenges, recent advances in antibody sequencing have significantly contributed to our understanding of the interplay between antibodies and allergens.Advanced technologies such as B-cell antibody sequencing and human hybridomas have enabled the successful solving and deposition of an IgE structure complexed with the mite allergen Der p 2 in the Protein Data Bank (PDB) [51,79,94].In addition, structural data are available for numerous allergens either in a complex with murine IgG monoclonal antibodies (mAbs) or with IgE constructs from phage display libraries [95][96][97][98].These recent achievements provide molecular details of the specific amino acids involved in allergen recognition [79,99].
When investigating allergens, it is crucial to consider the acidity of the environment and take into account different protonation states, as they influence stability and other biophysical properties [42,43,73].This aspect becomes particularly important when experimentally determined structures (via X-ray diffraction or NMR) are considered, and also when utilizing computational approaches such as structure prediction methods or biophysical property predictions of proteins.Some examples of computational approaches are interaction analysis, prediction of stabilizing or destabilizing mutations, and docking or hydrophobicity calculations [51].
The same considerations should also be applied when performing molecular dynamics (MD) simulations.One promising example is the implementation of constant pH molecular dynamics simulations (cpH-MD).Molecular dynamics simulations are typically performed at fixed protonations, which are selected for the starting conformation of the simulation.Since pH is highly dependent on the structure and conformational changes of a protein, cpH-MD methods have been developed to capture the interplay between changes in pH and changes in structure.These simulations enable the calculation of the probabilities for different protonation states and thereby predict the most probable protonation at a certain pH range.The approach to reliably access these changes is to explicitly define titratable protons.A list of states containing all titratable groups is used to determine whether a proton is "active" or "inactive".The simulation is run at a fixed protonation state, but at regular intervals, an attempt is made to change the protonation state, and the Metropolis criterion is used to decide whether the new state is accepted or not.If a change is accepted, the solvent is allowed to relax around the fixed protein for a certain number of steps [100,101].
Considering the impact of environmental acidity and resulting protonation states is essential for future research in the field of allergens.This facilitates a more thorough comprehension of the behavior and interactions, laying the groundwork for antibody engineering in the field of allergenicity and advancing the development of new therapeutics.
There is currently no definitive answer to the question of what exactly determines whether a protein is an allergen or not.However, recent developments go beyond merely categorizing proteins as allergens or not, but rather suggest a continuous differentiation ranging from "highly allergenic" to "not allergenic at all", based on many intrinsic biophysical and environmental factors, which are discussed in this review [48].

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/allergies4010001/s1,Table S1: List of selected members of different allergen families with their origin, route of exposure, predicted linear epitopes using the AlgPred2.0online tool and known conformational and linear B-cell epitopes determined using the IEDB.a).

Figure 1 .
Figure 1.Main steps of type I hypersensitivity reaction.(A) shows the sensitization phase, which corresponds to the first exposure of the immune system to a certain allergen.The allergen is presented to T-cells on the surface of an antigen-presenting cell in the form of small peptides.The activated T-cell divides to form TH2-cells and via the produced substances such as IL-4, B-cells become activated to produce specific antibodies.On the right of the picture, next to the magnifying glasses, sections of the interactions between the cell types are displayed in detail.ICOS: inducible co-stimulatory antigen; CD: cluster of differentiation; B7: CD80/CD86.In panel (B), the elicitation phase is shown, which represents the reaction at each re-exposure to the same allergen, or to cross-reactive allergens.The antibodies can now be quickly produced, and bind to mast cells and basophils.When an allergen binds to the antibodies anchored on mast cells and basophils, degranulation occurs: the cell releases cytokines, histamines and other toxic mediators which cause the immune response[36,37].

Figure 2 .
Figure 2. (A)The uptake of the allergen into the endosome of an antigen-presenting cell: the allergen enters the cell via endocytosis.In the endosome, proteases cut the allergen into small peptides, which are then loaded onto the MHC II and presented on the cell surface.(B) The acidification of the endosome in relation to different allergen stabilities.Depending on the stability of the allergen, the proteins are digested earlier or later in the phase of the maturation of the endosome.(C) Schematic representation of the structural behavior of an allergen as a function of varying pH.At low pH, a small fraction of the allergen is deprotonated and the allergen is structurally variable.The conformational ensemble is large.With increasing pH, the conformational ensemble becomes smaller, and a higher fraction of the allergen is deprotonated.

Figure 2 .
Figure 2. (A)The uptake of the allergen into the endosome of an antigen-presenting cell: the allergen enters the cell via endocytosis.In the endosome, proteases cut the allergen into small peptides, which are then loaded onto the MHC II and presented on the cell surface.(B) The acidification of the endosome in relation to different allergen stabilities.Depending on the stability of the allergen, the proteins are digested earlier or later in the phase of the maturation of the endosome.(C) Schematic representation of the structural behavior of an allergen as a function of varying pH.At low pH, a small fraction of the allergen is deprotonated and the allergen is structurally variable.The conformational ensemble is large.With increasing pH, the conformational ensemble becomes smaller, and a higher fraction of the allergen is deprotonated.
Contributions: The research was performed by C.A.S., R.Z. and M.L.F.-Q.under the supervision of K.R.L. and M.T. C.A.S. drafted the manuscript.All authors contributed to the paper by critically reviewing it.All authors have read and agreed to the published version of the manuscript.Funding: This work was supported by the Austrian Science Fund (FWF) via grant P34518.This work was supported by the Austrian Academy of Sciences APART-MINT postdoctoral fellowship to M.L.F.Q.

Table 1 .
Table listing allergens for which the effect of pH has been shown to alter the protease digestion or conformational epitopes, therewith influencing allergenicity.

Table 1 .
Table listing allergens for which the effect of pH has been shown to alter the protease digestion or conformational epitopes, therewith influencing allergenicity.