Next Article in Journal
Exploring the Use of Eye Tracking to Evaluate Usability Affordances: A Case Study on Assistive Device Design
Next Article in Special Issue
The Role of Gut Microbiota in Food Allergies and the Potential Role of Probiotics for Their Treatment
Previous Article in Journal
An Efficient Semantic Segmentation Framework with Attention-Driven Context Enhancement and Dynamic Fusion for Autonomous Driving
Previous Article in Special Issue
Sustainable Plant-Based Diets and Food Allergies: A Scoping Review Inspired by EAT-Lancet
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome-Wide In Silico Analysis Expanding the Potential Allergen Repertoire of Mango (Mangifera indica L.)

1
Department of Chemistry and Biochemistry, The University of Oklahoma, Norman, OK 73019, USA
2
Mercy School Institute, Edmond, OK 73013, USA
3
Dodge Family College of Arts and Sciences, University of Oklahoma, Norman, OK 73019, USA
4
Department of Biochemistry and Physiology, The University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA
5
Mass Spectrometry, Proteomics and Metabolomics Core Facility, Stephenson Life Sciences Research Center, The University of Oklahoma, Norman, OK 73019, USA
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(15), 8375; https://doi.org/10.3390/app15158375
Submission received: 25 May 2025 / Revised: 24 June 2025 / Accepted: 27 June 2025 / Published: 28 July 2025
(This article belongs to the Special Issue New Diagnostic and Therapeutic Approaches in Food Allergy)

Abstract

The potential of a protein to cause an allergic reaction is often assessed using a variety of computational techniques. Leveraging advances in high-throughput protein sequence data coupled with in silico or computational methods can be used to systematically analyze large proteomes for allergenic potential. Despite mango’s widespread consumption and growing clinical reports of hypersensitivity, the full extent of their allergenicity is yet unknown. In this study, for the first time, we conducted a genome-wide in silico analysis by analyzing a total of 54,010 protein sequences to identify the complete spectrum of potential mango allergens. These proteins were analyzed using various bioinformatics tools to predict their allergenic potential based on sequence similarity, structural features, and known allergen databases. In addition to the known mango allergens, including Man i 1, Man i 2, and Man i 3, our findings demonstrated that several isoforms of cysteine protease, non-specific lipid-transfer protein (LTP), legumin B-like, 11S globulin, vicilin, thaumatin-like protein, and ervatamin-B family proteins exhibited strong allergenic potential, with >80% 3D epitope identity, >70% linear 80 aa window identity, and matching with >80 known allergens. Thus, a genome-wide in silico study provided a comprehensive profile of the possible mango allergome, which could help identify the low-allergen-containing mango cultivars and aid in the development of accurate assays for variety-specific allergic reactions.

1. Introduction

The identification and characterization of allergens and allergen-like proteins from food sources is a central concern in not only allergy research but is also of growing importance of the food industry, where safety and risk assessment are crucial for regulatory approval, product development, and consumer protection [1,2,3]. On the other hand, identification of a complete allergen profile from any food species is challenging and time-consuming because of the complexity of the proteins involved and the individual variability in immune responses. However, computational methods and the availability of complete genome sequences can streamline this process, leveraging bioinformatics tools to analyze protein structures and identify potential allergenic components more efficiently. These computational approaches hinge on key features, such as sequence similarity, conserved structural motifs, and immunologically relevant domains that are commonly associated with known allergenic proteins [4].
Although in silico and computational tools cannot distinguish between the sensitization and elicitation phases of allergy development, international guidelines, including those from the FAO/WHO and the European Food Safety Authority (EFSA) [5], endorse their use, defining a protein as potentially allergenic if it shares more than 35% identity over 80 amino acids, or contains exact matches of 6–8 contiguous amino acids with a known allergen [5,6]. These thresholds recognize in silico predictions as a first-pass assessment in allergen risk evaluation, form the basis of many allergenicity prediction platforms, and are widely adopted to streamline regulatory evaluations in both academic and industry contexts [4].
Fruit allergies, which impact 0.03% to 8% of the global population, are a growing health concern [2,7]. Despite the incidence rate, many fruits remain underrepresented in allergen databases. In particular, tropical fruits have seen a surge in global consumption yet remain poorly characterized with respect to their allergenic profiles [4,8]. Mango (Mangifera indica L.), a widely consumed tropical fruit valued for its flavor and nutritional benefits, exemplifies the growing divide between increasing global dietary expansion and the gap in allergen knowledge. In recent decades, mango has experienced an evident rise in popularity beyond its native growing regions. Namely, in the United States, mango imports have quadrupled over the past two decades to meet consumer demands, from smoothies to skincare products, and therefore contribute to a broader and more frequent exposure across all age groups [9,10]. A recent meta-analysis of studies on fruit allergies found that mangos are one of the top five tropical fruits that trigger fruit allergies [7]. Multiple studies have implicated mango as a significant allergen, with reported prevalence ranging from 0.3% in Switzerland to 16% in Thailand, and skin test positivity rates as high as 42.3% in some Chinese cohorts [4,11].
Mango allergy manifests primarily in two immunological forms: immediate (Type I) and delayed (Type IV) hypersensitivity [11]. These reactions often manifest as contact dermatitis, rash, eczema, and blistering. As documented by Ukleja-Sokolowska et al. [12], a 30-year-old woman developed generalized urticaria, facial edema, strong stomach pain, and watery diarrhea within several minutes of eating a mango. Despite its clinical significance, the allergenic profile of mango remains incompletely characterized. According to the WHO/IUIS Allergen Nomenclature Sub-committee (https://allergen.org/, accessed on 24 May 2025), only three allergenic proteins have been formally identified in mango, including chitinase (Man i 1), pathogenesis-related (PR) protein (Man i 2), and profilin (Man i 4), representing only a fraction of the mango’s total proteome. A recent study by Guo and Cong (2024) listed a couple more proteins [4], including glyceraldehyde-3-phosphate dehydrogenase (GAPDH), lipoxygenase, and glucanase, which showed cross-reactivity with other plant species such as wheat, peanut, and banana [13,14]. Furthermore, mango is not only antigenic on its own but also exhibits extensive cross-reactivity with other food and inhalant allergens [4]. These existing findings underscore the broader immunological risk that mango poses to individuals with existing sensitizations, emphasizing the need for a deeper and wider understanding of mango’s allergenic potential.
In silico platforms, which are a powerful method that allows researchers to analyze protein structures and predict allergic reactions without requiring extensive laboratory testing, have been successfully used to identify the potential allergen proteins in numerous food species [15,16,17,18,19,20,21,22]. However, to our knowledge, no genome-wide analysis has been conducted to profile the complete allergome of any tropical fruits including mango. With the majority of the proteome remaining uncharacterized for allergenicity, a genome-wide evaluation is both timely and necessary. In this study, for the first time, we conducted a genome-wide in silico screening of 54,010 mango protein sequences extracted from the NCBI protein database. We sought to systematically discover and catalog potential allergens in mango using a variety of allergenicity prediction platforms. By expanding the known allergen repertoire of mango, our analysis provides a valuable foundation for future research in allergy diagnostics, immunotherapy design, and food safety assessment, which is especially pertinent as tropical fruits like mango become increasingly integrated into global diets and diverse consumer products [11].

2. Materials and Methods

2.1. Collection of Mango Protein Sequences

The mango genome (CATAS_Mindica_2.1, assembled in 2020 by the Beijing Institute of Genomics, Chinese Academy of Sciences, using the widely cultivated Alphonso variety (RefSeq Accession: GCF_011075055.1)) currently consists of 54,010 protein entries, which are publicly accessible under the NCBI Protein database (Taxonomy ID: 29780). To assess the allergenic potential of mango proteins, these protein sequences were downloaded into FASTA format and curated to include essential metadata such as NCBI accession numbers, protein names, sequence lengths, and source information. Figure 1 outlines the stepwise bioinformatics pipeline used to assess the allergenic potential of Mangifera indica proteins.

2.2. In Silico Platform for Identification of Potential Mango Allergens

The curated protein sequence dataset served as the foundation for our genome-wide allergen prediction screening and was subjected to AllerCatPro 2.0, a well-established prediction tool that integrates sequence similarity, epitope analysis, and structural comparisons [23,24]. Since AllerCatPro can only analyze 50 protein sequences at a time, we split the mango proteome dataset (54,010 protein sequences) into 1081 FASTA files and submitted each one for analysis. We additionally implemented a filter with two criteria: (i) ≥70% sequence identity over an 80 amino acid linear window, and/or (ii) best match with at least 80 known allergen entries, to ensure the strong consistency and specificity of the allergen prediction results with either strong or weak evidence.

2.3. Bioinformatics Tools for Data Processing

Scatter and bubble plots were generated using Python (version 3.12.7). Multiple amino acid sequence alignment was performed, and a phylogenetic tree was generated using an open-source sequence alignment platform, Clustal Omega (https://www.ebi.ac.uk, accessed on 24 May 2025). Heatmaps on percentage (%) identity were generated using Excel. B-cell epitope prediction was performed using the Immune Epitope Database (IEDB) (http://tools.iedb.org/bcell/, accessed on 24 May 2025) following the Kolaskar and Tongaonkar method [25].

3. Results and Discussion

The identification of allergenic proteins in food sources is a growing priority for researchers and regulatory agencies alike, especially as global diets diversify to include a wider range of under-characterized foods. In this regard, in silico approaches in conjunction with high-quality genome sequence data can enable scalable, quick evaluation of allergen identification, structural motifs, epitope similarity, and sequence homology to clinically verified allergens.

3.1. Identification of Known Mango Allergens

Our results indicated that five of the six proteins and/or protein families that were previously identified as mango allergens [4] were successfully identified and classified as having substantial allergy potential, demonstrating the effectiveness of the AllerCatPro as a powerful allergen prediction tool. These proteins include glyceraldehyde 3-phosphate dehydrogenase (GAPDH), profilin, β-1,3-glucanase, Bet v 1-like homologous protein/pathogenesis-related proteins (PR proteins), and Type I chitinases (Figure 2, Table S1).
It is important to note that members of these families were identified in both the high and weak allergen evidence categories, indicating that they may have clinical significance and that additional experimental validation is necessary. For example, out of the 39 chitinase isoforms that were found, 3 had strong evidence and 36 had weak evidence. Comparably, PR proteins have 25 isoforms, 8 of which are strong and 17 of which are weak; GAPDH has 19 isoforms, 9 of which are strong and 10 of which are weak; and profilins have 13 isoforms, all of which exhibit strong evidence. These findings highlight the varying levels of evidence supporting the role of different protein isoforms in biological processes. Further research is essential to clarify the functions of the weaker isoforms and to explore their potential as mango allergens. Importantly, GAPDH was recognized as Man i 1 in a recent publication by Guo and Cong [4], although chitinase proteins were listed as Man i 1 in the WHO/IUIS Allergen Nomenclature Sub-committee database. Similarly, chitinases were named as Man i 2 and Man i chitinase, profilin as Man i 3.01 and Man i 3.02, and Bet v 1-like homologous protein/pathogenesis-related proteins (PR proteins) as Man i 14 kDa. As a result, we decided to use the name of the mango allergens in this investigation in line with the most recent publication [4].

3.2. Potential Allergens in Mango Genome

In addition to the recognized mango allergens, AllerCatPro discovered hundreds more proteins belonging to different protein families that have a high probability of being allergens. In particular, a total of 1489 (3%) and 5277 proteins (10%) out of the 54,010 mango protein sequences had strong and weak allergenicity potential, respectively (Figure 3A, Table S1). The discovery of multiple allergenic protein families that are not yet considered as mango allergens but show a high degree of resemblance to known allergens from other fruit species further validated the efficacy of this in silico workflow. For instance, a number of proteins from cysteine proteases, legumin, non-specific lipid transfer proteins (LTPs), thaumatin-like protein, vicilin, ervatamin-like proteins, and globulins are among these families that showed strong allergenic potency (Figure 3B). These findings expand the known allergome of mango well beyond what has been previously reported in allergen databases. In addition, our results underscore the potential of in silico approaches to uncover novel candidate allergens based on structural and sequence homology with clinically established allergens from other sources (Figure 3B).

3.3. Identification of High-Confidence Allergen Protein Families in Mango

While approximately 1500 proteins demonstrated strong evidence of allergenicity and over 5200 proteins were considered to pose a lower allergenic risk, we applied an additional filter (≥70% identity across an 80 amino acid linear window and ≥80 known allergen hits) to refine the candidate list and better categorize protein families. This approach reduced the dataset to 63 high-confidence allergens and 185 proteins with moderate evidence (Figure 4, Table S2).
Our results showed that several proteins belonging to the cysteine protease family were the second-highest category in the mango genome, which indicated a considerable risk of allergies (Figure 3B). Several cysteine protease protein isoforms were significantly matched with kiwi (Actinidia deliciosa) Act d1 allergen (Figure 4A). The top three cysteine protease proteins, which are accession XP_044488586.1, XP_044511549.1, and XP_044509143.1, were aligned with Act d1 using multiple sequence alignment (Figure S1) and showed >82% amino acid sequence identity with Act d1 (Figure 5A).
Furthermore, amino acid sequence alignment reveals a highly conserved B-cell epitope region (Figure S2), and the epitope scores of these cysteine proteases were higher than those of the known Act d1 (Figure S2). A newly developed AI-based allergen prediction tool called pLM4AIg [26] further cross-validated the allergen potential for these cysteine proteases and confirmed the significant allergenicity comparable to Act d1.
Cysteine proteases are increasingly recognized as clinically important plant food allergens, with several members of this protein family known to trigger IgE-mediated hypersensitivity reactions [27,28]. These enzymes, which play key roles in plant defense and ripening, are structurally stable and resistant to gastrointestinal digestion—features that contribute to their heightened allergenic potential [27,29]. Well-characterized examples include Act d 1 from kiwi, Ana c 2 from pineapple, and Cari p 1 from papaya, all of which have been associated with severe systemic allergic responses [27]. Their conserved structure and proteolytic function are believed to support both direct immune sensitization and disruption of epithelial barriers, increasing the likelihood of allergen exposure [30,31]. The high sequence identity and epitope similarity observed between mango cysteine proteases and Act d 1 point to a significant risk of cross-reactivity in sensitized individuals, aligning with the broader clinical significance of this allergen family.
Lipid transfer proteins (LTPs), which are primarily low-molecular-weight heat-resistant proteins, are among the most well-characterized plant allergens that can be found in substantial quantities in the fruits [32]. Interestingly, our findings demonstrated that only two non-specific lipid-transfer proteins (LTPs), which correspond to accession numbers XP_044477456.1 and XP_044465367.1 in the mango genome, were identified as allergens with strong evidence. Despite being matched to known allergens Jug r 8 and Cry j LTP, these LTPs did not meet our high confidence criteria. On the other hand, among the weakly evident allergen list, a total of 24 LTPs met our high confidence criteria (Table S2). Among these LTPs, a non-specific lipid-transfer protein 1-like (XP_044503745.1) protein matched with 192 known LTPs, showing 71.2% identity in a linear 80 aa window and 85.7% identity in a 3D epitope. Notably, this mango LTP showed the greatest match with the chestnut (Castanea sativa) Cas s8 allergen (Table S2, Figure 4B).
To further demonstrate the similarities among the LTP isoforms in the mango genome, we selected two more LTPs (XP_044473132.1 and XP_044473056.1) and performed multiple sequence alignment in addition to a pairwise comparison of the known LTP allergens listed in the WHO/IUIS Allergen Nomenclature Sub-committee database. Our findings demonstrated that mango LTPs exhibited a conserved epitope binding location among plant LTPs and shared >50% amino acid sequence identity with all well-characterized fruit LTPs (Figure 5B). Cross-reactivity among LTPs from botanically distant fruits is well-documented, with shared IgE-binding epitopes and overlapped T-cell reactivity contributing to the clinical phenomenon of LTP syndrome, where sensitization to one LTP (e.g., Pru p 3 from peach) can trigger allergic responses to multiple plant foods [33,34,35].

4. Conclusions

In conclusion, our study highlights the strength of genome-wide in silico approaches in identifying potential allergens in food sources that have not been fully explored before. We were able to examine over 54,000 protein sequences of mango genome as a case study, revealing a more comprehensive allergen profile that has been previously documented. In addition to confirming the known allergens from mangos, we also found new protein families that have substantial similarities to recognized allergens from other plant sources such as lipid transfer proteins and cysteine proteases. These proteins showed good matches in allergy databases, conserved epitope areas, and high sequence identity, suggesting a high chance of cross-reactivity in sensitive people. The increasing number of clinical reports that associate mango with allergic reactions can be explained by this improved understanding of the mango proteome. Our findings also raise the possibility that there are a number of additional proteins with potential for allergies that have not been identified because of the limited scope of previous research on mango allergy. In order to determine which proteins require further investigation in clinical or experimental settings, researchers may now effectively and affordably screen entire genomes using thorough in silico methods.
Overall, this work contributes to filling the gap in allergen knowledge for tropical fruits like mango and sets the stage for future research in allergy diagnostics and food safety. These findings may also help breeders and food scientists develop low-allergen cultivars and safer food products, ultimately benefiting individuals with food sensitivities around the world.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/app15158375/s1, Figure S1: Multiple sequence alignment of mango cystine protease and known kiwi allergen Act d1. Figure S2: B-cell epitope region of three potential mango cysteine protease and known kiwi allergen Act d1 identified by IEDB (http://tools.iedb.org/bcell/) following Kolaskar and Tongaonkar method. Table S1: AllerCatPro database results of 54000 mango protein sequences. Table S2: List of high and low confidence potential mango allergens.

Author Contributions

Conceptualization, N.A.; Data curation and analysis, A.Z., A.S., A.N.H., and N.A.; Bioinformatics, N.A., A.S., A.Z., and Z.Y.; Primary draft of the manuscript, N.A., and A.S.; writing, review, and editing, N.A., A.S., A.N.H., A.Z., and Z.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are presented in the main text, as well as the Supplementary Materials.

Acknowledgments

N.A. gratefully acknowledges the initial funding support from the OU VPRP Office for the establishment of the Proteomics Core Facility.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
LTPLipid Transfer Protein
PRPathogenesis-Related
GPIGlycosylphosphatidylinositol
EFSAEuropean Food Safety Authority
IEDBImmune Epitope Database
pLM4AIgProtein Language Model-Based Predictors for Allergenic Proteins and Peptides
CATASChinese Academy of Tropical Agricultural Sciences

References

  1. Liu, W.; Yang, Q.; Wang, Z.; Wang, J.; Min, F.; Yuan, J.; Tong, P.; Li, X.; Wu, Y.; Gao, J.; et al. Quantitative food allergen risk assessment: Evolving concepts, modern approaches, and industry implications. Compr. Rev. Food Sci. Food Saf. 2025, 24, e70132. [Google Scholar] [CrossRef] [PubMed]
  2. Präger, L.; Simon, J.C.; Treudler, R. Food allergy—New risks through vegan diet? Overview of new allergen sources and current data on the potential risk of anaphylaxis. J. Der Dtsch. Dermatol. Ges. 2023, 21, 1308–1313. [Google Scholar] [CrossRef]
  3. Shaheen, N.; Hossen, M.d.S.; Akhter, K.T.; Halima, O.; Hasan, M.d.K.; Wahab, A.; Gamagedara, S.; Bhargava, K.; Holmes, T.; Najar, F.Z.; et al. Comparative Seed Proteome Profile Reveals No Alternation of Major Allergens in High-Yielding Mung Bean Cultivars. J. Agric. Food Chem. 2024, 72, 13957–13965. [Google Scholar] [CrossRef] [PubMed]
  4. Guo, H.; Cong, Y. Recent advances in the study of epitopes, allergens and immunologic cross-reactivity of edible mango. Food Sci. Hum. Wellness 2024, 13, 1186–1194. [Google Scholar] [CrossRef]
  5. Maurer-Stroh, S.; Krutz, N.L.; Kern, P.S.; Gunalan, V.; Nguyen, M.N.; Limviphuvadh, V.; Eisenhaber, F.; Gerberick, G.F. AllerCatPro-prediction of protein allergenicity potential from the protein sequence. Bioinformatics 2019, 35, 3020–3027. [Google Scholar] [CrossRef]
  6. EFSA Panel on Genetically Modified Organisms (GMO Panel). Scientific Opinion on the assessment of allergenicity of GM plants and microorganisms and derived food and feed. EFSA J. 2010, 8, 1700. [Google Scholar] [CrossRef]
  7. Krikeerati, T.; Rodsaward, P.; Nawiboonwong, J.; Pinyopornpanish, K.; Phusawang, S.; Sompornrattanaphan, M. Revisiting Fruit Allergy: Prevalence across the Globe, Diagnosis, and Current Management. Foods 2023, 12, 4083. [Google Scholar] [CrossRef]
  8. Jeong, K.Y.; Lopata, A.L. Editorial: Spotlight on allergy research in Asia. Front. Allergy 2024, 5, 1371795. [Google Scholar] [CrossRef]
  9. García-Villegas, A.; Fernández-Ochoa, Á.; Rojas-García, A.; Alañón, M.E.; Arráez-Román, D.; Cádiz-Gurrea, M.d.l.L.; Segura-Carretero, A. The Potential of Mangifera indica L. Peel Extract to Be Revalued in Cosmetic Applications. Antioxidants 2023, 12, 1892. [Google Scholar] [CrossRef] [PubMed]
  10. Weber, C.; Simnitt, S.; Wechsler, S.J.; Wakefield, H. Fruit and Tree Nuts Outlook: September 2023|Economic Research Service. Available online: https://www.ers.usda.gov/publications/pub-details?pubid=107539 (accessed on 24 June 2025).
  11. Berghea, E.C.; Craiu, M.; Ali, S.; Corcea, S.L.; Bumbacea, R.S. Contact Allergy Induced by Mango (Mangifera indica): A Relevant Topic? Medicina 2021, 57, 1240. [Google Scholar] [CrossRef] [PubMed]
  12. Ukleja-Sokołowska, N.; Gawrońska-Ukleja, E.; Lis, K.; Żbikowska-Gotz, M.; Sokołowski, Ł.; Bartuzi, Z. Anaphylactic reaction in patient allergic to mango. Allergy Asthma Clin. Immunol. 2018, 14, 78. [Google Scholar] [CrossRef]
  13. Paschke, A.; Kinder, H.; Zunker, K.; Wigotzki, M.; Weßbecher, R.; Vieluf, D.; Steinhart, H. Characterization of Allergens in Mango Fruit and Ripening Dependence of the Allergenic Potency. Food Agric Immunol. 2001, 13, 51–61. [Google Scholar] [CrossRef]
  14. Cardona, E.E.G.; Heathcote, K.; Teran, L.M.; Righetti, P.G.; Boschetti, E.; D’Amato, A. Novel low-abundance allergens from mango via combinatorial peptide libraries treatment: A proteomics study. Food Chem. 2018, 269, 652–660. [Google Scholar] [CrossRef] [PubMed]
  15. Garino, C.; Coïsson, J.D.; Arlorio, M. In silico allergenicity prediction of several lipid transfer proteins. Comput. Biol. Chem. 2016, 60, 32–42. [Google Scholar] [CrossRef]
  16. Kulkarni, A.; Ananthanarayan, L.; Raman, K. Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. Comput. Biol. Chem. 2013, 47, 149–155. [Google Scholar] [CrossRef] [PubMed]
  17. Halima, O.; Najar, F.Z.; Wahab, A.; Gamagedara, S.; Chowdhury, A.I.; Foster, S.B.; Shaheen, N.; Ahsan, N. Lentil allergens identification and quantification: An update from omics perspective. Food Chem. 2022, 4, 100109. [Google Scholar] [CrossRef]
  18. Bastiaan-Net, S.; Pina-Pérez, M.C.; Dekkers, B.J.W.; Westphal, A.H.; America, A.H.P.; Ariëns, R.M.C.; de Jong, N.W.; Wichers, H.J.; Mes, J.J. Identification and in silico bioinformatics analysis of PR10 proteins in cashew nut. Protein Sci. 2020, 29, 1581–1595. [Google Scholar] [CrossRef] [PubMed]
  19. Jamakhani, M.; Lele, S.S.; Rekadwad, B. In silico assessment data of allergenicity and cross-reactivity of NP24 epitopes from Solanum lycopersicum (Tomato) fruit. Data Brief 2018, 21, 660–674. [Google Scholar] [CrossRef]
  20. Zhou, W.; Bias, K.; Lenczewski-Jowers, D.; Henderson, J.; Cupp, V.; Ananga, A.; Ochieng, J.W.; Tsolova, V. Analysis of Protein Sequence Identity, Binding Sites, and 3D Structures Identifies Eight Pollen Species and Ten Fruit Species with High Risk of Cross-Reactive Allergies. Genes 2022, 13, 1464. [Google Scholar] [CrossRef]
  21. Hernández-Lao, T.; Rodríguez-Pérez, R.; Labella-Ortega, M.; Muñoz Triviño, M.; Pedrosa, M.; Rey, M.-D.; Jorrín-Novo, J.V.; Castillejo-Sánchez, M.Á. Proteomic identification of allergenic proteins in holm oak (Quercus ilex) seeds. Food Chem. 2025, 464 Pt 1, 141667. [Google Scholar] [CrossRef] [PubMed]
  22. Savić, A.; Mitić, D.; Smiljanić, K.; Radosavljević, J.; Stanić-Vučinić, D.; Ćirković Veličković, T. Ensemble-based in silico allergenicity prediction of black tiger shrimp Penaeus monodon. In Proceedings of the XVIII International Italian Proteomics Association Annual Meeting in partnership with the Hellenic Proteomics Society and Serbian Proteomics Association, Roma, Italy, 27–29 November 2024; p. 78. [Google Scholar]
  23. Nguyen, M.N.; Krutz, N.L.; Limviphuvadh, V.; Lopata, A.L.; Gerberick, G.F.; Maurer-Stroh, S. AllerCatPro 2.0: A web server for predicting protein allergenicity potential. Nucleic Acids Res. 2022, 50, W36–W43. [Google Scholar] [CrossRef]
  24. Krutz, N.L.; Kimber, I.; Winget, J.; Nguyen, M.N.; Limviphuvadh, V.; Maurer-Stroh, S.; Mahony, C.; Gerberick, G.F. Application of AllerCatPro 2.0 for protein safety assessments of consumer products. Front. Allergy 2023, 4, 1209495. [Google Scholar] [CrossRef] [PubMed]
  25. Kolaskar, A.S.; Tongaonkar, P.C. A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS Lett. 1990, 276, 172–174. [Google Scholar] [CrossRef]
  26. Du, Z.; Xu, Y.; Liu, C.; Li, Y. pLM4Alg: Protein Language Model-Based Predictors for Allergenic Proteins and Peptides. J. Agric. Food Chem. 2024, 72, 752–760. [Google Scholar] [CrossRef]
  27. Giangrieco, I.; Ciardiello, M.A.; Tamburrini, M.; Tuppo, L.; Rafaiani, C.; Mari, A.; Alessandri, C. Comparative Analysis of the Immune Response and the Clinical Allergic Reaction to Papain-like Cysteine Proteases from Fig, Kiwifruit, Papaya, Pineapple and Mites in an Italian Population. Foods 2023, 12, 2852. [Google Scholar] [CrossRef]
  28. Suzuki, M.; Itoh, M.; Ohta, N.; Nakamura, Y.; Moriyama, A.; Matsumoto, T.; Ohashi, T.; Murakami, S. Blocking of protease allergens with inhibitors reduces allergic responses in allergic rhinitis and other allergic diseases. Acta Oto-Laryngol. 2006, 126, 746–751. [Google Scholar] [CrossRef]
  29. Szewińska, J.; Simińska, J.; Bielawski, W. The roles of cysteine proteases and phytocystatins in development and germination of cereal seeds. J. Plant Physiol. 2016, 207, 10–21. [Google Scholar] [CrossRef]
  30. Sharma, A.; Vashisht, S.; Mishra, R.; Gaur, S.N.; Prasad, N.; Lavasa, S.; Batra, J.K.; Arora, N. Molecular and immunological characterization of cysteine protease from Phaseolus vulgaris and evolutionary cross-reactivity. J. Food Biochem. 2022, 46, e14232. [Google Scholar] [CrossRef]
  31. Soh, W.T.; Zhang, J.; Hollenberg, M.D.; Vliagoftis, H.; Rothenberg, M.E.; Sokol, C.L.; Robinson, C.; Jacquet, A. Protease allergens as initiators–regulators of allergic inflammation. Allergy 2023, 78, 1148–1168. [Google Scholar] [CrossRef]
  32. Anagnostou, A. Lipid transfer protein allergy: An emerging allergy and a diagnostic challenge. Ann. Allergy Asthma Immunol. 2023, 130, 413–414. [Google Scholar] [CrossRef] [PubMed]
  33. Andersen, M.-B.S.; Hall, S.; Dragsted, L.O. Identification of European allergy patterns to the allergen families PR-10, LTP, and profilin from Rosaceae fruits. Clin. Rev. Allergy Immunol. 2011, 41, 4–19. [Google Scholar] [CrossRef] [PubMed]
  34. Costa, J.; Mafra, I. Rosaceae food allergy: A review. Crit. Rev. Food Sci. Nutr. 2023, 63, 7423–7460. [Google Scholar] [CrossRef] [PubMed]
  35. Skypala, I.J.; Asero, R.; Barber, D.; Cecchi, L.; Diaz Perales, A.; Hoffmann-Sommergruber, K.; Pastorello, E.A.; Swoboda, I.; Bartra, J.; Ebo, D.G.; et al. Non-specific lipid-transfer proteins: Allergen structure and function, cross-reactivity, sensitization, and epidemiology. Clin. Transl. Allergy 2021, 11, e12010. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Bioinformatics pipeline for allergen prediction in Mangifera indica proteins using database screening, data visualization, and AI-based hypersensitivity analysis.
Figure 1. Bioinformatics pipeline for allergen prediction in Mangifera indica proteins using database screening, data visualization, and AI-based hypersensitivity analysis.
Applsci 15 08375 g001
Figure 2. Scatter plot analysis of mango proteins with predicted allergenicity by AllerCatPro 2.0. (A) Proteins with strong allergen evidence, plotted by % identity over an 80 amino acid window (x-axis) and number of known allergen hits (y-axis). The dot color indicates 3D epitope identity (yellow to red gradient). (B) Proteins with weak allergen evidence, shown using the same axes; dot color represents 3D epitope identity (green to orange gradient).
Figure 2. Scatter plot analysis of mango proteins with predicted allergenicity by AllerCatPro 2.0. (A) Proteins with strong allergen evidence, plotted by % identity over an 80 amino acid window (x-axis) and number of known allergen hits (y-axis). The dot color indicates 3D epitope identity (yellow to red gradient). (B) Proteins with weak allergen evidence, shown using the same axes; dot color represents 3D epitope identity (green to orange gradient).
Applsci 15 08375 g002
Figure 3. In silico allergenicity classification of 54,010 mango proteins. (A) Pie chart showing strong (n = 1489; 3%), weak (n = 5277; 10%), and no (87%) allergen evidence. (B) The bar graph shows the number of proteins in the mango genome that fall into the strong and weak categories of known allergen family proteins.
Figure 3. In silico allergenicity classification of 54,010 mango proteins. (A) Pie chart showing strong (n = 1489; 3%), weak (n = 5277; 10%), and no (87%) allergen evidence. (B) The bar graph shows the number of proteins in the mango genome that fall into the strong and weak categories of known allergen family proteins.
Applsci 15 08375 g003
Figure 4. High-confidence mango allergen candidates identified using refined thresholds. (A) Proteins predicted as strong allergens (n = 63). Dot size indicates number of allergen hits; color shows % identity to 3D epitope. Several cysteine protease isoforms showed strong matches with kiwi allergen Act d 1. (B) Proteins predicted as weak allergens (n = 185) using the same thresholds.
Figure 4. High-confidence mango allergen candidates identified using refined thresholds. (A) Proteins predicted as strong allergens (n = 63). Dot size indicates number of allergen hits; color shows % identity to 3D epitope. Several cysteine protease isoforms showed strong matches with kiwi allergen Act d 1. (B) Proteins predicted as weak allergens (n = 185) using the same thresholds.
Applsci 15 08375 g004
Figure 5. Heatmap showing sequence identity between mango proteins and known allergens. (A) Cysteine protease proteins XP_044488586.1, XP_044511549.1, and XP_044509143.1 were aligned with cysteine protease allergens from various species. (B) Non-specific lipid-transfer proteins (LTPs) XP_044503745.1, XP_044473132.1, and XP_044473056.1 were aligned with allergens across multiple plant species. Color scale reflects percentage identity: red = high, yellow = moderate, green = low.
Figure 5. Heatmap showing sequence identity between mango proteins and known allergens. (A) Cysteine protease proteins XP_044488586.1, XP_044511549.1, and XP_044509143.1 were aligned with cysteine protease allergens from various species. (B) Non-specific lipid-transfer proteins (LTPs) XP_044503745.1, XP_044473132.1, and XP_044473056.1 were aligned with allergens across multiple plant species. Color scale reflects percentage identity: red = high, yellow = moderate, green = low.
Applsci 15 08375 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Singh, A.; Zarif, A.; Huynh, A.N.; Yang, Z.; Ahsan, N. Genome-Wide In Silico Analysis Expanding the Potential Allergen Repertoire of Mango (Mangifera indica L.). Appl. Sci. 2025, 15, 8375. https://doi.org/10.3390/app15158375

AMA Style

Singh A, Zarif A, Huynh AN, Yang Z, Ahsan N. Genome-Wide In Silico Analysis Expanding the Potential Allergen Repertoire of Mango (Mangifera indica L.). Applied Sciences. 2025; 15(15):8375. https://doi.org/10.3390/app15158375

Chicago/Turabian Style

Singh, Amit, Aayan Zarif, Annelise N Huynh, Zhibo Yang, and Nagib Ahsan. 2025. "Genome-Wide In Silico Analysis Expanding the Potential Allergen Repertoire of Mango (Mangifera indica L.)" Applied Sciences 15, no. 15: 8375. https://doi.org/10.3390/app15158375

APA Style

Singh, A., Zarif, A., Huynh, A. N., Yang, Z., & Ahsan, N. (2025). Genome-Wide In Silico Analysis Expanding the Potential Allergen Repertoire of Mango (Mangifera indica L.). Applied Sciences, 15(15), 8375. https://doi.org/10.3390/app15158375

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop