Next Article in Journal
Exploring Additive Manufacturing for Sports Mouthguards: A Pilot Study
Previous Article in Journal
Statistical Post-Processing of Ensemble LLWS Forecasts Using EMOS: A Case Study at Incheon International Airport
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification

by
Ahmad Hani Abdalla Almakhamreh
1,† and
Ahmet Selman Bozkir
2,*,†
1
Institute of Graduate School, Hacettepe University, Ankara 06800, Turkey
2
Department of Computer Engineering, Hacettepe University, Ankara 06800, Turkey
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2026, 16(2), 751; https://doi.org/10.3390/app16020751
Submission received: 27 November 2025 / Revised: 4 January 2026 / Accepted: 9 January 2026 / Published: 11 January 2026
(This article belongs to the Special Issue AI-Driven Image and Signal Processing)

Abstract

Phishing attacks continue to evolve and exploit fundamental human impulses, such as trust and the need for a rapid response, as well as emotional triggers. This makes the human mind both a valuable asset and a significant vulnerability. The proliferation of zero-day vulnerabilities has been identified as a significant exacerbating factor in this threat landscape. To address these evolving challenges, we introduce CrossPhire: a multimodal deep learning framework with an end-to-end architecture that captures semantic and visual cues from multiple data modalities, while also providing methodological insights for anti-phishing multimodal learning. First, we demonstrate that markup-free semantic text encoding captures linguistic deception patterns more effectively than DOM-based approaches, achieving 96–97% accuracy using textual content alone and providing the strongest single-modality signal through sentence transformers applied to HTML text stripped of structural markup. Second, through controlled comparison of fusion strategies, we show that simple concatenation outperforms a sophisticated gating mechanism so-called Mixture-of-Experts by 0.5–10% when modalities provide complementary, non-redundant security evidence. We validate these insights through rigorous experimentation on five datasets, achieving competitive same-dataset performance (97.96–100%) while demonstrating promising cross-dataset generalization (85–96% accuracy under distribution shift). Additionally, we contribute Phish360, a rigorously curated multimodal benchmark with 10,748 samples addressing quality issues in existing datasets (96.63% unique phishing HTML vs. 16–61% in prior benchmarks), and provide LIME-based explainability tools that decompose predictions into modality-specific contributions. The rapid inference time (0.08 s) and high accuracy results position CrossPhire as a promising solution in the fight against phishing attacks.
Keywords: information security; computer vision; cybersecurity; machine learning; multimodality; phishing detection information security; computer vision; cybersecurity; machine learning; multimodality; phishing detection

Share and Cite

MDPI and ACS Style

Almakhamreh, A.H.A.; Bozkir, A.S. CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification. Appl. Sci. 2026, 16, 751. https://doi.org/10.3390/app16020751

AMA Style

Almakhamreh AHA, Bozkir AS. CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification. Applied Sciences. 2026; 16(2):751. https://doi.org/10.3390/app16020751

Chicago/Turabian Style

Almakhamreh, Ahmad Hani Abdalla, and Ahmet Selman Bozkir. 2026. "CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification" Applied Sciences 16, no. 2: 751. https://doi.org/10.3390/app16020751

APA Style

Almakhamreh, A. H. A., & Bozkir, A. S. (2026). CrossPhire: Benefiting Multimodality for Robust Phishing Web Page Identification. Applied Sciences, 16(2), 751. https://doi.org/10.3390/app16020751

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop