Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation
Abstract
1. Introduction
- We propose a novel explanation-enhanced detection framework, which is designed around a two-stage “first explanation, then judgment” mechanism. Initially, the framework is directed to generate a structured reasoning process that is both human-readable and of high quality (i.e., an explanation). Subsequently, this explanation is leveraged to enhance the final classification decision. This methodology significantly bolsters the robustness and accuracy of meme detection.
- In this study, we take the lead in systematically integrating Chinese cultural background knowledge into our harmful meme detection model. This integration is both explicit and structural, ensuring a comprehensive understanding and generation of Chinese cultural background knowledge within the model. We employ a meticulously crafted “culture-aware prompt” to effectively activate and guide the MLLM to utilize the extensive knowledge acquired during its pre-training phase. This approach enables the MLLM to decode the profound cultural connotations embedded within memes.
- We conducted extensive experiments on ToxiCN MM [10], the largest dataset of Chinese harmful memes with fine-grained annotations and compared the performance with various advanced methods. The experimental results fully demonstrated the effectiveness of our proposed method.
2. Related Works
2.1. Multimodal Harmful Meme Detection
2.2. Explainable Multimodal Learning
2.3. Chinese Content Security and Cultural Particularity
3. Materials and Methods
3.1. Multimodal Feature Alignment
3.2. Culture-Aware Explanation Generation
3.3. Explanation-Enhanced Decisions
4. Experimental Results and Analysis
4.1. Datasets and Baselines
4.2. Evaluation Metrics
4.3. Parameter Setting
4.4. Quantitative Analysis
4.5. Ablation Study
- “Structured output” provides stable gains. Removing the requirement for structured output, there is a slight decline in model performance (2.54%/1.81%). This indicates that forcing the model to think and express according to the logical chain of “cultural background-metaphor-harmfulness-target of attack” helps form a more organized and focused reasoning process, thereby bringing about stable performance improvements.
- “Cultural awareness” is the core. After removing the specially designed culture-aware prompt and switching to a generic prompt, there was a significant drop in model performance (3.39%/2.78%). This indicates that explicitly incorporating the cultural background information that a meme might imply into the model’s thinking and analysis process greatly benefits the performance of the model in the task of detecting harmful memes.
- “Explanation-enhanced decision-making” is of paramount importance. Removing the explanation-enhanced decision-making module results in the largest performance drop (9.65%/9.23%). This indicates that simply generating an explanation is insufficient. Most importantly, there is a need to effectively reintegrate high-quality explanation features into the decision-making process in order to optimize the model’s final performance.
- Structured representation and the culture-aware prompt have complementary gains. Removing the culture-aware prompt alone results in a 3.39 decrease in F1; when simultaneously removing the structured representation (line 6), the F1 further decreases by 10.3, a decrease significantly greater than the sum of the two individual decreases (3.39 + 2.54 = 5.93). This “superposition” decrease indicates that the structured output provides more controllable semantic slots for cultural prompts, while cultural prompts in turn enhance the adaptability of structured representation to the Chinese language context, forming a positive coupling between the two.
- Complete framework achieves robust detection through “Divide and Conquer”. When all the three modules are removed (last row), the model degrades into a normal black-box classifier, and F1 drops to 72.33, 13.18% points lesser than the complete framework. This verifies the design hypothesis of FG-E2HMD:
- Structured representation is responsible for the controllable output at the “form” level.
- The culture-aware prompt is responsible for the alignment of cultural context at the “content” level.
- Explanation enhancement is responsible for the “logic” layer of verifiable reasoning.
- After superimposing the three models, the model achieves the current optimal 83.39 F1 in the task of Chinese harmful meme detection and has significant fault tolerance to the local failure of individual modules.
Method | F1 | F1harm | Performance Drop | Interpretation |
---|---|---|---|---|
Full FG-E2HMD | 83.39 | 74.43 | — | A complete framework containing all modules. |
W/O structured representation | 81.27 | 73.08 | 2.54%/1.81% | Do not force structured output and generate free-form text explanations. |
W/O culture-aware prompt | 80.56 | 72.36 | 3.39%/2.78% | Use the general prompt (directly prompt the model to “explain why it is harmful”). |
W/O explanation enhancement | 75.34 | 67.56 | 9.65%/9.23% | Remove the “explanation-enhancement decision-making module” and classify directly using . |
W/O culture-aware prompt and explanation enhancement | 77.65 | 68.93 | 6.88%/7.39% | Jointly remove the culture-aware prompt and explanation enhancement. |
W/O structured representation and culture-aware prompt | 74.82 | 67.02 | 10.3%/9.9% | Jointly remove the culture-aware prompt and structured representation. |
W/O explanation enhancement and structured representation | 76.03 | 68.11 | 8.83%/8.50% | Jointly remove explanation enhancement and structured reprensentation. |
W/O all three modules | 72.33 | 64.81 | 13.18%/12.92% | Remove all three modules. |
5. Discussion
5.1. Case Study
- The model may occasionally misclassify a meme due to the benign information it contains.
- The model frequently demonstrates a deficiency in comprehending the intricate cultural context inherent in Chinese memes, leading to erroneous judgments.
5.2. Limitations and Ethical Considerations
5.2.1. Dependence on Underlying MLLM Capabilities
5.2.2. Adaptation to New Memes
5.2.3. Potential Misuse Risks
6. Conclusions
Future Work
- Automatic knowledge updating mechanism: we will investigate methods to dynamically integrate external knowledge bases such as online encyclopedias, modern language dictionaries, and real-time news feeds into the model to further improve its ability to learn continuously and adapt to evolving meme trends.
- Cross-cultural multilingual meme detection: the proposed framework will be extended to other languages and cultural contexts (e.g., Japanese, Korean) to explore the development of a flexible, configurable content moderation system capable of operating effectively in multicultural environments.
- Zero-shot and few-shot learning: we will examine how the model can leverage its strong inference capabilities to achieve efficient, explanation-enhanced harmful content detection under data-scarce or zero-shot conditions, thereby broadening its applicability to emerging and low-resource risk scenarios.
- Lightweight interpretation model solution: We will design a knowledge distillation framework to transfer the knowledge from large interpretive models to small models and meanwhile, introduce model pruning and quantization techniques. We aim to reduce the number of parameters and computational complexity and explore a simplified version of the attention mechanism while retaining key explanatory capabilities and reducing computational overhead. In addition, we plan to design hierarchical explanation strategies and generate explanations in stages based on the harmfulness level of the memes.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
MLLMs | Multimodal Large Language Models |
LLMs | Large Language Models |
RAG | Retrieval-Augmented Generation |
SOTA | State of The Art |
RoBERTa | Robustly Optimized BERT Pre-training Approach |
ViT | Vision Transformer |
BERT | Bidirectional Encoder Representations from Transformers |
Appendix A
Appendix A.1
# ---------------------------------------------------------------------------------------------------------------- # Pseudocode: Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation # ---------------------------------------------------------------------------------------------------------------- # 1. Global Constants and Pre-trained Models PRETRAINED_MLLM = load_model(“Qwen2.5-VL-7B”) # frozen weights TEXT_ENCODER = load_model(“RoBERTa”) # trainable weights CLASSIFIER = MLP(input_dim = 2 * hidden_dim, Output_dim = 2) # harmful/harmless CULTURE_AWARE_PROMPT=read_prompt_template(“culture_aware_template.txt”) # 2. Multimodal Feature Alignment Module def multimodal_feature_alignment(image I, text T): “”” Input: original image I and text T Output: unified multimodal representation Fmm extracted by frozen MLLM |
“”” Fmm = PRETRAINED_MLLM.encode([I, T]) # [hidden_dim] return = Fmm # Shared by subsequent module # 3. Culture-Aware Explanation Generation Module def culture_aware_explanation_generation(Fmm): “”” Input: multimodal representation Fmm Output: Structured explanation text Texp “”” prompt = fill_template(CULTURE_AWARE_PROMPT, Fmm) Texp = PRETRAINED_MLLM.generate(prompt) # output after instruction # fine-tuning return Texp # 4. Explanation-enhanced Decision-making Module def explanation_enhanced_decision(Fmm, Texp): “”” Input: original multimodal feature Fmm, explanation text Texp Output: harmful probability distribution “”” # 4.1 explanation text encoding Fexp = TEXT_ENCODER.encode(Texp) # [hidden_dim] # 4.2 feature fusion Ffinal = concatenate([Fmm, Fexp]) # [2 * hidden_dim] # 4.3 classification logits = CLASSIFIER(Ffinal) # dim = 2 = softmax(logits) # probability vector return # 5. End-to-end Inference Process def main(image_path, text_content): I = load_image(image_path) T = text_content Fmm = multimodal_feature_alignment(I, T) Texp = culture_aware_explanation_generation(Fmm) = explanation_enhanced_decision(Fmm, Texp) predicted_label = argmax() # 0: harmless, 1: harmful return { “label”: predicted_label, “probability”: , “explanation”: Texp # 6. Training Loop(optional) def train_step(batch): for (I, T, y) in batch: Fmm = multimodal_feature_alignment(I, T) Texp = culture_aware_explanation_generation(Fmm) = explanation_enhanced_decision(Fmm, Texp) loss = cross_entropy(, y) backpropagate(loss, parameters=[TEXT_ENCODER, CLASSIFIER]) optimizer_step() |
Appendix A.2
Parameter Category | Setting |
Optimizer | AdamW |
Batch Size | 32 |
Learning Rate | 1 × 10−5 |
Training Epochs | 10 |
β1 | 0.9 |
β2 | 0.95 |
Weight Decay | 0.02 |
Model | Qwen2.5-VL-7B |
RoBERTa (chinese-roberta-wwm-ext-base) | |
GPUs | 2 × NVIDIA RTX A6000 |
Appendix A.3
Model | P | R | F1 | F1harm |
---|---|---|---|---|
FG-E2HMD | 75.62 | 73.54 | 74.57 | 68.59 |
FG-E2HMD after 5 epochs fine-tuned | 80.25 | 81.36 | 80.80 | 71.35 |
Appendix A.4
References
- Zhuang, Y.; Guo, K.; Wang, J.; Jing, Y.; Xu, X.; Yi, W.; Yang, M.; Zhao, B.; Hu, H. I Know What You MEME! Understanding and Detecting Harmful Memes with Multimodal Large Language Models. In Proceedings of the 2025 Network and Distributed System Security Symposium, San Diego, CA, USA, 24–28 February 2025; Internet Society: San Diego, CA, USA, 2025. [Google Scholar]
- Kiela, D.; Firooz, H.; Mohan, A.; Goswami, V.; Singh, A.; Ringshia, P.; Testuggine, D. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes. arXiv 2021, arXiv:2005.04790. [Google Scholar] [CrossRef]
- Zhang, L.; Jin, L.; Sun, X.; Xu, G.; Zhang, Z.; Li, X.; Liu, N.; Liu, Q.; Yan, S. TOT: Topology-Aware Optimal Transport for Multimodal Hate Detection. arXiv 2023, arXiv:2303.09314. [Google Scholar] [CrossRef]
- Cao, R.; Hee, M.S.; Kuek, A.; Chong, W.-H.; Lee, R.K.-W.; Jiang, J. Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection. arXiv 2023, arXiv:2308.08088. [Google Scholar] [CrossRef]
- Gomez, R.; Gibert, J.; Gomez, L.; Karatzas, D. Exploring Hate Speech Detection in Multimodal Publications. arXiv 2019, arXiv:1910.03814. [Google Scholar] [CrossRef]
- Hee, M.S.; Lee, R.K.-W. Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions. arXiv 2025, arXiv:2502.11073. [Google Scholar] [CrossRef]
- Yang, S.; Cui, S.; Hu, C.; Wang, H.; Zhang, T.; Huang, M.; Lu, J.; Qiu, H. Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings. arXiv 2025, arXiv:2505.24341. [Google Scholar] [CrossRef]
- Scott, K. Memes as Multimodal Metaphors: A Relevance Theory Analysis. Pragmat. Cogn. 2021, 28, 277–298. [Google Scholar] [CrossRef]
- Lin, H.; Luo, Z.; Gao, W.; Ma, J.; Wang, B.; Yang, R. Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models. arXiv 2024, arXiv:2401.13298. [Google Scholar] [CrossRef]
- Lu, J.; Xu, B.; Zhang, X.; Wang, H.; Zhu, H.; Zhang, D.; Yang, L.; Lin, H. Towards Comprehensive Detection of Chinese Harmful Memes 2024. arXiv 2024, arXiv:2410.02378. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
- Ji, J.; Ren, W.; Naseem, U. Identifying Creative Harmful Memes via Prompt Based Approach. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; ACM: Austin, TX, USA; pp. 3868–3872. [Google Scholar]
- Gu, T.; Feng, M.; Feng, X.; Wang, X. SCARE: A Novel Framework to Enhance Chinese Harmful Memes Detection. IEEE Trans. Affect. Comput. 2025, 16, 933–945. [Google Scholar] [CrossRef]
- Hee, M.S.; Chong, W.-H.; Lee, R.K.-W. Decoding the Underlying Meaning of Multimodal Hateful Memes. arXiv 2023, arXiv:2305.17678. [Google Scholar] [CrossRef]
- Hwang, E.; Shwartz, V. MemeCap: A Dataset for Captioning and Interpreting Memes. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Singapore, 2023; pp. 1433–1445. [Google Scholar]
- Zhu, D.; Chen, J.; Shen, X.; Li, X.; Elhoseiny, M. MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models. arXiv 2023, arXiv:2304.10592. [Google Scholar] [CrossRef]
- Zhang, R.; Han, J.; Liu, C.; Gao, P.; Zhou, A.; Hu, X.; Yan, S.; Lu, P.; Li, H.; Qiao, Y. LLaMA-Adapter: Efficient Fine-Tuning of Language Models with Zero-Init Attention. arXiv 2023, arXiv:2303.16199. [Google Scholar] [CrossRef]
- Gao, P.; Han, J.; Zhang, R.; Lin, Z.; Geng, S.; Zhou, A.; Zhang, W.; Lu, P.; He, C.; Yue, X.; et al. LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model 2023. arXiv 2023, arXiv:2304.15010. [Google Scholar] [CrossRef]
- Liu, H.; Li, C.; Wu, Q.; Lee, Y.J. Visual Instruction Tuning. arXiv 2023, arXiv:2304.08485. [Google Scholar] [CrossRef]
- Ye, Q.; Xu, H.; Xu, G.; Ye, J.; Yan, M.; Zhou, Y.; Wang, J.; Hu, A.; Shi, P.; Shi, Y.; et al. mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality. arXiv 2023, arXiv:2304.14178. [Google Scholar] [CrossRef]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2022, arXiv:2201.11903. [Google Scholar] [CrossRef]
- Pan, F.; Luu, A.T.; Wu, X. Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning. arXiv 2025, arXiv:2506.08477. [Google Scholar] [CrossRef]
- Meguellati, E.; Zeghina, A.; Sadiq, S.; Demartini, G. LLM-Based Semantic Augmentation for Harmful Content Detection. arXiv 2025, arXiv:2504.15548. [Google Scholar] [CrossRef]
- Bui, M.D.; von der Wense, K.; Lauscher, A. Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models. arXiv 2024, arXiv:2411.03888. [Google Scholar] [CrossRef]
- Ranjan, R.; Ayinala, L.; Vatsa, M.; Singh, R. Multimodal Zero-Shot Framework for Deepfake Hate Speech Detection in Low-Resource Languages. arXiv 2025, arXiv:2506.08372. [Google Scholar] [CrossRef]
- Rana, A.; Jha, S. Emotion Based Hate Speech Detection Using Multimodal Learning. arXiv 2025, arXiv:2202.06218. [Google Scholar] [CrossRef]
- Pramanick, S.; Dimitrov, D.; Mukherjee, R.; Sharma, S.; Akhtar, M.S.; Nakov, P.; Chakraborty, T. Detecting Harmful Memes and Their Targets. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online Event, 1–6 August 2021; Zong, C., Xia, F., Li, W., Navigli, R., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 2783–2796. [Google Scholar]
- Kumari, G.; Bandyopadhyay, D.; Ekbal, A.; NarayanaMurthy, V.B. CM-Off-Meme: Code-Mixed Hindi-English Offensive Meme Detection with Multi-Task Learning by Leveraging Contextual Knowledge. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N., Eds.; ELRA and ICCL: Torino, Italia, 2024; pp. 3380–3393. [Google Scholar]
- Modi, T.; Shah, E.; Shah, S.; Kanakia, J.; Tiwari, M. Meme Classification and Offensive Content Detection Using Multimodal Approach. In Proceedings of the 2024 OITS International Conference on Information Technology (OCIT), Vijayawada, India, 12 December 2024; pp. 635–640. [Google Scholar]
- Prasad, N.; Saha, S.; Bhattacharyya, P. Multimodal Hate Speech Detection from Videos and Texts. Available online: https://easychair.org/publications/preprint/km3Rv/open (accessed on 25 July 2025).
- Lin, H.; Luo, Z.; Ma, J.; Chen, L. Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Singapore, 2023; pp. 9114–9128. [Google Scholar]
- Pandiani, D.S.M.; Sang, E.T.K.; Ceolin, D. Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities. arXiv 2024, arXiv:2406.07353. [Google Scholar] [CrossRef]
- Liu, X.; Zhu, Y.; Lan, Y.; Yang, C.; Qiao, Y. Safety of Multimodal Large Language Models on Images and Texts. EasyChair preprint 2023, 10743. [Google Scholar] [CrossRef]
- Sivaananth, S.V.; Sivagireeswaran, S.; Ravindran, S.; Nabi, F.G. Two-Stage Classification of Offensive Meme Content and Analysis. In Proceedings of the 2024 IEEE 8th International Conference on Information and Communication Technology (CICT), Prayagraj, India, 6–8 December 2024; pp. 1–6. [Google Scholar]
- Rizwan, N.; Bhaskar, P.; Das, M.; Majhi, S.S.; Saha, P.; Mukherjee, A. Zero Shot VLMs for Hate Meme Detection: Are We There Yet? arXiv 2024, arXiv:2402.12198. [Google Scholar] [CrossRef]
- Briskilal, J.; Karthik, M.J.; Praneeth, S. Detection of Offensive Text in Memes Using Deep Learning Techniques. AIP Conf. Proc. 2024, 3075, 020232. [Google Scholar] [CrossRef]
- Kmainasi, M.B.; Hasnat, A.; Hasan, M.A.; Shahroor, A.E.; Alam, F. MemeIntel: Explainable Detection of Propagandistic and Hateful Memes. arXiv 2025, arXiv:2502.16612. [Google Scholar] [CrossRef]
- Tabassum, I.; Nunavath, V. A Hybrid Deep Learning Approach for Multi-Class Cyberbullying Classification Using Multi-Modal Social Media Data. Appl. Sci. 2024, 14, 12007. [Google Scholar] [CrossRef]
- Ke, W.; Chan, K.-H. A Multilayer CARU Framework to Obtain Probability Distribution for Paragraph-Based Sentiment Analysis. Appl. Sci. 2021, 11, 11344. [Google Scholar] [CrossRef]
Prompt Template: | ||
---|---|---|
You are a multimodal content security expert with a deep understanding of Chinese culture and the online ecosystem. Please analyze the given meme and provide a structured analysis based on the following four aspects: Cultural Background Analysis: Does this image or text involve specific cultural elements of China (such as historical events, characters, internet memes, proverbs, social phenomena, etc.)? Please specify its origin and meaning. Metaphor and Subtext: How are the image and text content combined, and what are they implying, suggesting, or insinuating? What is the underlying subtext or metaphor? Harm Assessment: Based on the above analysis, please determine whether the meme is harmful or harmless. Target of Attack: If the meme is harmful, please indicate which specific individual, group, or organization is being targeted, attacked, satirized, or smeared. Summarize the above generated content into 5 sentences, with no more than 50 words in each sentence. |
Model | P | R | F1 | F1harm | |
---|---|---|---|---|---|
Text-only | DeepSeek-V3 | 74.63 | 75.36 | 74.99 | 58.75 |
RoBERTa | 75.52 | 77.54 | 76.36 | 66.48 | |
GPT4 | 74.52 | 65.59 | 68.01 | 51.78 | |
Image-only | Image-Region | 65.96 | 66.06 | 66.01 | 52.95 |
ResNet | 66.61 | 66.92 | 66.76 | 53.76 | |
ViT | 68.97 | 68.61 | 68.78 | 57.24 | |
Multimodal | GPT4 | 74.67 | 68.64 | 70.11 | 55.77 |
CLIP+MKE | 79.76 | 80.79 | 80.23 | 72.35 | |
VisualBERT COCO | 72.21 | 69.36 | 70.76 | 57.21 | |
Hate-CLIPper | 73.56 | 68.52 | 70.95 | 60.36 | |
MOMENTA | 74.05 | 69.88 | 71.90 | 62.45 | |
PromptHate | 75.83 | 72.36 | 74.05 | 63.15 | |
Qwen2.5-VL(zero-shot) | 70.25 | 73.36 | 71.77 | 58.76 | |
Debate-based model | 79.93 | 81.16 | 79.91 | 70.53 | |
FG-E2HMD | 83.53 | 83.26 | 83.39 | 74.43 |
Harmful Meme | (a) You Skr Ugly Thing | (b) Scientists are Working to Cure Them | (c) Girls Wash Clothes: Underwear, Dark Clothes, Light Clothes. Boys Wash Clothes. |
---|---|---|---|
Ground Truth | General Offensive, Harmful Text | Targeted Harmful, Text–Image Fusion | Targeted Harmful, Text–Image Fusion |
Text description | The sentence translates to “You ugly thing”, with “skr” being an internet slang term, overall expressing insult and dissatisfaction towards someone. | Scientists are working to cure diseases that affect women. | This is a reminder about laundry classification, suggesting that girls and boys should wash their clothes separately, and underwear, dark, and light clothes should also be washed separately. |
Image description | An exaggeratedly expressive and emotionally agitated man, with his fingers pointing forward, seemed to be shouting. | Two girls in wedding dresses are kissing each other. | Two comic strips, one showing a woman tidying up clothes for washing, the other showing a man casually throwing clothes into the wash, with a cat sitting below. |
Explanation from FG-E2HMD | [Cultural Background Analysis] The picture text combines the “skr” meme in Chinese internet culture, carrying a mocking and sarcastic meaning. [Metaphor and Subtext] Expressions paired with text implied a disparaging attitude, denigrating someone’s appearance or behavior. [Harmfulness Determination] The meme is highly offensive and potentially harmful through its abusive language. [Target of Attack] Its main target is individuals referred to as “ugly things”, the specifics of which are unclear. [Conclusion] Based on a comprehensive assessment, this meme is considered harmful content and should be disseminated with caution. | [Cultural Background Analysis] The image combines traditional wedding elements, and the text mentions “scientists are working hard to cure them,” which implies a humorous interpretation of the phenomenon of same-sex weddings. [Metaphor and Subtext] The stereotype and bias towards sexual minorities are metaphorically represented through the kissing scene and the words “scientists are working hard to cure them”. [Harmfulness Determination] The content is clearly insulting and offensive and belongs to harmful content. [Target of Attack] The main attack target is the sexual minority group (LGBTQ + group), which strengthens discrimination against them. [Conclusion] This meme belongs to harmful content. It utilizes stereotypes of sexual minorities, easily inciting insults and harm against the LGBTQ+ community, which is not conducive to creating an inclusive social environment. | [Cultural Background Analysis] The image utilizes gender stereotypes in Chinese internet culture, emphasizing the meticulousness of girls and the roughness of boys. [Metaphor and Subtext] The comparison of washing clothes is a metaphor for the traditional social perception of the differences in behavior between men and women. [Harmfulness etermination] The content is inoffensive and is meant primarily to entertain the audience and evoke resonance. [Target of Attack] No attacks or insults were made against specific groups. [Conclusion] It is a harmless product of online culture and is suitable for easy sharing. |
Prediction | GPT4: ✓ Qwen2.5-VL(zero-shot): ✓ Debate-based Model: ✓ FG-E2HMD: ✓ | GPT4: × Qwen2.5-VL(zero-shot): × Debate-based Model: × FG-E2HMD: ✓ | GPT4: × Qwen2.5-VL(zero-shot): × Debate-based Model: × FG-E2HMD: × |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, X.; Wen, D.; Zuo, D. Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation. Electronics 2025, 14, 3504. https://doi.org/10.3390/electronics14173504
Chen X, Wen D, Zuo D. Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation. Electronics. 2025; 14(17):3504. https://doi.org/10.3390/electronics14173504
Chicago/Turabian StyleChen, Xinhao, Dongxin Wen, and Decheng Zuo. 2025. "Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation" Electronics 14, no. 17: 3504. https://doi.org/10.3390/electronics14173504
APA StyleChen, X., Wen, D., & Zuo, D. (2025). Towards Detecting Chinese Harmful Memes with Fine-Grained Explanatory Augmentation. Electronics, 14(17), 3504. https://doi.org/10.3390/electronics14173504