Research on Machine Learning, Data Mining, Natural Language Processes, and Optimization Methods

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 31 October 2026 | Viewed by 5249

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science and Engineering, Universidad Carlos III de Madrid, 28911 Madrid, Spain
Interests: semantic interoperability; systems and software engineering; knowledge engineering

Special Issue Information

Dear Colleagues,

Artificial Intelligence is a wide and hot topic in applied mathematics nowadays. It is of interest not only to algorithms but also to methods and methodologies for achieving an ethical application in the current world.

This Special Issue welcomes papers presenting new results and methods in the areas of machine learning, data science, natural language processing, and semantic interoperability, as well as applications of them. Review articles will also be considered.

Dr. Anabel Fraga
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • optimization
  • machine learning
  • natural language processing
  • semantic interoperability
  • patterns in data science

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

32 pages, 16166 KB  
Article
A Multimodal Ensemble-Based Framework for Detecting Fake News Using Visual and Textual Features
by Muhammad Abdullah, Hongying Zan, Arifa Javed, Muhammad Sohail, Orken Mamyrbayev, Zhanibek Turysbek, Hassan Eshkiki and Fabio Caraffini
Mathematics 2026, 14(2), 360; https://doi.org/10.3390/math14020360 - 21 Jan 2026
Viewed by 1970
Abstract
Detecting fake news is essential in natural language processing to verify news authenticity and prevent misinformation-driven social, political, and economic disruptions targeting specific groups. A major challenge in multimodal fake news detection is effectively integrating textual and visual modalities, as semantic gaps and [...] Read more.
Detecting fake news is essential in natural language processing to verify news authenticity and prevent misinformation-driven social, political, and economic disruptions targeting specific groups. A major challenge in multimodal fake news detection is effectively integrating textual and visual modalities, as semantic gaps and contextual variations between images and text complicate alignment, interpretation, and the detection of subtle or blatant inconsistencies. To enhance accuracy in fake news detection, this article introduces an ensemble-based framework that integrates textual and visual data using ViLBERT’s two-stream architecture, incorporates VADER sentiment analysis to detect emotional language, and uses Image–Text Contextual Similarity to identify mismatches between visual and textual elements. These features are processed through the Bi-GRU classifier, Transformer-XL, DistilBERT, and XLNet, combined via a stacked ensemble method with soft voting, culminating in a T5 metaclassifier that predicts the outcome for robustness. Results on the Fakeddit and Weibo benchmarking datasets show that our method outperforms state-of-the-art models, achieving up to 96% and 94% accuracy in fake news detection, respectively. This study highlights the necessity for advanced multimodal fake news detection systems to address the increasing complexity of misinformation and offers a promising solution. Full article
Show Figures

Figure 1

33 pages, 465 KB  
Article
A Multi-Stage NLP Framework for Knowledge Discovery from Crop Disease Research Literature
by Jantima Polpinij, Manasawee Kaenampornpan, Christopher S. G. Khoo, Wei-Ning Cheng and Bancha Luaphol
Mathematics 2026, 14(2), 299; https://doi.org/10.3390/math14020299 - 14 Jan 2026
Viewed by 769
Abstract
Extracting and organizing knowledge from the agricultural crop disease research literature are challenging tasks because of the heterogeneous terminologies, complicated symptom descriptions, and unstructured nature of scientific documents. In this study, we developed a multi-stage natural language processing (NLP) pipeline to automate knowledge [...] Read more.
Extracting and organizing knowledge from the agricultural crop disease research literature are challenging tasks because of the heterogeneous terminologies, complicated symptom descriptions, and unstructured nature of scientific documents. In this study, we developed a multi-stage natural language processing (NLP) pipeline to automate knowledge extraction, organization, and integration from the agricultural research literature into a domain-consistent crop disease knowledge graph. The model combines transformer-based sentence embeddings with variational deep clustering to extract topics, which are further refined via facet-aware relevance scoring for sentence selection to be included in the summary. Lexicon-guided named entity recognition helps in the precise identification and normalization of terms for crops, diseases, symptoms, etc. Relation extraction based on a combination of lexical, semantic, and contextual features leads to the meaningful generation of triplets for the knowledge graph. The experimental results show that the method yielded consistently good results at each stage of the knowledge extraction process. Among the combinations of embedding and deep clustering methods, SciBERT + VaDE achieved the best clustering results. The extraction of representative sentences for disease symptoms, control/treatment, and prevention obtained high F1-scores of around 0.8. The resulting knowledge graph has high node coverage and high relation completeness, as well as high precision and recall in triplet generation. The multi-stage NLP pipeline effectively converts unstructured agricultural research texts into a coherent and semantically rich knowledge graph, providing a basis for further research in crop disease analysis, knowledge retrieval, and data-driven decision support in agricultural informatics. Full article
Show Figures

Figure 1

22 pages, 551 KB  
Article
A Readability-Driven Curriculum Learning Method for Data-Efficient Small Language Model Pretraining
by Suyun Kim, Jungwon Park and Juae Kim
Mathematics 2025, 13(20), 3300; https://doi.org/10.3390/math13203300 - 16 Oct 2025
Viewed by 1585
Abstract
Large language models demand substantial computational and data resources, motivating approaches that improve the training efficiency of small language models. While curriculum learning methods based on linguistic difficulty measures have been explored as a potential solution, prior approaches that rely on complex linguistic [...] Read more.
Large language models demand substantial computational and data resources, motivating approaches that improve the training efficiency of small language models. While curriculum learning methods based on linguistic difficulty measures have been explored as a potential solution, prior approaches that rely on complex linguistic indices are often computationally expensive, difficult to interpret, or fail to yield consistent improvements. Moreover, existing methods rarely incorporate the cognitive and linguistic efficiency observed in human language acquisition. To address these gaps, we propose a readability-driven curriculum learning method based on the Flesch Reading Ease (FRE) score, which provides a simple, interpretable, and cognitively motivated measure of text difficulty. Across two dataset configurations and multiple curriculum granularities, our method yields consistent improvements over baseline models without curriculum learning, achieving substantial gains on BLiMP and MNLI. Reading behavior evaluations also reveal human-like sensitivity to textual difficulty. These findings demonstrate that a lightweight, interpretable curriculum design can enhance small language models under strict data constraints, offering a practical path toward more efficient training. Full article
Show Figures

Figure 1

Back to TopTop