You are currently on the new version of our website. Access the old version .

634 Results Found

  • Article
  • Open Access
23 Citations
5,844 Views
33 Pages

23 January 2022

With the rapid proliferation of social networking sites (SNS), automatic topic extraction from various text messages posted on SNS are becoming an important source of information for understanding current social trends or needs. Latent Dirichlet Allo...

  • Article
  • Open Access
2 Citations
4,568 Views
22 Pages

The fast growth of data in the academic field has contributed to making recommendation systems for scientific papers more popular. Content-based filtering (CBF), a pivotal technique in recommender systems (RS), holds particular significance in the re...

  • Article
  • Open Access
24 Citations
6,936 Views
11 Pages

3 February 2020

In this paper, we study the feasibility of performing fuzzy information retrieval by word embedding. We propose a fuzzy information retrieval approach to capture the relationships between words and query language, which combines some techniques of de...

  • Article
  • Open Access
14 Citations
7,008 Views
23 Pages

20 March 2023

Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations for natural language processing (NLP) and other tasks. Many NLP applicat...

  • Article
  • Open Access
7 Citations
4,757 Views
10 Pages

Learning Subword Embedding to Improve Uyghur Named-Entity Recognition

  • Alimu Saimaiti,
  • Lulu Wang and
  • Tuergen Yibulayin

15 April 2019

Uyghur is a morphologically rich and typical agglutinating language, and morphological segmentation affects the performance of Uyghur named-entity recognition (NER). Common Uyghur NER systems use the word sequence as input and rely heavily on feature...

  • Article
  • Open Access
15 Citations
5,055 Views
27 Pages

Automatic Correction of Real-Word Errors in Spanish Clinical Texts

  • Daniel Bravo-Candel,
  • Jésica López-Hernández,
  • José Antonio García-Díaz,
  • Fernando Molina-Molina and
  • Francisco García-Sánchez

21 April 2021

Real-word errors are characterized by being actual terms in the dictionary. By providing context, real-word errors are detected. Traditional methods to detect and correct such errors are mostly based on counting the frequency of short word sequences...

  • Article
  • Open Access
21 Citations
4,331 Views
15 Pages

31 July 2021

In this paper, we propose a novel information criteria-based approach to select the dimensionality of the word2vec Skip-gram (SG). From the perspective of the probability theory, SG is considered as an implicit probability distribution estimation und...

  • Article
  • Open Access
855 Views
28 Pages

17 November 2025

This study is intended to evaluate and contrast the performance of varying combinations of embedding algorithms and weighting systems in measuring perception-based text similarity using the Cosine Similarity approach. Within a structured experiment d...

  • Review
  • Open Access
2 Citations
1,880 Views
22 Pages

Biological Sequence Representation Methods and Recent Advances: A Review

  • Hongwei Zhang,
  • Yan Shi,
  • Yapeng Wang,
  • Xu Yang,
  • Kefeng Li,
  • Sio-Kei Im and
  • Yu Han

27 August 2025

Biological-sequence representation methods are pivotal for advancing machine learning in computational biology, transforming nucleotide and protein sequences into formats that enhance predictive modeling and downstream task performance. This review c...

  • Article
  • Open Access
19 Citations
7,187 Views
20 Pages

Increasingly, the web produces massive volumes of texts, alone or associated with images, videos, photographs, together with some metadata, indispensable for their finding and retrieval. Keywords/keyphrases that characterize the semantic content of d...

  • Article
  • Open Access
17 Citations
7,467 Views
18 Pages

20 September 2020

Exchange rate forecasting has been an important topic for investors, researchers, and analysts. In this study, financial sentiment analysis (FSA) and time series analysis (TSA) are proposed to form a predicting model for US Dollar/Turkish Lira exchan...

  • Article
  • Open Access
7 Citations
5,931 Views
18 Pages

Leveraging Large Language Models for Sensor Data Retrieval

  • Alberto Berenguer,
  • Adriana Morejón,
  • David Tomás and
  • Jose-Norberto Mazón

15 March 2024

The growing significance of sensor data in the development of information technology services finds obstacles due to disparate data presentations and non-adherence to FAIR principles. This paper introduces a novel approach for sensor data gathering a...

  • Article
  • Open Access
3 Citations
5,373 Views
49 Pages

Interpretable Topic Extraction and Word Embedding Learning Using Non-Negative Tensor DEDICOM

  • Lars Hillebrand,
  • David Biesner,
  • Christian Bauckhage and
  • Rafet Sifa

Unsupervised topic extraction is a vital step in automatically extracting concise contentual information from large text corpora. Existing topic extraction methods lack the capability of linking relations between these topics which would further help...

  • Article
  • Open Access
1 Citations
1,540 Views
21 Pages

Old Wine in New Wineskins: Applying Computational Methods in New Testament Hermeneutics

  • Christian Houth Vrangbæk,
  • Eva Elisabeth Houth Vrangbæk and
  • Jacob Mortensen

31 December 2024

New Testament studies has over the past years seen an increase in the use of digital methods, but some of the more advanced methods still lack proper integration. This article explores some of the advantages and disadvantages in employing computation...

  • Article
  • Open Access
4 Citations
3,531 Views
19 Pages

3 November 2024

Sentiment analysis utilizes Natural Language Processing (NLP) techniques to extract opinions from text, which is critical for businesses looking to refine strategies and better understand customer feedback. Understanding people’s sentiments abo...

  • Article
  • Open Access
4 Citations
3,121 Views
18 Pages

This paper presents a mathematical analysis of semantic convergence in transformer-based language models, drawing inspiration from the concept of fractal self-similarity. We introduce and prove a novel theorem characterizing the gradient of embedding...

  • Article
  • Open Access
23 Citations
4,999 Views
16 Pages

An Enhanced Neural Word Embedding Model for Transfer Learning

  • Md. Kowsher,
  • Md. Shohanur Islam Sobuj,
  • Md. Fahim Shahriar,
  • Nusrat Jahan Prottasha,
  • Mohammad Shamsul Arefin,
  • Pranab Kumar Dhar and
  • Takeshi Koshiba

10 March 2022

Due to the expansion of data generation, more and more natural language processing (NLP) tasks are needing to be solved. For this, word representation plays a vital role. Computation-based word embedding in various high languages is very useful. Howe...

  • Article
  • Open Access
6 Citations
4,240 Views
31 Pages

20 April 2024

In the evolving field of machine learning, deploying fair and transparent models remains a formidable challenge. This study builds on earlier research, demonstrating that neural architectures exhibit inherent biases by analyzing a broad spectrum of t...

  • Article
  • Open Access
4 Citations
3,835 Views
31 Pages

Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like simila...

  • Article
  • Open Access
3 Citations
1,790 Views
14 Pages

3 November 2023

The use of different techniques and tools is a common practice to cover all stages in the development life-cycle of systems generating a significant number of work products. These artefacts are frequently encoded using diverse formats, and often requ...

  • Article
  • Open Access
1 Citations
3,541 Views
20 Pages

DAWE: A Double Attention-Based Word Embedding Model with Sememe Structure Information

  • Shengwen Li,
  • Renyao Chen,
  • Bo Wan,
  • Junfang Gong,
  • Lin Yang and
  • Hong Yao

21 August 2020

Word embedding is an important reference for natural language processing tasks, which can generate distribution presentations of words based on many text data. Recent evidence demonstrates that introducing sememe knowledge is a promising strategy to...

  • Article
  • Open Access
4,022 Views
12 Pages

29 December 2019

To overcome the data sparseness in word embedding trained in low-resource languages, we propose a punctuation and parallel corpus based word embedding model. In particular, we generate the global word-pair co-occurrence matrix with the punctuation-ba...

  • Article
  • Open Access
3 Citations
4,431 Views
21 Pages

Towards Context-Aware Opinion Summarization for Monitoring Social Impact of News

  • Alejandro Ramón-Hernández,
  • Alfredo Simón-Cuevas,
  • María Matilde García Lorenzo,
  • Leticia Arco and
  • Jesús Serrano-Guerrero

18 November 2020

Opinion mining and summarization of the increasing user-generated content on different digital platforms (e.g., news platforms) are playing significant roles in the success of government programs and initiatives in digital governance, from extracting...

  • Article
  • Open Access
5 Citations
2,472 Views
13 Pages

11 December 2022

The purpose of cross-domain sentiment classification (CDSC) is to fully utilize the rich labeled data in the source domain to help the target domain perform sentiment classification even when labeled data are insufficient. Most of the existing method...

  • Article
  • Open Access
1 Citations
1,855 Views
35 Pages

Set-Word Embeddings and Semantic Indices: A New Contextual Model for Empirical Language Analysis

  • Pedro Fernández de Córdoba,
  • Carlos A. Reyes Pérez,
  • Claudia Sánchez Arnau and
  • Enrique A. Sánchez Pérez

20 January 2025

We present a new word embedding technique in a (non-linear) metric space based on the shared membership of terms in a corpus of textual documents, where the metric is naturally defined by the Boolean algebra of all subsets of the corpus and a measure...

  • Article
  • Open Access
12 Citations
3,503 Views
10 Pages

Word2vec Word Embedding-Based Artificial Intelligence Model in the Triage of Patients with Suspected Diagnosis of Major Ischemic Stroke: A Feasibility Study

  • Antonio Desai,
  • Aurora Zumbo,
  • Mauro Giordano,
  • Pierandrea Morandini,
  • Maria Elena Laino,
  • Elena Azzolini,
  • Andrea Fabbri,
  • Simona Marcheselli,
  • Alice Giotta Lucifero and
  • Antonio Voza
  • + 1 author

Background: The possible benefits of using semantic language models in the early diagnosis of major ischemic stroke (MIS) based on artificial intelligence (AI) are still underestimated. The present study strives to assay the feasibility of the word2v...

  • Article
  • Open Access
35 Citations
6,884 Views
18 Pages

Today, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared w...

  • Article
  • Open Access
74 Citations
13,436 Views
19 Pages

A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning

  • Qicai Wang,
  • Peiyu Liu,
  • Zhenfang Zhu,
  • Hongxia Yin,
  • Qiuyue Zhang and
  • Lindong Zhang

4 November 2019

As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis w...

  • Article
  • Open Access
38 Citations
6,605 Views
21 Pages

18 September 2020

Detecting cybersecurity intelligence (CSI) on social media such as Twitter is crucial because it allows security experts to respond cyber threats in advance. In this paper, we devise a new text classification model based on deep learning to classify...

  • Article
  • Open Access
41 Citations
5,225 Views
25 Pages

Beyond Word-Based Model Embeddings: Contextualized Representations for Enhanced Social Media Spam Detection

  • Sawsan Alshattnawi,
  • Amani Shatnawi,
  • Anas M.R. AlSobeh and
  • Aws A. Magableh

7 March 2024

As social media platforms continue their exponential growth, so do the threats targeting their security. Detecting disguised spam messages poses an immense challenge owing to the constant evolution of tactics. This research investigates advanced arti...

  • Article
  • Open Access
117 Citations
13,863 Views
17 Pages

27 November 2021

Sentiment analysis (SA) detects people’s opinions from text engaging natural language processing (NLP) techniques. Recent research has shown that deep learning models, i.e., Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), an...

  • Article
  • Open Access
2 Citations
4,208 Views
15 Pages

10 June 2022

Text vectorization is the basic work of natural language processing tasks. High-quality vector representation with rich feature information can guarantee the quality of entity recognition and other downstream tasks in the field of traditional Chinese...

  • Article
  • Open Access
3 Citations
2,182 Views
38 Pages

A Domain Generation Algorithm (DGA) employs botnets to generate domain names through a communication link between the C&C server and the bots. A DGA can generate pseudo-random AGDs (algorithmically generated domains) regularly, a handy method for...

  • Article
  • Open Access
11 Citations
4,294 Views
17 Pages

20 June 2021

Successful cyber-attacks are caused by the exploitation of some vulnerabilities in the software and/or hardware that exist in systems deployed in premises or the cloud. Although hundreds of vulnerabilities are discovered every year, only a small frac...

  • Article
  • Open Access
31 Citations
8,115 Views
25 Pages

21 July 2023

In the increasingly complex domain of Korean voice phishing attacks, advanced detection techniques are paramount. Traditional methods have achieved some degree of success. However, they often fail to detect sophisticated voice phishing attacks, highl...

  • Article
  • Open Access
35 Citations
4,099 Views
24 Pages

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes

  • Damien Bouchabou,
  • Sao Mai Nguyen,
  • Christophe Lohr,
  • Benoit LeDuc and
  • Ioannis Kanellos

14 October 2021

Long Short Term Memory (LSTM)-based structures have demonstrated their efficiency for daily living recognition activities in smart homes by capturing the order of sensor activations and their temporal dependencies. Nevertheless, they still fail in de...

  • Article
  • Open Access
3 Citations
3,337 Views
17 Pages

Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content

  • Olga Bernikova,
  • Oleg Granichin,
  • Dan Lemberg,
  • Oleg Redkin and
  • Zeev Volkovich

14 April 2020

A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator ro...

  • Article
  • Open Access
2 Citations
4,533 Views
16 Pages

3 November 2020

This article presents an novel approach inspired by the modern exploration of short texts’ patterning to creations prescribed to the outstanding Islamic jurist, theologian, and mystical thinker Abu Hamid Al Ghazali. We treat the task with the g...

  • Article
  • Open Access
1,913 Views
20 Pages

Effective Context-Aware File Path Embeddings for Anomaly Detection

  • Ra-Kyung Lee,
  • Hyun-Min Song and
  • Taek-Young Youn

23 May 2025

In digital forensics, especially Windows forensics, identifying anomalous file paths is crucial when dealing with large-scale data. Traditional static embedding methods, which aggregate token-level representations, discard hierarchical and sequential...

  • Article
  • Open Access
3 Citations
7,090 Views
16 Pages

Archetype-Based Modeling and Search of Social Media

  • Brent D. Davis,
  • Kamran Sedig and
  • Daniel J. Lizotte

Existing keyword-based search techniques suffer from limitations owing to unknown, mismatched, and obscure vocabulary. These challenges are particularly prevalent in social media, where slang, jargon, and memetics are abundant. We develop a new techn...

  • Article
  • Open Access
11 Citations
4,082 Views
15 Pages

English–Welsh Cross-Lingual Embeddings

  • Luis Espinosa-Anke,
  • Geraint Palmer,
  • Padraig Corcoran,
  • Maxim Filimonov,
  • Irena Spasić and
  • Dawn Knight

16 July 2021

Cross-lingual embeddings are vector space representations where word translations tend to be co-located. These representations enable learning transfer across languages, thus bridging the gap between data-rich languages such as English and others. In...

  • Article
  • Open Access
5 Citations
3,469 Views
22 Pages

22 September 2021

Hashtags are considered important in various real-world applications, including tweet mining, query expansion, and sentiment analysis. Hence, recommending hashtags from tagged tweets has been considered significant by the research community. However,...

  • Article
  • Open Access
61 Citations
5,044 Views
17 Pages

28 August 2021

Spreading rumors in social media is considered under cybercrimes that affect people, societies, and governments. For instance, some criminals create rumors and send them on the internet, then other people help them to spread it. Spreading rumors can...

  • Article
  • Open Access
5 Citations
3,501 Views
12 Pages

Semantic Similarity of Common Verbal Expressions in Older Adults through a Pre-Trained Model

  • Marcos Orellana,
  • Patricio Santiago García,
  • Guillermo Daniel Ramon,
  • Jorge Luis Zambrano-Martinez,
  • Andrés Patiño-León,
  • María Verónica Serrano and
  • Priscila Cedillo

Health problems in older adults lead to situations where communication with peers, family and caregivers becomes challenging for seniors; therefore, it is necessary to use alternative methods to facilitate communication. In this context, Augmentative...

  • Review
  • Open Access
6 Citations
7,436 Views
28 Pages

Metaphors are an integral and important part of human communication and greatly impact the way our thinking is formed and how we understand the world. The theory of the conceptual metaphor has shifted the focus of research from words to thinking, and...

  • Article
  • Open Access
1,041 Views
17 Pages

An Order-Sensitive Hierarchical Neural Model for Early Lung Cancer Detection Using Dutch Primary Care Notes and Structured Data

  • Iacopo Vagliano,
  • Miguel Rios,
  • Mohanad Abukmeil,
  • Martijn C. Schut,
  • Torec T. Luik,
  • Kristel M. van Asselt,
  • Henk C. P. M. van Weert and
  • Ameen Abu-Hanna

29 March 2025

Background: Improving prediction models to timely detect lung cancer is paramount. Our aim is to develop and validate prediction models for early detection of lung cancer in primary care, based on free-text consultation notes, that exploit the order...

  • Article
  • Open Access
1 Citations
2,962 Views
25 Pages

Detecting Fake News in Urdu Language Using Machine Learning, Deep Learning, and Large Language Model-Based Approaches

  • Muhammad Shoaib Farooq,
  • Syed Muhammad Asadullah Gilani,
  • Muhammad Faraz Manzoor and
  • Momina Shaheen

10 July 2025

Fake news is false or misleading information that looks like real news and spreads through traditional and social media. It has a big impact on our social lives, especially in politics. In Pakistan, where Urdu is the main language, finding fake news...

  • Article
  • Open Access
8 Citations
2,890 Views
18 Pages

1 December 2021

Due to the accelerated growth of symmetrical sentiment data across different platforms, experimenting with different sentiment analysis (SA) techniques allows for better decision-making and strategic planning for different sectors. Specifically, the...

  • Article
  • Open Access
2 Citations
2,730 Views
19 Pages

12 April 2022

In multilingual textual archives, the availability of textual annotation, that is keywords either manually or automatically associated with texts, is something worth exploiting to improve user experience and successful navigation, search and visualiz...

  • Article
  • Open Access
15 Citations
3,940 Views
19 Pages

Chinese Named Entity Recognition Based on Knowledge Based Question Answering System

  • Didi Yin,
  • Siyuan Cheng,
  • Boxu Pan,
  • Yuanyuan Qiao,
  • Wei Zhao and
  • Dongyu Wang

26 May 2022

The KBQA (Knowledge-Based Question Answering) system is an essential part of the smart customer service system. KBQA is a type of QA (Question Answering) system based on KB (Knowledge Base). It aims to automatically answer natural language questions...

of 13