You are currently on the new version of our website. Access the old version .

292 Results Found

  • Article
  • Open Access
416 Views
24 Pages

Research on Technical Condition of Concrete Bridges Based on FastText+CNN

  • Shiwen Li,
  • Zhihai Deng,
  • Junguang Wang,
  • Xiaoguang Wu and
  • Qingyuan Feng

21 November 2025

Addressing the challenges of scarce measured data for Class 3–4 bridges and strong subjectivity in manual assessments in bridge technical-condition evaluation, this study innovatively proposes a FastText+CNN evaluation model that integrates sem...

  • Article
  • Open Access
4 Citations
2,590 Views
20 Pages

Integrated Model Text Classification Based on Multineural Networks

  • Wenjin Hu,
  • Jiawei Xiong,
  • Ning Wang,
  • Feng Liu,
  • Yao Kong and
  • Chaozhong Yang

Based on the original deep network architecture, this paper replaces the deep integrated network by integrating shallow FastText, a bidirectional gated recurrent unit (GRU) network and the convolutional neural networks (CNNs). In FastText, word embed...

  • Article
  • Open Access
7 Citations
3,968 Views
32 Pages

Digital Authorship Attribution in Russian-Language Fanfiction and Classical Literature

  • Anastasia Fedotova,
  • Aleksandr Romanov,
  • Anna Kurtukova and
  • Alexander Shelupanov

26 December 2022

This article is the third paper in a series aimed at the establishment of the authorship of Russian-language texts. This paper considers methods for determining the authorship of classical Russian literary texts, as well as fanfiction texts. The proc...

  • Article
  • Open Access
15 Citations
6,361 Views
24 Pages

22 December 2021

Authorship attribution is one of the important fields of natural language processing (NLP). Its popularity is due to the relevance of implementing solutions for information security, as well as copyright protection, various linguistic studies, in par...

  • Article
  • Open Access
2,355 Views
26 Pages

6 September 2024

In several applications of text classification, training document labels are provided by human evaluators, and therefore, gathering sufficient data for model creation is time consuming and costly. The labeling time and effort may be reduced by active...

  • Article
  • Open Access
37 Citations
6,941 Views
21 Pages

Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning

  • Nasrin Elhassan,
  • Giuseppe Varone,
  • Rami Ahmed,
  • Mandar Gogate,
  • Kia Dashtipour,
  • Hani Almoamari,
  • Mohammed A. El-Affendi,
  • Bassam Naji Al-Tamimi,
  • Faisal Albalwy and
  • Amir Hussain

Social media networks have grown exponentially over the last two decades, providing the opportunity for users of the internet to communicate and exchange ideas on a variety of topics. The outcome is that opinion mining plays a crucial role in analyzi...

  • Article
  • Open Access
3 Citations
2,808 Views
22 Pages

28 November 2024

The expanding Arabic user base presents a unique opportunity for researchers to tap into vast online Arabic resources. However, the lack of reliable Arabic word embedding models and the limited availability of Arabic corpora poses significant challen...

  • Article
  • Open Access
25 Citations
5,497 Views
10 Pages

29 October 2021

In this article, we present the results of our experiments on sentiment and emotion recognition for English and Polish texts, aiming to work in the context of a therapeutic chatbot. We created a dedicated dataset by adding samples of neutral texts to...

  • Feature Paper
  • Article
  • Open Access
5 Citations
2,949 Views
23 Pages

8 December 2024

Traditional software effort estimation methods, such as term frequency–inverse document frequency (TF-IDF), are widely used due to their simplicity and interpretability. However, they struggle with limited datasets, fail to capture intricate se...

  • Article
  • Open Access
21 Citations
9,959 Views
16 Pages

Intent detection is one of the main tasks of a dialogue system. In this paper, we present our intent detection system that is based on fastText word embeddings and a neural network classifier. We find an improvement in fastText sentence vectorization...

  • Article
  • Open Access
51 Citations
8,308 Views
16 Pages

Disaster Image Classification by Fusing Multimodal Social Media Data

  • Zhiqiang Zou,
  • Hongyu Gan,
  • Qunying Huang,
  • Tianhui Cai and
  • Kai Cao

Social media datasets have been widely used in disaster assessment and management. When a disaster occurs, many users post messages in a variety of formats, e.g., image and text, on social media platforms. Useful information could be mined from these...

  • Article
  • Open Access
20 Citations
7,736 Views
17 Pages

Exploring Language Markers of Mental Health in Psychiatric Stories

  • Marco Spruit,
  • Stephanie Verkleij,
  • Kees de Schepper and
  • Floortje Scheepers

19 February 2022

Diagnosing mental disorders is complex due to the genetic, environmental and psychological contributors and the individual risk factors. Language markers for mental disorders can help to diagnose a person. Research thus far on language markers and th...

  • Article
  • Open Access
3 Citations
3,044 Views
18 Pages

13 June 2023

Currently, sentiment analysis is a research hotspot in many fields such as computer science and statistical science. Topic discovery of the literature in the field of text sentiment analysis aims to provide scholars with a quick and effective underst...

  • Article
  • Open Access
15 Citations
3,753 Views
21 Pages

Intent Detection Problem Solving via Automatic DNN Hyperparameter Optimization

  • Jurgita Kapočiūtė-Dzikienė,
  • Kaspars Balodis and
  • Raivis Skadiņš

22 October 2020

Accurate intent detection-based chatbots are usually trained on larger datasets that are not available for some languages. Seeking the most accurate models, three English benchmark datasets that were human-translated into four morphologically complex...

  • Article
  • Open Access
4 Citations
2,532 Views
22 Pages

Stemming vulnerabilities out of a smart contract prior to its deployment is essential to ensure the security of decentralized applications. As such, numerous tools and machine-learning-based methods have been proposed to help detect vulnerabilities i...

  • Article
  • Open Access
10 Citations
3,467 Views
24 Pages

Topic Mining and Future Trend Exploration in Digital Economy Research

  • Changlu Zhang,
  • Qiong Yang,
  • Jian Zhang,
  • Liming Gou and
  • Haojie Fan

1 August 2023

This work proposes a new literature topic clustering analysis framework, based on which the topics of digital-economy-related studies are condensed. First, we calculated the word vector of keywords using the FastText model, and then the keywords were...

  • Article
  • Open Access
6 Citations
5,006 Views
32 Pages

24 March 2023

Modern approaches to computing consumer price indices include the use of various data sources, such as web-scraped data or scanner data, which are very large in volume and need special processing techniques. In this paper, we address one of the main...

  • Article
  • Open Access
17 Citations
6,714 Views
23 Pages

AntiBP3: A Method for Predicting Antibacterial Peptides against Gram-Positive/Negative/Variable Bacteria

  • Nisha Bajiya,
  • Shubham Choudhury,
  • Anjali Dhall and
  • Gajendra P. S. Raghava

Most of the existing methods developed for predicting antibacterial peptides (ABPs) are mostly designed to target either gram-positive or gram-negative bacteria. In this study, we describe a method that allows us to predict ABPs against gram-positive...

  • Article
  • Open Access
55 Citations
13,360 Views
22 Pages

25 March 2020

Accurate generative chatbots are usually trained on large datasets of question–answer pairs. Despite such datasets not existing for some languages, it does not reduce the need for companies to have chatbot technology in their websites. However,...

  • Communication
  • Open Access
7 Citations
7,952 Views
16 Pages

Due to instant availability of data on social media platforms like Twitter, and advances in machine learning and data management technology, real-time crisis informatics has emerged as a prolific research area in the last decade. Although several ben...

  • Article
  • Open Access
5 Citations
1,810 Views
24 Pages

3 October 2024

TextNetTopics is a novel topic modeling-based topic selection approach that finds highly ranked discriminative topics for training text classification models, where a topic is a set of semantically related words. However, it suffers from several limi...

  • Article
  • Open Access
23 Citations
5,539 Views
13 Pages

24 March 2020

The importance of cybersecurity has recently been increasing. A malware coder writes malware into normal executable files. A computer is more likely to be infected by malware when users have easy access to various executables. Malware is considered a...

  • Article
  • Open Access
11 Citations
5,696 Views
12 Pages

Determining the Age of the Author of the Text Based on Deep Neural Network Models

  • Aleksandr Sergeevich Romanov,
  • Anna Vladimirovna Kurtukova,
  • Artem Alexandrovich Sobolev,
  • Alexander Alexandrovich Shelupanov and
  • Anastasia Mikhailovna Fedotova

21 December 2020

This paper is devoted to solving the problem of determining the age of the author of the text based on models of deep neural networks. The article presents an analysis of methods for determining the age of the author of a text and approaches to deter...

  • Article
  • Open Access
926 Views
15 Pages

Large language models (LLMs) are increasingly applied to specialized domains like medical education, necessitating tailored approaches to evaluate structured responses such as SBAR (Situation, Background, Assessment, Recommendation). This study devel...

  • Article
  • Open Access
29 Citations
12,273 Views
16 Pages

18 May 2022

In this article, we address the problem of detecting anomalies in system log files. Computer systems generate huge numbers of events, which are noted in event log files. While most of them report normal actions, an unusual entry may inform about a fa...

  • Article
  • Open Access
1 Citations
2,922 Views
13 Pages

22 November 2023

In this study, we investigate knowledge transfer between two distinct sentence embedding models: a computationally demanding, highly performant model and a lightweight model derived from word vector averaging. Our objective is to augment the represen...

  • Article
  • Open Access
17 Citations
4,354 Views
14 Pages

Evaluating Polarity Trend Amidst the Coronavirus Crisis in Peoples’ Attitudes toward the Vaccination Drive

  • Rakhi Batra,
  • Ali Shariq Imran,
  • Zenun Kastrati,
  • Abdul Ghafoor,
  • Sher Muhammad Daudpota and
  • Sarang Shaikh

11 May 2021

It has been more than a year since the coronavirus (COVID-19) engulfed the whole world, disturbing the daily routine, bringing down the economies, and killing two million people across the globe at the time of writing. The pandemic brought the world...

  • Article
  • Open Access
58 Citations
5,677 Views
16 Pages

A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification

  • Nguyen Quoc Khanh Le,
  • Duyen Thi Do,
  • Truong Nguyen Khanh Hung,
  • Luu Ho Thanh Lam,
  • Tuan-Tu Huynh and
  • Ngan Thi Kim Nguyen

28 November 2020

Essential genes contain key information of genomes that could be the key to a comprehensive understanding of life and evolution. Because of their importance, studies of essential genes have been considered a crucial problem in computational biology....

  • Article
  • Open Access
31 Citations
8,137 Views
25 Pages

21 July 2023

In the increasingly complex domain of Korean voice phishing attacks, advanced detection techniques are paramount. Traditional methods have achieved some degree of success. However, they often fail to detect sophisticated voice phishing attacks, highl...

  • Article
  • Open Access
12 Citations
4,338 Views
10 Pages

8 February 2022

The ‘intention’ classification of a user question is an important element of a task-engine driven chatbot. The essence of a user question’s intention understanding is the text classification. The transfer learning, such as BERT (Bid...

  • Article
  • Open Access
3 Citations
2,811 Views
23 Pages

26 October 2024

The detection of essays written by AI compared to those authored by students is increasingly becoming a significant issue in educational settings. This research examines various numerical text representation techniques to improve the classification o...

  • Article
  • Open Access
23 Citations
5,004 Views
16 Pages

An Enhanced Neural Word Embedding Model for Transfer Learning

  • Md. Kowsher,
  • Md. Shohanur Islam Sobuj,
  • Md. Fahim Shahriar,
  • Nusrat Jahan Prottasha,
  • Mohammad Shamsul Arefin,
  • Pranab Kumar Dhar and
  • Takeshi Koshiba

10 March 2022

Due to the expansion of data generation, more and more natural language processing (NLP) tasks are needing to be solved. For this, word representation plays a vital role. Computation-based word embedding in various high languages is very useful. Howe...

  • Article
  • Open Access
14 Citations
4,654 Views
26 Pages

24 August 2021

Electric vehicle (EV) charging infrastructure is present all over the United States, but charging prices vary greatly, both in amount and in the methods by which they are assessed. For this paper, we interpret and analyze charging price information f...

  • Article
  • Open Access
24 Citations
5,854 Views
16 Pages

Citation Context Analysis Using Combined Feature Embedding and Deep Convolutional Neural Network Model

  • Musarat Karim,
  • Malik Muhammad Saad Missen,
  • Muhammad Umer,
  • Saima Sadiq,
  • Abdullah Mohamed and
  • Imran Ashraf

21 March 2022

Citation creates a link between citing and the cited author, and the frequency of citation has been regarded as the basic element to measure the impact of research and knowledge-based achievements. Citation frequency has been widely used to calculate...

  • Article
  • Open Access
29 Citations
6,652 Views
20 Pages

In March 2020, the World Health Organisation declared that COVID-19 was a new pandemic. This deadly virus spread and affected many countries in the world. During the outbreak, social media platforms such as Twitter contributed valuable and massive am...

  • Article
  • Open Access
1,933 Views
20 Pages

Effective Context-Aware File Path Embeddings for Anomaly Detection

  • Ra-Kyung Lee,
  • Hyun-Min Song and
  • Taek-Young Youn

23 May 2025

In digital forensics, especially Windows forensics, identifying anomalous file paths is crucial when dealing with large-scale data. Traditional static embedding methods, which aggregate token-level representations, discard hierarchical and sequential...

  • Article
  • Open Access
4 Citations
4,173 Views
15 Pages

2 August 2022

The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their l...

  • Article
  • Open Access
36 Citations
5,355 Views
18 Pages

Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM)

  • Arief Setyanto,
  • Arif Laksito,
  • Fawaz Alarfaj,
  • Mohammed Alreshoodi,
  • Kusrini,
  • Irwan Oyong,
  • Mardhiya Hayaty,
  • Abdullah Alomair,
  • Naif Almusallam and
  • Lilis Kurniasari

20 April 2022

Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet a...

  • Article
  • Open Access
14 Citations
7,029 Views
23 Pages

20 March 2023

Over the past few years, word embeddings and bidirectional encoder representations from transformers (BERT) models have brought better solutions to learning text representations for natural language processing (NLP) and other tasks. Many NLP applicat...

  • Article
  • Open Access
1,263 Views
15 Pages

26 January 2025

Current research widely acknowledges that the subcellular localization of mRNA is crucial for understanding its biological functions. However, current methods for mRNA subcellular localization based on k-mer frequency features may overlook the sequen...

  • Article
  • Open Access
65 Citations
10,638 Views
16 Pages

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

  • Jurgita Kapočiūtė-Dzikienė,
  • Robertas Damaševičius and
  • Marcin Woźniak

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Sho...

  • Article
  • Open Access
11 Citations
4,085 Views
15 Pages

English–Welsh Cross-Lingual Embeddings

  • Luis Espinosa-Anke,
  • Geraint Palmer,
  • Padraig Corcoran,
  • Maxim Filimonov,
  • Irena Spasić and
  • Dawn Knight

16 July 2021

Cross-lingual embeddings are vector space representations where word translations tend to be co-located. These representations enable learning transfer across languages, thus bridging the gap between data-rich languages such as English and others. In...

  • Article
  • Open Access
2 Citations
3,305 Views
14 Pages

4 November 2023

Short message services (SMS), microblogging tools, instant message apps, and commercial websites produce numerous short text messages every day. These short text messages are usually guaranteed to reach mass audience with low cost. Spammers take adva...

  • Article
  • Open Access
188 Citations
24,169 Views
19 Pages

Transfer Learning for Sentiment Analysis Using BERT Based Supervised Fine-Tuning

  • Nusrat Jahan Prottasha,
  • Abdullah As Sami,
  • Md Kowsher,
  • Saydul Akbar Murad,
  • Anupam Kumar Bairagi,
  • Mehedi Masud and
  • Mohammed Baz

30 May 2022

The growth of the Internet has expanded the amount of data expressed by users across multiple platforms. The availability of these different worldviews and individuals’ emotions empowers sentiment analysis. However, sentiment analysis becomes e...

  • Article
  • Open Access
52 Citations
13,149 Views
17 Pages

30 December 2020

Due to the COVID-19 pandemic, the sales of fast-food businesses have dropped sharply. Customer satisfaction has always been one of the key factors for the sustainable development of enterprises. However, in the fast-food restaurant business, gaining...

  • Article
  • Open Access
1 Citations
1,472 Views
16 Pages

Text Alignment in the Service of Text Reuse Detection

  • Hadar Miller,
  • Tsvi Kuflik and
  • Moshe Lavee

20 March 2025

This study introduces a novel approach to text alignment tailored for ancient languages, with a focus on Hebrew and Aramaic, aimed at enhancing text reuse detection. Unlike previous methods, our approach integrates multiple NLP components into a spec...

  • Proceeding Paper
  • Open Access
1,571 Views
14 Pages

17 February 2025

The classification of sparse text, common in short or specialized content, is challenging for natural language processing. These challenges stem from high-dimensional data and scarce relevant features because sparse text can result from noisy, short,...

  • Article
  • Open Access
15 Citations
5,786 Views
26 Pages

Geo-Spatial Mapping of Hate Speech Prediction in Roman Urdu

  • Samia Aziz,
  • Muhammad Shahzad Sarfraz,
  • Muhammad Usman,
  • Muhammad Umar Aftab and
  • Hafiz Tayyab Rauf

14 February 2023

Social media has transformed into a crucial channel for political expression. Twitter, especially, is a vital platform used to exchange political hate in Pakistan. Political hate speech affects the public image of politicians, targets their supporter...

  • Article
  • Open Access
7 Citations
4,311 Views
19 Pages

Approaches for the Clustering of Geographic Metadata and the Automatic Detection of Quasi-Spatial Dataset Series

  • Javier Lacasta,
  • Francisco Javier Lopez-Pellicer,
  • Javier Zarazaga-Soria,
  • Rubén Béjar and
  • Javier Nogueras-Iso

The discrete representation of resources in geospatial catalogues affects their information retrieval performance. The performance could be improved by using automatically generated clusters of related resources, which we name quasi-spatial dataset s...

of 6