Skip Content
You are currently on the new version of our website. Access the old version .

13 Results Found

  • Article
  • Open Access
1,465 Views
22 Pages

25 March 2025

While script identification is the first step in many natural language processing and text mining tasks, at present, there is no open-source script identification algorithm for text. For this reason, we analyze the Unicode encoding of each type of sc...

  • Article
  • Open Access
4 Citations
3,769 Views
17 Pages

Robotic Writing of Arbitrary Unicode Characters Using Paintbrushes

  • David Silvan Zingrebe,
  • Jörg Marvin Gülzow and
  • Oliver Deussen

Human handwriting is an everyday task performed regularly by most people. In the domain of robotic painting, multiple calligraphy machines exist which were built to replicate some aspects of human artistic writing; however, most projects are limited...

  • Article
  • Open Access

9 February 2026

Script identification is the first step in most multilingual text-processing systems. To improve the time efficiency of script identification algorithms, whether there is content written in a certain script in the text is first determined; if so, the...

  • Article
  • Open Access
1 Citations
4,170 Views
24 Pages

Convolutional Neural Network Based Ensemble Approach for Homoglyph Recognition

  • Md. Taksir Hasan Majumder,
  • Md. Mahabur Rahman,
  • Anindya Iqbal and
  • M. Sohel Rahman

Homoglyphs are pairs of visual representations of Unicode characters that look similar to the human eye. Identifying homoglyphs is extremely useful for building a strong defence mechanism against many phishing and spoofing attacks, ID imitation, prof...

  • Article
  • Open Access
4 Citations
5,231 Views
20 Pages

Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity

  • Sanjana Gunna,
  • Rohit Saluja and
  • Cheerakkuzhi Veluthemana Jawahar

Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on...

  • Article
  • Open Access
1,248 Views
36 Pages

A Survey of Printable Encodings

  • Marco Botta,
  • Davide Cavagnino,
  • Alessandro Druetto,
  • Maurizio Lucenteforte and
  • Annunziata Marra

12 August 2025

The representation of binary data in a compact, printable, efficient, and often human-readable format is essential in numerous computing applications, mainly driven by the limitations of systems and communication protocols not designed to handle arbi...

  • Article
  • Open Access
2,324 Views
13 Pages

14 October 2021

Deep learning models have been widely used in natural language processing tasks, yet researchers have recently proposed several methods to fool the state-of-the-art neural network models. Among these methods, word importance ranking is an essential p...

  • Article
  • Open Access
5 Citations
4,442 Views
18 Pages

A Syllable-Based Technique for Uyghur Text Compression

  • Wayit Abliz,
  • Hao Wu,
  • Maihemuti Maimaiti,
  • Jiamila Wushouer,
  • Kahaerjiang Abiderexiti,
  • Tuergen Yibulayin and
  • Aishan Wumaier

23 March 2020

To improve utilization of text storage resources and efficiency of data transmission, we proposed two syllable-based Uyghur text compression coding schemes. First, according to the statistics of syllable coverage of the corpus text, we constructed a...

  • Article
  • Open Access
3 Citations
3,131 Views
16 Pages

Hiding the Source Code of Stored Database Programs

  • Vitalii Yesin,
  • Mikolaj Karpinski,
  • Maryna Yesina,
  • Vladyslav Vilihura and
  • Kornel Warwas

9 December 2020

The objective of the article is to reveal an approach to hiding the code of stored programs stored in the database. The essence of this approach is the complex use of the method of random permutation of code symbols related to a specific stored progr...

  • Article
  • Open Access
1 Citations
1,794 Views
11 Pages

11 November 2024

Script identification is easier to implement than language identification, and its identification rate is very high. The fewer languages are identified when using a language identification algorithm, the higher the identification rate is. However, no...

  • Article
  • Open Access
3 Citations
4,043 Views
18 Pages

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

  • Anna Scius-Bertrand,
  • Michael Jungo,
  • Beat Wolf,
  • Andreas Fischer and
  • Marc Bui

26 May 2021

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages a...

  • Article
  • Open Access
15 Citations
10,577 Views
18 Pages

Homoglyph Attack Detection Model Using Machine Learning and Hash Function

  • Abdullah M. Almuhaideb,
  • Nida Aslam,
  • Almaha Alabdullatif,
  • Sarah Altamimi,
  • Shooq Alothman,
  • Amnah Alhussain,
  • Waad Aldosari,
  • Shikah J. Alsunaidi and
  • Khalid A. Alissa

Phishing is still a major security threat in cyberspace. In phishing, attackers steal critical information from victims by presenting a spoofing/fake site that appears to be a visual clone of a legitimate site. Several Unicode characters are visually...

  • Article
  • Open Access
1 Citations
2,279 Views
27 Pages

Extracting Geoscientific Dataset Names from the Literature Based on the Hierarchical Temporal Memory Model

  • Kai Wu,
  • Zugang Chen,
  • Xinqian Wu,
  • Guoqing Li,
  • Jing Li,
  • Shaohua Wang,
  • Haodong Wang and
  • Hang Feng

Extracting geoscientific dataset names from the literature is crucial for building a literature–data association network, which can help readers access the data quickly through the Internet. However, the existing named-entity extraction methods...