MDPI - Publisher of Open Access Journals

17 pages, 8996 KiB

Open AccessArticle

The Impact of Ancient Greek Prompts on Artificial Intelligence Image Generation: A New Educational Paradigm

by Anna Kalargirou, Dimitrios Kotsifakos and Christos Douligeris

AI 2025, 6(4), 81; https://doi.org/10.3390/ai6040081 - 18 Apr 2025

Viewed by 1152

Background/Objectives: This article explores the use of Ancient Greek as a prompt language in DALL·E 3, an Artificial Intelligence software for image generation. The research investigates three dimensions of Artificial Intelligence’s ability: (a) the sense and visualization of the concept of distance, (b) [...] Read more.

Background/Objectives: This article explores the use of Ancient Greek as a prompt language in DALL·E 3, an Artificial Intelligence software for image generation. The research investigates three dimensions of Artificial Intelligence’s ability: (a) the sense and visualization of the concept of distance, (b) the mixing of representational as well as mythic contents, and (c) the visualization of emotions. More specifically, the research not only investigates AI’s potentialities in processing and representing Ancient Greek texts but also attempts to assess its interpretative boundaries. The key question is whether AI can faithfully represent the underlying conceptual and narrative structures of ancient literature or whether its representations are superficial and constrained by algorithmic procedures. Methods: This is a mixed-methods experimental research design examining whether a specified Artificial Intelligence software can sense, understand, and graphically represent linguistic and conceptual structures in the Ancient Greek language. Results: The study highlights Artificial Intelligence’s possibility in classical language education as well as digital humanities regarding linguistic complexity versus AI’s power in interpretation. More specifically, the research not only investigates AI’s potentialities in processing and representing Ancient Greek texts but also attempts to assess its interpretative boundaries. The key question is whether AI can faithfully represent the underlying conceptual and narrative structures of ancient literature or whether its representations are superficial and constrained by algorithmic procedures. The study highlights Artificial Intelligence’s possibility in classical language education as well as digital humanities regarding linguistic complexity versus AI’s power in interpretation. Conclusions: The research is a step toward a more extensive discussion on Artificial Intelligence in historical linguistics, digital pedagogy, as well as aesthetic representation by Artificial Intelligence environments. Full article

► Show Figures

Figure 1

18 pages, 930 KiB

Open AccessCase Report

Ontological Representation of the Structure and Vocabulary of Modern Greek on the Protégé Platform

by Nikoletta Samaridi, Evangelos Papakitsos and Nikitas Karanikolas

Computation 2024, 12(12), 249; https://doi.org/10.3390/computation12120249 - 23 Dec 2024

Cited by 1 | Viewed by 830

Abstract

One of the issues in Natural Language Processing (NLP) and Artificial Intelligence (AI) is language representation and modeling, aiming to manage its structure and find solutions to linguistic issues. With the pursuit of the most efficient capture of knowledge about the Modern Greek [...] Read more.

One of the issues in Natural Language Processing (NLP) and Artificial Intelligence (AI) is language representation and modeling, aiming to manage its structure and find solutions to linguistic issues. With the pursuit of the most efficient capture of knowledge about the Modern Greek language and, given the scientifically certified usability of the ontological structuring of data in the field of the semantic web and cognitive computing, a new ontology of the Modern Greek language at the level of structure and vocabulary is presented in this paper, using the Protégé platform. With the specific logical and structured form of knowledge representation to express, this research processes and exploits in an easy and useful way the distributed semantics of linguistic information. Full article

(This article belongs to the Special Issue Recent Advances on Computational Linguistics and Natural Language Processing)

► Show Figures

Figure 1

18 pages, 2293 KiB

Open AccessArticle

Social Media Topic Classification on Greek Reddit

by Charalampos Mastrokostas, Nikolaos Giarelis and Nikos Karacapilidis

Information 2024, 15(9), 521; https://doi.org/10.3390/info15090521 - 26 Aug 2024

Cited by 4 | Viewed by 1471

Abstract

Text classification (TC) is a subtask of natural language processing (NLP) that categorizes text pieces into predefined classes based on their textual content and thematic aspects. This process typically includes the training of a Machine Learning (ML) model on a labeled dataset, where [...] Read more.

Text classification (TC) is a subtask of natural language processing (NLP) that categorizes text pieces into predefined classes based on their textual content and thematic aspects. This process typically includes the training of a Machine Learning (ML) model on a labeled dataset, where each text example is associated with a specific class. Recent progress in Deep Learning (DL) enabled the development of deep neural transformer models, surpassing traditional ML ones. In any case, works of the topic classification literature prioritize high-resource languages, particularly English, while research efforts for low-resource ones, such as Greek, are limited. Taking the above into consideration, this paper presents: (i) the first Greek social media topic classification dataset; (ii) a comparative assessment of a series of traditional ML models trained on this dataset, utilizing an array of text vectorization methods including TF-IDF, classical word and transformer-based Greek embeddings; (iii) a fine-tuned GREEK-BERT-based TC model on the same dataset; (iv) key empirical findings demonstrating that transformer-based embeddings significantly increase the performance of traditional ML models, while our fine-tuned DL model outperforms previous ones. The dataset, the best-performing model, and the experimental code are made public, aiming to augment the reproducibility of this work and advance future research in the field. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 3537 KiB

Open AccessArticle

From Data to Insight: Transforming Online Job Postings into Labor-Market Intelligence

by Giannis Tzimas, Nikos Zotos, Evangelos Mourelatos, Konstantinos C. Giotopoulos and Panagiotis Zervas

Information 2024, 15(8), 496; https://doi.org/10.3390/info15080496 - 20 Aug 2024

Cited by 2 | Viewed by 4257

Abstract

In the continuously changing labor market, understanding the dynamics of online job postings is crucial for economic and workforce development. With the increasing reliance on Online Job Portals, analyzing online job postings has become an essential tool for capturing real-time labor-market trends. This [...] Read more.

In the continuously changing labor market, understanding the dynamics of online job postings is crucial for economic and workforce development. With the increasing reliance on Online Job Portals, analyzing online job postings has become an essential tool for capturing real-time labor-market trends. This paper presents a comprehensive methodology for processing online job postings to generate labor-market intelligence. The proposed methodology encompasses data source selection, data extraction, cleansing, normalization, and deduplication procedures. The final step involves information extraction based on employer industry, occupation, workplace, skills, and required experience. We address the key challenges that emerge at each step and discuss how they can be resolved. Our methodology is applied to two use cases: the first focuses on the analysis of the Greek labor market in the tourism industry during the COVID-19 pandemic, revealing shifts in job demands, skill requirements, and employment types. In the second use case, a data-driven ontology is employed to extract skills from job postings using machine learning. The findings highlight that the proposed methodology, utilizing NLP and machine-learning techniques instead of LLMs, can be applied to different labor market-analysis use cases and offer valuable insights for businesses, job seekers, and policymakers. Full article

(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)

► Show Figures

Figure 1

14 pages, 334 KiB

Open AccessArticle

Upon Improving the Performance of Localized Healthcare Virtual Assistants

by Nikolaos Malamas, Konstantinos Papangelou and Andreas L. Symeonidis

Healthcare 2022, 10(1), 99; https://doi.org/10.3390/healthcare10010099 - 4 Jan 2022

Cited by 13 | Viewed by 3838

Abstract

Virtual assistants are becoming popular in a variety of domains, responsible for automating repetitive tasks or allowing users to seamlessly access useful information. With the advances in Machine Learning and Natural Language Processing, there has been an increasing interest in applying such assistants [...] Read more.

Virtual assistants are becoming popular in a variety of domains, responsible for automating repetitive tasks or allowing users to seamlessly access useful information. With the advances in Machine Learning and Natural Language Processing, there has been an increasing interest in applying such assistants in new areas and with new capabilities. In particular, their application in e-healthcare is becoming attractive and is driven by the need to access medically-related knowledge, as well as providing first-level assistance in an efficient manner. In such types of virtual assistants, localization is of utmost importance, since the general population (especially the aging population) is not familiar with the needed “healthcare vocabulary” to communicate facts properly; and state-of-practice proves relatively poor in performance when it comes to specialized virtual assistants for less frequently spoken languages. In this context, we present a Greek ML-based virtual assistant specifically designed to address some commonly occurring tasks in the healthcare domain, such as doctor’s appointments or distress (panic situations) management. We build on top of an existing open-source framework, discuss the necessary modifications needed to address the language-specific characteristics and evaluate various combinations of word embeddings and machine learning models to enhance the assistant’s behaviour. Results show that we are able to build an efficient Greek-speaking virtual assistant to support e-healthcare, while the NLP pipeline proposed can be applied in other (less frequently spoken) languages, without loss of generality. Full article

(This article belongs to the Special Issue Smart Home Care)

► Show Figures

Figure 1

17 pages, 669 KiB

Open AccessArticle

Extracting Semantic Relationships in Greek Literary Texts

by Despina Christou and Grigorios Tsoumakas

Sustainability 2021, 13(16), 9391; https://doi.org/10.3390/su13169391 - 21 Aug 2021

Cited by 8 | Viewed by 3068

Abstract

In the era of Big Data, the digitization of texts and the advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) are enabling the automatic analysis of literary works, allowing us to delve into the structure of artifacts and to compare, explore, [...] Read more.

In the era of Big Data, the digitization of texts and the advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) are enabling the automatic analysis of literary works, allowing us to delve into the structure of artifacts and to compare, explore, manage and preserve the richness of our written heritage. This paper proposes a deep-learning-based approach to discovering semantic relationships in literary texts (19th century Greek Literature) facilitating the analysis, organization and management of collections through the automation of metadata extraction. Moreover, we provide a new annotated dataset used to train our model. Our proposed model, REDSandT_Lit, recognizes six distinct relationships, extracting the richest set of relations up to now from literary texts. It efficiently captures the semantic characteristics of the investigating time-period by finetuning the state-of-the-art transformer-based Language Model (LM) for Modern Greek in our corpora. Extensive experiments and comparisons with existing models on our dataset reveal that REDSandT_Lit has superior performance (90% accuracy), manages to capture infrequent relations (100%F in long-tail relations) and can also correct mislabelled sentences. Our results suggest that our approach efficiently handles the peculiarities of literary texts, and it is a promising tool for managing and preserving cultural information in various settings. Full article

(This article belongs to the Special Issue Cultural Heritage Storytelling, Engagement and Management in the Era of Big Data and the Semantic Web)

► Show Figures

Figure 1

18 pages, 672 KiB

Open AccessArticle

Innovatively Fused Deep Learning with Limited Noisy Data for Evaluating Translations from Poor into Rich Morphology

by Despoina Mouratidis, Katia Lida Kermanidis and Vilelmini Sosoni

Appl. Sci. 2021, 11(2), 639; https://doi.org/10.3390/app11020639 - 11 Jan 2021

Cited by 3 | Viewed by 3166

Abstract

Evaluation of machine translation (MT) into morphologically rich languages has not been well studied despite its importance. This paper proposes a classifier, that is, a deep learning (DL) schema for MT evaluation, based on different categories of information (linguistic features, natural language processing [...] Read more.

Evaluation of machine translation (MT) into morphologically rich languages has not been well studied despite its importance. This paper proposes a classifier, that is, a deep learning (DL) schema for MT evaluation, based on different categories of information (linguistic features, natural language processing (NLP) metrics and embeddings), by using a model for machine learning based on noisy and small datasets. The linguistic features are string based for the language pairs English (EN)–Greek (EL) and EN–Italian (IT). The paper also explores the linguistic differences that affect evaluation accuracy between different kinds of corpora. A comparative study between using a simple embedding layer (mathematically calculated) and pre-trained embeddings is conducted. Moreover, an analysis of the impact of feature selection and dimensionality reduction on classification accuracy has been conducted. Results show that using a neural network (NN) model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation for EN–EL and EN–IT, by an increase of almost 0.40 points in correlation with human judgments on pairwise MT evaluation. It is observed that the proposed algorithm achieved better results on noisy and small datasets. In addition, for a more integrated analysis of the accuracy results, a qualitative linguistic analysis has been carried out in order to address complex linguistic phenomena. Full article

(This article belongs to the Special Issue Machine Learning Methods with Noisy, Incomplete or Small Datasets)

► Show Figures

Figure 1

21 pages, 1348 KiB

Open AccessArticle

A Novel, Gradient Boosting Framework for Sentiment Analysis in Languages where NLP Resources Are Not Plentiful: A Case Study for Modern Greek

by Vasileios Athanasiou and Manolis Maragoudakis

Algorithms 2017, 10(1), 34; https://doi.org/10.3390/a10010034 - 6 Mar 2017

Cited by 46 | Viewed by 10302

Abstract

Sentiment analysis has played a primary role in text classification. It is an undoubted fact that some years ago, textual information was spreading in manageable rates; however, nowadays, such information has overcome even the most ambiguous expectations and constantly grows within seconds. It [...] Read more.

Sentiment analysis has played a primary role in text classification. It is an undoubted fact that some years ago, textual information was spreading in manageable rates; however, nowadays, such information has overcome even the most ambiguous expectations and constantly grows within seconds. It is therefore quite complex to cope with the vast amount of textual data particularly if we also take the incremental production speed into account. Social media, e-commerce, news articles, comments and opinions are broadcasted on a daily basis. A rational solution, in order to handle the abundance of data, would be to build automated information processing systems, for analyzing and extracting meaningful patterns from text. The present paper focuses on sentiment analysis applied in Greek texts. Thus far, there is no wide availability of natural language processing tools for Modern Greek. Hence, a thorough analysis of Greek, from the lexical to the syntactical level, is difficult to perform. This paper attempts a different approach, based on the proven capabilities of gradient boosting, a well-known technique for dealing with high-dimensional data. The main rationale is that since English has dominated the area of preprocessing tools and there are also quite reliable translation services, we could exploit them to transform Greek tokens into English, thus assuring the precision of the translation, since the translation of large texts is not always reliable and meaningful. The new feature set of English tokens is augmented with the original set of Greek, consequently producing a high dimensional dataset that poses certain difficulties for any traditional classifier. Accordingly, we apply gradient boosting machines, an ensemble algorithm that can learn with different loss functions providing the ability to work efficiently with high dimensional data. Moreover, for the task at hand, we deal with a class imbalance issues since the distribution of sentiments in real-world applications often displays issues of inequality. For example, in political forums or electronic discussions about immigration or religion, negative comments overwhelm the positive ones. The class imbalance problem was confronted using a hybrid technique that performs a variation of under-sampling the majority class and over-sampling the minority class, respectively. Experimental results, considering different settings, such as translation of tokens against translation of sentences, consideration of limited Greek text preprocessing and omission of the translation phase, demonstrated that the proposed gradient boosting framework can effectively cope with both high-dimensional and imbalanced datasets and performs significantly better than a plethora of traditional machine learning classification approaches in terms of precision and recall measures. Full article

(This article belongs to the Special Issue Humanistic Data Processing)

► Show Figures

Figure 1

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI