Next Article in Journal
Millimeter-Wave Miniaturized Substrate-Integrated Waveguide Multibeam Antenna Based on Multi-Layer E-Plane Butler Matrix
Previous Article in Journal
Advances in NLP Techniques for Detection of Message-Based Threats in Digital Platforms: A Systematic Review
Previous Article in Special Issue
Interactive Heritage: The Role of Artificial Intelligence in Digital Museums
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Developing IQJournalism: An Intelligent Advisor for Predicting the Perceived Quality in Greek News Articles

by
Catherine Sotirakou
1,*,
Panagiotis Germanakos
1,2,
Anastasia Karampela
1,* and
Constantinos Mourlas
1
1
Faculty of Communication and Media Studies, National & Kapodistrian University of Athens, 10679 Athens, Greece
2
Cloud ERP User Experience, SAP Product & Engineering, SAP SE, 69190 Walldorf, Germany
*
Authors to whom correspondence should be addressed.
Electronics 2025, 14(13), 2552; https://doi.org/10.3390/electronics14132552
Submission received: 30 April 2025 / Revised: 3 June 2025 / Accepted: 13 June 2025 / Published: 24 June 2025
(This article belongs to the Special Issue Advances in HCI Research)

Abstract

Technological developments and the integration of social media into journalistic practices have transformed the media landscape, changing how information is gathered, produced, and shared. This evolution poses challenges, including the lack of clear guidelines and practical tools for ensuring the quality of digital news content. To address these issues, IQJournalism, an intelligent quality prediction advisor, was developed. This paper outlines the methodology for the development of IQJournalism, a platform that leverages advanced AI technologies to process Greek news articles and provide real-time editing recommendations on various dimensions, including language quality, subjectivity level, emotionality, entertainment, and social media engagement. First, a qualitative study was conducted through semi-structured, in-depth interviews with 20 experts, academic researchers and media professionals to identify indicators of perceived quality in journalism. These insights were then transformed into measurable features, which served as training data for explainable machine learning-based models for quality categorization and prediction. Finally, the IQJournalism platform was designed following a user-centered iterative process that included prototyping, testing, and redesigning. The innovative approach aims to serve as a valuable tool for improving journalistic quality, contributing to more reliable and engaging online news content. Importantly, the platform is not limited to the journalistic sector, but can also be used to optimize content in various areas, such as marketing, political, and strategic communication, supporting editors seeking to improve the quality and impact of their writing.

1. Introduction

Technological developments and the integration of social media have transformed the media ecosystem, reshaping the processes of collection, production, and dissemination of news. While these changes offer new opportunities, they have also introduced challenges, such as the absence of specific standards for assessing the quality of online news. Journalistic principles such as impartiality and integrity are often compromised, particularly in smaller news organizations, where sales and advertising pressures, political agendas, and powerful media ownership interfere with editorial independence [1,2]. News editors frequently overlook ethical standards, prioritizing market-driven goals or clickbait techniques, such as sensational headlines to increase page views and audience engagement metrics, thus undermining the credibility of the media over time.
Given these challenges, researchers and media professionals are exploring how AI can support high-quality journalism in the digital era. This research presents an innovative approach by integrating expert-based quality indicators, explainable machine learning and iterative design, specifically tailored to the Greek media ecosystem. Unlike other tools that rely on general linguistic or grammatical correction, IQJournalism provides interpretable recommendations aligned with editorial needs.
More specifically, this study introduces IQJournalism, an AI-powered intelligent advisor designed to guide journalists in improving the overall quality of articles and predicting social media success. IQJournalism adopts a mixed-methods research approach [3], combining semi-structured in-depth interviews with big data analysis to identify factors that improve the quality and engagement of journalistic content. Oriented by definitions of journalism as a medium for informing and empowering citizens [4,5], the project emphasizes key quality parameters, including subject matter, linguistic attributes, writing style, and audience reception, to support journalists in creating credible and impactful news. Leveraging also Natural Language Processing (NLP) and user-centered design principles, the system provides real-time feedback on various dimensions, including language quality, subjectivity level, emotionality, and entertainment. By fostering readability and engagement, IQJournalism improves the dissemination of news articles, discourages clickbait practices, and offers applications in various fields.
This paper makes several theoretical and practical contributions. Theoretically, it proposes a framework for defining journalistic quality based on expert knowledge and validated through machine learning. It contributes to the fields of journalism studies, AI, and human–computer interaction by demonstrating how explainable AI can be aligned with professional principles in content production. Practically, the study offers a system that helps editors by supporting content improvement in both journalism and other content areas. In particular, IQJournalism is designed to support various editorial roles throughout the content production workflow. Journalists can take advantage of the tool to improve language clarity and assess levels of subjectivity and emotionality. Audience editors and social media specialists can also benefit by understanding how their content influences engagement on digital platforms. Beyond journalism, the platform can assist writers in fields such as marketing and political communication who aim to produce credible and engaging content. The adaptability of the platform to different editorial contexts and its user-friendly interface further underscore its practical utility.
The study addresses a clear research gap: while AI tools to support writing are proliferating, few are based on quality definitions specific to the media field. This research fills this gap by offering a system adapted to the Greek journalistic context, with the potential for broader adaptation to other contexts. The main objectives of this work are to identify journalistic quality indicators defined by experts and transform them into measurable characteristics, to develop and evaluate machine learning models that predict article quality and social media engagement, to design and test a user-centered interface for editorial support, and to demonstrate how AI tools can improve the quality and credibility of online news.
The expected findings include the validation of a theoretically quality framework using high-performance machine learning models, the identification of key predictive features of both quality and engagement, as well as strong usability results from user testing. Overall, these results demonstrate that IQJournalism can provide meaningful feedback to journalists and editors, improving both content quality and audience trust in the digital age.

Background Work

Advances in AI have profoundly transformed journalism, improving multiple stages of the news production and distribution process. As noted by Simon [6], common applications of AI systems in news organizations include supporting journalists in accessing and observing information through tools for discovery, audience analysis, story detection, and idea generation. In selection and filtering, AI tools help with fact-checking, content categorization, automated data collection, transcription, and translation, improving accuracy and efficiency. In editing and processing, AI assists in brainstorming, drafting, editing, content reformatting, and SEO optimization. Finally, in publishing and distribution, AI enhances content personalization and paywall management, thus promoting greater audience engagement.
According to Verma [7], these automations not only improve efficiency but also free journalists to focus on more complex aspects of their work. The integration of AI is seen as a strategic development in journalistic practices allowing for deeper research by enhancing human characteristics such as insight, empathy, and investigative rigor. This shift marks a key development in the media sector, where AI technologies, rather than replacing editorial judgment, ultimately enhance the quality of journalism.
In recent years, intelligent authoring tools and natural language generators have incorporated sophisticated methods, such as the integration of Large Language Models (LLMs), that enable authors to generate content based on given instructions. Significant advancements in LLMs, such as Chat-GPT (https://chatgpt.com/ (accessed on 29 April 2025)), Gemini (https://gemini.google.com/app (accessed on 29 April 2025)), and Claude (https://claude.ai/ (accessed on 29 April 2025)), along with their widespread incorporation into everyday products, highlights their potential as powerful authoring assistants [8].
Based on these advancements, we conducted a thorough examination of various online editors and writing assistants to identify the specific needs and challenges journalists encounter when composing, editing, and revising text guided by system recommendations, as well as when organizing text files and folders. More specifically, we studied the functionality of the widely used application “Grammarly” (https://www.grammarly.com/ (accessed on 29 April 2025)). This is an award-winning online language component, often used by English as a foreign language (EFL) speakers, that detects and corrects linguistic errors, including grammatical and spelling mistakes, irregular verb conjugations, inappropriate noun usage, and incorrect word choices, and detects instances of plagiarism [9]. The system operates through an interconnected network powered by AI techniques, such as machine learning, deep learning, and NLP. It is important to note that Fitria’s [9] descriptive qualitative research demonstrated a significant improvement in students’ writing ability, with test performance scores rising from 34 to 77 out of 100 after using “Grammarly.”
Additionally, “Wordcraft” is a conventional writing tool supported by LLM that assists in rewriting, summarizing and stylistic editing. The features of this application are powered by LaMDA, a neural language model trained on Google data, such as public Web pages, forum discussions, and Wikipedia content. Refinement through conversational data has produced a chatbot-style interface. The user evaluation of the application showed that “Wordcraft” enhanced engagement, proved effective, shortened the writing process, enabled the development of long-form content, promoted the incorporation of AI-generated inputs, and improved overall user satisfaction [10].
“Wordtune” (https://www.wordtune.com/ (accessed on 29 April 2025)) surpasses the basic text editing functions, exploiting AI to interpret the writer’s intent. It provides rewording options in either informal or formal tones, and adjusts the length of the text by either shortening or expanding it. It also has the ability to change sentence structures and replace words with synonyms, while maintaining the original context [11]. Furthermore, a Chinese data input method designed to enhance the writing experience through real-time suggestions was presented by Dai et al. [12]. The system provides context-appropriate suggestions, including syntactic and semantic options, which are derived from text corpus mining and use NLP techniques, such as word vector representation and the Latent Dirichlet Allocation (LDA) topic model. The study revealed the effectiveness of “WINGS” in supporting creative writing.
Alongside these developments, Stefnisson and Thue [13] developed “Mimisbrunnur”, an interactive authoring tool that integrates AI, NLP, and a mixed-initiative framework. Specifically, the system comprises three main components: a state editor for defining the foundational facts of the narratives, an editor that controls the potential actions, and a goal editor that enables writers to define the outcomes for the story. An alternative research approach explored the use of conversational agents to support the development of fictional characters [14]. This tool named “CharacterChat”, assists writers in defining character traits through chatbot prompts and enables editors to refine these features through dialogue. Findings from two user studies demonstrated the effectiveness of “CharacterChat” especially when inventing new fictional characters. Lastly, Osone and Ochiai [15] introduced “BunCho,” an online writing environment, tailored for Japanese novelists, designed to stimulate creativity by exploiting GPT-2. User feedback reflected high satisfaction, with 69% of writers reporting enjoyment while using the platform to enrich their storytelling. Additional analysis also revealed that 69% of summaries created with the tool showed enhanced creativity.
This paper examines the development process of the IQJournalism system following a combination of qualitative and quantitative methods. For each phase of the system’s development, research questions and hypotheses were formulated to guide the process and confirm the outcomes, based on previous findings from journalism studies, AI-assisted writing and user-centered design. The paper begins by presenting key insights from 20 semi-structured interviews with journalism experts, identifying crucial indicators of journalistic quality. Previous research [1,16] has shown that accurate and informative headlines have a positive impact on readers’ perception of the credibility and quality of content, leading to H1: Experts will argue that the decreased accuracy of the headline of a story undermines the overall quality and attractiveness of the journalistic product. Research findings [16,17] indicating that emotionality undermines news quality formed the following hypothesis: H2: Experts of our research are likely to assert that the intense use of emotionally charged discourse reduces the overall quality of the journalistic product. Furthermore, the inclusion of relevant audiovisual material, such as images and videos, has been associated with increased credibility in journalism [18], supporting H3: Experts of this research are expected to contend that the use of audiovisual material enhances the overall quality of the journalistic product.
Following this, the paper delves into the stages of knowledge extraction, including data collection, preprocessing, transformation, data mining, and interpretation/evaluation, to extract meaningful patterns and features for model training. The extracted features served as input variables for training the machine learning models, with a focus on supervised learning approaches applied to classify article quality. Evaluation metrics of model performance are presented, along with detailed description of prediction accuracy and the final selection of the optimal model. From a computational perspective, studies support the use of linguistic and contextual features in predicting both article quality and social media engagement. This guided H4: Lower levels of subjectivity and entertainment will predict higher quality, whereas higher levels of emotional content and poor language use will predict lower quality.
The final section presents the user-centered methodology, analyzing the iterative design process of prototyping, testing, and refining the system. Drawing on the literature of user-centered design [19,20,21], the study evaluates the usability and user experience of the platform through a series of targeted hypotheses: H5: The interaction with the prototype results in a very positive user experience among the participants; H6: The perceived usability score of all participants is higher than the standard average SUS score of 68; H7: Participants’ NPS score is above 30, indicating a strong tendency to recommend the IQJournalism system; H8: The distribution of participants’ scores regarding their overall experience and satisfaction with the prototype shows a central tendency toward the highest values on the Likert scales, indicating strong positive attitudes; and H9: Most participants completed the prototype’s tasks more quickly and efficiently, (i.e., SWA < 1).

2. Materials and Methods

Despite the fact that there is a growing focus on developing solutions for educational use, writing enhancement, and even science fiction character creation [9,10,11,12,13,14,15], tools that specifically support the production of quality journalistic content remain limited. The IQJournalism system came to fill this gap aiming to highlight the qualitative characteristics of news stories and their capacity to increase engagement on social media. In more detail, the objectives of this approach were as follows:
a. To detect and evaluate the impact of specific text features, such as language quality parameters, headline accuracy, emotional discourse, and audiovisual material, on the overall quality and attractiveness of news pieces.
b. To build and evaluate a machine learning model capable of predicting the quality and social media engagement of online news content, focusing on the dimensions of language quality, subjectivity, emotionality, and entertainment.
c. To implement an iterative design process that includes prototyping, testing, and redesigning to effectively address specific user needs and requirements.
Bearing in mind the aforementioned objectives, the study was carried out through three distinct methodological phases (see Table 1). The first phase (A) involved a qualitative approach, using semi-structured, in-depth interviews with experts to identify key characteristics that influence journalistic quality and engagement. For the purpose of this qualitative study, the following research questions were formulated: Which are the main reliability characteristics of a journalistic product/report? Which are the main language quality parameters/characteristics of a journalistic text? Which factors determine the impartiality of a journalist?
The second phase (B) followed a quantitative and computational approach, where insights from the qualitative study were translated into measurable features and incorporated into machine learning models created to predict the quality and social media engagement of online news. In this phase, the following research questions were established: How can AI contribute to understanding quality standards in digital journalism? Can the model dimensions, namely Language Quality, Subjectivity, Emotionality, and Entertainment predict quality in online news? How do the individual quality criteria specifically affect the accuracy of quality prediction? Is it possible for a machine learning model to reliably predict the engagement of news content on social media?
The third phase (C) focused on iterative design, where prototyping, testing, and redesigning processes were employed to improve the system, ensuring alignment with the requirements and expectations of users [22]. Based on this inventive framework, the IQJournalism system aims to become an essential tool customized to editorial needs, assisting them to craft quality and engaging content.

2.1. Phase A: Qualitative Research

Our initial step toward achieving our main objective was to conduct, for the first time in Greece, semi-structured in-depth interviews [23] (p. 183), [24], [25] (p. 756), [26] (p. 156), [27]. The interviews were conducted between 13 May 2022 and 13 July 2022 and the total number of interviews was 20: there were 16 journalists with significant experience in the media field, focusing on structuring journalistic discourse within a framework of reliability, quality, and impartiality, and 4 Greek academic researchers specializing in communication and journalism. The interviewees were selected through purposive sampling to ensure a diverse representation of professional backgrounds and media roles. Interviews were conducted via videoconference and consent was obtained from all participants after being informed of the study’s purpose and confidentiality protocols. To minimize potential bias, the interview protocol included open-ended, neutrally worded questions. Thematic analysis was employed to analyze data [28,29], [30] (p. 40) and was cross-checked by multiple researchers. Data response saturation occurred at the 18th interview and 2 further interviews were conducted for final confirmation.
The research hypotheses (Hs) proposed for Phase A of the study were as follows:
H1.
Experts will argue that the decreased accuracy of the headline of a story undermines the overall quality and attractiveness of the journalistic product.
H2.
Experts of our research are likely to assert that the intense use of emotionally charged discourse reduces the overall quality of the journalistic product.
H3.
Experts of this research are expected to contend that the use of audiovisual material enhances the overall quality of the journalistic product.
Thematic analysis of interviews with journalism experts highlighted key aspects of quality in journalism. The experts emphasized that credibility relies on core journalistic principles, including effective questioning techniques and the use of reliable, diverse sources. They also highlighted the importance of investigative journalism, which faces challenges in the online media landscape [31]. Quality also involves correct language use, effective source management, and informative lead paragraphs. Experts largely disapprove capital letters in news articles, finding them distracting, but support bullet points for clarity.
Moreover, according to the experts, the ideal article length varies based on news type and coverage depth, with some preferring concise texts to retain reader attention. Regarding impartiality, although considered a challenge due to inherent biases, experts agree that journalists should present all perspectives, even when expressing personal opinions. Based on experts’ responses, headlines should accurately reflect the article’s content, supporting the hypothesis that accuracy outweighs emotional appeal [16,17]. Experts also affirmed the value of audiovisual material in digital journalism, provided it is relevant to the text, aligning with the third hypothesis (H3) on its integral role in quality news content.
The insights provided by the experts on structuring journalistic discourse within a framework of qualitative and engaging journalistic writing were substantial and significant. However, a deeper analysis of the findings from this qualitative research method would be beyond the scope of this paper [32].
The preliminary qualitative research findings, which emerged from the experts’ interviews, described several key issues: credibility, diversity of opinions and sources, language quality, text characteristics such as punctuation and article length, impartiality, importance of the headline, emotionality, and the role of accompanying audiovisual material. These main themes formed the basis for developing perceived quality indicators. In other words, experts’ insights on crafting high-quality, engaging journalistic articles were used as input data to train the machine learning model in Phase B of the study.

2.2. Phase B: Computational Model Development and Training

As mentioned in the previous research phase, the findings of the qualitative research, which were derived from the interviews with the experts, identified key topics and characteristics that formed the foundation for the development of the quality indicators. These indicators were then used as input features for training the machine learning model in phase B of the study. This phase, therefore, examines the various steps involved in the computational approach, with the aim of building machine learning models able to predict the quality and engagement of online social media news.

2.2.1. Data Collection, Preprocessing, and Feature Extraction

Initially, in the second phase of the study, a text analysis was performed on a dataset of over 10 million Greek news articles published on news websites from the year 2021 to 2023. Specifically, the content was sourced from 7.359 Greek news websites, with articles selected from the following categories: Economy, Politics, International, Sports, Technology, Culture, Society, News, Health, Tourism, and Lifestyle. After tagging a representative sample of publications and analyzing all sites, approximately 2.5 million articles were selected. Before using the corpus, Python 3 (https://www.python.org/ (accessed on 29 April 2025)) techniques were used to preprocess the dataset by removing stopwords, symbols, nonstandard words, NaN values and HTML code from the texts. After the data cleaning stage, 902.133 unique texts of articles from high-quality news websites and 607.704 unique texts from tabloid websites remained.
To generate advanced features aligned with the theoretical framework, a range of text analysis methods, NLP libraries, and Python packages were applied. Techniques such as tokenization, stemming, lemmatization, and part-of-speech tagging were employed along with multiple lexicons which required the original form of words. In certain cases, the raw text was used instead of the preprocessed text to identify adjectives or evaluate the level of readability.
For sentiment analysis, we chose the dictionary method and used translated versions of emotion and subjectivity lexicons. Specifically, to assess subjectivity, we used the Multi-perspective Question Answering (MPQA) subjectivity lexicon by Wilson, Wiebe, and Hoffmann [33], freely available for research purposes. To measure emotionality, we utilized three established dictionaries: the NRC Word-Emotion Association Lexicon (Emolex), with 14.182 words associated with 8 basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) at 2 levels (1 = associated, 0 = not associated) and 2 polarities (positive and negative); the National Research Council Canada Valence, Arousal and Dominance Lexicon (NRC-VAD) [34], containing 20.000 words labeled with scores for Valence, Arousal, and Dominance; and the National Research Council Canada Affect Intensity Lexicon (NRC-AIL) [35], with 5.815 terms rated by intensity for 4 basic emotions, anger, fear, joy, sadness [36,37].
The dependent categorical variable was determined by annotators: articles receiving at least two positive responses to the quality question were assigned a score of 1, while the rest were assigned a score of 0. Independent variables for the model were selected and operationalized based on the theoretical framework dimensions, which, according to the literature, define journalistic quality (refer to Table 2 for measurement details). To examine general attributes across a large corpus, we employed a computer-assisted text analysis methodology, using concept measurement.
The primary objective of the research was to explore, verify, and evaluate the theoretical framework of quality in journalistic discourse, derived from the previous phases of the project and from the study of the literature, rather than simply constructing an efficient classifier, which could be achieved by simpler methods.

2.2.2. Machine Learning Model Development

To develop the machine learning models, we implemented multiple classification algorithms using the Python library scikit-learn (https://scikit-learn.org/stable/ (accessed on 29 April 2025)). Specifically, we employed seven distinct models, i.e., Naive Bayes, K-Nearest Neighbors, Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, and XGBoost classifier, which leverage gradient boosting to refine predictions by adding new trees to the previous model. The selection of these models aimed to cover a diverse set of classification algorithms, ranging from simple methods to more complex ones. This variety ensures a balance between model performance, interpretability, and computational efficiency. Such diversity allowed us to evaluate their suitability for predicting news quality and engagement. For model training and evaluation, the dataset was divided, with 80% of the articles used for training and the remaining 20% for testing. To evaluate the different classification methods, the weighted average F-measure (F1) was applied, which is derived from the average of accuracy and recall.
The process of text mining refers to the application of methods to extract and analyze patterns and insights from unstructured textual content. Initially, three techniques were used for the baseline models, Bag of Words (bigrams and trigrams), TF-IDF (Term Frequency-Inverse Document Frequency), and the BERT language model. More specifically, the Bag of Words model was employed to transform each text into a numerical representation, where the frequency of each word was encoded into a vector. These numerical vectors were then used as input features for the machine learning algorithms.
To investigate our basic hypothesis that the specific features of a news item are related to quality, a corpus of newspapers was used as input for machine learning models. These models were trained on the proposed features to predict the quality of the article. After the initial modeling, a series of analyses follows to interpret the machine learning models in order to make the hidden decision-making mechanisms apparent and to assess the importance of each dimension in more detail. In the stage of model explanation (Explainable AI), all available tools in Python language were used, i.e., SHAP (https://shap.readthedocs.io/en/latest/generated/shap.Explanation.html (accessed on 29 April 2025)), LIME (https://c3.ai/glossary/data-science/lime-local-interpretable-model-agnostic-explanations/#:~:text=LIME%2C%20the%20acronym%20for%20local,to%20explain%20each%20indivi (accessed on 29 April 2025)), TreeInterpreter (https://github.com/andosa/treeinterpreter (accessed on 29 April 2025)), Eli5 (https://eli5.readthedocs.io/en/latest/overview.html (accessed on 29 April 2025)), and DTreeViz (https://github.com/parrt/dtreeviz (accessed on 29 April 2025)).
The dataset of online news stories was also used as input for the machine learning algorithms in order to test our hypothesis that specific inherent features of a news item can influence its engagement on social media. To create the dependent variable about successful and unsuccessful articles on Facebook, we separated the data into two groups, where the low one consisted of articles belonging to the 5th percentile and had a value of 0 and the high consisted of articles belonging to the 95th and had a value of 1. We believed that in order to discover the hidden patterns behind engaging news content, we should examine the really popular ones in contrast with the completely irrelevant ones in terms of engagement. Furthermore, we created four models, each one predicting a different engagement metric, namely Likes, Shares, Comments, and the number of Total Interactions, which included the emojis of Love, Care, Haha, Wow, and so on. The independent variables (see Table 3) represented various engagement and quality indicators drawn from the literature selected for their potential to reflect audience engagement with news content on Facebook. From the 7.359 websites, after tagging a representative sample of posts and analyzing all websites in relation to user engagement with Facebook and Twitter posts, over 5 million articles were selected.
The engagement analysis relied on tree-based models, as previous studies [53,54] have demonstrated that such classifiers offer in-depth interpretations of the model predictions. Accordingly, a binary classification task was conducted using three models from the scikit-learn Python library: Decision Tree, Random Forest, and XGBoost. The dataset was split into 80% for training and the remaining 20% for testing. The F-measure (F1) was selected as the primary accuracy method. Finally, several explainability techniques were also employed to extract rules associated with audience engagement.
At this stage of the research, the following research hypothesis was proposed:
H4.
Lower levels of subjectivity and entertainment will predict higher quality, whereas higher levels of emotional content and poor language use will predict lower quality. The accuracy of the quality classification models is presented in Table 4 and the best models are XGBoost classifier and Random Forest. The preferred classifier was able to correctly classify 85% of the news articles into high- or low-quality categories based on the identified attributes. The other models achieved F1 scores ranging from 80% to 85%. To find the most important features, we used the ELI5 Python library for “Inspecting Black-Box Estimators” to get the permutation importance (Figure 1). Therefore, the accuracy of the models indicates that the theoretical quality framework is effective in Greek articles.
Specifically, as shown in Figure 1, readability is the primary predictor for the model. The concept of readability refers to the overall sum of all the features of a text and their interactions that affect whether a reader will read it successfully. The degree to which a text is successfully read lies in the extent to which readers understand it, retain the information taken in from it, read it at a satisfactory speed, and find it interesting [55].
Additionally, in the ranking, it appears that emotions play a particularly important role in prediction, with Anger intensity, Joy intensity, Sadness intensity, and Trust ranking high. The number of adjectives and the presence of capitals in the body of the article are also important. Finally, the number of celebrities appearing in the article seems to play a large role in prediction.
To improve interpretability of the predictions, we analyzed the graphical representation of a Decision Tree from the XGBoost classifier to propose rules related to journalistic quality. However, extracting all rules from a Decision Tree can become increasingly challenging as the tree expands, leading to greater complexity and reduced interpretability. Furthermore, since constraints along a single path are conjunctive and different paths may give conflicting information, additional post-processing is often required to improve the original rule set. To address this, we incorporated measures of confidence and opted for the “most important” rules. Therefore, we selected the leaf nodes with a high probability to belong in class 0 (low quality) or class 1 (high quality), having a large number of samples (20% of the total training samples) at the tree construction phase, where at the same time, the misclassification error during the test phase for that specific leaf nodes remains very low (<0.1). Using Python’s DTreeViz library [56], we visualized the algorithm’s decision-making process at the leaf nodes, and based on the above preconditions for node selection, this approach supports confident generalization of the extracted rules.
The results identify six key rules for news discrimination based on nine important subdimensions of the framework. According to these rules (see Table 5), a high-quality article tends to be difficult to read, suitable for people with at least a third-grade education or age 17 or older, and has words of fewer than six characters on average. In addition, such articles occasionally convey positive emotions, with the number of adjectives varying according to other characteristics of the text, such as length, references to famous people, or references to crimes, accidents, or conflicts. For journalistic articles that are more readable—understandable by people without a high school education or under 17 years old—the prediction model considers them to be of high quality if less than 21% of the words in the articles express joy. In addition, in shorter texts (less than 243 words) adjectives should make up more than 15% of the total words, while longer texts should contain words that convey emotions such as anger or confidence.
For the engagement prediction classification models, we conducted three experiments, each targeting a different variable: Likes, Shares, and Comments. Using three classifiers, the XGBoost model demonstrated the highest accuracy, achieving an F1-score of 91% for predicting Likes (see Table 6). The Information Quality dimension emerged as the most important, with the number of words in the headline constantly ranking as the strongest predictor across all three engagement metrics. Other key attributes, such as readability, article length, and diversity, also ranked high in the permutation importance table. Furthermore, all four dimensions of the framework contributed meaningfully to the model’s performance. Emotionality indicators like anticipation, dominance, arousal, and valence proved influential, Subjectivity within the article body and the Facebook headline played a role in engagement prediction, and finally, from the Entertainment dimension, the presence of famous people was important. We observe that the significance of some features changes depending on the target variable, which is to be expected according to the literature [57]. Therefore, the number of celebrities mentioned in the article is the strongest predictor for Comments and Likes but has no significant effect on Shares. Additionally, the emotion of dominance is crucial for Likes, fear is only relevant for Likes, and positivity is relevant for Comments.
An examination of the five most significant decision paths revealed a set of rules which, if applied by professional journalists prior to publishing, could enhance engagement levels on Facebook. In general, the basic rule is this: a story with a high likelihood of engagement tends to have an average word length of up to five characters, is easy to understand, includes positive language, and is made up of less than 11% adjectives. In addition, the story should express trust and include references to crimes, conflicts, or accidents. If the article refers to famous people, evoking emotions such as fear or excitement is likely to improve engagement.

2.3. Phase C: User-Centered Design

The development of the intelligent advisor IQJournalism followed a design thinking approach, initially focusing on a deep understanding of the user’s specific requirements. This user-centered methodology then incorporated an iterative design process involving prototyping, testing, and redesigning [58]. The design and development steps followed in creating the IQJournalism system are depicted in the Figure 2 below.
The process began with conducting user research, in our case with journalists, in order to identify the needs and challenges they face during the writing process. These needs mainly concern the composition, editing, and revision of the text, based on system-generated suggestions, along with organizing documents and managing files. Given that journalists often rely on similar writing tools, our research also included an analysis of popular online writing assistants to understand common functionalities and user expectations. Most of the authoring tools and AI assistants we evaluated shared common features, such as a minimalist design to support focused and interrupted writing, AI feedback displayed to the right of the text, and color-coded suggestions.
The above research resulted in the prototype development of the intelligent text editor. More specifically, during the ideation stage, we developed various design alternatives for the tool’s Homepage and User Dashboard using Drawio (https://www.drawio.com/ (accessed on 29 April 2025)). As shown in Figure 3, the text editing interface is centered, with a main menu on the left sidebar offering essential functions like saving and printing. The right sidebar provides text quality improvement suggestions in terms of “Language quality”, “Subjectivity”, “Emotionality”, and “Entertainment,” displayed through color progress bars for visual assessment. Users can access more details by clicking the “See details” hypertext beneath each category. In the foreground, the “Text Preferences” pop-up window appears, where users at the beginning set article parameters such as the “Genre”, “Medium”, “Length of the text”, and multimedia inclusion of “Photographs” and “Videos”. This window is accessible anytime via the “Edit Preferences” button.
This preliminary concept was tested with a focus group, gathering valuable user insights that helped refine the design prototype for greater usability [59]. The focus group method facilitated group interaction, allowing us to adapt questions and gather important observations on the design of the system. This session included 10 Media Studies postgraduate students (8 female, 2 male, aged 23–40), all of whom had prior experience with professional text editing tools. After reviewing the designs, participants expressed preferences and suggested improvements for features such as a login button, navigation menu, color palette, IQ mode button, and performance score display.
The next step involved refining the initial designs based on focus group feedback and creating an interactive functional prototype using Figma (https://www.figma.com/ (accessed on 29 April 2025)). This prototype allowed users to engage with the system’s computational layer, which assesses an article’s perceived quality. The prototype included several key interfaces: a Homepage featuring a “Start Writing” prompt, top menu, and user sign-up options; a User Dashboard for file management and document creation; and a Document Page, where users can adjust preferences, review system feedback through the IQ Mode button, and make necessary changes. The following diagram (Figure 4) depicts a visual representation of the system’s entities. It provides a clear overview of the components, along with the interactions and actions available to the user.
Indicatively, Figure 5 illustrates the main interface of the IQJournalism platform, which includes two sidebars that host tools and information. These sidebars can be displayed or hidden by pressing the IQJournalism button. The left sidebar (Menu) allows users to perform a variety of actions, such as creating new files, uploading or downloading text, printing documents, editing their profile, and accessing informational and help material related to the application. The right sidebar (IQJournalism) displays detailed analysis results provided by the platform. These results include assessments of the text’s language quality, subjectivity, emotionality, and entertainment value. At the top-right corner of the screen, the platform presents an overall performance score for the article, along with a button labeled “Edit Preferences” for defining or updating the article’s attributes.
To evaluate the IQJournalism prototype’s user experience, we conducted a moderated desirability (light) usability study over two weeks in May 2023 at the Department of Communication and Media Studies. For the purpose of the usability study, five research hypotheses were formulated:
H5.
The interaction with the prototype results in a very positive user experience among the participants [19].
H6.
The perceived usability score of all participants is higher than the standard average SUS score of 68 [20].
H7.
Participants’ NPS score is above 30, indicating a strong tendency to recommend the IQJournalism system [21].
H8.
The distribution of participants’ scores regarding their overall experience and satisfaction with the prototype shows a central tendency toward the highest values on the Likert scales, indicating strong positive attitudes.
H9.
Most participants completed the prototype’s tasks more quickly and efficiently, (i.e., SWA < 1).
The study involved 20 postgraduate students (18 female, 2 male) aged 20–35, experienced in journalism and familiar with online editing tools. Each participant, guided by one or two moderators, performed situation-specific tasks designed to provide both implicit and explicit feedback. Participants completed three tasks: uploading a text file to assess its predicted performance, modifying article preferences to observe changes in performance scores, and improving the language quality score before downloading the file. Task performance was recorded, including timing and assistance levels. To capture usability and desirability, we used several assessment approaches: the User Experience Questionnaire (UEQ), System Usability Scale (SUS), Product Reaction Cards, perceived satisfaction items, Net Promoter Score (NPS), and open-ended questions. These tools allowed us to gather insights into user satisfaction and acceptance, combining qualitative feedback and quantitative measures like NPS, which indicated participants’ likelihood of recommending IQJournalism.
Participants also completed a Google Forms survey that documented demographic information, experience with web-based editors, and expertise at editing and authoring tasks. This valuable feedback highlighted the prototype’s strengths in visual design, ease of use, and usefulness, alongside areas for further refinement. The final stage in the user-centered iterative process involved refining IQJournalism’s design and functionality driven by user insights and testing. Iterative improvements enhanced the tool’s usability, performance, and overall user experience, leading to a fully functional version. During this stage, a fully operational software system was built and incorporated into the text editor, integrating machine learning models and additional functionalities described in Phase B of the methodology.

3. Results

The evaluation of IQJournalism focused on user experience, usability, satisfaction, and user performance when interacting with the IQJournalism prototype.

3.1. User Experience

Using the UEQ focusing on Pragmatic quality, which measures task-related aspects such as efficiency and ease of use, as well as Hedonic Quality, which captures the system’s appeal, enjoyment, and originality, participants rated IQJournalism highly on both conditions with Cronbach’s alpha reliability scores above 0.7 ( α = 0.72 for the Pragmatic and α = 0.80 for the Hedonic qualities, respectively). According to the UEQ benchmarking framework [19], the results were as follows: Pragmatic quality scored 1.81 (“Excellent”), Hedonic quality scored 1.12 (“Above Average”), and overall experience scored 1.47 (“Good”). Additionally, using Product Reaction Cards, positive attributes like “Easy to use” and “Friendly” were described by 80% and 70% of users, respectively, 55% of users described it as “Clean” and 50% as “Efficient”, while few selected negative terms like “Sterile” (5%) or “Inconsistent,” (5%). All these scores indicated strong user acceptance, simultaneously confirming H5.

3.2. SUS and NPS

The SUS baseline score of 82.5 indicated high perceived usability, with a standard deviation of 7.8, well above the threshold of 68 (above average [60]). These results showed that participants’ interactions with the prototype tasks positively influenced their perceptions of usability, thus supporting the research hypothesis related to the usability of the system (H6). The NPS score of 40 (Mdn = 8.5, IQR = 7.75) showed a high likelihood of recommendation and consequently confirmation of the seventh hypothesis. Based on the scale categories outlined by Reichheld [61], among the 20 participants, 5 (25%) rated the system a 10 and another 5 (25%) rated it a 9 (categorized as Promoters); conversely, 2 (10%) participants fell into the category of Detractors (scoring 5 and 6). Among the users categorized as Passives, 8 participants (40%) provided scores of 7 and 8. These responses showed a generally positive attitude towards IQJournalism, indicating a potential for favorable word-of-mouth among peers and future users of the proposed system [62]. Given the established strong positive correlation between SUS and NPS (r = 0.61) [62], the high perceived usability score seems to align with the observed NPS. However, a higher NPS score closer to 70 [63] might have been expected. This difference can be attributed to the limitations of the prototype used in the evaluation. Compared to a fully developed version, the first version of the interactive prototype may have lacked certain improvements, such as improved response times and clearer task guidance, which could have influenced users’ perceptions during the evaluation.

3.3. Satisfaction

Regarding users’ perceived satisfaction and performance, the findings revealed a generally positive consensus. Participants’ responses were concentrated toward the higher end of the scale (i.e., 5, 6, and 7), indicating strong positive attitudes. In response to the question, “How easy is it to deal with IQJournalism for doing your job?”, the participants provided positive feedback (M = 6, SD = 1.02), with 20% selecting 5 on the scale, 60% selecting 6 and 35% selecting 7. For the statement “Using the data from the various system’s views (e.g., articles’ preferences, predicted performance), I can self-reflect and get a good understanding of my performance”, 85% of the respondents expressed general agreement (M = 5.3, SD = 1.08). Additionally, 90% of the participants provided a positive response to the question “How would you rate your overall satisfaction with the IQJournalism prototype?” (M = 5.65, SD = 1.03). As regards the negatively worded element “When I interact with the various views of IQJournalism, (e.g., data entry, dashboard, preferences) for accomplishing my tasks, I usually feel uncomfortable and emotionally loaded (i.e., stressed-out/overwhelmed)” 85% of participants disagreed as anticipated (M = 2.05, SD = 1.35). The combination of the above findings supports the acceptance of hypothesis H8. Participants found the system to be engaging and easy to use, demonstrating an awareness of their performance while interacting with the various features of the prototype. Moreover, users reported not feeling stressed or overwhelmed during their interactions with the system and expressed overall satisfaction with the experience.

3.4. Task Performance

During the study, users were asked to perform three primary tasks, capturing SWA, completion time, and open-ended feedback. To test H9, which states that most participants would perform faster and more efficiently (i.e., SWA < 1) when interacting with the prototype, participants were split into two groups: Group A included those who completed all three tasks with minimal assistance (i.e., SWA < 1) and Group B included participants who required more support during the tasks (i.e., SWA ≥ 1). Overall, a positive correlation was observed between assists and task completion time. Specifically, strong positive correlations were noted for tasks 1 and 2, r = 0.71 and r = 0.65, respectively, and a weak positive correlation for task 3, r = 0.15. This indicates that Group A (the majority of participants) completed the tasks more quickly than those in Group B. Task-specific results are presented in Figure 6, including the Coefficient of Variation (CV) for each group to better illustrate the relative variability in performance.
For task 1, study participants were asked to upload a text file on IQJournalism and review its predicted performance. Prior to this, they read the article to gain an overview of its genre and content. Group A users (65%, CV = 34.8%) completed the task with an average time of M = 01:10, SD = 00:24, and had a mean SWA score M = 0, SD = 0, indicating no significant assistance from the moderators. In contrast, for Group B users (35%, CV = 39.6%), the mean time of completion was M = 02:30, SD = 00:59 and the average SWA score was M = 1, SD = 0, indicating they needed more support. Therefore, users who did not receive assistance completed this task significantly faster. Based on the corresponding qualitative feedback, 11 out of 20 participants described the interaction as easy and user-friendly, noting that the system’s output was clear and understandable. However, 4 participants reported difficulty locating the IQ Mode button quickly.
For Task 2, users were asked to edit the article’s preferences and observe the resulting changes in the system’s output. Group A users (65%, CV = 59.3%) completed this task within an average time of M = 00:44, SD = 00:26, and average SWA score of M = 0, SD = 0. On the contrary, for Group B (35%, CV = 87.9%), the average completion time was M = 01:29, SD = 01:18 and the average SWA score was M = 1, SD = 0.4. Also, users with low SWA scores were able to perform faster during the execution of task 2. In their open-ended responses, users stated that the task was easy to complete and that the changes in performance scores were clear. However, 6 participants (N = 20) mentioned that they had difficulty in locating the “Edit Preferences” button.
For task 3, study participants were asked to edit the article in order to improve the language quality score, download the article, and sign out. In detail, Group A (55%, CV = 61.8%) completed task 3 within an average time of M = 02:07, SD = 01:19. The average SWA score for Group A was M = 0.4, SD = 0.2. For Group B (45%, CV = 24.9%), the average completion time was M = 02:08, SD = 00:32 and the average SWA score was M = 1.1, SD = 0.17. In contrast to tasks 1 and 2, task 3 showed a weak correlation between completion time and SWA (r = 0.15), and users took longer to complete the task, irrespective of the level of support received. According to the qualitative feedback, participants faced some challenges: many users expected more guidance by the system, failed to notice the suggestions for improving the language quality score, and had difficulty in locating the download button. Despite these challenges, users in Group A completed the the task nearly as quickly as those in Group B, even without significant assistance. After implementing iterative adjustments and improvements based on the findings from the usability study, the fully functional version of the system will be developed. The IQJournalism system integrates a comprehensive software solution, incorporating machine learning algorithms alongside complementary features into the text editor.

4. Theoretical and Practical Implications

IQJournalism provides a robust framework and a proof-of-concept tool that can effectively support newsrooms in navigating the challenges of digital journalism. By leveraging AI, it offers practical assistance for producing high-quality, engaging content and contributes to the evolving landscape of media technology.
This study contributes to the fields of journalism studies, AI, and human–computer interaction by demonstrating a novel approach to predicting and enhancing journalistic quality and audience engagement in the digital age. By integrating qualitative insights from journalism experts with quantitative methods including machine learning and user-centered design, IQJournalism offers significant theoretical and practical implications. More specifically, the methodology, which began with semi-structured interviews with 20 experts (16 journalists and 4 academic researchers) in Greece, provides a theoretically grounded framework for defining perceived quality in online news content within the Greek context. Key indicators identified by experts, such as credibility, relying on source diversity and investigative journalism, correct language use, effective source management, informative lead paragraphs, appropriate article length, and the importance of headline accuracy over emotional appeal, were validated. The experts also affirmed the value of relevant audiovisual material.
By transforming these expert insights into measurable features, the study empirically tested and validated this theoretical framework using machine learning models. The high F1 scores achieved by models like XGBoost and Random Forest (up to 85% for quality classification) indicate that the identified features are effective predictors of perceived journalistic quality. This confirms that machine learning, guided by expert knowledge, can indeed contribute to understanding and operationalizing quality standards in digital journalism. The analysis also highlighted the distinct contributions of various linguistic and contextual factors to quality prediction, such as readability, emotional intensity (anger, joy, trust), the number of adjectives, the presence of celebrities, and the use of capitals.
Furthermore, the study provides theoretical insights into the factors driving social media engagement with news content. The machine learning models, particularly XGBoost, successfully predicted engagement metrics like Likes, Shares, and Comments with high accuracy (e.g., 91% F1 score for Likes). The identification of features like title length, emotionality (anticipation, dominance, arousal, valence, positivity, fear), subjectivity levels (in the article body and Facebook headline), readability, article length, diversity, and the presence of celebrities as significant predictors contribute to the understanding of how content attributes influence audience interaction on platforms like Facebook. The finding that the importance of features varies depending on the specific engagement metric (Likes, Shares, Comments) aligns with existing literature and provides nuanced insights into audience behavior.
The application of a user-centered, iterative design process (prototyping, testing, redesigning) in developing the IQJournalism platform also offers theoretical contributions to the field of human–computer interaction within the context of AI tools for creative and professional tasks. Evaluating the prototype using metrics like UEQ, SUS, and NPS and task performance analysis provided empirical evidence supporting the effectiveness and user acceptance of this design approach for developing intelligent authoring tools.
Additionally, the developed IQJournalism platform serves as a practical intelligent advisor for journalists and editors. By providing real-time editing recommendations based on dimensions such as language quality, subjectivity, emotionality, entertainment, and social media engagement, the tool empowers media professionals to enhance the overall quality and impact of their writing. It can support various roles within a newsroom, including journalists refining their text, audience editors, and social media specialists optimizing content for digital platforms. This support for readability and engagement can help discourage clickbait practices and contribute to more reliable and engaging online news content.
The system’s design, featuring a user-friendly interface with sidebars displaying analysis results through color progress bars and overall performance scores, provides clear visual feedback. The ability to define and adjust article preferences allows users to tailor the AI analysis to different genres, mediums, and publishing contexts (e.g., adapting content for Facebook or Twitter). Beyond traditional journalism, the platform has practical applications for optimizing content quality and impact in other fields, like marketing, political, and strategic communication. This highlights the broader utility of AI-powered writing assistance across various content creation industries.
To maximize the benefit of this methodology, the findings from the user evaluation during Phase C suggest key areas for practical improvement and future development. Users expressed a strong need for more detailed feedback and clear identification of specific text passages needing improvement. Incorporating features for tracking changes and offering auto-generation capabilities, such as keyword and synonym suggestions, and recommendations for enriching articles with media (photos, videos) are practical steps to enhance the tool’s utility. The request for always-on spelling, grammar, and syntax checks, as well as a text reading time indicator, points to incorporating standard editing features alongside the AI quality analysis. Finally, exploring integration options, such as developing the tool as an add-on for widely used platforms like Word or Wordpress, could significantly increase its accessibility and adoption by professionals.

5. Limitations and Future Research

IQJournalism, while offering a novel approach to enhancing journalistic quality through AI, has certain limitations that also highlight key areas for future research and development.
One of the most important limitations is that the machine learning models were developed and trained exclusively based on a dataset of Greek news articles. Consequently, the system’s linguistic adaptability is restricted to the Greek language and context. The intelligent advisor’s recommendations are tailored to the specific nuances of Greek journalistic practices and are not directly applicable to other languages. This limitation points to a clear and necessary direction for future work: to extend the capabilities of the system to other languages. This would involve creating new datasets, adapting NLP techniques, and potentially refining the quality indicators based on linguistic and cultural differences in journalism across various regions.
Regarding the methodology and samples, the user-centered design phase involved evaluating an interactive prototype rather than a fully functional system. While this approach was valuable for gathering initial user feedback and refining the design, the evaluation results, particularly the NPS score, suggested that the prototype may have lacked certain improvements present in a potential final version, such as faster response times or clearer task guidance.
This indicates that a future study evaluating the fully functional system with a broader range of participants, potentially including seasoned professional journalists beyond postgraduate students, would be valuable to fully assess its usability and impact in a real-world newsroom setting. The current usability study used postgraduate students with journalism/editing experience, which provided useful insights but may not capture the full spectrum of needs and workflows of experienced professionals.
Furthermore, user feedback during the prototype evaluation highlighted several areas for practical improvement and future feature development. To maximize the benefit of the IQJournalism methodology and platform, future research and development should focus on: (a) Providing more detailed feedback and clearer identification of specific text passages requiring improvement; (b) Implementing auto-generation features, such as keyword and synonym suggestions; (c) Offering recommendations for enriching articles with media (photos, videos etc.); (d) Incorporating always-on spelling, grammar, and syntax checks; (e) Adding a text reading time indicator; and (f) Exploring integration options, such as developing the tool as an add-on for widely used platforms like Word or Wordpress.
These suggested features, grounded on the user needs and the evaluation of the prototype, represent practical steps for enhancing the tool’s utility and facilitating its adoption in newsrooms and other content creation environments. Implementing these features in the fully operational system constitutes a significant portion of the future research and development agenda, aimed at creating a truly comprehensive intelligent assistant for journalists.

6. Conclusions

IQJournalism showcases the potential of integrating journalism principles with machine learning techniques and user-centered design to enhance the quality and engagement of journalistic content. The qualitative phase (Phase A) validated the fundamental hypothesis that journalistic quality is multifaceted. Experts agreed on the importance of correct language use, the inclusion of essential information (what, where, who, when, and why), the presence of sources within the article, and the representation of diverse points of view or information (pluralism), reflecting findings from previous studies on media trust [16,17,18]. The confirmation of hypotheses H1, H2, and H3 related to the accuracy of the headline [1,16], the moderation of emotions [16,17], and the importance of audiovisual material [18] suggests that traditional values continue to guide the perceptions of quality by experts.
In Phase B, the development and testing of machine learning models confirmed that expert-defined features were strong predictors of both perceived quality and social media engagement. The performance of models such as XGBoost and Random Forest (F1 score up to 85%) not only validates the effectiveness of the proposed framework, but also demonstrates the possibility of predicting quality using linguistic and contextual factors. In addition to the research question that investigated whether machine learning models could accurately predict social media engagement, the prediction experiments focused on specific metrics, i.e., Likes, Shares, and Comments, and XGBoost achieved an F1 score of 91% for Likes. Title length, emotionality, and references to celebrities emerged as significant predictors. For instance, positivity enhanced Comments, while fear and dominance were linked to Likes. These findings indicated the importance of tailoring content to audience preferences while maintaining journalistic integrity.
The final phase of the methodology (Phase C) focused on iterative design and usability testing, which produced strong user satisfaction metrics. Overall, the usability evaluation of the IQJournalism platform verified its effectiveness and user satisfaction, aligning with the proposed hypotheses (H5 [19], H6 [20], H7 [21], H8, and H9). Importantly, qualitative user feedback revealed both enthusiasm and reservations. While the interface and visual scoring elements of the tool were appreciated, participants requested deeper feedback, better change tracking, and broader editing capabilities, suggesting user trust depends not only on technical accuracy but also on how clearly the tool demonstrates its usefulness and how well it integrates into workflows. It is worth noting that users’ requests for features such as real-time grammar checks, keyword suggestions, and integration with platforms such as Word or WordPress reveal user expectations shaped by mainstream writing tools.
The progress of the interactive prototype was driven by iterative adjustments based on the feedback from the usability study, ensuring improvements in usability and user experience. The user-centered design process proved essential to the refinement of the platform, allowing for the incorporation of analysis features. This iterative approach transformed the prototype into a fully functional tool that effectively meets its objectives. Specifically, the final implementation of IQJournalism combines machine learning insights with user-friendly design, offering journalists a tool to enhance content quality and audience engagement. The ability to customize features for different publishing contexts allows for flexibility and adaptability, making the platform a useful service for newsrooms facing the challenges of digital journalism. The project not only contributes to the state of journalism technology but also provides a framework for future research and development in the media sector.

Author Contributions

Writing—review & editing, C.S., P.G., A.K. and C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH–CREATE–INNOVATE (project code: T2EDK-04616).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the nature and scope of the project, which does not involve activities that typically fall under ethical review, such as human or animal experimentation, personal data processing beyond standard practices, or interventions requiring ethical oversight.

Informed Consent Statement

Informed consent for participation was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors express their sincere appreciation to all those who contributed to the various stages of this research, with special thanks to Katerina Mandenaki, Antonis Armenakis, Stamatis Poulakidakos, and Spiros Moschonas for their expertise and invaluable support during Phase A: Qualitative Research, as well as to Theodoros Paraskevas and Irene Konstanta for their significant contributions to Phase C: User-Centered Design.

Conflicts of Interest

Author Panagiotis Germanakos was employed by the company SAP SE. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
BERTBidirectional Encoder Representations from Transformers
CVCoefficient of Variation
EFLEnglish as a Foreign Language
HTMLHypertext Markup Language
IQRInterquartile Range
LDALatent Dirichlet Allocation
LIMELocal Interpretable Model-agnostic Explanations
LLMLarge Language Model
MMean
MdnMedian
MPQA   Multi-perspective Question Answering Subjectivity Lexicon
NaNNot a Number
NLPNatural Language Processing
NPSNet Promoter Score
NRCNational Research Council
SDStandard Deviation
SEOSearch Engine Optimization
SHAPSHapley Additive exPlanations
SUSSystem Usability Scale
SVMSupport Vector Machine
SWASuccess with Assistance
TF-IDFTerm Frequency-Inverse Document Frequency
UEQUser Experience Questionnaire
VADValence, Arousal, Dominance
XGBoosteXtreme Gradient Boosting

References

  1. Hendrickx, J.; Truyens, P.; Donders, K.; Picone, I. The media for democracy monitor 2021. How leading news media survive digital transformation 2: Flanders (Belgium): News diversity put under pressure. In The Media for Democracy Monitor 2021; Nordicom: Göteborg, Sweden, 2021; pp. 8–41. [Google Scholar]
  2. Kalogeropoulos, A.; Rori, L.; Dimitrakopoulou, D. ‘Social Media Help Me Distinguish between Truth and Lies’: News Consumption in the Polarised and Low-trust Media Landscape of Greece. South Eur. Soc. Politics 2021, 26, 109–132. [Google Scholar] [CrossRef]
  3. Bryman, A. Social Research Methods; Oxford University Press: Oxford, UK, 2016. [Google Scholar]
  4. Lacy, S.; Rosenstiel, T. Defining and Measuring Quality Journalism; Rutgers School of Communication and Information: New Brunswick, NJ, USA, 2015. [Google Scholar]
  5. Rosenstiel, T. The Elements of Journalism: What Newspeople Should Know and the Public Should Expect; Three Rivers Press: Toronto, ON, Canada, 2014. [Google Scholar]
  6. Simon, F. Artificial Intelligence in the News: How AI Retools, Rationalizes, and Reshapes Journalism and the Public Arena; Report; Columbia University: New York, NY, USA, 2024. [Google Scholar]
  7. Verma, D. Impact of artificial intelligence on journalism: A comprehensive review of AI in journalism. J. Commun. Manag. 2024, 3, 150–156. [Google Scholar] [CrossRef]
  8. Bhat, A.; Shrivastava, D.; Guo, J.L. Approach intelligent writing assistants usability with seven stages of action. arXiv 2023, arXiv:2304.02822. [Google Scholar]
  9. Fitria, T.N. Grammarly as AI-powered English writing assistant: Students’ alternative for writing English. Metathesis J. Engl. Lang. Lit. Teach. 2021, 5, 65–78. [Google Scholar] [CrossRef]
  10. Yuan, A.; Coenen, A.; Reif, E.; Ippolito, D. Wordcraft: Story writing with large language models. In Proceedings of the 27th International Conference on Intelligent User Interfaces, Helsinki, Finland, 22–25 March 2022; pp. 841–852. [Google Scholar]
  11. Zhao, X. Leveraging artificial intelligence (AI) technology for English writing: Introducing wordtune as a digital writing assistant for EFL writers. RELC J. 2023, 54, 890–894. [Google Scholar] [CrossRef]
  12. Dai, X.; Liu, Y.; Wang, X.; Liu, B. Wings: Writing with intelligent guidance and suggestions. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MD, USA, 23–24 June 2014; pp. 25–30. [Google Scholar]
  13. Stefnisson, I.; Thue, D. Mimisbrunnur: AI-assisted authoring for interactive storytelling. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Edmonton, AB, Canada, 13–17 November 2018; Volume 14, pp. 236–242. [Google Scholar]
  14. Schmitt, O.; Buschek, D. Characterchat: Supporting the creation of fictional characters through conversation and progressive manifestation with a chatbot. In Proceedings of the 13th Conference on Creativity and Cognition, Virtual Event/Venice, Italy, 22–23 June 2021; pp. 1–10. [Google Scholar]
  15. Osone, H.; Lu, J.L.; Ochiai, Y. BunCho: Ai supported story co-creation via unsupervised multitask learning to increase writers’ creativity in japanese. In Proceedings of the Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; pp. 1–10. [Google Scholar]
  16. Ahmed, M.; Shazali, M. The interpretation of implicature: A comparative study between implicature in linguistics and journalism. J. Lang. Teach. Res. 2010, 1, 35–43. [Google Scholar]
  17. Wahl-Jorgensen, K. The strategic ritual of emotionality: A case study of Pulitzer Prize-winning articles. Journalism 2013, 14, 129–145. [Google Scholar] [CrossRef]
  18. Harcup, T.; O’neill, D. What is news? News values revisited (again). J. Stud. 2017, 18, 1470–1488. [Google Scholar]
  19. Hinderks, A.; Schrepp, M.; Thomaschewski, J. A benchmark for the short version of the user experience questionnaire. In Proceedings of the 14th International Conference on Web Information Systems and Technologies–APMDWE, Seville, Spain, 18–20 September 2018; SciTePress: Setúbal, Portugal, 2018; pp. 373–377. [Google Scholar]
  20. Lewis, J.R.; Sauro, J. Item benchmarks for the system usability scale. J. Usability Stud. 2018, 13, 158–167. [Google Scholar]
  21. Schneider, D.; Berent, M.; Thomas, R.; Krosnick, J. Measuring customer satisfaction and loyalty: Improving the ‘Net-Promoter’ score. In Proceedings of the Poster presented at the Annual Meeting of the American Association for Public Opinion Research, New Orleans, LA, USA, 15–18 May 2008; Volume 19. [Google Scholar]
  22. Paraskevas, T.; Konstanta, I.; Germanakos, P.; Sotirakou, C.; Karampela, A.; Mourlas, C.; Gekas, C. Measuring the Desirability of an Intelligent Advisor for Predicting the Perceived Quality of News Articles. In Proceedings of the 2nd International Conference of the ACM Greek SIGCHI Chapter, Athens, Greece, 27–28 September 2023; pp. 1–8. [Google Scholar]
  23. Phellas, C.N.; Bloch, A.; Seale, C. Structured methods: Interviews, questionnaires and observation. Res. Soc. Cult. 2011, 3, 23–32. [Google Scholar]
  24. DiCicco-Bloom, B.; Crabtree, B.F. The qualitative research interview. Med Educ. 2006, 40, 314–321. [Google Scholar] [CrossRef]
  25. Turner, D. Qualitative Interview Design: A practical guide for novice investigators. Qual. Rep. 2010, 15, 754–760. [Google Scholar] [CrossRef]
  26. Alsaawi, A. A critical review of qualitative interviews. Eur. J. Bus. Soc. Sci. 2014, 3, 149–156. [Google Scholar] [CrossRef]
  27. Carter, M. General Guidelines for Conducting Interviews. 2009. Available online: http://managementhelp.org/evaluatn/intrview.htm (accessed on 21 August 2015).
  28. Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
  29. Fereday, J.; Muir-Cochrane, E. Demonstrating rigor using thematic analysis: A hybrid approach of inductive and deductive coding and theme development. Int. J. Qual. Methods 2006, 5, 80–92. [Google Scholar] [CrossRef]
  30. Alhojailan, M.I. Thematic analysis: A critical review ofits process and evaluation. In Proceedings of the WEI international European academic conference proceedings, Zagreb, Croatia, 14–17 October 2012; Citeseer: University Park, PA, USA, 2012. [Google Scholar]
  31. Molyneux, L.; Holton, A. Branding (health) journalism: Perceptions, practices, and emerging norms. Digit. J. 2015, 3, 225–242. [Google Scholar] [CrossRef]
  32. Sotirakou, C.; Mandenaki, K.; Poulakidakos, S.; Armenakis, A.; Moschonas, S.; Karampela, A.; Mourlas, C. Deliverable 2.2: Analysis of journalistic texts using traditional quality assessment techniques. Preprints 2023. [Google Scholar] [CrossRef]
  33. Wilson, T.; Wiebe, J.; Hoffmann, P. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada, 6–8 October 2005; pp. 347–354. [Google Scholar]
  34. Mohammad, S. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 174–184. [Google Scholar]
  35. Mohammad, S.M. Word affect intensities. arXiv 2017, arXiv:1704.08798. [Google Scholar]
  36. Plutchik, R. A general psychoevolutionary theory of emotion. In Theories of Emotion; Elsevier: Amsterdam, The Netherlands, 1980; pp. 3–33. [Google Scholar]
  37. Ekman, P. An argument for basic emotions. Cogn. Emot. 1992, 6, 169–200. [Google Scholar] [CrossRef]
  38. Steensen, S. The intimization of journalism. In Handbook of Digital Journalism Studies; Sage: Thousand Oaks, CA, USA, 2016; pp. 113–127. [Google Scholar]
  39. Lasorsa, D.L.; Lewis, S.C.; Holton, A.E. Normalizing Twitter: Journalism practice in an emerging communication space. J. Stud. 2012, 13, 19–36. [Google Scholar] [CrossRef]
  40. Carpenter, S. Source diversity in US online citizen journalism and online newspaper articles. In Proceedings of the International Symposium on Online Journalism, Austin, TX, USA, 5–6 April 2008; Volume 4, pp. 3–28. [Google Scholar]
  41. Flesch, R. A new readability yardstick. J. Appl. Psychol. 1948, 32, 221–233. [Google Scholar] [CrossRef] [PubMed]
  42. Quandt, T. (No) News on the World Wide Web?: A Comparative Content Analysis of Online News in Europe and the United States. Journal. Stud. 2008, 9, 717–738. [Google Scholar] [CrossRef]
  43. Harcup, T. Journalism: Principles and Practice; Sage: Thousand Oaks, CA, USA, 2021. [Google Scholar]
  44. Bogart, L. Press and Public: Who Reads What, When, Where, and Why in American Newspapers; Routledge: London, UK, 1989. [Google Scholar]
  45. Horne, B.; Adali, S. This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In Proceedings of the International AAAI Conference on Web and Social Media, Montréal, QC, Canada, 15–18 May 2017; Volume 11, pp. 759–766. [Google Scholar]
  46. Cohen, J. Defining identification: A theoretical look at the identification of audiences with media characters. In Advances in Foundational Mass Communication Theories; Routledge: London, UK, 2018; pp. 253–272. [Google Scholar]
  47. Chakraborty, A.; Paranjape, B.; Kakarla, S.; Ganguly, N. Stop clickbait: Detecting and preventing clickbaits in online news media. In Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), San Francisco, CA, USA, 18–21 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 9–16. [Google Scholar]
  48. Sparks, C. Introduction: Tabloidization and the media. Javn. Public 1998, 5, 5–10. [Google Scholar] [CrossRef]
  49. Peters, C. Emotion aside or emotional side? Crafting an ‘experience of involvement’in the news. Journalism 2011, 12, 297–316. [Google Scholar] [CrossRef]
  50. Pantti, M. The value of emotion: An examination of television journalists’ notions on emotionality. Eur. J. Commun. 2010, 25, 168–181. [Google Scholar] [CrossRef]
  51. Harrington, S. Popular news in the 21st century Time for a new critical approach? Journalism 2008, 9, 266–284. [Google Scholar] [CrossRef]
  52. Shoemaker, P.J.; Cohen, A.A. News Around the World: Content, Practitioners, and the Public; Routledge: London, UK, 2012. [Google Scholar]
  53. Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
  54. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  55. Tzimokas, D.; Mattheoudaki, M. Readability indicators: Issues of application and reliability. In Major Trends in Theoretical and Applied Linguistics Volume 3; Versita: Basingstoke, UK, 2014; pp. 367–384. [Google Scholar]
  56. Parr, T.; Howard, J. The Mechanics of Machine Learning. 2018. Available online: https://mlbook.explained.ai/ (accessed on 30 April 2025).
  57. Park, C.S.; Kaye, B.K. Applying news values theory to liking, commenting and sharing mainstream news articles on Facebook. Journalism 2021, 24, 633–653. [Google Scholar] [CrossRef]
  58. Preece, J.; Sharp, H.; Rogers, Y. Interaction Design: Beyond Human–Computer Interaction; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  59. Abras, C.; Maloney-Krichmar, D.; Preece, J. User-centered design. In Encyclopedia of Human-Computer Interaction; Bainbridge, W., Ed.; Sage Publication: Thousand Oaks, CA, USA, 2004; Volume 37, pp. 445–456. [Google Scholar]
  60. Sauro, J. Measuring Usability with the System Usability Scale (SUS). 2011. Available online: https://measuringu.com/sus/ (accessed on 30 April 2025).
  61. Reichheld, F.F. The one number you need to grow. Harv. Bus. Rev. 2003, 81, 46–55. [Google Scholar]
  62. Sauro, J. Does Better Usability Increase Customer Loyalty? 2010. Available online: https://measuringu.com/usability-loyalty/ (accessed on 30 April 2025).
  63. Sauro, J. The challenges and opportunities of measuring the user experience. J. Usability Stud. 2016, 12, 1–7. [Google Scholar]
Figure 1. The importance of XGBoost features using ELI5 library.
Figure 1. The importance of XGBoost features using ELI5 library.
Electronics 14 02552 g001
Figure 2. User-centered design process.
Figure 2. User-centered design process.
Electronics 14 02552 g002
Figure 3. IQJournalism interactive prototype.
Figure 3. IQJournalism interactive prototype.
Electronics 14 02552 g003
Figure 4. Architecture overview of the system.
Figure 4. Architecture overview of the system.
Electronics 14 02552 g004
Figure 5. The main screen of the application with the sidebars enabled.
Figure 5. The main screen of the application with the sidebars enabled.
Electronics 14 02552 g005
Figure 6. Task mean completion time per group.
Figure 6. Task mean completion time per group.
Electronics 14 02552 g006
Table 1. Methodological phases.
Table 1. Methodological phases.
PhasePurposeDescription, Activities, and Outcomes
Phase A: Qualitative ResearchTo identify key characteristics and indicators of perceived quality in journalism and factors that influence engagement. This phase aimed to understand how journalistic discourse is structured within a framework of reliability, quality, and impartiality.Activities: 20 semi-structured, in-depth interviews with experts in Greece were conducted and analyzed using thematic analysis.
Outcomes: Expert interviews revealed key themes of journalistic quality, including credibility, source diversity, language quality, text characteristics, impartiality, headline importance, emotionality, and the role of audiovisual material, which guided the hypotheses (H1, H2, and H3) and served as the basis for developing perceived quality indicators.
Link to the next phase: The insights and identified quality indicators from the experts’ interviews served as input data to train the machine learning model in Phase B.
Phase B: Computational Model Development and TrainingThis phase translated expert insights into a measurable framework and developed AI models to predict article quality and social media engagement, validating the theoretical model and identifying key predictive features.Activities: Building on the insights from Phase A, this phase involved a quantitative and computational approach: - Data Collection and Preprocessing;
- Feature Extraction;
- Model Development for Quality Prediction;
- Model Development for Social Media Engagement Prediction.
Outcomes: Trained machine learning models for quality classification, with XGBoost and Random Forest performing best (F1 score: 0.85), highlighting key predictors such as readability and emotionality. Trained machine learning models for engagement prediction, with XGBoost performing best for predicting Likes (F1 score of 0.91%). These outcomes supported research hypothesis H4 and validated the effectiveness of the proposed theoretical framework.
Link to the next phase: The predictive models from Phase B were integrated into the IQJournalism platform in Phase C, forming a core component of the fully functional system.
Phase C: User-Centered DesignThis phase aimed to develop a user-friendly intelligent advisor that met the specific needs and expectations of journalists and editors.Activities: This phase employed a user-centered iterative design process.
- User Research;
- Prototyping;
- Testing;
- Redesigning;
- Implementation.
Outcomes: Initial and interactive prototypes were developed and positively evaluated, with high scores in user experience (UEQ), usability (SUS), and recommendation likelihood (NPS), supporting hypotheses H5, H6, H7, and H8. Task performance analysis confirmed efficient user interaction, validating H9. User feedback highlighted strengths and improvement areas, leading to a fully operational IQJournalism system integrating AI models into a text editor interface.
Link to overall project goal: This phase represents the realization of the IQJournalism platform as a practical tool, integrating the theoretical insights (Phase A) and the predictive capabilities (Phase B) into a user-friendly application.
Table 2. Creation of the features.
Table 2. Creation of the features.
DimensionMeasurement
Impartiality
SubjectivityA subjectivity lexicon, containing a list of subjectivity clues, is used to assess weak and strong subjectivity within texts. The subjectivity score is calculated by dividing the number of detected clues by the total word count [33].
Self disclosureThis metric is calculated as the ratio of self-reflective pronouns to the overall number of words in the text [38,39].
Diversity of sourcesSources’ diversity is calculated by using part-of-speech tags to separate entities and fragments and then counting the number of unique sources presented in a news story [40].
Language Quality
Readability of textFlesch Reading Ease Score with a score ranging from 0 to 100; the higher the score, the easier the text is to read [41].
Article LengthThe total number of words excluding stopwords [42].
AdjectivesTotal count of adjectives divided by the number of words in the text [43].
Typographical errorsAutocorrect 1 Python library corrects mistakes in the text. Then, the pairs of sequences are compared to calculate the difference.
NumbersTotal count of numbers [43].
ImagesThe ratio of illustrations to text [44].
Headline wordsThe total number of words in the title [45].
EntertainmentWe created four lists: (a) sensual words 2, (b) animals 3, (c) crime 4,5, and (d) celebrities. For the latter, we included the names of famous Greek celebrities. All the counts were used as separate features [43,46].
Emotionality
Emotional HeadlinesThe Textblob 6 sentiment analysis library works effectively with short text and it is capable of identifying clickbait headlines by detecting highly negative or positive language, with sentiment scores ranging from −1 to 1 [47].
EmotionsThe NRC Affect Intensity Lexicon contains 6.000 words tagged with an intensity label for each emotion using crowdsourcing [5,17,48,49,50,51]. The lexicon measures intensity scores for anger, fear, sadness, and joy, based on theories of emotion [36]. In addition, the older version of the NRC EmoLex was used for the emotions of trust, surprise, anticipation, and disgust [35].
Table 3. Creation of the features.
Table 3. Creation of the features.
DimensionMeasurement
Impartiality
SubjectivityWith the subjectivity lexicon, which comprises a set of subjectivity clues, three features were created for title, Facebook headline, and the body of the article [33].
Language Quality
Diversity of sourcesThe total number of unique sources referenced within a news item [40]
Readability of textFlesch Reading Ease Score [41] with a score ranging from 0 to 100; the higher the score, the easier the text is to read. For the calculation, the py-readability-metrics package was used.
Article LengthThe total number of words excluding stopwords [42].
Entertainment
Sensual wordsWe used a set of sensual words and counted the total number in each text.
AnimalsWe created a list of animal names and counted how many different animals appear in the story.
CrimeWe used some common words used to describe crime [52].
CelebritiesWe included the most influential people in Greece, including actors, TV presenters, singers, politicians, and other famous celebrities from the Greek show business.
Emotionality
Emotional headlinesTextblob package was used to detect highly negative or positive article titles.
EmotionsThe NRC EmoLex and Emotion Intensity Lexicon was used to capture the following emotions: anger, fear, sadness, and joy. The VAD 1 was used to capture the following emotions: Valence, Arousal, and Dominance.
1 http://saifmohammad.com/WebPages/nrc-vad.html (accessed on 29 April 2025).
Table 4. Accuracy scores of the classification algorithms.
Table 4. Accuracy scores of the classification algorithms.
ModelAccuracy (F1 Score)
Logistic Regression0.81
Naive Bayes0.81
Support Vector Machine0.82
K-Nearest Neighbors0.80
Decision Tree0.81
Random Forest0.85
XGBoost0.85
Table 5. Rules extracted from the final large leaves of the Decision Tree, applicable to high-quality articles.
Table 5. Rules extracted from the final large leaves of the Decision Tree, applicable to high-quality articles.
Rule 1DTree Explanation
Readability > 12.98Difficult text
Adjectives > 16.61More than 16% of the words are adjectives
Positive > 3.38Positive words in the text
Mean word length < 6.05Average word character value of less than 6
Rule 2
Readability > 12.89Difficult text
Adjectives > 16.19More than 16% of the words are adjectives
Positive ≤ 3.38Not many positive words in the text
Length ≤ 335Length less than 335 words
Mean word length < 6.12Average word character value of less than 6
Rule 3
Readability > 14.55Very difficult text
Adjectives ≤ 16.19Less than 16% of the words are adjectives
No Celebs > 0Presence of celebrities
Rule 4
Readability > 17.53Very difficult text
Adjectives ≤ 16.19Less than 16% of the words are adjectives
No Celebs = 0No celebrities
Crime > 0Report of a crime/accident or dispute
Rule 5
Readability ≤ 12.89Not a very difficult text
Joy intensity < 0.21Less than 21% of words express joy
Length ≤ 243Length equal to or less than 243 words
Adjectives > 14.78More than 15% of the words are adjectives
Positive > 1.19Positive words in the text
Rule 6
Readability ≤ 12.89Not a very difficult text
Joy intensity < 0.21Less than 21% of words express joy
Length ≤ 243Length equal to or less than 243 words
Anger > 1.18Rate of anger
Trust > 2.44Rate of trust
Table 6. Importance of the top ten features.
Table 6. Importance of the top ten features.
CommentsSharesLikes
F1 Score: 0.864F1 Score: 0.847F1 Score: 0.913
Samples: 7161Samples: 7199Samples: 7249
WeightFeatureWeightFeatureWeightFeature
0.0951 ± 0.0104 Title words 0.1011 ± 0.0155 Title words 0.0934 ± 0.0120 Title words
0.0262 ± 0.0135 Readability 0.0549 ± 0.0066 Readability 0.0888 ± 0.0138 Length
0.0255 ± 0.0087 No Celebs 0.0372 ± 0.0100 Length 0.0302 ± 0.0092 Readability
0.0161 ± 0.0058 Length 0.0147 ± 0.0055 Anticipation 0.0234 ± 0.0062 Dominance
0.0124 ± 0.0054 Diversity 0.0146 ± 0.0088 Diversity 0.0199 ± 0.0047 Subjectivity head.
0.0085 ± 0.0054 Subjectivity head. 0.0134 ± 0.0074 Arousal 0.0194 ± 0.0114 No Celebs
0.0069 ± 0.0091 Anticipation 0.0132 ± 0.0073 Subj. text 0.0157 ± 0.0032 Anticipation
0.0056 ± 0.0032 Dominance 0.0105 ± 0.0035 Valence 0.0092 ± 0.0019 Arousal
0.0029 ± 0.0030 Arousal 0.0066 ± 0.0035 Subjectivity head. 0.0077 ± 0.0027 Diversity
0.0018 ± 0.0019 Positivity 0.0065 ± 0.0048 Dominance 0.0051 ± 0.0042 Fear
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sotirakou, C.; Germanakos, P.; Karampela, A.; Mourlas, C. Developing IQJournalism: An Intelligent Advisor for Predicting the Perceived Quality in Greek News Articles. Electronics 2025, 14, 2552. https://doi.org/10.3390/electronics14132552

AMA Style

Sotirakou C, Germanakos P, Karampela A, Mourlas C. Developing IQJournalism: An Intelligent Advisor for Predicting the Perceived Quality in Greek News Articles. Electronics. 2025; 14(13):2552. https://doi.org/10.3390/electronics14132552

Chicago/Turabian Style

Sotirakou, Catherine, Panagiotis Germanakos, Anastasia Karampela, and Constantinos Mourlas. 2025. "Developing IQJournalism: An Intelligent Advisor for Predicting the Perceived Quality in Greek News Articles" Electronics 14, no. 13: 2552. https://doi.org/10.3390/electronics14132552

APA Style

Sotirakou, C., Germanakos, P., Karampela, A., & Mourlas, C. (2025). Developing IQJournalism: An Intelligent Advisor for Predicting the Perceived Quality in Greek News Articles. Electronics, 14(13), 2552. https://doi.org/10.3390/electronics14132552

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop