MDPI - Publisher of Open Access Journals

22 pages, 493 KiB

Open AccessArticle

Improving Performance of Automatic Keyword Extraction (AKE) Methods Using PoS Tagging and Enhanced Semantic-Awareness

by Enes Altuncu, Jason R. C. Nurse, Yang Xu, Jie Guo and Shujun Li

Information 2025, 16(7), 601; https://doi.org/10.3390/info16070601 - 13 Jul 2025

Viewed by 306

Automatic keyword extraction (AKE) has gained more importance with the increasing amount of digital textual data that modern computing systems process. It has various applications in information retrieval (IR) and natural language processing (NLP), including text summarisation, topic analysis and document indexing. This [...] Read more.

Automatic keyword extraction (AKE) has gained more importance with the increasing amount of digital textual data that modern computing systems process. It has various applications in information retrieval (IR) and natural language processing (NLP), including text summarisation, topic analysis and document indexing. This paper proposes a simple but effective post-processing-based universal approach to improving the performance of any AKE methods, via an enhanced level of semantic-awareness supported by PoS tagging. To demonstrate the performance of the proposed approach, we considered word types retrieved from a PoS tagging step and two representative sources of semantic information—specialised terms defined in one or more context-dependent thesauri, and named entities in Wikipedia. The above three steps can be simply added to the end of any AKE methods as part of a post-processor, which simply re-evaluates all candidate keywords following some context-specific and semantic-aware criteria. For five state-of-the-art (SOTA) AKE methods, our experimental results with 17 selected datasets showed that the proposed approach improved their performances both consistently (up to 100% in terms of improved cases) and significantly (between 10.2% and 53.8%, with an average of 25.8%, in terms of F1-score and across all five methods), especially when all the three enhancement steps are used. Our results have profound implications considering the fact that our proposed approach can be easily applied to any AKE method with the standard output (candidate keywords and scores) and the ease to further extend it. Full article

(This article belongs to the Special Issue Information Extraction and Language Discourse Processing)

► Show Figures

Figure 1

29 pages, 2368 KiB

Open AccessArticle

Chinese “Dialects” and European “Languages”: A Comparison of Lexico-Phonetic and Syntactic Distances

by Chaoju Tang, Vincent J. van Heuven, Wilbert Heeringa and Charlotte Gooskens

Languages 2025, 10(6), 127; https://doi.org/10.3390/languages10060127 - 29 May 2025

Viewed by 3002

Abstract

In this article, we tested some specific claims made in the literature on relative distances among European languages and among Chinese dialects, suggesting that some language varieties within the Sinitic family traditionally called dialects are, in fact, more linguistically distant from one another [...] Read more.

In this article, we tested some specific claims made in the literature on relative distances among European languages and among Chinese dialects, suggesting that some language varieties within the Sinitic family traditionally called dialects are, in fact, more linguistically distant from one another than some European varieties that are traditionally called languages. More generally, we examined whether distances among varieties within and across European language families were larger than those within and across Sinitic language varieties. To this end, we computed lexico-phonetic as well as syntactic distance measures for comparable language materials in six Germanic, five Romance and six Slavic languages, as well as for six Mandarin and nine non-Mandarin (‘southern’) Chinese varieties. Lexico-phonetic distances were expressed as the length-normalized MPI-weighted Levenshtein distances computed on the 100 most frequently used nouns in the 32 language varieties. Syntactic distance was implemented as the (complement of) the Pearson correlation coefficient found for the PoS trigram frequencies established for a parallel corpus of the same four texts translated into each of the 32 languages. The lexico-phonetic distances proved to be relatively large and of approximately equal magnitude in the Germanic, Slavic and non-Mandarin Chinese language varieties. However, the lexico-phonetic distances among the Romance and Mandarin languages were considerably smaller, but of similar magnitude. Cantonese (Guangzhou dialect) was lexico-phonetically as distant from Standard Mandarin (Beijing dialect) as European language pairs such as Portuguese–Italian, Portuguese–Romanian and Dutch–German. Syntactically, however, the differences among the Sinitic varieties were about ten times smaller than the differences among the European languages, both within and across the families—which provides some justification for the Chinese tradition of calling the Sinitic varieties dialects of the same language. Full article

(This article belongs to the Special Issue Dialectal Dynamics)

► Show Figures

Figure 1

31 pages, 5323 KiB

Open AccessArticle

Learning the Style via Mixed SN-Grams: An Evaluation in Authorship Attribution

by Juan Pablo Francisco Posadas-Durán, Germán Ríos-Toledo, Erick Velázquez-Lozada, J. A. de Jesús Osuna-Coutiño, Madaín Pérez-Patricio and Fernando Pech May

AI 2025, 6(5), 104; https://doi.org/10.3390/ai6050104 - 20 May 2025

Viewed by 1003

Abstract

This study addresses the problem of authorship attribution with a novel method for modeling writing style using dependency tree subtree parsing. This method exploits the syntactic information of sentences using mixed syntactic n-grams (mixed sn-grams). The method comprises an algorithm to generate [...] Read more.

This study addresses the problem of authorship attribution with a novel method for modeling writing style using dependency tree subtree parsing. This method exploits the syntactic information of sentences using mixed syntactic n-grams (mixed sn-grams). The method comprises an algorithm to generate mixed sn-grams by integrating words, POS tags, and dependency relation tags. The mixed sn-grams are used as style markers to feed Machine Learning methods such as a SVM. A comparative analysis was performed to evaluate the performance of the proposed mixed sn-grams method against homogeneous sn-grams with the PAN-CLEF 2012 and CCAT50 datasets. Experiments with PAN 2012 showed the potential of mixed sn-grams to model a writing style by outperforming homogeneous sn-grams. On the other hand, experiments with CCAT50 showed that training with mixed sn-grams improves accuracy over homogeneous sn-grams, with the POS-Word category showing the best result. The study’s results suggest that mixed sn-grams constitute effective stylistic markers for building a reliable writing style model, which machine learning algorithms can learn. Full article

► Show Figures

Figure 1

25 pages, 512 KiB

Open AccessSystematic Review

Artificial Intelligence Applied to the Analysis of Biblical Scriptures: A Systematic Review

by Bruno Cesar Lima, Nizam Omar, Israel Avansi and Leandro Nunes de Castro

Analytics 2025, 4(2), 13; https://doi.org/10.3390/analytics4020013 - 11 Apr 2025

Viewed by 2369

Abstract

The Holy Bible is the most read book in the world, originally written in Aramaic, Hebrew, and Greek over a time span in the order of centuries by many people, and formed by a combination of various literary styles, such as stories, prophecies, [...] Read more.

The Holy Bible is the most read book in the world, originally written in Aramaic, Hebrew, and Greek over a time span in the order of centuries by many people, and formed by a combination of various literary styles, such as stories, prophecies, poetry, instructions, and others. As such, the Bible is a complex text to be analyzed by humans and machines. This paper provides a systematic survey of the application of Artificial Intelligence (AI) and some of its subareas to the analysis of the Biblical scriptures. Emphasis is given to what types of tasks are being solved, what are the main AI algorithms used, and their limitations. The findings deliver a general perspective on how this field is being developed, along with its limitations and gaps. This research follows a procedure based on three steps: planning (defining the review protocol), conducting (performing the survey), and reporting (formatting the report). The results obtained show there are seven main tasks solved by AI in the Bible analysis: machine translation, authorship identification, part of speech tagging (PoS tagging), semantic annotation, clustering, categorization, and Biblical interpretation. Also, the classes of AI techniques with better performance when applied to Biblical text research are machine learning, neural networks, and deep learning. The main challenges in the field involve the nature and style of the language used in the Bible, among others. Full article

► Show Figures

Figure 1

14 pages, 1010 KiB

Open AccessArticle

Games with a Purpose for Part-of-Speech Tagging and the Impact of the Applied Game Design Elements on Player Enjoyment and Games with a Purpose Preference

by Rosa Lilia Segundo Díaz, Gustavo Rovelo Ruiz, Miriam Bouzouita, Véronique Hoste and Karin Coninx

Appl. Sci. 2025, 15(7), 3561; https://doi.org/10.3390/app15073561 - 25 Mar 2025

Viewed by 350

Abstract

Linguistic tasks such as Part-of-Speech (PoS) tagging can be tedious, but are crucial for the development of Natural Language Processing (NLP) tools. Games With A Purpose (GWAPs) aim to reduce the monotony of the task for native speakers and non-experts who contribute to [...] Read more.

Linguistic tasks such as Part-of-Speech (PoS) tagging can be tedious, but are crucial for the development of Natural Language Processing (NLP) tools. Games With A Purpose (GWAPs) aim to reduce the monotony of the task for native speakers and non-experts who contribute to crowdsourcing projects. This study focuses on revising and correcting PoS tags in the Corpus Oral y Sonoro del Español Rural (COSER), the largest collection of oral data in the Spanish-speaking world, to create a parsed corpus of European Spanish dialects. It also examines how game design elements (GDEs) affect players’ enjoyment. Three games—Agentes, Tesoros, and Anotatlón—were developed, incorporating different GDEs, such as rewards and challenges. The results show two levels of enjoyment: at the concept level with Anotatlón, and at the level of individual GDEs with Tesoros. This suggests that certain GDEs influence player enjoyment and, consequently, their preference for certain games. However, the study also shows the complexity of evaluating triggers for player enjoyment in games with more than one implemented GDE. Full article

(This article belongs to the Special Issue Innovative Horizons: Exploring the Convergence of Gamification and Virtual Reality)

► Show Figures

Figure 1

15 pages, 1562 KiB

Open AccessArticle

A Rewired NADPH-Dependent Redox Shuttle for Testing Peroxisomal Compartmentalization of Synthetic Metabolic Pathways in Komagataella phaffii

by Albert Fina, Sílvia Àvila-Cabré, Enrique Vázquez-Pereira, Joan Albiol and Pau Ferrer

Microorganisms 2025, 13(1), 46; https://doi.org/10.3390/microorganisms13010046 - 30 Dec 2024

Cited by 1 | Viewed by 1027

Abstract

The introduction of heterologous pathways into microbial cell compartments offers several potential advantages, including increasing enzyme concentrations and reducing competition with native pathways, making this approach attractive for producing complex metabolites like fatty acids and fatty alcohols. However, measuring subcellular concentrations of these [...] Read more.

The introduction of heterologous pathways into microbial cell compartments offers several potential advantages, including increasing enzyme concentrations and reducing competition with native pathways, making this approach attractive for producing complex metabolites like fatty acids and fatty alcohols. However, measuring subcellular concentrations of these metabolites remains technically challenging. Here, we explored 3-hydroxypropionic acid (3-HP), readily quantifiable and sharing the same precursors—acetyl-CoA, NADPH, and ATP—with the above-mentioned products, as a reporter metabolite for peroxisomal engineering in the yeast Komagataella phaffii. To this end, the malonyl-CoA reductase pathway for 3-HP production was targeted into the peroxisome of K. phaffii using the PTS1-tagging system, and further tested with different carbon sources. Thereafter, we used compartmentalized 3-HP production as a reporter system to showcase the impact of different strategies aimed at enhancing the peroxisomal NADPH pool. Co-overexpression of genes encoding a NADPH-dependent redox shuttle from Saccharomyces cerevisiae (IDP2/IDP3) significantly increased 3-HP yields across all substrates, whereas peroxisomal targeting of the S. cerevisiae NADH kinase Pos5 failed to improve 3-HP production. This study highlights the potential of using peroxisomal 3-HP production as a biosensor for evaluating peroxisomal acetyl-CoA and NAPDH availability by simply quantifying 3-HP, demonstrating its potential for peroxisome-based metabolic engineering in yeast. Full article

(This article belongs to the Section Microbial Biotechnology)

► Show Figures

Figure 1

21 pages, 1053 KiB

Open AccessArticle

Assessment of the Stability and Nutritional Quality of Hemp Oil and Pumpkin Seed Oil Blends

by Marta Siol, Natalia Chołuj, Diana Mańko-Jurkowska and Joanna Bryś

Foods 2024, 13(23), 3813; https://doi.org/10.3390/foods13233813 - 26 Nov 2024

Cited by 4 | Viewed by 1622

Abstract

This study characterized the quality of hemp oil (HO) and pumpkin seed oil (PO) and their blends before and after 2 and 4 months of storage at refrigerated and room temperature, without access to light and oxygen. The analyses included determining the acid [...] Read more.

This study characterized the quality of hemp oil (HO) and pumpkin seed oil (PO) and their blends before and after 2 and 4 months of storage at refrigerated and room temperature, without access to light and oxygen. The analyses included determining the acid value, peroxide value, fatty acid (FA) composition, and FA distribution in triacylglycerol (TAG) molecules. Pressure differential scanning calorimetry (PDSC) was used to assess the oxidative stability of oils and their blends. This study also evaluated the nutritional potential of hemp oil and pumpkin seed oil blends, as atherogenicity, thrombogenicity, and health-promoting indices and hypocholesterolaemic/hypercholesterolaemic ratio were calculated. The tested samples differed in properties depending on the storage time and temperature. The optimal choice was a blend of 50% hemp oil (HO) and 50% pumpkin oil (PO). This mixture demonstrated the desired fatty acid composition, satisfactory acid and peroxide values, and a relatively good oxidation induction time during storage. Despite the unfavorable distribution of FAs in TAG molecules, it was characterized by a balanced ratio of n-3 to n-6 acids. It was also concluded that research on HO and PO mixtures should be continued due to the potential synergistic effect of their bioactive substances. Full article

(This article belongs to the Special Issue Food Lipids: Chemistry, Nutrition and Biotechnology—2nd Edition)

► Show Figures

Figure 1

34 pages, 4479 KiB

Open AccessArticle

Development of a Children’s Educational Dictionary for a Low-Resource Language Using AI Tools

by Diana Rakhimova, Aidana Karibayeva, Vladislav Karyukin, Assem Turarbek, Zhansaya Duisenbekkyzy and Rashid Aliyev

Computers 2024, 13(10), 253; https://doi.org/10.3390/computers13100253 - 2 Oct 2024

Cited by 5 | Viewed by 2664

Abstract

Today, various interactive tools or partially available artificial intelligence applications are actively used in educational processes to solve multiple problems for resource-rich languages, such as English, Spanish, French, etc. Unfortunately, the situation is different and more complex for low-resource languages, like Kazakh, Uzbek, [...] Read more.

Today, various interactive tools or partially available artificial intelligence applications are actively used in educational processes to solve multiple problems for resource-rich languages, such as English, Spanish, French, etc. Unfortunately, the situation is different and more complex for low-resource languages, like Kazakh, Uzbek, Mongolian, and others, due to the lack of qualitative and accessible resources, morphological complexity, and the semantics of agglutinative languages. This article presents research on early childhood learning resources for the low-resource Kazakh language. Generally, a dictionary for children differs from classical educational dictionaries. The difference between dictionaries for children and adults lies in their purpose and methods of presenting information. A themed dictionary will make learning and remembering new words easier for children because they will be presented in a specific context. This article discusses developing an approach to creating a thematic children’s dictionary of the low-resource Kazakh language using artificial intelligence. The proposed approach is based on several important stages: the initial formation of a list of English words with the use of ChatGPT; identification of their semantic weights; generation of phrases and sentences with the use of the list of semantically related words; translation of obtained phrases and sentences from English to Kazakh, dividing them into bigrams and trigrams; and processing with Kazakh language POS pattern tag templates to adapt them for children. When the dictionary was formed, the semantic proximity of words and phrases to the given theme and age restrictions for children were taken into account. The formed dictionary phrases were evaluated using the cosine similarity, Euclidean similarity, and Manhattan distance metrics. Moreover, the dictionary was extended with video and audio data by implementing models like DALL-E 3, Midjourney, and Stable Diffusion to illustrate the dictionary data and TTS (Text to Speech) technology for the Kazakh language for voice synthesis. The developed thematic dictionary approach was tested, and a SUS (System Usability Scale) assessment of the application was conducted. The experimental results demonstrate the proposed approach’s high efficiency and its potential for wide use in educational purposes. Full article

(This article belongs to the Special Issue Smart Learning Environments)

► Show Figures

Figure 1

19 pages, 2296 KiB

Open AccessArticle

A Hybrid Approach to Ontology Construction for the Badini Kurdish Language

by Media Azzat, Karwan Jacksi and Ismael Ali

Information 2024, 15(9), 578; https://doi.org/10.3390/info15090578 - 19 Sep 2024

Viewed by 2445

Abstract

Semantic ontologies have been widely utilized as crucial tools within natural language processing, underpinning applications such as knowledge extraction, question answering, machine translation, text comprehension, information retrieval, and text summarization. While the Kurdish language, a low-resource language, has been the subject of some [...] Read more.

Semantic ontologies have been widely utilized as crucial tools within natural language processing, underpinning applications such as knowledge extraction, question answering, machine translation, text comprehension, information retrieval, and text summarization. While the Kurdish language, a low-resource language, has been the subject of some ontological research in other dialects, a semantic web ontology for the Badini dialect remains conspicuously absent. This paper addresses this gap by presenting a methodology for constructing and utilizing a semantic web ontology for the Badini dialect of the Kurdish language. A Badini annotated corpus (UOZBDN) was created and manually annotated with part-of-speech (POS) tags. Subsequently, an HMM-based POS tagger model was developed using the UOZBDN corpus and applied to annotate additional text for ontology extraction. Ontology extraction was performed by employing predefined rules to identify nouns and verbs from the model-annotated corpus and subsequently forming semantic predicates. Robust methodologies were adopted for ontology development, resulting in a high degree of precision. The POS tagging model attained an accuracy of 95.04% when applied to the UOZBDN corpus. Furthermore, a manual evaluation conducted by Badini Kurdish language experts yielded a 97.42% accuracy rate for the extracted ontology. Full article

(This article belongs to the Special Issue Knowledge Representation and Ontology-Based Data Management)

► Show Figures

Figure 1

20 pages, 626 KiB

Open AccessArticle

Natural Language Processing in Knowledge-Based Support for Operator Assistance

by Fatemeh Besharati Moghaddam, Angel J. Lopez, Stijn De Vuyst and Sidharta Gautama

Appl. Sci. 2024, 14(7), 2766; https://doi.org/10.3390/app14072766 - 26 Mar 2024

Cited by 2 | Viewed by 1713

Abstract

Manufacturing industry faces increasing complexity in the performance of assembly tasks due to escalating demand for complex products with a greater number of variations. Operators require robust assistance systems to enhance productivity, efficiency, and safety. However, existing support services often fall short when [...] Read more.

Manufacturing industry faces increasing complexity in the performance of assembly tasks due to escalating demand for complex products with a greater number of variations. Operators require robust assistance systems to enhance productivity, efficiency, and safety. However, existing support services often fall short when operators encounter unstructured open questions and incomplete sentences due to primarily relying on procedural digital work instructions. This draws attention to the need for practical application of natural language processing (NLP) techniques. This study addresses these challenges by introducing a domain-specific dataset tailored to assembly tasks, capturing unique language patterns and linguistic characteristics. We explore strategies to process declarative and imperative sentences, including incomplete ones, effectively. Thorough evaluation of three pre-trained NLP libraries—NLTK, SPACY, and Stanford—is performed to assess their effectiveness in handling assembly-related concepts and ability to address the domain’s distinctive challenges. Our findings demonstrate the efficient performance of these open-source NLP libraries in accurately handling assembly-related concepts. By providing valuable insights, our research contributes to developing intelligent operator assistance systems, bridging the gap between NLP techniques and the assembly domain within manufacturing industry. Full article

► Show Figures

Figure 1

9 pages, 1925 KiB

Open AccessProceeding Paper

A New Approach for Carrying Out Sentiment Analysis of Social Media Comments Using Natural Language Processing

by Mritunjay Ranjan, Sanjay Tiwari, Arif Md Sattar and Nisha S. Tatkar

Eng. Proc. 2023, 59(1), 181; https://doi.org/10.3390/engproc2023059181 - 17 Jan 2024

Cited by 5 | Viewed by 6459

Abstract

Business and science are using sentiment analysis to extract and assess subjective information from the web, social media, and other sources using NLP, computational linguistics, text analysis, image processing, audio processing, and video processing. It models polarity, attitudes, and urgency from positive, negative, [...] Read more.

Business and science are using sentiment analysis to extract and assess subjective information from the web, social media, and other sources using NLP, computational linguistics, text analysis, image processing, audio processing, and video processing. It models polarity, attitudes, and urgency from positive, negative, or neutral inputs. Unstructured data make emotion assessment difficult. Unstructured consumer data allow businesses to market, engage, and connect with consumers on social media. Text data are instantly assessed for user sentiment. Opinion mining identifies a text’s positive, negative, or neutral opinions, attitudes, views, emotions, and sentiments. Text analytics uses machine learning to evaluate “unstructured” natural language text data. These data can help firms make money and decisions. Sentiment analysis shows how individuals feel about things, services, organizations, people, events, themes, and qualities. Reviews, forums, blogs, social media, and other articles use it. DD (data-driven) methods find complicated semantic representations of texts without feature engineering. Data-driven sentiment analysis is three-tiered: document-level sentiment analysis determines polarity and sentiment, aspect-based sentiment analysis assesses document segments for emotion and polarity, and data-driven (DD) sentiment analysis recognizes word polarity and writes positive and negative neutral sentiments. Our innovative method captures sentiments from text comments. The syntactic layer encompasses various processes such as sentence-level normalisation, identification of ambiguities at paragraph boundaries, part-of-speech (POS) tagging, text chunking, and lemmatization. Pragmatics include personality recognition, sarcasm detection, metaphor comprehension, aspect extraction, and polarity detection; semantics include word sense disambiguation, concept extraction, named entity recognition, anaphora resolution, and subjectivity detection. Full article

(This article belongs to the Proceedings of Eng. Proc., 2023, RAiSE-2023)

► Show Figures

Figure 1

13 pages, 4697 KiB

Open AccessArticle

Does Part of Speech Have an Influence on Cyberbullying Detection?

by Jingxiu Huang, Ruofei Ding, Yunxiang Zheng, Xiaomin Wu, Shumin Chen and Xiunan Jin

Analytics 2024, 3(1), 1-13; https://doi.org/10.3390/analytics3010001 - 21 Dec 2023

Cited by 3 | Viewed by 1782

Abstract

With the development of the Internet, the issue of cyberbullying on social media has gained significant attention. Cyberbullying is often expressed in text. Methods of identifying such text via machine learning have been growing, most of which rely on the extraction of part-of-speech [...] Read more.

With the development of the Internet, the issue of cyberbullying on social media has gained significant attention. Cyberbullying is often expressed in text. Methods of identifying such text via machine learning have been growing, most of which rely on the extraction of part-of-speech (POS) tags to improve their performance. However, the current study only arbitrarily used part-of-speech labels that it considered reasonable, without investigating whether the chosen part-of-speech labels can better enhance the effectiveness of the cyberbullying detection task. In other words, the effectiveness of different part-of-speech labels in the automatic cyberbullying detection task was not proven. This study aimed to investigate the part of speech in statements related to cyberbullying and explore how three classification models (random forest, naïve Bayes, and support vector machine) are sensitive to parts of speech in detecting cyberbullying. We also examined which part-of-speech combinations are most appropriate for the models mentioned above. The results of our experiments showed that the predictive performance of different models differs when using different part-of-speech tags as inputs. Random forest showed the best predictive performance, and naive Bayes and support vector machine followed, respectively. Meanwhile, across the different models, the sensitivity to different part-of-speech tags was consistent, with greater sensitivity shown towards nouns, verbs, and measure words, and lower sensitivity shown towards adjectives and pronouns. We also found that the combination of different parts of speech as inputs had an influence on the predictive performance of the models. This study will help researchers to determine which combination of part-of-speech categories is appropriate to improve the accuracy of cyberbullying detection. Full article

► Show Figures

Figure 1

14 pages, 3296 KiB

Open AccessArticle

Circularized Nanodiscs for Multivalent Mosaic Display of SARS-CoV-2 Spike Protein Antigens

by Moustafa T. Mabrouk, Asmaa A. Zidan, Nihal Aly, Mostafa T. Mohammed, Fadi Ghantous, Michael S. Seaman, Jonathan F. Lovell and Mahmoud L. Nasr

Vaccines 2023, 11(11), 1655; https://doi.org/10.3390/vaccines11111655 - 28 Oct 2023

Cited by 4 | Viewed by 3143

Abstract

The emergence of vaccine-evading SARS-CoV-2 variants urges the need for vaccines that elicit broadly neutralizing antibodies (bnAbs). Here, we assess covalently circularized nanodiscs decorated with recombinant SARS-CoV-2 spike glycoproteins from several variants for eliciting bnAbs with vaccination. Cobalt porphyrin–phospholipid (CoPoP) was incorporated into [...] Read more.

The emergence of vaccine-evading SARS-CoV-2 variants urges the need for vaccines that elicit broadly neutralizing antibodies (bnAbs). Here, we assess covalently circularized nanodiscs decorated with recombinant SARS-CoV-2 spike glycoproteins from several variants for eliciting bnAbs with vaccination. Cobalt porphyrin–phospholipid (CoPoP) was incorporated into the nanodisc to allow for anchoring and functional orientation of spike trimers on the nanodisc surface through their His-tag. Monophosphoryl-lipid (MPLA) and QS-21 were incorporated as immunostimulatory adjuvants to enhance vaccine responses. Following optimization of nanodisc assembly, spike proteins were effectively displayed on the surface of the nanodiscs and maintained their conformational capacity for binding with human angiotensin-converting enzyme 2 (hACE2) as verified using electron microscopy and slot blot assay, respectively. Six different formulations were prepared where they contained mono antigens; four from the year 2020 (WT, Beta, Lambda, and Delta) and two from the year 2021 (Omicron BA.1 and BA.2). Additionally, we prepared a mosaic nanodisc displaying the four spike proteins from year 2020. Intramuscular vaccination of CD-1 female mice with the mosaic nanodisc induced antibody responses that not only neutralized matched pseudo-typed viruses, but also neutralized mismatched pseudo-typed viruses corresponding to later variants from year 2021 (Omicron BA.1 and BA.2). Interestingly, sera from mosaic-immunized mice did not effectively inhibit Omicron spike binding to human ACE-2, suggesting that some of the elicited antibodies were directed towards conserved neutralizing epitopes outside the receptor binding domain. Our results show that mosaic nanodisc vaccine displaying spike proteins from 2020 can elicit broadly neutralizing antibodies that can neutralize mismatched viruses from a following year, thus decreasing immune evasion of new emerging variants and enhancing healthcare preparedness. Full article

(This article belongs to the Special Issue Advances in the Use of Nanoparticles for Vaccine Platform Development)

► Show Figures

Figure 1

19 pages, 3742 KiB

Open AccessArticle

Analysis of Backchannel Inviting Cues in Dyadic Speech Communication

by Stanislav Ondáš, Eva Kiktová, Matúš Pleva and Jozef Juhár

Electronics 2023, 12(17), 3705; https://doi.org/10.3390/electronics12173705 - 1 Sep 2023

Cited by 1 | Viewed by 2382

Abstract

The paper aims to study speaker and listener behavior in dyadic speech communication. A multimodal (speech and video) corpus of dyadic face-to-face conversations on various topics was created. The corpus was manually labeled on several layers (text transcription, backchannel modality and function, POS [...] Read more.

The paper aims to study speaker and listener behavior in dyadic speech communication. A multimodal (speech and video) corpus of dyadic face-to-face conversations on various topics was created. The corpus was manually labeled on several layers (text transcription, backchannel modality and function, POS tags, prosody, and gaze). The statistical analysis was done on the proposed corpus. We focused on backchannel inviting cues on the speaker side and backchannels on the listener side and their patterns. We aimed to study interlocutor backchannel behavior and backchannel-related signals. The results of the analysis show similar patterns in the case of backchannel inviting cues between Slovak and English data and highlight the importance of gaze direction in a face-to-face speech communication scenario. The described corpus and results of the analysis are one of the first steps leading towards natural artificial intelligence-driven human–computer speech conversation. Full article

(This article belongs to the Special Issue Human Computer Interaction in Intelligent System)

► Show Figures

Figure 1

17 pages, 892 KiB

Open AccessArticle

Part-of-Speech Tags Guide Low-Resource Machine Translation

by Zaokere Kadeer, Nian Yi and Aishan Wumaier

Electronics 2023, 12(16), 3401; https://doi.org/10.3390/electronics12163401 - 10 Aug 2023

Cited by 3 | Viewed by 1963

Abstract

Neural machine translation models are guided by loss function to select source sentence features and generate results close to human annotation. When the data resources are abundant, neural machine translation models can focus on the features used to produce high-quality translations. These features [...] Read more.

Neural machine translation models are guided by loss function to select source sentence features and generate results close to human annotation. When the data resources are abundant, neural machine translation models can focus on the features used to produce high-quality translations. These features include POS or other grammatical features. However, models cannot focus precisely on these features when data resources are limited. The reason is that the lack of samples makes the model overfit before considering these features. Previous works have enriched the features by integrating source POS or multitask methods. However, these methods only utilize the source POS or produce translations by introducing the generated target POS. We propose introducing POS information based on multitask methods and reconstructors. We obtain the POS tags by the additional encoder and decoder and compute the corresponding loss function. These loss functions are used with the loss function of machine translation to optimize the parameters of the entire model, which makes the model pay attention to POS features. The POS features focused on by models will guide the translation process and alleviate the problem that models cannot focus on the POS features in the case of low resources. Experiments on multiple translation tasks show that the method improves 0.4∼1 BLEU compared with the baseline model on different translation tasks. Full article

(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)

► Show Figures

Figure 1

Search Results (68)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (68)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI