Natural Language Processing and Event Extraction for Big Data

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289).

Deadline for manuscript submissions: closed (29 February 2024) | Viewed by 756

Special Issue Editors


E-Mail Website
Guest Editor
“Enzo Ferrari” Engineering Department, University of Modena and Reggio Emilia, 41121 Modena, Italy
Interests: NLP; semantic data; word embedding; knowledge graph; question answering; big data analysis; digital twins; smart city; IoT; road traffic; air quality; sensor data; anomaly detection; sensor calibration
Center for Language Research, University of Aizu, Aizu-wakamatsu 965 8580, Japan
Interests: natural language processing; machine learning; similarity detection; linguistics

Special Issue Information

Dear Colleagues,

The amount of textual data shared on the Web is overwhelming, and specific techniques are required to manage it and gather knowledge. Event Extraction (EE) is a sub-task of Information Retrieval (IR) whose scope is to extract events automatically from the text, understand what is happening around the world and identify information about where and when it happened and who was involved.

EE has received considerable attention and has seen great progress in recent years. Several approaches have been developed distinguishing two types of EE: Sentence-level Event Extraction and Document-level Event Extraction. Hence, Natural Language Processing (NLP) techniques play a key role in this challenge allowing the extraction of structured information from freeform text. However, the works currently in the literature are not mature enough to allow for the identification of a definitive approach to the challenge of EE, each study addresses only a part of the problem and very often the proposed methodologies are language specific. The research field of EE is still open to different margins for improvement.

The scope of this Special Issue is to collect recent advances in NLP in the field of EE, focusing on techniques that are able to process text published on the Web, e.g., social media and online newspapers, and identify event descriptions (participants, location, and time).

The areas of interest of this Special Issue include (but are not limited to):

  • Sentence-level Event Extraction;
  • Document-level Event Extraction;
  • Question answering;
  • Named entity recognition and linking;
  • Semantic technologies;
  • Text categorization;
  • Natural Language Understanding;
  • Temporal annotation;
  • Knowledge graph;
  • Graph analysis;
  • Event deduplication;
  • Text similarity.

Dr. Federica Rollo
Dr. John Blake
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • text analysis
  • event detection
  • event extraction
  • event analysis

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1335 KiB  
Article
Data Sorting Influence on Short Text Manual Labeling Quality for Hierarchical Classification
by Olga Narushynska, Vasyl Teslyuk, Anastasiya Doroshenko and Maksym Arzubov
Big Data Cogn. Comput. 2024, 8(4), 41; https://doi.org/10.3390/bdcc8040041 - 07 Apr 2024
Viewed by 445
Abstract
The precise categorization of brief texts holds significant importance in various applications within the ever-changing realm of artificial intelligence (AI) and natural language processing (NLP). Short texts are everywhere in the digital world, from social media updates to customer reviews and feedback. Nevertheless, [...] Read more.
The precise categorization of brief texts holds significant importance in various applications within the ever-changing realm of artificial intelligence (AI) and natural language processing (NLP). Short texts are everywhere in the digital world, from social media updates to customer reviews and feedback. Nevertheless, short texts’ limited length and context pose unique challenges for accurate classification. This research article delves into the influence of data sorting methods on the quality of manual labeling in hierarchical classification, with a particular focus on short texts. The study is set against the backdrop of the increasing reliance on manual labeling in AI and NLP, highlighting its significance in the accuracy of hierarchical text classification. Methodologically, the study integrates AI, notably zero-shot learning, with human annotation processes to examine the efficacy of various data-sorting strategies. The results demonstrate how different sorting approaches impact the accuracy and consistency of manual labeling, a critical aspect of creating high-quality datasets for NLP applications. The study’s findings reveal a significant time efficiency improvement in terms of labeling, where ordered manual labeling required 760 min per 1000 samples, compared to 800 min for traditional manual labeling, illustrating the practical benefits of optimized data sorting strategies. Comparatively, ordered manual labeling achieved the highest mean accuracy rates across all hierarchical levels, with figures reaching up to 99% for segments, 95% for families, 92% for classes, and 90% for bricks, underscoring the efficiency of structured data sorting. It offers valuable insights and practical guidelines for improving labeling quality in hierarchical classification tasks, thereby advancing the precision of text analysis in AI-driven research. This abstract encapsulates the article’s background, methods, results, and conclusions, providing a comprehensive yet succinct study overview. Full article
(This article belongs to the Special Issue Natural Language Processing and Event Extraction for Big Data)
Show Figures

Figure 1

Back to TopTop