Next Article in Journal
Hydrogen Evolution on Nano-StructuredCuO/Pd Electrode: Raman Scattering Study
Previous Article in Journal
Virtual Inertia-Based Inverters for Mitigating Frequency Instability in Grid-Connected Renewable Energy System: A Review
Open AccessArticle

Semantic Network Analysis Pipeline—Interactive Text Mining Framework for Exploration of Semantic Flows in Large Corpus of Text

1
Computer Science, University of Portland, Portland, OR 90203, USA
2
Resource Data, Inc., Anchorage, AK 99503, USA
3
Computer Science, University of Alaska Anchorage, Anchorage, AK 99508, USA
*
Author to whom correspondence should be addressed.
Current address: 5000 N Willamette Blvd., Portland, OR 97203, USA.
Appl. Sci. 2019, 9(24), 5302; https://doi.org/10.3390/app9245302
Received: 5 June 2019 / Revised: 19 November 2019 / Accepted: 19 November 2019 / Published: 5 December 2019
(This article belongs to the Section Computing and Artificial Intelligence)
Historical topic modeling and semantic concepts exploration in a large corpus of unstructured text remains a hard, opened problem. Despite advancements in natural languages processing tools, statistical linguistics models, graph theory and visualization, there is no framework that combines these piece-wise tools under one roof. We designed and constructed a Semantic Network Analysis Pipeline (SNAP) that is available as an open-source web-service that implements work-flow needed by a data scientist to explore historical semantic concepts in a text corpus. We define a graph theoretic notion of a semantic concept as a flow of closely related tokens through the corpus of text. The modular work-flow pipeline processes text using natural language processing tools, statistical content narrowing, creates semantic networks from lexical token chaining, performs social network analysis of token networks and creates a 3D visualization of the semantic concept flows through corpus for interactive concept exploration. Finally, we illustrate the framework’s utility to extract the information from a text corpus of Herman Melville’s novel Moby Dick, the transcript of the 2015–2016 United States (U.S.) Senate Hearings on Environment and Public Works, and the Australian Broadcast Corporation’s short news articles on rural and science topics. View Full-Text
Keywords: semantic concept; text mining; computational linguistics; language processing; natural language processing; interactive visualization semantic concept; text mining; computational linguistics; language processing; natural language processing; interactive visualization
Show Figures

Figure 1

MDPI and ACS Style

Cenek, M.; Bulkow, R.; Pak, E.; Oyster, L.; Ching, B.; Mulagada, A. Semantic Network Analysis Pipeline—Interactive Text Mining Framework for Exploration of Semantic Flows in Large Corpus of Text. Appl. Sci. 2019, 9, 5302.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop