Big Data Technologies: Explorations and Analytics

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (30 June 2022) | Viewed by 21791

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science and Digital Technologies, University of East London (UEL), London E16 2RD, UK
Interests: big data technologies; computational intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Computer Science and Digital Technologies, University of East London (UEL), London E16 2RD, UK
Interests: big data analytics; artificial intelligence and machine learning; optimization theory

E-Mail Website
Guest Editor
Computer Architecture Department (DAC), Universitat Politècnica de Catalunya (UPC) Barcelona Tech, Barcelona, Spain
Interests: security and wireless networks

Special Issue Information

Dear Colleagues,

Big Data Technologies have recently emerged as an important research area being considered as a replacement for traditional data-processing applications, where traditional techniques are not capable enough to deal with massive, complex and new forms of data. In the context of the Big Data, traditional methodologies and platforms have slow responsiveness, lack of scalability, fault-tolerance, performance and accuracy. Hence, exploring new architecture, framework, security development, and advanced programming models for big data is the only way through which these massive and new forms of data can be handled in the proper way. This special issue is going to present the most recent achievements and developments in the big data technologies, exploration and analytics. Both theoretical studies and state-of-the-art practical applications are welcome for submission. All submitted papers will be peer-reviewed and selected based on the both quality and the relevance to this special issue. The list of possible topics includes, but not limited to:

  • Architecture, Framework, and Standard Design for Big Data Applications;
  • Open Source Developments for Big Data;
  • Big Data Quality Evaluation and Assurance Methodologies;
  • Innovative Big Data Applications and Services for Industries;
  • Big Data Governance, Ethics, Security, Privacy, and Trust;
  • Big Data Analytics for Smart Cities and Internet of Things;
  • Artificial Intelligence for Big Data;
  • Machine Learning for Big Data;
  • Computational Intelligence Applications for Big Data;
  • Optimization Theories for Big Data;
  • Big Data Analytics in batch, near real-time, and real-time modes;
  • Big Data Visualization.

Dr. Amin Karami
Dr. Fahimeh Jafari
Prof. Manel Guerrero-Zapata
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data
  • large scale data
  • streaming data
  • unstructured data
  • big data security
  • multimedia big data
  • big data analytics
  • big data visualization
  • visual analytics

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

31 pages, 1653 KiB  
Article
Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark
by Mousumi Chaudhury, Amin Karami and Mustansar Ali Ghazanfar
Electronics 2022, 11(16), 2567; https://doi.org/10.3390/electronics11162567 - 17 Aug 2022
Cited by 2 | Viewed by 5961
Abstract
The trend for listening to music online has greatly increased over the past decade due to the number of online musical tracks. The large music databases of music libraries that are provided by online music content distribution vendors make music streaming and downloading [...] Read more.
The trend for listening to music online has greatly increased over the past decade due to the number of online musical tracks. The large music databases of music libraries that are provided by online music content distribution vendors make music streaming and downloading services more accessible to the end-user. It is essential to classify similar types of songs with an appropriate tag or index (genre) to present similar songs in a convenient way to the end-user. As the trend of online music listening continues to increase, developing multiple machine learning models to classify music genres has become a main area of research. In this research paper, a popular music dataset GTZAN which contains ten music genres is analysed to study various types of music features and audio signals. Multiple scalable machine learning algorithms supported by Apache Spark, including naïve Bayes, decision tree, logistic regression, and random forest, are investigated for the classification of music genres. The performance of these classifiers is compared, and the random forest performs as the best classifier for the classification of music genres. Apache Spark is used in this paper to reduce the computation time for machine learning predictions with no computational cost, as it focuses on parallel computation. The present work also demonstrates that the perfect combination of Apache Spark and machine learning algorithms reduces the scalability problem of the computation of machine learning predictions. Moreover, different hyperparameters of the random forest classifier are optimized to increase the performance efficiency of the classifier in the domain of music genre classification. The experimental outcome shows that the developed random forest classifier can establish a high level of performance accuracy, especially for the mislabelled, distorted GTZAN dataset. This classifier has outperformed other machine learning classifiers supported by Apache Spark in the present work. The random forest classifier manages to achieve 90% accuracy for music genre classification compared to other work in the same domain. Full article
(This article belongs to the Special Issue Big Data Technologies: Explorations and Analytics)
Show Figures

Figure 1

27 pages, 3287 KiB  
Article
Big-Data Platform for Performance Monitoring of Telecom-Service-Provider Networks
by Milan Simakovic, Zoran Cica and Dejan Drajic
Electronics 2022, 11(14), 2224; https://doi.org/10.3390/electronics11142224 - 16 Jul 2022
Cited by 1 | Viewed by 1767
Abstract
Large telecom-service-provider networks are typically based on complex communications infrastructures comprising millions of network devices. The performance monitoring of such networks is a very demanding and challenging task. A large amount of data is collected and processed during performance monitoring to obtain information [...] Read more.
Large telecom-service-provider networks are typically based on complex communications infrastructures comprising millions of network devices. The performance monitoring of such networks is a very demanding and challenging task. A large amount of data is collected and processed during performance monitoring to obtain information that gives insights into the current network performance. Using the obtained information, providers can efficiently detect, locate, and troubleshoot weak spots in the network and improve the overall network performance. Furthermore, the extracted information can be used for planning future network expansions and to support the determination of business-strategy decisions. However, traditional methods for processing and storing data are not applicable because of the enormous amount of collected data. Thus, big-data technologies must be used. In this paper, a big-data platform for the performance monitoring of telecom-service-provider networks is proposed. The proposed platform is capable of collecting, storing, and processing data from millions of devices. Typical challenges and problems in the development and deployment process of the platform, as well as the solutions to overcome them, are presented. The proposed platform is adjusted to HFC (Hybrid Fiber-Coaxial) network and currently operates in the real HFC network, comprising millions of users and devices. Full article
(This article belongs to the Special Issue Big Data Technologies: Explorations and Analytics)
Show Figures

Figure 1

16 pages, 4803 KiB  
Article
IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop
by C. Kavitha, S. R. Srividhya, Wen-Cheng Lai and Vinodhini Mani
Electronics 2022, 11(10), 1599; https://doi.org/10.3390/electronics11101599 - 17 May 2022
Cited by 3 | Viewed by 1679
Abstract
Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by [...] Read more.
Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by using the MapReduce programming framework. A very large amount of data is transferred from Mapper to Reducer without any filtering or recursion, resulting in overdrawn bandwidth. In this paper, we introduce an algorithm called Inner MAPping Combiner (IMapC) for the map phase. This algorithm in the Mapper combines the values of recurring keys. In order to test the efficiency of the algorithm, different approaches were tested. According to the test, MapReduce programs that are implemented with the Default Combiner (DC) of IMapC will be 70% more efficient than those that are implemented without one. To make computations significantly faster, this work can be combined with MapReduce. Full article
(This article belongs to the Special Issue Big Data Technologies: Explorations and Analytics)
Show Figures

Figure 1

28 pages, 2455 KiB  
Article
JUpdate: A JSON Update Language
by Zouhaier Brahmia, Safa Brahmia, Fabio Grandi and Rafik Bouaziz
Electronics 2022, 11(4), 508; https://doi.org/10.3390/electronics11040508 - 09 Feb 2022
Cited by 8 | Viewed by 1953
Abstract
Although JSON documents are being used in several emerging applications (e.g., Big Data applications, IoT, mobile computing, smart cities, and online social networks), there is no consensual or standard language for updating JSON documents (i.e., creating, deleting or changing such documents, where changing [...] Read more.
Although JSON documents are being used in several emerging applications (e.g., Big Data applications, IoT, mobile computing, smart cities, and online social networks), there is no consensual or standard language for updating JSON documents (i.e., creating, deleting or changing such documents, where changing means inserting, deleting, replacing, copying, moving, etc., portions of data in such documents). To fill this gap, we propose in this paper an SQL-like language, named JUpdate, for updating JSON documents. JUpdate is based on a set of six primitive update operations, which is proven complete and minimal, and it provides a set of fourteen user-friendly high-level operations with a well-founded semantics defined on the basis of the primitive update operations. Full article
(This article belongs to the Special Issue Big Data Technologies: Explorations and Analytics)
Show Figures

Figure 1

12 pages, 1114 KiB  
Article
Cricket Match Analytics Using the Big Data Approach
by Mazhar Javed Awan, Syed Arbaz Haider Gilani, Hamza Ramzan, Haitham Nobanee, Awais Yasin, Azlan Mohd Zain and Rabia Javed
Electronics 2021, 10(19), 2350; https://doi.org/10.3390/electronics10192350 - 26 Sep 2021
Cited by 23 | Viewed by 8864
Abstract
Cricket is one of the most liked, played, encouraged, and exciting sports in today’s time that requires a proper advancement with machine learning and artificial intelligence (AI) to attain more accuracy. With the increasing number of matches with time, the data related to [...] Read more.
Cricket is one of the most liked, played, encouraged, and exciting sports in today’s time that requires a proper advancement with machine learning and artificial intelligence (AI) to attain more accuracy. With the increasing number of matches with time, the data related to cricket matches and the individual player are increasing rapidly. Moreover, the need of using big data analytics and the opportunities of utilizing this big data effectively in many beneficial ways are also increasing, such as the selection process of players in the team, predicting the winner of the match, and many more future predictions using some machine learning models or big data techniques. We applied the machine learning linear regression model to predict the team scores without big data and the big data framework Spark ML. The experimental results are measured through accuracy, the root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE), respectively 95%, 30.2, 1350.34, and 28.2 after applying linear regression in Spark ML. Furthermore, our approach can be applied to other sports. Full article
(This article belongs to the Special Issue Big Data Technologies: Explorations and Analytics)
Show Figures

Figure 1

Back to TopTop