Special Issue "Emerging Approaches and Advances in Big Data"

A special issue of Symmetry (ISSN 2073-8994).

Deadline for manuscript submissions: closed (30 June 2018)

Special Issue Editors

Guest Editor
Prof. Dr. Ka Lok Man

Department of Computer Science and Software Engineering, Xi’an Jiaotong Liverpool University, Suzhou Dushu Lake Higher Education Town, Suzhou Industrial Park, Jiangsu Province, China
E-Mail
Guest Editor
Dr. Kevin Lee

School of Science and Technology, Nottingham Trent University, Clifton Campus, Nottingham, NG11 8NS, UK
E-Mail

Special Issue Information

Dear Colleagues,

The growth of big data presents challenges, as well as opportunities, for industries and academia. Accumulated data can be extracted, processed, analyzed, and reported in time to deliver better data insights, complex patterns and valuable predictions to the design and analysis of various systems/platforms, including complex business models, highly scalable system and reconfigurable hardware and software systems, as well as wireless sensor and actuator networks. The main building blocks of big data analytics include:

  • big data thinking

  • computational tools

  • data modelling

  • analytical algorithms

  • data governance

Big data thinking is an exciting area that, not only involves business organizational data-related culture, but also big data projects initiation, team formation and best practices. Computational platforms and tools offer adaptive mechanisms that enable the understanding of data in complex and changing environments. Algorithms and analysis methods are the foundations for many solutions to real problems. Data and information governance and social responsibility directly affect data usage and social acceptance of business solutions.

This Special Issue on “Emerging Approaches and Advances in Big Data” will focus on emerging approaches and recent advances on architectures, design techniques, modeling and prototyping solutions for the design of complex business models, highly scalable system and reconfigurable hardware and software systems, and computing networks in the era of big data.

Prof. Dr. Ka Lok Man
Dr. Kevin Lee
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data architecture, modelling and toolkits

  • big data for business model and intelligence

  • big data challenges for small, medium and large enterprises

  • big data analytics and innovations

  • big data systems/analytics on emerging hardware/software architectures and computing networks

Published Papers (16 papers)

View options order results:
result details:
Displaying articles 1-16
Export citation of selected articles as:

Research

Jump to: Other

Open AccessFeature PaperArticle A Robust Distributed Big Data Clustering-based on Adaptive Density Partitioning using Apache Spark
Symmetry 2018, 10(8), 342; https://doi.org/10.3390/sym10080342
Received: 12 July 2018 / Revised: 28 July 2018 / Accepted: 13 August 2018 / Published: 15 August 2018
PDF Full-text (4317 KB) | HTML Full-text | XML Full-text
Abstract
Unsupervised machine learning and knowledge discovery from large-scale datasets have recently attracted a lot of research interest. The present paper proposes a distributed big data clustering approach-based on adaptive density estimation. The proposed method is developed-based on Apache Spark framework and tested on
[...] Read more.
Unsupervised machine learning and knowledge discovery from large-scale datasets have recently attracted a lot of research interest. The present paper proposes a distributed big data clustering approach-based on adaptive density estimation. The proposed method is developed-based on Apache Spark framework and tested on some of the prevalent datasets. In the first step of this algorithm, the input data is divided into partitions using a Bayesian type of Locality Sensitive Hashing (LSH). Partitioning makes the processing fully parallel and much simpler by avoiding unneeded calculations. Each of the proposed algorithm steps is completely independent of the others and no serial bottleneck exists all over the clustering procedure. Locality preservation also filters out the outliers and enhances the robustness of the proposed approach. Density is defined on the basis of Ordered Weighted Averaging (OWA) distance which makes clusters more homogenous. According to the density of each node, the local density peaks will be detected adaptively. By merging the local peaks, final cluster centers will be obtained and other data points will be a member of the cluster with the nearest center. The proposed method has been implemented and compared with similar recently published researches. Cluster validity indexes achieved from the proposed method shows its superiorities in precision and noise robustness in comparison with recent researches. Comparison with similar approaches also shows superiorities of the proposed method in scalability, high performance, and low computation cost. The proposed method is a general clustering approach and it has been used in gene expression clustering as a sample of its application. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle A Quick Gbest Guided Artificial Bee Colony Algorithm for Stock Market Prices Prediction
Symmetry 2018, 10(7), 292; https://doi.org/10.3390/sym10070292
Received: 21 June 2018 / Revised: 6 July 2018 / Accepted: 10 July 2018 / Published: 20 July 2018
PDF Full-text (2227 KB) | HTML Full-text | XML Full-text
Abstract
The objective of this work is to present a Quick Gbest Guided artificial bee colony (ABC) learning algorithm to train the feedforward neural network (QGGABC-FFNN) model for the prediction of the trends in the stock markets. As it is quite important to know
[...] Read more.
The objective of this work is to present a Quick Gbest Guided artificial bee colony (ABC) learning algorithm to train the feedforward neural network (QGGABC-FFNN) model for the prediction of the trends in the stock markets. As it is quite important to know that nowadays, stock market prediction of trends is a significant financial global issue. The scientists, finance administration, companies, and leadership of a given country struggle towards developing a strong financial position. Several technical, industrial, fundamental, scientific, and statistical tools have been proposed and used with varying results. Still, predicting an exact or near-to-exact trend of the Stock Market values behavior is an open problem. In this respect, in the present manuscript, we propose an algorithm based on ABC to minimize the error in the trend and actual values by using the hybrid technique based on neural network and artificial intelligence. The presented approach has been verified and tested to predict the accurate trend of Saudi Stock Market (SSM) values. The proposed QGGABC-ANN based on bio-inspired learning algorithm with its high degree of accuracy could be used as an investment advisor for the investors and traders in the future of SSM. The proposed approach is based mainly on SSM historical data covering a large span of time. From the simulation findings, the proposed QGGABC-FFNN outperformed compared with other typical computational algorithms for prediction of SSM values. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle A Local Approximation Approach for Processing Time-Evolving Graphs
Symmetry 2018, 10(7), 247; https://doi.org/10.3390/sym10070247
Received: 21 April 2018 / Revised: 8 June 2018 / Accepted: 12 June 2018 / Published: 1 July 2018
PDF Full-text (609 KB) | HTML Full-text | XML Full-text
Abstract
To efficiently process time-evolving graphs where new vertices and edges are inserted over time, an incremental computing model, which processes the newly-constructed graph based on the results of the computation on the outdated graph, is widely adopted in distributed time-evolving graph computing systems.
[...] Read more.
To efficiently process time-evolving graphs where new vertices and edges are inserted over time, an incremental computing model, which processes the newly-constructed graph based on the results of the computation on the outdated graph, is widely adopted in distributed time-evolving graph computing systems. In this paper, we first experimentally study how the results of the graph computation on the local graph structure can approximate the results of the graph computation on the complete graph structure in distributed environments. Then, we develop an optimization approach to reduce the response time in bulk synchronous parallel (BSP)-based incremental computing systems by processing time-evolving graphs on the local graph structure instead of on the complete graph structure. We have evaluated our optimization approach using the graph algorithms single-source shortest path (SSSP) and PageRankon the Amazon Elastic Compute Cloud(EC2), a central part of Amazon.com’s cloud-computing platform, with different scales of graph datasets. The experimental results demonstrate that the local approximation approach can reduce the response time for the SSSP algorithm by 22% and reduce the response time for the PageRank algorithm by 7% on average compared to the existing incremental computing framework of GraphTau. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Research on Electronic Voltage Transformer for Big Data Background
Symmetry 2018, 10(7), 234; https://doi.org/10.3390/sym10070234
Received: 27 May 2018 / Revised: 13 June 2018 / Accepted: 17 June 2018 / Published: 21 June 2018
PDF Full-text (2728 KB) | HTML Full-text | XML Full-text
Abstract
A new type of electronic voltage transformer is proposed in this study for big data background. By using the conventional inverted SF_6 transformer insulation structure, a coaxial capacitor sensor was constructed by designing a middle coaxial electrode between the high-voltage electrode and the
[...] Read more.
A new type of electronic voltage transformer is proposed in this study for big data background. By using the conventional inverted SF_6 transformer insulation structure, a coaxial capacitor sensor was constructed by designing a middle coaxial electrode between the high-voltage electrode and the ground electrode. The measurement of the voltage signal could be obtained by detecting the capacitance current i of the SF_6 coaxial capacitor. To improve the accuracy of the integrator, a high-precision digital integrator based on the Romberg algorithm is proposed in this study. This can not only guarantee the accuracy of computation, but also reduce the consumption time; in addition, the sampling point can be reused. By adopting the double shielding effect of the high-voltage shell and the grounding metal shield, the ability and stability of the coaxial capacitor divide could be effectively improved to resist the interference of stray electric fields. The factors that affect the coaxial capacitor were studied, such as position, temperature, and pressure, which will influence the value of the coaxial capacitor. Tests were carried out to verify the performance. The results showed that the voltage transformer based on the SF_6 coaxial capacitor satisfies the requirements of the 0.2 accuracy class. This study can promote the use of new high-performance products for data transmission in the era of big data and specific test analyses. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Adaptive Incremental Genetic Algorithm for Task Scheduling in Cloud Environments
Symmetry 2018, 10(5), 168; https://doi.org/10.3390/sym10050168
Received: 24 April 2018 / Revised: 11 May 2018 / Accepted: 15 May 2018 / Published: 17 May 2018
PDF Full-text (292 KB) | HTML Full-text | XML Full-text
Abstract
Cloud computing is a new commercial model that enables customers to acquire large amounts of virtual resources on demand. Resources including hardware and software can be delivered as services and measured by specific usage of storage, processing, bandwidth, etc. In Cloud computing, task
[...] Read more.
Cloud computing is a new commercial model that enables customers to acquire large amounts of virtual resources on demand. Resources including hardware and software can be delivered as services and measured by specific usage of storage, processing, bandwidth, etc. In Cloud computing, task scheduling is a process of mapping cloud tasks to Virtual Machines (VMs). When binding the tasks to VMs, the scheduling strategy has an important influence on the efficiency of datacenter and related energy consumption. Although many traditional scheduling algorithms have been applied in various platforms, they may not work efficiently due to the large number of user requests, the variety of computation resources and complexity of Cloud environment. In this paper, we tackle the task scheduling problem which aims to minimize makespan by Genetic Algorithm (GA). We propose an incremental GA which has adaptive probabilities of crossover and mutation. The mutation and crossover rates change according to generations and also vary between individuals. Large numbers of tasks are randomly generated to simulate various scales of task scheduling problem in Cloud environment. Based on the instance types of Amazon EC2, we implemented virtual machines with different computing capacity on CloudSim. We compared the performance of the adaptive incremental GA with that of Standard GA, Min-Min, Max-Min , Simulated Annealing and Artificial Bee Colony Algorithm in finding the optimal scheme. Experimental results show that the proposed algorithm can achieve feasible solutions which have acceptable makespan with less computation time. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Carbon Oxides Gases for Occupancy Counting and Emergency Control in Fog Environment
Symmetry 2018, 10(3), 66; https://doi.org/10.3390/sym10030066
Received: 10 February 2018 / Revised: 10 March 2018 / Accepted: 12 March 2018 / Published: 15 March 2018
PDF Full-text (1364 KB) | HTML Full-text | XML Full-text
Abstract
The information of human occupancy plays a crucial role in building management. For instance, fewer people, less demand for heat and electricity supply, and vice versa. Moreover, when there is a fire in a building, it is convenient to know how many persons
[...] Read more.
The information of human occupancy plays a crucial role in building management. For instance, fewer people, less demand for heat and electricity supply, and vice versa. Moreover, when there is a fire in a building, it is convenient to know how many persons in a single room there are in order to plan a more efficient rescue strategy. However, currently most buildings have not installed adequate devices that can be used to count the number of people, and the most popular embedded fire alarm system triggers a warning only when a fire breaks out with plenty of smoke. In view of this constraint, in this paper we propose a carbon oxides gases based warning system to detect potential fire breakouts and to estimate the number of people in the proximity. In order to validate the efficiency of the devised system, we simulate its application in the Fog Computing environment. Furthermore, we also improve the iFogSim by giving data analytics capacity to it. Based on this framework, energy consumption, latency, and network usage of the designed system obtained from iFogSim are compared with those obtained from Cloud environment. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Detecting Ghost Targets Using Multilayer Perceptron in Multiple-Target Tracking
Symmetry 2018, 10(1), 16; https://doi.org/10.3390/sym10010016
Received: 15 November 2017 / Revised: 30 December 2017 / Accepted: 2 January 2018 / Published: 4 January 2018
PDF Full-text (4123 KB) | HTML Full-text | XML Full-text
Abstract
This paper deals with a method for removing a ghost target that is not a real object from the output of a multiple object-tracking algorithm. This method uses an artificial neural network (multilayer perceptron) and introduces a structure, learning, verification, and evaluation method
[...] Read more.
This paper deals with a method for removing a ghost target that is not a real object from the output of a multiple object-tracking algorithm. This method uses an artificial neural network (multilayer perceptron) and introduces a structure, learning, verification, and evaluation method for the artificial neural network. The implemented system was tested at an intersection in a city center. Results from a 28-min measurement were 88% accurate when the multilayer perceptron for ghost target classification successfully detected the ghost targets, and 6.7% inaccurate when ghost targets were mistaken for actual targets. This method is expected to contribute to the advancement of intelligent transportation systems if the weaknesses revealed during the evaluation of the system are complemented and refined. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Graphical abstract

Open AccessArticle A Novel String Grammar Unsupervised Possibilistic C-Medians Algorithm for Sign Language Translation Systems
Symmetry 2017, 9(12), 321; https://doi.org/10.3390/sym9120321
Received: 30 November 2017 / Revised: 13 December 2017 / Accepted: 14 December 2017 / Published: 19 December 2017
Cited by 1 | PDF Full-text (7571 KB) | HTML Full-text | XML Full-text
Abstract
Sign language is a basic method for solving communication problems between deaf and hearing people. In order to communicate, deaf and hearing people normally use hand gestures, which include a combination of hand positioning, hand shapes, and hand movements. Thai Sign Language is
[...] Read more.
Sign language is a basic method for solving communication problems between deaf and hearing people. In order to communicate, deaf and hearing people normally use hand gestures, which include a combination of hand positioning, hand shapes, and hand movements. Thai Sign Language is the communication method for Thai hearing-impaired people. Our objective is to improve the dynamic Thai Sign Language translation method with a video captioning technique that does not require prior hand region detection and segmentation through using the Scale Invariant Feature Transform (SIFT) method and the String Grammar Unsupervised Possibilistic C-Medians (sgUPCMed) algorithm. This work is the first to propose the sgUPCMed algorithm to cope with the unsupervised generation of multiple prototypes in the possibilistic sense for string data. In our experiments, the Thai Sign Language data set (10 isolated sign language words) was collected from 25 subjects. The best average result within the constrained environment of the blind test data sets of signer-dependent cases was 89–91%, and the successful rate of signer semi-independent cases was 81–85%, on average. For the blind test data sets of signer-independent cases, the best average classification rate was 77–80%. The average result of the system without a constrained environment was around 62–80% for the signer-independent experiments. To show that the proposed algorithm can be implemented in other sign languages, the American sign language (RWTH-BOSTON-50) data set, which consists of 31 isolated American Sign Language words, is also used in the experiment. The system provides 88.56% and 91.35% results on the validation set alone, and for both the training and validation sets, respectively. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle System Framework for Cardiovascular Disease Prediction Based on Big Data Technology
Symmetry 2017, 9(12), 293; https://doi.org/10.3390/sym9120293
Received: 25 October 2017 / Revised: 24 November 2017 / Accepted: 24 November 2017 / Published: 27 November 2017
Cited by 2 | PDF Full-text (3196 KB) | HTML Full-text | XML Full-text
Abstract
Amid growing concern over the changing climate, environment, and health care, the interconnectivity between cardiovascular diseases, coupled with rapid industrialization, and a variety of environmental factors, has been the focus of recent research. It is necessary to research risk factor extraction techniques that
[...] Read more.
Amid growing concern over the changing climate, environment, and health care, the interconnectivity between cardiovascular diseases, coupled with rapid industrialization, and a variety of environmental factors, has been the focus of recent research. It is necessary to research risk factor extraction techniques that consider individual external factors and predict diseases and conditions. Therefore, we designed a framework to collect and store various domains of data on the causes of cardiovascular disease, and constructed a big data integrated database. A variety of open source databases were integrated and migrated onto distributed storage devices. The integrated database was composed of clinical data on cardiovascular diseases, national health and nutrition examination surveys, statistical geographic information, population and housing censuses, meteorological administration data, and Health Insurance Review and Assessment Service data. The framework was composed of data, speed, analysis, and service layers, all stored on distributed storage devices. Finally, we proposed a framework for a cardiovascular disease prediction system based on lambda architecture to solve the problems associated with the real-time analyses of big data. This system can be used to help predict and diagnose illnesses, such as cardiovascular diseases. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Self-Adaptive Pre-Processing Methodology for Big Data Stream Mining in Internet of Things Environmental Sensor Monitoring
Symmetry 2017, 9(10), 244; https://doi.org/10.3390/sym9100244
Received: 30 September 2017 / Revised: 11 October 2017 / Accepted: 15 October 2017 / Published: 21 October 2017
Cited by 3 | PDF Full-text (3246 KB) | HTML Full-text | XML Full-text
Abstract
Over the years, advanced IT technologies have facilitated the emergence of new ways of generating and gathering data rapidly, continuously, and largely and are associated with a new research and application branch, namely, data stream mining (DSM). Among those multiple scenarios of DSM,
[...] Read more.
Over the years, advanced IT technologies have facilitated the emergence of new ways of generating and gathering data rapidly, continuously, and largely and are associated with a new research and application branch, namely, data stream mining (DSM). Among those multiple scenarios of DSM, the Internet of Things (IoT) plays a significant role, with a typical meaning of a tough and challenging computational case of big data. In this paper, we describe a self-adaptive approach to the pre-processing step of data stream classification. The proposed algorithm allows different divisions with both variable numbers and lengths of sub-windows under a whole sliding window on an input stream, and clustering-based particle swarm optimization (CPSO) is adopted as the main metaheuristic search method to guarantee that its stream segmentations are effective and adaptive to itself. In order to create a more abundant search space, statistical feature extraction (SFX) is applied after variable partitions of the entire sliding window. We validate and test the effort of our algorithm with other temporal methods according to several IoT environmental sensor monitoring datasets. The experiments yield encouraging outcomes, supporting the reality that picking significant appropriate variant sub-window segmentations heuristically with an incorporated clustering technique merit would allow these to perform better than others. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Toward Bulk Synchronous Parallel-Based Machine Learning Techniques for Anomaly Detection in High-Speed Big Data Networks
Symmetry 2017, 9(9), 197; https://doi.org/10.3390/sym9090197
Received: 28 August 2017 / Revised: 15 September 2017 / Accepted: 15 September 2017 / Published: 19 September 2017
Cited by 4 | PDF Full-text (1360 KB) | HTML Full-text | XML Full-text
Abstract
Anomaly detection systems, also known as intrusion detection systems (IDSs), continuously monitor network traffic aiming to identify malicious actions. Extensive research has been conducted to build efficient IDSs emphasizing two essential characteristics. The first is concerned with finding optimal feature selection, while another
[...] Read more.
Anomaly detection systems, also known as intrusion detection systems (IDSs), continuously monitor network traffic aiming to identify malicious actions. Extensive research has been conducted to build efficient IDSs emphasizing two essential characteristics. The first is concerned with finding optimal feature selection, while another deals with employing robust classification schemes. However, the advent of big data concepts in anomaly detection domain and the appearance of sophisticated network attacks in the modern era require some fundamental methodological revisions to develop IDSs. Therefore, we first identify two more significant characteristics in addition to the ones mentioned above. These refer to the need for employing specialized big data processing frameworks and utilizing appropriate datasets for validating system’s performance, which is largely overlooked in existing studies. Afterwards, we set out to develop an anomaly detection system that comprehensively follows these four identified characteristics, i.e., the proposed system (i) performs feature ranking and selection using information gain and automated branch-and-bound algorithms respectively; (ii) employs logistic regression and extreme gradient boosting techniques for classification; (iii) introduces bulk synchronous parallel processing to cater computational requirements of high-speed big data networks; and; (iv) uses the Infromation Security Centre of Excellence, of the University of Brunswick real-time contemporary dataset for performance evaluation. We present experimental results that verify the efficacy of the proposed system. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle A Robust Method for Finding the Automated Best Matched Genes Based on Grouping Similar Fragments of Large-Scale References for Genome Assembly
Symmetry 2017, 9(9), 192; https://doi.org/10.3390/sym9090192
Received: 9 August 2017 / Revised: 8 September 2017 / Accepted: 11 September 2017 / Published: 13 September 2017
Cited by 1 | PDF Full-text (2368 KB) | HTML Full-text | XML Full-text
Abstract
Big data research on genomic sequence analysis has accelerated considerably with the development of next-generation sequencing. Currently, research on genomic sequencing has been conducted using various methods, ranging from the assembly of reads consisting of fragments to the annotation of genetic information using
[...] Read more.
Big data research on genomic sequence analysis has accelerated considerably with the development of next-generation sequencing. Currently, research on genomic sequencing has been conducted using various methods, ranging from the assembly of reads consisting of fragments to the annotation of genetic information using a database that contains known genome information. According to the development, most tools to analyze the new organelles’ genetic information requires different input formats such as FASTA, GeneBank (GB) and tab separated files. The various data formats should be modified to satisfy the requirements of the gene annotation system after genome assembly. In addition, the currently available tools for the analysis of organelles are usually developed only for specific organisms, thus the need for gene prediction tools, which are useful for any organism, has been increased. The proposed method—termed the genome_search_plotter—is designed for the easy analysis of genome information from the related references without any file format modification. Anyone who is interested in intracellular organelles such as the nucleus, chloroplast, and mitochondria can analyze the genetic information using the assembled contig of an unknown genome and a reference model without any modification of the data from the assembled contig. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle An Efficient and Energy-Aware Cloud Consolidation Algorithm for Multimedia Big Data Applications
Symmetry 2017, 9(9), 184; https://doi.org/10.3390/sym9090184
Received: 14 August 2017 / Revised: 30 August 2017 / Accepted: 1 September 2017 / Published: 6 September 2017
Cited by 4 | PDF Full-text (2395 KB) | HTML Full-text | XML Full-text
Abstract
It is well known that cloud computing has many potential advantages over traditional distributed systems. Many enterprises can build their own private cloud with open source infrastructure as a service (IaaS) frameworks. Since enterprise applications and data are migrating to private cloud, the
[...] Read more.
It is well known that cloud computing has many potential advantages over traditional distributed systems. Many enterprises can build their own private cloud with open source infrastructure as a service (IaaS) frameworks. Since enterprise applications and data are migrating to private cloud, the performance of cloud computing environments is of utmost importance for both cloud providers and users. To improve the performance, previous studies on cloud consolidation have been focused on live migration of virtual machines based on resource utilization. However, the approaches are not suitable for multimedia big data applications. In this paper, we reveal the performance bottleneck of multimedia big data applications in cloud computing environments and propose a cloud consolidation algorithm that considers application types. We show that our consolidation algorithm outperforms previous approaches. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle Using Knowledge Transfer and Rough Set to Predict the Severity of Android Test Reports via Text Mining
Symmetry 2017, 9(8), 161; https://doi.org/10.3390/sym9080161
Received: 5 June 2017 / Revised: 10 August 2017 / Accepted: 16 August 2017 / Published: 17 August 2017
Cited by 2 | PDF Full-text (2685 KB) | HTML Full-text | XML Full-text
Abstract
Crowdsourcing is an appealing and economic solution to software application testing because of its ability to reach a large international audience. Meanwhile, crowdsourced testing could have brought a lot of bug reports. Thus, in crowdsourced software testing, the inspection of a large number
[...] Read more.
Crowdsourcing is an appealing and economic solution to software application testing because of its ability to reach a large international audience. Meanwhile, crowdsourced testing could have brought a lot of bug reports. Thus, in crowdsourced software testing, the inspection of a large number of test reports is an enormous but essential software maintenance task. Therefore, automatic prediction of the severity of crowdsourced test reports is important because of their high numbers and large proportion of noise. Most existing approaches to this problem utilize supervised machine learning techniques, which often require users to manually label a large number of training data. However, Android test reports are not labeled with their severity level, and manual labeling is time-consuming and labor-intensive. To address the above problems, we propose a Knowledge Transfer Classification (KTC) approach based on text mining and machine learning methods to predict the severity of test reports. Our approach obtains training data from bug repositories and uses knowledge transfer to predict the severity of Android test reports. In addition, our approach uses an Importance Degree Reduction (IDR) strategy based on rough set to extract characteristic keywords to obtain more accurate reduction results. The results of several experiments indicate that our approach is beneficial for predicting the severity of android test reports. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Open AccessArticle A Case Study on Iteratively Assessing and Enhancing Wearable User Interface Prototypes
Symmetry 2017, 9(7), 114; https://doi.org/10.3390/sym9070114
Received: 18 May 2017 / Revised: 4 July 2017 / Accepted: 6 July 2017 / Published: 10 July 2017
Cited by 2 | PDF Full-text (19024 KB) | HTML Full-text | XML Full-text
Abstract
Wearable devices are being explored and investigated as a promising computing platform as well as a source of personal big data for the post smartphone era. To deal with a series of rapidly developed wearable prototypes, a well-structured strategy is required to assess
[...] Read more.
Wearable devices are being explored and investigated as a promising computing platform as well as a source of personal big data for the post smartphone era. To deal with a series of rapidly developed wearable prototypes, a well-structured strategy is required to assess the prototypes at various development stages. In this paper, we first design and develop variants of advanced wearable user interface prototypes, including joystick-embedded, potentiometer-embedded, motion-gesture and contactless infrared user interfaces for rapidly assessing hands-on user experience of potential futuristic user interfaces. To achieve this goal systematically, we propose a conceptual test framework and present a case study of using the proposed framework in an iterative cyclic process to prototype, test, analyze, and refine the wearable user interface prototypes. We attempt to improve the usability of the user interface prototypes by integrating initial user feedback into the leading phase of the test framework. In the following phase of the test framework, we track signs of improvements through the overall results of usability assessments, task workload assessments and user experience evaluation of the prototypes. The presented comprehensive and in-depth case study demonstrates that the iterative approach employed by the test framework was effective in assessing and enhancing the prototypes, as well as gaining insights on potential applications and establishing practical guidelines for effective and usable wearable user interface development. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Other

Jump to: Research

Open AccessFeature PaperProject Report A Study on Big Data Thinking of the Internet of Things-Based Smart-Connected Car in Conjunction with Controller Area Network Bus and 4G-Long Term Evolution
Symmetry 2017, 9(8), 152; https://doi.org/10.3390/sym9080152
Received: 19 May 2017 / Revised: 2 August 2017 / Accepted: 2 August 2017 / Published: 9 August 2017
Cited by 2 | PDF Full-text (5818 KB) | HTML Full-text | XML Full-text
Abstract
A smart connected car in conjunction with the Internet of Things (IoT) is an emerging topic. The fundamental concept of the smart connected car is connectivity, and such connectivity can be provided by three aspects, such as Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), and Vehicle-to-Everything
[...] Read more.
A smart connected car in conjunction with the Internet of Things (IoT) is an emerging topic. The fundamental concept of the smart connected car is connectivity, and such connectivity can be provided by three aspects, such as Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), and Vehicle-to-Everything (V2X). To meet the aspects of V2V and V2I connectivity, we developed modules in accordance with international standards with respect to On-Board Diagnostics II (OBDII) and 4G Long Term Evolution (4G-LTE) to obtain and transmit vehicle information. We also developed software to visually check information provided by our modules. Information related to a user’s driving, which is transmitted to a cloud-based Distributed File System (DFS), was then analyzed for the purpose of big data analysis to provide information on driving habits to users. Yet, since this work is an ongoing research project, we focus on proposing an idea of system architecture and design in terms of big data analysis. Therefore, our contributions through this work are as follows: (1) Develop modules based on Controller Area Network (CAN) bus, OBDII, and 4G-LTE; (2) Develop software to check vehicle information on a PC; (3) Implement a database related to vehicle diagnostic codes; (4) Propose system architecture and design for big data analysis. Full article
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
Figures

Figure 1

Back to Top