Next Issue
Previous Issue

Table of Contents

Big Data Cogn. Comput., Volume 2, Issue 3 (September 2018)

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Readerexternal link to open them.
Cover Story (view full-size image) With the increasing computational power of smartphones and smaller devices with high connectivity [...] Read more.
View options order results:
result details:
Displaying articles 1-15
Export citation of selected articles as:
Open AccessArticle Recreating the Relationship between Subjective Wellbeing and Personality Using Machine Learning: An Investigation into Facebook Online Behaviours
Big Data Cogn. Comput. 2018, 2(3), 29; https://doi.org/10.3390/bdcc2030029
Received: 29 June 2018 / Revised: 25 August 2018 / Accepted: 3 September 2018 / Published: 5 September 2018
Viewed by 552 | PDF Full-text (2027 KB) | HTML Full-text | XML Full-text
Abstract
The twenty-first century has delivered technological advances that allow researchers to utilise social media to predict personal traits and psychological constructs. This article aims to further our understanding of the relationship between subjective wellbeing (SWB) and the Five Factor Model (FFM) of personality
[...] Read more.
The twenty-first century has delivered technological advances that allow researchers to utilise social media to predict personal traits and psychological constructs. This article aims to further our understanding of the relationship between subjective wellbeing (SWB) and the Five Factor Model (FFM) of personality by attempting to replicate the relationship using machine learning prediction models. Data from the myPersonality Project was used; with observed SWB scores derived from the Satisfaction With Life Scale (SWLS) and Five Factor Model (FFM) personality profiles generated using responses on the 100-item IPIP proxy of the NEO-PI-R. After data cleaning, FFM personality traits and SWB scores were predicted by reducing Facebook Likes into 50 dimensions using SVD and then running the data through six multiple regressions (fitting the model via least squares and splitting the data via k-folds validation) with the Likes dimensions as predictors and each of the FFM traits and the SWB score as response variables. Standard multiple regression analyses were conducted for the observed and machine learning predicted variables to compare the relationships in the context of previous literature. The results revealed that in the observed model, high SWB was predicted by high extraversion, conscientiousness, and agreeableness, and low openness to experience and neuroticism as per previous research. For the machine learning model, high SWB was predicted by high extraversion, openness to experience, conscientiousness, and agreeableness, and low neuroticism. The relationships between SWB and extraversion, neuroticism, and conscientiousness were successfully replicated in the machine learning model. Openness to experience changed direction in its relationship with SWB from the observed to machine learning-derived variables due to failure to accurately recreate the variable, and agreeableness was multicollinear with SWB in the machine learning model due to the unknowing use of identical digital behaviours to replicate each construct. Implications of the results and directions for future research are discussed. Full article
Figures

Figure 1

Open AccessArticle Analysis of Nonlinear Bypass Route Computation for Wired and Wireless Network Cooperation Recovery System
Big Data Cogn. Comput. 2018, 2(3), 28; https://doi.org/10.3390/bdcc2030028
Received: 23 July 2018 / Revised: 24 August 2018 / Accepted: 28 August 2018 / Published: 3 September 2018
Viewed by 363 | PDF Full-text (829 KB) | HTML Full-text | XML Full-text
Abstract
It is a significant issue for network carriers to immediately restore telecommunication services when a disaster occurs. A wired and wireless network cooperation (NeCo) system was proposed to address this problem. The goal of the NeCo system is quick and high-throughput recovery of
[...] Read more.
It is a significant issue for network carriers to immediately restore telecommunication services when a disaster occurs. A wired and wireless network cooperation (NeCo) system was proposed to address this problem. The goal of the NeCo system is quick and high-throughput recovery of telecommunication services in the disaster area using single-hop wireless links backhauled by wired networks. It establishes wireless bypass routes between widely deployed leaf nodes to relay packets to and from dead nodes whose normal wired communication channels are disrupted. In the previous study, the optimal routes for wireless links were calculated to maximize the expected physical layer throughput by solving a binary integer programming problem. However, the routing method did not consider throughput reduction caused by sharing of wireless resources among dead nodes. Therefore, this paper proposes a nonlinear bypass route computation method considering the wireless resource sharing among dead nodes for the NeCo system. Monte Carlo base approach is applied since the nonlinear programming problem is difficult to solve. The performance of the proposed routing method is evaluated with computer simulations and it was confirmed that bandwidth division loss can be avoided with the proposed method. Full article
Figures

Graphical abstract

Open AccessArticle Productivity Benchmarking Using Analytic Network Process (ANP) and Data Envelopment Analysis (DEA)
Big Data Cogn. Comput. 2018, 2(3), 27; https://doi.org/10.3390/bdcc2030027
Received: 29 July 2018 / Revised: 15 August 2018 / Accepted: 22 August 2018 / Published: 3 September 2018
Viewed by 456 | PDF Full-text (857 KB) | HTML Full-text | XML Full-text
Abstract
Measuring productivity is the systematic process for both inter- and intra-organizational comparisons. The productivity measurement can be used to control and facilitate decision-making in manufacturing as well as service organizations. This study’s objective was to develop a decision support framework by integrating an
[...] Read more.
Measuring productivity is the systematic process for both inter- and intra-organizational comparisons. The productivity measurement can be used to control and facilitate decision-making in manufacturing as well as service organizations. This study’s objective was to develop a decision support framework by integrating an analytic network process (ANP) and data envelopment analysis (DEA) approach to tackling productivity measurement and benchmarking problems in a manufacturing environment. The ANP was used to capture the interdependency between the criteria taking into consideration the ambiguity and vagueness. The nonparametric DEA approach was utilized to determine the input-oriented constant returns to scale (CRS) efficiency of different value-adding production units and to benchmark them. The proposed framework was implemented to benchmark the productivity of an apparel manufacturing company. By applying the model, industrial managers can gain benefits by identifying the possible contributing factors that play an important role in increasing the productivity of manufacturing organizations. Full article
(This article belongs to the Special Issue Big-Data Driven Multi-Criteria Decision-Making)
Figures

Figure 1

Open AccessArticle Edge Machine Learning: Enabling Smart Internet of Things Applications
Big Data Cogn. Comput. 2018, 2(3), 26; https://doi.org/10.3390/bdcc2030026
Received: 11 July 2018 / Revised: 14 August 2018 / Accepted: 17 August 2018 / Published: 3 September 2018
Viewed by 1077 | PDF Full-text (2944 KB) | HTML Full-text | XML Full-text
Abstract
Machine learning has traditionally been solely performed on servers and high-performance machines. However, advances in chip technology have given us miniature libraries that fit in our pockets and mobile processors have vastly increased in capability narrowing the vast gap between the simple processors
[...] Read more.
Machine learning has traditionally been solely performed on servers and high-performance machines. However, advances in chip technology have given us miniature libraries that fit in our pockets and mobile processors have vastly increased in capability narrowing the vast gap between the simple processors embedded in such things and their more complex cousins in personal computers. Thus, with the current advancement in these devices, in terms of processing power, energy storage and memory capacity, the opportunity has arisen to extract great value in having on-device machine learning for Internet of Things (IoT) devices. Implementing machine learning inference on edge devices has huge potential and is still in its early stages. However, it is already more powerful than most realise. In this paper, a step forward has been taken to understand the feasibility of running machine learning algorithms, both training and inference, on a Raspberry Pi, an embedded version of the Android operating system designed for IoT device development. Three different algorithms: Random Forests, Support Vector Machine (SVM) and Multi-Layer Perceptron, respectively, have been tested using ten diverse data sets on the Raspberry Pi to profile their performance in terms of speed (training and inference), accuracy, and power consumption. As a result of the conducted tests, the SVM algorithm proved to be slightly faster in inference and more efficient in power consumption, but the Random Forest algorithm exhibited the highest accuracy. In addition to the performance results, we will discuss their usability scenarios and the idea of implementing more complex and taxing algorithms such as Deep Learning on these small devices in more details. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Figures

Figure 1

Open AccessArticle Exploiting Inter- and Intra-Base Crossing with Multi-Mappings: Application to Environmental Data
Big Data Cogn. Comput. 2018, 2(3), 25; https://doi.org/10.3390/bdcc2030025
Received: 25 May 2018 / Revised: 9 August 2018 / Accepted: 10 August 2018 / Published: 19 August 2018
Viewed by 504 | PDF Full-text (2400 KB) | HTML Full-text | XML Full-text
Abstract
Environmental data are currently gaining more and more interest as they are required to understand global changes. In this context, sensor data are collected and stored in dedicated databases. Frameworks have been developed for this purpose and rely on standards, as for instance
[...] Read more.
Environmental data are currently gaining more and more interest as they are required to understand global changes. In this context, sensor data are collected and stored in dedicated databases. Frameworks have been developed for this purpose and rely on standards, as for instance the Sensor Observation Service (SOS) provided by the Open GeoSpatial Consortium (OGC), where all measurements are bound to a so-called Feature of Interest (FoI). These databases are used to validate and test scientific hypotheses often formulated as correlations and causality between variables, as for instance the study of the correlations between environmental factors and chlorophyll levels in the global ocean. However, the hypotheses of the correlations to be tested are often difficult to formulate as the number of variables that the user can navigate through can be huge. Moreover, it is often the case that the data are stored in such a manner that they prevent scientists from crossing them in order to retrieve relevant correlations. Indeed, the FoI can be a spatial location (e.g., city), but can also be any other object (e.g., animal species). The same data can thus be represented in several manners, depending on the point of view. The FoI varies from one representation to the other one, while the data remain unchanged. In this article, we propose a novel methodology including a crucial step to define multiple mappings from the data sources to these models that can then be crossed, thus offering multiple possibilities that could be hidden from the end-user if using the initial and single data model. These possibilities are provided through a catalog embedding the multiple points of view and allowing the user to navigate through these points of view through innovative OLAP-like operations. It should be noted that the main contribution of this work lies in the use of multiple points of view, as many other works have been proposed for manipulating, aggregating visualizing and navigating through geospatial information. Our proposal has been tested on data from an existing environmental observatory from Lebanon. It allows scientists to realize how biased the representations of their data are and how crucial it is to consider multiple points of view to study the links between the phenomena. Full article
Figures

Figure 1

Open AccessArticle Data Science Approach for Simulating Educational Data: Towards the Development of Teaching Outcome Model (TOM)
Big Data Cogn. Comput. 2018, 2(3), 24; https://doi.org/10.3390/bdcc2030024
Received: 19 June 2018 / Revised: 2 August 2018 / Accepted: 3 August 2018 / Published: 10 August 2018
Viewed by 645 | PDF Full-text (958 KB) | HTML Full-text | XML Full-text
Abstract
The increasing availability of educational data provides the educational researcher with numerous opportunities to use analytics to extract useful knowledge to enhance teaching and learning. While learning analytics focuses on the collection and analysis of data about students and their learning contexts, teaching
[...] Read more.
The increasing availability of educational data provides the educational researcher with numerous opportunities to use analytics to extract useful knowledge to enhance teaching and learning. While learning analytics focuses on the collection and analysis of data about students and their learning contexts, teaching analytics focuses on the analysis of the design of the teaching environment and the quality of learning activities provided to students. In this article, we propose a data science approach that incorporates the analysis and delivery of data-driven solution to explore the role of teaching analytics, without compromising issues of privacy, by creating pseudocode that simulates data to help develop test cases of teaching activities. The outcome of this approach is intended to inform the development of a teaching outcome model (TOM), that can be used to inspire and inspect quality of teaching. The simulated approach reported in the research was accomplished through Splunk. Splunk is a Big Data platform designed to collect and analyse high volumes of machine-generated data and render results on a dashboard in real-time. We present the results as a series of visual dashboards illustrating patterns, trends and results in teaching performance. Our research aims to contribute to the development of an educational data science approach to support the culture of data-informed decision making in higher education. Full article
(This article belongs to the Special Issue Big Data and Data Science in Educational Research)
Figures

Figure 1

Open AccessArticle LPaaS as Micro-Intelligence: Enhancing IoT with Symbolic Reasoning
Big Data Cogn. Comput. 2018, 2(3), 23; https://doi.org/10.3390/bdcc2030023
Received: 15 July 2018 / Revised: 27 July 2018 / Accepted: 31 July 2018 / Published: 3 August 2018
Viewed by 733 | PDF Full-text (1955 KB) | HTML Full-text | XML Full-text
Abstract
In the era of Big Data and IoT, successful systems have to be designed to discover, store, process, learn, analyse, and predict from a massive amount of data—in short, they have to behave intelligently. Despite the success of non-symbolic techniques such as deep
[...] Read more.
In the era of Big Data and IoT, successful systems have to be designed to discover, store, process, learn, analyse, and predict from a massive amount of data—in short, they have to behave intelligently. Despite the success of non-symbolic techniques such as deep learning, symbolic approaches to machine intelligence still have a role to play in order to achieve key properties such as observability, explainability, and accountability. In this paper we focus on logic programming (LP), and advocate its role as a provider of symbolic reasoning capabilities in IoT scenarios, suitably complementing non-symbolic ones. In particular, we show how its re-interpretation in terms of LPaaS (Logic Programming as a Service) can work as an enabling technology for distributed situated intelligence. A possible example of hybrid reasoning—where symbolic and non-symbolic techniques fruitfully combine to produce intelligent behaviour—is presented, demonstrating how LPaaS could work in a smart energy grid scenario. Full article
Figures

Figure 1

Open AccessArticle The Rise of Big Data Science: A Survey of Techniques, Methods and Approaches in the Field of Natural Language Processing and Network Theory
Big Data Cogn. Comput. 2018, 2(3), 22; https://doi.org/10.3390/bdcc2030022
Received: 30 May 2018 / Revised: 29 July 2018 / Accepted: 31 July 2018 / Published: 2 August 2018
Viewed by 486 | PDF Full-text (446 KB) | HTML Full-text | XML Full-text
Abstract
The continuous creation of data has posed new research challenges due to its complexity, diversity and volume. Consequently, Big Data has increasingly become a fully recognised scientific field. This article provides an overview of the current research efforts in Big Data science, with
[...] Read more.
The continuous creation of data has posed new research challenges due to its complexity, diversity and volume. Consequently, Big Data has increasingly become a fully recognised scientific field. This article provides an overview of the current research efforts in Big Data science, with particular emphasis on its applications, as well as theoretical foundation. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Figures

Figure 1

Open AccessReview EMG Pattern Recognition in the Era of Big Data and Deep Learning
Big Data Cogn. Comput. 2018, 2(3), 21; https://doi.org/10.3390/bdcc2030021
Received: 3 July 2018 / Revised: 20 July 2018 / Accepted: 20 July 2018 / Published: 1 August 2018
Viewed by 741 | PDF Full-text (376 KB) | HTML Full-text | XML Full-text
Abstract
The increasing amount of data in electromyographic (EMG) signal research has greatly increased the importance of developing advanced data analysis and machine learning techniques which are better able to handle “big data”. Consequently, more advanced applications of EMG pattern recognition have been developed.
[...] Read more.
The increasing amount of data in electromyographic (EMG) signal research has greatly increased the importance of developing advanced data analysis and machine learning techniques which are better able to handle “big data”. Consequently, more advanced applications of EMG pattern recognition have been developed. This paper begins with a brief introduction to the main factors that expand EMG data resources into the era of big data, followed by the recent progress of existing shared EMG data sets. Next, we provide a review of recent research and development in EMG pattern recognition methods that can be applied to big data analytics. These modern EMG signal analysis methods can be divided into two main categories: (1) methods based on feature engineering involving a promising big data exploration tool called topological data analysis; and (2) methods based on feature learning with a special emphasis on “deep learning”. Finally, directions for future research in EMG pattern recognition are outlined and discussed. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Open AccessArticle A User Study of a Prototype of a Spatial Augmented Reality System for Education and Interaction with Geographic Data
Big Data Cogn. Comput. 2018, 2(3), 20; https://doi.org/10.3390/bdcc2030020
Received: 10 July 2018 / Revised: 23 July 2018 / Accepted: 31 July 2018 / Published: 1 August 2018
Viewed by 421 | PDF Full-text (7379 KB) | HTML Full-text | XML Full-text
Abstract
Recent technological advancements in many areas have changed the way that individuals interact with the world. Some daily tasks require visualization skills, especially when in a map-reading context. Augmented Reality systems could provide substantial improvement to geovisualization once it enhances a real scene
[...] Read more.
Recent technological advancements in many areas have changed the way that individuals interact with the world. Some daily tasks require visualization skills, especially when in a map-reading context. Augmented Reality systems could provide substantial improvement to geovisualization once it enhances a real scene with virtual information. However, relatively little research has worked on assessing the effective contribution of such systems during map reading. So, this research aims to provide a first look into the usability of an Augmented Reality system prototype for interaction with geoinformation. For this purpose, we have designed an activity with volunteers in order to assess the system prototype usability. We have interviewed 14 users (three experts and 11 non-experts), where experts were subjects with the following characteristics: a professor; with a PhD degree in Cartography, GIS, Geography, or Environmental Sciences/Water Resources; and with experience treating spatial information related to water resources. The activity aimed to detect where the system really helps the user to interpret a hydrographic map and how the users were helped by the Augmented Reality system prototype. We may conclude that the Augmented Reality system was helpful to the users during the map reading, as well as allowing the construction of spatial knowledge within the proposed scenario. Full article
Figures

Figure 1

Open AccessArticle Traffic Sign Recognition based on Synthesised Training Data
Big Data Cogn. Comput. 2018, 2(3), 19; https://doi.org/10.3390/bdcc2030019
Received: 29 May 2018 / Revised: 18 July 2018 / Accepted: 24 July 2018 / Published: 27 July 2018
Viewed by 528 | PDF Full-text (2651 KB) | HTML Full-text | XML Full-text
Abstract
To deal with the richness in visual appearance variation found in real-world data, we propose to synthesise training data capturing these differences for traffic sign recognition. The use of synthetic training data, created from road traffic sign templates, allows overcoming the problems of
[...] Read more.
To deal with the richness in visual appearance variation found in real-world data, we propose to synthesise training data capturing these differences for traffic sign recognition. The use of synthetic training data, created from road traffic sign templates, allows overcoming the problems of existing traffic sing recognition databases, which are only subject to specific sets of road signs found explicitly in countries or regions. This approach is used for generating a database of synthesised images depicting traffic signs under different view-light conditions and rotations, in order to simulate the complexity of real-world scenarios. With our synthesised data and a robust end-to-end Convolutional Neural Network (CNN), we propose a data-driven, traffic sign recognition system that can achieve not only high recognition accuracy, but also high computational efficiency in both training and recognition processes. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Figures

Figure 1

Open AccessReview Digitalisation and Big Data Mining in Banking
Big Data Cogn. Comput. 2018, 2(3), 18; https://doi.org/10.3390/bdcc2030018
Received: 27 June 2018 / Revised: 12 July 2018 / Accepted: 17 July 2018 / Published: 20 July 2018
Cited by 3 | Viewed by 840 | PDF Full-text (429 KB) | HTML Full-text | XML Full-text
Abstract
Banking as a data intensive subject has been progressing continuously under the promoting influences of the era of big data. Exploring the advanced big data analytic tools like Data Mining (DM) techniques is key for the banking sector, which aims to reveal valuable
[...] Read more.
Banking as a data intensive subject has been progressing continuously under the promoting influences of the era of big data. Exploring the advanced big data analytic tools like Data Mining (DM) techniques is key for the banking sector, which aims to reveal valuable information from the overwhelming volume of data and achieve better strategic management and customer satisfaction. In order to provide sound direction for the future research and development, a comprehensive and most up to date review of the current research status of DM in banking will be extremely beneficial. Since existing reviews only cover the applications until 2013, this paper aims to fill this research gap and presents the significant progressions and most recent DM implementations in banking post 2013. By collecting and analyzing the trends of research focus, data resources, technological aids, and data analytical tools, this paper contributes to bringing valuable insights with regard to the future developments of both DM and the banking sector along with a comprehensive one stop reference table. Moreover, we identify the key obstacles and present a summary for all interested parties that are facing the challenges of big data. Full article
Figures

Figure 1

Open AccessArticle Per-Flow Throughput Fairness in Ring Aggregation Network with Multiple Edge Routers
Big Data Cogn. Comput. 2018, 2(3), 17; https://doi.org/10.3390/bdcc2030017
Received: 11 June 2018 / Revised: 12 July 2018 / Accepted: 17 July 2018 / Published: 18 July 2018
Cited by 2 | Viewed by 425 | PDF Full-text (933 KB) | HTML Full-text | XML Full-text
Abstract
Ring aggregation networks are often employed by network carriers because of their efficiency and high fault tolerance. A fairness scheme is required in ring aggregation to achieve per-flow throughput fairness and bufferbloat avoidance, because frames are forwarded along multiple ring nodes. N Rate
[...] Read more.
Ring aggregation networks are often employed by network carriers because of their efficiency and high fault tolerance. A fairness scheme is required in ring aggregation to achieve per-flow throughput fairness and bufferbloat avoidance, because frames are forwarded along multiple ring nodes. N Rate N + 1 Color Marking (NRN + 1CM) was proposed to achieve fairness in ring aggregation networks consisting of Layer-2 Switches (SWs). With NRN + 1CM, frames are selectively discarded based on color and the frame-dropping threshold. To avoid the accumulation of a queuing delay, frames are discarded at upstream nodes in advance through the notification process for the frame-dropping threshold. However, in the previous works, NRN + 1CM was assumed to be employed in a logical daisy chain topology linked to one Edge Router (ER). The currently available threshold notification process of NRN + 1CM cannot be employed for ring networks with multiple ERs. Therefore, this paper proposes a method for applying NRN + 1CM to a ring aggregation network with multiple ERs. With the proposed algorithm, an SW dynamically selects the dropping threshold to send in order to avoid excess frame discarding. The performance of the proposed scheme was confirmed through computer simulations. Full article
Figures

Figure 1

Open AccessArticle From Big Data to Deep Learning: A Leap Towards Strong AI or ‘Intelligentia Obscura’?
Big Data Cogn. Comput. 2018, 2(3), 16; https://doi.org/10.3390/bdcc2030016
Received: 1 June 2018 / Revised: 9 July 2018 / Accepted: 16 July 2018 / Published: 17 July 2018
Viewed by 766 | PDF Full-text (970 KB) | HTML Full-text | XML Full-text
Abstract
Astonishing progress is being made in the field of artificial intelligence (AI) and particularly in machine learning (ML). Novel approaches of deep learning are promising to even boost the idea of AI equipped with capabilities of self-improvement. But what are the wider societal
[...] Read more.
Astonishing progress is being made in the field of artificial intelligence (AI) and particularly in machine learning (ML). Novel approaches of deep learning are promising to even boost the idea of AI equipped with capabilities of self-improvement. But what are the wider societal implications of this development and to what extent are classical AI concepts still relevant? This paper discusses these issues including an overview on basic concepts and notions of AI in relation to big data. Particular focus lies on the roles, societal consequences and risks of machine and deep learning. The paper argues that the growing relevance of AI in society bears serious risks of deep automation bias reinforced by insufficient machine learning quality, lacking algorithmic accountability and mutual risks of misinterpretation up to incrementally aggravating conflicts in decision-making between humans and machines. To reduce these risks and avoid the emergence of an intelligentia obscura requires overcoming ideological myths of AI and revitalising a culture of responsible, ethical technology development and usage. This includes the need for a broader discussion about the risks of increasing automation and useful governance approaches to stimulate AI development with respect to individual and societal well-being. Full article
Figures

Figure 1

Open AccessArticle Adaptive Provisioning of Heterogeneous Cloud Resources for Big Data Processing
Big Data Cogn. Comput. 2018, 2(3), 15; https://doi.org/10.3390/bdcc2030015
Received: 31 May 2018 / Revised: 5 July 2018 / Accepted: 9 July 2018 / Published: 12 July 2018
Cited by 1 | Viewed by 447 | PDF Full-text (640 KB) | HTML Full-text | XML Full-text
Abstract
Efficient utilization of resources plays an important role in the performance of large scale task processing. In cases where heterogeneous types of resources are used within the same application, it is hard to achieve good utilization of all of the different types of
[...] Read more.
Efficient utilization of resources plays an important role in the performance of large scale task processing. In cases where heterogeneous types of resources are used within the same application, it is hard to achieve good utilization of all of the different types of resources. By taking advantage of recent developments in cloud infrastructure that enable the use of dynamic clusters of resources, and by dynamically altering the size of the available resources for all the different resource types, the overall utilization of resources, however, can be improved. Starting from this premise, this paper discusses a solution that aims to provide a generic algorithm to estimate the desired ratios of instance processing tasks as well as ratios of the resources that are used by these instances, without the necessity for trial runs or a priori knowledge of the execution steps. These ratios are then used as part of an adaptive system that is able to reconfigure itself to maximize utilization. To verify the solution, a reference framework which adaptively manages clusters of functionally different VMs to host a calculation scenario is implemented. Experiments are conducted based on a compute-heavy use case in which the probability of underground pipeline failures is determined based on the settlement of soils. These experiments show that the solution is capable of eliminating large amounts of under-utilization, resulting in increased throughput and lower lead times. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Figures

Figure 1

Back to Top