The modern digitized world has led to the emergence of a new paradigm on global information networks and infrastructures known as Cyberspace and the studies of Cybernetics, which bring seamless integration of physical, social and mental spaces. Cyberspace is becoming an integral part of our daily life from learning and entertainment to business and cultural activities. As expected, this whole concept of Cybernetics brings new challenges that need to be tackled. The 2017 IEEE Cyber Science and Technology Congress (CyberSciTech 2017) provided a forum for researchers to report their research findings and exchange ideas. The congress took place in Orlando, Florida, USA during 6–10 November 2017. Not counting poster papers, the congress accepted over fifty papers that are divided into nine sessions (four of them are special sessions on hot research areas). The congress was co-located with three other IEEE conferences, the 15th IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC 2017), the 15th IEEE International Conference on Pervasive Intelligence and Computing (PICom 2017), and the 3rd IEEE International Conference on Big Data Intelligence and Computing (DataCom 2017). In this report, we provide an overview of the research contributions of the papers in CyberSciTech 2017. We group them roughly based on the sessions organized by the congress.
2. Cyber Human Science and Computing
This session included six papers. Four of them describe work pertinent to medicine [1
] such as overall survival rate prediction and differentiation of different skin diseases. A common theme is the use of machine learning algorithms to improve the state of the practice in medicine. One paper reported the implementation and evaluation of an architecture for modeling human behaviors [5
]. The remaining paper presented the work on identifying image tags from Instagram hashtags [6
], which could help develop training dataset automatically.
Liu et al. [1
] reported their work on developing a computer-based method for automatic analysis of respiratory sounds captured using the stethoscope. In their study, they captured data from 60 patients on three types of respiratory sounds, namely, wheezes, crackles, and normal sounds. They proposed a deep Convolutional Neural Networks (CNN) model to automatically learn the features of the various respiratory sounds for classification. Their primary research contribution lies in the design of the CNN model, which consists of six convolutional layers, three max pooling layers, and three fully connected layers.
Guo et al. [2
] proposed a novel discriminator for identifying two common skin diseases, Seborrheic keratosis (SK) and flat warts (FW) using deep convolution neural networks based on the confocal laser scanning microscope images. Clinically, it is critical to differentiate the two skin diseases because they require very different treatment plans. Unfortunately, it is quite difficult to reliably do so. The authors showed that the proposed method rivals the accuracy of that of dermatologists.
], Lu at el. proposed a novel method to predict the overall survival rate of gastric cancer patients after the D2 gastrectomy. The overall survival rate estimation is an important topic for oncology. Existing statistical predication models do not function well with the high dimensional dataset. The authors proposed a multi-modal hypergraph learning framework to overcome the challenges caused by the high dimensionality of the dataset, and therefore, were able to improve the accuracy of prediction. Using a dataset obtained from West China Hospital of Sichuan University with 939 patients, the authors demonstrated the superiority of their method compared with random forest and support vector machine.
], Jaimes and Steele proposed to use crowdsensing to collect massive real-world health and physiological data. The authors made two very interesting observation regarding crowdsensing: (1) the data collected should be specific to the health conditions; and (2) a person would be more interested in learning about his or her health conditions and therefore is more willing to participate in crowdsensing. A key requirement for enabling crowdsensing is to incentivize participation from a large number of people. The authors introduced a taxonomy of inventive mechanisms for health crowdsensing, and provided a novel incentive mechanism for crowdsensing based on a game theoretical multi-window framework.
], Guo and Ma introduced a model describing a wide variety of human behaviors in terms of their personality as well as contextual information such as location and time. They refer to the digital model for a person as an ontic persona. They described an implementation architecture on producing such ontic personae based on the collected personal data. The authors evaluated their model and implementation using two case studies, one focused on the detection of scenarios, and the other on the classification of different personality characteristics such as openness, conscientiousness, extraversion, agreeableness, and neuroticism.
], Giannoulakis, Nicolas Tsapatsoulis and Klimis Ntalianis reported their work on using the Hyperlink-Induced Topic Search (HITS) algorithm [7
] to identify image tags from Instagram hashtags. They created a bipartite graph with two types of nodes. The first type of nodes refer to the annotators, and the second type of nodes refer to the tags that the annotators have selected. They showed that the HITS algorithm can be used to accurately selecting honest annotators for the images.
3. Cyber Physical Computing and Systems
Because twelve papers were accepted on this topic, the presentation of these papers were divided into two sessions during the congress. The six papers in the first session are all related to transportation, with five of them presenting research that is fundamental to intelligent transportation system [8
], and the remaining one on the clustering of trajectory datasets [13
], Mahjoub et al. reported their work using machine learning to make trajectory predictions of vehicles. More accurate vehicle trajectory prediction could improve situational awareness, which is important for forward collision warning and cooperative adaptive cruise control. The authors proposed to use a two-layer neural network-based system to predict values of vehicle parameters including velocity, acceleration, and yaw rate in the first layer, and based on the output of the first layer, to predict the longitudinal and lateral trajectory points.
], Jamialahmadi and Fallah presented their research on the development of a simulation framework to perform vehicle safety analysis based on human behavior in emergency situations. They specifically studied the modeling of driver braking behavior upon receiving a warning from the collision warning system.
], Lei et al. reported their work on the time-of-arrival estimation for Buses based on Global Positioning System (GPS) data. To overcome the low precision of bus-mounted GPS system and the lack of real-time traffic information, they proposed a novel GPS calibration method and a hybrid dynamic prediction model that takes into account traffic flow evaluation results and GPS position calibration for better time-of-arrival estimation.
], Khandani et al. presented their work on tracking vehicle information via wireless networks for better understanding of the traffic flow and for detecting hazardous situations. By using several public datasets, they studied the impact of different data sampling strategies, including periodic beaconing and error-dependent sampling, and combined with constant-speed and constant-acceleration estimation.
], Sun et al. proposed a genetic algorithm for more efficient vehicle routing in the context of multimodel transportation logistics. They introduced adaptive crossover probability and mutation probability to enhance the global search performance.
], Rayatidamavandi et al. presented their work on the clustering of large trajectory datasets. This is typically the first step in extracting knowledge from trajectory data. They experimented with two types of hash functions, namely, locality-sensitive and distance-based hash functions, to cluster trajectory data. They showed that the locality-sensitive hashes lead to higher accuracy, but not higher bucket balance.
The six papers in the second session cover a variety of topics. We provide an overview of five of them here. Two papers are related to smart health with one on the coordination of health related data from multiple different sources [14
], and the other on anomaly detection based on physiological data [15
]. The remaining three papers are about high-performance computing [16
], improving mobile user experience [17
], and clustering for big data [18
], Xu et al. studied the issue of coordinating data collection from multiple wearable devices for health monitoring purposes. They proposed two statistical methods to take into account the time discrepancy between different data sources and its distribution. They verified the accuracy of their models by a set of experiments, and demonstrated the usability of the proposed models with a case study using the adaptive frequency strategy. They showed that their models help improve the completeness of the data collected and minimize the redundancy in the data compared with the static frequency method.
], Wang et al. proposed an anomaly detection algorithm on physiological data collected via wearable devices. They validated their algorithm with real patient datasets and demonstrated that their algorithm has good anomaly detection accuracy with acceptable alarm precision and recall ratios.
], Xu et al. proposed a novel cache management method designed for solid state drives (SSDs) to improve the performance and lifetime of SSDs. Unlike SRAM and DRAM, SSDs suffer from low cache performance due to their internal garbage collection activities. Compared with traditional magnetic-based hard drives, SSDs have more limited lifetime because of the limited P/E cycles. The proposed method is based on a reuse distance aware cache management, which strike a balance between the cache hit ratio and the internal garbage collection overhead.
], Xu et al. described a min-max based scheme for caching at edge nodes for mobile cyber physical systems. Caching contents at edge nodes helps users’ quality of experience because caching reduces end-to-end latency. Their solution addresses the trustworthiness of edge nodes based on users’ feedback.
], Aseeri et al. reported their work on using bisecting K-means to cluster datasets that are larger than memory capacity. They chose to follow an iteration limiting strategy and applied their method to the large challenge-response datasets of physical unclonable functions.
4. Cyber Science and Fundamentals
This session has seven papers on a variety of topics related to cyber science. Li et al. [19
] surveyed the publications on cyberspace between the period of 1989 and 2016 and showed that the field is matured. Two papers focused on using neural networks for knowledge discovery [20
]. Two other papers investigated recommendation models [22
]. One paper reviewed the current approaches on data-intensive computing [24
]. The remaining paper presented an interesting visual analysis solution for massive open online courses (MOOCs) [25
]. We also include one paper from the short paper session because it is also related to knowledge discovery [26
], Li et al. analyzed the literature on the topics of cyberspace based on data obtained from Web of Science during the period between 1989 and 2016. They observed that 1999 had the most number of publications, and, in 2016, the field has matured.
], Mungai et al. proposed a semantic neuron network based associative memory model to assign semantics on lower-level text-mining result. They performed chunking mechanisms on matrices to merge and decompose correlated matrices. The goal of their study is to make their associative memory model to be self-organizable and self-evolvable. In [26
], Izhar and Apduhan introduced a framework to identify relationship between organizational data and organizational goals based on ontology. They implemented their framework in a higher education institution in Australia.
], Furukawa and Zhao proposed to derive rules from the layers closer to the output layer of a deep multilayer perceptron neural network. Their hypothesis is that such layers could learn more abstract and therefore simpler rules. The validated their hypothesis using several public datasets. Their work is important for knowledge discovery using machine learning.
], Gao et al. presented their work aiming to improve the quality of making recommendations to online users based on their behaviors. They defined the preference on two-item sets with different types. The first item set has the same-type feedback relationship, and the second item set is a mixed-type set. Based on this assumption, they introduced a novel algorithm called Bayesian Personalized Ranking over Mixed-Type Item-sets to make better recommendations, and validated their approach with data collected from a healthcare website and a mobile e-commerce application.
], Li et al. studied the issue of diversity when making recommendations with a novel ranking model. The model takes into account factorized category features of items for diversity of the recommendation. They also devised a scoring formula based on both relevancy and diversity of the recommendation results.
], Rao and Wang provided a review of the state of the art approaches to enhancing the performance of data-intensive computing by considering the semantics of the computing. They gave an overview of current programming models and technologies for data-intensive computing. They identified four types of performance defects as part of their survey, and outlined future research challenges and opportunities.
], Li et al. presented a visual analysis solution to help instructors of MOOCs to know better the progress made by students and gain insight on how well an individual student is teaching. Based on relevant data collected in the courses, their solution could present several visual views for the instructors, such as activeness calendar view and progress distribution view.
5. Cyber Communications and Security
This section consists of six papers and they are all related to computer and network security in different application areas. We also include three papers from the short paper section here because they are also related to security.
], Frank et al. reported their experience in the design and implementation of a cyber security testbed for teaching students. This testbed was meant to showcase their authors vision on how an educational testbed ought to be built. They believe the testbeds should facilitate a wide range of audience to learn cyber security skills.
], Sun et al. proposed a novel malware detection method for the Android platform based on extreme learning machine. The method detects malware based on the sensitive permissions and sensitive application programming interface calls. They validated their approach with a testing tool named WaffleDetector. In [29
], Wang and Chen analyzed the details of HTTP spectral Hijacking attack and proposed a mechanism to detect such attacks.
], Xu et al. presented a reliability study on the distributed advanced metering infrastructure (AMI) for smart grid. The identified the reliability issues in AMI and outlined two solutions. They proposed a mathematical model that ensures good overall network reliability, and validated their approach via simulation.
], Fang et al. proposed an interference management mechanism for physical layer security in a two-tier heterogeneous network (HetNet). The study considers an eavesdropping attack where there are several users of the HetNet and one eavesdropper. The objective of the mechanism is to maximize the secrecy rate of users under the eavesdropping attack.
], Wang et al. proposed a novel content caching framework for mobile edge networks. The framework relies on the data collected from users’ mobile devices, which are used to infer user preferences on communicational and computational tasks. The framework consists of an ambient data collection scheme and an application predication scheme. The latter is critical to predict the application that a user is likely to launch and the corresponding content will be needed for the application launch, and therefore the content can be prefetched.
], Ye et al. addressed the security concerns of a green roof monitoring system, which consists of numerous Internet of Things (IoT). They experimented with several authentication and credential update schemes to ensure proper access control to the system. In [34
], Sa et al. proposed the use of a randomly switching controller to counter the active system identification attacks. Such attacks are often launched to gain insight to the target system models prior to attacks that might comprise the system.
Contrary to most other research that focuses on the detection of security attacks and defense mechanisms, in [35
], Maimon et al. presented a study on the attackers themselves. They argued that it is time to revise the traditional model on cyber criminals, which captures the attacker’s “skills, knowledge, resources, access to the target organization and motivation to offend” (or SKRAM for short). They proposed a revised SKRAM model that considers the attacker’s online circumstances.
6. Cyber-Enabled Smart Environment and Healthcare
This special session has seven papers on various topics related to smart environment and healthcare. We also included two papers from the short paper session because they are also related to healthcare.
], Ploof et al. described an inexpensive system that can be used to carry out balance assessment. It uses a set of high speed cameras and a Wii balance board with custom software to determine the center of mass and the center of pressure. They validated their system experimentally with reasonably good result.
], Liu and Zhao proposed a system designed to serve as a virtual life coach for children with autism. They argued that the content of the treatment/care program should be designed to cater to the intense interests of each autism child so that he/she is more responsive to the treatment and care programs. They provided a detailed description of major components of the system.
], Wu and Zhao reported their customer validation experiences on the technology they developed as part of an Ohio i-Corps program. The technology was initially designed to track non-compliance activities such as back bending that might increase the risk of lower back injuries of nursing assistants [39
]. Through numerous interviews with potential customers during the customer validation study, they realized that the technology can be evolved into a digital platform that track the quality of services provided by nursing assistants, and therefore, can be used to build an environment of care for nursing homes.
], Qiu et al. reviewed smart wearable devices such as fitbit and smart watches in their applications of fitness. They focused on the underlying technologies that power these devices, including both hardware such as sensors and software platforms. They also outlined future research directions.
], Christian et al. developed an adaptive energy efficient data transmission scheme to minimize power consumption when using wearable devices to collect health and fitness information. The key idea is to consider the physical activities that the user is engaged in while the data are collected. The recognition of the physical activities would put the data in the right context for analysis. When the user is engaged in certain activities and the physiological signal level (such as heart rate) are normal in the context of the activity, the transmission could be omitted, which reduces the transmission rate.
], Bucioli et al. proposed a set of algorithms to generate 3D heart models for display in the holography and mixed reality environments. Their algorithms were tested with Microsoft HoloLens, which provides a holographic mix reality display. In [45
], Goode and Steele presented a cyber-physical system based on wearable devices for taking physiological and activity data. The system was designed for individuals with diabetes to better quantify the effects of their exercise and activity and understand the positive impact of exercises towards diabetes management.
], Khan et al. introduced a framework to help develop user-centered smart environments, which consist of various smart devices and interface modalities. At the core of the framework are a digital agent that processes data and provide an adaptive user interface, and a device interface that allow access to context data of smart environments.
], Kepuska and Bohouta reported a design of a speech recognition system that works on wake-up-word, or in the general automatic speech recognition mode. This line of research is important because voice recognition is essential for any smart environment that aims to dynamically interact with one or more users.
7. Other Special Sessions and Workshop
CyberSciTech 2017 has several other special sessions and one workshop, with a combined eleven papers. Of these eleven papers, we will provide an overview of nine of them. These papers discuss a wide range of subjects within the scope of cyber science and technology. We intentionally omitted poster session papers because of the work-in-process nature of those papers.
], Hui and Xing proposed a quality of experience based pricing mechanism for vehicle-to-grid networks. The problem was formulated as an optimization task with two classes of constraints. In [49
], Feng et al. analyzed the primary structure and security mechanism of the Simple Network Management Protocol (SNMP), and outline security threats against the protocol. The objective of the study is to improve the security of the IEEE P21451-1-5 protocol, which is the standard protocol for IoT. In [50
], Pei et al. presented a performance comparison study of base station 5G communication under several different configurations. In [51
], Ye et al. also investigated the performance issues in 5G wireless networks. Their work showed that a joint-base-station cooperative transmission and ON-OFF mechanism adapt better with environment changes. In [52
], Zhao et al. studied networking strategies for 5G communication and compared their approaches with traditional cellular networks.
], Deng and Wang provided a review on key technical issues for secure and efficient data transmission in energy distribution networks. In [54
], Xu et al. presented an analysis on the flexibility of cross-organizational process modeling using
calculus. In [55
], Liu et al. examined the explicit and implicit structural features of the cross-organizational business process, constructed a herarchical mapping model, and defined a set of rules based on
calculus. In [56
], Liu et al. presented a study on bike common flows. The research objective is to understand where to place bike stations for the bike sharing programs. They proposed a clustering algorithm to analyze the bike flows. The derived clusters would then be where the stations should be placed.