A Holistic Overview of Anticipatory Learning for the Internet of Moving Things: Research Challenges and Opportunities

The proliferation of Internet of Things (IoT) systems has received much attention from the research community, and it has brought many innovations to smart cities, particularly through the Internet of Moving Things (IoMT). The dynamic geographic distribution of IoMT devices enables the devices to sense themselves and their surroundings on multiple spatio-temporal scales, interact with each other across a vast geographical area, and perform automated analytical tasks everywhere and anytime. Currently, most of the geospatial applications of IoMT systems are developed for abnormal detection and control monitoring. However, it is expected that, in the near future, optimization and prediction tasks will have a larger impact on the way citizens interact with smart cities. This paper examines the state of the art of IoMT systems and discusses their crucial role in supporting anticipatory learning. The maximum potential of IoMT systems in future smart cities can be fully exploited in terms of proactive decision making and decision delivery via an anticipatory action/feedback loop. We also examine the challenges and opportunities of anticipatory learning for IoMT systems in contrast to GIS. The holistic overview provided in this paper highlights the guidelines and directions for future research on this emerging topic.


Introduction
The Internet of Things (IoT) has received significant attention from the research community since its first introduction by Kevin Ashton in 1999 [1][2][3]. The basic concept of IoT is that every physical thing in a smart city is connected, and can function as a sensor embedded in tiny computers, which are then geographically distributed over a vast area of a smart city. An IoT device is always connected through a communication network, ranging from short range networks (e.g., Bluetooth, Zigbee, near-field communication (NFC)), to medium range networks (e.g., Wi-Fi, Digi Mesh), to large range networks (e.g., LoRaWan, cellular, WiMax). Today, IoT devices are usually expected to collect sensor data, communicate with each other, and make decisions without human intervention [4][5][6][7]. Some examples of IoT devices include smart traffic lights, smart parking meters, smart home meters, smartphones, and wearable devices [8][9][10][11][12][13].
The IoT market in smart cities has not really taken off yet due to a number of technical, political, and financial barriers; however, previous survey papers have already shown different points of view regarding the role of IoT in smart cities. These are mainly related to IoT architecture concerns such as elements, facilities, protocols, and standards for IoT [14][15][16][17][18][19], as well as the development of new IoT applications such as smart factories [20], smart homes [21], and smart hospitals [22].
The Internet of Moving Things (IoMT) takes this a step further, and can be defined as "the extension of the concept of the IoT to moving things, which is essentially any IoT device that moves". Instead of having a fixed location in a smart city, an IoMT device can be anything people wear or carry around, such as clothes, smartphones, and wearables; or things used for transportation, such as cars, trucks, trains, bikes, and planes. When these IoMT devices are connected to each other, not only can they sense themselves (e.g., speed, acceleration, and direction) and their surrounding environment (e.g., temperature, noise, and air pollution), they can also exploit the resources made available by edge, fog, and cloud computing.
Therefore, IoMT devices generate unbounded data streams from a vast amount of indoor and outdoor locations that require a low-latency database for storing and exploring data in space. Time is an important dimension because different time windows used to handle IoMT data streams have an impact on preprocessing, analytical, and visualization tasks. Some examples include landmark windows [23], sliding windows [24], damped widows [25], and tilted windows [26]. Different time windows have been proposed to cope with transporting data streams where the data rate could overwhelm the processing power of the computation resources at the edge, fog, and cloud. In contrast, the space dimension has been overlooked until now, despite the fact that the data streams are being generated by IoMT devices moving over large geographical areas, with a fine spatial granularity. There is now a growing interest and demand for developing IoT-GIS platforms that can handle data streams generated by IoMT devices. This paper is one step in this direction, mainly because IoMT is paving the way for anticipatory learning.
As indicated in [27], anticipatory learning is an often misused term. Rosen defined it as "a system whose current state is determined by a (predicted) future state", while Nadin has defined it as "a system whose current state is determined not only by a past state, but also by possible future states" [28][29][30][31]. Nevertheless, both authors agree that prediction and anticipation are not interchangeable concepts. The consensus is that an anticipatory system makes a decision to impact the future in order to benefit a user; meanwhile, a predictive system uses a predictive model that can foresee the future state of the system itself.
In this paper, anticipatory learning for IoMT is defined as "a system where the current state is determined by the past and future behavior of IoMT devices that is represented by the dynamic geographical distribution of IoMT devices over time". This is critical for building context intelligence for anticipatory learning models. Mainly because IoMT devices are equipped with different sensors, which generate data streams of spatio-temporal information used to infer contextual intelligence on what is happening, where and why it is happening, and what should be done about it. In other words, contextual intelligence requires that anticipatory learning models have: (1) a context sensing strategy of relevant past events detected or monitored by IoMT devices; (2) spatio-temporal awareness of present contextual variables being continuously used for gathered IoMT data; and (3) user-driven awareness of the preferred future so the system can exert influence and help a user to make appropriate decisions.
Current edge-fog-cloud computing is the technology which allows us to run machine learning algorithms and build anticipatory learning models [32,33]. In contrast, our current GIS technology has been primarily developed for supporting predictive systems. Recent attempts at designing IoMT-GIS have shown the main limitations of GIS in processing IoMT data streams [34,35]. Adding the functionalities of an anticipatory learning model to GIS will only create more barriers to using GIS for running streaming machine learning for building anticipatory learning models.
Since a fairly systematic overview of IoT systems has been recently published elsewhere [36], our paper focuses on IoMT systems. Our purpose is not only to give a holistic overview of IoMT research that is relevant to each stage of an anticipatory learning model but also to provide some guidelines and future research directions for building anticipatory learning models for IoMT systems.
The rest of the paper is organized as follows. Section 2 introduces the main concepts of IoMT systems and compares the data collection strategies currently being used in research projects. Section 3 describes the main steps involved in building anticipatory models for IoMT systems. Section 4 describes the research being carried out on context sensing at the edge of a network, while Section 5 introduces context intelligence using fog computing. Section 6 delineates the prediction and intelligent actions for anticipatory learning. Section 7 gives a a holistic overview of the challenges and opportunities for building anticipatory learning for IoMT systems. Finally, conclusions and future research are given in Section 8.

Internet of Moving Things
In general, IoMT devices are equipped with many types of sensors, from accelerometers and gyroscopes to proximity, light, and ambient sensors, as well as microphones, and cameras. They also have the capability of computing by using a wide range of communication interfaces, such as Wi-Fi, Bluetooth, or NFC. The ability to sense themselves and their surrounding environments is key to generating "small data streams" over space and time in such a way that they share many characteristics of big data, including the five V's: variety, velocity, volume, veracity, and value [37][38][39][40][41].
The nature of IoMT data streams is multimodel, diverse, heterogeneous, and voluminous; often supplied at high speed, and with a degree of uncertainty. In general, these data streams also have distinctive characteristics that make the traditional storage, management, and processing of current GIS obsolete [42]. These characteristics can be described as one of the following: • Data in motion: The IoMT devices have the ability to sense themselves using context variables such as velocity, acceleration, and direction at a specific location and time. However, they can also sense their surrounding environments using context variables such as temperature, noise, and air pollution, and depending on the type of sensor deployed inside an IoMT device, these variables might have a variety of spatial ranges (e.g., from 1 and 10 m to 100 m and 1 km) as well as time granularities (e.g., from milliseconds and seconds to hours and days). Overall context sensing data are constantly moving from the IoMT devices to edge and fog nodes, up to the cloud depending on the processing power and storage resources available; • Data in many forms: Depending on the context intelligence envisaged for an anticipatory learning model, each IoMT device can perform different sensing functions for collecting time-series and event triggered data. This leads to different data types including structured, semistructured, unstructured, and mixed data streams; • Data at rest: It is indisputable that IoMT devices produce a large amount of data streams that are always tied with a location over time. This poses a challenge to capturing, processing, and managing the data within an appropriate spatio-temporal scale that is needed to be known a priori when developing anticipatory learning models; • Data in suspicion: The uncertainty refers to the biases, noise, and abnormalities in the data streams for reasons such as data inconsistency and incompleteness, latency, ambiguity, deception, and approximation; • Data of many values: The potential context hidden deep in the IoMT data streams is significant and has not yet been fully exploited. By processing, computing, analyzing, and making decisions based on this context could help us support decision-making actions. Anticipatory computing is considered in this paper as a key approach to exploiting that potential. Table 1 compares some selected research projects where the data from IoMT devices were collected using several different sensors, such as GPS, radio-frequency identification (RFID) tags, and cameras. They have been categorized into four common types: structured, unstructured, semistructured, and mixed. Structured data are the information that complies with a formal schema and data models; meanwhile unstructured data do not follow any predefined data model. Semistructured data do not reside in a data model, but do have some organizational structures that make them easier to analyze (e.g., CSV, XML, JSON file). Mixed data are the combination of many types of data together. It is argued that a large part of IoMT data produced today is either semistructured or unstructured data [38]. Our literature review of selected projects confirms this hypothesis, and it also reveals the following main issues in GIS:

•
Uniqueness: The IoMT data streams are a unique type of spatio-temporal data because they represent an immense cloud of location points over time in such a way that current spatial representations (e.g., trajectories, time geography, and layers) cannot handle the volume of these data points and their assigned semistructured and unstructured data; • Propagation: We consider propagation as a discrete-time process starting from one data point to another data point that is able to accumulate context information and is governed by the progress speed between the two or more data points. Spatio-temporal progress matrices have been used in the past, but they cannot handle nonstructured and unstructured data streams. More research work is needed in this domain; • Multiprocessing: It is easy to see from Table 1 that accumulated data streams can arrive and require processing at various speeds from batch to near real-time or real-time processing. Most of the research projects have used batch processing to analyze their data. The development of streaming GIS is needed for analyzing the data streams as they arrive. Traffic Monitoring Traffic lights [48] Structured Batch Clustering of IoT devices UAVs [49] CityPulse framework Bus [50] IoT-Based Smart Parking Ultrasonic [51] Real-time Analyzing people's activities RFID tags [52]  Traffic Congestion Prediction GPS [58] Complex Event Processing RFID, GPS [59] Mode Transportation Prediction Crowdsourced data [60,61] Mobility Prediction Smart Card [62] Mining the semantics of origin-destination flows GPS, Mobile Phone [63] Optimizing the mobility models and communication performance GPS [64] CarStream Services driving data including vehicle status, driver activity, and passenger-trip information [65] Traffic monitoring and alert notification Geo-location and speed data [66] Transportation Network Optimization GIS and the Internet of multimedia [67] Emissions and traffic-related impacts Crowdsourced data [68] Multi Access Physical Monitoring System wearable smart-log data [69] Wearable

Anticipatory Learning Model
"Anticipation pertains to change, that is, to a sense of the future" [30]. From an IoMT perspective, we need to be able to acquire data streams that can be used to sense a comprehensive context in space and time, and infer anticipatory actions based on predictions of the future state of this context. To that end, Figure 1 illustrates four main steps in building anticipatory learning models which are: (1) context sensing; (2) context intelligence; (3) context prediction; and (4) anticipatory action/feedback loop, as previously proposed in [73,74]. Most state of the art research is currently limited to the first three steps. Pejovic and Musolesi [27] stated that the main barrier to further proliferation of anticipatory computing is the inability of IoMT devices (and IoT in general) to seamlessly interact with humans and generate feedback, which is vital to guiding an anticipatory learning process. The literature review presented in this paper also reveals another barrier to the proliferation of anticipatory learning models, which is the lack of approaches to represent a priori spatio-temporal knowledge of a particular context. This is crucial for avoiding an Internet of "Useless" MobileThings in guiding anticipatory learning processes in the near future.

Context Sensing at the Edge of a Network
For an anticipatory learning model, sensing plays an important role in delivering the data used to generate context intelligence. Context may be divided into various categories (location, identity, activity, time) [75] and may have numerous aspects, such as geographical, physical, social, and temporal aspects [76]. Contextual sensing aims to provide an interface between IoMT devices (things) in the physical world and a person or a group of people.
In vehicular context sensing, IoMT devices in a vehicle can detect important aspects of driver behavior and the surrounding environment over time. On-board sensors in the vehicle, as well as sensors built into mobile devices carried by the driver, can also be used to gather IoMT data streams. Furthermore, IoMT data streams from different cars can provide increased spatial coverage to better understand the context, and can also help to reduce disambiguation. Context sensing can provide information on drivers changing lanes, stop signs, obstructions, and potholes. These features can be further used to infer a context that will be used within an anticipatory learning model to improve driver safety and engine efficiency.
In order to achieve this, data preprocessing is necessary to extract features from IoMT data streams and use those features to provide context intelligence. The availability of edge computing power promisingly allows us to run many preprocessing techniques near to an IoMT device, rather than having all IoMT data streams sent to a data center [77][78][79][80][81]. The correct choice of preprocessing techniques will be vital in the later steps of building an anticipatory learning model. A brief description of each preprocessing step is presented as follows: • Dealing with missing data: For a large accumulated data streams, deleting observations based on missing values is usually not considered as being a problem, but for a continuous data stream, it may affect our later steps in anticipatory learning. Therefore, missing values could be replaced based on predictive models [82,83]; • Filtering: IoMT devices usually produce noise data streams. In order to minimize the impact on succeeding steps, a clear set of automated tasks are needed to define, detect, and correct errors. Some new approaches can be found in [84,85]; • Summarization and aggregation: For some applications, the summary form of accumulated data streams might be enough for statistical analysis [86,87]; other applications may require data aggregation to diminish the bandwidth consumption as well as the data latency [88];  [90]. Another technique, latent Dirichlet allocation (LDA), is used to find a linear combination of features that characterize or separate two or more classes [91,92]. Recently, pattern reduction (PR) was presented in [93] for reducing the number of patterns.
It is of paramount importance that IoMT data streams are preprocessed before passing to the next step (i.e., context intelligence). Should we, therefore, stream all of our IoMT data to the cloud (data centers)? Our answer to this question is no. The closer to the data source the preprocessing is performed, the more advantages the IoMT system has. With the huge volume of IoMT data streams produced by a variety of sensors, it is highly possible to flood and overwhelm the networks and data centers (i.e., the cloud). In addition, some preprocessing tasks can be implemented using a specific set of IoMT devices which can help to improve the interactions between devices and improve the efficiency of the whole system.

Context Intelligence at the Fog Layer of a Network
Context intelligence requires inductive reasoning to infer higher-level concepts from preprocessed IoMT data streams. With academic references from as early as the 1980s, this is not a new theory; however, IoMT systems have revealed that context intelligence requires anticipatory learning models which understand the limitations of our algorithms in generating new knowledge, and are able to adapt this knowledge to an environment different from the one in which the learning model was trained. Contextual intelligence requires moving far beyond an analysis of economic, urban, rural, and many other spaces. It is common to rely on simple explanations for complex high-level concepts (i.e., complex phenomena such as human behavior). The most difficult task in this step is adjusting our persistent mental models and learning to differentiate between universal beliefs and their specific patterns and standards.
Our vision of context intelligence is to distribute streaming analytics into a hierarchical order, starting with descriptive analytics, which can be processed on edge nodes themselves (i.e., gateways), and perform more complex diagnostic analytics on fog nodes. Bonomi et al. [77,78] previously proposed a hierarchical distributed architecture based on fog computing to process IoT data with low latency, location awareness, and mobility support. We extended this distributed architecture with the following elements: • Scalability: By distributing automated analytical tasks, context intelligence depends on the scalability of IoMT devices. Many context models will require simple machine learning algorithms such as the linear Spanish inquisition protocol (L-SIP) which has been applied to reduce data transmission; filtered state classification (ClassAct) as a human posture/activity classifier based on decision tree; and time-discounted histogram encoding (Bare Necessities) which is used for summarizing the relative time spent in given contexts [94]; • Mobility and geographic distribution: These are indispensable requirements for context intelligence; however, an anticipatory learning system also requires a rich scenario of communication and interaction between all available computational resources. To achieve this, a priori data pipelines must be designed that will support an analytics everywhere framework [95][96][97]; • Heterogeneity and interoperability: Obviously, terminal devices in the IoMT system can collect data with different timestamps, formats, and locations. Additionally, the edge network computing devices which deploy the IoT gateways could seamlessly support the interoperability between terminal devices. For example, an array of devices including an armband sensor, a Bluetooth headset, a smartphone, an external antenna for a GPS receiver, and a light laptop with a transceiver [98] were combined to collect human activity data, which were then processed to predict the context around them.

Context Prediction and Anticipatory Actions
Context prediction and anticipatory action are the two important steps for anticipatory learning models. Anticipatory action refers to the act (behavior), including actual decision making; internal preparatory mechanisms; or learning that is dependent on predictions, expectations, aims, or beliefs about future states. According to [31], anticipation focuses on the impact of a prediction or expectation of current behavior. Stated in another way, anticipatory actions are not only about predicting the future or expecting a future event but also about changing behavior (or behavioral biases and predispositions) according to this prediction or expectation. For anticipatory learning models to assist citizens in changing their behavior, context prediction and intelligence-driven actions must play a major role.
Previous research has described different prediction models used to predict the behavior of people or IoMT devices. Tsai, Chun-Wei, et al. [99] give a brief review of data mining techniques for IoT systems. Figure 2 illustrates the state of the art research for context prediction using different analytical algorithms and a variety of data sources, while Table 2 below summarizes the approaches used for building a prediction model based on supervised and unsupervised prediction techniques [100][101][102]. Supervised techniques rely on labeled data and training to find a model that can afterwards be applied to a new dataset. Unsupervised techniques, in contrast, use unlabeled data and attempt to predict common patterns. Table 2. State-of-the-art projects using approaches in Figure 2.

Research Challenges and Opportunities
While the principles of anticipatory learning modeling have been studied for several decades [28,130], IoMT is actually in its infancy. Although recently, researchers attempted to integrate an anticipatory process into artificial learning systems [131][132][133][134][135], few attempts can be found on research applications that apply the theory of anticipatory computing to building context intelligence in IoMT devices [136,137]. We advocate that the proliferation of IoMT devices has created a unique opportunity to explore anticipatory learning models using the vast amount of IoMT data streams. This section discusses the research challenges in applying anticipatory computing for IoMT systems.

Research Challenges
Anticipatory learning for IoMT systems is reliant on multidisciplinary research fields such as the Internet of Things, big data analytics, geospatial data science, cloud computing, edge computing, machine learning, and data mining. Inherent challenges to this are discussed below.
• Privacy: One of the main concerns about deploying IoMT devices around a smart city is how to generate anticipatory actions from IoMT data streams without violating user privacy. Some examples of sensitive information gathered by IoMT devices include locations, activities, and emotions. For example, anticipatory computing can be misused to predict the future user locations or activities of an individual. Preserving privacy becomes even more complex when it comes to considering the inconsistent privacy policies among multiple users. One example includes the case of one user who may only want to donate one type of data (i.e., Bluetooth data), while another one donates two types (e.g., Bluetooth and Wi-Fi usage data). When these data are combined and co-location patterns are found, the information of the first user can be unintentionally exposed; • Security: The diversity of IoMT devices that we expect in smart cities poses a significant challenge to ensuring the security of the entire anticipatory learning process, especially regarding wearable devices, body sensor networks, or carried items (such as smartphones). IoMT devices may pose a threat to users due to susceptibility to hacking. Although there is currently some attention on the issue of security for the IoMT systems [138][139][140], there is no common standard, protocol, or security framework for IoMT devices. Therefore, addressing security issues for IoMT is now an urgent concern in our research work; • Connection: One of the key factors to making IoMT devices work effectively is the communication networks used by them. Mobility poses a challenge in terms of always maintaining a stable connection among IoMT devices in a smart city. In the future, new networking technology is expected to be used to keep IoMT devices collecting data seamlessly, regardless of their location, over short and long periods of time [141][142][143][144][145]; • Turbulence: Different from the fixed-location-based IoT devices, the mobility of the devices usually creates chaotic and unstable interactions between these devices. For example, IoT devices deployed at a fixed location always know to which neighbors they are communicating.
In contrast, IoMT devices do not know a priori about their close neighbors. The first law of geography needs to be further explored in terms of the potential impact of geographical proximity on the interoperability, power usage, automation of analytical tasks, data pipelines, and communication protocols of IoMT devices; • Management: Selecting the right type of IoMT device to support a specific anticipatory task is not an easy choice. If we choose many IoMT devices it may cause many problems such as power drains, noise, and data latency, to mention a few. Alternatively, if fewer devices, edge nodes, and fog nodes are deployed over a large geographical area, there may be gaps in data collection. Another challenge is how to efficiently manage the energy usage patterns of IoMT devices as they move; • Information loss: Processing data streams at the edge of a network brings potential information loss, a risk that must be balanced between the efficiency of the system and the value of the contextual information lost. It also raises an important question about the possible geographical divide, where regions of a smart city will determine which data streams should be processed at the edge nodes, and which data streams should be processed in a cloud computing environment. Determining which types of data streams and mobility behavior of IoMT devices and where they should be used for data processing remains an interesting research challenge; • Steaming geospatial analytics: the spatial relationship among the locations of the measured contextual variables using a sequence of accumulated data streams is demanding new methods that do not rely on density and proximity, but on the connectivity of a massive cloud of data points. The research challenge is threefold: (1) How to develop new spatial interpolation processes for determining which data points from the current data streams should be used to estimate values at other unknown points; (2) how to select the type of time windows that should be used for streaming geospatial analytics; and (3) geospatial summarization where the connectivity of the IoMT devices is used to summarize accumulated data streams over space and time; • Analytics everywhere frameworks: From our literature review, there are over 400 architectures that were developed to handle the incoming IoT data streams using different strategies such as streaming, microbatch, and batch processing. These strategies have been designed to work towards an asynchronous approach for static IoT devices. For developing anticipatory learning models using IoMT systems, we identified the need for analytics everywhere frameworks that are capable of breaking down the processing and analytical capabilities into a network of streaming tasks and distributing them into different compute nodes in an edge-fog-cloud continuum. The research challenge is to develop location aware analytical capabilities to support streaming descriptive, diagnostic, and predictive analytics.

Opportunities
Along with the above-mentioned challenges, there are always some opportunities. We illustrate some of these in terms of anticipatory computing for IoMT systems.

•
Locations offer many opportunities for geospatial research: The context sensing ability of an IoMT system usually produces data streams that bring the opportunity for developing new location-aware applications. The mobility of these devices can also be examined using different spatial and temporal scales. New location prediction and mobility prediction models are needed to support anticipatory learning models, especially in the case for smart cities; • Real-time anticipatory actions: Having a learning engine close to an IoMT device, and combining the knowledge and insight which is computed in a cloud environment, can anticipate the needs of citizens in real time. As delineated in [146], "if this real-time analytics is fed into some kind of a predictive model and the results are used to take the user current decisions, then we have what is defined as anticipatory computing. If the output of the predictive model is directly fed into an automated decision-making process, it ensures a desired outcome. This is prescriptive analytics. This roadmap essentially is shaping the future." • Integration with opportunistic computing: There is a concern for how users carrying IoMT devices could interact with each other opportunistically [147]. IoMT could be an enabler by providing more interaction between users through moving devices. Some typical applications might include human-centric sensing, and data sharing; • Combination of different research fields to mimic human anticipatory actions: Recently, some digital assistants, such as Apple Siri, Google Now, Microsoft Cortana [148], have become able to help people do things such as sending a text, playing a song, adding a reminder, etc. None of these tasks required anticipatory actions. Researchers are looking for a tool that can give instantaneous delivery, understand surrounding context, and be able to analyze a huge amount of streaming data [149]. To achieve this, anticipatory computing needs to combine many fields of research such as geography, deep learning, humanoid robots, artificial general intelligence, and big data analytics.

Conclusions
This paper discusses anticipatory computing, which refers to systems that are focused on anticipating what is most relevant to users and acting accordingly, rather than only reacting to user commands. Anticipatory actions rely on different predictive models by combining processing levels such as cloud, edge, and fog nodes deployed around a smart city. It is important to point out that anticipatory computing and IoMT systems are continuously changing. In addition, the proliferation of IoMT devices offers many related research challenges and opportunities as discussed in this paper.
The promising trend toward IoMT (and IoT in general) has already attracted researchers from different industries, academic fields, research groups, government departments, etc., who are laying the foundation for smart cities. We have identified a gap in this foundation: the anticipation actions, which are expected to have a strong impact on the way smart cities will operate in the future. Hopefully, the path laid out in this paper will give useful guidelines for further research in this emerging topic.
Author Contributions: These authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript. Acknowledgments: The authors would like to thank Alica Farnham for proofreading this paper. The authors also appreciate the insightful comments and suggestions provided by three anonymous reviewers and the guest editors on the previous version of this manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: