Towards Massive Data and Sparse Data in Adaptive Micro Open Educational Resource Recommendation : A Study on Semantic Knowledge Base Construction and Cold Start Problem

Micro Learning through open educational resources (OERs) is becoming increasingly popular. However, adaptive micro learning support remains inadequate by current OER platforms. To address this, our smart system, Micro Learning as a Service (MLaaS), aims to deliver personalized OER with micro learning to satisfy their real-time needs. In this paper, we focus on constructing a knowledge base to support the decision-making process of MLaaS. MLaas is built using a top-down approach. A conceptual graph-based ontology construction is first developed. An educational data mining and learning analytic strategy is then proposed for the data level. The learning resource adaptation still requires learners’ historical information. To compensate for the absence of this information initially (aka ‘cold start’), we set up a predictive ontology-based mechanism. As the first resource is delivered to the beginning of a learner’s learning journey, the micro OER recommendation is also optimized using a tailored heuristic.


Introduction
In the information age, the development and dissemination of learning resources are booming at a much higher speed and wider range than their traditional shapes.People have shown increasing interest in getting access to online learning resources and getting involved in online learning activities, especially via mobile devices [1].Many leading universities have opened up access to their courses.Indeed, access to open education resources (OERs) is exponentially increasing.This boom of OERs gains wider popularity in the entire higher and adult education sector, and has also attracted many researchers' attention from educational, social, and computational views [2].According to the latest statistics, millions of people have attended the virtual classroom of online open learning to access OERs, which are produced and updated on a daily basis.This leads to an emerging concomitant trend: that of open learning [3].
Open learning is recognized as a novel and effective learning method that could lead to a revolution in traditional learning, distance learning and electronic learning (e-learning) which have been widely used in the first decades of 21th century.Nevertheless, current OER delivery still faces challenges and its sustained success remains in doubt.Recent studies actually suggest that massive open online courses (MOOCs), a common open learning environment, are currently suffering from low completion rates [4].Most learners who enroll in MOOC courses end up dropping out.Educational professionals have focused much of their effort on exploring open learning, OER and MOOC formats, but the concomitant pedagogical innovations for mobile learning are yet to receive sufficient attention [5].Indeed, there are many opportunities to improve open learning and OER delivery.
Our previous studies [6,7] demonstrated that micro learning is becoming a mainstream online learning mode; but coupled with mobile platforms, the knowledge attainment process is often fragmented.Using mobile devices, learners are easily affected by their mood or environmental distractions [8].In this paper, we will present our research on profiling micro learning processes through OERs and building a knowledge base to support the decision-making process of micro OER adaptation.This knowledge base will be built using a top-down approach.A construction of augmented ontologies oriented to micro open learning will be illustrated first, followed by a data processing strategy.Due to the brevity of learner information at the commencement of the micro open learning, we will introduce an ontological approach to technically address the cold start problem.

Nature of OERs and Open Learning Delivery in Mobile Environments
Open learning is different from on-campus, e/m-learning modes.OERs are "digital learning resources offered online freely and openly to teachers, educators, students, and independent learners in order to be used, shared, combined, adapted, and expanded in teaching, learning and research" [3].Open learning is the combination of informal learning and formal learning.Learners enjoy high flexibility in online open learning because there is no strict time constraint for joining and quitting.
Learners engaged in open learning are from different age groups and cultural backgrounds, with a wide range of geographic distribution.
Generally, OERs can be differentiated from MOOC and open courseware (OCW).Contrary to MOOC, OCW only offers course materials rather than entire courses.In other words, OER can be structured (MOOC content), unstructured (i.e., OCW), or even both.OER providers and instructors have tried to promote their courses and affiliated educational products at full stretch.They have leveraged mobile learning (m-learning) for learners to easily participate in learning activities regardless of restrictions in time and location.
Another aspect is that mobile learning activities in open learning normally consist of two sections: online learning and offline learning [9].Since mobile learners can freely download materials onto their mobile devices to view offline, they do not often stay on open learning platforms and attend virtual classrooms [10].In fact, accessing OERs online is only a part of learning; more tasks associated with learning would require offline activity [11], such as data collection, data analysis, and report writing for an assignment.Logically, mobile open learning is conducted through online systems that include guided and instructional materials, transaction details and deliverable resources [12].Hence, while learners are able to accomplish many open learning tasks offline, for some necessary procedures, such as data entry and work submission, they need to go back online to conduct these specific tasks.

Micro Learning
Micro learning refers to short-term learning activities in small learning units [13].Its learning process can cover time spans from a few seconds (e.g., in mobile learning) to up to 15 min [14].With mobile devices, learners normally accomplish learning objectives in a short time period.According to prior study [14], micro learning can be defined by the assumption that a short time span is needed to complete a relevant learning task.Hence, micro learning is booming with the wide use of mobile devices, and is becoming a major learning tool in the mobile environment.Micro learning shares some similar characteristics with mobile learning, as both are individually referable, self-contained, reusable and re-mixable [15].
Micro learning resources are available on-demand to facilitate just-in-time learning [16].These small learning bytes cannot be learned on-the-go, but require less effort.They can aid quick assimilation, thus reducing the dependency on a fixed time slot or the need to take a large chunk of time out of learners' working day [13].As micro learning evolves, micro-content delivery with a sequence of micro interactions enables users to learn without information overload [16].Compared to traditional learning modes, the overall effort required to progress through an entire concept will proceed in a continuous, or even intermittent, way rather than a consecutive way [16].

System Framework and Previous Work
In our previous studies [6,7], we have discussed the popularity of adopting micro learning in accessing OER, especially through mobile devices.The necessity of improving existing mechanisms of micro learning support has also been stated.Having studied the present status of research and development of open learning and OERs, we are motivated to carry out research to provide learners with adaptive OERs by means of micro learning with regard to their individual needs.In other words, we are dedicated to tailoring OERs into chunks of relatively short time length, and allocating them to learners at the right time.This approach was realized by Software as a Service, Micro learning as a Service (MLaaS).In optimal conditions, through use of MLaaS, learners can easily complete the learning process by using their fragmented pieces of time.For example, a learner may spend 15 min using mobile devices to learn a piece of a MOOC course on his or her way home from work by train.In this case, an ideal course module delivered to him or her should be limited to that time length (e.g., 15 min), to ensure a micro but complete learning experience.
The framework of MLaaS is shown in Figure 1.As a data-rich system, MLaaS will be able to exploit detailed learner activity data not only for recommending what the next micro learning activity for a particular student should be, but also for predicting how that student will perform that future learning content.
In our pilot work [17], we proposed peer-to-cloud and peer-to-peer models for resource sharing and storage in service-oriented contexts.Such models can have higher upload and download speeds than a traditional cloud model, user model or peer-to-server-peer model, and can be more robust to the failures of peers or servers in the cloud environment [17,18].Hence, we adopt this design and apply its concept as the topology of the new system for micro open learning.
The P2P sub-network of the proposed system is to conform with the nature of open learning, where varieties of P2P learning occur frequently and randomly.This P2P tier guarantees that P2P learning can be organized instantly, and the first-hand resources can be shared and exchanged, regardless of access to the cloud.
From the top-down view, MLaaS borrows the cloud service to maximize the capability of hosting.The cloud part of the system consists of four domains: data tracking, data collection, data processing and data storage.
The functions of modules in MLaaS's cloud-end are outlined in prior works [5,19].A noticeable feature of the system is that there are three file transmission channels:

•
A channel between learners and an instructor-created OER pool in the cloud storage part (i.e., Channel A in Figure 1).

•
A channel between learners and a learner-generated OER pool in the cloud storage part (i.e., Channel B in Figure 1).

•
A channel among all learners engaged in open learning (i.e., Channel C in Figure 1).A channel among all learners engaged in open learning (i.e., Channel C in Figure 1).Once a learner indicates his or her desire to carry out micro learning and sends such a request from a mobile device, OERs will be transmitted through one of the three channels.
Where the OERs actually come from in the cloud resources pools (i.e., from which exact cloud nodes the OERs are retrieved and invoked) will be defined and externally supported by third-party service-selection and resource-allocation services from mainstream service providers.This problem has been well studied; typical solutions can be found in the work reported in [20].
We have reported on the architecture and technical details of MLaaS in prior studies [19,21].A comprehensive description is beyond the scope of this paper.It is worth noting that MLaaS only produces micro OERs, rather than OERs.That is to say, normal OERs available online are collected by MLaaS and clustered in the OER pools, as shown in Figure 1.For this reason, despite MLaaS owning its data collection mechanism, it shares some demographic and educational data with the platforms or providers from which the OERs originate.This helps in learner profiling, which will be introduced in the Section 5.1, even if a new learner registration in MLaaS is informal, and without sufficient demographic and educational data provided.
The system framework has thereby briefly been introduced here as background, and we will now move on to the focus of this paper: ontology construction, data processing strategy and coldstart problem.

Research Problem Identification and Design
Given all decisions of micro OER adaptation are made by the Adaptive Engine, it acts as the core of the system.It consumes the results from all other services and transmits its output straight to the Once a learner indicates his or her desire to carry out micro learning and sends such a request from a mobile device, OERs will be transmitted through one of the three channels.
Where the OERs actually come from in the cloud resources pools (i.e., from which exact cloud nodes the OERs are retrieved and invoked) will be defined and externally supported by third-party service-selection and resource-allocation services from mainstream service providers.This problem has been well studied; typical solutions can be found in the work reported in [20].
We have reported on the architecture and technical details of MLaaS in prior studies [19,21].A comprehensive description is beyond the scope of this paper.It is worth noting that MLaaS only produces micro OERs, rather than OERs.That is to say, normal OERs available online are collected by MLaaS and clustered in the OER pools, as shown in Figure 1.For this reason, despite MLaaS owning its data collection mechanism, it shares some demographic and educational data with the platforms or providers from which the OERs originate.This helps in learner profiling, which will be introduced in the Section 5.1, even if a new learner registration in MLaaS is informal, and without sufficient demographic and educational data provided.
The system framework has thereby briefly been introduced here as background, and we will now move on to the focus of this paper: ontology construction, data processing strategy and cold-start problem.

Research Problem Identification and Design
Given all decisions of micro OER adaptation are made by the Adaptive Engine, it acts as the core of the system.It consumes the results from all other services and transmits its output straight to the user interfaces.For this reason, the MLaaS is conceived to meet the standard of a data-rich system, and a knowledge base serves as its think tank.Basically, the knowledge base is constructed using a top-down approach, by making use of semantics means, from the pattern level to data level.In other words, several ontologies are drawn at first, followed by 'data processing' work.We attempt to combine the pattern and rule discovery processes of micro learning with a survey of the education literature to produce features that could affect the learning experience and outcomes in a mobile environment [21].In detail, 'data processing' work involves all operations on data, from the very beginning to the end, such as entity extractions, relationship extractions, resolution disambiguation and so on.
Given that we have the overall system framework in place, we consequently adopt a conceptual graph-based approach for dealing with the ontology construction [22,23].These graphs profile the features that play a significant role in an ongoing micro learning process, and also depict how features were mutually affected by and interrelated with each other.According to our design, the profiling procedure is carried out from two sides, the learner side and the OER side.
While the profiling proceeds forward, some new problems appear which contradict our original design intentions.One of the most important problems is that the system, MLaaS, knows little about the learners, because either 'OER' or 'learner' is new to this emerging educational setting.This creates serious difficulties for beginning the 'data processing' process.Profile construction is impossible with insufficient information about the learner at the commencement of open learning.Therefore, the learner profile cannot be fully filled in with valid data.
In this paper, the research focuses on the knowledge base for micro OER recommendation and delivery.Naturally, based on the volume of retrievable data, this problem can be approached from two sides.

•
If a learner is well-known by the MLaaS, an educational data mining and learning analytics (EDM/LA) approach will be applied to his or her historical data to understand his or her learning patterns and preferences.Thereby, a well-grounded recommendation can be made based on his or her personalized settings and particular surroundings.

•
If a learner is relatively poorly known by the MLaaS, (i.e., this is a new learner to the OER environment), this will be treated as a cold-start problem and tackled by filling in the gaps with predicted data, so that a recommendation will be made based on demographic information.
Freshly generated information, along with the cold-start recommendation, will populate the first version of a learner's profile.

EDM/LA for Micro Learning
Student learning data collected by open learning systems are explored to develop predictive models by applying educational data mining methods that classify data or identify relationships.These models play a key role in building adaptive learning systems, in which adaptations or interventions based on the model's predictions can be used to change what students experience next, or even to recommend academic services to support their learning.
Analyzing these newly logged events requires new techniques for working with unstructured text and image data, data from multiple sources, and vast amounts of data ("big data").Big data does not have a fixed size; any number assigned to define it would change as computing technology advances to handle more data.For example, Manyika et al. defines big data as "Datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze" [24].
At this cutting edge, EDM and LA are widely used in research.They are used to build models in several areas that can influence online learning systems.As its name implies, EDM is a state of the art method that applies data mining techniques to educational data.It is concerned with many developing methods, and acts on exploring the unique types of data in educational settings.Using these methods, students and educational settings can be better understood [25].To enable smart and adaptive micro learning for MOOC, EDM and LA are key concepts that we employ to build the basis of the dynamic learner model construction.
Generally, OCW data is locked away in independent data silos hosted by different OCW/OER providers.This makes it much less useful than it could be.It is difficult to develop tools for consuming data from multiple silos.Searching OCW/OER across multiple silos means invoking the user interface of each one, and receiving the results in separate groups.The presence of data silos makes accessing data and interoperability between repositories harder in several ways.
Browsing OERs also has a problem in that each silo has its own organizational structure.Some silos have no way to link to a particular item, and so hinder the free flow of information.The presence of OCW silos impedes the interoperability, discovery, synthesis, and flow of knowledge.As a result, it is a difficulty for teachers, students and self-learners to look for resources, and sometimes they make decisions based on incomplete information.Linked data have the potential to create bridges between OCW data silos.

Cold Start Problem in Micro OER Delivery
In computer science literature, widely used adaptive recommendation methods generally consist of two main categories, i.e., memory-based and model-based algorithms [26].Although they have been found in many successful cases of recommender systems, for example, the Amazon online store, it is usually difficult to provide reliable recommendations due to the insufficiency of initial data of ratings or preferences.This leads to the occurrence of the cold start problem.Commonly, the cold start problem is triggered by three factors: new community, new item and new users.
The cold start problem becomes more severe in open learning, especially in micro learning through OERs [14].Both open learning and OER are relatively new products, which have been emerging in very recent years.Meanwhile, the followers of this novel trend, no matter whether they are new education pursuers or regular learners migrating from other online learning modes, are forming a completely new community.On the other hand, the learning demands and expectations of learners engaged in open learning are much more practical than conventional university students.In other words, they are mostly self-regulated, so that they have a great deal of flexibility in deciding when to join or quit the online course, and they can switch among courses frequently and at will [27].Consequently, for OER providers, it is difficult to establish a model and update it accordingly for any individual learner because they do not have any historical data to hand.
In micro open learning, or micro learning over OERs, it is very normal to find that learners take part in and deviate from the learning scenarios frequently, as well as turning on and off the learning activities at will.That is to say, the overall situations of micro learning vary completely from individual to individual.Moreover, it is very common for freshmen to join open learning, or for existing learners to unfold a brand-new course learning profile at any time.All in all, then, (1) there are a large number of new learners in open learning; (2) new learners usually initiate access of new learning resources; and (3) learners who went through learning resources in the same branch of a discipline will form new communities.
If treated carelessly, the cold start problem may lead to the loss of learners who were previously engaged in open learning, but decide to stop using the OER delivery system or be unwilling to adopt the learning mode [28].The reasons behind the situation are mainly due to the lack of accuracy in the recommendations received in that first stage, in which the learners have not yet cast a significant number of votes or rating to feed the recommender systems.Basically, the sparsity of data affects user satisfaction, and then it can further affect user acceptance of the new open learning mode.

Contribution
Extended from our published works [19,29], which presented the general framework and educational background of micro open learning, this research will provide innovative deliverables by the following means:

•
Top-down processing of semantic knowledge base building.

•
Conceptual graph-based ontology construction for the pattern level.

•
Data source documentation and data processing strategies for the data level.

•
Complete ontology-based mechanism for tackling the cold start problem.

Conceptual Graph-Based Ontology Construction
Generally, a workable knowledge base has a two-tier structure: a pattern level at the top and data level at the bottom [30].For the pattern level, the ontologies are constructed based on conceptual graphs, as we briefly described in Section 3.1.1.By this means, the ontologies represent the formal dimensions of the data processing workflow, and can drive data processing with a priori knowledge, thereby reducing the search space [23,31].
By accomplishing a comprehensive survey of literature in the fields of pedagogy, psychology, e-learning and mobile learning, we sorted out features that might play key roles in the micro open learning experience and achievement.These conceptual graphs also represent how features were affected by and interrelated with each other in the ongoing micro open learning process.This will be introduced in the subsequent Sections 4.1.1 and 4.1.2,from the OER (item) side and learner (user) side, respectively.

Augmented Micro OER Ontology
From the item-based perspective, we deepen the sights from normal e/m-learning into the micro learning environment.For this reason, the general ontology of OER is augmented to adapt the needs of micro learning.
In an augmented micro OER ontology, an annotation of a micro OER is self-describing; with metadata exploring its educational parameters, such as typology (video, audio, text, etc.), type of interaction (expositive, active, mixed, two-way), didactic model (e.g., inductive, deductive, learning by doing, etc.), and non-functional attributes, such as QoS, semantic density and so on [32].Each node in an augmented OER ontology indicates a micro OER chunk.A chunk is the smallest unit in the micro learning settings-normally a finely-cut piece of an OER from its provider-with an apparently shorter time length (preferably less than 15 min) than its original shape.It can be a mini concept or knowledge point, tinier than what teachers used to deliver; or it can be a cut of course video or lecture notes; or course settings delivered along with a concept, such as assessment, task, reading material and so on [21].
No chunk is totally independent, and each of them is part of a relational web rather than merely a conceptual object [33].This ontology is used to explicitly classify the OERs for recommendation among a pedagogically defined set of distinctive main concepts, fed as the raw material into the reasoning process of MLaaS [29,33].
A conceptual graph of the augmented OER ontology is shown as Figure 2.

Augmented Micro Learning Learner Profile Taxonomy
From a user-based perspective, the main ontology, on which all learner profiles are based, is named the Benchmark ontology, where the element Learner is put at the center of the graph [29].Acting as an instance of a preset domain ontology, a specific learner profile oriented to micro learning is a set of nodes from the Benchmark ontology matched with a node in the augmented micro OER ontology.It contains plenty of annotations in terms of their learning behaviors and context.A conceptual graph of the benchmark ontology is shown as Figure 3.

Augmented Micro Learning Learner Profile Taxonomy
From a user-based perspective, the main ontology, on which all learner profiles are based, is named the Benchmark ontology, where the element Learner is put at the center of the graph [29].Acting as an instance of a preset domain ontology, a specific learner profile oriented to micro learning is a set of nodes from the Benchmark ontology matched with a node in the augmented micro OER ontology.It contains plenty of annotations in terms of their learning behaviors and context.A conceptual graph of the benchmark ontology is shown as Figure 3.

Augmented Micro Learning Learner Profile Taxonomy
From a user-based perspective, the main ontology, on which all learner profiles are based, is named the Benchmark ontology, where the element Learner is put at the center of the graph [29].Acting as an instance of a preset domain ontology, a specific learner profile oriented to micro learning is a set of nodes from the Benchmark ontology matched with a node in the augmented micro OER ontology.It contains plenty of annotations in terms of their learning behaviors and context.A conceptual graph of the benchmark ontology is shown as Figure 3.

EDM and LA Strategy
Reporting learners' data visually and statistically to demonstrate their unique learning story, and also their learning constraints (such as time availability), is crucial.This plays a significant role in assessing learners' study status, estimating learners' study progress, and carrying out strategic decision-making.This process is responsible for the benchmark setting for routine data extraction from the open learning platform.
For the bottom level (i.e., the data level) of the knowledge base, the technical operation of semantic learner profiles and knowledge base construction for micro open learning is based on data that populates the graphs from two sources: explicit data collection (e.g., through mandatory requests); and implicit data tracking (e.g., automatic extraction) [33].
In addition, rather than developing the domain ontology for OERs by ourselves, a general structure of courseware ontology is built jointly by making use of existing ontologies, which were extracted from major OER providers, such as universities involved in major open courseware alliances (e.g., participating institutions in edX (https://www.edx.org/schools-partners)), or from the Linked Open Data Cloud community (http://lod-cloud.net/)[34].
The investigation of 'big' open learning data is organized at the OER side.Among the massive OERs, three main types of relations are foreseen:

•
ConsistsOf is an inclusion relation.This relation can be generally found between two OERs or one OER and one micro OER.Two items with this relation are located in different hierarchies of the augmented micro OER ontology.

•
RequiredSequence is a strong order between two items (OER or micro OER), where the former micro OER must necessarily be learnt before the latter one, due to course settings and educational consideration.

•
RecommendedSequence is a weak order relation between two items (OER and micro OER), where the former micro OER is suggested to be learnt before the latter one, according to the instructor's guidance, but is not mandatory.

•
It is certainly possible for two items (OER or micro OER) to have no relation at all.

•
Both relations regarding sequence can be inherited by entities' descendants, for example, if there is a RecommendedSequence(R 1 , R 2 ) indicating an OER R 1 is preferably learnt prior to R 2 , then, for MR 1 ∈ R 1 and MR 2 ∈ R 2 , there is a RecommendedSequence(MR 1 , MR 2 ).
The purpose of the EDM/LA is to amend, enrich and validate the aforementioned ontologies built manually and extracted semi-automatically, and verify and weigh the importance of discovered relations.Our combination of EDM/LA is realized on the basis of two components [35]; on-campus mobile learning data (i.e., structured data), and 'big' open learning data (i.e., unstructured data).In particular, we are carrying out the experimental EDM/LA by conducting a substantial analysis of the real data of learning behaviors of students from a public university in Australia.The data are collected from the main learning management system (LMS) and data warehouse of the university.This analysis aims to identify the regular patterns of students getting involved in blended learning (i.e., on-campus learning and e/m learning); for example, whether and how often they adopt micro learning modes to accomplish learning tasks, to explore the major factors that affect their learning habits, and most importantly, to understand the rules for the ways in which features listed in the personalized learner model are mutually affected by, interrelate with, and act upon their learning outcomes.At this stage, we are discovering potential trends, which cannot be directly shown from the data we have gathered.We can then apply such findings to open learning scenes and infer what is behind the scene.The detailed data sets are illustrated in Table 1:

Data Type Purpose
Learners' exact time of logon/out for each time To know how long they stay online each time The To know their learning habits (how they prefer learning resources to be passed on)

Course requirement/milestones set in LMS (by instructor)
To know the suggested learning schedule Their detailed learning activities (What they do when staying online and how long they spend on each specific learning activity, type of resource they access for each specific time) To know their learning habits, learning engagements, learning speed and so on.
Their interactions with LMS and learner-generated content (from forum and thread, etc.) To know their preferences, interests and to measure their engagement.
Frequencies of their participation in interactive learning activities (e.g., forum, thread) To know their engagement

Extent of completeness for each learning activity
To know whether they finished an entire step of learning or drop off halfway The learning paths they have gone through (the sequence of their access of learning resources over LMS) To further establish optimal learning paths Their learning achievement (grades and final marks if possible) To know how their learning behaviors affect their learning outcomes

Groups or teams they have participated in
To know their collaborative learning performance and similarities/changes of learning time frame among learners The study is subsequently extended and applied to a larger scale, by analyzing 'big' data from real open learning activities.Data mining means with different aims are shown in the first column of Table 2.
To a large extent, the establishment of the data level can involve integrating heterogeneous OCW repositories, refining and blending available OERs into the micro learning context and publishing their metadata as linked data.Because in recent years some educators and researchers have made great efforts to publish and popularize the OER in terms of the linked data concept, a workflow developed with this extended aim is generally divided into six phases: 1.
Identify and select heterogeneous data sources to determine the scope of the content.

6.
Consume and display linked data.

OERs in affiliated social networks
To distinguish information that can be useless, harmful and may cause time wasted for learners.

Social Network Analysis Other content in affiliated social networks
To screen well-recognized information in order to recommend to learners as their learning augmentation besides the OERs (text mining technique employed)

Representation of Learner Profile
Adopting ontologies as the basis of the learner profile is crucial in addressing the cold start problem in micro OER delivery.It allows the initial learner behavior to be matched with priori knowledge defined in the ontologies and relationships among them.
The learner profile is managed by MLaaS in two parts: a static part and a dynamic part.The static part can be represented by a vector, which contains demographic and educational information.By matching these two augmented ontologies for item and user, respectively, the dynamic part of a learner node is denoted as a pair, L j = {MR u , ML j }, L j ∈ L. Herein, the element MR u denotes the uth micro OER, as described in Section 4.1.2,which is a particular version of the micro OER ontology, and a three-dimensional element ML j {P u,j , TA j , D j } is exclusive to jth learner during the micro learning process.Herein, the element P u,j indicates the learner's preferences, TA j indicates the jth learner's instant time availability, and D j denotes the level of distraction in terms of the given learning environment and surroundings.
Whenever MLaaS gathers any information from the learner's learning process over OER, the learner profile will be updated with regard to ML j .

Preference Propagation
Provided the cold start condition for the first micro OER delivery, a learner is required to quickly mark down a preference for a specific micro OER.Consequently, a spreading activation approach is applied to maintain the preference against its parent node (i.e., the R v is the vth OER from which the MR u is derived) as well as updating the learner profile.It propagates the learners' preferences upwards in the hierarchy of the micro OER ontology based on activation values.In other words, the preference obtained from a micro OER is applied to its ancestor and spread in its superclass (i.e., OER) level.An example of the spreading activation is shown in Figure 4. and a three-dimensional element MLj {Pu,j, TAj, Dj} is exclusive to jth learner during the micro learning process.Herein, the element Pu,j indicates the learner's preferences, TAj indicates the jth learner's instant time availability, and Dj denotes the level of distraction in terms of the given learning environment and surroundings.
Whenever MLaaS gathers any information from the learner's learning process over OER, the learner profile will be updated with regard to MLj.

Preference Propagation
Provided the cold start condition for the first micro OER delivery, a learner is required to quickly mark down a preference for a specific micro OER.Consequently, a spreading activation approach is applied to maintain the preference against its parent node (i.e., the Rv is the vth OER from which the MRu is derived) as well as updating the learner profile.It propagates the learners' preferences upwards in the hierarchy of the micro OER ontology based on activation values.In other words, the preference obtained from a micro OER is applied to its ancestor and spread in its superclass (i.e., OER) level.An example of the spreading activation is shown in Figure 4.A partial view of augmented micro OER ontology in the 'information technologies' area is shown in Figure 4. Specifically, it describes an 'e-business' OER from an Australian provider, OpenLearning (https://www.openlearning.com/).At the bottom level of the ontology, nodes which are depicted with an oval shape typically conform to the standard of micro OER.Red integers shown in nodes with a rectangle shape are preference values from a learner versus target OERs.Algorithm 1 is proposed to execute the process of preference propagation.

Algorithm 1 Preference Propagation
Input: Dynamic part of learner profile L j ={ MR u , ML j }, a trial micro OER MR u , L j ∈ L Output: Updated dynamic part of learner profile with updated P u,j value in the triple dimensional set ML P(R v )and Activation(R v ), preference value and activation value for the OER R v //Step 1: Spreading Activation begin: Initialize PriorityQueue;//PriorityQueue is the set of OERs within the same discipline where R v belongs to Set Activation of all micro OER to 0 for each end for end The normalization factor acts on preventing the propagated preferences from escalating continuously to such an extent that they exceed a reasonable range, which could result in difficulty of data processing in the forthcoming process.The confidence degree for the propagated preference of OER is recorded as CD(P v,j ).

Instant Time Availability
The system is able to obtain explicit information on how long the learner can (or would like to) spend on a micro OER through mobile devices in real time.As a mandatory request, a learner is required to input his instant time availability at the beginning of every micro learning activity.According to the system settings, the instant time availability, TA j , is suggested to be represented by an integer from 1 to 15.However, if the learner is not very sure how long he is able to spend on the micro OER at once, he is free to leave a time span, which can be continuous integers in the same range.

Demographic Classification
In our prior work [21], we have discussed the key issues that might cause distraction in micro open learning, which generally comes from two sides, the social side and the environmental side.
In addition, MLaaS investigates existing learners' degrees of distraction as reference, and senses every learner's location information through built-in functions in mobile devices.Based on the given taxonomy and augmented ontology, we carry out a demographic classification that aims to cluster learners into cohorts, in order to match them with micro-pieces of OERs.
The mechanism of classification is implemented because learners who have similar static information-such as employment and/or education background, occupation-and a similar learning environment/location, are more likely to face similar levels of distraction.For the same reason, their overall time availabilities are more likely to fall in the same range.Herein MLaaS tries to associate a learner with a pre-clustered learner group by applying the stereotyping technique to fulfill the requirements of demographic classification.
For a newly joined learner, L j , an ensemble method of a binary classifier and a one-against-all model is utilized to obtain multi-class classifications [36,37] in order to predict its category, C j .The system is trained with an existing set of learners, L. Typical binary classification techniques, e.g., C4.5 decision tree [38,39] or Naive Bayes classifier [40], can be employed to serve as the base algorithm (i.e., training algorithm F in Algorithm 2) in order to produce a suitable classifier, CF k .A new learner L j is classified with the label k, whose CF k produces the highest value of ŷ.Hence, the demographic classification is realized according to learners' static and location information.Once new learners join the open learning scenario, MLaaS responds immediately by classifying them into clusters.

Similarity Measure between Two Learners
MLaaS is responsible for finding similar existing learners in the discovered demographic categories, so as to recommend micro OERs to them that have been recognized as suitable to learn in a given time span, situation and environment.
Learners' learning location information is sensed from the location service embedded in the mobile devices.Thus, the similarity between two learners, L i and L j , is evaluated using Equation (1).
where S l is the similarity value of the lth attribute in the static part of learner profile, and W l is its corresponding weight.SLo i,j denotes their similarity of location, and W i.j denotes the weight of the location factor.

Distraction Prediction
Thus, in terms of Equation ( 2), the distraction value can be estimated in accordance with the action that any member in a same cluster indicates the predicted distraction level.
where d j , Loa is the self-identified degree of distraction the learner L j felt in the location Lo a , acquired by mandatory request.This follows the expectation that learners who have similar general situations (i.e., social factors) and surroundings (i.e., environmental factors) have a high probability of having similar degrees of distraction.The confidence degree for the predicted distraction is depicted as CD(D i,Loa ).

Downwards Propagation
In the Section 5.1 we have merely obtained the preference of a learner on an 'entire' OER rather than on a micro OER, now the preference values are again propagated downwards through the ontology hierarchy.Consequently, each micro OER node receives an estimated preference value from its ancestor.This propagation process is executed with a decay factor.For each micro OER the final preference value, P u,j , can be calculated use the following Equation (3).
where R is the set of all the nodes in the higher hierarchy than MR u , R v is a direct ancestor of node MR u and Q(u,v) depends on the count of level between MR u and R v .
and the confidence degree for the descendant node, in regards to P u,j , is calculated as the average of the confidence values of its ancestors, decreased by a decay factor, µ.
Having settled all values for the three attributes denoting preferences, instant time availability and degree of distraction in the set ML, a complete learner profile is constructed from the sparse initial information by the MLaaS.

Micro OER Sorting Rules
For each micro OER, once MLaaS has acquired its final preference value and confidence degree, those nodes which do not meet the minimum requirement of confidence degree are rejected by the system.
A list of recommended micro OERs is generated, in which the ones with higher learner interest are placed at the top.For two micro OERs MR u and MR w , their sequence is determined according to some heuristic rules which are defined in accordance with the extraction of three kinds of relations discussed in Section 4.1.1.These rules are executed sequentially with priority.

1.
If there is a RequiredSequence relation between these two micro OERs, the prerequisite one is placed above (refer to Section 4.1.1).

2.
If in the preference regarding these two OERs, P u,j , P w,j , the former is higher than the latter one, then MR u is above MR w .

3.
If, in absolute terms, the confidence degree CD(P u,j ) is high and the CD(P w,j ) is low, then MR u is above MR w .

4.
If there is a RecommendedSequence relation between these two micro OERs, the one which is suggested to be accessed first is placed above (refer to Section 4.1.1).

5.
The micro OER which is more related to the learner's education background, or falls in relevant disciplines or inter-disciplines, is placed with priority if the disciplinary difference between these two candidate micro OERs is obvious.6.
Otherwise, the recommended micro OER list is randomly ordered if none of the above rules applies.
Herein, the first rule is deemed to be a hard rule which should be strictly obeyed, and the rest rules are soft rules that can be violated on a case-by-case basis in consideration of educational factors.

Recommendation Results Optimization
MLaaS consumes the value P and D in conjunction with their TA to compare with the attributions and requirements annotated in the metadata of the augmented OER ontology.
The next step is to integrate the outcomes from Section 5.4.A fitness function will convert these selected multidimensional arrays into one variable.Hence, this problem is hereby properly transferred to a multi-objective optimization problem.
To initiate the constrained multi-objective optimization, candidate learning path solutions (chromosomes) are randomly generated where each of them is a learning path with a series of micro OERs.For a chromosome, its violation degree is investigated by examining the relations between each contiguously prior/posterior micro OER pair against the first 5 rules listed in Section 5.4.2, and then summing up.For such a pair in a chromosome, its violation degree, VD(MR t , MR t+1 ), is calculated by the weighted sum of its violations of rule 2 to rule 5, where MR t is the tth micro OER in k and MR t+1 is the (t + 1)th.The higher the violation degree is, the more serious the candidate learning path violates the rules.The violation degree of a candidate learning path, k, is calculated using the following Equation ( 6): Thereafter, let the variable RA u denote the degree of required attention of a given micro learning resource, MR u , whose real-time suitability for micro learning, RT u,j , is calculated by comparing with the learner, L j 's predicted distraction, using the following Equation ( 7): RT u,j = {(RA u ) 2 + [CD(D j,Lo a ) * D j,Lo a ] 2 } 1/2 (7) Hence, for the candidate learning path, k, RT k,j denotes the sum of the real-time suitability of micro OERs it contains.Similarly, P k,j sums up all the predicted preferences from the learner L j versus micro OERs that k contains.η = min(αVD k + βRT k,j + γ/P 1 k,j + δ/P k,j ) where α, β, γ and δ serve as weight for each variable and suggestively α > β > γ > δ, P 1 k,j denotes the L k 's preference value of the first micro OER in the candidate learning path k.
Algorithm 4 indicates typical steps for making the first recommendation By this means, the heuristic Algorithm 4 infers a suitable micro OER as the first attempt of learning resource recommendation in the novel open learning experience through MLaaS.
Along with the successful launch of a solution to the well-known cold start problem in micro learning, learners' upcoming behaviors will be continuously acquired by MLaaS to feed the reasoning engine.

Algorithm 4 Micro OER Recommendation in a Cold Start Condition
Input: P u,j (the Learner L k 's predicted reference to the micro OER MR u ), D j,Loa (predicted distraction level), CD(P u,j ) and CD(D i,Loa ) (their confidence degree), RA u (the degree of required attention of MR u ), TA j (the instant time availability), rules (1st-6th) Output: the tag of a micro OER which acts as the first delivery begin: Randomly generate candidate learning paths as chromsomes for each chromosome k do Select micro OER it contains for each MR u in a chromosome k, Caculate its P u,j and CD(P u,j ).Import D j,Loa , CD(D i,Loa ) and RA u Caculate its RT u,j end for Calculate k's VD k Use Equation ( 8) to evaluate its fitness η end for while iteration times < max iteration time do apply heuristic approach to generate new candidate solutions for each new chromosome k' do check time length of the first micro OER in k', TL 1 k if TL 1 k is in the range of TA j keep k' otherwise reject k' end if evaluate the fitness of k', η, using Equation ( 8) end for replace chromosomes with higher η values end while output the selected chromosome k"with minimum η and satisfied TL 1 k select the first micro OER in k"as the first delivery end

Conclusions
In this paper, we introduced a study aimed at dealing with the adaptive micro OER delivery.We proposed a tailored system for this MLaaS.Aside from its technical details and working principles, we primarily focused on the construction of the knowledge base.It was built using a top-down approach, by having ontologies at the pattern level first.Using this, a strategy on processing data was then developed at the lower level.This supported the decision-making process of the micro OER recommendation system.However, because both the system and user are new, learners' information deficit in MLaaS delayed the commencement of adaptive micro open learning and MLaaS operation.A detailed approach was therefore provided to deal with this so called 'cold start problem' based on predicting learners' features from the sparse initial information.
Our future work will extend the EDM/LA work, and prototyping this ontological approach for the cold start problem and further developing its corresponding component in MLaaS.This will be further evaluated by measuring the prediction accuracy.We will also engage real learners to compare the quality of recommendations.Apart from the 'new user' cold start problem discussed in this paper, we will look for solutions to deal with 'new items', i.e., micro OERs, by using a queue-jumping method to insert them into established learning paths.

Figure 1 .
Figure 1.Topology of the Architecture of MLaaS.

Figure 1 .
Figure 1.Topology of the Architecture of MLaaS.

Figure 3 .Figure 2 .
Figure 3. Concept Graph of Benchmark Ontology for a Learner Profile in Micro Open Learning.

Figure 3 .Figure 3 .
Figure 3. Concept Graph of Benchmark Ontology for a Learner Profile in Micro Open Learning.
IP address or gateway information of their internet connection To know their exact learning location and surroundings Mobile device information, mobile operator information and mobile OSs To know their general situation Their personal enrollment information (full time or part time, nationality) To know their learning time availability, organization and language skills Their residential information (session address and permanent address) To understand their distance to campus and the potential modes of transportation they adopt) Subjects they have chosen (current) To know their academic background and field Subjects they have chosen (historical) To know their academic background and field Historical grades To know their academic background and infer level of pre-knowledge Course materials they have accessed (material type, topic, length, requirement associated with them)

Figure 4 .
Figure 4. Partial View of the Augmented Micro OER Ontology and Spreading Activation for a Learner's Preference on OER.

Figure 4 .
Figure 4. Partial View of the Augmented Micro OER Ontology and Spreading Activation for a Learner's Preference on OER.

Table 1 .
EDM/LA Data Sources from University Warehouse.

Table 2 .
EDM/LA Scheme for Open Learning Data.
Given L j is categorized into C k , afterwards, the learner's neighborhood, NB j , is calculated by Algorithm 3.This aims to match a new learner's category with an existing learner's category.