An Intelligent Task Scheduling Mechanism for Autonomous Vehicles via Deep Learning

: With the rapid development of the Internet of Things (IoT) and artiﬁcial intelligence, autonomous vehicles have received much attention in recent years. Safe driving is one of the essential concerns of self-driving cars. The main problem in providing better safe driving requires an efﬁcient inference system for real-time task management and autonomous control. Due to limited battery life and computing power, reducing execution time and resource consumption can be a daunting process. This paper addressed these challenges and developed an intelligent task management system for IoT-based autonomous vehicles. For each task processing, a supervised resource predictor is invoked for optimal hardware cluster selection. Tasks are executed based on the earliest hyper period ﬁrst (EHF) scheduler to achieve optimal task error rate and schedule length performance. The single-layer feedforward neural network (SLFN) and lightweight learning approaches are designed to distribute each task to the appropriate processor based on their emergency and CPU utilization. We developed this intelligent task management module in python and experimentally tested it on multicore SoCs (Odroid Xu4 and NVIDIA Jetson embedded platforms). Connected Autonomous Vehicles (CAV) and Internet of Medical Things (IoMT) benchmarks are used for training and testing purposes. The proposed modules are validated by observing the task miss rate, resource utilization, and energy consumption metrics compared with state-of-art heuristics. SLFN-EHF task scheduler achieved better results in an average of 98% accuracy, and in an average of 20–27% reduced in execution time and 32–45% in task miss rate metric than conventional methods.


Introduction
IoT is becoming ubiquitous that connects a million devices such as sensors, actuators, gateways, and hubs over the Internet [1]. Many embedded applications are connected to the physical world via the Internet, e.g., autonomous vehicles, avionics, home automation systems, health monitoring systems, smart cities, etc. This will lead to new developments in combining the embedded systems with IoT called real-time embedded IoT systems [2]. Of these applications, self-driving cars have received much attention in the recent past due to connectivity and artificial intelligence. Self-driving cars are the primary example for the real-time embedded IoT system, which includes multiple hardware sensors, actuators are connected through the IoT for processing each function. The self-driving car market is projected to grow by more than £ 7 trillion per year by 2050 [3].
However, there are many challenges in the transportation industry to make selfdriving cars as public transportation to create a safe environment. Many autonomous vehicle manufacturers such as Uber, Waymo, and Google manufacturing are struggling to implement smart cars in the market due to the high cost of the computational systems and the lag of the software task management system [4]. According to a recent survey, are scheduled based on the two heuristics called direct execution strategy and task replacement model, allocating the tasks on mobile edge servers for execution. Hussein et al. developed a dynamic voltage frequency scaling-based task scheduling algorithm called Energy Saving-Dynamic Voltage Frequency Scaling(ES-DVFS) for autonomous systems. The proposed algorithm modeled the driving tasks as aperiodic tasks with soft deadlines [15]. Xiaoqiang et al. reshaped the Petri nets with Ant Colony Optimization (ACO) swarm-based optimizer for enhancing the performance of IoT networks. The job scheduling problem is analyzed with a schedule timed transition Petri net (TTPN) and search space optimization algorithm to enhance the makespan (overall schedule length) [16]. The IoT tasks are modeled as directed acyclic graphs (DAG) with precedence dependency constraints. The genetic algorithm is utilized to optimize the search space during the allocation and execution of real-time IoT tasks on virtual machines. Makespan is minimized, and the infrastructure's performance as a service (IaaS) model is improved using Genetic Algorithm (GA) scheduling modules by Xiaojin Ma et al. [17]. Bini et al. proposed the branch-bound search based task scheduler with fixed priorities in order to maximize the system performance without violation of deadlines [18]. Likewise, Hyun-Jun Cha et al. developed a deadline monotonic task scheduler for optimality of deadline and period constraints in automotive systems [19].

Research Gap on Existing Models
These methods' limitations are less efficient for multiprocessor system-on-chip and not suitable for high-complexity tasks such as object detection, path planning, localization, lane detection, etc. In recent days, hardware technology has become more efficient in combining multiple CPUs, GPUs, AI, and IoT processors in the same chip to handle multiple tasks simultaneously. The existing software kernel is less efficient in handling these heterogeneous resources that require the intelligent task management system to improve resource utilization with high-performance and low-energy consumption. Moreover, recently traffic prediction problems are focused much on autonomous vehicle systems. The proposed intellectual task management system resolved the hardware/software codesign challenges and provided the optimal solutions in resource utilization, task miss rate, and execution time.

Challenges on Conventional Task Scheduling Policy for Self-Driving Cars
The conventional algorithms are most suitable for uni-processor hardware computing systems that are not readily adaptable for multiprocessor systems. Moreover, recently autonomous vehicles are designed with HMPSoCs to provide efficient performance at runtime.

•
Task miss ratio is the significant constraint to be minimized in the autonomous vehicles to minimize the accident rates, which is not considered in any previous scheduling methods.

•
Hardware resources in autonomous vehicles to be efficiently utilized to maintain the lifetime of the devices, such as sensors, actuators, and processing units.

•
Time complexity (overall execution time) is another essential metric for optimizing autonomous vehicles due to the battery-based device. • A trade-off between task miss ratio and overall execution time in task scheduling is an NP-hard problem.
These are hardware/software co-design challenges that need to be solved in autonomous vehicles to provide safe driving and high-performance.

Significant Contributions
The proposed intellectual task management system developed on addressing the challenges mentioned above and evaluated on two different HMPSoC hardware configurations. A supervised resource predictor with a scheduling algorithm is designed and validated with standard autonomous vehicle benchmarks.
Resource Predictor using Supervised Learning Model: Smart vehicles comprise mixedcritical workloads in terms of sensors, image processing, high computational workloads to complete a single task. The challenge that arises in real-time is (i) Identification of task criticality and selection of best resources for execution is still lagging, which degrades the performance at runtime. In the proposed framework, we addressed this issue by developing an intellectual learning neural network to predict the best hardware cluster for each autonomous vehicle workload on the HMPSoC platform. Single Layer Feed-forward Network (SLFN) [20] and lightweight deep neural network (LW-DNN) are low-complex predictors designed and compared with traditional learning algorithms such as SVM, RF, decision tree.
Task Scheduling Mechanism: Another challenge is arisen on ordering the tasks for execution. Many existing scheduling approaches are more time-complex. We developed a low-complex "Earliest hyper period first (EHF)" algorithm and executed the tasks on the base cluster to resolve this. Consequently, tasks are executed based on their emergency and earliest hyper period to achieve better task miss rate (TMR) and performance.
Implementation Module: A heterogeneous multicore SoC module "Odroid Xu4, and NVIDIA jetson" platforms are used to implement the scheduling algorithm and provide real-time solutions for autonomous vehicle services. CAV and IoMT benchmarks are used for the evaluation of the proposed task management system. The hardware module is connected with a cloud server to maintain the regular monitoring and storage of obtained results for future use. SLEN-EHF task scheduler achieved better results in an average of 98% in prediction accuracy. An average of 2027% was reduced in execution time and 32-45% in task miss rate metric than conventional methods.

Outline
The remaining sections are illustrated as follows: Section 2 detailed relevant works performed on Scheduling algorithms for IoT-based AV applications. In Section 3, the Preliminary system and application modules are discussed with the proposed architecture. SLFN mathematical model and resource predictor with internal working structure is illustrated in Section 4. EHF algorithm with workflow diagram is explained in Section 5. In Section 6, an Experimental setup with validated results is described with performance charts-the conclusion with the proposed limitations as detailed in Section 7.

Related Works
Autonomous vehicle research emerges from the past decade. It has recently become ubiquitous due to IoT, artificial intelligence, and high computational platforms' availability. Meanwhile, they boost the research topic on hardware architecture and intelligent task scheduling mechanisms to satisfy the timing constraints and safety requirements with low power consumption. Chien-Ying et al. (2018) researched real-time IoT systems regarding architecture, security, and processing issues. The autonomous vehicle driving workloads are framed as real-time periodic and aperiodic jobs with various timing constraint are modeled, and different case studies of Real-Time(RT)-IoT is deliberated in detail. RT-scheduling for IoT devices with its open problems and challenge is explained [21]. Nasri et al. (2015) have developed a model to capture the harmonic coupling between the periods. After calculating the harmonic subinterval in the range of possible periods for a given set of tasks that can easily calculate and assign a harmonic period. To further reduce the hyper period's value for the harmonic period's resulting assignment [22]. Liu et al. (2017) developed an autonomous driving system architecture that can run tasks on a heterogeneous Advanced Risc Machine (ARM) mobile system-on-chip [23]. They partition autonomous vehicle workloads into three categories in terms of sensing, perception, and decision-making.
Smruti et al. (2019) focused on IoT networks that assess various layers in terms of sensors, actuators in the application layer connected with hardware through efficient scheduler modules. Global and local preferable algorithms were designed for IoT devices. The proposed technique utilizes the multicore processors with a DVFS power optimizer to adjust the computational elements' voltage and the frequency at runtime [24]. Resource aware scheduler is framed for RT-IoT devices by Shabir et al. (2019) [25]. RT-Tasks are modeled in a periodic format in terms of execution time, period, and deadline constraints. The proposed module initially identified the hyper period for each periodic task set and executed by conventional schedulers such as Earlierst Deadline First(EDF), Rate Monotonic (RM). Resources such as processor energy consumption are considered for optimization and achieved by minimizing each periodic task set's hyper period during runtime. Sehrish Malik et al. (2019) developed an emergency first algorithm for real-time embedded workloads with hard and soft deadline constraints. Real-time IoT tasks are modeled with four diverse tags: emergency, regular, and most emergency periodic and aperiodic. Based on the tag attached to each task that multicore processors execute in [26]. Sensor tasks are modeled as periodic with a soft deadline, and actuator tasks are modeled as most emergency tasks prioritized first. The proposed model also includes the Artificial Neural Network (ANN)based prediction module with urgency and failure measure metrics that helps in allocation and execution. Farzad Samie et al. (2019) published a survey on machine learning models for embedded IoT systems. The significant role of machine learning and deep learning algorithms in IoT applications in various categories is discussed generously [27].
Jie Tang et al. (2020) proposed the low-cost real-time autonomous vehicle (Dragonfly Pod) with three modules, such as LoPECS (Low-Power Edge Computing System) and CNN for real-time object detection and speech recognition module with the heterogeneous multicore platform at an affordable price of $10,000 [28]. Recently many autonomous vehicles are connected with mobile edge computing servers, and mobile devices are used to monitor and control the services in real-time. Xu et al. (2020) [29] developed a scheme based on a small adjustment of these problems' resulting periods. The adjusted periods of the task may not be entirely harmonious. However, they are closely linked, facilitating the entire schedule's calculation up to the execution time for the periodic task's final planning.

Autonomous Vehicle Service Module
Primary services of autonomous vehicles are five modules: (i) Sensor and Actuator module, (ii) Perception, (iii) Localization and Mapping, (iv) Path Planning, and (V) Control Module. Each module consists of monitoring (periodic), controller (event-driven), and complex tasks (large computational tasks). Figure 1 illustrates the basic structure of the proposed task management system.

Application Layer Structure
In this work, the proposed framework is designed for all three category task sets. We assumed these tasks are represented as tree structures with nodes and edges [30]. A periodic (Sensor) task is represented as a single tree with its precedence tasks as child threads. Likewise, aperiodic and complex tasks are modeled as trees. Figure 2 (left) represents the sample periodic task graph. Figure 2 (right) denotes the aperiodic task graph, and the complex graph includes both periodic and aperiodic tasks, which are modeled as the same. (left) Periodic workload "getemp" sensor processing task is represented with subtasks, (right). Aperiodic workload "fire alarm" task with its subtasks.
Periodic Task Graph: Each periodic task set includes these four subtasks such as sensing, processing, monitoring, and uploading at regular time instance 't'. Definition 1. Each node in the periodic tree is denoted as Per i = {AT i , ET i , D i , t i , Token i } that need to be executed at regular time instance t i (i.e., period). Where Per i denotes the periodic task graph, AT is the arrival time of a periodic task, ET is the execution time of a periodic task, D is the deadline of a periodic task, 't' is the period of a periodic task, Token is the priority bit associated to a periodic task were Token = 0 it is a regular task if Token = 1 emergency task.
Aperiodic Task Graph: Aperiodic tasks are event-based tasks that are highly prioritized in IoT-based Autonomous vehicle applications. Generally, aperiodic tasks are not periodic (i.e., not steady job). Here, we represent the time instance as execution instance time of event triggered.

Definition 2. Each node in the Aperiodic tree is denoted as Aper
needs to be executed t i when triggered. Where APer i denotes the aperiodic task graph e i is the event triggered for a particular task, AT i is the arrival time of a periodic task, ET i is the execution time of a periodic task, D i is the deadline of a periodic task, t i is the event trigger instance in nanoseconds, E is the priority bit associated to an aperiodic task were E = 1 always because aperiodic is a control task that to be executed first in real-time systems to avoid the catastrophic situation.
Complex Task Graph: Complex tasks are Localization, Path planning, obstacle detection, etc.

Definition 3.
WTi denotes a large computational complex task Compl i ={AT i ,ET i ,D i ,t i ,E i } It should be executed repeatedly by CPUs or GPUs to control the entire system and safe driving.
For example, Object tracking, localization tasks are emergency tasks that need to be monitored and controlled continuously.

Constraints Involved in the Application Layer
Arrival time-Tasks arrived at the processor for execution are denoted by 'AT', which varies dynamically for each task.
Deadline-Worst case completion time of each task is denoted by 'D', which has been pre-defined before execution.
Response time-The time duration between released time and finish time is the total response time of task 'i' at processor 'j'.
Execution time-Worst case processing time of task 'i' is denoted by 'ET' during processor execution.
RT: Task-to-processor Mapping: Each node (i.e., task) is allocated to a particular processor for execution is a mapping problem; designing an optimal mapping for an RT-IOT system is critical due to limited resources in terms of memory, processors, and network availability. In this paper, a dynamic core mapping algorithm based on the SLFN predictor is developed for task allocation on multicore processors in real-time.
RT-Task scheduling: A feasible or optimal sequence order of task execution predefined in the RT-IoT system. In this paper, the earliest hyper period first (EHF) scheduling sequence is designed and experimentally validated with real-time periodic and aperiodic task sets.

Hardware Layer Multicore Processor System-On-Chip
In this paper, we assumed that CPU's are represented as Little cores = {L 1 , L 2 , . . . , L k }, GPU's as big cores = {B 1 , B 2 , . . . , B g }, with respective operating frequencies and voltage levels. Two different heterogeneous multicores SoC are targeted as hardware clusters named as (Hardware configuration:1 ("Odroid Xu4 SoC: 4L1b, 3L1b, 0L1b") and (Hardware configuration: 2 (Nvidia Jetson SoC-"8L0b, 0L384b, and 8L384b") which is used for the implementation of the proposed intelligent task management framework. The targeted heterogeneous multicore architecture comprises 16 CPU cores and 384 GPU cores. The number of big and little cores in each cluster differs based on the core combinations. These are the clusters we used for dataset preparation Clusters (CL): where "L" denotes the CPU core and "b" represents the GPU core. The operating voltage for the entire system is denoted as, Vl = {Vl 1 , Vl 2 , . . . Vl V }, v ∈ k, g and the frequencies are denoted as f l = f l 1 , f l 2 , . . . , f l f , f ∈ k, g respectively.

Intelligent Task Management for IoT Based AV
In this proposed framework, tasks are allocated on an optimal hardware cluster that is predicted by learning networks, and it has been scheduled based on the EHF algorithm. The SLFN and LW-DNN supervised learning predictors select the best hardware cluster based on the task and execution constraints listed in Table 1. The overview structure of the proposed model is illustrated in Figure 3. Formation of RT-IoT with the embedded system is deliberated with its current challenges in hardware/software, task scheduling, and security issues.
Multicore platforms, IoT networks are studied.
IoT tasks with mixed-criticalities --- [22] Fixed-priority and Rate monotonic scheduling algorithms designed with optimized period bounds Hardware is not focused on this work.
IoT tasks with mixed-criticalities (control applications are modeled as periodic task sets) Resource utilization is improved with harmonic period selections.
HMPSoCs are utilized for evaluation.
Autonomous vehicle workloads are considered.
Resources are utilized up to 100% Execution time is improved based on the appropriate selection of processors.
- [24] Energy consumption is targeted in this work.

Resource Prediction Model
Researchers recently developed deep learning-based optimization algorithms to enhance application mapping performance on multicore architectures [31]. The limitation of traditional scheduling techniques was targeted only a single metric like energy or power or time constraints, and task miss ration is not observed in the existing techniques which is most significant in RT-IoT systems. In this paper, we developed two different supervised learning algorithms named LW_DNN and SLFN. The DNN network is adopted from the previous work, which is utilized for comparison.

Feature Extraction
We executed IoT based Autonomous vehicle tasks on two different hardware configuration clusters named as (Hardware config:1 ("Odroid Xu4 SoC: 4L1b, 3L1b, 0L1b") and (Hardware config: 2 (Nvidia Jetson SoC-"8L0b, 0L384b, and 8L384b") with a mixed category of CPUs and GPUs. Table 2. describes the significant features utilized for selecting the appropriate cores for execution.
IoT and autonomous vehicle benchmark programs are repeatedly executed in each cluster and examined the execution characteristics. Workload characteristics are extracted in terms of execution features and optimized in this phase.
As mentioned above, each workload is executed on every hardware cluster and observed PC values, which in total 10 K data. The primary purpose of the feature preprocessing algorithm is to minimize the time complexity and training complexity. We developed a feature optimization algorithm similar to the previous work [31]. Table 1 shows the optimized features used for core selection. These are the significant features utilized as input for resource predictors on selecting optimal core at runtime. The application and execution features are automatically recognized and feedforwarded into the modeling process as input vector values.

Modeling of SLFN Predictor
Single Hidden Layer Feedforward Neural Network Prediction Model is developed to predict the best hardware cluster for each task execution to optimize resource utilization and power consumption. Machine and Deep learning classifiers, predictors have become popular in IoT, computer vision, medical devices, and autonomous vehicle applications. In this work, we adopted a single-layer feedforward network (G.B. Hung 2006) to predict the multicore in terms of "4L1b, 3L1b, 0L1b, 8L0b, 0L384b, and 8Lb". Each task's best hardware cluster is selected based on the task emergency value and core utilization value. Based on this prediction, each ready task arrived from queues is allocated and executed on respective cores. In this network, 'J' neurons in the concealed layer are expected to operate with a widely differentiable activation function (e.g., the Relu, sigmoid function). However, the output layer is straight. The concealed layers in ELM do not have to be tuned obligatorily. Loads of the concealed layer are given blindly (inclusive of bias loads). Figure 4 depicts the basic structure of the SLFN resource predictor with its input features are represented as 'X', concealed layer nodes are denoted as 'H', and otuput score is denoted as 't1 to tm'. The condition is not that the hidden neurons are inapplicable; instead, there is no need for tuning. Even in advance, the hidden node's parameters can be chaotically generated. The concealed neurons' parameters can be randomly produced even in advance.
Before taking care of the training set data, for an SLFN resource predictor, the system yield is given by Based on Equation (9), 'G' is considered the general sigmoid function with (a,b) node values. The hidden output layer vector µ is utilized to diminish the training dataset error that connects the output as Based on the Equations (10) and (11), ∂ remain hidden or unknown layer output matrix, 'T' denotes the target training set data matrix and H + denotes the Moore Penrose inverse of H. The above equation. can also be given as Hence the output function can be found using the above equation.
where 'c' is the approximate constant, and 'OP' is the output matrix. ELM uses the kernel function to yield good accuracy for better performance. The significant advantages of the ELM are minimal training error and better approximation. Since ELM uses the auto-tuning of the weight biases and non-zero activation functions, a detailed description of ELM equations can be found in the literature [32]. The pseudocode for the ELM is shown in Algorithm 1.

Algorithm 1: SLFN Resource Predictor
Input: Trained dataset with a random number of weights and bias values Output: Predicted Score in terms of optimal core and frequency parameter 1: The network is initialized with input neurons 'X'. Ready tasks parameters such as arrival time, Worst Case Execution Time (WCET) values with a respective deadline, and period with execution parameters are entered as input features to the developed SLFN network, which deviated from {1 to N}. The proposed SLFN resource predictor is a supervised learning network that is trained with several sensors and actuator workloads execution characteristics along with hardware configuration labels were trained statically. Figure 5 illustrates the working procedure of SLFN resource predictor executed for sample task set. Initially, the task features are feed-forwarded into SLFN modeling to obtain the best processor. The sample task graph is executed on the hardware configuration 1.

Task Scheduler Method Earliest Hyperperiod First Scheduling Sequence
In this work, mixed category workloads are allocated on optimal cores for execution. Liu and Layland et al. developed optimal uni-processor algorithms such as RM and EDF, which computes the hyper period of ready tasks to the total processor utilization (i.e., total execution cycles). Likewise, Sherik et al. acknowledged individual task periods play a vital role in resource consumption complexity. To address this issue, the author developed an adaptive hyper period generator to optimize each task period. We are motivated by this; our algorithm initially identifies the optimized hyper period for each task based on the earliest hyper period, each task is executed on the allocated processor. The optimal core executes aperiodic tasks. If the optimal core is running a high-priority periodic or complicated task, it will enter into the suspended state, and aperiodic tasks are executed whenever it arrives. This pre-defined condition is assumed to avoid the RT-IoT system devastation. Each task graph includes the 'N' number of task nodes represented by periodic, aperiodic, or complex sets. A task node includes essential parameters such as WCET of task ω_1 denoted by task inter-arrival instance is denoted by period and hyper period (i.e., the completion time of each task set), and core utilization is calculated based on Equation (8).
The Cluster utilization is denoted by Equation (9), These two factors find whether a given task set ω is schedulable on a given CPU of 'M' cores.
The hyper-period Hω i of a set of n tasks ω can be found by Equation (9), where LCM is the least common multiple of tasks' period, and 'ti' is an individual task. Task nodes with the "earliest hyper period are executed first" from the allocated queue. As we mentioned, Liu and Layland fixed the schedulability criteria as TU ≤ 1. The problem arises when the algorithm passes the schedulability test. However, the embedded IoT objects do not have sufficient resources to execute tasks due to the requirement of higher clock cycles than devices can support. In this work, we assumed Hω i ≤ Threshold M ; then, the entire task set is schedulable, where the threshold is the total load capacity of core 'M' at runtime.

Proposed SLFN_EHF Scheduling Algorithm
Three modules form the entire RT-IoT system, which is elaborated below (Algorithm 2).

Algorithm 2 Task Scheduling Mechanism
Input: Periodic, Aperiodic, and Complex tasks : Randomly generated bias and weights for SLFN at runtime. 8: Activation function = relu, optimizer = adam, 9: Trained with workload characteristics which executed using Nvidia SoC 10: Ready tasks are feedforwarded to the SLFN resource predictor in order to predict the optimal hardware processor. 11: Predicted optimal core in the final layer using the softmax function 12: Tasks are allocated in the optimal core (i.e., queue), given the SLFN network. 13: The pre-assumed condition is Aperiodic tasks alone executed immediately on current processors to avoid devastation. 14: Periodic and complex tasks on each core are scheduled and executed based on the hyper period method. 15: Hyperperiod (Hp) = {LCM(t i ) 1 < 0 < n} 16: Tasks are sorted in non-decreasing order of Hp. 17: If Hw i < Threshold M condition has been checked 18: The task set is schedulable and executed on a predicted processor 19: else 20: The task set is moved to the waiting queue, and the same is executed on the base cluster 21: end 22: Task miss deadline = miss/Total tasks on each queue 23: Execution time is calculated based on CPU processing time for the entire process. 24: Every iteration results are updated to the cloud server automatically at a regular 2: time interval.

25: End
The tasks are allocated on the predicted processor, which is optimal for each task at runtime based on the two conditions verified by the task manager. The hyper-period metric needs to be less than the active cluster threshold, and its emergency is very high. If this condition is satisfied, then the task graph is allocated on the optimal cluster selected by the proposed SLFN resource predictor. Otherwise, the task graph will be allocated to the waiting list and executed on the base cluster. The optimal clusters are activated only if the tasks are allocated to them. Otherwise, it will be deactivated (idle state). Both hardware configurations are activated initially, and high-processing tasks such as obstacle detection, lane detection, path planning are executed on the hardware configuration 2. Figure 6 illustrate the working procedure of SLFN_EHF task management technique. This section presents three sub-modules called simulation environments, implementation setups, and real-time test descriptions to develop and test the proposed mechanism. Results and discussion follow, and the corresponding performance graphs are shown below.

Simulation Environment
The famous "Python IDE-version 3.8" with machine learning libraries is used to develop the learning predictors and installed on two embedded kits and PCs. Task Graph For Free (TGFF) [33] is used to generate the synthetic real-time task graphs and their constraints. The "psutils and Perf" tools are used to evaluate the performance metrics after each execution.

Implementation Setup's
In this work, we have used two different embedded multicore boards, such as the "Odroid Xu4 and Nvidia Jetson xavior" [34] board for evaluation and validated the proposed models. Jetson modules include an on-board power monitor, the INA3221, to monitor the voltage and current of curtain rails. Table 3 shows the hardware configuration board details.

Realtime Benchmark Programs
IoMT [35] and autonomous vehicle application comprises of taskset-1: Object Tracking, and Path planning programs are developed in python. These are the standard autonomous vehicle application which is mentioned on the CAV benchmark [36]. The IoMT database includes nine programs named taskset-2: Activity, Imhist, Aes, Iradon, Squ_wab, Hrv, Lzw, and Adpet. These programs are frequently executed on both embedded kits to observe the different characteristics in terms of features are listed in Section 3. Various sensors and actuators based programs such as taskset-3: "getpressure, gethumidity, getmotion, gettemp, LED" are used for data collection. The synthetic database also generated using TGFF and pseudorandom values that are listed in Table 4. In total, more than 10k data have been used for training and testing purposes. The synthetic task graphs are generated. Tables 5 and 6 depicts the parameters used for synthetic and benchmark workloads with its task description.

Results and Discussion
The proposed algorithms are validated using an estimated few scheduling metrics such as task miss rate, execution time (makespan), cluster utilization, and prediction accuracy were estimated as per the below equations.
where φ accuracy rate of core prediction µ denotes the predicted number of cores divided by ρ an actual number of active cores.
where TMR is called the missed task rate, which is calculated by dividing the missed task instance by the total number of completed task instances on that particular task graph (ω).
where makespan is the total time taken to complete the entire set of tasks ('M') on the entire set of allocated cores ('N'). These metrics are estimated for each task graph and normalized the values. Tables 7 and 8 clarifies that the proposed SLFN network achieved better results compared to other machine learning algorithms, where all the strategies have been optimally tuned for the problem in hand. LW-DNN and SLFN achieved nearly 96-98% accuracy for both synthetic and real-time benchmarks. SLFN achieved better performance for sensors, actuator workloads in real-time. These results are obtained based on the intellectual selection of processors and the best order of execution. Few workloads such as object detection, edge video analysis only utilized the hardware configuration.2 and the resources are optimized through this appropriate selection and utilization. Synthetic workloads are executed on appropriate processors selected by the proposed resource predictor. The accuracy is achieved by nearly 96.3 to 98% for synthetic workloads, which is highly utilized in the hardware configuration 1. Figures 7 and 8 illustrates the resource prediction accuracy obtained for sythetic task sets and benchmark workloads at runtime. The proposed SLFN_EHF task scheduler achieved nearly 97.5% in an average for benchmark workloads. Figure 9 illustrate the task miss instance rate observed for synthetic workloads execution on hardware configuration 1. The TMR is minimized in an average of 15-22% compared to traditional schedulers due to selection of best processors for ready tasks at runtime.      Figures 11 and 12 represents the overall execution time (Makespan) observed for both synthetic tasksets and benchmark workloads. The proposed task schedulers reduced the makespan nearly 27-32% compared with traditional algorithms. The makespan of the taskset −3 is nearly equal for all methods due to its low-level execution cost compared to other tasksets.  The overall schedule length has been measured as per Equation (12). Figures 13 and 14 represents the normalized task miss rate and execution time that is optimized using the proposed predictor based task scheduler. Both synthetic and benchmarak workloads are executed on hardware configuration.2 Few synthetic task sets are nearly executed within micro seconds on hardware configuration.2 due to its high-speed and large number of GPUs. The SLFN core predictor with EHF task scheduler outperformed traditional CFS [37] and Fairness [38] algorithms. The proposed algorithm achieved better results on each cluster in an average of 20-27% in makespan metric and 32-45% reduction in task miss rate metric than CFS and fairness schedulers.

Conclusions
Autonomous vehicle and IoT applications have become everywhere in recent days. Sensors, actuators are the core components of the applications mentioned above. The software challenges in terms of task distribution, appropriate allocation, execution on-time without delay are still NP-hard issues in real-time. This paper developed an intelligent kernel that comprises the SLFN core predictor and EHF task scheduler algorithm for IoTbased autonomous vehicles. The essential purpose is to minimize the task miss rate and overall schedule length by intelligent mapping and execution kernel. The proposed framework improves the scheduling performance by selecting and executing optimal hardware configuration in terms of core type, frequency-voltage level for each workload at runtime.
Additionally, two different hardware setups have been utilized for experimental validation of the proposed system performance. The proposed SLFN core predictor is compared with the previous work LW-DNN core predictor and achieved better accuracy of nearly 96-98% approximately and in an average of 27% reduced in execution time and 45% in task miss rate metric. The limitation of the proposed method is targeted only the IoT-based autonomous vehicle workloads and embedded workloads for training and testing purposes which can be extended with other multimedia benchmarks in future work.