A Context-Aware IoT and Deep-Learning-Based Smart Classroom for Controlling Demand and Supply of Power Load

: With the demand for clean energy increasing, novel research is presented in this paper on providing sustainable, clean energy for a university campus. The Internet of Things (IoT) is now a leading factor in saving energy. With added deep learning for action recognition, IoT sensors implemented in real-time appliances monitor and control the extra usage of energy in buildings. This gives an extra edge on digitizing energy usage and, ultimately, reducing the power load in the electric grid. Here, we present a novel proposal through context-aware architecture for energy saving in classrooms, combining Internet of Things (IoT) sensors and video action recognition. Using this method, we can save a signiﬁcant amount of energy usage in buildings.


Introduction
With rising demand for clean energy for environmental sustainability, the Korea Electric Power Corporation funded the regional power grid project to build a Smart Energy Campus. This was a three-year project with an energy efficient concept to do research on energy management. In this paper we combined Internet of Things (IoT) and context-aware sensors like temperature, humidity, and luminance sensors, along with deep learning for human action recognition to minimize energy consumption [1]. Finally, we were able to combine sensor data sets with video and image datasets to predict the temperature, humidity, and luminance in a smart classroom during the class hours. Demand for clean energy has increased the social demand of renewable energy, while the cost of relevant technologies is coming down. Therefore, the usage of renewable energy for energy production has increased. This will have an impact on the current electrical system. Consumers can benefit from renewable energy limiting the usage of the energy power system. Usage of renewable energy is likely to lessen the emissions associated with the buildings and benefit in decreasing the energy demand from the main electrical grid. Internet of things (IoT) facilitates networked connection to physical appliances used in daily human life along with buildings, and machines in IoT. This has resulted in the importance of IoT in energy management schemes for clean energy [2]. Today, there is growing interest of IoT in smart grids. Traditional power grids need to upgrade for smart grids. The IoT-aided smart grid system addresses the existing energy wastage, providing connectivity, automation, and tracking [3]. An IoT-based framework integrated at the user end for monitoring the energy usage and benefits improvement of the smart grid after adopting IoT technology is addressed for building and power grids [4]. IoT devices build a communicative channel between the energy consuming and production device using internet, which in turn enables the adjustment of energy consumption and production and also optimization for saving energy bills. The impact and challenges of IoT in transforming electric power and energy systems is addressed in [5]. Similarly, major integration challenges faced by utilities with high-distributed energy resources (DERs) micro grids and the current approach for managing the issues is analyzed in [6]. Application of an IoT-enabled human-in-loop energy management system is addressed in [7] with a proper framework for transition toward smart buildings. In distributed energy systems, progress to meet the consumer demand of with proper demand management is required. Some of the programs can be addressed with smart operation and sustainable edge computing, so that energy can be fully utilized [8,9]. Since the evolution of the Internet, we have seen rapid increase in demand for IoT-related sensors. Many studies have been conducted on the Internet of Things and sensors for energy management. Recently, the concept of context awareness sensors and their utilities are garnering attention in technological fields with high-speed internet on the market. The main objective of this research paper is the combination of context awareness sensors and artificial intelligence for energy consumption reduction. This paper will give clear idea of combination of context awareness sensors and video data in context awareness architecture. Most home appliances are adjusted to minimize energy consumption with installation of smart air conditioners, and dehumidifiers [10]. The basic idea is to create a database that will help in monitoring and controlling making all systems connected to internet to save energy [11]. User activities have significant impacts on the amount of energy consumed in building/classroom. We had seen from Lai et al.'s work how energy is related with appliances and human activities and the amount of energy we can save if we connect appliances according to user needs [12].
We propose a novel concept of saving energy combining sensors' real-time data and user activity with machine learning. This framework will provide a proactive service in classroom to save energy usage. Three different environment sensors related to energy usage in buildings are installed to check temperature, humidity, luminance, and students' activities. Few studies have been done in combining real time sensors and video data for saving energy. We used hardware on a Raspberry Pi computer and context awareness sensors for temperature, luminance and humidity and the software architecture was built on Python (tensor flow) and MySQL. We present a novel context-aware architecture which combines deep learning C3D and LSTM models to save energy usage in classroom environment. This paper will provide a solid idea about video data and sensor data utility and a framework for a new action recognition concept. Our main contributions are as follows.
(a) We present a deep learning and sensor architecture for a context awareness environment with: • Temperature, humidity, and luminance sensors monitoring classroom environments; • Video sensors for recording to recognize the students' activities in the classroom environment; • The hardware and software architecture of datasets collections in minimal cost and time.
(b) We present a multimodel approach with different sensors' characteristics and frequencies: • With the help of the Raspberry Pi computer, we were able to combine all the different sensors for data collection; • Hence it is an optimal solution for easy sensors data collection.
(c) We proposed a combined neural network approach in field of context awareness with different sensors' real-time data and motion video data for human action recognition for energy management. This is an essential step in combining IoT and a neural network. (d) We successfully trained and executed the neural network for transfer learning of lab action data sets. (e) Finally, we are able to reduce load at peak hours with proper energy management.

Related Works
Here we briefly survey the works done in smart sensor for energy management. We have seen several intelligent models proposed in the IoT in one decade. However, in the context awareness field until now, research is computed to only IoT [10,13]. Context awareness systems until now focused on location base [14], object and ontology-oriented [15]. Byun and Park invested context awareness in energy saving [16]. They mainly focused on self-adaptive intelligent gateways and sensors [10]. Meanwhile, Oscello et al. [17,18] proposed a framework to identify energy wastage in context awareness environments. They mainly explained how energy is wasted without user benefits but none of them have related research on action recognition with video data. Many works are related with sensors in different sectors like marketing, wearable sensors, home sensors, smart city, context aware environment, and smart enterprise.

Energy Control/Management
Significant work has been done in energy control and management with lights control [19], providing ideas about controlling home lights that help in reducing cost and conserving energy by turning off lights when not in use. Similarly, smart homes with wireless networks [20] are adding advantages in living conditions. Smart air conditioner control [21] presents algorithms for real-time contextual information from IoT devices. Along with the significant impact of IoT in smart grids using the European Telecommunications Standards Institute (ETSI) reference model and the impact on demand-side management, [22] this paper introduces a wide research area that can be implemented in smart grids with IoT. The chapter aims to identify uses of IoT and IoT-enabled technology in secure smart grid design. Also, it shows the importance of IoT in decreasing the operational and maintenance costs of establishing communication between objects and humans. Finally, it describes IoT applications in five different domains: transmission, distribution, operation, generation, and customer.

Activity Detection
Activity detection introduces a device-oriented IoT energy management system, [23] describes the process of how user activity can be determined using a device-block selection mechanism and how the relation between them is constructed. Convolutional neural networks have made huge progress since rapid development of GPUs and CPUU clusters in training big data, and it made a strong breakthrough on visual recognition in [24,25]. Conv Nets were applied to solve problems in human pose estimation both in images and videos [26,27]. More interestingly, these deep networks are used for image feature learning [28]. Similarly, Zhou et al.'s work performed well on transferred learning tasks. Deep learning is applied to video feature learning in an unsupervised setting [29]. Action recognition performed better, while feature extraction was more precise [30,31]. Some classification can be automated or extracted through the learning process, as in [32][33][34][35]. Karpathy et al. [36] used spatiotemporal data for feature extraction on activity detection. Ji et al. [37] proposed a 3D Convolutional Neural Network (CNN)-based human detector and head tracker to segment human subjects in videos. Sun et al. [38] proposed a factorization of 3D CNN and exploited multiple ways to decompose convolutional kernels. The Convolutional 3 Dimensional (C3D) method took this idea a step further with the help of GPU memory [39]. Activity recognition with a spatiotemporal method has improved with different pattern and skeletal body detection. Deep convolutional neural networks have shown higher accuracy on action recognition performed in still images [40][41][42][43]. A detailed study with a Kinect-based action recognition algorithm to differentiate on feature extraction methods was analyzed in [44].

Objective of Energy Saving System
The impact of the Internet of Things (IoT) on energy management has been very successful in recent years. The concept of combining context-aware sensors and video sensors for action recognition is developed to reduce the power load on buildings. This paper presents a novel solution for combining sensors and deep learning for energy saving. We have seen many research articles focusing only on IoT and sensors. Here, we came up with the idea of deep learning implementation in the existing system architecture. We categorize our architecture in two sections. One section is the video camera section, where we collect video actions of students for action recognition. The collected video is subdivided into 11 small video clips with four different actions of students: entering, standing, sitting, and going out of the classroom. Then the video clips are classified into images with specified frames which we will discuss on datasets about video to image classification. After our system recognizes the student activity, it combines the results with the predicted sensor data sets of temperature, humidity, and luminance and decide whether to control the appliances according to the context for energy management, which includes controlling air conditioning, heating, and lighting in the classroom. Every 10 minutes, this process takes data readings from the context-aware sensors. Figure 1 explains the process of data collection and recognition with IoT to control the excess energy usage. We did two different experiments to predict the output for video and sensor data sets. We applied the Convolutional 3 Dimensional (C3D) model for action recognition and Long Short Term Memory (LSTM) to predict the sensor output.

Software and Hardware Setup for Data Processing
For data collection we set up a context-aware sensor laboratory. Sensor data were collected through a Raspberry Pi board and students' movements were recorded with a camcorder. We used low-power-consuming context-aware sensors for data collection from temperature, humidity, and luminance sensors. MySQL was used to manage the database. Similarly, video-to-image data requires big data sets for proper efficiency so we collaborated with C3D data sets to make our datasets large enough for the training and testing of the action recognition experiment. We used Passive Infrared Sensor (PIR) covering 360 • -the maximum area to detect motion. Similarly, temperature, humidity, and luminance sensors were used for respective data collection that varies when students enter the classroom. Figure 2 shows how sensors were placed for data collection. The sensor covered all the possible space inside the room so that the error margin was lowered. PIR sensors were placed to collect all the motion and this was collected in the server provided with the MySQL database.

Transfer Learning with Lab-Action Data Sets
Transfer learning is a common method of training small domain datasets into a larger domain data set. In practice, the big part datasets are usually the largest in the domain where we can leverage the features extraction effectively. Figure 3 shows how we combined the UCF-101, a big-data domain, with our lab action data domain with transfer learning. We have a great challenge in transfer learning with UCF-101 data sets, since it consists of single-person actions and lab action data sets have multi-person actions. Transfer learning helped in building the network with knowledge-sharing concepts that actually helped us to train our data sets with learning concepts. In this paper, we implemented transfer learning to train our lab-action data sets with a big domain dataset of UCF-101 image data sets. Applying this transfer learning with C3D, we successfully captured the featured vector for the first task and after that we redefined our own convolution function with an extra fully connected layer with conv 128 layers and retrained the lab-action-only feature vector. We used conv 128 layers to increase the efficiency of programming since our image data sets added for transfer learning did not need large filter layers. With fully connected layers in the end, we could carry out the classification. This made it easy to transfer the knowledge into our network to carry out another task. Finally, we got the output accuracy of 60% for students' multi-action recognition. The results we got are satisfactory given the size of the data sets we trained, and in future we will be doing the same with large data sets with multiple actions.

Context-Aware Network
A complete architecture is shown in Figure 4 that shows how our proposed context aware network with transfer learning architecture works. We can see two different architectures working in parallel to collect and predict sensor and action recognition output. We performed transfer learning for our human-action image data sets. The first section is for action recognition, showing how data were collected and used for feature extraction. With C3D as the feature extractor, it identifies the four different actions of students inside the classroom, namely, entering, sitting, standing, and going out of class. CNN is arguably the most widely used approach in human-action recognition. It consists of multiple hidden layers and pooling layers sampling and fully connected layers. The second part is for estimating future sensor readings of a classroom using the Long Short-Term Memory (LSTM) network. It includes forget, input, cell, and output gates as well as a specific internal structure of individual hidden units-all of which facilitate effective error back-propagation even for very complex-that is, deep-model-architectures, and over prolonged periods of time, e.g., hundreds of time steps. For instance, a sequence of sensor readings such as (T i , H i , L i ), (T i-1 , H i-1 , L i-1 ), and (T i-n , H i-n , L i-n ) are used to predict sensor readings at time i + m (T i+m , H i+m , L i+m ), where T i , H i , and L i denotes the value of temperature, humidity, and luminance at time i, respectively. Recognized student activities and the predicted sensor reading (T i+m , H i+m , L i+m ) are fed into ground truth data for specific calculation of rise in temperature and humidity inside a classroom. The output of action recognition is (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1), where each value denotes students "coming in", "standing", "sitting", and "going out" respectively. In addition, the predicted temperature values at time i + m are between 17 to 28 degrees, humidity value ranges from 30 to 40, and luminance value ranges from 25 to 550. By going through the normalization layer, recognized action values and predicted sensor values become 0 and 1, with final output as turning on or off the air conditioner, dehumidifier or lights.

Experiment Results Analysis
In this experiment, we focused on the subject of the context-aware system by using raw sensors, such as passive infrared (PIR), humidity, and temperature sensors and light dependent resistor (LDR) sensors to monitor outside luminance. A Raspberry Pi computer was used as the mediator to collect and record data from the raw sensors. Then, collected data were transferred via Wi-Fi into the database server where the data were recorded in MySQL. The light source in the room was tested with and without natural light. In order to ensure a real-time scenario, we performed an intuitive test in our laboratory for the sunlight falling on the window at different times of the day. The intensity of the natural light falling on the window is computed at a different time, starting from class hour. Similarly, for action recognition, we made video data of student activities with multiple actions as stated in Table 1. Here, we present a discussion on both data preparation and simulation results in detail.

Data Preparation
We use C3D to train the UCF-101 dataset. The UCF-101 dataset consists of 13,320 videos of 101 human actions [24,42]. Similarly, we added our four different actions into UCF-101, making different categories of actions of students in daily classroom activities. With 44 video action clips from four different actions, lab-action data sets give diversity in terms of multiple person actions. Our datasets in Figure 5 are for action recognition in realistic action videos collected in a smart lab, providing the various angles in the diversity of video data, which makes it a good data set in terms of context awareness and action recognition. In the below Table 1, we can see the video action data sets specimen for classroom action recognition.

Result Discussion
Our model in above Table 2 accuracy is slightly higher for multi-person action recognition compared to those combined with single-action datasets. We realized the lower accuracy is due to the lack of sufficient action datasets for multi-person actions. Since the proposed system focuses on student actions in a classroom, we are satisfied with the results. This will be a good starting point for further work where we will add sufficient data sets. To address the demand-side management, the sensors are programmed with limit values set as the lower and higher limits, which helps in the activation and deactivation of the smart appliances. For example, in the case of a temperature sensor with limit values of 18 • C and 25 • C programmed as the lower and upper limit, if it is below the lower limit value, the air conditioner will turn off and, if above the high limit, then it will be turned on. Similarly, this applies to other appliances that are programmed with smart sensors. With human action recognition, the proposed system helps in saving 25-30% of the energy usage in the building. To support our experiment, we presented the prediction simulation results of different sensors with the Long Short Term Memory (LSTM) approach. Table 3 explains about the performance comparison between existing methods and proposed architecture. The proposed architecture outperforms the traditional method in energy saving percentage. It gives an extra edge in energy saving with a human action recognition network. The high performance rate makes it more user-friendly in the buildings where energy is always wasted when there is no proper inspection. Hence, the proposed method will reduce the human dependency on controlling energy wastage in the buildings. Similarly, the simulation plots in Figure 6 shows the predicted and raw data from sensors which were taken in active class hour.

Conclusions
We presented a novel idea of implementing a self-controlled smart energy saving concept by exploiting deep learning, transfer learning and context awareness techniques with IoT. We reviewed the growing impact of IoT on smart loads and its advantages in smart grids. Through IoT home sensors, we showed the importance of IoT in digitizing the electric power system to better distributed energy resources. Recognition of human activities in a classroom was trained with transfer learning.
Sensor readings and video data were fused in the proposed context-aware architecture, producing control signals such as turning on and off the air conditioning, dehumidifier, and heating to save energy in the smart classroom environment.
In this paper, the proposed system is programmed in such a way to manage the demand side, which will have a significant impact on energy saving and management in load shedding. The accuracy results in this experiment are satisfactory, given they were trialed on four actions with lesser data sets compared to UCF-101 big data sets, but these results could gradually be increased with larger image/video data sets. Energy usage can be reduced by low-power-consuming sensors. The proposed architecture helps in saving 25-30% energy in a building. Hence, due to human action recognition network, it will reduce the human labor used to control the energy wastage in the buildings. For future work, we will focus on estimating the overall energy savings and cost effectiveness with low-cost home sensors and reducing the electric loads in broader areas in communities.

Conflicts of Interest:
The authors declare no conflict of interest.