1. Introduction
Information technology, especially the Internet of Things (IoT) and artificial intelligence (AI), becomes increasingly popular in smart building applications, such as occupancy estimation for energy-efficient building operations [
1,
2], and demand-oriented air conditioners [
3]. For example, with the help of distributed household IoT devices, AI algorithms have been widely used to model the energy consumption characteristics of smart buildings and find the optimal solutions of parameter thresholds and control parameters. AI algorithms have also been studied to intelligently interpret the visual contents of surveillance cameras and identify the number of residents and their locations in smart buildings. Thus, the operation of air conditioners is adjusted to provide “just-right” heating or cooling services. In addition, smart buildings can adjust the indoor thermal environment, such as temperature, humidity, or airflow, to improve the comfort of building occupants [
4]. Large companies, such as IBM or Intel, are also committed to developing AI algorithms for building performance optimization.
In the United States, Sudden Infant Death Syndrome (SIDS) is one of the leading causes of sudden and unexpected death in babies under one year of old. Many studies point out that letting babies sleep on their stomachs can easily lead to SIDS [
5,
6]. Therefore, the American Academy of Pediatrics recommends that babies should sleep on their backs because this can keep the airway open. It is reported that babies have the lowest risk when sleeping on their backs, followed by sleeping on their sides. Sleeping on the stomach is at the highest risk, because it compresses a baby’s chin, narrows the airway, and restricts breathing. However, in practice, it is difficult to let babies always sleep on their backs because it is easy for them to roll over to sleep on their stomachs.
In order to monitor sleeping status, a series of sensor-based wearable or touchable monitors have been developed [
7,
8,
9,
10,
11,
12]. For example, IoT-based smart posture detection systems have been developed in [
10,
11,
12], where a pressure sensing mattress is used to collect body pressure data that are processed for posture recognition. Sleep experiments were conducted on an infant in [
10], and the reported classification of baby sleep posture reached 88%. In [
11], pressure sensors are placed in a sensing cushion, which is used to collect ten children’s sitting pressure data. Although infant sleep posture is not involved to detect in [
11], the average classification accuracy for children sitting posture is 95%. In [
12], the authors proposed to recognize sleep positions with body pressure images and achieved a high recognition accuracy. When various sensors are placed on babies, these baby monitors can track the breathing, body temperature, heart rate of sleeping babies, etc. Then, if a baby monitor observes some abnormal activity, such as stopping breathing or slowing down heart rates, it will send a warning alert to parents. Although it sounds attractive, these wearable or touchable baby monitors suffer from two limitations. First, these monitor systems include various electrodes or sensors located on a crib mattress or the waist and feet of an infant’s body. In order to collect reliable data, these pads and sensors need to fit or touch a baby’s body well at any time of sleep. In fact, it is inconvenient for babies to always wear these pads or sensors correctly when they sleep. Second, these baby monitors often send out false alarms, which can increase the anxiety of many parents [
13,
14]. Therefore, parents are most likely to suffer from increased stress or even depression, which affects their sleep quality and emotion.
In order to get rid of the shortcomings of wearable sensor-based baby monitors, researchers began to investigate contactless camera-based monitoring, which detects sleeping postures through cameras and AI algorithms. The researchers in [
15] predict that future research on sleep health will be data-driven and AI algorithms will play a critical role. Instead of using wearable body sensors or electrodes for signal collection, AI algorithms analyze the output of cameras and classify sleeping postures. Compared with sensor-based baby monitors, this approach is user convenient and cost-effective. In [
16], infrared cameras and depth sensors were used to collect data, and then a convolutional neural network (CNN) classifies sleeping postures with an accuracy of 94%. However, this idea was only verified on a small dataset containing 1880 samples, and this approach has not been validated in the baby sleep scenarios. Furthermore, due to the use of depth sensors and infrared cameras, its hardware cost is expensive. Later, in order to reduce the system cost, the researchers in [
17] used 4250 daytime baby sleep images from ordinary cameras to explore eight different CNN architectures. The highest classification accuracy of 87.8% is achieved in a CNN consisting of four convolutional layers and two dense layers. To further increase the classification accuracy, the researcher in [
18] explored three CNN architectures. Inspired by GoogLeNets and ResNets, the researcher proposed to add skip connections to standard CNN architectures. Skipping effectively simplifies the network by using an average pool on each feature at the end, so keeps fairly low parameters. The dataset for baby sleep images is the same as [
17]. Besides, in order to accommodate his CNN architectures in portable electronics, the researcher [
18] proposed to reduce the number of feature maps. Thus, based on a ResNet network with 16 convolution layers and 3 dense layers, the corresponding classification accuracy is 89%. Recently, the researcher [
19] proposed to use DenseNet-121 for baby sleep posture classification. DenseNet, also known as dense convolutional network, is a type of convolutional neural networks, in which each layer is connected to all subsequent layers. Since each layer in DenseNets receives collective knowledge from all preceding layers, the information flow among different layers is enhanced. Therefore, this type of network is thinner and more compact [
20]. Compared with other AI algorithms, fewer parameters and higher accuracy can be potentially achieved through dense connection. As a result, DenseNet-121 tends to have fewer parameters and a smaller memory footprint. Unfortunately, the researcher [
19] only demonstrated his AI algorithm function very well with baby doll pictures, but did not evaluate the classification accuracy using real infant sleep images. Moreover, in [
21], a CNN architecture (i.e., Inception-v3 [
22]) with transfer learning is used for sleep posture classification. As a widely used image recognition model, Inception-v3 improves the computational efficiency and meanwhile keeps fewer parameters. Although the classification of adult sleep scenes shows an accuracy of around 90% on a dataset with only 1200 non-baby sleep images, the effectiveness of this Inception-v3 architecture has not been tested on real infant sleep datasets.
Table 1 summarizes the accuracy and disadvantages of these existing AI algorithms. From the above discussion and
Table 1, it is clear that these existing contactless camera-based baby sleep monitoring studies have not fully met the requirements of AI for edge computing in smart buildings [
23,
24,
25]. To date, two major bottlenecks are preventing AI from detecting potential infant sleep hazards in smart buildings. First, current datasets of baby sleep posture are not large or diverse. Generally speaking, the performance of AI algorithms is improved by adding more training samples [
26], and a high-diversity dataset can maximize the information contained [
27]. However, the researchers [
16] use 1880 data samples, the researchers [
17,
18] use 4250 daytime baby sleep images, the researchers [
21] use 1200 data samples, while the researcher [
19] uses baby doll pictures to approximate real baby sleep images. Although babies sleep at night most of the time, existing datasets do not contain night-vision sleep images. Therefore, it is necessary to generate a large and diverse baby sleep posture dataset to train and evaluate AI algorithms. Second, as stated in [
18,
19], memory constraint is a major challenge for using deep learning AI algorithms in edge computing systems. AI algorithms must not only fit in the program memory of edge computing systems (such as micro-controllers, Raspberry Pi), but also leave space in the memory so that operating systems or CPU kernels can run smoothly. For example, under the Raspberry Pi 3 A+’s maximum memory constraint of 512 MB, it can run lightweight programs and scripts. Therefore, if an AI algorithm requires several hundred Megabytes of memory, it may not be able to run on edge computing systems. To deal with this challenge, AI algorithms must be optimized to a small memory footprint for real-time operations on edge systems.
In order to solve the aforementioned two research bottlenecks, in this work, we investigate and propose an optimized AI algorithm for infant sleep posture classification. Regarding the contribution of the body of knowledge, this work makes the following two contributions: (1) we have generated a large and diverse dataset for training and evaluating AI algorithms. This dataset contains 10,240 day and night-vision baby sleep images. (2) We propose a new AI algorithm and use the post-training weight quantization technique to minimize memory usage. In this way, the data type of weight parameters in our AI algorithm is converted from 32-bit floating points to 8-bit integers. Thus, these quantized weights are easy to store and run in many edge computing devices (e.g., 8-bit ATmega328P micro-controller). In order to evaluate the proposed AI algorithm, we have implemented it in a Python program and run it on TensorFlow and Keras platforms. The experimental results show that with a very small memory footprint of 6.4 MB, a classification accuracy of about 90% is obtained. Compared with the state-of-the-art AI algorithms in the literature, the proposed idea achieves a comparable detection accuracy, while the memory footprint decreases from at least 58.2 MB to 6.4 MB, a reduction of at least 9 times. Therefore, our proposed memory-efficient AI algorithm has great potential to be deployed and to run on edge devices, such as micro-controllers and Raspberry Pi, which have low memory footprint, limited power budget, and constrained computing resources.