Discovery of Resident Behavior Patterns Using Machine Learning Techniques and IoT Paradigm

: In recent years, technological paradigms such as Internet of Things (IoT) and machine learning have become very important due to the beneﬁt that their application represents in various areas of knowledge. It is interesting to note that implementing these two technologies promotes more and better automatic control systems that adjust to each user’s particular preferences in the home automation area. This work presents Smart Home Control, an intelligent platform that offers fully customized automatic control schemes for a home’s domotic devices by obtaining residents’ behavior patterns and applying machine learning to the records of state changes of each device connected to the platform. The platform uses machine learning algorithm C4.5 and the Weka API to identify the behavior patterns necessary to build home devices’ conﬁguration rules. Besides, an experimental case study that validates the platform’s effectiveness is presented, where behavior patterns of smart homes residents were identiﬁed according to the IoT devices usage history. The discovery of behavior patterns is essential to improve the automatic conﬁguration schemes of personalization according to the residents’ history of device use.


Introduction
Nowadays, machine learning techniques have gained importance in various study areas due to their large number of pattern discovery applications. Under the changing conditions of the environment, individuals tend to develop behavioral patterns to better adapt to their environment. The analysis of the observed actions and their influences in the environment may lead to the automatic recognition of those meaningful behavioral patterns. Machine learning methods are appropriate when observable actions are available though there are no precise specifications for the system's desired behavior. There are currently home automation systems, such as Google Home or TESY Cloud, that allow remote monitoring and control of home devices, while others have automatic configuration schemes, such as Google Home Routines. However, most of them do not consider user behavior analysis to discover patterns of use to establish fully personalized automatic configurations adjusted to each home's unique context. In those systems, automation occurs rigidly without considering changes or transformations in users' behavior as time passes. The automatic configuration is essential since, by counting on the users' behavior patterns, it will be possible to establish the appropriate arrangements adjusted to the inhabitants' preferences or needs at home. The configuration will lead to an increase in

Related Work
In recent years, IoT has been established as a technology that allows access to information from devices of daily use in a timely and efficient manner. In turn, the number of devices that now have an Internet connection has grown exorbitantly. Under this context, it is possible to monitor each IoT device's status, usage and energy consumption in a building, which provides excellent opportunities for improvement and strengthening of security, comfort and energy-saving schemes. The increment in the number and variety of IoT devices has revealed a lack of IoT platforms to manage and analyze such data. For this reason, an approach to limit the energy consumption that IoT devices cause when carrying out high data transmissions was presented by Azar et al. [8] to improve privacy and reduce network traffic while saving time. The approach proposed using edge computing, which allows downloading the workload from the cloud in a location closer to the data source to be processed. To bring computing closer to where data are produced, Valerio et al. [9] performed data analysis on mobile nodes passing through IoT devices and explored the fog computing paradigm using a distributed machine learning framework (Hypothesis Transfer Learning). Saeid et al. [10] presented a taxonomy of machine learning algorithms that explains how different techniques are applied to data to extract higher-level information. They also evaluated machine learning methods that address the problems derived from the management of IoT data by considering smart cities as the case of primary use. Filho et al. [11] proposed a solution that combines computational intelligence and fog computing called STORm (Smart Solution for Decision Making in a Residential Environment). The STORm system, to improve the decision-making process, was able to detect and control the information generated by sensors installed in the residential scenario. Kasnesis et al. [12], within a collective intelligent environment and based on semantic Web technologies, proposed a platform that allows dynamic injections of automation rules. Fensel et al. [13] presented an IoT semantic platform over typical home appliance data called OpenFridge. Frontoni et al. [14] developed a framework to allow rapid development of complex systems by integrating new device classes into existing systems and controlling and centralizing information. Additionally, Silva et al. [15], to identify the appropriate solution for designing an IoT system, proposed a model and a framework. The results presented show that considering implementation time, cost, energy consumption, among other attributes, the methodology helped in the designing of an IoT system.
The reduction of energy consumption is a topic of considerable interest in the scientific community. Wen et al. [16] presented the ECIB algorithm that considers energy and cost variables to program IoT workflows with intensive batch processing in clouds. The main objective of ECIB is to improve energy efficiency and therefore reduce costs of operation. Another approach was proposed by Terroso-Saenz et al. [17] after introducing and testing the IoT Energy Platform (IoTEP), which aimed to provide, in terms of energy data management, an holistic solution for IoT energy data management. Pawar et al. [18], regarding power management, designed a smart system with the objective of replace, in a controlled manner based on consumer preference, a complete power outage in a region with a partial load outage. On the other hand, Zekić-Sušac et al. [19] used Rpart regression trees, random forest and variable reduction procedures to create predictive models of the specific energy consumption of public sector buildings in Croatia. The predictive models solve the problem of incorporating machine learning and the Big Data platform in the same intelligent system to manage the public sector's energy efficiency. Nabizadeh et al. [20] proposed an IoT-based Smartphone Energy Assistant (iSEA) framework that drives energy-conscious behaviors in commercial buildings. iSEA uses a deep-learning approach to identify individual occupants' energy use through their smartphone tracking and offers personalized feedback to impact their usage. García-Martín et al. [21] presented a review of the different approaches to estimating energy consumption on the one hand and machine learning applications on the other hand. Based on the data collected from an energy monitoring platform at a university in south China, Li et al. [22] proposed energy consumption patterns using data mining approaches. The coupling relationships between these components were revealed using multiple machine learning methods to propose precise energy-saving strategies for air conditioners' random use. Using data from temperature and humidity sensors, Raza et al. [23] developed machine learning models of heat flow of the environment and coming from the operation of heating, ventilation and air conditioning (HVAC) equipment. From these data, they obtained a low-cost and non-intrusive methodology for determining energy waste based on consumer behavior (CBB-EW) in HVAC operation and control. Chacon-Troya et al. [24] explained the design of an intelligent residence application for the control and monitoring of electricity quotas. To estimate the costs of the residence devices, they combined a Web application native technology. Aiming to reduce energy consumption, Saba et al. [25] contributed to the modeling and simulation of multi-agent systems for hybrid renewable energy system residences. In [26], Buono et al. presented The Non-Intrusive Load Monitoring (NILM) System as a mobile application that shows historical and real-time information about energy consumption and sends message alerts whenever an energy overload is about to occur.
To strengthen an energy alert system that reduces costs to users, Weixian et al. [27] presented a home management self-learning system that classify and learn from smart homes data through machine learning. In [28], Elkhorchani and Grayaa described a system that reduces not only CO2 emissions but also energy consumption rates. The home energy management system proposed by them employs an architecture that includes wireless communication, renewable energy principles and a power outage algorithm. Matsui [29] proposed a system that suggests ways to reduce electricity consumption through a module that also provides related information to be displayed according to the user's comfort preferences. Al-turjman et al. [30] reviewed and categorized existing energy monitoring approaches in the literature, investigating the impact and effectiveness of these monitoring systems under stress. Baker et al. [31] developed a multi-cloud IoT service composition algorithm called E2C2. The E2C2 algorithm seeks and integrates the fewest possible IoT services to achieve the user's requirements to create an energy-conscious composition system. Aiming to promote energy-saving, previous works regarding nature-inspired metaheuristic algorithms have revealed substantial contributions in the field of energy demand predictions [32][33][34][35].
Reinforcement of safety measures of IoT applications has also been an important research topic. Based on a deep migration learning model, Li et al. [36] proposed an intrusion detection algorithm (IDS) that helps identify network anomalies and takes necessary countermeasures to ensure IoT's safe and reliable operation applications. Alli et al. [37] proposed a secure calculation download scheme in the Fog-Cloud-IoT environment (SecOFF-FCIoT). Using machine learning strategies, they achieved an efficient and safe download in the Fog-IoT environment. Lanford and Perez [38], using video streaming, implemented a security system able to monitor the environment of a smart house. Malina et al. [39] presented a security framework that improves Internet of Things security and privacy services for the Message Queuing Transport Telemetry Protocol (MQTT).
In contrast, Mozaffari et al. [40] took into account a more human aspect of IoT devices. They considered the three stages of falling people (prediction, prevention and detection) to develop a diagnostic system for falls in smart buildings. At the same time, Ud Din et al. [41] studied different machine learning-based mechanisms on IoT used in healthcare, smart grids and vehicular communications to explain the role that machine learning techniques are expected to play in IoT networks development. In turn, Nižetić et al. [42] conducted a review of scientific contributions presented at the 4th International Conference on Smart and Sustainable Technologies held in Croatia, in 2019 (SpliTech 2019) to demonstrate the pros and cons of IoT technologies regarding an environmental point of view to promote the smart utilization of limited global resources.
As shown in this section, works related to this project's context were identified in the literature. However, of these works, only those in [11,12] are focused on improving the automatic configuration schemes. None considers the analysis or discovery of the residents' behavior patterns to improve the automatic control of domotic devices. Most papers expose issues related to energy-saving [8,9,13,[16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35], security [36][37][38][39], data analysis [10], home system design [14,15] and healthcare [40][41][42]. After the reviewing these papers, a lack of smart configuration schemes that balance energy saving and comfort by analyzing user interaction with home automation devices was identified. Taking this into account, Smart Home Control is a solution capable of analyzing residents' historical interaction with their devices through a mobile application. From this analysis, behavior patterns are recognized to create automatic control rules that automate the home configurations and reduce energy consumption if necessary. Smart Home Control is a tool that improves and extends the functionality of a domotic system. The following section describes the architecture and functionality of Smart Home Control.

Materials and Methods
The need for a platform that analyzes the residents' behavior through machine learning arises from the lack of availability of a highly automated and personalized home appliance configuration schemes. Machine learning makes it possible to discover residents' behavior patterns, improving automated decision-making effectiveness as if the users' were controlling their devices manually. The "Smart Home Control" mobile application and its architecture were developed to meet this need.

Smart Home Control's Architecture
The design of the Smart Home Control architecture includes a domotic system that contains the necessary technology to control and monitor IoT devices. The domotic system consists of devices with "IoT ready" technology installed in smart homes or buildings. In addition, as part of the home automation system, the harvesting of information from the surrounding environment regarding temperature and natural light detection has been considered. The Smart Home Control platform also includes a communication infrastructure that allows data to be obtained from sensors and sent to actuators or other IoT devices using web services.
The Smart Home Control architecture consists of six layers whose integration enables collecting and analyzing data from the IoT devices, providing control and communication for the home automation system. The layers that constitute the Smart Home Control are: (1) Presentation; (2) Security; (3) Control; (4) Communication; (5) Data; and (6) Devices. Figure 1 presents the Smart Home Control architecture, showing its layered design, the modules that comprise the layers and how they relate to each other. tions and reduce energy consumption if necessary. Smart Home Control is a tool that improves and extends the functionality of a domotic system. The following section describes the architecture and functionality of Smart Home Control.

Materials and Methods
The need for a platform that analyzes the residents' behavior through machine learning arises from the lack of availability of a highly automated and personalized home appliance configuration schemes. Machine learning makes it possible to discover residents' behavior patterns, improving automated decision-making effectiveness as if the users' were controlling their devices manually. The "Smart Home Control" mobile application and its architecture were developed to meet this need.

Smart Home Control's Architecture
The design of the Smart Home Control architecture includes a domotic system that contains the necessary technology to control and monitor IoT devices. The domotic system consists of devices with "IoT ready" technology installed in smart homes or buildings. In addition, as part of the home automation system, the harvesting of information from the surrounding environment regarding temperature and natural light detection has been considered. The Smart Home Control platform also includes a communication infrastructure that allows data to be obtained from sensors and sent to actuators or other IoT devices using web services.
The Smart Home Control architecture consists of six layers whose integration enables collecting and analyzing data from the IoT devices, providing control and communication for the home automation system. The layers that constitute the Smart Home Control are: (1) Presentation; (2) Security; (3) Control; (4) Communication; (5) Data; and (6) Devices. Figure 1 presents the Smart Home Control architecture, showing its layered design, the modules that comprise the layers and how they relate to each other.  The functions for each element of the Smart Home Control architecture are described below.

•
Presentation layer represents the interface that enables a user to interact with the Smart Home Control platform.
Mobile Application is an Android-based interface through which a user gets access to the system. From this interface, the user can request to show real-time or historical data, select between automatic or manual control of the IoT devices and define their configurations.
• Security layer represents the set of technological elements whose activity is necessary to ensure secure access to the platform's functions.
Authentication Module is responsible for validating the user's access data with the information from the User Data module, through the implementation of the OAuth 2.0 authorization framework that enables a third-party application to obtain limited access to an HTTP service.
• Control layer represents the functionality of the platform. It contains the necessary methods to access each device's functions connected to the Smart Home Control platform.
User control module: A user has access to this module from the mobile application. From this module, the user can turn on, turn off, or change their devices' configuration settings. This module does not use prediction models. Automatic control module: This module is in charge of processing the data from the devices' readings. The module analyzes the information from the devices of the house to discover the behavior patterns and conditions by recording some action or state changes. Said analysis is carried out using the C4.5 machine learning algorithm that can avoid data overclassification and subsequent overtraining. C4.5 can also handle both discrete and continuous incomplete data [43]. It is essential to mention that the behavior patterns detected allow generating decision trees that are translated into automatic configuration rules applicable to each device.
• Communication layer is the software infrastructure needed to establish communication between the various modules and elements of the Smart Home Control platform.
REST API contains methods that allow communication between the control layer, the device layer and the data layer. This layer collects data from sensor devices and transports the respective commands to control them.
• Data layer represents the information that Smart Home Control is focused on and the data of interest for the platform's various modules.
Device data corresponds to the stored data of each device connected to the platform. These data are mainly used to identify each device and its current status.
Device history is the historical information that includes each change in status registered by a specific device. This information is crucial because the record of all the devices' meaningful changes conveys the data necessary to analyze and discover the house inhabitants' behavior patterns. This cumulative information will effectively carry out the training of the automatic control module of the system. User data corresponds to the stored data of the Smart Home Control users. These data are stored to personalize the user profile. House data represents the stored data of the house where the Smart Home Control platform is installed. It includes information about the rooms to which home automation control devices are associated. Configuration rules corresponds to the rules built by the automatic control module for each household device. The registered rules for a device may change as the automatic control module recognizes new device usage patterns.
• Device layer includes the communication technology necessary to control and monitor home automation devices. The intelligent system is implemented on this domotic system.

Sensors:
These are devices that detect and record changes in the environmental conditions (light, temperature, presence of people, etc.) for the home automation system. IoT Devices: These are devices connected to the domotic system through the use of IoT technology.

Smart Home Control Workflow
A workflow for the Smart Home Control carries out the process of communicating with domotic devices, obtaining their data and analyzing them to build the automatic control rules of home automation devices. Figure 2 shows the Smart Home Control workflow, including the key concepts of the system's operation. o Configuration rules corresponds to the rules built by the automatic control module for each household device. The registered rules for a device may change as the automatic control module recognizes new device usage patterns.
 Device layer includes the communication technology necessary to control and monitor home automation devices. The intelligent system is implemented on this domotic system.
o Sensors: These are devices that detect and record changes in the environmental conditions (light, temperature, presence of people, etc.) for the home automation system. o IoT Devices: These are devices connected to the domotic system through the use of IoT technology.

Smart Home Control Workflow
A workflow for the Smart Home Control carries out the process of communicating with domotic devices, obtaining their data and analyzing them to build the automatic control rules of home automation devices. Figure 2 shows the Smart Home Control workflow, including the key concepts of the system's operation. The workflow is described as follows:


The environment of a house should always provide access to domotic devices to obtain information of interest. It is advisable to consider an approach that includes IoT technology devices to facilitate communication and access to data corresponding to the history of use of the devices.  The domotic devices, possibly having different communication interfaces, are accessed for reading and control purposes through web services. Each web service conforms to each device's specific communication interface, returning the information in the appropriate format so that the platform correctly stores the data of interest for later analysis.  The data obtained from the devices are stored and subsequently analyzed by taking advantage of the Weka API that already has an implementation of the C4.5 algorithm, which aims to classify the collected data and thus allow the discovery of resident use patterns. The workflow is described as follows: • The environment of a house should always provide access to domotic devices to obtain information of interest. It is advisable to consider an approach that includes IoT technology devices to facilitate communication and access to data corresponding to the history of use of the devices.

•
The domotic devices, possibly having different communication interfaces, are accessed for reading and control purposes through web services. Each web service conforms to each device's specific communication interface, returning the information in the appropriate format so that the platform correctly stores the data of interest for later analysis.

•
The data obtained from the devices are stored and subsequently analyzed by taking advantage of the Weka API that already has an implementation of the C4.5 algorithm, which aims to classify the collected data and thus allow the discovery of resident use patterns.

•
After the usage patterns have been discovered, it is then possible to build decision trees that serve as a base for constructing custom comfort setting rules for each specific home's history. Consequently, each house where Smart Home Control is implemented will have appropriate rules fully adapted to the residents' conditions, allowing the automatic control to be dynamic by following an "always training-always evolving" approach. • Finally, these new configuration rules will allow the platform to send the proper instructions to the devices through web services, thus improving the house's comfort conditions in terms of device control.
As can be seen, the central design idea of Smart Home Control is to follow an approach where the system remains in permanent learning. This design allows discovering the residents' usage behavior patterns, building control rules for home automation and readjusting these rules as usage patterns change.

Automatic Control Module: Pattern Discovery
The Smart Home Control Automatic Control module uses the Weka API 3.8 implementation of the machine learning algorithm C4.5 to analyze the information from IoT devices and discover the behavior patterns that allow generating the appropriate automatic control rules. The rules are applied to control devices with little or no human intervention, such as turning on the air conditioner when the thermometer reaches a temperature considered hot. Machine learning algorithms are used in many areas ranging from health services to financial frameworks. For this project, the C4.5 algorithm was selected for its performance and efficiency in building classification models that face prediction problems. There are previous works where the performance and effectiveness of various algorithms have been evaluated. Among them, the authors of [44] verified that the results of experimenting with the implementation of C4.5 lead to better levels of interpretability and precision of data. It also generates a shorter response time in the categorization processes compared to other algorithms such as Random Tree or fuzzy C-means. In [45], the authors confirmed the efficacy of C4.5 since after the tests were performed, finding that the algorithm was capable of making predictions with better performance metric results than other algorithms such as CART or even Random Forest. Data related to actions performed using Smart Home Control must be collected, stored and analyzed daily to discover usage patterns and generate rules. The C4.5 predictive algorithm uses the data to perform configuration automatically once the usage history of each device is available. Sensor readings, which can be numerical or nominal, are examined for behavioral patterns expressed in the form of decision trees that will be used to classify new readings [29] and control devices autonomously. Notice that the user's continuous use of Smart Home Control will allow obtaining better and more usage information, enabling the automatic configuration module to be trained with greater precision and producing more precise automatic configurations.
As mentioned above, residents can control their devices using the mobile application (created for the Android platform) of Smart Home Control. Constant interaction with the application allows Smart Home Control to obtain data that are used for the analysis and prediction of automatic settings for each device. Figure 3 presents some of the application manual configuration interfaces for a room and a single device. Figure 4 shows the interfaces from which it is possible to set the whole house in intelligent control mode.
This automatic operation mode works according to the behavior patterns detected in the house. An example is when the lights or the air conditioner in the room should be turned on or off by considering temperature and human presence, among other settings.
Through the use of C4.5, the automatic configuration module analyzes the historical use of home automation devices to detect usage patterns that serve to identify the conditions of use of each device. The detection of these conditions allows the construction of classification rules for new values, which is useful in predicting future domotic configurations.  Figure 4 shows the interfaces from which it is possible to set the whole house in intelligent control mode.
This automatic operation mode works according to the behavior patterns detected in the house. An example is when the lights or the air conditioner in the room should be turned on or off by considering temperature and human presence, among other settings.  Figure 4 shows the interfaces from which it is possible to set the whole house in intelligent control mode.
This automatic operation mode works according to the behavior patterns detected in the house. An example is when the lights or the air conditioner in the room should be turned on or off by considering temperature and human presence, among other settings.

Experimental Case Study
This section presents a case study results to validate that it is possible to discover patterns of home automation devices through Smart Home Control. This case study contemplates the data collected from the devices of the home automation system of a group of 10 houses located in Orizaba, Veracruz, Mexico. These houses were monitored during a period of 10 months (from March to December 2019). The training models of the system were defined from the information collected from the devices of each house. The construction of these training models is represented by a series of nested conditions in the decision tree format for each home automation device connected to Smart Home Control. Each device's decision tree is translated as a rule that allows its respective automatic on and off, which considers the house's general state. Figure 5 shows the house's distribution studied, which has a living room, dining room, kitchen, two bedrooms and a bathroom. and (b) house profile selector.
Through the use of C4.5, the automatic configuration module analyzes the historical use of home automation devices to detect usage patterns that serve to identify the conditions of use of each device. The detection of these conditions allows the construction of classification rules for new values, which is useful in predicting future domotic configurations.

Experimental Case Study
This section presents a case study results to validate that it is possible to discover patterns of home automation devices through Smart Home Control. This case study contemplates the data collected from the devices of the home automation system of a group of 10 houses located in Orizaba, Veracruz, Mexico. These houses were monitored during a period of 10 months (from March to December 2019). The training models of the system were defined from the information collected from the devices of each house. The construction of these training models is represented by a series of nested conditions in the decision tree format for each home automation device connected to Smart Home Control. Each device's decision tree is translated as a rule that allows its respective automatic on and off, which considers the house's general state. Figure 5 shows the house's distribution studied, which has a living room, dining room, kitchen, two bedrooms and a bathroom. In Figure 6, the scenario of the experimental case study is visually represented. It should be noted that the information of the house is accessed in real-time through sensors and devices of daily use with IoT technology-enabled. Smart Home Control uses web services to access devices and sensors to obtain input data that can be used to train the Automatic Control module. Once trained, the module is capable of controlling the devices as if the user did. Through a mobile application, the resident interacts with Smart Home Control with a user-friendly interface where notifications and information of interest from the devices connected to the system are displayed. The user is also able to control their devices manually through their mobile application. In Figure 6, the scenario of the experimental case study is visually represented. It should be noted that the information of the house is accessed in real-time through sensors and devices of daily use with IoT technology-enabled. Smart Home Control uses web services to access devices and sensors to obtain input data that can be used to train the Automatic Control module. Once trained, the module is capable of controlling the devices as if the user did. Through a mobile application, the resident interacts with Smart Home Control with a user-friendly interface where notifications and information of interest from the devices connected to the system are displayed. The user is also able to control their devices manually through their mobile application. Mathematics 2021, 9, x FOR PEER REVIEW 11 of 25 Three residents inhabited the house, and domotic devices were installed in each room. Additionally, sensors that allowed the recording of environmental conditions were installed to record possible variables to consider when taking any action on a specific device. For this reason, the on and off records of each of the devices were evaluated. The measurements recorded, e.g., human presence, natural light and temperature sensors, were considered. The registration of information from the sensors in each room is used to determine factors that could affect residents' behavior patterns. These factors vary among the houses if variables such as geographic location, socioeconomic stratum, etc. are considered. For example, the case study was conducted in Veracruz's central state area with a hot-humid climate. This climate encourages the frequent use of air conditioning devices, unlike other regions such as Mexico's central area, where its cooler temperatures favor less use of these devices. Table 1 presents the devices, sensors and rooms taken into account for home automation registration and automatic control.  Three residents inhabited the house, and domotic devices were installed in each room. Additionally, sensors that allowed the recording of environmental conditions were installed to record possible variables to consider when taking any action on a specific device. For this reason, the on and off records of each of the devices were evaluated. The measurements recorded, e.g., human presence, natural light and temperature sensors, were considered. The registration of information from the sensors in each room is used to determine factors that could affect residents' behavior patterns. These factors vary among the houses if variables such as geographic location, socioeconomic stratum, etc. are considered. For example, the case study was conducted in Veracruz's central state area with a hot-humid climate. This climate encourages the frequent use of air conditioning devices, unlike other regions such as Mexico's central area, where its cooler temperatures favor less use of these devices. Table 1 presents the devices, sensors and rooms taken into account for home automation registration and automatic control. From March to December 2019, data collection and analysis were carried out, analyzing all household device configuration records including the readings of the sensors installed in each room. Table 2 presents part of the changes in the state of the devices and sensors installed in the dining room as an example of the information taken into account as training data. • Once the dataset was obtained, an analysis of each device's usage behavior was performed, taking into account the rest of the devices and sensors in the home. • Each device was classified with possible "On" and "Off" values. Then, by using C4.5, decision trees were obtained whose conditions are based on the behavior observed during the training period. The rules resulting from the construction of the decision trees are dynamic and vary according to the particular behavior observed during the training.
The resulting control rules are the best at adjusting to the behavior of the house. The control rules allow automating the smart home to improve the comfort conditions of the users. Table 3 shows the attributes considered for the C4.5 analysis and pattern discovery. As mentioned, to analyze each device's usage pattern in the system, it is necessary to consider the rest of the devices and sensors states of the house as variables. Each device analyzed is considered as a class attribute, which will be the classification target of C4.5. However, it is possible to analyze the numerical and nominal information collected from home automation devices to perform, for each class attribute, the selection of key attributes that allow the construction of decision trees with greater precision through the use of a correlation-based feature subset evaluator and the best first search method. Table 4 shows the selection of attributes considered for each class attribute of the experimental case study. Table 4. Selected attributes for each class attribute.

Class Attribute
Selected Attributes

Lights @Bedroom2
Lights @Livingroom NaturalLight @Diningroom TV @Bedroom1 TV @Bedroom2 Lamps @Bedroom2 Temperature @Livingroom NaturalLight @Bedroom1 Lamps @Bedroom1 Presence @Bedroom2 A set of metrics was selected to evaluate the C4.5 performance. These metrics are described as follows.  Table 5 presents the precision, recall, F-Measure, MCC, ROC Area and PRC Area metrics resulting from the device class analysis using C4.5.  Figure 7 shows an example of a classification tree that resulted from the data analysis of each of the house devices. The design of the trees was based on those presented by Pintelas et al. [46] who developed a semi-supervised methodology grey-box model to explain and understand how the predictive models works achieving a balance between the black box and white box paradigms. The tree shows the discovery of the usage patterns related to turning on and off the lights in the living room using algorithm C4.5. When considering the totality of data recorded by the rest of the devices and sensors, the resulting tree reveals a set of conditions applicable to the whole house. Figure 8 shows a second example of a classification tree resulting from the analysis using algorithm C4.5 to set the rule for turning the living room air conditioner on and off.
Once the classification models have been obtained through the use of C4.5, it is possible for Smart Home Control to automatically turn on or off the devices previously described in this document. As the decision trees are built from the data recorded by the residents' spontaneous usage of home devices, the system can perform the actions as close as possible to the way the house inhabitants would do, representing an improvement in comfort in a smart home.
of each of the house devices. The design of the trees was based on those presented by Pintelas et al. [46] who developed a semi-supervised methodology grey-box model to explain and understand how the predictive models works achieving a balance between the black box and white box paradigms. The tree shows the discovery of the usage patterns related to turning on and off the lights in the living room using algorithm C4.5. When considering the totality of data recorded by the rest of the devices and sensors, the resulting tree reveals a set of conditions applicable to the whole house.   Once the classification models have been obtained through the use of C4.5, it is possible for Smart Home Control to automatically turn on or off the devices previously described in this document. As the decision trees are built from the data recorded by the plain and understand how the predictive models works achieving a balance between the black box and white box paradigms. The tree shows the discovery of the usage patterns related to turning on and off the lights in the living room using algorithm C4.5. When considering the totality of data recorded by the rest of the devices and sensors, the resulting tree reveals a set of conditions applicable to the whole house.   Once the classification models have been obtained through the use of C4.5, it is possible for Smart Home Control to automatically turn on or off the devices previously described in this document. As the decision trees are built from the data recorded by the

Results and Discussion
As the case study has shown, through monitoring, collecting and analyzing the usage data coming from the devices connected to a home automation system, it was confirmed that the combination of machine learning and IoT allows identifying usage patterns. The usage patterns encoded as rules are of utmost importance as their application leads to improved comfort experience in smart homes and buildings. It should be noted that, although Smart Home Control has an intelligent control module with the ability to make autonomous decisions based on the usage history, the user retains the final authority over the control of home automation devices. By having the ability to record new changes in the state of home devices, the user will continue to record changes in the history of stored data, information that will serve to train Smart Home Control permanently and adaptively, allowing the decision trees to be readjusted as user behavior change.
In addition to the case study's observation period described above, a second observation was carried out, for another ten months, from January to October 2020. The purpose of this second observation was to verify if the Automatic Control module of the Smart Home Control system was capable of adjusting the automatic configuration rules for each device according to the changes in the new residents' usage patterns. It is important to mention that, although the Automatic Control was kept on during the second observation period, the ability of residents to control their devices manually was not limited. At the end of the second period, the following observations were established:       As verified, the automatic control module can readjust the home automation control rules according to the change in the residents' patterns of use. The readjust is possible thanks to the continuous recording and analysis of usage data from the devices connected to Smart Home Control, which allows the system to follow an "always on training" approach. This feature is of great importance if the automatic control module is required to  As verified, the automatic control module can readjust the home automation control rules according to the change in the residents' patterns of use. The readjust is possible thanks to the continuous recording and analysis of usage data from the devices connected to Smart Home Control, which allows the system to follow an "always on training" approach. This feature is of great importance if the automatic control module is required to As verified, the automatic control module can readjust the home automation control rules according to the change in the residents' patterns of use. The readjust is possible thanks to the continuous recording and analysis of usage data from the devices connected to Smart Home Control, which allows the system to follow an "always on training" approach. This feature is of great importance if the automatic control module is required to make timely decisions considering its available information.
Throughout the development of the case study, it was possible to confirm that: • The discovery of behavior patterns of the residents of a home is relevant to improving home automation conditions. • The construction of automatic configuration rules must be dynamic and must evolve as residents' behavior patterns change.

•
The analysis of the residents' behavior patterns accounts for continued training of the system. The training frees users from getting involved in making control decisions of their devices. However, no matter how close the system behaves to the residents, there will always be factors or conditions that cause changes in the home automation device usage.

•
The automation of homes based on residents' behavior patterns brings benefits and improvements in home automation control schemes. Some of the schemes that can be improved under this approach are comfort, energy-saving, safety and healthcare schemes.
However, despite the benefits of developing automatic control of home automation devices based on the history of each house, there is a great challenge that must be taken into account:

•
The human being is unpredictable, and behavior patterns can change drastically according to personal considerations, mood, mental health and medication.
Keeping in mind the volatility with which human beings change their behavior, it would be interesting to include technologies related to the reading and interpretation of body language to improve automatic configuration schemes.
Finally, it is worth mentioning that an extended analysis was performed despite obtaining satisfactory results using the C4.5 algorithm. The analysis aimed to determine, taking advantage of the information collected from the experimental use case, the feasibility of using other decision trees (e.g., CART) or ensemble algorithms (e.g., Adaboost and Random Forest). The analysis was conducted in the short term to improve attribute classification precision to construct automatic control rules. Table 6 shows the comparison of the precision metrics resulting from the analysis of attributes through the use of C4.5, CART, Adaboost and Random Forest. As can be seen, the use of Random Forest represents an improvement in the classification of attributes of the experimental case study. Table 7 shows the percentage of incidents in which each algorithm obtained a better score applicable to the attributes of the study case. However, Random Forest's Smart Home Control implementation conveys challenging problems despite obtaining better scores in performance metrics. These problems are due to the difficulty interpreting the results obtained by the algorithm. These results contrast to those obtained by C4.5 whose iterative pruning process allows directly obtaining decision trees that are easily translated into control rules.

Conclusions
Humans have always been looking for more and better ways to facilitate daily life activities. It is evident that this desire also motivated the search to improve comfort conditions at home. Technologies such as machine learning and IoT make it possible to strengthen home automation systems with automatic control modules for daily use devices. It is important to mention that technological advances have allowed exponential growth in the number of IoT devices, which increases the ability to obtain information about the use that residents of a house give to their devices. Furthermore, it is possible to analyze the data coming from these IoT devices by using machine learning algorithms to identify residents' patterns of behavior to create automatic configuration schemes that adjust to the house residents' particular preferences.
In this paper, Smart Home Control is proposed, a platform that performs the analysis of historical records of the use of home automation devices to detect smart home residents' behavior patterns through IoT and machine learning, which improves the comfort schemes of domotic systems. Smart Home Control uses the C4.5 algorithm to classify the data from the sensors and IoT devices and thus build decision trees for each device's automatic configuration whose control adjusts to user's preferences. Additionally, an experimental case study was executed to validate the platform effectiveness. However, it should be noted that the results and effectiveness of Smart Home Control are affected by external factors such as social changes, natural disasters or even the mood of the inhabitants of the home. It is also important to emphasize that, after the implementation and results obtained using C4.5, it was possible to determine that the resulting models have a high level of interpretability. However, it was possible to identify a lack of explainability in the generated models as a limitation of this work. New evaluations are planned in the short term to establish strategies to deal with the black box problem generated by the low level of explainability.
Smart Home Control has high scalability potential. Therefore, future work is considered to incorporate more automatic configuration profiles related to security, energy-saving and accident detection, among others. With the increase in configuration profiles available to users, it will be possible to select the rules best suited for their needs, based on their current context. To validate the effectiveness of the new Smart Home Control profiles, it is planned to observe at least ten houses that make use of the profiles during monitoring periods of 8-10 months. The Smart Home Control mobile application was developed exclusively for Android, so it is desirable be available for other platforms. Regarding the platform's effectiveness, it is important to record user satisfaction in terms of usability and improvement of their perception of comfort at home by applying user-centered evaluations based on the User-Centered Evaluation Framework for Computer Recommendation Systems. Finally, the implementation of more robust and complex classification algorithms, such as Random Forest, is considered for a possible improvement in the performance of the prediction process of the automatic control of home automation devices. In addition, according to the amount of data that are collected over time, the evolution of the data analysis paradigm of Smart Home Control to a Deep Learning and Big Data approach is feasible. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns of the users involved in the study.