Skip to Content
Applied SciencesApplied Sciences
  • Article
  • Open Access

31 October 2022

Identification of Daily Living Recurrent Behavioral Patterns Using Genetic Algorithms for Elderly Care

,
,
,
,
and
1
Computer Science Department, Faculty of Automation and Computer Science, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
2
Department of Informatics, Technical University of Munich, Boltzmannstr. 3, 85748 Garching bei München, Germany
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Bio-Inspired Computing and Its Applications

Abstract

A person’s routine is a sequence of activities of daily living patterns recurrently performed. Sticking daily routines is a great tool to support the care of persons with dementia, and older adults in general, who are living in their homes, and also being useful for caregivers. As state-of-the-art tools based on self-reporting are subjective and rely on a person’s memory, new tools are needed for objectively detecting such routines from the monitored data coming from wearables or smart home sensors. In this paper, we propose a solution for detecting the daily routines of a person by extracting the sequences of recurrent activities and their duration from the monitored data. A genetic algorithm is defined to extract activity patterns featuring small differences that relate to the day-to-day contextual variations that occur in a person’s daily routine. The quality of the solutions is evaluated with a probabilistic-based fitness function, while a tournament-based strategy is employed for the dynamic selection of mutation and crossover operators applied for generating the offspring. The time variability of activities of daily living is addressed using the dispersion of the values of duration of that activity around the average value. The results are showing an accuracy above 80% in detecting the routines, while the optimal values of population size and the number of generations for fitness function evolution and convergence are determined using multiple linear regression analysis.

1. Introduction

Sticking daily routines is a great tool to support the care of older adults or persons living with dementia in their home being useful for caregivers and the patient itself. The daily routine is a sequence of recurrent activities performed by a person every day [1]. Routine means organization and discipline and can bring several benefits to in-person care, such as the improvement of mental health by reducing anxiety and stress levels and physical health and productivity [2,3]. In the case of dementia, it helps prevent faster cognitive decline allowing the detection of subtle changes or deviations in time [4]. Having a rather chaotic lifestyle can negatively impact health in the long term [5]. For example, sleep deprivation can lead to the probability of developing cardiovascular or nervous system diseases and diabetes in time, while irregular meals or late meals can affect emotional or mental state, increase the likelihood of developing digestive system diseases and decrease the defense capacity of the immune system.
In this context, it is useful to provide solutions to allow the detection of recurrent patterns of activities and to objectively infer the daily routine of a person out of Internet of Things (IoT)-monitored data. The large-scale adoption of wearable IoT devices eases the data collection, but inferring personalized daily routines is not an easy task as it is affected by many variables (e.g., time frame, weekday or weekend, chronic condition, etc.) and differentiation factors [6]. Nevertheless, monitoring the person’s daily routine using IoT wearable devices is a promising research field [7], insufficiently explored, with applicability in personalizing health and care services. Discovering the person’s daily life routine from sensors’ data and deviations from it can help in assessing the health status of the person enabling the healthcare personnel to proactively intervene to avoid the person’s institutionalization [8]. For example, sleeping more than usual can be a symptom of depression, while frequent going to the toilet can be associated with a urinary tract infection.
Most of nowadays research is focused on identifying the activities of daily living (ADL) to observe abnormal data [9,10,11,12]. Only a few research is looking at the activity length and sequence of such activities to detect the person’s recurrent behavior patterns that are part of their daily routine and relevant deviations from it. The use of IoT sensors can generate large volumes of data that require efficient algorithms capable of processing this data and associated search space to identify recurrent behavioral patterns [13]. The collected data could be incomplete and inaccurate, and in this case, algorithms capable of handling data quality issues are required [13]. Metaheuristics algorithms could be a viable solution for solving such types of problems since they provide a near-optimal solution for problems with incomplete and inaccurate data or when the computing power is limited, and the execution time must be low [14]. They can provide approximate solutions with lower computational overhead than state-of-the-art solutions such as neural networks or exhaustive search algorithms and better solutions than deterministic and rule-based algorithms [15].
In this paper, we propose the use of a genetic metaheuristic for the detection of recurrent activity sequences that form a person’s daily routine. An individual is encoded as a sequence of activities performed by a person in a day while its quality is evaluated using a fitness function that considers four types of probabilities, previously introduced by us in [16]: the probability of transition among activities, the probability that an activity is the first or last in a recurrent pattern of activities, and the pattern length probability. A method based on the average daily activity time variability is defined to enrich the inferred activity patterns with time-related information. For the selection process of the parent chromosomes, we have used a tournament-based strategy, and the population evolution is ensured by applying crossover and mutation operators. To avoid premature convergence a roulette wheel selection strategy is used for the dynamic selection of the operators to be used in generating the next generation of offspring.
The genetic heuristic has good results for constraint-based optimization problems that can be translated into digging for the best solutions in the search space, as in the case of the routine detection problem. The length of the daily activities has a certain degree of flexibility and is bounded by upper and lower values, while the transition among activities can happen with different probabilities. Moreover, this type of constraint is rather personalized as they depend on the wishes and needs of each person. Consequently, the space and datasets for searching the routine are large and hard to explore. The genetic heuristic is more suitable compared to the deterministic algorithms as it uses the history encoded in the chromosomes of the previous population to guide the search. Therefore, our genetic-based solution can identify more than one routine for a person, while most state-of-the-art deterministic approaches [16,17,18] can identify only one rigid routine. This solution is more realistic since a person can have several variations of recurrent daily activity patterns that differ slightly depending on contextual factors. For the duration of activities, we use an interval determined by the dispersion of the duration values of that activity around the average value, which is a more flexible approach than deterministic ones [16], which use only one value namely the average time. Finally, the tournament-based strategy combined with roulette wheel selection for generating new populations of chromosomes guides the search process with a good balance between exploitation and exploration of the search space for routine detection. As reported in other literature works [19], the balance is important to identify regions in the areas that are closer to daily routines with good quality and to dismiss the parts already explored or with poor quality solutions.
The novel contributions of this paper are the following:
  • A heuristic approach based on a genetic algorithm for identifying the recurrent activity patterns of a person and constructing the daily routine. The chromosomes are encoded using the person’s ADL and a probability-based fitness function is used to evaluate the quality of the population. Thus, it allows the extraction of several patterns of activities that differ slightly from each other capturing the day-to-day contextual variations that occur in a person’s daily routine.
  • A tournament-based strategy for dynamic selection of operators applied for generating the offspring, while the time variability of activities of daily living is addressed using the dispersion of the duration values of that activity around the average value.
  • Study the impact of the population size and number of generations on the fitness function evolution and convergence by using multiple linear regression analysis.
The paper is structured as follows: Section 2 reviews the state of the art and presents the progress beyond, Section 3 presents the genetic solution for detecting recurrent activity patterns forming the person’s daily routine using genetic algorithms, Section 4 presents the experimental results, Section 5 discusses the impact of the adjustable parameters on the performance of the genetic algorithm, and Section 6 presents conclusions and possible further developments.

3. Materials and Methods

Genetic algorithms (GA) are stochastic search methods that mimic natural biological evolution [35]. They apply the survival of the fittest principle and operate on a population of individuals to produce better individuals in the next generation. The main steps of applying GA to solve a specific problem are to define a suitable representation of an individual and generate the initial population, define a fitness function for evaluating the individuals and select the best ones, and define an appropriate interpretation for crossover and mutation operators that will be used to generate new offspring and update the population (see Figure 1).
Figure 1. Genetic solution for routine detection.
In the next paragraphs, these steps are followed to describe how GA is used to detect the recurrent activity sequences carried out by humans to construct a person’s daily routine.
An individual (also named chromosome) is encoded as a sequence of ADL representing a potential frequent pattern, part of the daily routine of a person:
i n d i v = { a i :   T a i 1 a i     a i , a i 1 H A D L p e r s o n ,   i 0  
where H A D L ( p e r s o n ) represents the historical monitored data (i.e., monitored days) of a person, a i , a i 1 are activities performed by the person, and T a i 1 a i is a transition between two activities (i.e., a i follows a i 1 ).
The generation of the first population of individuals has a great impact on the individuals’ evolution and eventually on the algorithm convergence. In our case, we have opted for the random generation of the initial population considering the historically monitored activities of the person on daily basis:
i n i t i a l P o p u l a t i o n = R H A D L |   R H A D L   H A D L p e r s o n
The approach is suitable because the daily routine we want to extract must contain some of the activities patterns already observed in the historical monitored data of a person ( H A D L p e r s o n ). To assess the quality of an individual and to select the best ones from the population we have defined a probabilities-based fitness function:
F i t n e s s i n d i v i =   P S a 1 ω 1 P E a n + ω 2 P L l i n d i v i j = 1 n P a j | a j + 1 1 n + 1
In (3), l i n d v i is the length of the individual i , n is the number of activities in the sequence, P S a 1 is the start probability of the first activity in the sequence, P E a n is the end probability of the last activity in the sequence, P L l i n d i v i   is the length probability computed for the individual i , P a j | a j + 1 is the transition probability from a j to activity a j + 1 . ω 1 and ω 2 are the weights associated to the end and length probabilities such that relation (4) is true. In our experiments, ω 1 is set to the value of 0.4 and ω 2   to the value of 0.6.
ω 1 + ω 2 = 1
The four types of probabilities were defined by us in [16] were a Markov model-based solution was introduced for routine detection. In this paper, we have adapted and re-used their calculation method for the fitness function definition.
The start and the end probability of an activity a i (activity that is most likely to be the first or last activity in the daily routine) are computed by counting the number of appearances of the activity a i at the beginning and end of the pattern sequences in specific days and we divide it with the number of days of monitored data available for the person:
P S   a i = d a y :   d a y   H A D L p e r s o n     d a y   s t a r t s   w i t h   a i | { d a y :   d a y   H A D L p e r s o n } |
P E   a i = d a y :   d a y   H A D L p e r s o n     d a y   e n d s   w i t h   a i | d a y :   d a y   H A D L p e r s o n }
The transition probability P ( a i | a j ) is calculated by dividing the number transitions form a i to a j to the number of all transitions between a i and all the other activities available different from a j [16]:
P a i | a j = k = 1 n | {   T a i a j :   T a i a j   d a y k } | k = 1 n |   T a i a m :   T a i a j   d a y k |
where n = H A D L p e r s o n and m   j .
The length probability of a sequence of activities is defined as the probability of having a routine of a certain length and is calculated by counting the number of days that have the same length divided by the total number of days present in the population of individuals:
P L s e q u e n c e = | d a y :   d a y     R H A D L p e r s o n       l e n g h t s e q u e n c e = l e n g h t d a y   } | H A D L p e r s o n
where R H A D L p e r s o n H A D L p e r s o n .
To determine the individuals in the current population that will be chosen to transmit their genetic material to the next generation, we have used a tournament-based approach. In the tournament selection, several individuals are randomly selected from the population. The best ones among them as ranked using the fitness function become parents of the next generation of individuals (also called offspring) who are created using some operators.
To avoid the premature convergence of the genetic algorithm when the population reaches a suboptimal state the operators can no longer produce offspring with fitness values better than that of their parents, we dynamically apply several crossover and mutation operators as suggested in [36]. The selection of the crossover operator that will be applied in each generation is performed dynamically based on the rules presented below that compute the progress rate of applying a specific crossover operator, o p c . If the offspring D dominates the parents P 1 and P 2 then the progress rate of the crossover operator p r o g r e s s o p c is set to the value 1. When the offspring D does not dominate the parents, then the progress rate value for the crossover operator is set to 0.5. If the offspring D dominates either the parent P1 or the parent P2 or no dominance relation exists with the other one, then:
p r o g r e s s o p c = max 1 k 1 c u r r e n t G e n m a x G e n ,   0.5
Finally, if the offspring D dominates at least one parent, then:
p r o g r e s s o p c = max 1 k 2 c u r r e n t G e n m a x G e n ,   0
In relation (9) and (10), c u r r e n t G e n is the current generation, m a x G e n is the maximum number of generations, and k 1 ,   k 2 are two parameters that set the velocity of progress which are fine-tuned on an experimental basis. In our experiments, k 1 and k 2 were set to a value of 0.3.
The scheme for dynamic selection of crossover and mutation operators allows to balance between the exploration and exploitation of the search space avoiding the local optimum and premature convergence. The crossover represents a search within a region closer to a potential solution, while the mutation leads to a solution outside the region. At the same time, the dynamic selection scheme allows the consideration of a variety of operators to guide the search and generate the next population. As there is a variety of operators in the literature, the main criteria used in choosing the operators were the encoding type used in the genetic algorithm for routine detection as well as the reports from literature papers [37].
We have considered three types of crossover operators, namely one-point crossover, two-point crossover, and maximal preservation crossover. The crossover operators require two parents to generate the offspring, one being the donor and the other the receiver.
In one-point crossover, a crossover point is randomly generated based on the length of the shortest parent:
P o i n t c r o s s o v e r = R a n d o m   M i n   L e n g h t   i n d i v d o n o r , i n d i v r e c e v e r
Both parents are then split at the location defined by the crossover point. The first offspring will get the first part of the first parent and the second part of the second parent, while the second offspring will get the first part of the second parent and the second part of the first parent (see Figure 2).
Figure 2. One-point crossover on two individuals representing daily activities sequence of a person.
In two-point crossover, two different crossover points are generated that should have a smaller value than the length of the shortest parent. The two points are used to split both parents into three parts. The two offspring chromosomes will have the first and last subsequence of the same parent, but the middle sequence will be from the other parent (see Figure 3).
Figure 3. Two-point crossover on two individuals representing daily activities sequence of a person.
The maximal preservation crossover [23,24] is a method that produces only one offspring. Figure 3 and Figure 4 show how we apply the maximal preservation crossover for the case in which the donor and receiver parents have the same gene and for the case when they do not have the same genes. The crossover point is randomly generated within the donor’s length. The donor sequence is split, and the first part is copied into the offspring, while the second part is reordered based on the receiver parent. In the case in which the donor and receiver parents have the same gene (see Figure 4), the genes of the second part of the donor will be copied into the offspring based on the order in which they appear in the receiver parent.
Figure 4. Maximal preservation crossover operator usage when the two parents have the same genes (blue—genes from the donor, red—genes from the receiver, orange—genes from the donor reordered as in receiver).
In the case in which the donor and receiver parents do not have the same genes (see Figure 5), the remaining genes from the receiver that are found in the donor will be rearranged in the order in which they appear in the receiver parent, while also keeping the genes that are present only in the recipient (i.e., sport) and then they will be copied in the offspring. In this case, if a gene from the donor is not present in the receiver, it will not be present in the offspring.
Figure 5. Maximal preservation crossover operator when the two parents do not have the same genes (blue—genes from the donor, red—genes from the receiver, orange—genes from the donor reordered as in receiver together with genes from receptor that are not in donor).
In the case of mutation, we have defined a parameter that specifies for each chromosome if it will be mutated or not. A random number is generated and compared with this parameter to decide. Two types of mutation operators are considered, namely, mutation with random selection and mutation with weighted selection.
In the case of mutation with random selection, we have generated two numbers, one in range 1 ,   L e n g t h c h r o m o s o m e and another one in range 1 ,   S i z e   A c t i v i t y   P o o l . The first number will correspond to the chromosome gene that will be replaced, and the second number will correspond to the selected activity from the activity pool that will replace the gene. Figure 6 presents an example of applying a mutation operator with random selection. The two arrows point to the randomly selected activities. As you can see, the gene inside the chromosome will be replaced by a gene from the activity pool.
Figure 6. Example of applying mutation with random selection.
The mutation with weighted selection is similar, but the selection of the new activity to replace the gene is controlled. Once we have selected a gene to be replaced (see Figure 7), we will look at the transition probabilities of the gene before it. As we can see in the example in Figure 5, the activity (i.e., gene) selected to be replaced is “Sleep”, which is preceded by the activity “Work”. To identify the activity that will replace the “Sleep” activity, we will use the values of the transition probabilities of the “Work” activity. We represent the distribution of the “Work” activity transition probabilities in a pie chart where the width of the sectors is proportional to the transition probabilities from “Work” activity to another activity. Based on this pie chart, the selection of the activity that will replace the “Sleep” activity will be performed similarly to the roulette wheel selection method. In our example, the activities that are more likely to follow the “Work” activity (that is, those for which the transition probabilities have higher values) will have a higher chance of replacing the “Sleep” activity. The improvements this method brings over the previous one, is that mutated chromosomes will be more likely to have a higher fitness than the original chromosomes.
Figure 7. Example of applying mutation with weighted selection.
The progress rate specific to the application of each mutation operator o p m   is calculated based on the following rules. If the offspring resulted after applying the mutation operator o p m on an individual dominates the individual, then the progress rate of the mutation operator, p r o g r e s s o p m is set to the value 1. When the individual dominates the offspring resulted after applying the mutation operator o p m   on the individual, or the offspring is not valid, then the progress rate value for the mutation operator is set to 0.5. If no dominance relation exists between individual and the offspring resulted after applying the mutation operator, o p m   on the individual, then:
p r o g r e s s o p m = max 1 k 3 c u r r e n t G e n m a x G e n ,   0.5
where c u r r e n t G e n is the current generation, m a x G e n is the maximum number of generations, and   k 3 is the velocity of the progress and is set in our experiments to 0.2 value.
The average progress rates of applying each crossover and mutation operator are computed starting from the first generation until the current generation based on the rules described. In the first generation, each operator is assigned the same probability of selection which is 1/n where n represents the number of operators considered in the selection process. The average progress rate of a specific operator is defined as the ratio between the sum of progress starting from the first generation up to the current generation and the number of times the operator has been applied. The result is used to compute the crossover or operator selection probabilities.
Algorithm 1 shows the pseudocode for routine detection using the genetic algorithm with dynamic operators.
Algorithm 1: GA for Routine Detection Using Dynamic Operators.
Inputs: dataset—the historical dataset, maxGen—the number of generations
Output: routine—activity sequence representing the routine of a person
Comments:crossovers—the set of crossover operators; mutations—the set of mutation operators; best_routines—the best routines identified during each generation; Opc—crossover operator; Opm—mutation operator
1.
Begin
2.
  Generate i n i t i a l P o p u l a t i o n
3.
   while maxGen is not exceeded
4.
     Select individual1 and individual2, using the tournament selection
5.
     foreach Opc in crossovers do
6.
       offspring = Opc(individual1, individual2)
7.
       Compute the average progress and update the selection probability for Opc
8.
     end for
9.
     Use the roulette wheel selection to choose the most appropriate Opc
10.
     Opc (individual1, individual2)
11.
       Include the resulting offspring in the population
12.
     foreach individual in the population do
13.
         foreach Opm in mutations do
14.
          Opm (individual)
15.
          Compute the average progress and update the selection probability of Opm
16.
         end foreach
17.
         Use the roulette wheel selection to choose the most appropriate Opm
18.
         Opm (individual)
19.
         Include the resulting offspring in the new population
20.
     end foreach
21.
     Keep in the new generation several the best individuals
22.
     Identify and store the current best individual in best_routines
23.
   end while
24.
   return the best routine out of best_routines
25.
 End
The algorithm takes as inputs the dataset collected from sensors containing the activities performed by a person on each of the monitored days and the maximum number of generations and returns a person’s daily living routine. The algorithm will test all the crossover and mutation operators on every generation, and we will always advance with the generation of the operator with the highest progress rate.
To incorporate time-related information (e.g., duration of the activity) into the detected patterns of activities, we proposed an approach that measures the dispersion of the time values of the duration of an activity around the average value for each period of the day, namely for morning (between 6:01 a.m. and 12 p.m.), for afternoon (between 12:01 p.m. and 18 p.m.), for evening (between 18:01 p.m. and 0:00 a.m.), and for night (between 0:01 a.m. and 6:00 a.m.). To compute the lower and upper bounds of the interval corresponding to the duration of the activity a i is carried out, the standard deviation is determined as:
s t d a i = j = 1 n d u r a t i o n a i j m e a n a i n 1
where: n is the number of days from the dataset in which the activity ai is performed, d u r a t i o n a i j is the duration of the activity a i in day j and m e a n is the average value of the durations of the activity a i in n days. Based on the standard deviation the lower and upper bounds of the interval of the duration for the activity a i are computed as follows:
d u r a t i o n a i = M E A N 3 s t d a i ,   M E A N + 3 s t d a i

4. Evaluation Results

In our experiments, we have used the dataset from [16]. It contains data over three months linked to the ADLs performed by 10 older adults (ids 1 to 10) daily in their own homes. The data were acquired using a monitoring infrastructure based on wearable sensors and Beacons technology. Each ADL features a start and end time and one of the following activity labels: sleeping, eating (i.e., breakfast, lunch, snack, dinner), personal hygiene, reading, spare time/TV, walking, and outside. Not all the activities are registered for each person. Some of them do not perform specific activities such as going outside, or walking is highly dependent on their health state. The monitored older adults who volunteered to participate are aged between 70 and 85 years. All of them suffer from cardiovascular diseases. They live alone but have the support of their family. The ten apartments in which they live have similar plans and devices. They have the same number and types of devices that they use in their homes.
Compared to the classification-based solutions where the number of persons in the dataset may influence the quality of the results obtained, in the case of our genetic solution, this is not the case. For each person, a different search space is created based on the daily activities’ length intervals and probabilities for activities transition. They are rather personal and driven by individual conditions, wishes, and needs. They are encoded into the chromosomes and used in the reproduction phases to guide and balance the exploration and exploitation of the search space. No search information is being translated and used across the different persons’ individual search spaces.
To analyze the characteristics of the dataset used in experiments, we conducted an exploratory analysis using statistical graphs. Figure 8 shows the distribution of the average durations of each ADL for each person in the dataset, while Figure 9 illustrates the frequency of appearance of each ADL.
Figure 8. The average duration of each ADL per person. Different colors are used for different persons: blue for id = 1, orange for id = 2, gray for id = 3, yellow for id = 4, lilac for id = 5, green for id = 6, red for id = 7, pink for id = 8, fuchsia for id = 9, brown for id = 10.
Figure 9. The frequency of occurrence of each ADL per person. Different colors are used for different persons: blue for id = 1, orange for id = 2, gray for id = 3, yellow for id = 4, lilac for id = 5, green for id = 6, red for id = 7, pink id = 8, fuchsia for id = 9, brown for id = 10.
The sleep activity has the longest average duration for all the persons in the dataset and the frequency of occurrence of breakfast, lunch, and dinner activities has a high degree of similarity for all the persons in the dataset. Higher differences in the frequency of occurrence are observed in the case of sleep, personal hygiene, or spare time/TV and reading activities. Moreover, there are persons with missing activities such as snacks, walking, reading, or outside. This is due either to the fact that the sensor did not record that activity, or because the person did not do that activity.
The duration of activities, their frequency of appearance as well as the potential transitions among the activities influence the dimension of the search space:
A D L i ,     A D L j A D L s C A D L i A D L j 2 A D L A D L s ( f r e a q   A D L C A D L d u r a t i o n )
where A D L s represent the set of activity labels from the dataset, C A D L i A D L j 2 the combinations of transitions among two activities in the dataset, f r e a q   A D L is the frequency of appearance of an activity and C A D L d u r a t i o n all combinations of activities with their duration.
Figure 10 also shows the distribution of sleep activity duration for a person. As can be seen, there is a variety of durations for each activity which makes the search space for daily routines quite big and difficult to be processed in a reasonable time by deterministic algorithms requiring the use of heuristics.
Figure 10. Distribution of sleep activity duration over the days.
To assess the performance of our genetic algorithm for behavioral patterns and routines discovery, we have computed the accuracy of the generated solutions for all persons monitored in our dataset. Table 1 illustrates the computed accuracies for each person which vary between 80% and 86% with an average accuracy of 82%.
Table 1. Accuracy of routine detected by our genetic algorithm.
Figure 11 illustrates the best-ranked routines generated by the genetic algorithm for the person with id 6 showing the frequent activities sequences and their duration.
Figure 11. The best-ranked routines considering the fitness value for the person with id 6.
The duration interval corresponding to each activity in each period of the day (i.e., morning, afternoon, evening, and night) for the first routine is presented in Table 2.
Table 2. Duration intervals for the activities part of the first routine returned by the genetic algorithm for patient with id 6.
By analyzing the best routine returned by the genetic algorithm we can observe that the patient sleeps between 11 p.m. and 7 a.m. with an average sleep duration of 8 h, has breakfast between 8 and 10 a.m. with an average duration of 26 min, goes outside two times per day, in the morning and in the evening with an average duration of 62 min in the morning and 47 min in a part of the afternoon and evening.
To compare our results with previous solutions from the reviewed literature, we tried to identify similar approaches that address routine detection using only bio-inspired heuristics. The closest we could find was the one proposed by Quaid et al. [27]; thus, we compared it with their reported results. They proposed a reweighted genetic heuristics and classification algorithm for human behavior recognition from accelerometer signals. Table 3 presents the accuracy metric value reported by the authors compared with the accuracy of our algorithm, which achieves slightly better results.
Table 3. Routine detection comparative results.

5. Discussion

To identify the impact of the control parameters on the performance of the genetic algorithm and to fine-tune their values we have performed sensitivity analysis. We have determined how the target variables reflecting the performance of our proposed solution (i.e., fitness value and execution time) are affected by the changes in input parameters such as population size ( p o p S i z e ) and number of generations ( n o G e n e r a t i o n s ). The objective was to finetune them to the best variant so that the genetic algorithm determines the person’s daily routine with the greatest accuracy. The ranges in which the input parameters are varied are [20,120] for number of generations and [20,70] for population size.
Figure 12 and Figure 13 show the evolution of the fitness value and execution time when the proposed genetic algorithm is used to detect the routine of a person.
Figure 12. The average fitness variation when generations number is changed, and population size is constant.
Figure 13. The execution time variation when number of generations is changed, and population size is maintained constant.
First, we maintained the population size at a constant value and varied the number of generations between 20 and 120. Then, we repeated this experiment for a population size varying between 20 and 70. For each considered configuration, 30 runs were made to compute the average values for fitness and execution time. As shown in Figure 12, with the increase in the population size, an improvement in the fitness value is achieved but also increases the execution time (see Figure 13). The trend is similar when the size of the population is maintained constant and the number of generations is increased (i.e., the value of the fitness and the execution time increase). In the case of fitness evolution, at some point, the increase in the value of fitness stabilizes and does not change for many generations, while in the case of the execution time, the growth is proportional to the number of generations.
As we aim to identify a routine that reflects as faithfully as possible the frequent patterns of activities that the person performs during the day, thus we focus on solutions with a fitness value as high as possible even if it involves a higher execution time. Table 4 presents the configurations that have provided the best results in terms of fitness values for routine detection. The best average fitness value is obtained for a population size of 70 and 110 generations with a minimum impact on the execution time which is kept below 1.5 s.
Table 4. The fitness function value sensitivity to the variation of the input parameters.
We have performed multiple regression studies to analyze how the size of the population and the number of generations affect the value of the fitness in the GA approach. We investigated various regression coefficients, the ANOVA table, and the regression analysis table. For the correlation coefficient, multiple R takes values in the range [−1, 1] and indicates a linear relation strength between the independent and dependent variables. In the case of our algorithm, the multiple R-value is 0.92 showing a positive relation between the fitness value (dependent variable) and the population size and the number of generations (independent variables. The coefficient of determination, R square measures how much of the variation in the fitness value can be explained by the variation in population size and number of generations. In our case, the R-square value of 0.85 indicates that 85% of the variation in fitness value can be explained by the change in the population size and number of generations. The standard error measures the precision of our analysis model, and the lower the value is, the more precise predictions are provided. In our case, we obtained a standard error of 0.024, thus the regression model produces precise predictions. Finally, the observation represents the number of configurations of the parameters considered which in our case is the combination of the number of generations and the size of the population in our model.
The analysis of variance, ANOVA table, (see Table 5) provides information about the variability level of our regression model reflected by several components. At the same time, Figure 14 shows details of the spread and distribution of the results achieved on how the control variables of population size and the number of generations influence the value of the fitness function. The residuals show the difference of the mean sample (i.e., positive when the value is greater and negative when the value is smaller). The degree of freedom, df, is associated with the variance sources, while the sum of squares SS provides information about data dispersion, and how well the data fits into the regression model. As can be seen in our case the value of the residual SS is lower compared to the total SS, indicating that the model fits well the data. The mean squares parameter, MS, provides an estimate of the variance concerning the regression and is calculated as the ratio between the sum of squares and the degree of freedom. It is used to determine the F value which provides information about the fitness model’s importance in relation to the null hypothesis. Significance F shows if our solution with the two independent variables (that is, the population size and the number of generations) can be used to explain the variability of the fitness value. Since in our case, the value of Significance F is lower than 0.05 the model is statistically significant.
Table 5. ANOVA table.
Figure 14. Boxplot for data distribution of ANOVA analysis.
The regression analysis from Table 6 provides more in-depth information on the influence of the size of the population and the number of generations. The coefficients are the least square estimates for the independent variables (i.e., the size of the population and the number of generations), the standard errors for coefficients are the standard errors of the least squares estimates for the independent variables and the p-value is the value for hypothesis testing.
Table 6. Regression analysis table.
The coefficients values reflect the mathematical relation between an independent variable either the population size or the number of generations and the dependent variable, while the p-value indicates if the relation between an independent variable and the dependent variable is statistically significant or not (i.e., if there is a correlation between an independent variable and the dependent variable). In the case of our algorithm, the coefficient for the size of population is 0.003, while for the number of generations is 0.0008. These values mean for an increase in one unit in population size, the value of the fitness function will increase on average by 0.003, and for an increase in one unit of the number of generations, the value of the fitness function will increase on average by 0.0008. Since these values are small, we conclude that the size of the population and the number of generations have quite a small influence on the variability of the value of the fitness. However, of these two variables, the size of the population has a greater influence on fitness than the number of generations. Both the number of generations and the population size are statistically significant and influence the fitness variability since the p-values are lower than 0.05.

6. Conclusions

In this paper, we proposed a solution for identifying the frequent behavioral patterns part of the daily routines of a person by considering, as relevant features, the sequence in which the activities are performed as well as the time interval and the duration corresponding to each activity in the sequence. Genetic algorithms are used to identify the sequence of activities that occur on the vast majority of monitored days, and a method based on the standard deviation is used to calculate the time interval and duration corresponding to each activity in the sequence.
To avoid the premature convergence of the genetic algorithm and to maintain a better balance between exploration and exploitation, a strategy based on the dynamic application of crossover and mutation operators has been used. The selection of crossover and mutation operators that were applied in the dynamic selection strategy was made taking into consideration the encoding strategy of an individual that we used as well as the advantages that these operators offer as reported in the specialized literature.
The approach has been tested on a dataset of ten patients with age between 75 and 80 who suffer from cardiovascular diseases. For each person, a different search space was created based on the duration intervals of the daily activities and the transition probabilities of the activities, on which the genetic algorithm was applied to extract the daily routine corresponding to that person. To determine how the performance of the genetic algorithm is influenced by the variations of the control parameters, we have performed a sensitive analysis. We have also compared our approach with other state-of-the-art approaches, to assess its performance in terms of the accuracy of the results.
The obtained results demonstrate that our approach can provide good results even when working with smaller amounts of data, unlike existing classifier-based approaches that require large amounts of annotated data to achieve good results.
In future work, we plan to implement a distributed version that runs several genetic algorithm instances in parallel, aggregates the best solution provided by each instance, and outputs the best solution out of this set. This may improve the accuracy of detecting the routine of the person. Moreover, to better capture the behavior of a person in a certain context, we intend to encode in the individual representation contextual information as well as additional information regarding the location in which the activities are performed or the frequency of the activities. Finally, we will look into newer and promising metaheuristics such as Whale or Harris hawks optimization algorithms that may improve the results of daily routine detection.

Author Contributions

Conceptualization, V.R.C. and C.B.P.; methodology, V.R.C. and T.C.; software, D.D.; validation, C.B.P., D.D. and V.R.C.; formal analysis, C.B.P.; investigation, I.A. and D.D.; writing—original draft preparation, V.R.C., T.C. and C.B.P.; writing—review and editing, I.A. and I.S.; visualization, I.A. and T.C.; Supervision, I.S.; funding acquisition, T.C. and I.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by UEFISCDI Romania and the European Union AAL Programme with co-funding from the Horizon 2020 research and innovation programme, grant number AAL264/2021 engAGE, AAL159/2020 H2HCare, and AAL162/2020 ReMember-Me.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article.

Acknowledgments

This work was supported by three grants of the Romanian National Authority for Scientific Research and Innovation, CCCDI–UEFISCDI and of the AAL Programme with co-funding from the European Union’s Horizon 2020 research and innovation programme projects number AAL264/2021 engAGE, AAL159/2020 H2HCare, and AAL162/2020 ReMember-Me within PNCDI III.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Avni-Babad, D. Routine and feelings of safety, confidence, and well-being. Br. J. Psychol. 2011, 102, 223–244. [Google Scholar] [CrossRef] [PubMed]
  2. Arlinghaus, K.R.; Johnston, C.A. The Importance of Creating Habits and Routine. Am. J. Lifestyle Med. 2018, 13, 142–144. [Google Scholar] [CrossRef] [PubMed]
  3. Pilz, L.K.; Couto Pereira, N.S.; Francisco, A.P.; Carissimi, A.; Constantino, D.B.; Caus, L.B.; Abreu, A.C.O.; Amando, G.R.; Bonatto, F.S.; Carvalho, P.V.V.; et al. Effective recommendations towards healthy routines to preserve mental health during the COVID-19 pandemic. Braz. J. Psychiatry 2022, 44, 136–146. [Google Scholar] [CrossRef] [PubMed]
  4. Schneider, F.; Horowitz, A.; Lesch, K.P.; Dandekar, T. Delaying memory decline: Different options and emerging solutions. Transl. Psychiatry 2020, 10, 13. [Google Scholar] [CrossRef]
  5. Farhud, D.D. Impact of Lifestyle on Health. Iran. J. Public Health 2015, 44, 1442–1444. [Google Scholar]
  6. Gao, S.; Tan, A.H.; Setchi, R. Learning ADL Daily Routines with Spatiotemporal Neural Networks. IEEE Trans. Knowl. Data Eng. 2021, 33, 143–153. [Google Scholar] [CrossRef]
  7. Maučec, M.; Donaj, G. Discovering Daily Activity Patterns from Sensor Data Sequences and Activity Sequences. Sensors 2021, 21, 6920. [Google Scholar] [CrossRef]
  8. Anghel, I.; Cioara, T.; Moldovan, D.; Antal, C.; Pop, C.D.; Salomie, I.; Pop, C.B.; Chifu, V.R. Smart Environments and Social Robots for Age-Friendly Integrated Care Services. Int. J. Environ. Res. Public Health 2020, 17, 3801. [Google Scholar] [CrossRef]
  9. Zekri, D.; Delot, T.; Thilliez, M.; LeComte, S.; Desertot, M. A Framework for Detecting and Analyzing Behavior Changes of Elderly People over Time Using Learning Techniques. Sensors 2020, 20, 7112. [Google Scholar] [CrossRef]
  10. Zerkouk, M.; Chikhaoui, B. Spatio-Temporal Abnormal Behavior Prediction in Elderly Persons Using Deep Learning Models. Sensors 2020, 20, 2359. [Google Scholar] [CrossRef]
  11. Arifoglu, D.; Bouchachia, A. Detection of Abnormal Behaviour for Dementia Sufferers using Convolutional Neural Networks. Artif. Intell. Med. 2019, 94, 88–95. [Google Scholar] [CrossRef] [PubMed]
  12. Hela, S.; Amel, B.; Badran, R. Early anomaly detection in smart home: A causal association rule-based approach. Artif. Intell. Med. 2018, 91, 57–71. [Google Scholar] [CrossRef] [PubMed]
  13. Mohan, P.; Lee, B.; Chaspari, T.; Ahn, C.R. Assessment of Daily Routine Uniformity in a Smart Home Environment Using Hierarchical Clustering. IEEE J. Biomed. Health Inform. 2021, 25, 3197–3208. [Google Scholar] [CrossRef] [PubMed]
  14. Malik, H.; Iqbal, A.; Joshi, P.; Agrawal, S.; Bakhsh, F.I. Metaheuristic and Evolutionary Computation: Algorithms and Applications. In Studies in Computational Intelligence, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1–849. [Google Scholar]
  15. Tsai, C.-W.; Chiang, M.-C.; Ksentini, A.; Chen, M. Metaheuristic Algorithms for Healthcare: Open Issues and Challenges. Comput. Electr. Eng. 2016, 53, 421–434. [Google Scholar] [CrossRef]
  16. Chifu, V.R.; Pop, C.B.; Demjen, D.; Socaci, R.; Todea, D.; Antal, M.; Cioara, T.; Anghel, I.; Antal, C. Identifying and Monitoring the Daily Routine of Seniors Living at Home. Sensors 2022, 22, 992. [Google Scholar] [CrossRef]
  17. Wang, L.; Zhou, Y.; Li, R.; Ding, L. A fusion of a deep neural network and a hidden Markov model to recognize the multiclass abnormal behavior of elderly people. Knowl. -Based Syst. 2022, 252, 109351. [Google Scholar] [CrossRef]
  18. Shahid, Z.K.; Saguna, S.; Åhlund, C. Detecting Anomalies in Daily Activity Routines of Older Persons in Single Resident Smart Homes: Proof-of-Concept Study. JMIR Aging 2022, 5, e28260. [Google Scholar] [CrossRef]
  19. Sun, J.; Zhang, H.; Zhang, Q.; Chen, H. Balancing exploration and exploitation in multiobjective evolutionary optimization. Inf. Sci. 2019, 497, 199–200. [Google Scholar]
  20. Martín, A.J.; Gordo, I.M.; García Domínguez, J.J.; Torres-Sospedra, J.; Plaza, S.L.; Gómez, D.G. Affinity Propagation Clustering for Older Adults Daily Routine Estimation. In Proceedings of the 2021 International Conference on Indoor Positioning and Indoor Navigation (IPIN), Lloret De Mar, Spain, 29 November–2 December 2021; pp. 1–7. [Google Scholar]
  21. Friedrich, B.; Sawabe, T.; Hein, A. Unsupervised statistical concept drift detection for behaviour abnormality detection. Appl. Intell. 2022. [Google Scholar] [CrossRef]
  22. Meng, L.; Miao, C.; Leung, C. Towards online and personalized daily activity recognition, habit modeling, and anomaly detection for the solitary elderly through unobtrusive sensing. Multimed. Tools. Appl. 2017, 76, 10779–10799. [Google Scholar] [CrossRef]
  23. Arab, M.; Akbarian, H.; Gheibi, M.; Akrami, M.; Fathollahi-Fard, A.M.; Hajiaghaei-Keshteli, M.; Tian, G. A soft-sensor for sustainable operation of coagulation and flocculation units. Eng. Appl. Artif. Intell. 2022, 115, 105315. [Google Scholar] [CrossRef]
  24. Seiter, J.; Derungs, A.; Schuster-Amft, C.; Amft, O.; Tröster, G. Daily life activity routine discovery in hemiparetic rehabilitation patients using topic models. Methods Inf. Med. 2015, 54, 248–255. [Google Scholar] [PubMed]
  25. Hu, R.; Michel, B.; Russo, D.; Mora, N.; Matrella, G.; Ciampolini, P.; Cocchi, F.; Montanari, E.; Nunziata, S.; Brunschwiler, T. An Unsupervised Behavioral Modeling and Alerting System Based on Passive Sensing for Elderly Care. Future Internet 2021, 13, 6. [Google Scholar] [CrossRef]
  26. Zekri, D.; Delot, T.; Desertot, M.; Lecomte, S.; Thilliez, M. Using Learning Techniques to Observe Elderly’s Behavior Changes over Time in Smart Home. In Proceedings of the ICOST 2020, Hammamet, Tunisia, 24–26 June 2020; pp. 129–141. [Google Scholar]
  27. Quaid, M.A.K.; Jalal, A. Wearable sensors based human behavioral pattern recognition using statistical features and reweighted genetic algorithm. Multimed. Tools Appl. 2020, 79, 6061–6083. [Google Scholar] [CrossRef]
  28. Chifu, V.R.; Pop, C.B.; Rancea, A.M.; Morar, A.; Cioara, T.; Antal, M.; Anghel, I. Deep Learning, Mining, and Collaborative Clustering to Identify Flexible Daily Activities Patterns. Sensors 2022, 22, 4803. [Google Scholar] [CrossRef] [PubMed]
  29. Ren, Y.; Zhang, C.; Zhao, F.; Xiao, H.; Tian, G. An asynchronous parallel disassembly planning based on genetic algorithm. Eur. J. Oper. Res. 2018, 269, 647–660. [Google Scholar] [CrossRef]
  30. Fathollahi-Fard, A.M.; Dulebenets, M.A.; Hajiaghaei–Keshteli, M.; Tavakkoli-Moghaddam, R.; Safaeian, M.; Mirzahosseinian, H. Two hybrid meta-heuristic algorithms for a dual-channel closed-loop supply chain network design problem in the tire industry under uncertainty. Adv. Eng. Inform. 2021, 50, 101418. [Google Scholar] [CrossRef]
  31. Fathollahi-Fard, A.M.; Ahmadi, A.; Karimi, B. Sustainable and Robust Home Healthcare Logistics: A Response to the COVID-19 Pandemic. Symmetry 2022, 14, 193. [Google Scholar] [CrossRef]
  32. Lau, Y.-Y.; Dulebenets, M.A.; Yip, H.-T.; Tang, Y.-M. Healthcare Supply Chain Management under COVID-19 Settings: The Existing Practices in Hong Kong and the United States. Healthcare 2022, 10, 1549. [Google Scholar] [CrossRef]
  33. Soussa, M.R.B.; de Senna, V.; da Silva, V.L.; Soares, C.L. Modeling elderly behavioral patterns in single-person households. Multimed. Tools Appl. 2021, 80, 22097–22120. [Google Scholar] [CrossRef]
  34. de Moura, I.R.; Teles, A.S.; Endler, M.; Coutinho, L.R.; da Silva e Silva, F.J. Recognizing Context-Aware Human Sociability Patterns Using Pervasive Monitoring for Supporting Mental Health Professionals. Sensors 2021, 21, 86. [Google Scholar] [CrossRef] [PubMed]
  35. Holland, J.H. Adaptation in Natural and Artificial Systems; MIT Press: Cambridge, MA, USA, 1992; pp. 1–228. [Google Scholar]
  36. Nicoara, E.S. Mechanisms to Avoid the Premature Convergence of Genetic Algorithms. Ser. Math. Inform. Phys. 2009, 61, 87–96. [Google Scholar]
  37. Katoch, S.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar] [CrossRef] [PubMed]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.