Assisted Living System with Adaptive Sensor’s Contribution

Multimodal sensing and data processing have become a common approach in modern assisted living systems. This is widely justified by the complementary properties of sensors based on different sensing paradigms. However, all previous proposals assume data fusion to be made based on fixed criteria. We proved that particular sensors show different performance depending on the subject’s activity and consequently present the concept of an adaptive sensor’s contribution. In the proposed prototype architecture, the sensor information is first unified and then modulated to prefer the most reliable sensors. We also take into consideration the dynamics of the subject’s behavior and propose two algorithms for the adaptation of sensors’ contribution, and discuss their advantages and limitations based on case studies.


Introduction
Nowadays, in developed countries, significant progress in the process of aging is observed-the percentage of elderly people in the population is higher than the percentage of young people. It is expected that in these countries the current 20%-proportion of people age 60 years and above will increase by 32% by the year 2050. Over 50 years between 1950 and 2000 the median age increased from 29.0 years to 37.3 years and its continued growth is estimated to be 45.5 years by the year 2050 [1].
These figures force the governments of developed countries to carry out adequate actions. They mainly consist of the monitoring of health parameters and physical activity for the purposes of prevention against all types of diseases and life risks such as falls and frailty due to the absence of systematic physical exercise, selected on the individual level. Taking care of people who need special treatment (older, with disabilities, during recovery after injuries, accidents, or serious illnesses) is not limited to satisfy their physiological or material needs, but first of all involves physical, psychological, and social stimulation [2]. As early as in ancient times, not without reason, Aristotle said that "movement is life-life is movement". Thus, all attempts and efforts towards achieving practical support for such people by encouraging their psychomotor autonomy are of great importance.
To face the above needs, projects of technical solutions proposed worldwide aim at the non-invasive, convenient, and secure monitoring of supervised human vital signs [1,3]. Such monitoring is expected to reduce the costs of expensive medical equipment or specialized medical and rehabilitation staff and to assist non-professional individuals in taking continuous care of ill people.
Every approach to an assisted living system raises three issues: • Adequacy of the applied sensor set; • Intrusion of measurement devices in the subject's environment and behavior; • Violation of the subject's privacy and vulnerability of the collected data.
Sensors 2020, 20, 5278 4 of 28 and knee joints) were used. The results obtained from the proposed system and the reference systems (Vicon and dynamometer platform) were similar. Mizuno et al. [30] introduced a multimodal system for the recognition of ADL activities of monitored persons. The system integrates piezoresistive pressure sensors, a motion detector placed in a watch, a sound sensor in glasses, an ultrasonic sensor (closed in a pen) measuring a distance from a ceiling, and a position sensor (Bluetooth and GPS). The proposed system enables the detection of walking, running, standing, eating, talking, and office work.
In [31], the physical activity of persons during rehabilitation after the stroke was monitored. For this purpose, the integrated three-axis accelerometric and one-axis gyroscopic sensors (positioned bilaterally to the subjects' ankles and wrists) were used. Two accelerometers and a pressure sensor were attached to the cane used by the examined person. Measurement data were recorded during level walking, walking carrying an object, walking on an uneven surface, walking up a ramp, walking down a ramp, walking up a flight of stairs, walking down a flight of stairs, walking over an object, pivoting, and opening a door. Each motor activity was identified by a neural network. For all activities at an average specificity of 95%, the sensitivity ranged from 75.1% to 97.4%. Then, the use of the cane was studied in the context of a particular type of activity. The studies were performed based on the measurements data from sensors located on the cane.
An extensive review of methods used in ambient assisted living systems was provided in [32]. The variety of methods used for sensing particular behavior patterns (e.g., fall detectors) raises the question of their substitution or complementary use. This issue was studied in our group [33] and several other authors provided comparative results for the efficiency and accuracy of different sensor types in specific everyday living events. These findings paved the way to a concept of multimodal sensing where sensors of different types are used in the following scenarios: • Simultaneous: information from both sensors are gathered concurrently and fused together to yield features of higher sensitivity and specificity; • Complementary: information from sensors is switched selecting the best sensor accordingly to the changes in recording conditions (e.g., indoor/outdoor).
While the simultaneous scenario has been applied in numerous proposals, the complementary scenario is also worth studying in a pursuit for continuous surveillance of a mobile human. Consequently, surveillance of physiological parameters may be employed in the healthy population as an essential part of prevention programs and on the other hand, ill or disabled people will not be sent to their beds or premises without the chance of physical exercise or a social life.
An alternative concept was proposed in [34]. Five sensors: pulse, chest accelerometer, limb accelerometers, camera, and microphone were used in pairs for the detection of seven elementary poses, which in turn contributed to the representation of actual behavior. In that previous work, we used graph representation with node values standing for pose contribution and edge flow representing the activity in time. This approach used complementary premise-fixed and wearable sensors, simple yet reliable algorithms for recognition of elementary poses, and a concise representation of any behavior, even unknown at the setup stage.

Data Fusion Techniques
One of the most cited is the work by Boonma and Suzuki [35], which presents the basics of biologically-inspired architecture for Sensor Networks (BiSNET) with implemented key biological mechanisms such as energy exchange, pheromone emission, replication, and migration. The authors evaluate the BiSNET for oil spill detection in the coastal environment. The network is based on agents without a centralized service to coordinate them, thus it is lightweight, scalable, and self-healing. This means the sensor nodes autonomously adapt their states and data transmission according to dynamic changes of conditions, retain their power efficiency, against the increase of network size (up to 600 nodes), and collectively detect and eliminate false-positive data. Cohen and Edanb [36] propose a sensor fusion framework that adaptively selects the most reliable sensor set and the most suitable algorithm. To this point, the algorithm implements measures continuously quantifying sensor performance. The concept has been software simulated with a grid-map paradigm, logical sensors, and performance measures to allow the random setup of sensors producing multiple data types. The performance was measured as a difference of each particular setup and the final fused map, which has to be known beforehand. The sensor re-configuration procedure is applied once a low-performing sensor is detected.
The system presented by Marti et al. [37] is built with several sensors and a centralized automatic reasoning module that integrates partial descriptions with contextual information of the system, and combines available sensor data, to produce a fused output that best satisfies the goals following given ontology. The system is robust to temporary sensor unavailability, variable reliability of sensor information, and supports on-the-fly redefining its goals. The proposal has been implemented and tested in the ground vehicle navigation.
A comprehensive review of the state-of-the-art techniques on multi-sensor fusion in the area of BSN can be found in [38]. The paper particularly focuses on physical activity recognition and widely discusses the data fusion pros and cons at levels of data (suitable for homogenous sensor set), features, and decisions (allowing for the combination of data from heterogeneous sensors). Moreover, centralized, distributed, and hybrid approaches to collective decision making are studied. Although a waste literature review is presented, only one example of context-adaptive fusion was provided in the work by Cook et al. [39].
Koping, Shirahamaand, and Grzegorzek [40] address the need for a general data fusion framework for a specific smartphone-based multi-sensor body area network. Since the framework is dedicated to a general-purpose surveillance system, it supports the heterogeneous sensor set and the data fusion is performed on the feature vector level through code-based learning. Specific signals are first processed at the sensors with adequate feature extracting algorithms. This approach is also used in the proposed solution; however, we do not follow the static data fusion paradigm.
Very recently Lin et al. [41] proposed a smart sensors data fusion system targeted to support stable, safe, and efficient medical patient-robot interaction. The medical services provided by autonomous robots require real-time monitoring of the state of both users. To this point, various sensor, communication, robot, and data processing technologies have been applied. The proposed hybrid body sensor network architecture is based on multi-sensor fusion employing an interpretable neural network. However, the data integration process seems to be fixed for the given patient. Bazo et al. [42] propose the combination of radiofrequency-based positioning and computer vision-based human pose estimation as a tool for behavioral analysis and activity recognition. The two subsystems have complementary properties i.e., the radiofrequency localizer solves the occlusions that may occur in the computer vision detector, and the computer vision subsystem increases the accuracy of positions measured with the radiofrequency localizer. This model falls in the larger category of bimodal position and activity sensing systems also developed by other authors for analysis of shoppers [43,44], pedestrians [45,46], or just human pose recognition [47]. Both subsystems are independent and separately process the RF and RGBD sensors produced data. The sensor fusion module uses the tag and skeleton and iteratively seeks for its stable state expressed by maximizing data persistence. The priority of visual or radiofrequency data is used solely to avoid ghosting.
He et al. [48] give a critical review of state-of-art solutions for scalable fault-tolerant information fusion in a distributed wireless sensor network. The authors indicate the most challenging areas in sensors application, which are different sensing modalities enriching the robustness but demanding more than simple fusion of homogenous data and a wide range of uncertainties in sensing and communication (misdetection, false alarms, unavailability, or delays). The paper also highlights several interesting areas of future improvements such as mutual calibration and verification of data consistency.
Proper instrumentation and interpretation software enable detecting particular events and classifying human behavior in several categories of risk. Extending this scope leads to a continuous Sensors 2020, 20, 5278 6 of 28 predict-and-verify scenario, where the detection of unexpected behavior provides signs of possible health setbacks [49]. In that previous research, the information of the currently identified pose was not utilized to improve the sensing performance of the current state nor prepare the sensing system for the most probable subject pose. A novel concept stemming from our previous studies is presented in this paper. It combines behavior prediction and sensor reconfigurability schemes into a behavior tracking system that continuously adapts the sensor contribution to the present and most probable future activity of the supervised subject.

Concept of Adaptive Sensing
The concept of continuous adaptation of sensors' contribution in a multimodal system originates from rules of information propagation in living neural systems. Let us shortly recall two different types of chemical synapses: ionotropic, with a quick and short synaptic response, specialized in fast sensory or executory, excitatory or inhibitory pulse messaging, and metabotropic, with a delayed and long-standing response, being primarily responsible for the modulation of pulse conduction. All mammals select the dominating and auxiliary senses that they actually use to perceive the surroundings thanks to these two complementary types of synaptic junctions.
Mimicking the above-mentioned natural rule of neural modulation in a technical multisensor assisted living environment requires solving two issues:

•
Determining competence areas and performance hierarchy in a given sensor set; • Specifying data stream modulation rules, allowing to adapt each sensor's contribution to a final decision.
Initially, we assume each sensor to have an exclusive sector of competence area, where no other sensor is applicable, and its complementary sector, where it competes with one or more other sensors. Although the accuracy and reliability are most naturally selected as competence criteria, a variety of other parameters are applicable in a real surveillance system: availability, intrusiveness, energy consumption, etc. Moreover, the cooperation of two sensors in a common competence sector yields valuable information about the coherence of their data streams, which may be useful in other scenarios to assess the quality of measurements relying only on the auxiliary sensor (e.g., when the principal sensor data are unavailable).
In the following sections, we develop this concept by examining the sensor set and sensor-specific preprocessing software (Section 4) in an experimental detection of human motor activities (Section 5). The discussion of the experiment outcome is followed by a proposal of two data stream adaptation algorithms (Section 6) and the presentation of a use case (Section 7). The discussion and future remarks (Section 8) conclude the paper.

Components of the Sensor Set
All experiments were carried out indoor in a large room (approx. 150 m 2 ) by means of four different motion signals measurement devices ( Figure 1): a wireless (WLAN) EMG biopotentials amplifier ME6000 (Mega Electronics) with MegaWin software (B), a wireless feet pressure measurement system ParoLogg with Parologg software (C), ACC Revitus module with dedicated software (D), and a digital video camera Sony HDR-FX7E (E) [50]. Table 1 illustrates a sampling frequency for each of the used sensors: Eight-channel electromyographic signals were surface recorded from the muscles of both lower limbs: quadriceps-vastus lateralis (1), biceps femoris (2), tibialis anterior (3), gastrocnemius-medial head (4). Time-series foot pressure signals were obtained from the 64 built-in pressure sensors insoles (each foot insole has 32 independent sensors). A three-dimensional accelerometric signal was recorded with the use of Revitus located on the human sternum, while for video measurements a digital camera placed on the left side of the examined person was set up (720 x 576 pixels).  Eight-channel electromyographic signals were surface recorded from the muscles of both lower limbs: quadriceps-vastus lateralis (1), biceps femoris (2), tibialis anterior (3), gastrocnemius-medial head (4). Time-series foot pressure signals were obtained from the 64 built-in pressure sensors insoles (each foot insole has 32 independent sensors). A three-dimensional accelerometric signal was recorded with the use of Revitus located on the human sternum, while for video measurements a digital camera placed on the left side of the examined person was set up (720 × 576 pixels).

Preprocessing of the Measurement Data
The successive steps of processing the measurement data from each of the sensors B ÷ E were presented and described in detail in [50]. The scheme in Figure 2 illustrates the main parts of the proposed signals processing.
The sensors were used individually and in sets of two to four sensors. The classification of motor activities was based on feature vectors recorded by one to four sensors simultaneously. The feature vectors for each setup are presented in Table 2. In the case of multiple sensors, we simply combined the feature vectors of each sensor.

Materials
In the experiment, 20 volunteers performed 12 selected physical activities (1a ÷ 6b, Figure 3) with about 30 repetitions (19 ÷ 46) for each one: • Reaching (4a) and return from reaching (4b) the upper limb upwards in the sagittal plane (standing); • Bending (5a) and straightening the trunk (5b) from bend forward from a stand pose in the sagittal plane; • A single step with the right (6a) and the left (6b) lower limb (stance phase).
Sensors 2020, 20, x FOR PEER REVIEW 10 of 29 Figure 3. Selected (first, middle, and end, respectively) video frames presented the investigated movements (activities 1a ÷ 6b). Transitions between the specific successive activities are named 1a1b ÷ 5a5b.

Feature Classification Methods
Supervised classification of the selected motor activities was performed with the use of k-NN (k-Nearest Neighbors) method and Manhattan metric. The sizes of learning and test sets were in the ratio of 1:3. With the final results presentation in mind, several variables were introduced [50]: • Correctness of recognition for all volunteers-Rs a; • Calculation error of Rs_a -Us_a-a measure of the results dispersion comes from inter-subject differences (weighted standard deviation due to different numbers of activity repetitions for each volunteer); . Selected (first, middle, and end, respectively) video frames presented the investigated movements (activities 1a ÷ 6b). Transitions between the specific successive activities are named 1a1b ÷ 5a5b.

Feature Classification Methods
Supervised classification of the selected motor activities was performed with the use of k-NN (k-Nearest Neighbors) method and Manhattan metric. The sizes of learning and test sets were in the ratio of 1:3. With the final results presentation in mind, several variables were introduced [50]:

•
Correctness of recognition for all volunteers-R s a ; • Calculation error of R s_a − U s_a -a measure of the results dispersion comes from inter-subject differences (weighted standard deviation due to different numbers of activity repetitions for each volunteer); • Percentage of correct recognitions for all activities and all volunteers-R s_ALL ; Percent recognition for all activities-R s_V ; • Calculation error of R s_V − U s_V -a measure of results value dispersion arising from differences between different activities (weighted standard deviation due to different number of repetitions of each activity for each volunteer); • Calculation error of R s_V − U s_ALL -a measure of the dispersion of the results due to recognitions of the individual activities.

Sensor Set Performance Results
Based on the data presented in Tables 3 and 4 and Figures 4 and 5 we concluded that the measurements carried out simultaneously with two, three, or four sensors lead to a significant improvement of recognition reliability.    The experiment results prove that the overall activity recognition performance (right columns of Tables 3 and 4) can be improved by adapting the sensor set and the features used to the particular action and to the particular subject. This statement is a background of the proposed adaptation algorithms presented in Section 6.  Tables A1-A11.

Reliability-Driven Sensor Data Fusion
The experiment results prove that the overall activity recognition performance (right columns of Tables 3 and 4) can be improved by adapting the sensor set and the features used to the particular action and to the particular subject. This statement is a background of the proposed adaptation algorithms presented in Section 6.

General Assumptions and System Design
The general architecture of a multisensory environment for assisted living consists of sensors, dedicated feature extraction methods, and modality selectors. The proposed innovation replaces the selector by a modulator using weight coefficients W k (Figure 6) to prefer the most pertinent features while discriminating the others. As the sensors use specific signals (muscular, pressure, acceleration, and video), one of the consequences of replacement of the feature selector by a modulator is the necessity of uniform representation of all features. To this point, the feature calculation step uniforms the information update rate and normalizes the feature values. The output of each sensor is given as a probability-ordered list of activities {A i , p i } (see Figures 6 and 7).  Three coefficients are proposed to modulate the influence of each sensor on the final decision about the detected activity. These are listed and shortly explained below.
H k is an activity-independent coefficient characterizing each sensor cost including hardware, installation, and maintenance as well as human factors like acceptance of each particular sensor set (cameras at home, accelerometer belt or bands, electrodes, etc.); all these factors we consider to be constant in time thus these values need to be evaluated once per subject. In order to efficiently adapt the sensors' choice, extreme values of H s should be avoided.
R k (A) is an activity-dependent factor of reliability; as it was demonstrated in Section 5, sensors show different performance in the detection of basic daily activities of the human; accordingly, in the system paradigm, R s is the primary factor adapting the contribution from sensors to the current activity of the monitored subject.  Hk is an activity-independent coefficient characterizing each sensor cost including hardware, installation, and maintenance as well as human factors like acceptance of each particular sensor set (cameras at home, accelerometer belt or bands, electrodes, etc.); all these factors we consider to be constant in time thus these values need to be evaluated once per subject. In order to efficiently adapt the sensors' choice, extreme values of Hs should be avoided.
Rk(A) is an activity-dependent factor of reliability; as it was demonstrated in Section 5, sensors show different performance in the detection of basic daily activities of the human; accordingly, in the system paradigm, Rs is the primary factor adapting the contribution from sensors to the current activity of the monitored subject. L(n) is a penalty factor that discriminates the influence from sensors depending on their position n on the reliability ranking in determining the activity A by sensor k; the actual penalty factor is calculated based on a coefficient p: low values of p equalizes the ranking list what makes the system mostly working with multiple sensors and avoiding the worst, while a high value of p prefers the winner to be a unique working sensor: The contribution of each sensor k may be thus determined as: and normalized over the whole set of sensor weighting coefficients: Accordingly, with the currently detected subject's action, the system automatically adapts the feature set (Table 2) to optimally detect the present action. The optimization criteria may be freely selected from variables presented in Section 4.4 and used jointly with other attributes (including non-technical such as acceptance, usage cost, etc.). To keep the presentation simple, we use the correctness of recognition (given in Table 3). In a real system, besides the subject action, the selection of sensors also takes into account constant factors like costs and availability or acceptance of a sensor by individual subjects.
Instead of applying recognition correctness generalized for all volunteers, an individual table, equivalent to Table 3 may be built for each supervised subject. The personalization of the multisensor environment improves the individual performance (compare columns in Table 4) but requires a set of exercises performed under the supervision of a human assistant who annotates the activities and checks the recognition correctness (or other optimization criteria).
Based on selected optimization criterion (in our example: generalized correctness of recognition, Table 3) a hierarchy of feature vectors is built for each detected activity. Taking the action "bending" (5a) as an example, we have sensor set hierarchy: It is noteworthy that BD yields better results than BCD, therefore the use of more sensors does not lead to better results, and adding a sensor (C in this case) may degrade the recognition correctness.
The modulation of the sensor's contribution presented above is confirmative. Firstly, the detection is roughly made with a possibly not optimal sensor set and then confirmed with an adapted set. The modification closes the information loop and, like all kinds of feedback, raises the stability issue if the action detected with adapted features does not match those initially detected. The other drawback of confirmative detection is related to possible erroneous first detection leading to an even less optimal sensor set and confirming the erroneous decision.

Stability Condition for Modulated Sensor Set
The stability issue in a sensor set with modulated contribution can be solved by limitation of the weight modulation range. Let f be a function A = f (S k ;W k ) assigning a unique subject's action A to specific sensor outputs S k modulated by W k . This means all probability values p i of given activities A i from sensor k are multiplied by W k : Let m be a function W k = m(A) modulating the contributions from sensors S k to maximize the reliability of the recognition of A. Therefore, the modulator is stable if: which means the modulation does not influence the current recognition result. Since we cannot expect the recognition result to be a linear function of the modulation depth, we propose an iterative try and fail algorithm finding the modulation limits. To find the value of W k , between the original W k1 and the desired target W k2 the algorithm repeatedly bisects an interval and then selects a subinterval in which both ends yield different actions for further processing.
All necessary steps of the modulation algorithm are performed within the subject state sampling interval. New data gathered from the sensors are processed with optimized sensors' contribution and confirm the detected subject's action.
The stability issue can be also avoided by applying a sensor set consistency rule. This rule uses the past sensor set as a reference and requires the new set to be as similar as possible. Continuing the example given in Section 6.1 if "bending" has been detected with BE sensors and a "straightening the trunk" (5b) occurs thereafter, the sensor set hierarchy is the following: (BD, BDE); (BE, DE); (CE, BCD, BCE, CDE, BCDE); CD; BC.
Maintaining the BE configuration is preferred over changing to DE, despite their equal performance, for the stability reason.

Predictive Modulation of Sensors' Contribution
One may question the purpose of optimization if it only confirms the result of recognition already made. Fortunately, in most assisted living environments, the prevention of dangerous events is stressed as a primary goal, their architecture usually includes an artificial intelligence-based system for learning of the subject's habits and detecting unusual behavior as a potential sign of danger. Such systems gather the information of individual habits in a form of database learned and updated from real past behavior records. Such a database provides activity statistics, but, more interestingly, for each given activity the most probable next activity can be determined. We propose to use the information from the individual's habits database to predict the subject's upcoming action and adjust the sensor's contribution accordingly (Figure 7). The modulation is still made accordingly to the stability requirements (see Section 6.2), but the sensor's contribution now adapts to the most probable next subject's action.
Introducing the habits database in the feedback path has two benefits: • Prediction of upcoming action takes into account multimodal time series instead of single points, what stabilizes the prediction in case of singular recognition error; • Focusing on optimal recognition for current action makes the system conservative (i.e., expecting a stable status), whereas optimizing for future action makes it progressive (i.e., awaiting changes of the status).

A Compound Action
The proposed sensor's contribution modulation technique was analyzed in a previously proposed multisensor environment for assisted living [33]. We also used previously recorded data from 20 volunteers (8 women and 12 men, aged between 22 and 61 years), acting accordingly to predefined realistic scenarios. Table 5 presents an example compound action of searching a book on a wall-mounted shell, consisting of elementary poses (defined in Section 4): squatting (1a, 1b), reaching forward (3a, 3b), reaching upward (4a, 4b), and bending (5a, 5b). Table 5. From sensors (in %) to the compound action recognition, "searching on the shelf". Multiple repetitions of patterns in the habits learning phase and opposed direction of elementary poses labeled with a and b facilitate correct prediction of subsequent poses and respective adaptation of sensors' contribution. In the studied case, no abrupt corrections in the sensor set were necessary, consequently, changes of weighting coefficients were linear and not restricted by stability limits. The smoothing influence of prediction on sensors' modulation is also revealed in Table 5. Nevertheless, studies of correct work of the system for unexpected activities and possible errors in stabilizing the algorithm need a recording of human performance according to purposely designed misbehavior.

Change of Environment
In this scenario, we assume that a walking subject (alternating activities 6a and 6b) goes outdoor and sensor E (the video system) no longer provides reliable data. Since the sensor set hierarchies (Table 3)   The most reliable sets common for both activities are BCE, BCD, and BDE, and after the elimination of sensor E data, the recognition relying on sensor D (accelerometer) data has equivalent correctness. However, if the subject changes the activity, the equivalence of data from E and D is no longer guaranteed (see 1a in Table 3 as an example). For this reason, sensor B starts to be taken into consideration and the system prefers using BD.
In the case of an opposite event (i.e., the subject enters indoor), switching back to video-based sensors is not justified by the possible improvement of recognition correctness, but the video sensor will be more comfortable for the subject than the first choice accelerometer due to having one sensor less to wear. In case the subject decides to take off the accelerometer belt, the persistent consistency of information from B and E will cause a fast return to sensors BE instead of DE.

Cooperation of Sensors
The cases presented in 7.1 and 7.2 assume the presence or absence of a sensor and do not turn to account the full potential offered by modulation of contributions from multiple sensors to the activity recognition. Here we assume that: (1) all sensors are available but attributed by a quantitative variable of cost and (2) the subject performs a compound action. The modulation is then expected to continuously calculate and maximize the correctness-to-cost ratio. To this point, data on recognition correctness given in Table 3 are considered to be discrete samples in a continuous space of possible actions. The system is then expected to detect actual behavior as composed of simultaneously occurring elementary poses (see [34]) and pose contribution are taken into account to select the best sensor set. Adopting the data from the experiment (Table 3), we assume the subject is simultaneously getting up from a chair (2b) to a stand position, and reaching the upper limb forward in the sagittal plane (3a). In the first part of the action, 2b dominates, in the middle part the contribution, 3a takes over and dominates in the terminal part (e.g., reaching a book on the shelf). To show the modulation process we assume that we only have BCD sensors (no video sensor) and only two of them available at a time. Accordingly to the data in Table 3, we have hierarchies: Since the CD sensor pair is the least favorable, we are going to use sensor B (EMG signal) and, in the course of action, modulate the contribution from C (pressure) and D (acceleration). Sensor C is then successively replaced by D and BC becomes BD as the action initially resembling 2b is more and more similar to 3a.

Unexpected Change of Action
The last presented case assumes that the subject stands up from the chair (2b), reaches forward (3a) and, instead of returning from reaching (3b) which was the most probable action, bends (5a) searching for the book on a lower shelf, then instead of return from bending (5b) he or she directly sits back to the chair (2a). Therefore:

•
In action 5a performed instead, the sensor priority set is: The process of sensor selection may be in this case presented in a tree as in Figure 8.
Sensors 2020, 20, x FOR PEER REVIEW 20 of 29 (BCE, CDE, BDE, BCDE); (BE, CE, DE, BCD); BD; BC; CD... The process of sensor selection may be in this case presented in a tree as in Figure 8.

Discussion
The results showed that it is possible to recognize the selected motor activities of everyday life with high reliability by using a different kind of individual sensor as well as their 2-, 3-, or 4-elements sets. Although some activities are recognized with less reliability with the use of some sensors, in such case there is a possibility to successfully use the data from other sensors (see discussion and conclusions in [50]) or sensors sets for which the outcome is more reliable. As can be observed from Table 3 and Figure 4 the recognition with the use of sensors sets very often has higher values (94.1-100%) than with the use of the individual sensors, for any type of activity. The same observation can be also taken from Table 4a, Table 4b, and Figure 5, which present very often better results from sensors sets (88.3-100%) that from the individual sensors, for any volunteer. There are sometimes opposite cases, but only when the individual sensor (with lower recognition for some activity or some volunteer) is applied to a sensor set. In such a situation, this sensor decreases the recognition for the sensor set and this recognition is lower than for the other individual sensor (with higher recognition).
To sum up, the individual sensors have complementary scopes of competences and their mutual exchange depending on the current situation benefits better results than the usage of a rigidly defined sensor set.
Studying sensors' performance in recognition of six elementary daily living activities, we confirmed that particular sensors show their optimal recognition accuracy at different movements (Table 3). Consequently, due to the complementary competencies of sensors, combining information

Discussion
The results showed that it is possible to recognize the selected motor activities of everyday life with high reliability by using a different kind of individual sensor as well as their 2-, 3-, or 4-elements sets. Although some activities are recognized with less reliability with the use of some sensors, in such case there is a possibility to successfully use the data from other sensors (see discussion and conclusions in [50]) or sensors sets for which the outcome is more reliable. As can be observed from Table 3 and Figure 4 the recognition with the use of sensors sets very often has higher values (94.1-100%) than with the use of the individual sensors, for any type of activity. The same observation can be also taken from Table 4a, Table 4b, and Figure 5, which present very often better results from sensors sets (88.3-100%) that from the individual sensors, for any volunteer. There are sometimes opposite cases, but only when the individual sensor (with lower recognition for some activity or some volunteer) is applied to a sensor set. In such a situation, this sensor decreases the recognition for the sensor set and this recognition is lower than for the other individual sensor (with higher recognition).
To sum up, the individual sensors have complementary scopes of competences and their mutual exchange depending on the current situation benefits better results than the usage of a rigidly defined sensor set.
Studying sensors' performance in recognition of six elementary daily living activities, we confirmed that particular sensors show their optimal recognition accuracy at different movements (Table 3). Consequently, due to the complementary competencies of sensors, combining information from multiple different sensors is expected to give more reliable recognition. Unfortunately, in compound actions, true recognition falls into the border area or actually moves from the area of competence of one sensor to another. This remark was a foundation of the presented concept, design. and prototype of an assisted living system with an adaptive sensor contribution.
Based on the comparison of the accuracy of activity recognition by four different assisted living sensors, we built activity-specific sensor priority lists and proposed a multimodal surveillance system with adaptive sensor's contribution. The setup we used as a model of a sensorized environment in which multiple sensors of possibly different paradigms and performance cooperate in the surveillance of a human. We assumed that sensors not only differ in reliability depending on the subject's action but also give consistent or contradictory results. We proved this assumption in experiments showing that adding sensors may decrease the correctness of recognition (Table 3).
Since the sensor data differ in form and refresh rate, sensor-specific data processing was applied first to provide data in a uniform format before fusion. The sensor-independent format was a list of activities ordered by descending detection probability. Activity data matching and fusion are made on the list level and also allows for continuous adaptation of sensors' contribution to the final result of the network. This proposal has been inspired by a neuromodulatory mechanism, which, although far more complicated, also leads to modulation of the information flow from the senses to the brain.
Biomimetic modulation of a sensor's contribution in a multisensory assisted living environment puts forward their advantages according to the subject's behavior. Being aware of limitations present in any human behavior model, we took selected daily living activities as samples in a continuous space of possible behaviors and tried to represent the actual behavior with a measure of similarity to these primitives [34]. In this paper, we showed that sensors, due to the specificity of their work principle, are somewhat 'specialized' in the recognition of particular poses or activities. Consequently, if a compound activity is represented by a set of elementary poses of varying contributions (see Section 7.3), the surveillance system, besides other limitations (see Section 7.2), should optimize the flow of sensor data seamlessly.
Regarding the related works, the main novelty in this paper is the ongoing adaptation of the sensor set dependent on the subject's behavior. Since the range of activities is virtually unlimited and the prediction of most probable future action is uncertain, given optimization rules had to be proposed and were implemented as: • Sensor cost-to balance the sensor usage; • Penalty factor-to balance between multimodal and single mode-switching system; • Stability check-to maintain decision on detected activity while modifying sensors' contribution.
Since human activity is a dynamic process, the contribution of the sensors needs to be considered as time-varying. To this point in the design of the multimodal assisted living system with adaptive sensor's contribution, we proposed to consider conservative and predictive adaptation. The conservative adaptation assumes the sensor contribution is adapted after the activity recognition and, in case other results were issued by the adapted system, raises the stability issue, which can be solved in several ways (e.g., see 6.2). The predictive adaptation requires the use of a subject's habits database, which has to be created and trained, but it already contains a personalized factor. Moreover, the prediction of behavior is never 100% accurate, something that needs to be taken into consideration in the design of adaptation rules.
We used four different sensors with quite good performance in the given experimental setup. However, one should consider more difficult or unstable conditions (e.g., lighting) and simplified sensors (e.g., when the energy consumption will be taken into consideration). The maximum error the system will make in activity recognition is expected as equal to the error of the second sensor.
Conservative adaptation in the two-sensor mode, (p > 1) may give erroneous recognition which (according to Table 3) may be inaccurate by 5.9% of cases (activity 4b, sensors C and E). The stability check in conservative adaptation prevents the system from changing the recognition decision based on an inappropriate change of sensors. The proposed new sensor set is applied in a subsequent sensing step and if the previous activity is maintained and the new settings are appropriate, a more accurate recognition will be issued.
In predictive adaptation, the unexpected behavior may affect the sensor set adaptation making the new proposed set inappropriate. In this case, again one should consider the case that a less accurate sensor will be proposed, and the overall reliability will decrease. Unlike the conservative case, the subject's history (represented in Figure 7 as the "habits" database) helps to avoid the adaptation mismatch. However, it is worth noting that we used only a single step prediction (i.e., next most probable activity has been taken as a background for sensors adaptation), and future studies are necessary to potentially extend the prediction range to a tree of n future activities.
Our studies presented here were performed with the data recorded from specific sensors (including custom sensor-specific software, Figure 2) in the given test environment described by Smoleń [50]. With different sensors, particular findings (such as Table 3) may differ significantly, but a general rule of building sensor set hierarchies is universal and worth follow-up by other scientists developing multimodal human activity sensing systems. Therefore, we found it reasonable to present the system operation in four case studies than to give a quantitative evaluation of setup-specific activity detection efficiency.
The building of such a prototype system combining wearable and infrastructural sensors is the aim of our next project. Also, the question of initial personalization of recognition and data flow rules needs to be considered again in the context of a working prototype.

Conclusions
Based on the analysis of the performance of four different assisted living sensors in six elementary reversible activity types of the human, we proposed the analysis rules with adaptive sensor contribution. We applied them to the design of an auto-optimizing multimodal surveillance system and studied its behavior in true-to-life assisted-living scenarios, including compound activities. We pointed out the possible advantages of complementary competences of the sensors and confirmed benefits resulting from their adaptive contribution.
The building of such a prototype system combining wearable and infrastructural sensors is the aim of our next project. Also, the question of initial personalization of recognition and data flow rules need to be considered again in the context of a working prototype.