Robotic Assistance in Medication Intake: A Complete Pipeline

: During the last few decades, great research endeavors have been applied to healthcare robots, aiming to develop companions that extend the independent living of elderly people. To deploy such robots into the market, it is expected that certain applications should be addressed with repeatability and robustness. Such application is the assistance with medication-related activity, a common need for the majority of elderly people, referred from here on as medication adherence. This paper presents a novel and complete pipeline for assistance provision in monitoring and serving of medication, using a mobile manipulator embedded with action, perception and cognition skills. The challenges tackled in this work comprise, among others, that the robot locates the medication box placed in challenging spots by applying vision based strategies, thus enabling robust grasping. The grasping is performed with strategies that allow environmental contact, accommodated by the manipulator’s admittance controller which offers compliance behavior during interaction with the environment. Robot navigation is applied for the medication delivery, which, combined with active vision methods, enables the automatic selection of parking positions, allowing efﬁcient interaction and monitoring of medication intake activity. The robot skills are orchestrated by a partially observable Markov decision process mechanism which is coupled with a task planner. This enables assistance scenario guidance and offers repeatability as well as gentle degradation of the system upon a failure, thus avoiding uncomfortable situations during human–robot interaction. Experiments have been conducted on the full pipeline, including robot’s deployment in 12 real house environments with real participants that led to very promising results with valuable ﬁndings for similar future applications.


Scope
The societal challenge inherent in the growing elderly population is a worldwide phenomenon, which is getting worse when considering that such groups usually suffer from chronic diseases that gradually deteriorate their health status decreasing their physical and mental capabilities. Recently, significant steps have been made in the context of service robotics for assisted living environments to support older people's independence [1]. In accordance with a recent literature review, provided by Bedaf et al. (2015), up to 2015, more than 100 assistive robots have been developed aiming to support a wide range of different activities of elderly people among which mobility, self-care, social, and others of general purpose related activities can be identified. the denouement of medication assistance scenario has been realised through a Partially observable Markov decision process (POMDP) model that can handle the uncertainty of the perception input and promptly decide on the next best robot action. The decision maker interacts directly with a robot task planner which is capable of orchestrating the robot skills, along with the communication modalities that have been developed, covering the multiple aspects of the medication assistance scenario and thus offering a fully autonomous solution.
To the best of our knowledge, this is the first working solution that addresses the complete pipeline for the medication assistance task involving autonomy, perception, cognition, human action monitoring and task planning. It is important to state herein that the ultimate goal of this paper is to provide solutions in a series of perception, cognition and action problems that will enable, in the near future, assistive robots' deployment in real home environments. The main novelty and contributions of this work are summarized as follows: • The development of an integrated vision system able to perform object and human detection and tracking suitable for monitoring of medication activities; • The integration of novel object grasping strategies with environmental contact, enabling object manipulation from challenging spots accompanied with situation awareness mechanisms; • The development of safe manipulation and navigation strategies suitable for robotic agents that target operation in domestic environments with non-expert robot users; • The identification of the necessitated robot skills for assistance in medication adherence activity, their development in a modular manner and their organization under a task planner framework that covers all the corner cases that can be identified within the examined assistive scenario; • The integration and assessment of all the developed skills in multiple realistic scenarios with various users.

State-of-the-Art Robotic Applications in Medication Adherence Activities
To accurately classify a robot as personal service or a professional service robot, a precise overview of the task that the robot agent is designated to perform should be performed [7]. In the domain of personal service robots, which is of particular relevance in this work, the majority of the robots target domestic applications and can be classified as mobile servant, people carrier, physical assistant, personal care and healthcare assistant robots [8,9]. A representative taxonomy in the domain of health care robots could differentiate them among those that provide physical assistance, those that provide companionship, and those that monitor health and safety. Remarkable assisting robots for the elderly have been developed so far under the scope of research projects, such as the HOBBIT project [1], which combined research from robotics, gerontology, and human-robot interaction in order to develop a healthcare robot capable of preventing falls, detecting emergencies, bringing of objects and offering reminders to the users. The ACCOMPANY project was built upon the Care-O-Bot ® 3 integrated in a smart-home environment aiming empathic and social humanrobot interaction, robot learning and memory visualization as well as persons' monitoring at home [2]. Albeit the fact that these robots were mainly designed to provide healthcare assistance, the challenging task of assisting, with robotic manipulations intervention, in medication tasks was merely studied, and the provided solutions were focused solely on the provision of reminders and notifications. However, a plethora of laborious research has already been conducted in the respective domain covering feasibility studies of contemporary robots to provide assistance during medication tasks. The need of contemporary agents to assist in medication activities is a global preference for the healthcare robots [10]. The authors conducted a specific research in a retirement home which substantiates a claim that residents may expect robots to assist them in medication adherence activities. This is further reinforced by the study conducted in [11], where it is indicated that people, in a retirement village that utilizes healthcare robots, rate highly the reminders' provision in medication adherence activities. Moreover, apart from the user's preferences, it is important to assess the feasibility of robots to assist in medication sessions as well as the definition of the level and type of intervention. A touch screen and voice-based interface integrated on a robot, introduced in [12], proved it is an effective platform able to proactively support users with their healthcare delivery related to assistance provision to elderly individuals with mild cognitive and physical impairments, through human-robot interaction, people monitoring and probabilistic action planning for robot motion next to the user. Specifically, in this work, a robotic agent acted as a prompting mechanism, through the establishment of interactive communication, to stimulate the elderly to complete their medication adherence task. These findings are further emphasized when considering the feasibility study [13], where several years later, the same experimental topology had been applied on a sample of ten older users. This experiment exhibited remarkable results, based on which the concept of reminders provision through a robot was well received from all users that successfully completed the session, and most subjects found it easy to use, appropriately designed and felt confident using it. However, in accordance with a recent study [14], it is revealed that contemporary healthcare robots are mostly tailored to assist in medication adherence activities by providing reminders (regarding the medication adherence schedule) or monitoring the medication sessions while only few approaches also addressed the medication fetching task.
Albeit the fact that authors in [15] stated that current robot technology does not allow reliable fully autonomous operation of service robots with manipulation capabilities in the heterogeneous environments of private homes, the early work presented in [16] succeeded the realization of an autonomous service robot capable of serving drinks to users. Bohren et al. endorsed a PR2 robot with advanced perception capabilities, planning components including navigation, arm motion planning with goal and path constraints, and grasping modules as well as a pioneering new task-level executive system to control the overall behavior of the system. A few years later, the same robot has been used in healthcare domain by assisting in medication adherence activities [15]; however, the overall system was commanded and controlled from the user remotely through handheld devices. Such works paved the way for the insertion of the personal assistive robots in the domain of healthcare and comprised the ancestors of the work presented herein, which aims to introduce a complete pipeline that addresses autonomous assistance provision in healthcare activities through active communication, robotic manipulations and monitoring of the medication adherence activity, all implemented as modular skills orchestrated by probabilistic decision-making integrated with a modular task planner.

Paper Layout
This paper documents the considerations, analyzes the adopted solutions, and finally assesses the complete framework developed with a real mobile robot manipulator aiming to proactively help in daily domestic activities of older persons, particularly focusing on the support of medication task. More specifically, this paper presents: • the user requirements during medication adherence activities; • the hardware architecture and the physical implementation of the robotic manipulator; • the adopted software components that address the user requirements; • the safety features incorporated within the developed software; • the experimental results that demonstrate operation in various environments with real users.
In order to effectively address and demonstrate the completion of the above-mentioned objectives, a robotic manipulator has been utilized, namely, the RAMCIP robot. RAMCIP was researched and developed under the scope of the European Project "Robotic Assistant for MCI Patients at Home", which envisioned and realized a novel service robot able to proactively assist older persons at the early stages of dementia, in a wide range of their daily activities. Assistance in medication adherence comprised the most challenging use case that the RAMCIP robot, focusing on the elderly with Mild Cognitive Impairments (MCI), had to accomplish (see Figure 1), considering its demand for specific perception, action, communication, and cognition capabilities. Therefore, it is evident that, through the detailed description of this scenario, a profound understanding of the robot's capabilities will be revealed. However, it should be stressed herein that the developed software solutions are hardware independent meaning that the same algorithms can be adopted by any other robot setup that meets the physical requirements of the scenario. The rest of this paper is organized as follows: After this introductory section, Section 2 presents the requirements posed from the health care domain and outlines the basic hardware setup of the utilized robot. Section 3 describes the perception skills developed as part of the robot's vision system. Details regarding the implemented strategies for robotic action capabilities are provided in Section 4, while Section 5 describes the cognitive abilities of the robot. Section 6 exhibits the overall assessment of the developed methods and discussions regarding findings and limitations are provided in Section 7.

System Architecture
The architecture of our system was designed and developed based on the findings revealed during the created surveys that involved medical staff, patients, and caregivers [17]. The objective of these surveys was to identify and conclude the functional requirements, the human-robot interaction mechanisms, the design of the robotic assistant and user acceptance aspects [18], parameters that should be determined during the design of a healthcare robot. To this end, both the hardware setup as well as the adopted software architecture were chosen to methodically contribute to the robot's mission.

Medication Adherence: Use Case Requirements
Given the outcome of the above-mentioned surveys, the assistance provision in medication adherence activities comprised a high priority use case for the users and the caregivers. In accordance with this, the robot should: • be aware of the medication schedule of the user; • provide reminders to the user through the communication modalities before a medication session; • be able to locate, detect and fetch the pill box, especially when the latter is placed in high places difficult to be reached by the user; • monitor and assess the progress of the medication adherence activity; • be able to place the pill box back to its storage position; • establish communication with external person in cases were medication process has not been completed successfully; • complete the assistance provision for the medication adherence scenario in a coherent and structured manner, with sufficient repeatability.

Hardware Specifications and Setup
It is evident that the use case requirements could be met by any mobile manipulator endorsed with specifically designed software components. However, it is essential to briefly describe herein the RAMCIP robot physical implementation view in order to present the minimum required hardware specifications, while exhibiting how each hardware component contributed to the completion of the medication assistance scenario. To this end, the basic architectural elements of the utilized robot (see Figure 2) are described as follows: Figure 2. The basic architectural elements of the RAMCIP robot. The robotic platform with the elevation mechanism, the turntable, the arm manipulator and the head have been developed by ACCREA Engineering © . The utilized hand is the smart grasping system of SHADOW Robot Company, Ltd. © .

Mobile Platform:
The mobile platform comprises a two degrees of freedom (DoF) differential kinematics model. It provides the locomotion functionality and hosts the entire computational system of the robot. This component is responsible for performing the navigation tasks required for approaching the human and fetching the medication box.
Elevation Mechanism: The elevation mechanism allows the robot to reach both higher (around 1.75 m) and lower (e.g., floor or low table) locations with the same robotic arm. This component is essential when the medication box is stored in a high place where the robot has to be stretched in order to reach it, or has to be shrunk in order to reach a lower table.
Arm manipulator: The arm manipulation is relied in an 8-DoF kinematic model offering increased operational workspace that allows the robot to grasp the medication container in various environment topologies. The 6 DoF arm operational workspace is extended when considering the fact that the manipulator is mounted on a rotary actuated turntable with an elevation mechanism. To increase safety and endorse compliance manipulation capabilities, a force torque sensor has been mounted between the arm wrist and the hand.
Hand: The robot is equipped with a three-fingered robotic hand with nine degrees-offreedom, suitable to perform grasping of different objects with various grasping strategies. It is also equipped with force sensors at its fingertips that offer situation awareness when transferring the medication box to the user.
Head: The robot has a 2 DoF head equipped with a display for facial expressions that enables robot-user interaction and augments it with affective cues. The actuation of the robot head is inherited to a mounted RGB-D camera on top of it. On the one hand, the motorized head augments the robot's perception by enabling active vision and, on the other hand, it allows natural robot reaction when interacting with the user where perception along manipulation should be coordinated. Part of the communication modalities constitute a projector for the provision of augmented reality (AR) communication purposes.
Perception System: The robot perception system consists of a motorized RGB-D sensor mounted on the robot's head and two laser scanners mounted on the platform. The RGB-D sensor is utilized for the mapping, the environment monitoring and the human tracking. The laser scanners are utilized mainly for the robot localization and navigation and, in cases of cameras' occlusions, provide rough estimations of the user's location in the environment.

HR-Communication Modalities:
In the front of the robot body, interaction components are included such as microphone, tablet PC and speakers, allowing communication with the user. This module is utilized for the provision of notifications and the dialogues during medication adherence session.

Hierarchical Semantic Map
This component is responsible to model and store the environment state in a manner that will be apprehensible by the robot. A 3D metric map of the explored environment is firstly constructed. Then, the high level information, such as the state of the objects, the objects' attributes and their in-between relationships, are coded in a semantic hierarchy, namely: the Hierarchical Semantic Map. This structure allows the robot to operate, i.e., navigate in a 2D environment and manipulate objects, in the 3D environment while it simultaneously can address high-level commands such as "bring me the pill box". In addition, this structure allows the robot to perform its tasks by keeping track of the position of the objects within the environment, avoiding time-consuming wandering and searching in a known environment.
In this work, we assume that the robot operates in a known previously mapped environment. The mapping method adopted herein is an RGBD-SLAM approach capable of operating in large scale environments and producing 3D metric maps. It employs the on-board RGB-D sensor of the robot and performs incremental visual odometry during the robot's travel by using feature tracking and egomotion estimation with RANSAC constraints, undergoing a specific outlier filtering step for better optimization [19]. The robot's estimated poses and the corresponding matched features are treated as a graph, the nodes of which correspond to the estimated poses and the edges to the in-between transformation of the nodes. This graph is optimized with certain criteria using the g2o optimization library [20]. By applying the transformations of each edge in the graph to the respective acquired point clouds, a dense 3D map of the explored environment is obtained. The 3D map is constructed by teleoperating the robot during its installation in the environment. The usage of the 3D map is twofold; on the one hand, the 3D map is topdown projected and transformed into a weighted global 2D cost-map to be used for robot navigation exploiting the differential drive kinematics of the robot (see Section 4.1.2). On the other hand, the 3D information of the map is exploited to serve the "pill-box" localization within the environment and, since it is registered with the semantic information i.e., XML, each time the "pill-box" has to be fetched or localized, the 3D information is retrieved in the form of supporting surface (which are considered as constant during the process) so as to efficiently detect and distinguish the "pill-box" object from the rest of the scene.
The hierarchical semantic map associates the 3D metric map with semantic information, through an XML schema that encodes semantics of the environment architecture (see Figure 3). The root of the hierarchy in the semantic model is the "house" which comprises a series of "rooms". For each room, there is a descriptive list of the contained large scale objects. Some general semantic information is stored regardless of the category that a large scale object belongs to, such as the label of the object (shelf, table, etc.), a list of small scale objects (e.g., "pill box", "cup") that are affiliated with, the type of these relationships, its pose with respect to the global map, its affordances (graspable, supporting surface, etc.) and a list of robot parking positions associated with this specific object, from where the robot can stand in order to observe it, interact with it, or interact with the objects laying on it (e.g., parking positions of a table, allowing the robot to grasp objects that lay on it). A detailed 3D model for each small object of interest is also stored in the hierarchical semantic map, which will be later utilized by the object detection methods. In the medication scenario examined herein, the type of information stored in the hierarchical semantic map is typically the following: the "pill box" of a specific model (shape/color) is stored in a "high shelf": located in a room (e.g., "kitchen", "living room"), which is defined during the installation phase of the robot in the home environment. The shelf is described by a supporting surface constituted of p = {p i , ..., p n } points where each point retains p i = {x i , y i , z i } coordinates, expressed with respect to a global coordinate system in the map. The robot's initial parking pose in front of the shelf is r = {x r , y r , θ r }, where x r and y r are the robot location in the map and θ r is its orientation, assuming motion on planar surfaces. All the above-mentioned semantics are essential information required to locate precisely the "pill box" within the house environment. In situations where the robot fetches the "pill box" from high shelf to the "table", the corresponding large object parent ID of the small object "pill box" is updated respectively in the XML model. The supporting surface defined over a large object e.g., "table" is passed to the hierarchical semantic map during the robot installation phase in the environment by associating the respective points in the initially constructed 3D map.

Object Detection and Monitoring
The objective of this component is the detection and pose estimation of small objects involved in the medication adherence activity for their grasping from the robot as well as for their state monitoring during the observation of the medication adherence activity. The objects involved in the studied activity are the "cup" and the "pill box". Their initial detection and pose estimation is performed with the method developed in [21] within the scope of the RAMCIP project. Apparently, this method could be replaced by any other method capable of performing object detection and 6D pose estimation; however, we briefly and conceptually describe it herein for the sake of completeness. Specifically, during the training phase, the method extracts 2.5D depth invariant patches that cover the same area of an object, regardless of their distance from the camera, from rendered synthetic color and depth images captured from multi-view textured 3D models of the object. The extracted patches are projected to the 3D camera coordinate system and passed to an auto encoder-decoder network for feature extraction. The features along with the pose of the respective patch and the corresponding labels are utilized to train a Hough Forest, which allows object classification, as a regular random forest. During the inference phase, a hypothesis verification step is further utilized to select the subset of the total detections made by the classifier that best explains the examined scene. Following a scene segmentation and specific rule based criteria during the re-projection of each model with a specific pose, deductions about the object ID and its pose are performed. This method is utilized for the pose estimation of the "pill box" during grasping as well as for the initial detection of the "pill box" and the "cup" on the scene which seed the object monitoring component.
For the assessment of the medication adherence activity and assuming that the robot has successfully reached an appropriate monitoring position, the object monitoring component is initialized. Firstly, a workspace is defined with 3D boundaries wrapped above the supporting surfaces of the large objects that are inside the field of view of the robot's camera. The object detection algorithm, described above, is activated to retrieve the exact position and orientation of the instances of the object categories considered relevant to the ongoing activity (in the case of the medication adherence, these are the "pill box" and the "cup"). The retrieved poses are used to project the 3D object models in the 3D space and hence to obtain a geometrical estimation of the volumes that the detected objects occupy within the workspace. By extracting an Octree representation of the whole workspace, the volumes that correspond to these objects are tracked. For as long as such a volume remains occupied, the corresponding object is labeled as present in the scene. When the occupancy of the specified volume is reduced below a certain threshold, it signifies that the corresponding object has been removed by the user, then the object's label shifts to not-present. Other small objects possibly existing in the workspace are treated as irrelevant and are neither identified nor labeled; however, they are considered part of the rest of the scene. The same Octree representation is responsible for detecting any changes in the rest of the monitored scene. If a change that corresponds to a new cluster entering the scene is reported, the object detection module is re-executed on that cluster only, to determine if any of the objects labeled as not-present were placed back in the workspace of the large object by the user. To avoid any occlusion effects and resolve any dis-ambiguities in the tracked estimated volumes, the human's hands are masked from the skeleton tracker information and excluded from the procedure. A graphical representation of the monitoring module along with the possible states of the monitored objects are depicted in Figure 4. For the objects identified as not-present in the workspace, it is assumed that the user interacts with them, while, for objects identified as present in the workspace, it is assumed that currently there is no interaction of the user with that object. The main assumption done in this work is that we employed the same "pill-box" and "cup" objects during the entire evaluation procedure, and this is reasonable since the main objective of this work is to evaluate the performance of the overall system. In different cases, modeling of the user's medication box and cup should have been done in each new environment; thus, we adopted the same objects among different setups for simplicity.

Human Understanding in the Scene
This component is responsible for apprehending the human's presence in the environment as well as to understand the medication adherence activity. The human presence is modeled by exploiting human detection and tracking solutions based on multiple modalities. Activity understanding is a custom solution that combines geometrical information of the human skeleton limbs and information regarding the state of small object's (e.g., "pill box") involved in the medication adherence activity, as analytically described in our previous work [22].
In particular, for the human detection in the scene, two modalities have been fused; the first one is a human detection and tracking framework suitable to operate with lowcost depth RGB-D sensors at real-life situations addressing limitations such as body part occlusions, partial body views, sensor noise and interaction with objects, extensively described in [23]. The second one is a laser based human tracking mainly inspired by the work presented in [24]. Fusion of the two methods is a rule based method, an analytic description of which could be found in our previous work described in [25]. Based on this, the robot is constantly aware about the human's pose P h = (x h , y h , z h , θ h ) in the environment, expressed in a common human-robot coordinate system O map . The human pose P h will be latter exploited by the robot's global path planner to select a parking position around the humans and initiate the human-robot interaction.
Understanding of the medication adherence activity relies on the detection of a sequence of human actions with the concurrent detection of manipulated objects by the human. For the human action recognition, a geometric approach described in [26] has been exploited, which is based on the skeleton joints geometric topology. A combination of such geometrically inferred actions with manipulated objects indicate the execution of an action with semantic context, e.g., the action hand-to-mouth with the object cup as not-present on the monitoring area, designating the action drinking. The human actions and the manipulated objects along them are graphically illustrated in Figure 5. The sequence of the actions in not binding given that, in order to be sure, regarding the observation, the robot expects to identify all the triplets of actions/objects so as to infer a correct medication adherence activity. During the assistance in a medication adherence scenario, the robot navigates from one place to another in order to perform perception or action based tasks, e.g., reaching a parking position for human monitoring, or reaching a parking position for grasping the "pill box". To achieve this, the robot should be able to autonomously navigate from one place to another within the environment. Thus, a global path planner (GPP) is essential for autonomous navigation since it is responsible to navigate the robot platform from its current to a target location in order to execute a task. The GPP exploits a priori collected data, which in our case is the static map created from the hierarchical semantic mapping framework, to express the acquired information properly in the configuration space, i.e., a cost-map and then looks for the optimal path according to the respective search algorithm. The RACMIP navigation system exploits the D* Lite global path planner, as presented in [27], which is a fast path planning and re-planning algorithm suitable for goal-directed robot navigation in partially known or unknown terrain. D* Lite constitutes an extension of Lifelong Planning A* (LPA*) [28], and it is capable of re-planning a new shortest route from its current position to the given goal, when new obstacles appear.
The selection of the most appropriate parking position is a custom solution developed in the scope of this work that allows optimal parking of robots based on the criteria imposed from the task that it is designated to perform. Depending on the context of the task, the robot navigation goal is elaborated to incorporate active vision benefits. The RAMCIP robot is equipped with a single RGB-D camera and laser scanners of approximate 360°field of view and, depending on the task, it will enable it to sufficiently track the human actions, the small objects to be detected in the scene, and the manipulators workspace in order to reach and grasp the small objects, i.e., the "pill box". In general, the method is relied on the selection of a (X T , Y T , θ T ) pose in the map and the generation of a circular area around which the robot is able to park. The radius of this area is regulated by the context of the ongoing task T. Given the (X T , Y T , θ T ), the robot searches for an optimal observation range around this perimeter for the identification of a suitable parking pose that satisfies the requirements of frontal facing of the interaction object and the robot's footprint fitting among the static obstacles of the global metric map. The robot's footprint affordance to the static metric map is controlled with a spatial decomposition technique, i.e., kdTree, by searching with neighborhood radius of a size equal to the robot's footprint radius, thus ignoring those poses that intersect among the static obstacles and the footprint's points [29]. The selection of the most appropriate parking pose is performed by applying a Euclidean distance minimization criterion among the robot current pose and the calculated ones.
Parking strategy for Object Grasping: This is a two step reaching strategy, in accordance with the robot firstly selecting a parking position with respect to the supporting surface of the small object and navigating towards this and, then, upon detecting the small object from a distant place, it infers a parking position convenient for the object's grasp. Specifically, the robot from its current position recalls from the hierarchical semantic map the semantics of the "pill box", based on which the large object ID i.e., "kitchen table", is recalled along with its center of mass (X kt , Y kt , Z kt , θ kt ), which declares its location in the house. A circular area is defined around the (X kt , Y kt ), the radius of which is regulated by the object detection algorithm (see Section 3.2) which has a limitation to operate up to 1.5 m. The algorithm infers the candidate parking poses and, based on its shortest distance from the current robot location, the most suitable one is selected. Then, the object detection module for the "pill box" identification and pose estimation is activated and, upon detection of the small object's pose (X so , Y so , Z sot , θ so ), a circular area with radius up up to 1m, which roughly defines the robot's manipulator workspace avoiding any singularities, is defined and candidate parking positions are again inferred. The robot selects the one closest to its current location and navigates respectively to perform again object detection and grasping. Parking strategy for Human Activity Monitoring: This is a one step strategy, and it is used by the robot to reach the human for activity recognition or communication. Based on the human's pose (x h , y h , z h , θ h ) in the environment (see Section 3.3), expressed in a common human-robot coordinate system O map , the robot infers a circular area around the human with a radius imposed by the "personal space" criterion, inspired by the proxemics theory introduced in [30], with a radius greater than 1.2 m as suggested in [31]. The robot searches for an optimal observation range outside this perimeter for the identification of a suitable parking pose that satisfies the requirements for human frontal facing and the robot's footprint fitting among the static obstacles of the global metric map.
Parking strategy for Object Release: This is again a two-step strategy, in accordance with which the robot firstly parks with respect to the supporting surface upon which a small object should be released and, then, upon detecting the center of mass of the largest clear sub-surface (x s , y s , z s , θ s ) from a distant place, it infers a parking position convenient for the object release. The suitable clear sub-surface is estimated by exploiting the depth information of the camera to compute the largest convex hull, on the extracted plane from the respective large object that does not enclose any objects in it. In such a way, collisions with other objects during the release of a small object are avoided.

Local Planning
The global plan is generated on-demand each time the robot is commanded to navigate from one place to another under inherent limitations of on-board environment perception and mapping systems. Therefore, an additional reactive local path planner is necessary for safe navigation, which accounts for dynamic environmental constraints such as moving obstacles as well as uncertainty in the sensing inputs. Naturally, this introduces a requirement of real-time capability of the local planner. We use the dynamic window approach (DWA) as it is suitable for real-time execution, and it allows a configurable structure and parameter tuning, based on the initial work described in [32].
As input, the DWA local planner takes the set of waypoints computed in the global path, odometry information containing pose and velocity, local map representation and the robot representation in a 2D plane. To simplify the problem setup, but enhance the safety considerations, the robot is approximated by a convex polygon obtained by projecting its 3D shape onto the floor surface, accounting for the current arm pose at each control update ( Figure 6). The local planning computations are performed on the proximal area around the robot-a grid of approximately 3 × 3 m generated by merging the sensor data from an RGB-D camera and laser range scanners. Finally, optimization is performed in the velocity space of this area (containing both translational and rotational components), by maximizing a value function, while taking into account robot velocity and acceleration limits. The result is a locally optimal velocity vector. The entire procedure is given in detail in pseudo-code in Algorithm 1. First, the local planner parameters are parsed, containing information such as dynamic constraints of the robot, desired velocity space discretization and minimum distance to the current local goal, which is chosen from the global path. Second, the velocity space is discretized into a dynamic window, accounting for maximum velocities and accelerations within one discretization step, and a set of admissible velocities V admissible is obtained. The set V admissible is then pruned by looping through velocity vectors v i and removing ones that are deemed dangerous or lead to a collision. This is achieved by projecting the robot polygon along a motion defined by v i , checking for intersections with the obstacle set O local and solving for shortest time t i until a collision. The result is a set of safe velocities V safe , where each v i has a corresponding time-to-collision t i assigned to it. V safe ← ∅ Set of safe velocity vectors 9: for v i ∈ V admissible do 10: if not (FLAG_coll and FLAG_danger) then 14: end if 16: end for 17: if V safe == ∅ then 18: STOPANDREPLAN() 19: end if 22: end procedure 23: p global ← GETGLOBALPOSE() 24: end while 25: Selecting the optimal velocity vector within V safe requires the designing of a value function that accounts for several factors. We define the following heuristic utility functions as its components-heading, speed and distance, with respect to each safe v i . Their effects are the following: • heading-Rewards goal directed motions • speed-Rewards high linear velocities to enforce fast goal directed motions whenever possible • distance-Rewards long predicted times t i until collision Effects of each utility function are given as weights w i and summed into a value function: The combination of heading and speed generates an attractor behavior towards the local goal. In the presence of obstacles, the distance utility shifts the optimum of the total value function towards a deviating motion around the obstacle. Figure 7 illustrates such a scenario-the robot is placed in front of an obstacle in the x-direction of the robot coordinate system (see Figure 6), and it is shown how the value drops in the direction of the obstacle. Nevertheless, in situations when the robot detects dynamic obstacles, a frequently updated global path re-planning is performed to ensure smooth robot motion. The global plan update frequency is set to 5 Hz constituting a descent compromise of unnecessary computational burden and smooth change in robot heading considering the robot maximum velocity.

Manipulation and Admittance Control
In the medication adherence scenario, the robot has to perform dexterous manipulation in order to reach for, grasp and release the "pill box". These manipulation actions (as described in Section 4.3) necessitate environmental contact that demands compliance behavior. In addition, this compliance behavior is required for the safety of the surroundings and the human. In more detail, grasping a "pill box" from a table or shelf requires the robot to be compliant in case the robot hand/finger collides with the table surface which potentially causes a damaging level of interaction force. The compliant control of the RAMCIP robot is achieved by means of an admittance controller with a 6 DoF force-torque sensor at the wrist which generates the behavior of a spring-mass-damper system with the following dynamics: with the desired virtual mass M d , damping D d , stiffness function F K (x, x d ), and external force F ext . The set-point of the controller is given by the desired pose, x d , and the desired force, F d . To bound an excessive interaction force given by a large deviation from the desired pose, the spring component is saturated with a maximum force F max .
The RAMCIP robot has a redundant manipulator consisting of an elevation mechanism, a turntable link and a 6 DoF arm. The characteristics of the joints are different in terms of the range (i.e., joint position limits) and type (linear or rotational). Namely, the RAMCIP manipulator has 8 DoF in total which is redundant for Cartesian space motion. Therefore, the control problem can be split into two separate components; end-effector motion and null-space motion. Specifically, mapping of joint velocitiesq to task velocitiesẋ is unique, while mapping of task velocities to joint velocities is not. The joint velocity can thus be considered in terms of operational space velocity and null-space velocity bẏ with J # W (q) being a pseudo inverse of the Jacobian matrix and the projection matrix to the null-space N. The pseudo inverse is given by with W being a symmetric positive definite weighting matrix. Null-space motion is used as a reactive strategy to fulfill low priority tasks. Initial trials with the RAMCIP kinematic structure have shown that avoiding joint limits by pure reactive null-space motion is not sufficient to hold the position constraints while executing the motion with acceptable performance. In addition to the null-space motion, the null-space projection can be modified by adapting the weighting matrix. Changing the weighing matrix does not influence the Cartesian motion of the end-effector but the linear combination of joint velocities. To implicitly avoid limits, we propose online nullspace projection shaping to adapt the weighting matrix onlline. To avoid joint limits, the weighting matrix to favor joint motions which point away from the limit more than motions towards it. The joint limit avoidance weighting matrix is given by W(q,q) = diag(w 1 (q 1 ,q 1 ), ..., w N (q N ,q N )) (6) with w i (q n,i ,q n,i ) = 1 tan q 3 i atan(q i ) π + 0.5 The weight decreases for joint motion near the limit and towards the limit and increases for motion away from the limit. By using the online null-space projection approach, joint limit avoidance is implicitly amplified, and it increases the overall performance of the RAMCIP system.

Grasping
In the average domestic environment, the objects are usually placed on planar support surfaces, which pose an environmental constraint for current state-of-the-art grasp planners. Grasp planners attempt to plan grasps by avoiding collision with the environment and often fail to find a solution for flat or small objects ( [33]), like the "pill box". In this work, we use two grasp strategies, developed within the scope of RAMCIP ( [34,35]). These strategies exploit environmental contact, for grasping the "pill box": grasping from a table and grasping from a high shelf. The first strategy is general for flat objects that can be reached from the top, while the second is typical for flat objects that are placed at a high surface and the robot can not reach them from the top.
Both proposed strategies consist of three consecutive Grasp State targets GS = [x G q], where x G ∈ SE(3) iis the pose of the palm's frame {G} (Figure 8a) w.r.t. the target object frame {O} (Figure 9) and q ∈ R n are the joint positions of the hand with n joints: the Initial Grasp State (IGS) is a hand configuration that does not involve external contacts in general, the Pregrasp State (PGS) involves finger contact with the environment and/or the object while in the Final Grasp State (FGS) the object is securely grasped and no longer supported by the surface. IGS is planned online given the current scene, while PGS and FGS are dynamically achieved during execution by the action of the grasp controller. Multiple desired IGS are generated to provide different possibilities for the grasp in order to be able to select one which is feasible based on other constraints like obstacles surrounding the target object or kinematic constraints of the arm. The assumption is that a module exists which can check if an IGS is feasible with respect to the obstacles and the robot/hand kinematics.  The strategies require as input the pose and the bounding box of the "pill box", which are extracted by retrieving its model from the hierarchical semantic map object detection and identification. With respect to the support surface, the strategies require the surface normalŝ n and the surface closest to the object edge which is defined as the vectorŝ l , a vector perpendicular to this edge as shown in Figure 9. Furthermore, the strategies require compliant fingers and arms, a requirement fulfilled by the RAMCIP robot, which has inherently compliant fingers while the compliance of the arm is realised actively, as described in the previous section.

Grasping the "Pill Box" from a Table
The concept of this strategy is shown in Figure 10. After the end-effector has reached the selected IGS, it approaches the support surface (table) alongŝ n under force control in order to land compliantly and ensure the proper relative orientation of the hand with the table, achieving the PGS. Subsequently, the fingers close while maintaining contact with the surface until they establish contact with the object and exert a predefined or learned grasping force, reaching the FGS by lifting the object. By involving direct contact with the table, robustness to estimation errors regarding both the object pose and the support surface normal is achieved. Figure 10. The concept of the grasp strategy for grasping the "pill box" from a table.

IGS PGS FGS
The set of the produced IGSs place the palm frame {G} w.r.t. {O} with position p OG = λŝ O,n and orientation R OG = [−û ×ŝ O,nû −ŝ O,n ], where λ ∈ R is a small constant representing the distance of the palm from the support surface,ŝ O,n the surface normal w.r.t. {O} andû is the normal vectors of the bounding box's faces, which are not parallel to the support surface. The configuration q of the fingers is the same for all IGSs and is produced using physical human interactive guidance, shown in Figure 8a.
In order to reach PGS, we command the arm with a wrench F cmd G , which is the sum of two components: one force opposite to the surface normal, which results in an arm motion downward for establishing contact, and one external wrench induced by the contact forces applied by the fingertips to the environment, required for the proper translation and rotation of the hand, so that a full contact with the three fingers is established. The contact forces are measured by using the Optoforce sensors with which the RAMCIP hand is equipped at its fingertips. In the equilibrium, the two wrenches are equal and the motion stops. The PGS is reached when the velocity of the hand is zero, which means that the fingers have successfully landed on the surface. In particular, consider the frame {M} placed on the centroid of the fingers, p M = 1 n ∑ j=1, ... ,n p N j , with the same orientation as {G}, and {N j } the frame placed on the j-th fingertip (Figure 8a). The total commanded wrench, applied and expressed in {G}, is the following: where a the magnitude of F d of Equation (2),ŝ G,n andŝ M,n theŝ n expressed in {G} and {M}, respectively, K a diagonal matrix of gains, I 3 the identity matrix, p MN j the skew-symmetric matrix of the p MN j and f N j ∈ R 3 is the external contact force measured on the fingertip j. After landing, the fingers start closing, using the joint position controllers, in order to reach FGS. While the fingers close, they exert forces on the wrist of the arm. Due to its compliance, the arm is moving upwards while the fingers are maintaining contact with the support surface.

Grasping the "Pill Box" from a High Shelf
The concept of this strategy is shown in Figure 11. Due to the size of the "pill box", this strategy uses two of the three fingers of the RAMCIP hand, as shown by the IGS finger configuration in Figure 8b. After the hand reaches the IGS, it lands on the object with one finger, which we call the dominant finger (Figure 11), reaching PGS. Then, the opposable finger, which we call the residual finger, closes until an external force, measured by the residual's Optoforce sensor, is encountered by establishing contact on the opposite side of the shelf or on the cupboard door, if any. Then, the arm starts moving away from the surface along the direction ofŝ l , invoking the object's sliding on the surface. The predefined perpendicular force exerted on the object by the dominant finger should be such that it ensures adequate friction between fingertip and the "pill box", sustaining the tangential forces which drive the sliding. Finally, when the surface constraint vanishes for the residual finger, a fast finger closing establishes an opposable grasp with the "pill box". The IGSs of this strategy are calculated as a desired pose of the dominant fingertip: where ∈ R is an offset and d is the distance of the object from the closest surface edge. The above implies that the algorithm will produce IGSs only if the distance of the fingertip from the palm is larger than the distance of the "pill box" from the shelf's edge. In the opposite case, the object is not reachable and the palm will collide with the edge of the surface; a case which is not considered by the strategy. The orientation of the {G} frame is calculated by the desired orientation of the dominant fingertip R N : This rotation matrix is rotated along the x-axis by multiple θ ∈ [− π 2 , π 2 ] rad, ensuring the generation of multiple IGSs for selection.
In this strategy" the PGS is reached using the same algorithm as in the previous strategy, applied only to the dominant finger; consequently, the wrench component produced by the contact of the fingertip with the "pill box" is zero as {M} ≡ {N}.
For reaching the FGS, the arm moves in Cartesian space along the direction of theŝ l , while the residual proximal joint velocity is operated by the control lawq r = K( f re f − | f N r |), whereq r is the joint velocity, K is a diagonal positive gain matrix, f re f ∈ R is the desired grasping force and | f N r | is the norm of the force measured on the tip of this finger. This simple controller commands a velocity proportional to the error of the measured force norm, which means that the finger will close until establishing contact with the surface and eventually with the object with force magnitude f re f , achieving an opposable grasp with the dominant finger. Notice thatq r is high enough in order to have a fast finger motion for snapping the residual finger on the object during the withdrawal of the arm from the shelf.

Slippage Detection and Reaction
During the grasped object lifting, removal from the supported surface and transfer, a slippage detection method developed within the RAMCIP project has been utilized [36]. The developed reaction strategy relied on the gradual increase of the grasping force upon the detection of slippage. In the exploited method, a novel feature vector has been introduced for slippage detection, combining time and frequency domain content of measured force magnitude. The proposed scheme exploited the availability of the contact force magnitude on the RAMCIP hand fingertips, which implies the possibility to acquire 3D force measurements. Time domain features combined with frequency ones have been trained for only one surface and proved generalization ability for both translational and rotational slippage.

Robot Decision-Making and Task Planning
The orchestration and realization of the above described skills for the completion of the medication serving and monitoring mission is performed in a hierarchical manner. On top of this hierarchy, a partially observable Markov decision process (POMDP) system has been implemented to decide regarding the optimal next robot action, based on the current state of the robot-human-environment ecosystem, and in the lower level, a task planner interprets this decision and transforms it into real robotic actions by initializing the respective skills through the Robot Operating System (ROS).

POMDP Decision-Making
For the problem formulation, we relied on the generic POMDP design theory [37], however, tailored to the explicit medication assistive scenario, where the problem domain comprises the environment, the human and the robot. Towards this direction, the discrete POMDP is designed as a tuple P = {S, A, Ω, R, T, b 0 }, where S = {s 1 , s 2 , ..., s n } denotes the States space that determines the condition of the environment, the human and the robot at each time t. A = {a 1 , a 2 , ..., a n } denotes the Actions space that encloses all the actions that the robot is able to perform so as to interact with the human and the environment. Ω = {ω 1 , ω 2 , ..., ω n } denotes the Observations space that comprises the robot perception input from the human and the environment, yet under the assumption that an observation ω partially describes the state of the previous entities. R = (A, S) comprises a Reward function that determines the restrictions imposed by penalizing or endorsing specific robotic actions (A) during the interaction with the human and the environment (S).
The solution of the POMDP model after its definition is produced using the existing solvers [38]. The outcome is an action selection policy π that maximizes the sum of the expected future reward up to a specific time. This policy comprises a mapping from the current state belief probability to the action space A. Given the computed policy, the robot can select an optimal action by computing its belief state based on the following update rule: where b is the updated belief, b is the given belief at the previous time step and (a, ω) is the latest combination of robot action and observation. It is apparent that the computation of optimal policies is characterized by an exponential computational growth. A single step of value iteration to compute the next selected action is on the order of |C t | = O(|A||C t−1 | |Ω |), where |C t−1 | corresponds to the number of components required to represent the next selected action at iteration t − 1, while the computational burden is estimated by taking into consideration the number of iterations in each step for the O(|S| 2 A||C t−1 | |Ω| ).
In robot assistance during medication adherence, the number of states and actions grows drastically by considering an abundance of environment, human and robot states, and many robotic actions that need to be determined. Thus, an abstraction of the state and action space given the awareness of the robot for the user has been applied, following our previous work [39]. Specifically, since the state space is partially observable, it can only be conceptually grouped by defining scalable blocks of states that correspond to distinct levels of robot alert (LoRA), S = {S H , S M , S L }. Herein, the state space is conceptually partitioned in three levels of robot alerts, namely High, Medium and Low. The states that may belong to the S H correspond to phases in the assistive task for which the human requires drastic assistance from the robot. The S M defines the group of states within the task, in which the robot has already been engaged in an assistive task and the levels of awareness about the human have been moderated. Finally, the S L outlines these states where the assistive scenario has been resolved, the required intervention is diminished and the robot is complacent about the status of the human.
The conceptual partitioning of the state space indirectly defines groups of robotic actions, the context of which is related to the type of robot intervention required for the scenario propagation, given the current robot awareness about the human. Thus, the action space is respectively partitioned as A = {A Act , A Com , A Perc }, where A Act corresponds to a highly interventional set of robotic actions necessitated when the environment and the human is at the S H ; the A Com reflects a more discreet robotic set of actions when the status of the domain is assessed to be at S M and the A Perc consists of rather passive robotic actions, in essence applied when the LoRA about the human is diminished, i.e., S L . The designed POMDP model aims to propagate the system to the S L set of states by selecting the corresponding set of actions, a feature regulated herein by carefully assigning the values at the reward function, thus endorsing the system, respectively. A positive reward value is passed to the model when the selected action transits the system from a higher to lower LoRA state, while a negative reward value is passed to the model when the selected action tends to bring the system to a higher LoRA. A uniform distribution is applied in the rewards function when the system passes from medium to medium LoRA states. Through this methodology, the POMDP model is designed in a human-centric manner, where the partial observable set of states corresponds to the status of the human and the environment, while the set of actions are solely robotic related, thus resulting in a prompting system aiming to draw decisions about the robot intervention in order to reduce the awareness of the robot about the human and thus resolve the assisting scenario. To better outline the nominal states required for the medication adherence scenario denouement with robotic assistance, we selected to present the policy π subset that can lead the robot from high to low LoRA as an explicit finite-state controller [40]. Such a policy graph is graphically illustrated in Figure 12, which explicitly represents "do action then continue with the given policy". In this graph, the nodes correspond to vectors in value function and are associated with actions A and the edges correspond to transitions based on observations Ω. In Figure 12, the initial state S{L − 1} is the one outlined with a green double circle in the finite-state controller. The state S{L − 1} became the current one after the system obtained the observation Ω{Monitor_ω2} stemming from action A{Monitor}. The latter indicates that, while the robot performed the Monitor action, the scenario about the medication was triggered based on the human's medication schedule and the observation Ω{Monitor_ω2} passed to the policy graph leading the system eventually to state S{L − 1}. The rest of the semantics for the actions and the expected observations are analytically appended in Table 1, which interprets the entire finite-state controller required for the denouement of the medication adherence scenario.

S{L-1} A{Perc1}
S{H-0}, A{SysRes} Figure 12. The POMDP policy graph of the robot decision maker as a finite-state controller. The graph comprises a subset of all the identified states that can be aroused during the medication assistance scenario; however, it exhibits the majority of the Actions A which correspond to the manipulation and perception skills described in the previous sections. The states grouped with the same colour indicate that they belong to the same LoRA convention during the definition of the POMDP.

Task Planner
By carefully observing Figure 12, an insight regarding the flow of robotic actions required for the scenario is obtained. Specifically, the A Act set involves all the robotic actions required to fulfil the robot's engagement to resolve a specific robotic task e.g., navigation, manipulation, grasp, release, etc. The A Com set of actions is less invasive than the A Act set and comprises the bidirectional communication planning required for the communication with the user supporting modalities such as dialogues, user interface displays, gestures and even notification with augmented reality. A detailed description of the communication planner of RAMCIP robot is thoroughly discussed in [41]. It is worth mentioning that the communication planner has been developed in a handheld device that comprised the human-machine interaction interface of the robot and the communication with the task planner performed through ROS bridge messages, thus passing the respective observations to the decision maker. The A Perc set of actions corresponds to the monitoring components of the robot which trigger functionalities suitable for assessment of the current status of the human and the environment, such as human detection and tracking, actions interpretation and objects detection and recognition. It is revealed that this set of actions is passive, since the robot monitors the human and the environment, while the observations acquired from these actions are expected to alter the state of the domain.
To this end, the task planner that has been developed is responsible for interacting with the decision-making component and executing the inferred actions. In more detail, the task planner continuously "listens" to the inferred actions A i from the POMDP graph policy and plans the respective task T i as a sequence of robotic skills, Skill = {skill 1 , skill 2 , ... , skill n } required for the execution of the selected robotic action. Upon the completion of the respective task, the task planner passes the observation ω i to the belief propagation policy in the decision maker and the system propagates to the next state. Particularly, for the communication between the decision maker and the task planner, depending on the current state, the POMDP sends action messages to the task planner, and the latter returns the outcome of the executed task in a form of observations required from the POMDP policy graph, thus updating, respectively, the belief state in Equation (10) in order to select the next best action. Each task T i is organized in a dedicated set of perception and action skills Skill required for each completion. Each skill i is realized as an ROS action, the initialization and termination of which is triggered through the standard ROS architecture. Each one of the ROS functionalities implements a specific robot Skill i (Figure 13). The outcome of each skill execution is coded and returned as observation ω i to the policy graph of the POMDP. Typically, low level observation outputs such as object pose detection, human's pose in the environment, robot localization etc, are handled internally for each ROS component and translated into higher level pass/fail observations to propagate the decision-making policy. Fail safe mechanisms are foreseen within the proposed cognitive architecture where each ROS node is accompanied with specific diagnostics and state reporting. However, the hardware components are not accompanied with the respective diagnostics functionalities and, thus, in case of a failure of a hardware component, the system does not enter from the monitoring into any other state. This is mainly due to the fact that the RAMCIP robot constitutes a prototype and in the future releases such issues once the hardware that will be stabilized will be resolved.
Task Planner Figure 13. The POMDP and task planner integration schema. At the top, the causal relationships between POMDP states, actions, rewards, and observations are illustrated in accordance with [42] policy graph interpretation. At the bottom, the task planner functionality is illustrated, where each action is assigned to a specific task, which is further decomposed into ROS functionalities, i.e., action, service, node, which implements a specific robot Skill i . The outcome of each skill execution is coded and returned as observation ω i to the policy graph of the POMDP.

Experimental Evaluation and Discussion
The framework and the subordinate methods described in this paper for the assistance provision in the medication adherence scenario have been evaluated with the RAMCIP robot with 12 real participants diagnosed with MCI. The experiments took place in their own home environment, all located in Barcelona, Spain ( Figure 14). Each participant had the opportunity to interact with the robot for at least seven days. During these days, the robot had been tested in several scenarios concerning assistive living; however, the results mentioned herein are focused on the medication scenario, which is the most important and complex one that includes all the developed robot skills. The overall findings of the experimentation are reported below along with discussion regarding the failure situations.
It is worth mentioning that, before the actual interaction, one day had been spent for the robot's installation in each participant's house. The robot installation consisted of four mandatory steps including the metric map construction, its augmentation to obtain the hierarchical semantic properties, acquisition of the detailed 3D models for all small objects of interest, and collection of user specific information (e.g., medication schedule, facial features). In addition, during this phase, the user had the opportunity to be familiarized with the robot through a brief introduction to its functionalities and its communication capabilities.
During the experimentation phase, in each house, it has been ensured that the robot's deployment will not be invasive to the the user's environment. Thus, the furniture topology has been kept intact apart from minor modifications in the area, so as to facilitate the robot's charging station and the parking position for the monitoring. In order to evaluate the performance of RAMCIP robot in the medication adherence scenario, two different methods were employed. The first concerned the documentation of each execution utilizing an external camera, and the second one concerned the development of a logging system integrated into the decision-making module and the task planner. The latter allowed the recording of the executed policy graph which tracked the robot's operation and registered the incidents that occurred from the ROS diagnostics. In this way, it has been ensured that both high level data, concerning the robot's decisions and selected actions (stemming from the POMDP), as well as low level information including failures from the implemented skills, were monitored. The data extracted from this mechanism were compared with the ground truth (the external camera footage). In order to determine whether the scenario had been executed successfully, the robot after each repetition had to be able to assess with certainty if the user had indeed taken the medication or not. Figure 14. Selected cases of execution of the medication assistance scenario in 12 different real house environments.The instances concern RAMCIP robot during "pill box" grasping from high shelf (first row), activity monitoring (second row), "pill box" grasping from table (third row) and "pill box" release to its storage position (last row).
Overall, in compliance with the time schedule of the participant, the scenario was initiated proactively. During the seven days of engagement, this scenario was executed at least once per day, even if the medication schedule of the participant involved more than one session. In case the scenario was omitted for one day, due to either technical difficulties or participant's obligation to other activities, it was recompensed the next day by performing the scenario more than once. The efficiency is evaluated based on the rate of overall correct repetitions and outlined in Table 2. The medication adherence scenario has been executed in total 84 times, out of which 68 were correctly performed, resulting in an overall accuracy of 80.95%. A repetition was marked as correct only when the robot had returned to the monitoring state, disengaged from the scenario and was aware about whether the participant had taken the medication or not. To extract this outcome, the sub-graph of the states with the adjunct actions, as inferred from the POMDP, has been compared with the external camera recordings.
The remaining 16 erroneous executions concerned situations in which the robot was uncertain about whether the user had taken the medication due to errors stemming from the perception, action and communication skills' faulty operation, as well as other types of unexpected situations, all summarized in Table 3. In particular, the erroneous executions that occurred relied firstly on the localization error where-in four instances-the robot parked with the wrong orientation. Such types of errors directly influence the transformation chain, especially when it comes to the detection of objects like the "pill box", which requires cm level accuracy. The overall localization error of the RAMCIP robot platform is ±5 cm lateral and 2°rotational, which is considerably low when it is compared with the requirement to position the RGB-D camera so as to bring the "pill-box" centred to the RGB-D camera FoV. However, there were certain situations in which aggregated localization error led to robot parking poses that severally influenced the object detection algorithm and, thus, the grasping of the "pill box". The robot was unable to recover from this situations by utilizing its internal mitigation strategies, and the scenario was terminated. Human activity recognition is also a very challenging task in uncontrolled environments and, as expected, affected the outcome of the evaluation. There were three situations where the robot did not recognize correctly the outcome of the activity tracking. This phenomenon has been observed in situations where skeletal occlusions were excessive or the illumination reflections severally influenced the data from the depth sensor. As a result, the joints of the participant either were not observable at all, or the existing skeleton topology was an outlier, hence the skeletal joints did not comply to any geometric criteria that were posed. In those situations, the robot either accidentally spotted that the user had taken the medication and terminated the assistive scenario or prompted the user to take the medication, even if s/he had already done it.
Arm manipulation and grasp planning also comprised significant challenges during the deployment in the diverse environment of the participants' houses. In more detail, four (4) planning errors occurred during the experiments, two of which concerned the arm trajectory generation and two concerned the grasp planning. Considering the arm trajectory generation, the problematic situations were spotted in very tight environments where the workspace was very limited due to surrounding objects/furniture. The outcome of this effect was that the planning procedure was timed-out since the identification of the inverse kinematics solution took an excessive amount of time, and the planner abandoned the procedure. Regarding the grasp planning, there had been two failure incidents which actually stemmed from the object pose detection algorithm. In particular, albeit the fact that the object class had been correctly detected, the pose estimation module inferred an erroneous pose. This as a result caused the inferred grasping poses to be very challenging and the resulting arm planning solutions to bring the end-effector to particular poses indicating collisions with the environment. In such situations, the grasp had been aborted and the scenario had been terminated. In another situation, a weak grasp occurred again due to erroneous object pose estimation and the "pill box" slipped from the robot's hand. In that case, the robot detected the slippage using the Optoforce sensors mounted on the fingertips; however, the recovery strategy for grasping such flat objects from the floor was not foreseen in the scenario and, thus, the latter was terminated.
Another source of failures during the evaluation of the medication adherence scenario involved communication issues. Specifically, the communication between the participant and the robot failed three times leading to the conclusion of the medication adherence scenario, while leaving the robot with the clue that the participant had taken the medication. Albeit the fact that communication system was equipped with noise cancellation mechanisms, due to excessive surrounding noise, the robot interpreted erroneously the user's response and closed the interaction scenario precociously.
Apart from the above-mentioned situations, there were two more occasions where the robot failed to complete the assistive scenario. Both of them were related to the erroneous observations passed to POMDP, mainly due to the fact that unexpected human responses occurred during the interaction scenario, e.g., the participant interacted with the medication intake scenario and initiated a different task due to its pathological condition and did not return to the medication session. Such conditions completely confused the activity recognition module, and the operation flow propagated erroneously. Specifically, the passed observations from the task planner to the POMDP drove the system to a complete irrelevant state that was not expected at the current interaction section. However, such situations were very limited and stemmed from the fact that the complete modeling of a human-environment and robot ecosystem is not possible in such challenging operation scenarios in uncontrolled environments, thus leaving some ambiguous corner cases that the decision-making mechanism is not able to resolve.
Finally, it should be stated that the entire experimental procedure has been approved by the ethics committee of the RAMCIP project. In the context of these experimental trials, the requested information regarding the consent forms has been collected from the ethics committee representatives and the involved doctors in the project, who among other things informed the participants that the collected data will be treated in an anonymized manner. Moreover, the completed consent forms have been signed by all the participants.

Conclusions
Summarizing the paper at hand significantly contributes to the current state-of-the-art solutions for the existing personal assistive robots. It focuses on the verified use case of assistance provision in medication adherence activities and, contrary to the existing works (e.g., other personal robots) that address the issue of medication adherence partially, the proposed framework outlined a complete pipeline of software operating on hardware that addresses the problem of assistance provision in medication adherence activities fully. More specifically, in this work, a complete pipeline for the robot assistance in medication adherence scenario has been presented. The paper at hand exhibited a framework where a robot will be able to provide assistive living services, focusing on the medication serving and monitoring. Contrary to the existing works, the proposed one delivers a holistic solution, where the robot is not solely involved as a fetching machine yet is endorsed with capabilities that enable it to respond with certain accuracy to the question: "Has the user received the medication?". To achieve this, custom solutions tailored to the specific problem have been adopted (where possible) and developed and integrated under a prompting decision-making mechanism which is realized with a task planner that coordinates the developed set of skills. The user requirements have been closely studied, the hardware architecture has been documented and the developed software skills have been presented. The integrated system has been evaluated with 12 real participants with MCI in their own house environment. The overall system presented herein achieved more than 80% accuracy, indicating that the robotic system can be sufficiently used for the assistance provision in real medication adherence scenarios.
Finally, it is important to state herein that the ultimate goal of this paper was to provide solutions in a series of perception, cognition and action (i.e., navigation, manipulation, grasping) problems that will enable, in the near future, assistive robots' deployment in real home environments. However, important research should be conducted towards the certification of such robots in order to ensure that they will comply to any standards required for the safe human-robot coexistence and interaction.

Conflicts of Interest:
The authors declare no conflict of interest.