Design for an Art Therapy Robot : An Explorative Review of the Theoretical Foundations for Engaging in Emotional and Creative Painting with a Robot

Social robots are being designed to help support people’s well-being in domestic and public environments. To address increasing incidences of psychological and emotional difficulties such as loneliness, and a shortage of human healthcare workers, we believe that robots will also play a useful role in engaging with people in therapy, on an emotional and creative level, e.g., in music, drama, playing, and art therapy. Here, we focus on the latter case, on an autonomous robot capable of painting with a person. A challenge is that the theoretical foundations are highly complex; we are only just beginning ourselves to understand emotions and creativity in human science, which have been described as highly important challenges in artificial intelligence. To gain insight, we review some of the literature on robots used for therapy and art, potential strategies for interacting, and mechanisms for expressing emotions and creativity. In doing so, we also suggest the usefulness of the responsive art approach as a starting point for art therapy robots, describe a perceived gap between our understanding of emotions in human science and what is currently typically being addressed in engineering studies, and identify some potential ethical pitfalls and solutions for avoiding them. Based on our arguments, we propose a design for an art therapy robot, also discussing a simplified prototype implementation, toward informing future work in the area.


Introduction
The current article presents a basic design for an art therapy robot.Specifically, we derive some guidelines from the literature in exploring how to visually express emotions in a creative context of painting with a robot, as a step toward realizing a therapy robot which could positively support people's well-being by engaging with people on an emotional and creative level; additionally, we provide an implementation example from some of our work in recent years, illustrated in Figure 1.

Terms
"Art therapy" is a therapeutic process involving art-making: a patient expresses emotions through creating art, which also serves as a bridge between the patient and a therapist [1]."Therapy" here encompasses notions of care, healing, and providing attention [2] and is intended to mitigate problems and facilitate positive health and well-being, where "well-being" here is used synonymously with "happiness", "quality of life", and "life-satisfaction" [3]."Art", which has been described as washing "away from the soul the dust of everyday life" [4], can include painting, drawing, photography, collage, and sculpture, and be abstract or symbolic, where we use the term "symbol" here in its general sense as some meaningful stimulus pattern, thereby comprising icons and indices."Emotion" in humans refers to a complex psycho-physical phenomenon at the heart of art therapy [5,6], involving subjective feelings, somatic symptoms such as elevated heart rate, affect displays such as smiling, and cognitive appraisals [7]; we suggest that some interesting properties of emotions to consider for art therapy include co-occurrence, referents, timing, and polysemy (that emotions often co-occur, that emotions usually refer to some referent which can be someone or something, that emotions play out over time, and that emotion displays can express different emotions), as described in Appendix A. "Creativity" is another core aspect of art therapy, which is "a way of doing things" [8], characterized by some relative novelty, as described in Appendix B. Additionally, for "robot" here, we primarily consider (semi-)autonomous machines, comprising computers, sensors and actuators, with some degree of human-like intelligence and capabilities that can be used for therapy.
Figure 1.The basic concept of the current work is that we imagine robots can play a useful role in engaging with people emotionally and creatively in art therapy.The photo shows a robot painting with a person, which bases its artwork on the person's emotions inferred using a Brain-Machine Interface.Used here with permission.

Motivation
There is a large and rising need to help people mentally: for example, loneliness, which has been tied also to dementia [9], has been described as a "rising epidemic" that can be a higher mortality risk than moderate daily smoking or obesity when prolonged, and is estimated to cost hundreds of billions of dollars yearly in the US alone [10].One useful approach for mentally helping patients is art therapy, which has demonstrated some good effects with a wide range of groups, as follows.Some positive subjective effects of art therapy include improved self-awareness, self-image, relaxation, and social interactions; objectively measured outcomes include improved clinical outcomes, better vital signs, reduced cortisol, better sleep, shorter stays, faster discharges, and higher pain tolerance [11].Furthermore, effects can persist; for example, "a subtle, more pronounced and durable positive effect across time" of art therapy was noted compared to an alternative of recreational activity group work in dementia patients [12].It has even been suggested that art therapy potentially has some "preventive, diagnostic, therapeutic and rehabilitative benefits that other forms of therapy cannot provide" as art offers another rich way for people to explore themselves other than through the spoken word or playing [1].(We note that this by no means implies that art therapy is better than other forms of therapy, which can also have unique and powerful benefits; for instance, in robotics, some studies using Paro and KiliRo have reported success in reducing stress and psychologically helping elderly and children with autism [13][14][15].)Art therapy has been used with both adults and children for various conditions, including dementia, autism, depression, trauma (post-traumatic stress disorder, sexual abuse, and traumatic brain injury), AD/HD, schizophrenia, bipolar disorder, borderline personality disorder, AIDS, asthma, burns, cancer, chemical dependency, hemodialysis, sickle cell disease, and tuberculosis [6,[16][17][18][19].
To provide art therapy, we believe that a painting robot could potentially be useful: for example, helping people who cannot currently receive care due to a shortage of human resources [20] as robots can be manufactured as needed and programmed knowledge can be shared; saving some time for human therapists, such as for traveling to, attending, and documenting sessions; leveraging abilities not normally available to humans, such as inferring emotions from brainwaves; being available at possibly any time as robots do not need to sleep; and facilitating self-exploration without requiring people to express vulnerable thoughts to another human.In practice, some benefits of therapy robots have already been described in other contexts, such as decreasing loneliness in persons with dementia with pet-like robots, Paro and AIBO [21,22].Furthermore, in comparison with other technologies such as virtual agents, robots have been reported to elicit stronger perceived emotions, presence, motivation, and engagement [23][24][25][26], and could perform more behaviors, such as seeking out a person to interact, working in 3D, and conducting other useful tasks in a home such as cleaning and fetching items.Additionally, painting is a typical form of art which engrosses multiple senses-not only sight and touch but to a lesser degree also smell and sound, allows for high creativity as objects do not have to be seen to be painted; does not require knowledge of computers; and leaves a tangible, persisting artifact which can be a reminder of achievement.Furthermore, affective computing and artificial creativity relate to the strategic area of artificial intelligence, where multi-billion dollar plans are being proposed and it has been claimed that "whoever becomes the leader in this sphere will become the ruler of the world" [27].
Thus, the current work seeks to present a design for a therapy robot, which could positively support people's well-being by engaging with people on an emotional and creative level.A challenge is that the theoretical foundations are highly complex: robot art therapy requires gathering knowledge from various fields in human science and engineering; and, as noted, we are only just beginning ourselves to understand emotions and creativity in human science, which have been described as important topics, final frontiers and ultimate challenges in artificial intelligence [28][29][30].
To gain insight, our approach is to explore the theoretical foundations by reviewing some of the extensive literature, identifying issues we feel to be important, proposing prescriptions, and clarifying possibilities, in Section 2. (We note that, in Section 2, the current article does not seek to provide a definitive review of a single, existing, mature field, but rather seeks to explore what valuable information can be drawn together from some online sources in multiple fields to design a new interaction, involving robot art therapy.In terms of the PICO paradigm used in medical studies, we include references to work conducted with various participant groups, as we believe that various groups could benefit from robot art therapy, and that interaction by a robot with various groups will be important to avoid stigmatization.Interventions of interest relate both to human art therapy and to general robot interaction strategies, whereas outcomes of interest relate to facilitation of well-being and positive emotions.Likewise, we do not place restrictions on year of publication; however, only publications in English language are considered.)Based on our arguments, we propose a design for an art therapy robot, also discussing a simplified prototype implementation in Section 3 (some code has been made available on GitHub (https://github.com/martincooney),and some videos on YouTube (https://youtu.be/9d96MrSCjx8,and other footage of our robot painting at https://www.youtube.com/channel/UCVDPmL7NkC5Mn_A1wU0RPKg).)Section 4 provides some extra discussion, toward informing future work in the area.Additionally, we note that a portion of the current article extends and directly builds upon some of our recent work on identifying some general potential dangers in affective computing [31], and implementing emotional art-making behaviors [32].

Theoretical Foundations
To propose a design for an art therapy robot, focusing on emotions and creativity, we believe the first important step is to consider the theoretical foundations.Here, we review some related work, identifying issues we feel to be important, and proposing prescriptions.Related work concerns robots used for therapy and art, strategies for interacting in art therapy, and mechanisms for expressing emotions and creativity.

Therapy Robots
Numerous kinds of therapy exist-e.g., physical, play, dance, music, drama, writing, gardening, and art-some of which are already being conducted with various robots.
In particular, physical therapy robots have a long history.Rudimentary prosthetics and sensors might have been used since the times of Ancient Egypt [33]; more recently, exoskeletons were designed as tools to help people walk in the 1800s, and to carry heavy loads since the 1960s [34].Such robots have also begun to be used more recently in other roles, such as "socially assistive" robots working as exercise coaches, to get elderly to move their arms [35].Another robot was built to act as an "opponent"; elderly chase the robot, whose behavior adapts based on perceived skill levels [36].Furthermore, a robot was constructed to be an exercise partner; elderly walk with the robot between fixed stations by clicking a button to control the robot to move [37].
In addition, many playful robots have been built which are intended to function like pets, toys, tools, or play partners.Playful pet robots, such as the seal-like robot Paro and dog-like robot AIBO, which were introduced in the 1990s, and more recently the teddy bear-like robot CUDDler, have been used in place of animals in "Robot Assisted Therapy" (RAT) [21,22,38].Robots have also been built as toys for children to program, such as PETS, which can tell stories [39].Playful parrot robots, RoboParrot and KiliRo, have been used with autistic children, as tools for screening and to achieve enjoyable learning [13,40].Furthermore, some robots have been built as play partners, such as Maggie, which can play various games including Tic-tac-toe, Hangman, Twenty Questions, Peekaboo, and "Animal Quiz" [41], or Probo, Zeno, Rovio, and Nao, which have also been used to play games with autistic children, involving identifying emotions [42,43] and imitation [44,45].
In addition, some work has explored designs for robots which could be used for dance or music therapy.For example, Kosuge et al. created a robot which can dance with people as a dance partner [46], and Kozima et al. built Keepon to investigate the usefulness of rhythm synchronicity, which could be used to enable therapeutic dance interactions [47].In addition, an interaction with a music therapist robot was proposed, in which people guess from hearing a few notes which song a robot is playing [48]; robots which can interact musically with humans have also been built [49], which could also be used for therapy.
Additionally some robots can engage in more than one kind of therapy.For example, Martín et al. conducted tests with dementia patients using a Nao Robot designed for four types of sessions: physiotherapy, play with storytelling, music, and language (asking about numbers and days of week to stimulate thinking and recall) [50].In addition, various other general healthcare robots have been built, which can carry, monitor, and fetch items for humans [51][52][53].Moreover, recently a need to develop more autonomous therapy robots has been described by researchers in the DREAM project, using the term "Robot enhanced therapy" (RET); their goal is to use robots as a tool to help children with autism to improve social skills comprising turn-taking, joint attention, imitation, in conjunction with Applied Behavioral Analysis (ABA), a learning theory based on behavioral repetition and cognitive association [54].Art therapy could also potentially be structured to improve such social skills.Another interesting proposal has been made that robots could conduct "alloparenting", acting as surrogate guardians for human children, which we believe also relates to therapy, as parents are also required to support the emotional and mental well-being of children; this work highlights the importance of joint attention and empathy, as well as the promotion of autonomy through principles of beneficence, non-maleficence, equal provision of resources, and respect for a person's dignity [55].In summary, therapy robotics appears to be an active area of research in which various work is being done, with both children and elderly, and also persons with physical disabilities, dementia, and autism [56,57].
Some gaps seem to exist for therapy regarding drama, writing, gardening, and art therapy.For example, a robot could read lines from a play with a human, write a diary together with a human (e.g., filling in things which a patient has forgotten to mention, or asking for further details), garden (e.g., planting or harvesting with a human), or make art with a human.We focus here on the latter possibility; to our knowledge, previous work has not proposed a design for an art therapy robot.

Robot Art
Various machines have been built to create visual art, under names such as computer art, generative art, and electronic art: typically computers are used to generate images, which can be translated into physical art by printers and robot embodiments.Some first artistic images on computers were reportedly created for fun by IBM employees in the 1950s [58].In the 1970s, an artist named Cohen started producing paintings generated by a computer program, AARON, which were executed by plotters and printers [59].In the 1990s, a drawing robot called ISAC was also designed to mimic an artist's movements [60].In the last decades, various detailed prescriptions were made, such as how a robot can grasp a paint brush in its fingers, and paint, detecting the brush tip [61], and numerous robots have been designed as tools to draw some existing scene in an aesthetically appealing way.Recently, every year various such robots take part in the international robot art competition (robotart.org),illustrated in Figure 2, and there is also a "DeepArt" competition for style transfer artwork (at the Annual Conference on Neural Information Processing Systems (https://deepart.io/nips/submissions/)).Such systems usually seek to create art that is aesthetically pleasing by taking a model or an image, possibly altered via style transference, as input, and using reinforcement learning to iteratively refine artwork, or mimicking an artist's movements directly.However, for robot art therapy, we expect that there might not be a photo to compare to, work might be only partially completed, and robots should move not only based on an artist's motions.Some interactive art systems have also been created which are not controlled by an artist's movements.For example, one humanoid robot asked people if they would like to be drawn, and then drew detected faces step-by-step, from rough contours to more detailed lines [62]; this interaction resembles our vision for how a robot can interact with a human not as a tool but as an interaction partner, while co-creating art.In addition, recently, in the field of human-computer interaction, some interactive systems have been developed.Lee et al. described an interface for assisted drawing using a tablet, where the system used shadows to provide suggestions on how to complete sketches [63]; likewise, Ha et al. made available online a tool called "sketch-rnn" which can also generate such suggestions, using a Sequence-to-Sequence Variational Autoencoder [64].We note that, in practice, suggestions are often drawn from where a person has just finished drawing, which could interfere in a physical system: a robot should not push aside a person's hand to start painting in the same spot.
In addition, we suggest that, to go further, a robot could recognize which "rules" a person is following when painting (which might relate to speed or placement within a composition), predict intentions and next moves, track its performance, and plan how to behave next.Regarding the first problem of aligning a robot's work with a human's, previous work has demonstrated classification of different styles of painting such as baroque or impressionist, which could possibly be one parameter to consider [65].Furthermore, Crick et al. described a camera-equipped robot which could learn the rules of a game (i.e., the intentions of other robots) via a dynamics-based causal model, and then participate using its motions to resolve ambiguities and further improve its understanding [66].For prediction, Mutlu et al. reported benefits of having a robot predict what a human will do next, using a person's gaze, in a scenario about a robot handing objects to a human [67].Montebelli et al. also reported possibilities for recognizing intentions through motion patterns, although highlighting individualistic differences [68].Furthermore, work has been done on looking toward a human for feedback (e.g., Breazeal's Leonardo [69]), and use of smiling for reinforcement [70], although in these works the referent has been clear, and smiling by itself can result from various reasons.Some inspiration can be drawn from such work, on how an art robot could extend a human's painting, and engage in harmonious co-creation, but the focus has been largely not been on therapy, emotions, and well-being.

Interaction Strategies for Art Therapy
As described in the previous section, various robots have been proposed for physical exercise and play, and art robots have been built to create artwork which is optimally pleasing to the eye, but the literature does not clearly indicate how a robot can interact to support a human seeking to engage in art therapy.Even turning for inspiration to human science to examine how art therapy is typically conducted, we encounter the challenge that there is no one "accepted" way to conduct art therapy; rather numerous approaches exist [71], some of which are briefly described in Appendix C. Below, we first clarify a proposal for a basic scenario, based on our own ideas.Then, we describe some ideas from human science, from some explicit guidelines, to some hints implied by underlying mechanisms in art therapy, before considering some general guidelines.

Humanistic Art Therapy: Robot as Partner
In the current article, we argue that some approaches might not be appropriate for robots, due to the risk of harm if a robot presents a mistaken diagnosis.First, mistakes could be common in first robot prototypes; art therapy requires a high degree of understanding of humans, which would be currently difficult to expect from a robot given the emerging nature of this field.Second, there seems to be a potentially dangerous power imbalance in the case of a "normal" therapist looking over the shoulder of an "abnormal" patient, judging, in that decisions could be made based on some diagnosis which could adversely affect a person.Some similar arguments have been made in human science: for example, it has been proposed that some approaches could be oppressive if there is some disagreement in values between therapist and patient [72,73].This line of thought was also echoed in the area of human-robot interaction, for therapy in general, by Ziemke and colleagues, who questioned the assumption that the therapist is an all-knowing expert who can deduce the truth [54], and by Tapus et al. who also advocated that therapy robots should be hands-off [74].Furthermore Kahn et al. stated that the important target is how to design a scenario in which people will interact with robots as partners in a joint creative enterprise [75].Thus, we envision, from a humanistic perspective, that the first fundamental scenario to target for art therapy robots should be one in which the person is at the center of the interaction, and the robot does not diagnose and judge, but rather tries to play a supporting role.
Following this basic conceptualization that a robot can act as a partner in art rather than a judge, we propose a basic scenario, following the Five W strategy (why, who, what, where, when) to address some salient questions [76].Since "why" has already been discussed, we turn first to the question of who should paint.Art therapy can be conducted with various numbers of people and robots.A benefit of interacting with a single person, rather than a group, is that we expect the person will be better able to perceive full attention from the robot, due to, e.g., the "Socratic bottleneck" where in a group only one person can speak at once [77].(If we assume the art therapy robot uses a familiar interface for communicating-humanoid or possibly animal-like-then, similar humans or animals, it can show attention through cues such as gaze, body pose, speech, location, and motions (e.g., art-making), but, because humans and animals tend to only have one head, one body, and a few arms, and take turns speaking, such a robot cannot simultaneously directly look at and listen to multiple people at once.It would have to look from one to another, listening to one person's response then another's, similar to a human, providing an impression of giving less than its full attention to each person.In other words, as an example, any time one person is speaking is a time when another person will appear to not be receiving full attention.This effect is known in pedagogy, and a reason for sometimes breaking students apart into small groups, so that they can more actively interact [77]).
Group therapy can also result in a sharing of strongly emotional experiences which might otherwise not be possible, and improve relationships [1]; however, this is a more complex case, introducing new dynamics, such as the relationships between people and how choices will be regulated based on who is present.It could also be possible to allocate multiple robots per person; for example, one robot could steady the arm of a person with Parkinson's disease, while another robot could paint beside them, providing company.However, we believe this case is also more complex.Therefore, we suggest that the more basic dyadic case is useful for initial explorations.
For such a dyadic case, we propose that three basic cases can be described, in which the robot's involvement in painting is 0% (only the person paints), somewhere in between 0% and 100%, such as 50% (both the person and robot paint), or 100% (only the robot paints).To determine which a person prefers, the robot can ask at the beginning of the interaction.In the first case, the person might wish to achieve independently without relying on others and feel ownership over the art.In the second case, the person might value the enjoyment from social interaction or expect a nicer result if co-creating.In the third case, the person might be incapable of physically participating, or merely prefer to passively observe.In the first and second cases, the robot can seek to infer a person's emotions from the art they make, while, in the third case, the robot can seek to recognize how the person is feeling directly.In all cases, the robot can attempt to make some basic conversation: In the first and second cases, the robot can also ask the person about what they are painting, whereas, in the third case, the robot will simply comment on what it is painting.From the perspective of simplicity, the first case is arguably the simplest; however, the second and third cases we feel are most in line with our vision of the robot as a partner, and fundamental from the perspective of exploring how a robot can interact in an emotional and creative manner.
Regarding what a person should paint, we suggest giving the person themselves the dignity to freely select what to paint.Where a more formal procedure is desired, objective tests could be adapted, such as the Diagnostic Drawing Series (DDS) which assesses colors, lines, and composition (e.g., placement and integration), in asking a person to make a picture, make a picture of a tree, and make a picture of how they feel using lines, shapes, and colors [78].However, we note that this test would not be possible to directly apply for the painting scenario considered here, as it is designed specifically for pastels; also, requirements on time, number of artworks to produce, and subject matter, might not be desired in various cases: for example, if the artwork generated can also be intended to be aesthetic or displayed somewhere, or if a person requires more time, or does not have the physical strength to produce many drawings.We suggest that a robot can in general seek to track such objective information over time to roughly gauge a person's state and the effectiveness of sessions, even with freely chosen art.In addition, for individuals who cannot decide what to paint, a robot could suggest some topic; for example, self-portraiture is a tool used by art therapists as a way to promote self-reflectance and self-acceptance [79].
Another question which should be addressed is where the robot should paint.If a robot draws on the same substrate or canvas, the interaction could be felt to be more social and intimate; that is, there could be a stronger effect of social facilitation [80].Furthermore, the robot could help the human to paint better, especially for individuals with restricted mobility, e.g., adding details which might require high dexterity, knowledge, or technical ability.Conversely, allowing a person to complete their own painting could result in an increased perception of accomplishment, and ownership, resulting from high perceived involvement, personalization, expression of territoriality, and control [81].(Feeling ownership can also enhance memory through the so-called self-referential effect [82], which could be useful for dementia patients.)Such a case might also be simpler for initial investigations, as a robot does not need to track where a person is painting and avoid colliding with them.We argue that both cases can have benefits, so an art therapy robot should be capable of engaging in a range of art-making behaviors, from facilitation to completing a painting by itself.We also propose that the robot first ask the human how they would like to paint, and, if the same substrate is used, that the robot should allow the human as much as possible to play an important role in making the artwork.
In addition, a decision should be made on when art therapy should take place.Art therapy is tailored for the individual both in terms of number of sessions and structure of sessions, as follows [6].Single sessions are possible, although therapy tends to last over several weeks to a year.Session can be approximately an hour each.A common structure for each session, in line with models such as the Creative Axis Model, is to have some warm-up activity, a main activity, and reflection at the end.Similarly, initial sessions can focus on goal-setting, identifying problems, and accustomization; middle sessions can involve working on central themes, then more challenging and complex themes, distancing from problems, investigating solutions and answering questions; and, the end can focus on review, next steps, and closure.In addition, art therapists can make responsive art before, during, or after sessions.We believe the most practical starting point for research in art therapy is focusing on a single session first.If the robot does not paint during the same session, timing and memory can be factors which should also be considered, which might be important, especially for persons with dementia; therefore, we also suggest that initial work focuses on the basic case in which the robot makes art at the same time as the person, in close social proximity.
Within this basic scenario, summarized in Table 1, we next consider some guidelines about how to conduct the desired form of humanistic art therapy.Phillips recommended: (1) investigating the meaning of the art with the person; (2) accepting (and at times encouraging) the communication of strong emotions, including negative ones; (3) praising creativity and skill, even for negative depictions; and (4) suggesting alternatives for disturbing negative content to express feelings [83].Additionally, the importance of keeping track of the timing and progression in the artworks over time was mentioned, as well as the importance of understanding popular culture, clinical, and social context for seeking to find meaning in a person's art, although the latter could be difficult for a robot.These guidelines can be followed both via verbal interaction and via the therapist's own artwork, in a process which has been described as responsive art, or visual feedback.Phillips noted that often visual feedback was more successful than a verbal response in investigating meaning, in line with our idea that a robot can paint with a person as a companion.Such prescriptions are in line with our idea of the robot acting as a companion, rather than a judge, but they do not clearly state how a robot can decide what to draw.For example, if a person is expressing a negative emotion such as sadness, what should a robot do?The literature presents some evidence supporting two potentially opposing premises: that a robot could try to match a human's negative emotions, and that it could seek to distract with a positive emotional display.In human-robot interaction, Goetz et al. compared a robot which is always positive to a robot which matched its mood to the context, finding the latter was more liked [84]; Tapus and colleagues also found that people preferred a robot to match its behaviors to a user's personality [85].A benefit of such an approach could be that the person does not feel that they always have to be positive and suppress their emotions, which has been reported to be an ineffective emotion regulation strategy with some negative consequences [86].Furthermore, a tendency to like people with similar attitudes has been described [87], implying that such convergence of emotion displays could engender liking.(We note that this concept of displaying similar emotions does not imply any suggestion that a robot's appearance should or should not resemble a human's, which is debated: According to Mori's Uncanny Valley hypothesis, a near-human appearance with slight imperfections could potentially trigger negative impressions; conversely, it has been argued that negative impressions can be avoided using an attractive design as not all imperfections elicit the same responses [88], and also that humans can quickly become accustomed and extend their expectations for appearance [89].)In addition, humor could also be perceived in causing a robot to behave in a sad manner, like seeing a pet's concern over its owner; negative emotions can also be valuable [90], and Philipp's second guideline above regarding accepting others' emotions could also be interpreted as suggesting matching.
A caveat is that the studies above were not conducted in the context of art therapy and it was not determined if people felt better as a result.In art therapy, Drake and Winner found that distraction had a better effect on a person's mood than venting (i.e., expressing positive, rather than negative emotions, which might help to avoid negative rumination) [91].We believe various positive effects could ensue from a robot's positive emotional behavior, such as happiness through emotional contagion; a person could feel safe if a robot never displays negativity; the robot could seem to have its own goals and not be merely reactive; and, Philipp's third and fourth guidelines could be interpreted as suggesting distraction, by praising and suggesting alternatives.
However, Drake and Winner's study was not conducted with robots.Moreover, if a robot always acts in the same positive manner, ignoring the person's emotions, it could appear boring [92], or insincere [93].Positive behaviors such as laughter can be irritating when perceived to stem from schadenfreude [94].Moreover, if a robot adopts a purely positive stance about everything, and does not acknowledge any negative aspects, a person might feel the need to express the negative side themselves, in line with the concepts of the devil's advocate and reverse psychology, which might result in undesired negative rumination.Furthermore, the robot could be interpreted as implying that, if a person is not positive like it, they must have a problem, which could produce negative feelings.
We argue that the results of these studies do not necessarily contradict: for example, a robot in an interaction can do both, sometimes matching, thereby showing empathy and gaining trust, and sometimes distracting, thereby helping the user to feel better.Conversely, the robot could do both simultaneously, expressing a mixture of negative and positive emotions: for example, drawing a sad scene with a positive rainbow.Furthermore, there is not necessarily a conflict or disruption of contingency in responding positively to a negative emotional display.A study conducted in the context of analyzing conversations on Twitter suggested that the form in which positive emotions are shown is important, finding that sympathy, greetings, and recommendations in particular exerted a strong positive influence on others' emotions [95].Conversely, worrying, teasing, and complaining often caused others to feel negative emotions.In terms of basic Ekman emotion categories, worrying and complaint can be associated with fear and anger, respectively (high arousal, low valence), whereas sympathy is displayed through sadness (low arousal, low valence) toward a person's unfortunate situation, or affection (high valence) toward a person.Greetings and recommendations are typically happy (mid to high valence).Teasing is typically insincere, and can be insincere anger, or insincere joy.Additionally, the study also found that people usually expressed the same emotion or a positive emotion, in line with our idea that both matching and distraction are useful behaviors to employ.We note that there are also theories in human science such as communication accommodation theory (CAT) which could offer additional insight into when to match or distract (converge or diverge); however, this theory, in line with social identity theory (SIT), focuses on explaining human motivations for social approval, communication efficiency, and social identity, rather than on how a robot could engender positive affective changes [96].
We believe that a key concept underlying matching is empathy, which describes the capability to perceive, understand, and share another person's emotions [74,97].From this perspective, we believe that matching is not merely mimicry in showing similar emotions, but rather there is also a cognitive aspect involving perspective taking, which is required to deal with complex emotions involving mixed emotions and referents.More specifically, when matching, emotions to display can be influenced by the type and degree of emotions perceived; the robot's personality; and the perceived importance for the robot to act based on the closeness of the relationship between robot and person, as well as the appraised context (e.g., for humans a bystander effect is observed in which helping behaviors are inhibited when many people are present) [98].Based on such theory, a rich computational model of empathy has been built, which we believe will enable matching behavior in art therapy robots [99].
Some further insight can also be obtained by considering not just explicit guidelines for what a human therapist should do, but also the underlying processes embedded in art-making which allow people to feel good effects.As noted in Section 1, in art therapy, some positive effects-such as improved self-awareness, self-image, relaxation, and social interactions-are facilitated via processes of self-exploration, self-fulfillment, catharsis, and perceiving belonging [11].Self-awareness is enhanced by projecting and exploring emotions and experiences, which might be easier to express with symbols than words (described as "refraction", "dramatic distancing", or being "once-removed"), and by allowing inference of a person's current state and progression over time; enhanced self-awareness can facilitate healing, via reappraisal.Self-image is enhanced by fulfillment: via opportunities to achieve and to actively take an empowered role in improving one's situation, promotion of cognitive ability (creative thinking), and positive distraction allowing a person to escape from negative rumination to a state of "remembered wellness" and regain an identity not defined by their problems.Relaxation is promoted by catharsis: the release of stress and tensions, engagement in a repetitive physical activity in which the person can freely choose when to start and end, and the subjective nature of aesthetics allowing for many different kinds of "good" result.Social interactions can be improved by feeling a sense of belonging, by being able to share with others, and by being included in a form of expression which is accessible to people of all ages and cultures.Thus, we propose that an art therapy robot can seek to promote self-exploration, self-fulfillment, catharsis, and perceiving belonging.
To promote self-exploration, the robot should sometimes ask a person about their art; in doing so, the person should not be directly interrogated, but rather their emotions can be explored indirectly through the art.The robot can also compile data on a person which can be given if approved to a care giver such as a doctor to make inferences on a person's state and progression.To promote self-fulfillment, the robot could leave the most important parts of the painting for the person to paint, providing opportunities to feel independence, control, purpose, and growth, while possibly scaffolding and adjusting the challenge to the person's skill level.In addition, the robot can refer to the person as an artist or as creative or skillful, seek to positively distract the person sometimes, and engage the person in creative thinking, asking questions such as "What do you think this looks like?" or "What would you do next time you wanted to paint something like this?" Furthermore, self-acceptance could be facilitated by including positive personalized content; for example, a robot could paint fish for a person with dementia who used to enjoy fishing.To promote catharsis, the robot should not put time pressure on a person, but rather ensure the atmosphere is relaxing; the robot should take over as little as possible the physical act of painting for a person, and should recognize when a person wants to start and end.To promote social interactions, the robot should suggest opportunities, such as showing the painting to others, or displaying a painting somewhere; the robot could furthermore suggest including others in paintings, mention others who have painted similar paintings, or seek to include interesting content in paintings which could lead to conversation.

General Interactions
We have considered the human science literature specific to art therapy to gain some initial ideas.Here, we enrich those ideas by considering general requirements to achieve good interactions with a robot, based on the idea that, although humans can do much automatically without conscious thought, robots need to be explicitly programmed.In other words, positive user experience (UX) is important to realize successful interactions with a robot, but does not automatically result when a system is built; rather, conscious design is required [100].Here, we discuss properties such as behavior modalities, and how to structure behavior to facilitate general well-being.
A fundamental question is which modalities a robot should use for input and output.The usefulness of speech for art therapy can be expected, as psychotherapy is sometimes referred to as "talk therapy"; verbal and vocal channels allow complex information to be conveyed [101], in a highly salient fashion [102], without requiring a person to look away from art-making and possibly lose concentration [103].Visual output also is useful, in that numerous streams of information can be shown continuously and simultaneously, additionally to people with difficulty hearing, and a human does not have to wait to hear complete messages from a robot, which could be difficult for users with limited attention.Moreover, tactile interfaces are fundamental, both for operating tools and machines, and for affectionate interactions.Since human art therapists typically utilize multiple modalities, we propose that robot art therapists as well will ultimately benefit from the ability to engage in multimodal interaction.
We also consider the overall problem of facilitating a person's perceived well-being.In general, a robot can seek to facilitate hedonic and eudaimonic aspects of well-being, i.e., short-term positive feelings, and aspects which should contribute to eventual good feelings over the long run.Short-term feelings can be easier to measure when assessing the usefulness of some intervention, but their relation to long-term happiness is not always clear-especially considering the phenomenon of "hedonic adaptation", where people after a fortunate or unfortunate event tend to return to the same "set point" or degree of happiness [104].This mechanism could be beneficial for a person's ability to survive (e.g., helping people to avoid becoming oblivious to danger, and recover, respectively), emerging from neurochemical desensitization, but reduces the certainty of achieving positive long-term effects when focusing only on short term design properties.Likewise, the impact of focusing only on eudaimonic factors can also be unclear; a person struggling hard to improve themselves might experience much suffering every day, without certainty that their goal will be ever reached.Therefore we believe it is useful to design for both aspects, also in trying to facilitate good feelings toward the artist, the art, and the robot.
In our previous work, we have proposed some designs for facilitating hedonic well-being in an interaction with a robot, based on guidelines that a robot's behavior should be rewarding, helpful, or inspiring (possibly combining function and playfulness, reactivity and proactivity), clear in regard to its intentions, and carefully executed [105,106].In addition, some general criteria have been proposed to be important for a person's eudaimonic well-being [107]-comprising self-acceptance, positive relations with others, autonomy, environmental mastery, purpose in life, and personal growth-but how to apply them for a robot to support a person's well-being in a creative application is unclear.We note that these criteria have been embedded in a form of therapy called "well-being therapy" which is intended to help with affective disorders, but this involves a very different scenario from the one we consider, in which a therapist assesses notes on perceived experiences to detect impairments and leads a process of cognitive restructuring [108].Instead, we argue here that these dimensions of eudaimonic well-being can be related to the qualities positively affected by art-therapy previously noted.Self-exploration allows for personal growth and perceiving purpose in life.Enhancing self-image relates to self-acceptance; as well, being autonomous and controlling one's environment leads to having a positive self-image.Furthermore, enhancing social interactions entails positive relations with others.
Aside from theoretical prescriptions, the measurement of bodily signals in neuroscience is also contributing new understanding.Korb has summarized some positive effects from feeling gratitude, labeling negative emotions, decision-making, touch, bright light, and exercise [109].Feeling gratitude resulted in increased levels of dopamine and serotonin, associated with well-being.We note additionally that, not only being thankful for positive experiences, but forgiving of negative experiences, has also been linked with well-being [3].Labeling negative emotions resulted in higher ventrolateral prefrontal cortex activation and reduced amygdala activity, which relates to reduced perceptions of worry and fear.Making decisions, involving actively selecting and planning actions, especially without striving for perfection, engaged the prefrontal cortex, helped overcome striatum activity related to negative habits, calmed the limbic system, increased dopamine, and led to a stress-reducing feeling of control.Touch resulted in increased levels of oxytocin, serotonin, and dopamine-neurotransmitters associated with well-being-as well as reduced perceptions of pain and social exclusion, which was reported in an FMRI study to affect the brain similarly to physical pain.We also note that warmth is a typical component of human touch, where physical warmth has been linked with perceived psychological warmth [110].Bright light in the day and exercise also led to numerous positive effects such as boosted serotonin levels.
Based on this, we draw the following conclusions: If a person does not know what to paint, the robot can suggest painting something for which they feel grateful; for negative displays, a robot can ask them to consider forgiveness.A robot should ask a person to describe depicted emotions, although this does not mean a robot should not also make inferences itself based on the person's art and/or direct signals.As suggested previously, the robot should allow the person to make various decisions.If possible, it would be an advantage for the robot to also be capable of engaging in simplified touch interactions (e.g., recognizing a hug or pat, and reacting appropriately); some previous studies in robotics have proposed such methods [111][112][113][114]. Likewise, an advanced system could act to select a painting environment which provides some sunlight and adequate warmth, and if possible should let the person move to get physical exercise (e.g., to fetch paints).
In designing interactions for well-being, another challenge will be how to assess the success of art therapy interactions.For this, a number of instruments have been described, from the Bradburn Affect Balance Scale, Fordyce Happiness Scale, and Satisfaction with Life Scale (SWLS) for well-being, as well as the Friedman Affect Scale and Positive and Negative Affect Schedule-X [3].For working specifically with elderly with dementia and art therapy, the following instruments have been proposed: Cornell Scale for Depression in Dementia (CSDD), The Multi Observational Scale for the Elderly (MOSES), The Mini-Mental State Exam (MMSE), The Rivermead Behavioural Memory Test (RBMT), Tests of Everyday Attention (TEA), Benton Fluency Task, and Bond-Lader Mood Scale [12].Thus, many tools exist which can be applied to evaluate robot art therapy interactions.
In summary, we have examined both explicit and implicit guidelines for human art therapists, as well as general guidelines for achieving good interactions, to prescribe interactive guidelines for an art therapist robot.

Ethical Pitfalls
In addition to considering what a robot therapist should do, we believe it is also important to pay careful attention to what should be avoided.In our previous work, we considered the ethics of a general system that can visualize emotions, e.g., using a computer screen [31], discussing some possibilities for how to conceal a person's emotions or use existing regulation such as the General Data Protection Regulation (GDPR) by regarding a person's emotional state as personal information.From this previous work, we imagine that pitfalls could generally involve physical harm, psychological harm, mistakes, and deception, but this level of detail is insufficient to incorporate into a design.To go further, we need to also consider the specific scenario of painting with a robot in art therapy.To do so, we follow the general pattern of our previous work in first identifying some potential pitfalls, before suggesting some avoidance strategies.

Identifying Pitfalls
Considering that potential problems could involve physical harm, psychological harm, mistakes, and deception, we imagine that physical harm could be caused by the robot, the act of art-making, the person, or others.A robot could hurt a person with its motions, or splash paint or water around, damaging physical property such as the environment or a person's clothes.Art-making could negatively affect a person: for example, making a person with respiratory problems inhale chemicals in paint or varnish, or making a person stand or sit for too long, possibly without drinking liquids.We also consider the possibility that a human could harm the robot or environment: assaults on robots, usually by younger persons, have been reported in the HRI literature [115]; in addition, although incidences are typically rare, assaults on caregivers by persons with dementia, retardation, and (paranoid) schizophrenia have also been described [116].Lastly, physical altercations or mistreatment-potentially subtle, such as leaving the window to a patient's room open on a cold day-could ensue if peers or caregivers learn as a result of art therapy that a person has negative feelings to them, such as violent obsessions, and is vulnerable.
We also believe that psychological harm could result from scaring a person, damaging relationships, or making them feel bad about their emotions, creativity, or painting skills: Even if a robot is safe, if this knowledge is not adequately conveyed, a person might be afraid of being harmed by or breaking the robot, because people of various cultures can feel anxiety about robots (it is not just limited to the West, where Terminator-like scenarios are commonly considered) [117], people have been killed by robots in the past (https://www.news.com.au/technology/factory-worker-killed-by-rogue-robotsays-widowed-husband-in-lawsuit/news-story/13242f7372f9c4614bcc2b90162bd749), and robots can be expensive (e.g., https://qz.com/1194939/the-us-government-just-gave-someone-a-120-millionrobotic-arm-to-use-for-a-year/).Psychological harm could also result if interpersonal ties are damaged-leading to discomfort or decreased trust (e.g., a family member or friend might wonder if is this is really the person they thought they knew)-or from the stigma of having it revealed that the person had engaged in therapy with the robot [118] .In addition, it is unclear if people will feel the same way about a robot therapist as with a human, or if they will feel as if they have been abandoned, to be left alone with a machine [119].Additionally, Phillips mentioned that patients can initially seek to test art therapists, possibly with disturbing art, and "write off" therapists who are found to be unable to be open, which could result in the therapy failing and perceived isolation [83].Similarly, therapy could fail because some patients are confused and uncertain, remember failures in previous art classes, and can be reluctant to open up emotionally and wary of revealing vulnerable innermost emotions [1].
Mistakes could involve indiscriminately revealing emotions, recognizing or depicting the wrong emotions, or making mistaken judgments about a person.Inappropriately showing a person's emotions could reveal a mental illness or feelings to others, which the person might not wish to have known.This information could be revealed by accident, or sought intentionally; for example, a care center might seek to check for "problematic" residents.Trouble could arise if the person's feelings are somehow judged as undesirable, threatening and unjust, and if a person is perceived as weak and an easy target, i.e., crime can result if there is a convergence of a criminogenic strain and circumstances conducive to criminal coping [120].As noted above, this could result in physical violence or psychological pain.However, we note that such an outcome is not the only one possible; seeing a person's emotions depicted in a painting could also promote understanding and empathy, leading to potentially better treatment [121], or simply not be of interest if emotion visualization becomes common and desensitization takes place [122].
Moreover, if a person feels that the robot could convey their emotions to the wrong person, they might feel concerned, and try to hide their emotions; however, hiding emotions is a difficult task, as emotions can be leaked through facial microexpressions and body language [123], and, because the outcomes could be serious, more worry could be perceived.In addition, although a physical painting might have some advantages over digital media, in that a computer image can be uploaded to the internet where many people could see it, and it might be difficult to remove the image if it exists on various computers, physical objects are harder to store as they require more space, and leaving such a permanent record could act as a reminder of stigma, informing many years later that a problem existed.
Mistaking the emotions that a human is feeling, verbally or in art, could also lead to damaged relationships, from which physical or psychological harm can result, as previously described.Wrong emotions could be indicated intentionally, via the work of a hacker, but we imagine that most cases would likely be unintentional: we expect that mistakes will occur in recognizing emotions, which can be difficult even for people, mostly due to the complexity of the phenomenon.Errors could occur when inferring basic emotions, referents, mixed emotions, or progressions.As an example, emotional signals can be ambiguous.For example, nodding can be positive, indicating agreement, greeting, or thanks; neutral expressing confirmation; or negative conveying irony or emphatic insistence [124].The robot could infer that a person who has watched a horror movie is afraid of a caregiver.The person could feel happy and afraid at the same time, but be inaccurately described by an algorithm as afraid.The person could also have been sad all day but be assessed as happy at that moment.Errors in algorithms can also act as self-fulfilling prophecies [125].For example, a person could become angry if an algorithm recognizes them as angry when they are not.
Another possible problem could result not from the robot, but on the observer's side, who might not interpret a painting or its description properly.Such mistakes could result in physical or psychological harm if the misinformation causes a decision to be made.For example, if a person's state is mistakenly inferred to be problematic, unneeded medicine could be prescribed.Conversely, if an extreme state is seen as normal, caregivers could be prevented from providing timely help.For example, negative consequences could ensue if a robot ignores genuine threats, either conveyed verbally or in a person's art; this is complicated because so-called "sublimation", a Freudian term describing tension release through a productive act, can appear negative, similar to threat communications (e.g., as in disturbing art) [83].Another potential mistake is "contaminating" the personal meaning of an artwork by telling the person what a painting means, as the meaning of a painting for its creator is more important than a therapist's interpretation [1].In addition, the robot should be careful not to mistakenly exclude a person, by ignoring them or their decisions.
Finally, problems could result from deception, in promoting false relationships in place of real ones, in manipulating emotions, and in creating unrealistic expectations.A person might feel camaraderie toward an art therapy robot which is their partner in painting and receive the impression that the robot understands them; such people might be used to interacting with computers and robots, and it might feel safer to bond with a robot than a human as there is less possibility for rejection and less judgement [126,127].A danger is that a person might then turn to a robot as a replacement for other humans, even though the relationship will likely not be genuine in the same sense as with humans [128].Moreover, a robot aware of a person's emotions could be programmed to use its knowledge for some extra purpose, e.g., to influence a person to select some treatment over another one, or to gain money, power, or political support.An added concern is that robots can be perceived to be less accountable for dishonest behavior than humans [129].In addition, misleading people with regard to a robot's capabilities or the expected outcomes of the therapy with the robot could result in disappointment and lack of trust, not only in the robot, but potentially also in art therapy itself or any human caregivers involved.

Proposed Solutions to Ethical Pitfalls
In general, to deal with potential problems with physical harm, psychological harm, mistakes, and deception, we propose the detailed solutions below, which are also incorporated as part of our design in Section 3.1.Some proposals, such as regarding safe embodiment and basic recognition capabilities, target robots specifically, as it is understood that human art therapists will behave in a safe and context-aware manner.Conversely, some proposals, comprising patient confidentiality and only acting in a person's best interest, are based on our idea that art therapy robots can also follow codes of conduct for human art therapists (such as the Code of Professional Practice by the Art Therapy Credentials Board (ATCB)).Additionally, both basic and more advanced proposals are presented, as follows.
To prevent physical harm, we propose that a therapist robot should have some capability to avoid and detect problems, and alert and aid when trouble occurs.To avoid hurting a person, the robot itself should be built in a safe manner, and be fully covered, with no exposed sharp, hard, or electrical parts; with compliant joints and quickly cancelable motions; and with a stable base and hardware emergency stops.Moreover the robot's intentions should be easily inferred.This can be tackled by using meaningful motions which are not too fast, and clearly indicating any changes in the robot's motion patterns, potentially verbally.An advanced robot could also detect problems via typical human communication modalities such as sound or vision (e.g., screaming for "help" or frantic hand-waving); moreover, gaze and/or projectors can be used to show where a robot will move next [130].To avoid damaging property, the environment in which the robot operates should be kept safe and uncluttered; e.g., newspapers can be placed on the floor below a painting to catch paint.A more advanced robot could alert humans if it is being damaged, and seek to avoid damage; in addition, a mobile robot should be able to sense some of its environment (e.g., to avoid falling on a person), and a physics model could be used to predict what kind of damage could occur when planning motions.
To avoid problems coming from the act of art-making, a robot should have some basic ability to disclose potential dangers.Moreover, a contraindication for art therapy could be if it takes the place of another, needed treatment, such as medication; thus, the decision to engage in such interactions should always take place with the knowledge of a person's caregivers.A more advanced design can also have some basic ability to check the suitability of the environment-e.g., temperature, exposure to sunlight or noise, adequacy of ventilation, and presence of furniture if a person becomes tired or faint-and keep track of a person's state, such as how long they have been standing, when they last drank liquids, potentially what medical conditions they have, and how they respond when asked about how they are feeling.Respiratory problems could be taken into account when choosing paints and varnishes.To avoid physical harm from others, a robot should engage in continuous interaction, sometimes asking the person if they feel all right, and alert if there is no positive response.An advanced robot could seek to detect altercations or mistreatment, and even potentially seek to prevent or defuse the situation; in general, we also propose that an advanced robot should also seek to monitor basic signals such as a person's pulse, breathing, temperature, and pose; and to also help in emergencies by alerting medical personnel and potentially even perform basic first aid procedures.
To avoid psychological harm to a person, the robot should clearly explain that it is safe and robust, disclosing its safety features, and should check that a person is willing to paint with it.In addition, the robot should indicate sincere liking of a person through positive behavior, and avoiding negative behavior directed to a person in regard to their emotional states, creativity, painting skills, etc.To avoid perceived stigma, art therapy robots should look like regular robots used for other tasks such as housework or teaching, and should not be associated with illness; furthermore, such robots can be used by various demographics (children in schools, adults at work, and elderly at care centers) to avoid being associated with any one particular disorder.To avoid being written off, a robot should react positively to a person's art, without showing shock, fear, or disgust [83].Likewise, to deal with initial negative feelings, the robot should seek to create an atmosphere that is positive and non-judgemental.
To avoid communicating private information to the wrong person, an art therapy robot should practice patient confidentiality.Paintings, physical or in digital form, should be stored safely, in locked safes or computers not connected to the Internet.Processing can be on-board, potentially communicating only basic information if needed, via encryption, and conducting authentication.Cloud services might not be optimal if there is a risk of data being lost, corrupted, or stolen.Sharding (breaking apart files and spreading them over multiple computers) can be used instead of working with whole files or data blocks.For encryption, popular current methods incorporating public key cryptography such as RSA (Rivest-Shamir-Adleman) and PGP (Pretty Good Privacy) are expected to be breakable by quantum computing able to run Shor's algorithm and find the two primes of a semiprime in quadratic time; some post-quantum alternatives include lattice and code-based methods such as NTRU and McEliece (https://www.zdnet.com/article/ibmwarns-of-instant-breaking-of-encryption-by-quantum-computers-move-your-data-today/and https: //www.technologyreview.com/s/420287/1978-cryptosystem-resists-quantum-attack/).In addition, information that is made available to a robot can also be limited, especially if the robot is part of a team of caregivers; for example, the robot probably should not need to know which medications a person is taking.Moreover, a robot could generate dummy paintings implying that a person has felt only positive emotions.For higher security, private rooms can be used, without cameras and with closed windows and curtains (e.g., preventing drones from spying); art-making could also be conducted in virtual or augmented reality.
To avoid problems recognizing emotions, the robot should never act in isolation, but always ask for feedback and confirmation from the human.Additionally, more advanced mechanisms for recognizing and expressing emotional properties (e.g., mixed emotions, referents, timing, and polysemy) in visual art will facilitate accurate communication; for example, responsible use of emotions by a robot might include ensuring that displayed emotions are clearly tied to a referent.To avoid mistaken judgements by a robot, we also suggest, in line with our idea that art therapy robots should not be used to judge people, but rather act as partners, that robot behavior should also be logged, along with reasons for any behavior, for transparency.In addition, the robot could alert a human care giver when a person behaves in an extreme manner.For this, it could be possible to only act if it is known from the outset that there is a risk, or if the robot is highly certain of an inference, as in significant hypothesis testing, where a null hypothesis is rejected only if it would be highly unlikely given observed data (e.g., with a probability less than one in twenty, two hundred, or one thousand).It could also be possible to log a person's verbal description of their artwork for a human therapist to review later.To avoid contaminating the meaning of a painting, the robot should ask about its meaning, rather than telling the person what it means.
To avoid potential deception, human therapists should be prioritized when available, and robots only used either to help humans to work more efficiently or when no human is available.It should be made clear that current robots are not capable of forming relationships in the same way as humans.Advanced robots can also seek to promote social ties.With regard to the potential for manipulation, robots should again practice confidentiality, and there should be no ulterior financial incentive to manipulate a person's emotions from the perspective of the caregivers, or partners offering the robot; the only goal of using the robot should be to better help more people to achieve a better state of well-being, by improving quality and quantity of care, and reducing the workload on human therapists.Furthermore, robots' capabilities and expectations for therapy sessions should be made clear to avoid a person feeling disappointed.
By following such prescriptions for what a robot should not do, we believe better interactions will result.

Emotions
Emotions, as noted previously, form a vital portion of the fabric of art therapy and art-making; therefore, we must examine the literature to determine how a robot can engage appropriately on an emotional level.We note that, by emotion, we do not mean that our focus is on whether artwork is aesthetically pleasing; likewise, we are not concerned with the question if robot emotions will ever be exactly the same as human emotions.Our goal is not to impress people with the beauty of a robot's art or create an identical replication of a human, but to allow robots to engage in emotional interactions with people to support their well-being.Below, we consider modeling, expressing (abstractly or symbolically), and recognizing emotions.
Modeling of emotions has been complicated by difficulty in defining the term and a vast number of proposed models.In engineering, various work is being conducted on emotions using various labels such as affective computing or semantic analysis, with discrete, continuous, or hybrid models, each with advantages and disadvantages.The discrete model does not encapsulate intuitive interrelationships between emotions, such as that "contentment" is closer to "happiness" than "anger", and various arbitrary numbers of categories are used.Likewise, dimensional models can be difficult to imagine (it can be easier for lay persons to imagine "angry" than "low valence, high arousal") and can have trouble capturing some emotions (e.g., "surprise" via dimensions such as valence and arousal).Some hybrid models have also been proposed such as Plutchik's wheel [131], which seeks to place similar categories next to one another in a continuous graph, but this approach suffers from the problems of discrete models; without dimensions such as valence and arousal, the meaning of distance becomes unclear (e.g., what does it mean to place "disgust" opposite to "trust"?).Thus, current approaches also do not take into account the complex properties we described in Section 1 in regard to mixed emotions, referents, timing, and polysemy.In the current work, we suggest the usefulness of using both discrete and continuous models simultaneously: discrete models for working with participants in experiments, continuous models for robot programs, and a map between them.Moreover, we believe in the value of considering advanced properties of emotions for interactions in which emotions are crucial, such as therapy.
We turn to how emotions can be expressed.Various hints can be found in the literature for how to convey emotions through basic abstract properties of a painting, such as color, lines, and composition.From a dimensional perspective, valence can be conveyed by color hue [132] and brightness [133]; shape/line curvature [134] and symmetry [135,136]; and harmoniousness of the composition (key points aligned along lines dividing the composition into thirds, as in the rule of thirds [137], and alignment with the canvas [138]).Arousal can be conveyed by color (warm vs. cold) [132], and combinations (complementary or discordant, vs. analogous) [139]; as well as line orientation (horizontal vs. diagonal) [140].We incorporate these ideas into our design and prototype described in the next section.
In addition to abstract painting, recognizable content can be used to convey emotions.A highly useful study by Machajdik et al., in addition to describing machine learning features which can be used to classify emotions in images, proposed detecting the presence or absence of people via faces and skin color, and described a need for recognizing other semantic information [141].We agree that, despite many possible confounders, there will be some typical shared emotional meanings which are perceived in symbols.For example, a heart or gravestone could be painted to express positive or negative emotions, respectively.Some work has been done on automatically recognizing such symbols; for example, local self-similarity descriptors have been used to detect hearts [142], based on the idea that humans can detect symbols expressed in various ways even if there is no simple similarity in typical photometrics like color, edges, or intensity.Furthermore, classifiers trained on photos have been used to detect objects in paintings [143].In addition, AutoDraw, based on data acquired through an online quiz called "Quick, Draw!" [144], recognizes symbols users sketch, from a set of categories [145].What is unclear from the perspective of expressing emotions is which symbols will be able to convey which emotions, in artwork and paintings, as is addressed further in Appendix D.
In addition to expressing emotions, a robot can also seek to recognize a person's emotions.Some readers might wonder if an art therapy robot truly needs to be able to recognize emotions, or if the robot could not merely imitate the colors and lines that the person is using, or rely entirely on asking the person what they are feeling.We believe that emotion recognition is useful: by recognizing the emotions behind a painting we expect a robot can provide a more creative and more interesting performance, introducing new colors, lines, and content-especially when a person's painting skill is low.Furthermore, a person's response might not be accurate, as sometimes people do not even know how they are feeling.Additionally, in the case where a person cannot or does not wish to make art themselves, the robot might have to rely on signals other than in the painting to infer their emotions.Various sensors, from cameras and microphones to electrodermal activity sensors, and brain machine interfaces, can be used to infer emotions, from signals such as facial expressions and speech [146][147][148].
In particular, we believe that in art therapy, analysis of a person's spoken words, also about the artwork, will be highly important.Conversational devices such as Google's Assistant, Apple's SIRI, and Amazon's Echo are already engaging in conversations with people in homes; Google Duplex, based on a recurrent neural network (RNN) using TensorFlow, has demonstrated some excellent performance in conversing with humans, e.g., in calling a restaurant to make a reservation [149].To extract important information about emotions from verbal content, some challenges include anaphora resolution, word sense disambiguation, aspect extraction, and named entity recognition (pertaining to referents); subjectivity analysis (which would seem to be useful for therapy, in recognizing which utterances are revealing, perhaps for summarizing sessions); as well as detection of insincere text incorporating sarcasm and humor [150].There are also many examples of robots designed to engage in speech-based interaction: for example, Kismet, capable of detecting affective vocal displays tailored to the robot's child-like appearance [151], and Nico, which can learn new words by generating definition trees comparing parsed utterances with sensor-detected entities [152].
A demerit to typical modalities such as face expressions and speech is that they are relatively easy to control, e.g., faking a smile or a cheerful speech response, and might not work with some groups, e.g., with autism, strokes, or paralysis.Various sensors could be used to avoid such problems.For example, Rani et al. proposed a robot which can detect a person's stress via wearable sensors [153].Here, we suggest that BMIs and thermal cameras could also be useful, which we feel possess some advantages such as recognition power in the case of BMIs and remote sensing for thermal cameras, but have not received as much attention yet as other sensors.BMIs have been used to detect positive or negative valence via higher activations in the left or right frontal lobes, respectively [154].Facial temperature has previously been used to infer arousal [155], and also to detect Ekman's six basic emotions [156].
Thus, currently, there exist various models, hints for expression, and techniques for recognition, with a few noteworthy gaps, such as in regard to the emotional meaning of symbols, as well as methods for working with mixed emotions, referents, progressions, and polysemy.

Creativity
In addition to emotions, art-making, also within the context of art therapy, is an inherently creative process.A robot should not paint randomly, or always in the same fashion, disregarding the human's emotions, which could feel boring, unhuman and hard to relate to.Rather, an art therapy robot should convey appropriate emotions, in a creative manner which stimulates and engages.
Here, we describe some recent work on artificial creativity, where great strides are being made.Computers are composing songs [157], authoring short stories [158], generating news articles (e.g., https://automatedinsights.com/wordsmith or https://narrativescience.com/Platform), writing movie scripts [159], and creating computer games [160].Some artworks could be described as passing a Turing test, in the sense that they are sometimes believed to be created by humans or even rated higher than actual human-made creations: for example, poems [161] and particularly relevant to the current proposal, visual art on a computer screen [162].
A central question is how a robot should engage in a creative process.Some inspiration can be found in computational theories that are beginning to elucidate how creativity is exercised in humans.For example, some recent work based on the concept of the "adjacent possible" [163] has succeeded in predicting some laws of creativity such as Heap's Law and Zipf's Law, that the rate at which new artifacts are created is sublinear, and that rank and frequency are inversely related within creative spaces, within a simplified context (the Polya's urn problem) [164].Basically, such work involves having some model of what is possible and some way to reach adjacent terrain.We believe this basic formulation is reflected in general in current approaches for artificial creativity, which tend to comprise two components, some kind of prior model or data, along with some mechanism of introducing noise.
One powerful "black box" mechanism for creation involves using a generator-discriminator combination.For example, in the work mentioned above by Elgammal and colleagues, a "Creative Adversarial Network" was built, in which feedback from the discriminator was modified to rate art both in terms of goodness and style; confusing the discriminator with style is used to assess creativity.The goal was to generate art which, in accordance to guidelines by Berlyne and Martindale, would create a suitably high arousal potential, being somewhat creative, but still adhere to typical notions of what is aesthetic.A drawback to such approaches is difficulty in explaining why decisions were made.As in the third level of DARPA's categorization of artificial intelligence (where the three levels refer to hand-crafted rules, statistical inference, and explanatory AI), there is an increasing trend toward transparency and ability to explain decisions, which is especially important if a robot is supposed to engage in a relationship of trust with a human [165].For example, in art therapy, if a robot generates responsive art which could be interpreted badly, it should be able to clarify its intentions, to ensure that a human's feelings aren't hurt.One possible approach to realize such a transparent creative system could apply other statistical approaches which generate human-readable rules, such as trees or boosting, although obtaining stable classifiers which do not vary with small changes in training data can be challenging in practice.Another possibility could be to use the discriminator not just on generated artworks, but to identify models which are more creative.For example, creativity has been assessed in one work in terms of the number of different solutions which can be generated to solve a problem [166], thus perhaps progressively more creative models could be trained.
Another fundamental approach is to use some outside data to "seed" artwork with creativity.For example, Colton et al. described an AI which generated poetry from online news articles; its creative decisions were explained in a report it automatically created about each poem [28].We believe such an approach to be highly useful.A demerit from the perspective of art therapy could be that a robot's artwork will likely not be generated immediately, but rather people will watch the artwork being generated.Thus, it could be beneficial if people could understand what a robot was doing without waiting until the robot has finished.Moreover, people might not bother to read descriptions about a robot's painting.Alternatively, if read out loud, people might not wish to wait through a long report.We propose that it would be useful for a system to be able to explain creative decisions in a more natural fashion, such as verbally in a conversation.In doing so, the robot could proactively issue statements and questions to seek to assess a person's interest and knowledge levels (knowledge tracing), arouse curiosity, and summarize or expand on topics as desired, adjusting the detail of its words to take into account a person's knowledge and interest level; in this way the robot could also immediately halt its explanations when appropriate.
We believe that some possible topics of interest here include how to find a good balance between questions about a person's artwork and comments about its own, how a robot can detect a person's engagement or boredom as feedback, how it can quickly and accurately infer interest and knowledge levels, possibly from a cold start where no data are available (i.e., zero acquaintance), and, although we concern ourselves primarily with a dyadic scenario in the current article, how to effectively explain to groups when interest and knowledge levels of individuals differ.Additionally, we have assumed that it is important for the robot to explain why it paints the way it does but other questions are also possible.Fox et al. listed the following questions which explainable AI systems should be able to answer: Why did the robot not do something else?Why was a robot's behavior good in terms of some particular criteria?Why can't the robot carry out some particular task?Why is some replanning/reconsideration required or not required [167]?
In summary, a vast literature exists, spanning robots in therapy and art, interactive strategies, and mechanisms for emotion and creativity, from which a large number of prescriptions can be made.

Requirements and Capabilities
In the previous section, various prescriptions were made, which required merging and structuring to form a design.Thus, we revisit requirements, goals and problems, as well as proposed solutions, identified in the previous section, grouping related concepts.For example, relaxation and catharsis were included in the category for hedonic well-being, as they refer to temporary good feelings.This yielded seven requirement categories and six capability categories.Requirements categories, explained below, are as follows: R1 Co-explore, R2 Enhance self-image, R3 Improve social interactions, R4 Please, R5 Engage Emotionally, R6 Engage Creatively, and R7 Avoid Pitfalls.

•
R1 Co-explore.The robot should investigate the meaning of the person's art with the person, showing attention: expressing inferences about the person or their artwork in painting and verbally, and asking questions to confirm and to encourage the person to reflect.The robot can also track the state of the person's artworks over time.

•
R2 Enhance self-image.The robot should be positive, accepting and encouraging the communication of emotions.If sharing a substrate, the robot can also leave the most important parts of the painting for the person to paint, providing opportunities to feel independence, control, purpose, and growth, while possibly scaffolding and adjusting the challenge to the person's skill level.The robot can seek to include positive personalized content tailored to the person in its painting.

•
R3 Improve social interactions.The robot can suggest including others in paintings, mention others who have painted similar paintings, or seek to include interesting content in paintings which could lead to conversation.• R4 Please.To promote a general perception of hedonic well-being, the robot should help the person to feel good about engaging in art therapy with the robot, by behaving in an enjoyable and likeable way.For easy good communication, the robot can offer a familiar interface such as humanoid behavior and capable of multimodal interaction.To be liked by the person, the robot can show empathy and match a person's emotions, although it should not show negative emotions toward a person or their art; and be positive, showing sincere liking, as praise can be given for a person's creativity and skill, even for negative depictions.Emotion expression can be made to be large and meaningful, and express sincerity by being clear, e.g., ensuring that referents are clearly conveyed.Creativity can show interesting variation within a stable core.Suggestions, delivered proactively, can invite interaction and clearly convey interactive affordances and a robot's intentions.The robot can also infer when the person wants to end the interaction.• R5 Engage Emotionally.The robot should seek to infer emotions embedded in a person's artwork, and can also seek to infer a person's emotions directly.Basic emotions can be conveyed abstractly based on heuristics, or via symbols such as a person's face.The robot can also seek to infer and convey complex phenomena such as mixed emotions, referents, and progressions.• R6 Engage Creatively.The robot should be able to make basic creative choices and discuss these with a person through conversation.
• R7 Avoid Pitfalls.The robot should avoid pitfalls including physical harm, psychological harm, mistakes, and deception.
In addition, six categories of capabilities are defined as follows: The relationship between our proposed requirements and related solutions is summarized in Table 2 .

Simplified Implementation Example
It could be challenging to implement the design with only general guidelines.To ensure the design can be implemented, and help others to do so, we describe an example of a simplified implementation.For this, a medium fidelity prototyping strategy was adopted, which balances insight into how a system will be perceived and flexibility and ease of development [168].Below, we describe the interface, inference, painting, and speech.

Safe, Familiar Interface
A robot was required; various robots were available at our lab, including Turtlebot, Thymio, ARDrone, Nao, and Baxter on Ridgeback (hereafter Baxter).For art therapy, we desired a platform which would be highly safe and have a familiar interface-with a humanoid appearance that would make it easy to recognize as a therapist-and rich multimodal interactive capabilities, also facilitating development.Therefore, we chose the Baxter robot shown in Figure 3, which is a safe platform intended to operate near humans; all actuators are equipped with springs, and operate at low speeds (the base's maximum speed is 1.1 m/s).Moreover, Baxter is a humanoid, with a display showing a face capable of showing various expressions (for which OpenCV was used), speakers to issue speech utterances, long seven degree-of-freedom arms able to reach various points on various sizes of canvases without getting in a human's way (a reach of 1.2 m), and an omnidirectional base.To sense, the robot also has a camera and 13 sonar sensors around its head, infrared range sensors and cameras in its hands, force sensors and accelerometers in its arms, a laser and IMU in its base, and a microphone.In addition, as the robot is adult size, it could be possible for people to easily imitate, and possibly feel a connection to, the robot.We note that one disadvantage of using Baxter in studies with some user groups, such as dementia patients, is that the robot could be perceived as threatening, as it is large and heavy (approximately 100 cm × 80 cm × 180 cm, with a weight of approximately 210 kg), and its bent arms with the elbows upwards could be perceived as resembling an insect such as a praying mantis or an arachnid; in this case another robot could possibly be used.

Inference
Inference addressed detecting emotions with a BMI and recognizing keywords spoken by the person.Russell's dimensional model was used to model a person's emotions.To recognize a person's emotion, we used a typical 14 channel wireless EEG with a sampling rate of 128 SPS; the Emotiv Epoc+.Electroencephalography (EEG) signals were obtained from four channels (AF3, F3, F4 and AF4) on the frontal lobe, and filtered into three bands using the Remez exchange algorithm, which iteratively finds a filter minimizing the maximum error in the desired frequency ranges [169]: Theta (4-8 Hz), Alpha (8-12 Hz) and Beta (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30).Mean log-transformed brain wave power values were computed for each of the four channels and three bands by extracting the spectral densities via Welch's method [170]: this involved averaging periodograms obtained using the Discrete Fourier Transform in Equation (1) on overlapping segments of the signal to reduce variance.These values were used as features by Support Vector Machines (SVMs) in a one-vs.-onearrangement with a Radial Basic Function (RBF) kernel to infer degrees of arousal and valence.
RBF SVMs can be used to classify using Equation ( 2), where sgn is the sign function returning 1 or −1, y i ∈ {1, −1} are supervised labels, a i are learned weights, x i are training vectors, x j is a test vector (e.g., data signals representing a person's current emotion), exp(x) is the exponential function calculating the value of e to the power of x (the portion of the Equation (2) within the exponential function is the kernel), b is a bias term, and γ is a parameter greater than zero which represents the influence of support vectors: a high γ will result in low influence of each support vector, tending to result in a classifier with high bias and low variance.Classification values were then sent wirelessly to a desktop program which used a simplified model to associate emotions with visual design elements.
x n e −i2πkn/N (1) To hear keywords spoken by a person, CMU Pocketsphinx (https://cmusphinx.github.io/) was used for English (we have also experimented with a Swedish version using Voxcommando (https: //voxcommando.com/).The robot recognizes keywords indicating that the person is done painting, conveying the emotions that a person's painting shows, and asking the robot about its painting.

Painting
We focused on the basic simplified case in which the robot and human paint on separate substrates.To find heuristics capable of expressing basic emotions, we obtained insight from two professional artists.They helped us by pointing out useful sources in the literature, drawing sketches as examples of how to convey emotions through painting as in Figure 4, suggesting we consider the medium (we had been using canvas, but watercolor paper facilitated aesthetically interesting blending and was found to be useful especially for conveying relaxed emotions), and highlighting the importance of introducing variations into each painting the robot should produce.Together with them, we derived some initial heuristics, shown in Figure 5.A highly simplified strategy was adopted to transform recognized emotions into a plan for a painting.Recognized emotions were transmitted as a pair of real numbers.The real numbers were binned into one of six categories for arousal, and the same for valence, where each bin corresponded to a model for a painting.To add some simplified creative aspect and avoid producing the same painting multiple times, some randomness was added, which affected angles; number, size, and position of shapes; and colors, within the allowed range of each model.Based on the computed composition, commands were sent to the robot's internal computer to direct it to paint using a paint brush on its left arm and sponge on its right arm.Along the way, we also noted some unintended effects based on comments from colleagues: One blue background the robot generated felt aroused, an orange/yellow background felt relaxed, and black was felt to express positive valence.We believe this was caused by high contrast when the background was not completely filled in, letting the white of the paper come through; long horizontal strokes used to fill in the backgrounds and low contrast; and a personal color preference, respectively.Not filling in the backgrounds perfectly, while generating some confusion, created some complexity which could be aesthetically pleasing.Using the horizontal strokes for each background also was a shortcut used for practicality.Thus, we recommend designers be aware for the potential conflicts between clarity of communicating emotions, aesthetic complexity, and shortcuts used for practicality.Some examples of abstract paintings expressing basic emotions are shown in Figure 6.We also investigated how more complex emotions could be conveyed.The easiest way to convey such information could be through a written description, such as meta-information associated with an image, but a person might not read such a description.Referents could be conveyed by painting a human who shows the referent somehow via gaze, pointing or holding some object; or by a thought bubble or arrow.Mixed emotions could be conveyed by painting a human and using different body parts such as the eyes, mouth and hands as channels (e.g., like the female subject in the Mona Lisa, with her slightly sad eyes and slightly smiling mouth); another mechanism could be to somehow connect emotion displays (e.g., by placing them side by side, such as the Sock and Buskin masks of comedy and tragedy, the Janus head looking into the past and future, or the yin-yang symbol expressing a balance between positive and negative aspects).Progressions could be shown by using the tendency of people to read from left to right, or right to left, in a culture (e.g., comic book style).A polysemy model could be used to select clear symbols.Some examples of symbolic paintings expressing complex emotions are shown in Figure 7. (Additionally we also entered some paintings generated by this prototype into the international robot art competition.The paintings were judged partly by a panel of judges and partly by the public, and received a sixth place award; a comment from the judges was "If this body of work was exhibited at a gallery and I was told that the artist aimed to capture emotion through colour, composition, and textures-I would buy it (says one of our professional judges).The bold brush strokes, cool or warm templates to match the emotional quality expressed, it all made sense-but felt alive.Loved them").

Speech
For speech playback, soundplay.node(Festival) was used.The robot starts the interaction with a greeting, followed by an explanation and suggestions, then loops while waiting for a person to say they have finished painting.A timer is used for the robot to occasionally ask the person about what they are painting or make a comment about its own painting.During the time when it is not speaking, the robot also responds to questions about what it has painted, explaining creative decisions, and if the robot hears a request for "help", it simulates sending an emergency call to a care giver.At the end, the robot praises the human and says goodbye.If the robot received permission, it takes a photo of the scene with the finished paintings.
Below, we present some examples of the robot's utterances: Although highly simplified, we feel the prototype illustrates some of the promise of robot art therapy.

Discussion
In conclusion, we have explored the theoretical design space for an art therapy robot, suggesting the importance of taking into account the rich literature in human science and engineering on various relevant topics from robots used for therapy and art, to potential strategies for interacting, and mechanisms for expressing emotions and creativity.

•
Art therapy robots.We have motivated why art therapy robots would be useful, and described an apparent gap in the literature in regard to drama, writing, and gardening therapy for robots.

•
Therapeutic interactions.We have suggested the usefulness of a humanistic, "responsive art" approach as a starting point for an interaction strategy for art therapy robots, comprising concepts such as "matching" and "distracting".

•
Emotions.We have compiled with the help of the artists a list of heuristics for autonomously generating abstract emotional art based on simplified properties of color, lines, and composition.We have reported on some symbols which appear to strongly convey emotions, proposing the importance of one symbol in particular, a painted human face with various expressions, as a familiar and powerful symbol.Furthermore, we have highlighted a perceived gap between our understanding of emotions in human science and what is currently typically being addressed in engineering studies, in terms of mixed emotions, referents, timing, and polysemy, and suggested how such emotional characteristics can be considered or conveyed in art therapy.

•
Creativity.We have discussed some issues in artificial creativity, and proposed that an art therapy robot should be able to discuss creative choices with a person through conversation.

•
Ethics.We have identified some potential ethical pitfalls for an art therapy robot and proposed solutions for avoiding them.
Based on our arguments, we have proposed a design for an art therapy robot, also discussing an example of a simplified prototype implementation.
We believe that this will help to guide some next steps in this area, although we note several important limitations.Robot art therapy relates to a huge number of research areas, which cannot be comprehensively surveyed in a single article; rather our intention was to discuss some central studies and present an exploratory overview, to gain insight into the theoretical foundations for robot art therapy.As such, there are risks of bias, and areas in which the body of evidence is sparse.For example, robots typically use large datasets to learn, but descriptions of how human art therapists respond with visual feedback are currently few.Furthermore, our suggestions are limited by lack of certainty regarding concepts such as emotions, creativity, and well-being, which are still being studied in human science.Moreover, we do not suggest that our design will allow robots to replace human art therapists, which we believe is not desirable and would also be currently impossible from the perspectives of conversational ability and understanding of humans; rather the goal in the current article is simply to act as a foundation toward setting up some first simplified interactions expected to be beneficial for humans, while ensuring that expectations are realistic, and permitting some suspension of disbelief, although we expect problems regarding overly high expectations of a robot's abilities will fade over time due to accustomization.In addition, we believe that, even if a human therapist were required to analyze data from sessions, make inferences about progression or regression of the patient, and plan how to structure next sessions, partner robots could still be useful, e.g., in potentially providing some care to people who cannot currently receive care due to lack of resources, facilitating self-exploration in an engaging, non-judgemental way as a partner, being available at possibly any time as robots do not need to sleep, and saving some time for human therapists, such as time spent traveling to, attending, and documenting sessions.
Future work will advance both our theoretical and practical knowledge.Below, we give a few examples.

•
Art therapy robots.Theoretical work will extend our design to work with other forms of therapy such as music therapy.

•
Therapeutic interactions.Interaction strategies will be refined, e.g., by clarifying how art should be used to provide therapeutic feedback, possibly through preparing datasets of patient art and responsive art from human therapists.

•
Emotions.We will get a better understanding of how to convey complex emotions, also using symbols, and we will tackle some questions which emerged in our work such as: Will abstracted symbols be perceived more as conveying emotions, whereas realistic symbols will be perceived more as semantic or referential?• Creativity.Various questions can be tackled, such as: How could an art therapy robot personalize creative artwork generated for a person?• Ethics.Concerns for other forms of art or therapy should be addressed, as well as legal questions.
For example, if an art therapy robot errs, who is at fault: the robot, the makers, or the entity offering its services?Moreover, who will own the copyright for the robot's generated artwork if it is generated in a therapy session for a human?
Practical work in general will involve conducting interactions with various users, and assessing various design capabilities and assumptions.By exploring how robots can contribute to our well-being in emotional and creative interactions, our intention is that the resulting knowledge will also positively affect acceptance of robots in human environments, provide some insight into which facets of human activity are unique or also performable by robots, and-in the sense that robots can provide insight into human nature [171]-possibly even move a small step toward gaining a better understanding of the complex human phenomena of emotions, creativity, and well-being.

Appendix A. Emotions
Emotions are key to art and art therapy [6]-so much so that the purpose of art has been defined as being "that feelings . . .can be transmitted with all their force and meaning to other persons or to other cultures" [5]."Emotion" in humans refers to a complex psycho-physical phenomenon, involving subjective experiences referred to as feelings, somatic symptoms, expression via affect displays such as smiling, and cognitive appraisals [7].Affective space can be partitioned for simplicity into some discrete categories such as "happy" or "sad", capturing the way people typically describe emotions [172,173], or along a few continuous dimensions such as valence (positivity) and arousal, characterizing interrelationships between emotions [174,175].Here, the term emotion is used interchangeably with "affect", but we differentiate emotions from "moods", attitudes and "sentiments" which are also considered to be more long-term [176]-although what exactly constitutes short-or long-term is not clear, and we believe that these phenomena are highly intertwined; furthermore, the latter two concepts focus on the dimension of valence rather than on other dimensions such as arousal or dominance.In the context of art-making, we note that an artist's emotions are not necessarily the same as the emotions which are intended to be perceived in the art, or the emotions that an observer actually feels.For example, a person might have fun painting an angry painting, which might be considered boring by an observer who has seen many similar paintings.Nonetheless, we consider that in art therapy a goal is typically to explore one's feelings, so there will likely be some correlation between a person's felt emotions and emotions perceived in the artwork.We additionally highlight some other properties of emotions which we consider to be interesting from the perspective of robot art therapy: co-occurrence, referents, timing, and polysemy.

•
Co-occurrence.Humans can feel multiple, sometimes opposite, emotions simultaneously [177,178].For example, conflicting emotions might include enjoying a scary movie, a "dumb" joke, or cacophonous music; feeling happy and sad at experiencing a small winning or loss in gambling; feeling hopeful at new prospects but sad to lose contact with friends, when moving; feeling happy but sad when a child leaves home; or feeling happy for students who excel and sad for those who do not, after a test.Some examples of complimentary emotions are feeling relaxed and happy, or sad and angry.Humans appear to be adept at recognizing such emotions; it has even been suggested that blended emotions are displayed more often in the face than single emotions, and that people are better able to process this mixed information [179].

•
Referents.Another property we note is that emotions are typically directed toward some context (here, a "referent"), and not random phenomena disconnected from the world; this can be seen in some typical patterns of thought occurring during the experiencing of emotions described by appraisal theory [180].For instance, appraisal of an event can involve assessing its relevance, how difficult it is to handle, the causes, and norms for how people typically react [181].
Referents are also generally important for phenomena related to emotions such as "sentiments", attitudes, preferences and opinions, which are typically directed toward something or someone.We believe that identifying the referent of an emotion is vital for human interactions; for example, a robot might need to know if a person is angry at the robot or at someone else.However, referents can be quite complex.A person cut off while driving might feel angry toward another driver, their car, or even everyone in the vicinity, or the situation in general.Moreover, in some cases, the referent might be obvious, but in others, even humans can have difficulty identifying why others feel the way they do, for example when someone is angry with them.Furthermore, it has been suggested that sometimes emotions are not directed at anything in particular, in the case of "core affect", in which a person might feel excited, depressed, or relaxed [182].

•
Timing.Additionally, emotions change over time.In general, there is an internal homeostatic tendency for strong emotions to gradually fade over time, and emotions can be more easily regulated as they start than when they are in progress [183].The role of timing in "emotion episodes" has also been examined in affective events theory, in which it is claimed that that the patterns with which emotions fluctuate over time are highly predictable [184].In addition, some typical progressions of emotions have been described.For example, a grief response can proceed from denial, to anger, bargaining, depression, and finally acceptance [185]; in terms of basic emotions, this could be described as a progression from surprise to anger, sadness, and finally a neutral state.In addition, progressions of emotions from fear to anger are predicted in General Strain Theory [120,186].We believe that humans have some intuitive understanding of such processes, which could be beneficial for a robot art therapist; for example, if it is known that scared people can become angry, predictions can be made about how a scared person might feel in the future.

•
Polysemy.Additionally, emotional signals can be ambiguous.For example, nodding can be positive, indicating agreement, greeting, or thanks; neutral expressing confirmation; or negative conveying irony or emphatic insistence [124].Robots should be aware of such nuances in order to communicate well with humans.In the current article, we seek to take into account such considerations, in exploring the complex phenomenon of the visual communication of emotions.

Appendix B. Creativity
Another core characteristic associated with art-making is creativity.Creativity has been defined in various ways: for example, by Cope as "the ability to associate two things which heretofore have not been considered particularly . . .associate-able" [187] or as "imagination . . .skill, and the ability to assess" [188].A typical opinion reported by Arai and Lanier, among others, is that robots cannot be creative (https://english.kyodonews.net/news/2018/07/528f2cc41122-feature-dont-fear-robotsfear-robotic-humans-says-japans-ai-auntie.html), but can only recycle "data from people . . .(where) the problem . . . is that the people are made anonymous" [8].This opinion is reflected in a statement by Jack Ma from Alibaba: "We have to teach (children) something unique, so a machine can never catch up with us . . .we should teach our kids . . .painting (and) art" (https://www.youtube.com/watch?v= rHt-5-RyrJk).
We agree that robots typically use data from people, but we believe this is not limited to computers, as people also draw insight from observing others; in addition, various machines described in Section 2 have already been developed to engage in basic art-making and painting.In the current article, we adopt a useful formulation from Winiger that creativity can be interpreted, not as something one has or doesn't have, but rather as "a way of doing things" [8].Moreover, we hold that creativity is not limited to some form such as poetry but appears in various different forms, in line with Horace's "Quod libet audendi", marked by some core component of novelty or incongruity; other factors such as usefulness are predictive of creativity only when the novelty is very high [189].A creative artwork does not need to be novel for all humans, but can be creative for an individual or robot, as in Boden's concept of P-versus H-creativity [190] or Simonton's little-c versus Big-C creativity [191].

Appendix C. Basic Forms of Art Therapy
Although a comprehensive description is outside of the scope of the current article, we note a few examples of variants comprised within the umbrella term of "art therapy": cognitive, behavioral, psychodynamic, and humanistic [71].Cognitive approaches focus on mental processes as the key to therapy, whereas behavioral approaches focus on measurable outcomes.Psychodynamic approaches, such as Freudian or Jungian approaches, focus on unconscious dynamics and the phenomenon of transference.Furthermore, humanistic approaches, such as the person-centered approach, focus on a model of wellness rather than illness, and unconditional positive emotion and empathy toward the person making art.
Furthermore, for any particular kind of approach, numerous possibilities exist for how to seek to intervene to improve a person's well-being.For example, Cognitive Behavior Therapy (CBT), combining ideas from cognitive and behavioral psychology, aims to confront and replace dysfunctional thought patterns and amend critical behaviors relevant to specific problems; high success rates have resulted in CBT being suggested as a first step in the treatment of a majority of psychological problems [192].Following such a strategy, a robot could use the A-B-C model prescribed in the first form of CBT, "Rational emotive behavior therapy", to identify adversities (A), irrational beliefs (B)-such as that a person must always be successful, loved or comfortable-and consequences (C) [193].Conversely, for difficult cases involving relapses or vague symptoms, schema therapy could be used to identify underlying rigid and maladaptive thought patterns, as well as emotional triggers (e.g., reflecting hypersensitivity to abandonment) [194].Following the concept of guided imagery, the robot and human could jointly paint positive images, or the robot could paint positive images that the person could focus on, such as images depicting the overcoming of a problem [195].Thought stopping could involve the robot detecting negative emotions projected into art and alerting the painter-although the efficacy of such techniques in isolation is debated [196].Moreover, projective tests such as the Draw-A-Man Test, House-Tree-Person and Kinetic Family Drawing could be prescribed-although here too the reliability of such tests has been strongly questioned, due to the ambiguity which exists with the interpretation of symbols without feedback regarding the context, as well as confounds from artistic skills and culture [197].

Appendix D. The Emotional Meaning of Symbols
Some datasets of artwork exist, such as Painting-91, which comprises 4266 paintings from 91 different artists [65], but do not contain any information about emotions.However, some inferences can be drawn from image datasets assembled to investigate emotions [141,198,199].In particular, the emotional meaning of some themes in images has been explored using the International Affective Picture System (IAPS), which contains roughly 1000 images rated by large numbers of people for valence and arousal [200].Erotica, sports, adventure, and food were perceived as conveying high valence and arousal.Images of nature and babies were perceived to convey high valence and low arousal.Grieving scenes, illness and accidents, including a starving child and an injured face, were perceived to convey low valence and medium arousal.Furthermore, threatening content including snakes, guns, and violent death were perceived to convey low valence and high arousal.At the risk of over-simplification, these quadrants typically contain happy, relaxed, sad, and angry emotions, respectively.Note: Content perceived with low valence and arousal was rare, in line with Tellegen's conjecture that low valence images are necessarily arousing.Strong positive correlations were also noted between how emotions were perceived by children and adults, as well as by women and men.We believe invaluable hints can be drawn from such results, but it is currently not possible to directly apply them to infer emotional meaning from symbols for robot art therapy.First, because the images typically contain multiple symbols expressing emotion, it is not straight-forward to infer from results for themes or pictures how individual symbols are perceived.For example, in one picture, which content is responsible for the valence and arousal levels attributed to the image: is it the expression or bent pose of a person, a caged dog beside him, or the natural landscape behind?Second, content can express meaning outside of basic sentiment; for example, it could be perceived as inappropriate if a robot sought to express happiness by painting an erotic scene for a person with dementia.Notwithstanding these concerns, we highlight one brief statement in this work, which was that over half of the dataset images depict people because it was claimed that the images which evoke the most emotion are those involving humans.Indeed, the images depicting sports and erotica, which are attributed joyful levels of valence and arousal, often contain people who appear joyful.Thus, we suggest that a robot could draw human faces with various facial expressions, as a basic symbol to express various emotions.
Another set of typical symbols which people use, from which inference could be possible, comprises emojis and emoticons.One study has started to investigate the emotional meaning of these common symbols [201].Some drawbacks for use with robot art therapy however are that many of these symbols are used alongside text in a semantic rather than emotional way: for example, a picture of an airplane can refer to a trip, rather than joy or anger.This might explain some unintuitive results in the above study such that symbols for a bomb, crying face, and worried face were labeled as slightly positive.Moreover, similar symbols were not grouped; there are many heart emojis, for example, which we expect would have similar emotional meanings.Furthermore, only valence/sentiment were measured, and not arousal or discrete emotions.Thus, we believe a gap exists in understanding the typical emotional meanings of symbols in paintings.

Figure 2 .
Figure 2. Examples of paintings from the robot art competition: (a) Cezanne's Houses at L'Estaque by CloudPainter; (b) Red Flowers (Floral no. 1) by Joanne Hastie; (c) Man by PIX 18 at Columbia University; (d) Full Bloom of Sakura by CMIT ReART at Kasetsart University; (e) Scribbles by CARP at Worcester Polytechnic Institute; (f) WWF by JACKbDU at New York University Shanghai; (g) Perlin Noise Field by Late Night Projects; and (h) Homage To Jackson Pollock by e-David at University of Konstanz.

Figure 4 .
Figure 4. iPad sketches by an artist, illustrating some possibilities for expressing emotions visually.All images c 2017 Dan Koon.Used here with permission.

Figure 5 .
Figure 5. Heuristics for conveying basic emotions through abstract painting.

Figure 6 .
Figure 6.Examples of (a) abstract paintings by our robot, compared with (b) paintings by an artist.In each, the top left painting represents the angry quadrant, the top right represents the joyful quadrant, the lower right represents the relaxed quadrant, and the lower left represents the sad quadrant.

Figure 7 .
Figure 7. Examples of symbolic paintings by our robot.(a) Expressing basic relaxation (b) Mixed emotions: seeking to express anger (face on left), fear (face on right); sadness (via blue color); and overall negative valence (descending mood lines) (c) A progression from miserable on the left to joyful on the right, using highly abstracted faces.

Table 1 .
Proposal for a basic starting scenario for art therapy robot.
1 Emotions from painting or C1.2 emotions directly from a person.C1.3 Speech, when person wants to end the interaction, asks a question, or answers.C1.4 Problems.

Table 2 .
Design for art therapy robot: relating requirements to capabilities.