Designing Home Automation Routines Using an LLM-Based Chatbot

: Without any more delay, individuals are urged to adopt more sustainable behaviors to fight climate change. New digital systems mixed with engaging and gamification mechanisms could play an important role in achieving such an objective. In particular, Conversational Agents, like Smart Home Assistants, are a promising tool that encourage sustainable behaviors within household settings. In recent years, large language models (LLMs) have shown great potential in enhancing the capabilities of such assistants, making them more effective in interacting with users. We present the design and implementation of GreenIFTTT , an application empowered by GPT4 to create and control home automation routines. The agent helps users understand which energy consumption optimization routines could be created and applied to make their home appliances more environmentally sustainable. We performed an exploratory study (Italy, December 2023) with N = 13 participants to test our application’s usability and UX. The results suggest that GreenIFTTT is a usable, engaging, easy, and supportive tool, providing insight into new perspectives and usage of LLMs to create more environmentally sustainable home automation.


Introduction
The constant growth in greenhouse gas emissions, primarily caused by human activity, contributes to a concerning escalation in global temperatures [1].The residential sector's increased electricity use contributes to a considerable amount of global energy consumption [2][3][4].
Interactive and innovative technologies, as well as Sustainable Human-Computer Interaction [5] research, can play a crucial role in delivering new digital and innovative solutions that can help raise awareness and lead individuals toward more sustainable habits [6].Digital systems that provide eco-related feedback, such as the real-time visualization of energy consumption, have been found to encourage responsible electricity usage [7].Especially in home automation contexts [8,9], conversational-based assistants are among the most popular interfaces for interacting with residential IoT networks [10].
The advent of large language models (LLMs) in the field of Artificial Intelligence has revolutionized various applications, including text summarization, story and creative generation, and code development [11].While LLMs have found applications in diverse domains such as robotics [12] and video games [13], their integration into smart home automation [14], particularly in environmental sustainability, is still largely unexplored.Previous work by Giudici et al. [15] has shown that LLMs are generally good at answering general questions related to environmental sustainability but suffer in accuracy on specific topic-related questions.However, designing and exploring specific applications to advise users on household sustainable practices still needs to be explored.
In this context, our research introduces GreenIFTTT, an innovative and novel webbased conversational agent (powered by the GPT4 model) that encourages individuals, particularly homeowners, to adopt environmentally conscious habits within their households.We present the design and the user experience principles followed to create the GreenIFTTT application, with features that educate users on sustainable electricity consumption, reducing the overall energy consumption and simplifying routine creation with smart technologies.The system focuses on creating routines: automation sequences within the home environment based on the sequential execution of specific activities triggered by various conditions.The primary interaction paradigm is conversational, leveraging the capabilities of the LLM to assist users in locating and monitoring their smart appliances.The system also integrates data from connected sensors, giving users real-time insights into their daily routines.
In addition, we want to support the system design contribution by reporting an exploratory study conducted to assess GreenIFTTT features and answering the following research questions:

RQ1
What is the user experience of a conversational agent powered by a large language model for promoting sustainable household practices?RQ2 What is the engagement and likability of a conversational agent powered by a large language model for promoting sustainable household practices?RQ3 What is the usability of a conversational agent powered by a large language model for promoting sustainable household practices?
The results indicate positive user experiences, with high scores on valuable and wellknown scales such as the User Experience Questionnaire (UEQ), Parasocial Interaction (PSI), and System Usability Scale (SUS).Participants found the system supportive, easy to use, and engaging, highlighting its potential for promoting sustainable practices in home environments.Our results provide valuable insights into such systems' effectiveness and user acceptance, paving the way for further research using LLM-based conversational agents to promote environmental sustainability in home settings.
This paper is organized as follows.Section 2 frames the current state-of-the-art applications in home automation environments, such as large language models and chatbots for environmental sustainability.Section 3 describes the design, user experience, and highlevel implementation of GreenIFTTT.Section 4 delves into the methodology of the user evaluation performed, while Section 5 reports and discusses the results, also addressing the limitations of the present work.Finally, Section 6 presents the conclusion and presents future directions for research work.

State-of-the-Art Applications
The two pillars on which the presented research relies are home automation environments and conversational technology, focusing on large language models.In order to describe the state-of-the-art applications of such pillars in depth, this section is organized as follows.Section 2.1 delves into the landscape of home automation environments and their primary interaction and programming methods; Section 2.2 presents the landscape of conversational technology-also applied in home automation environments-and the recent literature on large language models.

Home Automation Environments
Smart environments are physical spaces enhanced with sensing, actuation, communication, and computation capabilities to adapt to users' preferences, requirements, and specific needs [16].They encompass various applications and settings, from smart homes to smart cities and factories [17].The development of smart environments faces challenges such as precise activity recognition, effective localization systems, and the need for a smooth transition from traditional to smart environments [18,19].Emerging wearable devices and wireless communication technologies are expected to further optimize the resource efficiency, comfort, and safety of such environments [17].
In this vast landscape, home automation environments refer to systems that enable the management of household activities through computerized control, offering benefits such as comfort, security, and energy efficiency [20].Advancements in the Internet of Things (IoT) research have enabled the remote control of devices and simplified everyday tasks, making home automation-in general-more accessible and user-friendly.In order to control home appliances, Trigger-Action Programming (TAP) is a simple programming model available to users to create rules automating the behavior of smart homes, devices, and online services [21].Householders can create TAP rules (also called recipes, applets, or routines) using a simple conditional structure-summarized as if 'condition' then 'action'using applications designed to be user friendly and accessible without a programming background [22].For instance, the commercial service IFTTT (https://ifttt.com/,accessed on 5 May 2024) (i.e., If This Then That) enables end users to easily use TAP with the most-used commercially available home devices.
However, according to [23,24], the oversimplification of existing TAP systems limits the expressivity of the programs that can be created, leading to inconsistencies in interpreting the behavior of TAP and errors in creating programs with a desired behavior.Another limitation is the lack of standard open protocols; each vendor allows its own product in a closed environment and requires the user to use several applets to manage a fully integrated environment [25].Still, empirical evidence from Ur et al. [22] showed how users tend to create their TAP recipes rather than using existing ones shared by other users or appliance manufacturers.Finally, Heo et al. [26] discussed how existing IoT frameworks read sensor data periodically, independent of real-time constraints.For example, the IFTTT framework polls sensor data every 15 min, and real-time APIs cause sensors to send unnecessary messages that do not affect any TAP recipe and waste battery power.In their work, the authors introduced the RT-IFTTT language, which allows users to specify real-time sensor constraints.A central manager analyzes the relationships between sensors and the TAP, calculating polling intervals for each trigger condition and turning off the sensor when unnecessary, saving battery and energy.
While TAP has demonstrated great practicality in customizing smart home devices, allowing end users to express a wide range of desired behaviors, and machine learning algorithms have improved the adaptability and functionality of such programs [21], the landscape is still sparse in sustainable living.However, it is worth reporting that in a broader panorama, as in the Sustainable Human-Computer Interaction [5] literature, we can find the evaluations of different visual systems [27] used to reduce the energy consumption of households [28][29][30] or intelligent technologies that shift people's thermal comfort consumption [31].Still, in this field, Alan et al. [32] presented an interactive IoT system designed to help manage energy costs, offering flexible autonomy and detailed information about its operation.Finally, Bang et al. [33] proposed a gamified activity in the context of domestic sustainability, while Beheshtian et al. [34] developed a social robot for sustainable living in a block of flats.These examples from Sustainable HCI can inform the design of new TAP solutions to help people have more sustainable home environments (e.g., designing applets to reduce energy consumption).

Large Language Models
A large language model (LLM) is a subset of Artificial Intelligence algorithms that uses deep learning techniques (e.g., a transformer model) and massively large datasets to understand, summarize, generate, and predict new content [11,35].Nowadays, LLMs are mainly employed for text summarization [36] and Next Sequence Prediction [37] (i.e., predict the subsequent element or event in a sequence based on patterns and information present in the history of the sequence), the automation and efficiency of repetitive tasks, and language translation [11].The pioneer model presented in the LLM world was GPT-2, released by OpenAI in 2019, which was trained on 1.5 billion parameters, 7 k books, and 8 million web pages [38].In recent years, other big tech companies have published different versions of LLMs, like Google's Bard and Microsoft's Bing AI.Still, OpenAI improved the model, presenting ChatGPT/GPT-3.5 [39] and GPT4.In addition, these models have been applied in different contexts, for example, to enrich the language capabilities of robots [12], videogame characters [13], and smart home automation [14].Although LLMs have positive impacts, it is also important to emphasize that there are negative aspects that can impact society [11,40].For instance, they can generate misinformation and disinformation and amplify biases in training data, raising ethical and data privacy issues [11,41].
Large language models are grounded on long scientific research in natural language processing, aiming to create computers that can interact with the user using natural language [42] (i.e., interpret and generate human language).A digital tool that interacts with users using natural language is defined in the scientific literature as a Conversational Agent (CA) [9].CAs have been used to engage users in text-based information-seeking and task-oriented dialogues for many applications [43].For example, they are integrated into physical devices (such as Alexa and Google Home [10]) and are available in many contexts of everyday life, like in phones (like Siri, the Apple virtual assistant [44]), cars, and online consumer assistance [45].
CAs are also considered promising in the environmental sustainability domain [8,9].Traditional rule-based chatbots have previously been used to deliver energy feedback [46,47], suggesting sustainable mobility [48] or reducing food waste [49].Still, Ramasubbu et al. [50] presented a chatbot to optimize the schedule of switching off smart plugs in an office.Furthermore, Gunawardane et al. [51] proposed an example using a data-driven chatbot to suggest recipes with leftover foods.Finally, Giudici et al. [15] presented an evaluation of LLMs to create a hybrid conversational agent that can trigger devices in a home automation environment and address users' open-domain questions.
In the home automation context, LLMs unlock a more natural interaction between users and space compared to the interactions that task-specific systems can provide [14].In addition, they can overcome the limitations of traditional conversational agents integrated with TAP, which include the limited language pattern it must follow and triggering and interfacing with single devices, leading to a low-integrated ecosystem perception [52].Still, such a feature is possible thanks to a significant exposure of LLM to daily situations and requests inserted into their diverse training data, on which they can be fine-tuned (i.e., using previously available datasets [53]).For instance, Fast et al. [54] presented that an LLM trained only on written works of fiction is able to recognize everyday activities based on semantic relationships between objects and the activities that they are frequently used in.King et al. [14] presented Sasha, an LLM able to realize goal-oriented home automation in domotic environments.They evaluated the ability of such an LLM to create parsable JSON files to be applied in a real-case setting using a data source of smart home action plans.Similarly, Li et al. [55] presented ChatIoT, a zero-code rule generator, to improve the quality of TAP generation while reducing the tokens required for the prompt to the LLM.In addition, a set of confirming rules in the interaction pipeline allows for refining the TAP recipe to be more accurate and safely executed.Finally, Nascimento et al. [56] explored the usage of generative AI versus engineering-crafted coding to create a control structure for an IoT application.The results pointed out how there were cases in which humans outperformed AI algorithms and others where they did not, with the experiment validity being highly impacted by the background experience and skills of the engineers enrolled.
To the best of our knowledge, an aspect that still needs to be addressed is the usage of LLMs to advise users on how to be more sustainable in their houses.In particular, we focused on using GPT4 models to help households create TAP routines that make them more eco-sustainable and save electric energy.

The System
We present the design of GreenIFTTT, a web-based conversational agent designed to encourage individuals, particularly homeowners, to cultivate environmentally conscious habits within their households, embracing an innovative approach toward emerging technologies and smart devices.The key features of the system are as follows: 1.
Educating Users on Sustainable Electricity Consumption.The system pushes users to adopt more sustainable behaviors regarding electricity consumption.

2.
Stimulating Cost Optimization in Utility Bills.The system provides mechanisms and tips to optimize utility bills, ensuring cost-effective energy management.

3.
Energy Consumption Reduction.The system actively works to induce people to reduce overall energy consumption, promoting an eco-friendly lifestyle through energy consumption shifting.4.
Simplifying Everyday Life: The system simplifies users' daily routines, making integrating smart technologies seamless and hassle-free.
The application focuses on creating multiple routines.Routines are Trigger-Action Programming automation components within the home environment based on the sequential execution of certain activities.An activity is the change in state of an appliance (e.g., turning on the washing machine) when one or more conditions occur.Finally, conditions may occur depending on the state of another home device, on some external API (e.g., weather or solar forecast services), or on temporal conditions (timers in a specific span).Figure 1 shows different app pages for creating, viewing, and controlling routines.
Finally, as already described in Section 2, existing smart home assistants lack linguistic patterns to activate devices (or routines).Usually, they cannot create a fully integrated ecosystem enabling the creation of TAP involving multiple devices and trigger conditions.GreenIFTTT is a new digital solution to overcome such limitations, using LLMs' new cutting-edge generative power.
This section is organized as follows to organize a complete system description.Section 3.1 describes the design of the user experience of the entire GreenIFTTT application, while Section 3.2 illustrates the user workflows, particularly the generation of a home automation routine using the chatbot.Finally, Section 3.3 details the implementation, and an in-depth definition of the integration of the GPT model is provided in Section 3.4.

User Experience
The primary interaction paradigm employed is conversational.This approach places the core of the user experience in the system's ability to actively engage users, enabling them to leverage the extensive capabilities of an LLM.Users receive assistance locating and monitoring their smart appliances by interacting with a fine-tuned version of GPT4.They can also access data collected from connected sensors, streamlining their daily routines.No particular on-top machine learning algorithm is used to inform the routine creation; this relies only on the prediction of the LLM, driven by the energy consumption values included in the prompt (see Section 3.4).
To analyze and define existing user experience with the application, we employed the 5W+H heuristic framework proposed by Jia et al. [57].The model addresses the well-known WH questions, who, what, when, where, why, and how (5W+H), to explore the various elements with which the user interacts.
Who Homeowners seek to develop new eco-friendly routines in their homes with an open-minded approach toward embracing new technologies and smart devices.The primary goal is to reduce power consumption and cultivate a more eco-sustainable lifestyle.
What The issue the system aims to resolve is to simplify users' everyday interactions with smart devices [58].The system was designed to introduce ease and efficiency into managing these devices.
When The system is most suitable for house owners who are open-minded toward innovative technologies and willing to leverage the significant advancements in the field of large language models.
Where The system was designed for implementation in residential settings where homeowners seek to integrate smart technologies seamlessly into their daily routines.It is adaptable to various housing environments, from urban apartments to suburban homes, fostering a sustainable and energy-efficient lifestyle.In addition, we created a web-based application to allow people to interact with the GreenIFTTT agent from desktops and mobile devices.
Why The motivation behind the system lies in addressing the growing need for simplicity and efficiency in managing smart devices within households.

How
The system aims to streamline daily routines by incorporating new AI technologies, enhance energy efficiency, and contribute to more sustainable electric consumption.The system offers homeowners a convenient way to embrace eco-friendly practices while enjoying the benefits of innovative solutions.Finally, following the principles presented by Liao and Vaughan [59], the application design aimed to keep users well informed about ongoing processes through essential and straightforward visuals.For example, we included appropriate feedback mechanisms (e.g., toasters while the model is computing the response), delivering response messages within a reasonable time frame and ensuring that users feel guided during the conversational process.Actions automatically performed by the system are stated via text in a short, clear, and comprehensible manner, fostering a sense of transparency.
The system also offers users a high level of freedom, allowing them to customize every routine component and empowering users to tailor the system to their specific needs and preferences.
Lastly, the entire application wants to be the following: • Simple.The interface was designed to be straightforward, minimizing complexity and facilitating easy navigation.• Intuitive.The system's features and functions are accessible and usable without a steep learning curve.• Easy to use.The overall user experience was designed for accessibility and user friendliness, ensuring a smooth and efficient interaction.
The above features collectively contribute to fostering a positive user experience, aligning with the overarching goal of creating a user-friendly and highly adaptable system to individual preferences.

User Workflows
User workflows represent users' steps to navigate the system and complete desired actions [60].Such workflows are crucial for designing an intuitive and efficient user experience.The system offers various functionalities accessible through a designated tab in the application.Such functionalities are represented in Figure 2  One remarkable user workflow involves creating a home automation routine using the chatbot (see Figure 3).This workflow starts with user initiation through the Chat tab.The user submits a text prompt describing the desired routine.The GPT model processes the prompt and generates a JSON response.The backend system parses this response to implement the custom routine and provides a feedback message to the user (in the chat).Following generation, the user can review the created routine.They have the option to manually edit or delete the routine entirely.Alternatively, the user can instruct the chatbot to perform these actions on their behalf.Thus, this user workflow demonstrates the system's ability to guide users through task completion, offering manual and chatbotassisted options for creating custom routines.

Implementation
We implemented the solution considering the principles of scalability, modularity, and ease of implementation, and possible future extensions.The project was configured to integrate with existing IoT devices, enabling real-time control.However, for the empirical evaluation of the system, we simulated device behavior by emulating their connections within a local network.
Our chosen technological solution is a single-page web application, adopting the threetier client-server architecture, segmented into a frontend, backend (application logic), and data layer or external connections.Figure 4 shows a high-level system overview.The application's backend plays a crucial role in providing the main logic of the application (routine handling and authentication), integration with external components, data storage in the database, and interfaces with the frontend.The backend is structured by separating the data and application logic layers for efficiency and code clarity.
In particular, as shown in Figure 5, the backend was developed using the Node.jsframework and Prisma ORM to facilitate communication with the MongoDB database.The frontend was structured as a single-page web application developed with the Vue.js framework.
The software (version v1.0.0) (https://gitlab.com/i3lab/GreenIFTTT,accessed on 6 May 2024) is being released as open source to allow study repeatability, improvement, or any other suggestion from the scientific community.

Integration with GPT4
In GreenIFTTT, GPT4 plays a central role in processing user input.The prompt sent to GPT4 is fed with data previously entered by users, their home automation (in a JSON format), and live consumption of appliances.In addition, a publicly available dataset (https:// www.kaggle.com/datasets/ecoco2/household-appliances-power-consumption,accessed on 6 May 2024) containing appliance-monitored consumption data is used to fine-tune the model, enhancing its responsiveness and ability to create more eco-related advice and interpret appliance consumption data.
When a user sends a message, the system checks for a thread previously opened by the user with GPT4; if no thread exists, one is created.A thread, which is considered an OpenAI API beta feature, represents and contains an ongoing conversation between the system and a specific user.
Creating a thread involves sending instructions to GPT4 to guide the model generation and shaping responses in a parsable format for the client (i.e., the backend with the app logic).A prompt engineering technique [61,62] was applied.Generation instructions (zeroshot prompting) are sent to the fine-tuned GPT4 model in the initial system prompt using the text provided in Appendix A.
According to such an initial prompt, GPT4 usually responds with a specific JSON structure, reported in Appendix B, consisting of key-value attributes.The message attribute is delivered directly to the user and stored in the database.Other operations are addressed depending on the message's type, which can be either chat or routines.In the former, no further operations are performed on the database.In the latter case, additional JSON attributes are analyzed to execute database operations, such as creating a routine involving activities and related trigger conditions to be satisfied.Every exchange of messages between the user and GPT4, and vice versa, is stored in the database, ensuring that the conversation history is readily available as soon as the user loads the chat page, providing a comprehensive and continuous user experience.
Finally, discussing and highlighting some challenges for future work integrating GPT4 in domain-specific applications is crucial.As described above, the thread function of Ope-nAI, which was in a beta version (preview) at the time of the study, was used, which resulted in slightly longer response generation times compared to traditional APIs.Furthermore, as described, the fine-tuning contributed (only in the early stages of interaction) to lengthening these times.In the future, it is necessary to evaluate the time generation of the thread function in a production environment (not in beta).Still, in a real-case environment, the creation of edge or mixed computing systems (with part of the LLM generation carried out in users' systems) is worth investigating.

Empirical Study
The exploratory evaluation of our application touches on three areas: engagement, as the ability to keep the subject engaged in the activity for a prolonged period; likability, which is defined as the degree of appreciation and the ease of use of the tool; and usability, which is the degree to which something is able or fit to be used.This also included an evaluation of the interaction paradigms.

Research Variables
Data gathering was performed using a web-based questionnaire.Participants answered questions referring to different quantitative scales, which were as follows: • The User Experience Questionnaire (UEQ) [63] (α = 0.87) was used-in its short versionto extract feedback about the user experience (UX) of an interactive digital tool; • The Parasocial Interaction (PSI) scale [64,65] (α = 0.91) measures the degree to which participants feel connected to and attached to the system; • The System Usability Scale (SUS) [66,67] (α = 0.74) was used to determine participants' perceptions of the interface's usability.
The UEQ scale presents a set of items that contains two different adjectives; participants selected, on a seven-stage scale, which of the opposing terms for each item better described the system (e.g., the first item of the UEQ is obstructive vs. supportive; 1 is linked to "obstructive" and 7 to "supportive").All the items in the PSI and SUS scales were evaluated using a seven-point Likert scale ranging from 1 ("Completely Disagree") to 5 ("Completely Agree").

Participants
The study involved 13 participants (3 females and 10 males) with a mean age of 27 years (M = 27.85,SD = 11.58).All study participants were recruited voluntarily using snowball sampling (started by our community, colleagues, and university students).The participants were mainly undergraduate or postgraduate students; from the qualitative results, they reported varying expertise in the home automation field (from those interested in home automation to those who did not consider it).All participants signed a consent form informing them about procedures, goals, and data treatment.The investigation was conducted using the same laptop in our research laboratory in Milan (Italy) in December 2023.

Procedure
The experimental protocol consisted of three phases (Figure 6) with a total session duration of about 15 min.Participants were asked to fill out general biographical information in the first phase.In addition, a researcher presented the scenario and the tasks of the study using a supplemental paper sheet.The scenario and tasks used in the study are detailed in Appendix C.During the second phase, participants were invited to interact with the system and complete the tasks presented in the previous phase.Finally, in the last phase, participants filled out a questionnaire with all the inquiries needed to assess the research variables presented in Section 4.1, which, respectively, were proposed by Laugwitz et al. [63], Tsai et al. [65], and Bangor et al. [67].

Empirical Results
This section reports the empirical evidence from the exploratory study described above.This section is organized as follows.Section 5.1 reports the descriptive results, while Section 5.2 discusses this evidence and attempts to answer the research questions, focusing on previous scientific results.Finally, Section 5.3 reports this study's limitations.
Finally, the SUS score had an average value of 83.79 (SD = 5.07).

Discussion
As stated in the introduction (Section 1), this preliminary evaluation aimed at understanding more about the engagement, likability, and usability of a large language model-powered conversational agent for promoting sustainable household practices.Regarding usability, the average SUS score was 83.79.According to Bangor et al. [67], the usability of our system is more than acceptable, with a grade of B and an adjective rating between "good" and "excellent".
Our results on the Parasocial Interaction scale are greater than those previously determined by Tsai et al. [65], suggesting a positive and effective interaction likeability between participants and GreenIFTTT.In particular, users reported a higher Interaction Satisfaction and Perceived Parasocial Interaction.Such higher results could be related to using an LLM instead of a traditional rule-based chatbot.LLMs can engage users with their advanced language capabilities [11], almost keeping the main functionalities of traditional agents.Notably, in our context, we imposed on the LLM that "If you don't know how to respond to me, you can ask me to repeat the question or to show you the available commands" (Appendix A, line 3).Still, from the scientific literature, we have examples of recent studies reporting the perceived high consciousness [68] and human likeness [69] that users attribute to such agents.In addition, Ross et al. [70] reported the high quality of the generated responses and the agent's ability to assist users in specific domain tasks (e.g., produce or create code).Finally, our results are also better than those of the previous work by Giudici et al. [71], in which the authors evaluated the usage of a traditional rule-based chatbot to promote environmental sustainability in home environments.The UEQ output also confirmed such results and pointed out that participants found the application easy, clear, supportive, and interesting.In particular, the highest average score was obtained on the easy adjective.Considering that the experimental task was executed by participants who were not experts in the specific environmental sustainability field and generating good and reliable home automation represents a quite trivial task [23,24], we can argue that the approach presented in this paper of using an LLM to create home automation tasks can enable more people (even non-experts) to set up routines to make their home consumption more optimized and sustainable.Still, such results are aligned with the instructions in the LLM prompt, in particular, You are a helpful assistant and should answer me clearly and it must be humanlike, friendly, in a way that I see you as my best friend (Appendix A, line 1 and 13).Finally, previous research by Zhang et al. [62] has indicated a connection between a user's level of engagement with a digital application and their environmentally sustainable attitudes, corroborating the results obtained by our application.

Limitations
Our work presents limitations.First of all, for a more comprehensive evaluation, a sample that includes more participants who are more balanced and extended over a larger population in age and gender is needed.In addition, the study presented in this paper was conducted in a laboratory, using a hypothetical scenario, significantly impacting the ecological validity of the experience.Even though participants interacted with a working conversational agent, the appliances' data were from a publicly available dataset, and users' actions did not affect real devices.In addition, no pre-or post-questionnaire on the environmental attitudes of participants was undertaken.Finally, there was no comparative user study where participants interacted with GreenIFTTT and alternative systems (e.g., the basic IFTTT or other LLMs) to comprehensively evaluate the performance, usability, and user experience, running more advanced statistical analyses.For all the above reasons, the results of our study are preliminary and insufficient to make a definitive claim on the effectiveness of GreenIFTTT in a real scenario.However, we provided empirical preliminary insights into this emerging topic by addressing our research questions.
Secondly, according to Rillig et al. [72], LLMs and other new innovative technologies (e.g., the Metaverse) directly and indirectly impact the environment and environmental research.One of the direct negative impacts is the high amount of energy that LLMs need to be trained and employed.In particular, Luccioni et al. [73] quantified the footprint of BLOOM's 176 B parameter LLM, considering the training equipment manufacturing, training the model, and model deployment (accessible via API endpoints).In the same way, Ref. [74] proposed a framework to compute the carbon footprint of different AI models and estimate the carbon emitted across usage.However, Ref. [75] reported comparing the usage of generative AI systems and human individuals performing equivalent writing and illustrating tasks.Contrary to expectations, AI systems released between 130 and 1500 times less carbon per page of text generated than their human counterparts, and similar results (310-2900 times less) were found for image generation.The authors discussed how generative AI is not a replacement for human tasks; however, it holds the potential to perform some activities with much lower carbon emissions.In our specific research context, no comparisons or specific results are reported as to whether LLMs can provide home automation that can reduce energy consumption and positively impact carbon emissions while also considering the footprint of LLMs in the overall energy balance.

Conclusions and Future Works
We presented the design and a preliminary evaluation of GreenIFTTT, a web-based conversational agent empowered by the GPT4 model, designed to encourage environmentally conscious habits within households.Leveraging the capabilities of large language models, GreenIFTTT aims to simplify users' daily routines and reduce overall energy consumption.This empirical study's results demonstrated and validated our research questions.Users reported a positive experience with the application (RQ1), also indicating a high level of engagement and likability (RQ2), as well as usability (RQ3).
Integrating GPT4 into a conversational agent for promoting eco-friendly practices represents a significant step toward leveraging advanced AI technologies for environmental sustainability, simplifying the creation of home automation.The system's effectiveness in providing personalized and facilitating interactions with domotic environments in natural language contributes to its potential impact on users' behavior toward more sustainable living.
Building on the foundation laid by this research, several avenues for a future research agenda emerge; such an agenda is dual for us and other researchers in the field.First, we aim to overcome our limitations (see Section 5.3) by conducting a more extensive study (involving a larger population).In addition, we aim to explore in-the-field studies in real-home environments to evaluate the long-term impact of GreenIFTTT on users' sustainable practices and energy consumption and validate its integration with real-case home appliances.In addition, we are already running a study comparing different LLMs on their generative capabilities to realize home automation; the next stage would also be to validate the effectiveness of such sustainable automation (i.e., the impact on reducing energy consumption).Future research could also focus on studying the different generative capabilities of LLMs and related end-user perceptions without limiting the comparison of LLMs on datasets of user interactions but opening up to comparisons with studies involving users in the field.Conditions list is filled in only if there is some activity going on and   "CONDITION_ACTOR_TYPE" can be "sensor" or "timestamp".

30
"DEVICE_ID" is the id of the device to use that is related to the condition and you can find it here: ${JSON.stringifydevices}.
32 "CONDITION_VALUE1" is the value to compare with and "CONDITION_VALUE2" is the value to compare with if "CONDITION_OPERATOR" is "between" or "not between".

→ 33
The condition you create should be when the device is powered on.If an object needs to be turned off, you must create a complementary condition for when it needs to be turned on.You have to gather information from the loaded files about the average energy consumption of the devices during different time periods of the day, to respond to me properly."status": "DEVICE_STATUS",
, and they include the following: • Dashboard.Provides an overview of user energy information and graphical trends.• Charts.Allow the exploration of detailed power consumption and billing data.• Devices.Enable connection with and the monitoring of smart home appliances.• Routines.Facilitate the creation and management of automated routines.• Chat.Integrates with a pre-trained ChatGPT model for routine creation through text prompts.

Figure 2 .
Figure 2. Overview of the main components of GreenIFTTT.

Figure 3 .
Figure 3. Schema representing the user workflow followed to create a home automation routine using the chatbot.

Figure 5 .
Figure 5. System overview with employed technologies.

24 '
DEVICE_ID' is the id of the device to use and 'DEVICE_STATUS' can be 'Active' or 'Inactive'.

→ 25 '
device' is filled with the information from DEVICE_ID.

→ 36 The 4 " 5 " 6 "
files about consumption have json objects formatted like this: ${JSON.stringifyjsonConsumptionModel}where the values are in kWh.If you don't know how to respond to me or you don't find a certain device in the files you were given, check information about energy consumption on the internet.message": "Hello, world!", timestamp": "2022-01-01T12:00:00Z",

37 } 38 } 39 ] 40 } 41 ] 42 } 43 }
GreenIFTTT snapshots.(a) Example of a chat for automating switching on lights.(b) Example of the activity for the routine generated by the chat in (a).(a) (b) Figure A3.GreenIFTTT snapshots.(a) Example of a chat for the automation of a vacation.(b) Example of the activity for the routine generated by the chat in (a).
GreenIFTTT snapshots.(a) Example of the real-time energy consumption of an appliance.(b) Example of the dashboard used to visualize past energy consumption patterns.

Table 1 .
Descriptive results of the research variables.

Institutional Review Board Statement:
This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of Politecnico di Milano University (Milan, Italy).Informed consent was obtained from all subjects involved in the study.It is 'True' only if I ask you to turn on or off the routine, otherwise is 'False'.Air Conditioner', 'Sensor', 'Other' in lowercase and if a device already exists, you can't modify.Activities list is filled in only if there is some routine going on and 23 'ACTIVITY_NAME' can be 'Postpone Washing Machine', 'Turn on Washing Machine', 'Turn off Washing Machine', 'Pause Washing Machine', 'Resume Washing Machine', 'Cancel Washing Machine' or other things about house appliances inside a house and 'ACTIVITY_STATUS' can be 'Active' or 'Inactive'.
→ → → → 13 'message' in the JSON is filled with a message to show to the user, it must be human-like, friendly, in a way that I see you as my best friend.16 'ROUTINE_NAME' can be whatever you want and 17 'ROUTINE_STATUS' can be 'Active' or 'Inactive'.18 'ROUTINE_SWITCHONOFF' can be 'True' or 'False'.