Machine Learning Methods for Inferring Interaction Design Patterns from Textual Requirements †

: Ambient intelligence is one of the most exciting ﬁelds of application for pervasive, wireless, and embedded computing. However, the design and implementation of real-world systems must be conducted utilizing software engineering approaches. Some types of environments (hospitals, older adults homes, emergency scenarios, etc.) are particularly critical, especially in terms of the issues concerning expressing requirements, verifying and validating them, or ensuring functional correctness. To provide adequate ambient intelligence solutions, it is necessary to place special emphasis on obtaining, specifying, and documenting software requirements. To address this issue, our paper presents a model that integrates both requirements and design patterns. This is done through a natural language processing application in conjunction with other artiﬁcial intelligence algorithms. This work aims to support designers when analyzing text requirements and support design decisions. Our results were evaluated according to the cross-validated accuracy of predicting design patterns. The results obtained indicate that this approach could lead to good recommendations of design patterns, as it demonstrated an acceptable classiﬁcation performance over the balanced dataset of requirements instances.


Introduction
Ambient intelligence (AmI) is the integration of ubiquitous computing, ubiquitous communications, and the application of user interfaces (UIs). The objective of AmI is to design and implement new systems that provide intelligent, personalized, and connected services [1]. One of the main components of an ambient intelligence platform is the way it relates to humans, and vice versa. To achieve this relationship, AmI can aim for this interaction to be performed in a non-intrusive way, minimizing explicit interaction. Thus, in situations in which explicit interactions are required, there is a need to design a set of physical interactions that include environment and communication, among other characteristics. For the set of interactions to be designed, we need to apply interaction design techniques.
Interaction in AmI has followed the user-centered design approach. The aim of this approach is not only to design products or user experiences, but to understand the relationship of the product with the motivations of the final user, its use, its context, and user needs. However, some types of environments (hospitals, older adults homes, emergency scenarios, etc.) are particularly critical, especially due to the issues concerning expressing software requirements, verifying and validating them, or ensuring functional correctness. This is due to the different behaviors and activities that are executed in these types of scenarios. To meet this challenge, and to understand and write the needs of the various scenarios, AmI designers have to collaborate with experts from other disciplines.
To provide adequate AmI solutions, it is necessary to place special emphasis on obtaining, specifying, and documenting software requirements. These requirements are based on normative, social, and technical aspects and must be transferred into functional requirements that can be used for system development. According to Coronato et al. [2,3], the specification of requirements allows designers to detect and eliminate faults from the beginning of the design process.
Within the software design process, some problems are recurrently found, for which it is valid to reuse solutions [4]. Much of the current related work on patterns stems from the work of Alexander et al. [5] covering the design and layout of buildings, towns, and communities. In software engineering and human-computer interaction, patterns do not just describe recurring situations and their solutions, but they describe a method for presenting the situation and solutions in a structured way [6].
Recurring situations are similar for many AmI application development projects, for which using design patterns based on requirements seems to be a suitable solution. However, the literature about integrating both requirements and design patterns is scarce. Thus, their relation in practical contexts has not been sufficiently highlighted. This is due to questions that arise regarding the application of design patterns, such as How do you select the required design patterns? Moreover, How do you evaluate the process of using design patterns? The creativity and skills involved in designing UIs can be subjective and error-prone.
In this paper, a field-tested approach that combines both concepts is proposed. We present a model (IDPatternM) for inferring patterns of interaction design from the text processing of the requirements. This will allow designers to save time when analyzing text requirements and support design decisions. The proposal consists of the design of a recommendation system based on the IDPatternM model through the application of artificial intelligence (AI) algorithms.
In Section 2, related work is discussed to lay the foundations of the proposed model (IDPatternM). In Section 3, the process of searching for suitable design patterns for specific requirements is explained in detail, which is our main contribution. Section 4 presents the methodology that was followed in the development of the interaction patterns recommendation module. Section 5 describes the evaluation of the prediction of suitable design patterns for specific requirements. Finally, Section 6 concludes with a summary and a list of future work.

Literature Review
In recent years, some work has been published on the integration of both requirements and design patterns. We briefly present a selection of works that focus on integrating both requirements and design patterns.
Given the heterogeneity of current interactive systems, approaches such as model-based user interface development (MBUID) help reduce the gap between requirements and implementation. The above is achieved through the definition of models related to the UI that are captured and updated throughout the development life cycle. One of these models is the task model, which is useful when designing and developing interactive systems as it describes the logical activities that must be carried out to achieve the objectives of users [7]. However, the designer must enter specific metadata to relate to the information contained in the model-based system.
Several model-based tools have been developed to present and test a possible solution [8][9][10][11]. First, PD-MBUI [9] (UI based on patterns and UI based on models) proposes to integrate the patterns in the design process. The starting point of this approach is a clear description of the users' requirements, which includes answers to questions such as Who are the users?; What are their tasks?; In what environment will the system be used? The answers to these questions are essential to building models that reflect the users and their real needs in any scenario. However, they approach the problem by defining a structure for the design patterns and generalizing in the generation of an ontology, through which information is inferred based on an input defined by the designer.
Other approaches such as [8,10,11] consider the identification of patterns based on an ontology with the same structure of the design pattern and its design patterns catalog. Despite this, some approaches specifically address the natural language processing (NLP) of requirements. One of these is that of Navarro-Almanza et al. [12], who propose a model using deep learning techniques to represent text data in low-dimensional vector space and classify requirements in 12 different categories, which is lower than the NLP approaches to requirement classification. Moreover, design patterns are not addressed in this proposal.
There are other attempts at the automatic classification of software requirements using the PROMISE corpus, such as [13][14][15]. In general, approaches are based on NLP and information-extraction techniques to define characteristics for applying machine learning techniques. Some approaches are based on word frequencies and probabilistic methods, such as [16,17]. However, the requirements processing is not related to the design patterns or to supporting designers of interactive systems.
Gaffar et al. [18] propose a "disseminator pattern" that can be a person or a software agent and that interacts with the authors of the design patterns to support them in a semi-structured generation of the designs, in such a way that they become software tools that help the designers find, filter, and combine the patterns in a new design. The work of Gaffar et al. is based on an underlying XML database that, in turn, feeds back to it. Furthermore, it requires a good structuring of the data of the patterns and queries that represent most user needs when consulting design patterns. However, the requirements are not considered.
On the other hand, to support design decision making, which is subjective and error-prone, the approach presented in this paper consists of the construction of a recommendation system based on a requirements classification model to support designers in the process of designing interactive systems, through the inference of the related patterns in a set of requirements data. Instead of relying on manually crafted rules, text classification with machine learning learns to make classifications based on past observations. While we show prediction accuracy in the context of UI design, it is possible to extend our framework through a greater variety of requirements and design patterns. Therefore, it is possible to apply it in other application scenarios.
The works identified and described in the previous paragraphs are important but isolated efforts. To be able to specify the selection of patterns of interaction design, it is necessary to analyze what is needed and how to design an appropriate interactive system, for which the design patterns derived from requirements are useful to bridge gaps in the early phase of system development, where recurring requirements call for similar solutions. In the following section, a proposal for a model of interaction design pattern recommendation (IDPatternM) is described.

IDPatternM: A Model of Interaction Pattern Inference Based on Software Requirements
IDPatternM is a model for the identification of design patterns based on requirements and is defined in this work (see Figure 1). The IDPatternM model arises from a case study [19] in which a series of metadata are considered in the design decision making by interactive system designers. As a point of entry to the model, a requirement is needed in text, which is entered into the text editor module (see Figure 1a). Here, a verification is carried out regarding the integrity of the metadata (e.g., the user, action, objects) and the grammatical structure for text processing.
Considering the advantages of NLP and formal representations, the proposal includes text verification based on a semantic parsing task [20]. For instance, given the requirement "A user shall be able to enter information on an incident, including location, description, and time period", we found that "user", "information on an incident", "location, description and time period", and "to enter" were specified as properties. The structure of these properties is similar to sentences that are grammatically adequate. Once the requirement structure has been validated, this is classified into a type of requirement: functional or non-functional. The requirement classification model module (see Figure 1b) consists of a trained semi-supervised learning algorithm with a set of requirements data associated with a type of requirement (see Figure 1c), namely, the software requirements dataset. The functional requirements are the declarations of the services provided by the system, the way in which the system must react to particular inputs. An interactive system also involves non-functional requirements that describe quality criteria. However, this work focuses on functional requirements only, which specify what a system can do through interaction objects. The requirement classification model module generates, as an output, a design pattern prediction. In this case, the example prediction is the "structured format" pattern, which is established as the following problem: "The user needs to enter data quickly in the system, but the format of the data must adhere to a predefined structure". Furthermore, the model will be fed back with the prediction.
The interaction design pattern recommender model (see Figure 1d) was generated with a design-level requirements dataset (see Figure 1e) on which AI algorithms were used for text classification. The following section describes each module of the IDPatternM model.

Methods
For the inference of interaction design patterns from the requirements text, a set of data requirements was established, for which texts were collected from various sources such as the PROMISE corpus dataset [21], which is the most popular dataset of software requirements. These requirements describe the actors, objects, and resources of the system over which they act, as well as the operations carried out by a user (transitive or non-transitive). Subsequently, these requirements were classified based on the design patterns of the Toxboe collection [22] (see Table 1).

Table Filter
The user needs to categorically filter the data shown in the tables by columns.

Dashboard
The user wants to digest data from multiple sources at a glance.

Sort By Column
The user needs to be able to sort the data in a table according to the values of a column.

Morphing Controls
The user wants to only be presented with controls available in the current mode.

Search Filters
The user needs to conduct a search using contextual filters that narrow the search results.

Structured Format
The user needs to quickly enter data into the system, but the format of the data must adhere to a predefined structure.

Notifications
The user wants to be informed about important updates and messages.

Forgiving Format
The user needs to quickly enter data into the system, which then in turn interprets the user's input.

Input Prompt
The user needs to enter data into the system.

Experiments
A model of textual requirements and design pattern classification (IDPatternM) was trained with the naive Bayes multinomial algorithm. The multinomial distribution usually requires the counting of entire entities. However, fractional counts like TF-IDF (term frequency-inverse document frequency) are also possible for the processing of requirements. This method [23,24] represents term frequency (TF) × inverse document frequency (IDF). In addition, weighting is commonly used in text extraction and information retrieval to evaluate the importance of a linguistic term (usually unigram or bigram) in a dataset. The importance of the term (weight) increases with the frequency of the term in the text, but it is compensated by the frequency of the term in the domain of interest (e.g., frequent words like "the" or "for" will be reduced).
To train a supervised classifier, the first step is to transform the requirement text into a vector of numbers. Subsequently, the vector representations were explored as TF-IDF weighted vectors. Once these representations of text vectors were generated, it was possible to train supervised classifiers with requirements and predict the design pattern in which they will be classified. After all of the previous data transformation, all features and labels (design pattern collection) were used to train the classifiers. There are several algorithms to perform training for this type of problem. The modality used was 75-25 (that is, 75% of the data from each participant was used for training and 25% for validation).
Once the requirements text classification model was executed, it was compared with other learning models, from which cross-validation was obtained based on its accuracy and the source of any potential problem. Comparative evaluation was carried out for the following four classification models: logistic regression, multinomial naive Bayes, vector linear support machine (LinearSVM, and random forests.
These four classification models were validated in the basic approach called cross-validation k-fold, which divides a training set into k smaller sets. In this case, a k-fold value of 5 was established. For each experiment, accuracy, precision, and recall are reported as classification performance measures.

Results
From the TF-IDF score, the terms that are most correlated with each of the design patterns were found using the chi-square test for feature selection. This test is used for categorical characteristics in a dataset. The chi-square between each requirement and the design pattern was calculated, from which the unigrams and bigrams of the entities with the best chi-square scores were subsequently displayed (see Table 2). Likewise, the test determines if the association between two categorical variables of the sample reflect their real association in the population. These unigrams and bigrams demonstrate the congruence that exists in the definition of the problem of each design pattern with each text of functional requirement. For this, it is necessary to have a reliable dataset, that is, with classifications of functional requirements validated by expert interactive system designers.
After the data transformation process mentioned above, the classifiers were trained based on the requirements and type of design pattern. These were evaluated based on the logistic regression, multinomial naive Bayes, linear vector support machine (LinearSVM), and random forests algorithms. As a result, the LinearSVM and multinomial naive Bayes algorithms work better than the other two classifiers. LinearSVM presents a slight advantage, with a median accuracy of around 38% (see Table 3). From this result with the model (LinearSVM), the confusion matrix was generated (see Table 4), in which the discrepancies between the predicted and real labels can be observed. The vast majority of predictions end in the diagonal (predicted equal to real). However, a series of erroneous classifications are shown, which are requirements that affect more than one design pattern. Table 4. Confusion matrix of LinearSVM algorithm in the requirements classification.

Predicted
Actual TF DB SByC MC SFilters SFormat Notif FFormat IPrompt Finally, a classification report and a representation of the main classification metrics by class (a type of design pattern) (see Table 5) were obtained. The classification metrics provide deeper insight into the behavior of the classifier, in terms of the overall accuracy, that can mask the functional weaknesses in a class of multi-class problems. The classification reports that 59% of data was predicted accurately. The "dashboard" (100%) and "notification" (80%) design patterns obtained were predicted best. However, the number of samples for each one was low.

Conclusions
This research work describes a model for the recommendation of design patterns based on requirements (IDPatternM) for developing interactive systems in AmI. Due to situations in which explicit interactions are required, it is necessary to design a set of physical interactions that include the environment and communication, among other characteristics. The use of design patterns supports compliance with the principles of usability and user experience, thus promoting compliance with standards and the use of best practices. However, it is a challenge to find the right pattern when the requirement is ambiguous or the context of use is not specified. Due to these problems, this work proposes a IDPatternM to initially formulate grammatically correct requirements so that they can be processed and related to design patterns. Thus, they have to be translated, as well as classified according to their functionality to associate them with the corresponding pattern, according to a knowledge base.
For the evaluation of the IDPatternM module, a study was carried out with four learning models, in which its accuracy and the potential source were obtained as metrics. Although the results of the experiments determined that the linear vector support machine method is the most appropriate, a challenge in this work is to determine the knowledge base because for the moment, it is formed based on the literature and not on opinions of experts in the field.
The scope of the application developed is for a collection of patterns that cover the design of UIs. Even so, the IDPatternM model is more ambitious in the sense of having a knowledge base and considering different collections of patterns to cover the requirements of any type of system. As a future work, we propose to evaluate the recommendation module with expert designers who work with design patterns in their daily practice.