On the Use of Quality Models to Address Distinct Quality Views

: Different software product quality models interpret different amounts of information, i.e., they can capture and address different manifestations of software quality. This characteristic can cause misleading statements and misunderstandings while explaining or comparing the results of software product quality assessments. A total of 23 previously identiﬁed distinct software product quality models are analysed on how they handle the abstract notion of quality, and a taxonomy on the quality manifestations that the individual software product quality models are able to capture is established. Quality models that are able to solely describe the quality manifestation of the source code are attractive due to their full automation potential through static code analysers, but their assessment results ignore a huge part of software product quality, which is the one that most impresses the end user. The manifestations of software product quality that address the behaviour of the software while it operates, or the perception of the end user with regard to the software in use, require human involvement in the quality assessment. The taxonomy contributes to interpreting the quality assessment results of different quality models by showing the possible quality manifestations that can be captured by the identiﬁed models; moreover, the taxonomy also provides assistance while selecting a quality model for a given project. The quality manifestations used for the quality measurement always need to be presented, otherwise the quality assessment results cannot be interpreted in an appropriate manner.


Introduction
Software product quality models have undergone continuous development for more than 40 years, starting with their conception in the 1970s [1][2][3][4]. Assessing software products involves quality models of a wide range from simple hierarchic decomposition techniques to complex meta-models to cope with the abstract notion of software quality [3][4][5]. Nevertheless, the quality models applied and their concrete implementations differ significantly in terms of the ability to capture the manifestations of software quality. The differences in this ability of the identified software product quality models are introduced in the present paper to assist with the decision of whether a particular quality model is suitable for a specific task or whether the statement based on a specific software product quality model holds for software quality in general. Following the terminology of the ISO/IEC 9126 standard and its successor ISO/IEC 25010 [6,7], the quality manifestations the software product quality models are able to capture are called quality views in this paper, by which a unified terminology is also laid down. Creating a unified terminology is necessary, as some software product quality models, including ADEQUATE [8,9], FURPS+ [10][11][12], GEQUAMO [13], reflect the opinions of different stakeholders, which are directly or indirectly referred to as quality views in the model definition documents.
The ISO/IEC 9126 and ISO/IEC 25010 standards [6,7] distinguish three distinct views: (1) internal quality, (2) external quality, and (3) quality-in-use view [6,7,14]: Internal quality view: The set of quality properties of the source code and documentation that are deemed to influence the behaviour and use of the software.

External quality view:
The set of quality properties that determine how the software product behaves while it operates. These properties are usually measured when the software product assessed is operational and is being tested.

Quality-in-use view:
The set of quality properties that determine how the user perceives software product quality, i.e., to what extent the objectives that the software is used for can be achieved.
The software product quality assessed by the three different views needs to have predictive power towards each other, called predictive validity [6,7]. Thus, the measured internal quality should predict the external quality, and the measured external quality should predict the quality-in-use [6,7]. Nevertheless, this validity does not necessarily hold since quality metrics and measures associated with the quality properties do not always have such a predictive power [14]. Software product quality model families explicitly or implicitly define which manifestations of quality they are able to address.
The term software product quality model family is coined for related quality models with the same concepts but with minor differences. While the terms software product quality model family and software product quality framework are used interchangeably here, a distinction is made between the product quality model and process quality model. The former deals with the quality properties of software products while the latter assesses the processes through which the software products come into existence [6,7]. Furthermore, the term software product quality model or quality model in short refers to any software product quality model or any software quality assessment approach that makes the use of a software product quality model possible.
This research considers quality models that aim to describe each known aspect of software product quality, i.e., models that aim for completeness and, therefore, do not focus solely on a limited scope of quality properties such as maintainability or security. Consequently, the identified software product quality models offer support for the assessment of the software product as a whole and allow to define quality targets that endeavour for completeness. To emphasise the aim of these models, the term "complete software product quality models" is used. Thus, partial quality models are not eligible when assessing a software product as a whole. In [5], 23 software product quality model families were evaluated from the point of view of their relevance for the industrial and scientific communities. These identified, complete software product quality models constitute the set of models analysed in the present study.
Complete software product quality models usually encourage tailoring to specific project's needs, which is illustrated in Table 1 with variations within the same software product quality model families. The process of how abstract software product quality models can serve as a basis for a specific application domain (mobile application projects at Samsung) is demonstrated by Kim and Lee in [15]. In the specific tailoring process, Kim and Lee created a separate quality model starting from ISO/IEC 9126 [6] by removing the hierarchy, selecting the quality properties relevant for the specific context and added automation potential for measuring the internal quality view. The demonstrated process can also be applied to consider the specificities of particular application domains such as web, embedded, and IoT.

Research Question
Different software product quality models approach the abstraction of quality to capture its distinct manifestations in various ways, herein referred to as quality views following the terminology of the ISO/IEC 9126 [6] and ISO/IEC 25010 [7] standards. When assessing software products, it is of crucial importance to highlight the quality views used because the quality assessment results can only be interpreted with regard to this information in a fair manner to avoid misunderstandings and misinterpretation. For this reason, it is necessary to classify all the software product quality models identified in [5] to discover the quality views the models can address. Consequently, the research question in this study is formulated as follows: RQ1: Which quality views can the identified 23 software product quality model families address?

Structure of the Study
Section 2 analyses previous studies in the field, Section 3 presents the research methods, Section 4 introduces the taxonomy of all software product quality model families identified in [5], while Section 5 presents the limitations of the conducted research, and Section 6 closes the paper with the concluding remarks.

Related Works
Galli et al. in [5] carried out a comprehensive systematic literature review as per [16], and developed a scoring scheme for measuring the relevance of software product quality model families for the industrial and scientific communities. In addition, an execution tracing quality study of software product quality models was reported in [3]. Other studies on software product quality models have been carried out by Hegeman in [17] (up to 2011) and by Ferenc et al. in [18] (up to 2014), respectively. However, these two studies are not systematic as per [16], do not measure the quality model families' relevance for the industrial and scientific communities, and include partial quality models that deal with a specific part of software product quality.
Software product quality models defined, tailored or referenced since 2000 are elicited in [5], which was complemented with an analysis of the developments and trends that emerged during this period of time. Criteria regarding whether a publication establishes a new software product quality model family or solely adjusts a quality model already in use while keeping the existing quality model's concepts are also introduced in detail in [5]. In addition, the relevance of the quality model families for the industrial and scientific communities was measured (1) on the extracted, structured and synthesised qualitative information from the software product quality model definitions, and (2) on the following quantitative indicators: (i) relevance score; (ii) quality score average; (iii) time range; and (iv) 12-month average Google Relative Search Index. The identified 23 software product quality model families reveal huge differences in terms of application and research.
To make the present study self-contained, a summary of the main contributions of the study [5] are given below: (1) Search queries were developed and recorded for automatic document search; (2) Searches were carried out in six computer science digital libraries, including IEEE and ACM; (3) The automatic document search was complemented with manual and reference searches; (4) All the documents returned by the searches were pre-analysed, the duplicates were removed, the non-primary sources were removed, the complete software product quality models were included in the investigation, but process quality models and partial quality models were excluded with the reason for exclusion recorded; (5) Documents were assigned to the software product quality models they define, tailor or reference, which lead to the creation of related document clusters, i.e., to software product quality model families. Publications and model definitions stem from the academic and industrial domain, including EMISQ [19], SQUALE [20], FURPS [10,11], SQAE and ISO9126 combination [21], Quality Model of Kim and Lee [15] from companies such as Air France, Siemens, Qualixo, IBM, MITRE, and Samsung; (6) Each document was analysed in-depth to extract the terminology and concepts of each software product quality model; (7) Each document was assigned a score value for clarity of the publication and a score value for the actuality of the publication based on defined scoring criteria; the clarity score discriminates whether a quality model was published in a detailed mature state in one publication or in several publications by means of smaller increments; (8) The quality score of a document is defined as the product of its clarity score and actuality score; (9) For each cluster of documents, the following indicators are defined: (i) The relevance score as the sum of the quality scores of its documents; (ii) The quality score average; (iii) The publication time range; and (iv) The 12-month average of Google Trends called the Google Relative Search Index [22].
The first two indicators are related to research intensity, while the third one is related to the duration of research on a software product quality model family, none of which are normalised. The 12-month average Google Relative Search Index (GRSI) illustrates how far a software product quality model family is spread with everyday use cases. GRSI is computed from the normalised time series data of Google Trends. Cases when the GRSI indicator was not possible to apply as certain software product quality model families manifest homonyms, i.e., different things with the same name, were indicated with an "n.a." value in the ranking Table 1. Notice that all models except for FURPS [10] and GQM [60] have score values in agreement with the 12-month average GRSI values, which indicates that FURPS and GQM have more widespread application than shown by their relevance score. The ISO/IEC 9126 and ISO/IEC 25010 standards have very similar software product quality model concepts [6,7]. The 12-month average GRSI indicator shows more activity for the ISO/IEC 9126 standard [6] than for its successor ISO/IEC 25010 [7]; moreover, the last identified publication on the ISO/IEC 9126 standard appeared in 2017, years after the publication of the successor standard ISO/IEC 25010 [7]. The SQALE model [44] was ranked as third according to the relevance score, which mirrors the 12-month average GRSI if the two outliers, FURPS [10] and GQM [60], were excluded. In addition, SonarQube [65], a popular tool implementation of the abstract SQALE model [44], would suppress the 12-month average GRSI indicator for all listed quality model families to nearly zero [5,22], which means that the majority of the activities in the domain seem to be associated with such widespread implementation of the SQALE model [44]. Table 1 results are taken into account while filling in the columns "Research Interest" and "Widespread Use Cases" in the present study taxonomy in Table 2.

Methods
The study applies qualitative research comprising the analysis of documents and tailored quality models to specific project needs. The documents forming the input of the present research were identified by a systematic literature review [16] in the scope of [5]. As the internal/external quality views and quality-in-use view are not explicitly defined in all the model descriptions, the quality property specifications and model concepts were investigated to obtain an insight as to whether a given quality model is able to address external or quality-in-use manifestations.
Quality models able to provide guidelines to the investigation of the source code or documentation, without exhibiting any quality property assessing the software's runtime behaviour or the perception of the end user, are designated as quality models with internal quality view only (denoted "I" in Table 2). Quality models that contain quality properties to assess the software's runtime behaviour are designated with the external quality view (denoted "E" in Table 2). In addition, if a quality model also shows up quality properties to reflect the end user's assessments in any form, then the quality-in-use view is also assigned (denoted "U" in Table 2). If a given quality model does not provide quality property specifications, but it contains concepts that allow to define or deal with existing quality property specifications and how to extend the model, then these concepts were examined to decide which quality views the quality model is able handle. Thus, quality models lacking explicit information about the quality views were classified based on the authors' interpretation of the published model definitions with regard to concepts, and quality properties. Cases where alternative interpretations to the authors' one may exist are indicated with brackets in Table 2 and with additional explanations in the next section. Quality models not providing enough information to perform a reasonable classification are classed "Undefined" in Table 2. There are also quality assessment approaches that allow the definition of a new quality model or the use of a different quality model from the one predefined. The quality views applied in such cases depend on the quality models implemented (denoted "D" in Table 2). The flow chart of the taxonomy process is depicted in Figure 1.
Section 4 lays out the classification of the identified software product quality models and provides reasoning in each case where the model description does not explicitly define the use of the internal, external and quality-in-use views.

Results
This section presents the taxonomy of the software product quality model families from the point of view of quality manifestations they can capture. An important aspect of the quality models is their ability to measure the properties of software products; therefore, software product quality models are involved in the study; however, some quality models also exhibit properties to assess the process by which the software products come into existence. Such models are also marked in Table 2 Table 2: the Quamoco quality model [46] encompasses an entity "utility" that describes the relative satisfaction of stakeholders with regard to a specific "factor" in the model's terminology; ADEQUATE [8,9] determines the key personnel and their views. FURPS+ [10][11][12] includes systematic approaches to collect architectural requirements involving the stakeholders to consider their opinions; GEQUAMO [13] encompasses quality views for users, sponsors and developers; McCall et al.'s model [2] contains usability and testability factors; Boehm et al.'s model [1] reflects quality characteristics for usability and human engineering, which could make it possible to consider the end user's view of quality. SQUALE, COQUALMO and STAC models are classified as quality models with internal and external quality views in Table 2: SQUALE [20] possesses different metrics and practices related to distinct test levels; COQUALMO [54,55] comprises a defect introduction and removal sub-model, which assumes the software to be operational; the STAC model [63] defines metrics related to testing.
Ulan et al.'s model [56], SQALE [17,[38][39][40][41][42][43][44], EMISQ [19,49,50], Kim and Lee's model [15], Dromey's model [59], SQAE [64] are classified in Table 2 as quality models with internal quality view only: Ulan et al.'s model permits the use of any goal-question-metric approach; its novelty lies in the complex mathematical computations used in a clustering approach, although the model published demonstrated internal quality properties only without direct guidelines for extension; SQALE also contains the testability quality property that is divided into two further quality properties, Unit Testing Testability and Integration-level Testability, that are based on the results of code analysers.
GQM [60] and SQUID [14] are rather general quality assessment approaches with no predefined quality models. These approaches assume the creation of a software product quality model or the use of an existing one, named content model in SQUID terminology, which is suitable to achieve the goals set in the given context. Consequently, the quality views they consider depend on the defined metrics and goals. In the published case study with the model definition [14], SQUID also encompasses metrics that can only be collected when the software is operational and metrics on usability that allow the expression of end users' quality perception. For these reasons, they are classified in Table 2 as quality assessment approaches with quality views that depend on the quality model used.
In addition to the quality views applied, further information is presented in the columns: "Predefined Quality Properties or Metrics Available" to describe whether the specified model defines quality properties with associated computation methods; "Research Interest" to indicate whether there are more publications related to the quality model and one in the last five years; "Widespread Use Cases" to provide the judgement based on the specified model's 12-month average GRSI indicator from Table 1; "Also Process Related Properties" to show whether the specified model is a hybrid model with quality properties that describe process quality.
From Table 2, it is clear that the identified software product quality model families apply different quality views to capture the abstract notion of quality. The capabilities of the listed software product quality models deviate significantly; many of them are unable to assess the external quality view and the quality-in-use view. The provided taxonomy makes it possible to interpret the quality assessment results. Based on quality assessments with internal quality view only, it is not possible to make valid statements regarding software product quality as a whole. This is because such models lack the capability to capture how the software behaves while it is used and how the end user perceives its quality. Quality assessment results always need to be interpreted with the quality model's capabilities, including quality views, which is especially important for project negotiations. In addition, Table 2 also contains relevant information to support the selection of a quality model for a specific project. In such cases, not only the model's quality views play an important role but the following criteria can also influence the decision: (1) the amount of associated research studies on the model, i.e., the research interest, (2) the model's public spread, and (3) whether the model has process related quality properties.

Limitations
The research conducted is qualitative in nature and rests on data extracted from the definition documents of the different quality models and from their variations, which depict how to tailor the identified models to specific needs. The analysis of the listed software product quality model families was done carefully, although the high amount of information implicates a risk of ignoring minor details. On the other hand, some quality models investigated did not provide enough information to allow their classification with regard to Table 2 taxonomy. These models were correspondingly marked in Table 2 and explained in Section 4. Table 2 taxonomy lists the software product quality model families according to their relevance for the industrial and scientific communities, as reported in 2020 in [5]. This means that publications appeared afterwards and new applications of the listed quality models might have an influence on this order but not on the provided taxonomy of quality models with internal, external and quality-in-use manifestations.

Conclusions
Classification of software product quality models based on the amount of information they can interpret, i.e., the quality manifestations they can address, is an important step towards preventing misunderstandings and misleading statements regarding software product quality assessments. At present, the most comprehensive and easy-to-understand quality interpretations are defined by the ISO/IEC 9126 and ISO/IEC 25010 [6,7] standards. Both standards classify quality manifestations with three distinct views: internal, external and quality-in-use view. This was extended to all 23 software product quality models identified in 2020 in [5] and listed in Table 2 even if the definition documents use different terminologies or do not mention such views explicitly. Consequently, the current study places the existing publications related to software product quality modelling into a new context. Moreover, the research conducted stresses the requirement to present not only the quality assessment results but also the quality manifestations, i.e., the quality views, measured. The latter is a must for fair quality measurements and assessments.
The quality assessment results cannot be interpreted on their own without the knowledge of whether internal, external or quality-in-use quality manifestations were measured. Indicating and emphasising this characteristic of the software product quality assessment is a major contribution of the paper. The scope of software product quality that the individual quality models are able to assess has a direct impact on software development and maintenance costs. In conclusion, as explained in Section 4, a "good quality" statement made on the basis of a software product quality model that captures the above three views has a very different meaning to a "good quality" statement made on the basis of another software product quality model, which can only capture one individual view of the three defined ones. Consequently, it is never fair to make a statement with regard to the whole software product quality when only the internal quality view is assessed.
The software product quality models with full-automation potential can solely assess the internal view of quality, which can be captured by static code analysers. The software product quality models capable of handling all internal, external and quality-in-use views offer partial automation features so that the results of the measurement of the external and quality-in-use views could be considered. This is not an impediment but a broader range of quality manifestations to consider for approximating the reality better while assessing quality. Nevertheless, if such models are integrated into the development pipeline with the expectation of full automation, then their capabilities are restricted to the internal quality view only.

Conflicts of Interest:
The authors declare no conflict of interest.