Lightweight Software Architecture Evaluation for Industry: A Comprehensive Review

Processes for evaluating software architecture (SA) help to investigate problems and potential risks in SA. It is derived from many studies that proposed a plethora of systematic SA evaluation methods, while industrial practitioners currently refrain from applying them since they are heavyweight. Nowadays, heterogeneous software architectures are organized based on the new infrastructure. Hardware and associated software allow different systems, such as embedded, sensor-based, modern AI, and cloud-based systems, to cooperate efficiently. It brings more complexities to SA evaluation. Alternatively, lightweight architectural evaluation methods have been proposed to satisfy the practitioner’s concerns, but practitioners still do not adopt these methods. This study employs a systematic literature review with a text analysis of SA’s definitions to propose a comparison framework for SA. It identifies lightweight features and factors to improve the architectural evaluation methods among industrial practitioners. The features are determined based on the practitioner’s concerns by analyzing the architecture’s definitions from stakeholders and reviewing architectural evaluation methods. The lightweight factors are acquired by studying the five most commonly used lightweight methods and the Architecture-based Tradeoff Analysis Method (ATAM), the most well-known heavyweight method. Subsequently, the research addresses these features and factors.


Introduction
It is essential to detect software architecture (SA) problems before software development, but it is not easy to analyze SA because of software heterogeneity and the arrangement of software components [1,2]. Consequently, the problematic SA leads to project failure; therefore, different big, upfront designs are sketched based on the experiences of software analysts to explore the SA problems in the early phase [3]. Modern SA works based on various Flask web servers and local Raspberry servers, connected via cloud technology to a central platform. They receive user requests and control the protocol through a REST API to command microcontrollers and detect sensor faults. Internet of Things (IoT) devices with resource limitations are programmed with high-complex languages, such as Python, C++, and even Java, which significantly differ from the traditional monolithic SA. It is very difficult to analyze the intercommunication of all those elements [4,5].
A successful software project delivers the agreed-upon functionalities in the software within the triangle of a specific time, budget, and acceptable quality [6]. An SA plays a vital role in this triangle since the SA initializes system design with models and analysis to ensure that the design meets the system's functional and non-functional requirements. It extends and sustains the system by integrating it with other systems. Changes to the existing requirements always happen and may change the SA, bringing massive code rework and impacting the schedule and budget [7]. international standard, ISO/IEC 19505: 2012, and has been accepted as an effective industrial standard for SA [28]. However, UML notation meets the user's needs and is flexible enough to follow their expectations; this flexibility is embedded in semantic informality that can be understood differently [29,30]. Some researchers [27][28][29][30][31][32][33] imply that UML per se is not enough, and there is a need for a formal approach to adapt the UML. Moreover, Rodriguez et al. [31] and Medvidovic et al. [32] proved Petri net supremacy over architectural description languages (ADLs) and formal methods languages. As a result, Petri net can bridge this gap. Additionally, Jensen et al. [33] and Emadi et al. [34] stated that timed hierarchical colored Petri net is a compatible version of Petri net that can be utilized to simulate complex data values and SAs [35].
Architectural patterns are an ideal complement to architectural decisions. An architectural design, interchangeably called architectural style, is specified as a set of principles besides a coarse-grained pattern abstract framework for systems [36,37]. It is a standard solution, reused and partitioned for chronic issues in the SA area. An architectural style thoroughly regulates the vocabulary of components and connectors. It means how they can be together with a set of constraints. It may impose some topologic restraints on architectural explanations [38].
Additionally, it may have some execution semantics, which can be part of the style definition [39,40]. For the rising concern of distributed and heterogeneous software mentioned in the previous section, the commonly used architectural styles are client/server, component-based architecture, domain-driven design, layered architecture, message bus, N-tier/3-tier, object-oriented, SOA, and pipe and filter. In practice, the standard SA of a system is mainly made up of a pattern of different architectural styles for systems [41].
Despite the popularity of domain-driven design, layered architecture, message bus, N-tier/3-tier, and object-oriented modeling, it is challenging to scale them up. Moreover, there is no tool to analyze or measure non-functional properties. The interconnection mechanisms are so basic (method invocation); thus, complex interconnections are so hard in these methods. In a nutshell, there is no clear image of system architecture before component creation [42]. Component-based SA relieves this modeling trouble and develops reusable off-the-shelf component-based heterogeneous systems.
In comparison with other SA styles, a component or connector has a higher level of roughness. ISO/IEC/IEEE 42010: 2011 defines this architectural type as an effective standard [43]. Along with component-based architecture, SOA, and pipe and filter facilitate the handling of non-functional properties. It has the excellent capability of making complex products, apart from what kind of platform or technology is used in these products. However, in comparison, the presentation is still a problem in both styles [44][45][46][47].

Software Architecture Evaluation
This section examines the existing literature on SA evaluation for the last three decades to answer research questions about SA evaluation [48]. This systematic review results in the factors that can be used for proposing an evaluation framework.
After more than 30 years of SA evaluations, many research questions remain open about the categorization of SA evaluations. This section determines the criteria for the classification of SA evaluations to identify factors for SA evaluation. In this study, for the systematic review of the literature, the terms of SA "Review", "Evaluation", "Analysis", "Assessment", and "Validation" are used interchangeably as search keywords. Figure 1 and Table 1 indicate that 27 credential SA evaluations have been selected to review from IEEE, Springer, ACM, Elsevier, and Google Scholar. They are the standard methods and techniques of SA evaluation.

Reference Study Focus
Breivold et al. [49] The search identified 58 studies that were cataloged as primary studies for this review after using a multi-step selection process. The studies are classified into the following five main categories: techniques supporting quality considerations during SA design, architectural quality evaluation, economic valuation, architectural knowledge management, and modeling techniques.
Barcelos et al. [50] A total of 11 evaluation methods based on measuring techniques are used, mainly focusing on simulation and metrics to analyze the architecture.
Suman et al. [51] This paper presents a comparative analysis of eight scenario-based SA evaluation methods using a taxonomy.
Shanmugapriya et al. [52] It compares 14 scenario-based evaluation methods and five of the latest SA evaluation methods.
Roy et al. [53] The taxonomy is used to distinguish architectural evaluations based on the artifacts on which the methods are applied and two phases of the software life cycle.
Mattsson et al. [54] The paper compares 11 various evaluation methods from technical, quality attributes, and usage views.
Hansen et al. [55] The research reports three studies of architectural prototyping in practice, ethnographic research, and a focus group on architectural prototyping. It involves architects from four companies and a survey study of 20 practicing software architects and developers.
Gorton et al. [56] This paper compares four well-known scenario-based SA evaluation methods. It uses an evaluation framework that considers each method for context, stakeholders, structure, and reliability.
Weiss et al. [57] It conducted a survey based on architectural experience, which was organized into six categories. The architecture reviews found more than 1000 issues between the years 1989 and 2000.
Babar et al. [58] It discusses the agility in SA evaluation methods.
Suryanarayana et al. [59] It states refractory adaption for architecture evaluation methods.
Lindvall et al. [60]; Santos et al. [61] These are references of reviewed papers lacking stated knowledge, which is needed in the paper or more investigation.
Oliveira et al. [62] It reviews the agile SA evaluation.
Martensson et al. [63] It reviews the scenario-based SA evaluation based on industrial cases.  Although SA evaluation is an important activity at any stage of the software life cycle, it is not widely practiced in the industry [64]. Gorton et al. [65] conducted large-scale research to identify the industrial practices of architecture evaluations to categorize SA evaluations based on evaluation techniques. Table 2 indicates the investigations in architectural evaluations from the industrial aspect. The techniques and methods listed in the first column of Table 2 are based on the frequency of use in industry; these techniques are used in the evaluation methods. The most frequent methods of each technique are listed below, and the quality attributes (QAs) are applied to them. These methods are elicited from systematic literature reviews and the latest related academic papers and books.
Despite the encouraging number of basic research found (76 approaches), it is evident that only 27-SA evaluation approaches, regardless of their targeted QAs. This table tries to compare the techniques of approaches. Later on, in Section 5, the target quality attributes will be discussed. There are some reasons for such a massive decrease in the amount of research. First, there were some identical entries for the same article when we searched in numerous databases. Second, a large percentage of the study evaluated one or several QAs in a subtle ad hoc way. Consequently, those studies are omitted, as they did not manuscript a repeatable evaluation process or method. Third, some studies considered both software and hardware evaluations, so they were not suitable in the current research emphasizing SA evaluation approaches. Below, the techniques and methods are discussed.  The first technique focuses on experience, where SA plays a vital role in design and evaluation [66]. This is the most practiced technique by the industrial section [56]. Empirically Based Architecture Evaluation (EBAE) is performed late in development. At the same time, Attribute-Based Architectural Styles (ABAS) can run during the design time and get integrated with ATAM [49,60,67]. Decision-Centric Architecture Reviews (DCAR) analyse a set of architectural decisions to identify if the decision taken is valid. It is more suitable for agile projects due to its lightweight [62].  The second-most popular technique is the prototype that collects early feedback from the stakeholders based on and enables architecture analysis to look at close-to-real conditions. It may answer questions that cannot be resolved by other approaches [68].  Although SA evaluation is an important activity at any stage of the software life cycle, it is not widely practiced in the industry [64]. Gorton et al. [65] conducted largescale research to identify the industrial practices of architecture evaluations to categorize SA evaluations based on evaluation techniques. Table 2 indicates the investigations in architectural evaluations from the industrial aspect. The techniques and methods listed in the first column of Table 2 are based on the frequency of use in industry; these techniques are used in the evaluation methods. The most frequent methods of each technique are listed below, and the quality attributes (QAs) are applied to them. These methods are elicited from systematic literature reviews and the latest related academic papers and books.
Despite the encouraging number of basic research found (76 approaches), it is evident that only 27-SA evaluation approaches, regardless of their targeted QAs. This table tries to compare the techniques of approaches. Later on, in Section 5, the target quality attributes will be discussed. There are some reasons for such a massive decrease in the amount of research. First, there were some identical entries for the same article when we searched in numerous databases. Second, a large percentage of the study evaluated one or several QAs in a subtle ad hoc way. Consequently, those studies are omitted, as they did not manuscript a repeatable evaluation process or method. Third, some studies considered both software and hardware evaluations, so they were not suitable in the current research emphasizing SA evaluation approaches. Below, the techniques and methods are discussed.

•
The first technique focuses on experience, where SA plays a vital role in design and evaluation [66]. This is the most practiced technique by the industrial section [56]. Empirically Based Architecture Evaluation (EBAE) is performed late in development. At the same time, Attribute-Based Architectural Styles (ABAS) can run during the design time and get integrated with ATAM [49,60,67]. Decision-Centric Architecture Reviews (DCAR) analyse a set of architectural decisions to identify if the decision taken is valid. It is more suitable for agile projects due to its lightweight [62].

•
The second-most popular technique is the prototype that collects early feedback from the stakeholders based on and enables architecture analysis to look at close-to-real conditions. It may answer questions that cannot be resolved by other approaches [68].

•
The third technique is a scenario-based evaluation. SAAM is the earliest method using scenarios and multiple SA candidates. Later on, ATAM completed SAAM by trade-off analysis between QAs, where ATAM uses qualitative and quantitative techniques. The Architecture-Level Modifiability Analysis (ALMA) and Performance Assessment of Software Architecture (PASA) have been used to combine scenarios and quantitative methods to boost the results [69,70].

•
The fourth technique is checklists, consisting of detailed questions assessing the various requirements of architecture. Software Review Architecture (SAR) uses checklists according to the stakeholder's criteria and the system's characteristics. The Framework of Evaluation of Reference Architectures (FERA) exploits the opinions of experts in SA and reference architectures. There is a need for a precise understanding of the requirements to create the checklist [71,72].

•
The fifth technique is simulation-based methods, which are very tool-dependent; the Architecture Recovery, Change, and Decay Evaluator/Reference Architecture Representation Environment (ARCADE/RARE) simulates and evaluates architecture by automatic simulation and interpretation of SA [73]. An architecture description is created using the subset of toolset called Software Engineering Process Activities (SEPA), descriptions of usage scenarios are input to the ARCADE tool [74]. Many tools and toolkits transform architecture into layered queuing networks (LQN) [75]. It requires special knowledge about the component's interaction and behavioral information, execution times, and resource requirements [76]. Formal Systematic Software Architecture Specification and Analysis Methodology (SAM) follows formal methods and supports an executable SA specification using time Petri nets and temporal logic [77]. It facilitates scalable SA specification by hierarchical architectural decomposition.

•
The sixth category is for metrics-based techniques that need to be mixed with other techniques, and they are not intrinsically powerful enough [78]. Here are some examples of metric-based methods. The Software Architecture Evaluation on Model (SAEM) is based on the Goal/Question/Metric Paradigm (GQM) to organize the metrics. Metrics of Software Architecture Changes based on Structural Metrics (SACMM) measures the distances between SAs endpoints by graph kernel functions [79]. Lindvall et al. [60] introduced late SA metrics-based approaches to compare the actual SA with the planned architecture.

•
The seventh technique, focusing on mathematical-model-based methods, is highlighted in the research areas, but the industry does not attend to them. Software Performance Engineering (SPE) and path and state-based methods are used to increase the reliability performance. These modeling methods exploit mathematical equations resulting in architectural statistics such as the mean execution time of a component and can be mixed with simulation [80,81].
Concisely, all the above discussion derives is that scenarios-based evaluation techniques are well-investigated and broadly reviewed in research papers. It has been completed by other techniques, such as simulations and mathematical techniques, to perform effectively. Another technique is simulation, focusing on the main components of a planned or implemented architecture to simulate context. While mathematical techniques use static evaluation of architectural designs, some modeling techniques originate from highperformance computing and real-time systems. Experienced-based techniques are different from other techniques. They are less explicit, and they are based on subjective factors, namely, intuition and experience. This technique is based on reviewer perception, objective argumentation, and logical reasoning. For example, an expert might recognize an availability problem, and then they convince others through scenarios that depict the situation. Table 2. SA evaluation categorization.

Methods Quality Attribute Remarks
Experience-based Experts encountered the software system's requirements and domain.

EBAE (Empirically Based Architecture Evaluation) Maintainability
The most common technique applied to review architecture in the industry [5], based on expert's knowledge and documents.
ABAS (Attribute-Based Architectural Styles) Specific QAs DCAR (Decision-Centric Architecture Reviews) All Prototyping-based Incrementally prototyping before developing a product to get to know the problem better.
Exploratory, Experimental, and Evolutionary Performance and modifiability The delighted techniques have been applied to review industry architecture; possibly an evolutionary prototype can be developed into a final product.

Scenario-based
The specific quality attribute is evaluated by creating a scenario profile conducting a concrete description of the quality requirement.

SAAM (Software Architecture Analysis Method)
All Scenario is a short description of stakeholders' interaction with a system. Scenario-based methods are widely used and well known [49]. Design pattern, conformance with design and violations. Inter-module coupling violation.

TARA (Tiny Architectural Review Approach) Functional and non-functional
Math Model-based By mathematical proofs and method, operational qualities requirements such as performance and reliability are evaluated.
Path and state-based methods Reliability It is mixed with a scenario and simulation-based architecture to have more accurate results.

Categorizing of Software Architecture Evaluation
There is no distinctive categorization in technique-based classifications because hybrid architecture evaluation methods that use multiple architecture evaluation techniques belong to different categories [82]. The selected methods' discussions discover various classifications, comparing them with some commonalities regarding the assessment procedure's activities and artifacts. However, since it is not apparent which methods are the same as the proposed solution, these methods will be analyzed to attain their common aims and an objective mechanism. To address this problem, we have recognized a set of criteria that can provide a foundation for comparing and assessing SA evaluation methods.
This study proposes a comparison framework to present and compare the analysis methods to elaborate on these fundamental criteria. It has been proposed by the combination of three software evaluation comparison frameworks [83,84]. In Table 3, this comparison framework is introduced, which contains the following main components of SA evaluation methods: context, stakeholder, contents, time, and reliability. For each component, the related elements are identified and mentioned. Then the existing taxonomies of these elements are generally mentioned in taxonomic comparison, and the taxonomies are broken down in more detail in the complementary table.
Niemela et al. [85] introduced a framework to compare SA evaluation methods with some essential criteria. These criteria are listed below from C1 to C7, which can be answered concerning Other than these criteria, tools, and techniques, and the SA description and outcomes of the methods are also explored. Moreover, C4, C5, and C6 are replied during the review of the SA evaluation "process". These criteria are used to compare the existing solution to identify factors that increase the evaluation framework's use.

Identifying Factors for Lightweight Evaluation Method
Architecture evaluations are usually performed manually based on informal/semiformal architecture documentation and the reviewer's knowledge [86]. The comprehensive SA evaluation methods, in particular ATAM, require a massive amount of cost and effort. This problem results in lightweight SA evaluation methods. Heavyweight reviews are long-running, documentation-based, such as technical reviews and inspections. While lightweight runs are based on short-running processes with little architecture documentation. The criteria for lightweights have not yet been defined, and they are detectable based on the publication's claim. In this research, ATAM, as the best sample of heavyweight methods, ARIS, BAR, and TARA,-ATAM, ARID, PBAR, and TARA, are lightweight to identify factors for lightweight methods [87,88].

Architecture Tradeoff Analysis Method
ATAM is the most mature, sophisticated, and well-known method that various researchers have devised. ATAM-based methods are flexibly used for evaluation purposes, such as the following: seeking SA improvement opportunities, risk analysis, SA comparison, but generally, finding out whether the candidate SA supports business goals adequately. ATAM-based methods engage various stakeholders with various techniques for prioritizing requirements by cumulative voting and utility tree, then identify trade-offs to resolve conflicts. C1 (Main Goal): identifying SA patterns and tactics suit business derives. C3 (Covered QAs): All QAs or any property that can affect the business goals. C4, C5, and C6 (Process): Table 4 explains the ATAM process. C2 (Evaluation Techniques): Based on scenario and experience. C7 (Validation): It has been extensively validated. Outcomes: A list of risks, non-risks, risk-themes, sensitivity points, and trade-off points. SA description: SA styles and tactics, as well as ATAM-based methods, define a precise template for documenting the quality scenarios. Tools and techniques: Brainstorming and voting. Discussion: Although it exploits a scenario-based paradigm, it engages various stakeholders for up to six weeks and is costly. The quality scenarios and ATAM templates represent the requirements in detail, but they are often confusing. Cumulative voting can induce stakeholders into excessive rivalry. Moreover, its reliance on SA documentation and the ignorance of project management paradigms makes ATAM impossible to run for agile projects. The decision-makers will prioritize their decision based on the quality attribute goals.
The first version is based on prioritized quality scenarios and a quality attribute tree.
6. Analyze the architectural approaches Evaluation Team and software architects They will link the SA to primary quality attribute goals to develop an initial analysis resulting in non-risks, risk, and sensitivity/trade-off points.
The first version of non-risks, risks, risk themes, trade-off points, sensitivity points

Lightweight ATAM
Costs of the ATAM-based method derived from Lightweight ATAM requires less than 6 h running. The technique is used by a development team that is familiar with ATAM, SA, and goals.
C4, C5, and C6 (Process): The evaluation process was created by eliminating or constraining the scope of ATAM's activities, which is shown in Table 4. It assumes the participants are familiar with ATAM while brainstorming and prioritizing are omitted because of their cost.
Step 9 should be completed in 30 min. C7 (Validation): No validation, and the method features generally are as same as ATAM.
Discussion: The method reduces the stakeholder's engagement, and the evaluation process steps, but architecture evaluation still needs more formality. It relies on the stakeholder's familiarity and tactical knowledge, which is achieved because of the full ATAM implementation. It is evident that by constraining the scope and depth of evaluation, a lower effort is needed. Step 1: Appointing of reviewers.
Step 7: Brainstorming and prioritizing the scenarios.
Step 8: Conducting of SA evaluation.
Step 9: Results. C7 (Validation): One pilot experience in the industry. C2 (Evaluation Techniques): Based on scenario and expertise. SA description: There is no specific form of SA designs or documents. Tools and techniques: Brainstorming and voting. Outcomes: List of the given SA issues. Discussion: It is a simple method to seek flaws and weaknesses in QAs of the given SA. ARID does not explicitly state the QAs and SA styles during the analysis. The analysis focuses on a set of properties represented by a group of quality scenarios. It has nine steps, which are not compatible with the lightweight concept. It emphasizes an expert informal review with no particular form of SA style. As a result, it is difficult to repeat.

PBAR
ATAM is the most mature, sophisticated, and well-known method that various researchers have devised. ATAM-based methods are flexibly used for evaluation purposes, such as the following: seeking SA improvement opportunities, risk analysis, SA comparison, but generally, finding out whether the candidate SA supports business goals adequately. ATAM-based methods engage various stakeholders with various techniques for prioritizing requirements by cumulative voting and utility tree, then identifying trade-offs to resolve conflicts. C1 (Main Goal): Detecting quality attribute issues. C3 (Covered QAs): Potential risks influencing QAs. C4, C5, and C6 (Process): 1. Elicitation of essential quality requirements from user stories with the assistance of developers. 2. Establishing SA's structure by a discussion with developers. 3. Nominating architectural styles. 4. Analyzing the nominated architectural effects on the qualities. 5. Recognizing and discussing the final results.
C2 (Evaluation Techniques): It is based on scenario and experience. C7 (Validation): Nine student small-size projects for industrial use. SA description: There is no specific form of SA designs or documents, but SA styles are included during the evaluation. Tool and techniques: Informally requirement elicitation during the development team meeting. Outcomes: It has the QAs issues, which are mismatches between QAs and SA styles. Discussion: PBAR contains all the criteria of the lightweights. It reduces the process into five steps that occur once in face-to-face meetings with the development team. It omits the prioritizing requirements to help the method. PBAR requires a negligible amount of time to run in comparison with the traditional methods. It focuses on the production step in agile projects. It is operational in the software industry rather than the conventional methods for companies that use agile and lean software development methodologies. It also confines the use of this method comprehensively. The evaluation uses SA styles and tries to find mismatches between SA styles and QAs of candidate SAs. However, it ignores formalizing the assessment technique and merely relies on tacit knowledge of SA styles and their impacts on QAs. Moreover, the influence of styles on QAs is not conclusive in most cases since other factors should be taken into account. SA description: There is no specific form of SA designs or documents, but the evaluator should understand functional/deployment structures and system context. Tool and techniques: The method involves automated code analysis techniques (module dependencies, size measures, code metrics, and test coverage). For implemented software exploits information on software execution (e.g., event logs). Outcome: a list of crucial requirements with its relevant SA. Discussion: TARA is a lightweight permissive method that does not exclude requirements specification documents. It allows an evaluator to consult with the stakeholders to prioritize the requirements. TARA suits the implemented software since it uses code analysis techniques with operational data. Evaluation methods mainly rely on explicit scenarios and the architect's knowledge, but TARA relies on the reviewer's judgment associated with the SA analysis evidence. Consequently, it just works well for implemented software in the maintenance phase when it is hard to correct the flaws. 3. Management presentation: The management/customer representative will be exposed to a brief presentation to elicit the potential decision forces (the list of architectural decisions was produced in the first step). 4. Architecture presentation: The lead architect will present potential decision forces and potential design decisions to all participants in a very brief and interactive session to revise the list of architectural choices. 5. Forces and decision completion: The decision forces and design decisions will be verified based on the same terminologies for all stakeholders. 6. Decision prioritization: The decisions will be prioritized based on participant's votes. 7. Decision documentation: The most important decisions will be documented in applied architectural solutions, the addressed problem, the alternative solutions, and the forces that must be considered to evaluate the decision. 8. Decision evaluation: By discussion among all stakeholders, the potential risks and issues are selected, the decisions are revised based on decision approval. 9. Retrospective and reporting: Review team will scrutinize all the artifacts and produce the final report.
C3 (Evaluation technique): Experience-based and expert reasoning. C7 (Validation): It has been verified in five large industrial projects. SA Description: SA design, informal requirements, and business drivers. Output: Issues and risks. Tools and technique: Templates, wiki, and UML tools. Discussion: DCAR originated from SA evaluation experiences in the industry. It is a lightweight method that allows users to analyze and record the rationale behind architectural decisions systematically. In comparison, scenario-based methods test SAs against scenarios to find flaws and issues in a specific QA. For the sake of being lightweight, brainstorming and prioritizing steps are omitted. The reviewers should know SA and rely on the standard UML tool to make the evaluation understandable for stakeholders.
Although it provides comprehensive templates for assessment, it considers several factors that originated from managerial views. This consideration leads to the nine steps, which are not compatible with lightweights.

Factors for Lightweight Evaluation Method
The five lightweight methods plus ATAM are compared in Table 5 based on the following most common aspects: the evaluation methods, SA description, evaluation time, method's validation, and tool support. These aspects are categorized based on the comparison framework reflected in Table 3, and the approaches are related to them. Table 5. Lightweight methods comparison.

Aspect Category Approaches
The goal of the evaluation method Assessment against requirements Lightweight ATAM Architectural flaws detection PBAR, TARA, ARID SA has been evaluated at various points in the software life cycle. It can happen at the early and late stages of the development life cycle. Early methods evaluate SA candidates before the implementation, while late methods assess the system's implemented versions compared to the planned/previous versions. Early methods are based on SA descriptions and other sources of information. These methods lead to a better understanding of SA and the identification of problems with the architecture. At the same time, late processes utilize data obtained from the actual software implementation. Hence, the existing architecture can be reconstructed to compare with early evaluated SA. Early methods mostly contain scenario-based, mathematical-model-based, and simulation-based, while late ones are mostly metrics-based and tool-based. Early methods emphasize designing and modeling while late ones try to catch code violations and module inconsistencies. Sometimes, early methods can evaluate the implemented software [89,90]. While late and early evaluation is not contradictory, they can mostly not be attended simultaneously due to the overload they imbue on the approach. As it is indicated in the time of the evaluation part of Table 5, no method can cover all the stages.

Need of Agility
Although SA evaluation is beneficial, it is not broadly applied in the industry nowadays. Even agile development approaches do not encourage using architecture evaluation methods since they usually take a considerable amount of time and resources [91,92]. Except for PBAR and DCAR, the other methods are not proper for agile projects.

Ad Hoc Analysis
Ad hoc analysis ties architecture analysis to architecture design and implementation activities employing experience, expertise, and argumentation [93]. Informal experiencebased architecture analysis is prevalent, as this method works regardless of architecture documentation. This analysis is carried out manually by several SA studies [94,95].

Targeted Quality Attributes: Performance and Security in Software Architecture Evaluation
Based on "software architecture definition differences" in this research, performance and security are selected as the targeted QAs. These QAs will be discussed in the following subsections. The QAs belong to the methods represented in Table 2.

Performance
Mostly, the performance is estimated based on the approximate model of the runtime view. These methods need appropriate descriptions of the dynamic behaviors of SA to show the characteristics of the components, frequency, and nature of inter-component communication. Mathematical formalism such as Petri net and simulation boost this estimation [96]. Figure 2 [48] shows that most SA performance analysis methods convert SA specifications to desirable models. Subsequently, timing data is added to the models to estimate performance attributes and provide the following feedback: i.
Predicting the system's performance in the early stages of the software life cycle. ii.
Testing performance goals. iii. Comparing the performance of architectural designs. iv.
Finding bottleneck, possible timing problems.

Performance
Mostly, the performance is estimated based on the approximate model of the runtime view. These methods need appropriate descriptions of the dynamic behaviors of SA to show the characteristics of the components, frequency, and nature of inter-component communication. Mathematical formalism such as Petri net and simulation boost this estimation [96]. Figure 2 [48] shows that most SA performance analysis methods convert SA specifications to desirable models. Subsequently, timing data is added to the models to estimate performance attributes and provide the following feedback: i.
Predicting the system's performance in the early stages of the software life cycle. ii.
Testing performance goals. iii.
Comparing the performance of architectural designs. iv.
Finding bottleneck, possible timing problems. Some of the essential methods are discussed in the following: CF is a mathematical model-based method that integrates performance analysis into the software development cycle. It presents software execution behaviors through a graph, including arcs and nodes, with timing information. This model works based on the Queuing Networks Model (QNM) performance model. The simulation evaluates the model to estimate performance attributes. The approach has been enriched by using Kruchten's 4 + 1 views and using use case scenarios depicted by a message sequence chart as the dynamic behavior of SA. Later on, the approaches combine UML diagram information to create a performance model of SA more formally. However, the methods do not consider the concurrent/non-deterministic behaviors of the components during QNM modeling. In order to address these problems, the labeled transition system (LTS) graph and ADL were added to the approaches [97,98]. The emerging problem was the computational complexity of the possible state space explosion of the architecture description's finite-state model. This problem persuades experienced-based analysis methods such as ABAS to not use an analysis tool to evaluate performance [99].
PASA boosts SPE by adding performance anti-patterns and architectural styles. It tries to adapt the concept of ATAM and SAAM into SPE. PASA formally states the scenarios with a descriptive architecture language such as the UML sequence diagram [100].
Nevertheless, none of these approaches has yet been applied to a complete environment for performance analysis, specification, and providing feedback to the designer. The unsolved problem is automating completely derived performance models from the software specification and assimilating the supporting tools in a comprehensive environment [101]. Moreover, although the quality of models has not yet been attended to deeply, high-quality models are an essential factor in which verification and performance analysis strongly rely on them [102].

Security
Security is a complex technical topic that can only be treated superficially at architectural levels. Although scenario-based methods are typically used for SA security analysis, security differs from other quality attributes. The security requirements are not enough for constructing a "security scenario" by themselves [103]. At the same time, it is necessary to understand the precise security requirements of an application and devise mechanisms Some of the essential methods are discussed in the following: CF is a mathematical model-based method that integrates performance analysis into the software development cycle. It presents software execution behaviors through a graph, including arcs and nodes, with timing information. This model works based on the Queuing Networks Model (QNM) performance model. The simulation evaluates the model to estimate performance attributes. The approach has been enriched by using Kruchten's 4 + 1 views and using use case scenarios depicted by a message sequence chart as the dynamic behavior of SA. Later on, the approaches combine UML diagram information to create a performance model of SA more formally. However, the methods do not consider the concurrent/non-deterministic behaviors of the components during QNM modeling. In order to address these problems, the labeled transition system (LTS) graph and ADL were added to the approaches [97,98]. The emerging problem was the computational complexity of the possible state space explosion of the architecture description's finite-state model. This problem persuades experienced-based analysis methods such as ABAS to not use an analysis tool to evaluate performance [99].
PASA boosts SPE by adding performance anti-patterns and architectural styles. It tries to adapt the concept of ATAM and SAAM into SPE. PASA formally states the scenarios with a descriptive architecture language such as the UML sequence diagram [100].
Nevertheless, none of these approaches has yet been applied to a complete environment for performance analysis, specification, and providing feedback to the designer. The unsolved problem is automating completely derived performance models from the software specification and assimilating the supporting tools in a comprehensive environment [101]. Moreover, although the quality of models has not yet been attended to deeply, high-quality models are an essential factor in which verification and performance analysis strongly rely on them [102].

Security
Security is a complex technical topic that can only be treated superficially at architectural levels. Although scenario-based methods are typically used for SA security analysis, security differs from other quality attributes. The security requirements are not enough for constructing a "security scenario" by themselves [103]. At the same time, it is necessary to understand the precise security requirements of an application and devise mechanisms to support SA security. In the implementation layer of SA, there are many techniques such as Windows operational security, Java Authentication and Authorization Service (JAAS), and without any significant problems [104]. These techniques mitigate the principal threats: authorization violation, system penetration, integrity compromise, confidentiality disclosure, repudiation, and denial of service.
The most important problem is that distributed SA has multiple layers of abstraction. Once each service abstracts the lower layer's business functionality, it is needed to abstract the underlying application's user identity context. Combining with the individual backends, heterogeneous security concepts beget a long way from the first request for a business procedure to the systems. Therefore, it also comprises monitoring, logging, and tracing all data flows related to security [105]. Security architectural flaws can be omissions, commissions, and realization flaws [106].

•
Omission flaws are born in the aftermath of decisions that have never been made (e.g., ignoring a security requirement or potential threats). Experience and prototype-based or even scenario-based methods can help the architect to detect this type of flaw. Still, they are mainly concerned with the requirement elicitation step, which is outside the scope of this research.

•
Commission flaws refer to the design decisions that were made and could lead to undesirable consequences. An example of such flaws is "using a weak cryptography for passwords" to achieve better performance while maintaining data confidentiality. DCAR is devised to support such a problem. • Realization flaws are the correct design decisions (i.e., they satisfy the software's security requirements), but their implementation suffers from coding mistakes. It can lead to many consequences, such as crashes or bypass mechanisms. TARA and SA evaluation methods can mitigate these problems.
In the industry, commission and omission flaws happen due to inexperienced decisions. The realization flaws are mostly ignored due to the cost of the detective methods. As a result, this current research highlights the realization flaws.
An SA model has properties such as performance or security. Regularly, these properties are emergent, and it is more feasible to reason about emergent properties in simpler models than complex ones. So, it is needed to simplify your model to leverage the problem and prove your knowledge about the emergent properties [107].

Identified Features and Factors
The study was devised to identify the features and lightweight factors to boost an evaluation framework. Concerning this fact, the study sought the practitioner's needs and the tendencies that researchers have paid attention to. Figure 3 shows the relationships and basis for identifying the features and factors. Then Table 6 lists the specified features and lightweight characteristics acquired from the study. Figure 3 shows how this research applies text analysis and data mining to the comprehensive online definitions of SA and the pool of papers published in the last three decades. In the next step, a comparison framework was defined, and six approaches were inspected deeply to find features and factors. First of all, two categories of SA definition were elicited from the online repository. In the next step, the keywords in practitioners' reports differed from the keywords of researchers. The top keywords are "time, cost, distributed, and complexity", which means the practitioners needed a lightweight solution. "Distributed" refers to the scope of the evaluation. Secondly, based on the 811 published studies from 1999 to 2016 in SA's topics, "security and performance analysis, heterogeneity and distributed, and agility" were the most popular research topics. Similar to a practitioner's concerns, heterogeneity and distribution refer to the scope. Security and performance were selected as the targeted quality attributes (TQA) for evaluation. Agility was the same as one of the identified factors for lightweights.
Component-Based architecture, SOA, and pipe and filter styles were selected as the identified scope's proper SA styles. As mentioned in Sections 3, 4 and 5.1, Petri net's formalism and visual presentation, hierarchical colored Petri net, is chosen to present SA due to its supremacy over other SA presentation methods.
For the sake of research concerns, 76 articles with SA evaluation topics were selected out of 811 articles. Next, 27 SA evaluation methods were chosen for the review. These methods were reviewed based on the following two aspects: the technique and popularity in the industry. The comparison framework was defined to compare the evaluation methods. Then five lightweight methods were selected and compared with ATAM. Consequently, three lightweight factors were identified. Moreover, (TQA1) performance and (TQA2) security were reviewed throughout 27 evaluation methods. As mentioned many times in this study, the SA evaluation approaches are devised to help software architects make a proper decision. Obviously, the architects are interfering in many steps to heighten the evaluation process, so the SA evaluation tends to be more manual rather than automated. Architects' skills may impact the design and decision-making parts that are outside the scope of this research.
Based on identified features, factors, and comparison framework, the overall profile of a lightweight evaluation framework is described below as follows: This study replied to research question two by distinguishing the differences between practitioners' and researchers' perspectives on SA via the comparative text analysis of SA definitions and the systematic literature review of existing methods. Then, for the sake of the first research question, this exploratory research identified the features and characteristics that enable lightweight SA evaluations in the industry. An evaluation framework can boost its usage by detecting flaws and issues in SA's performance and security. An informal description of requirements, UML diagrams, and source code is the input of the framework. The framework works within the specific scope of distributed software with the mentioned SA styles. The procedure for stakeholders is the minimal process, which was elicited from reviewing the lightweight solutions. The procedure is a face-to-face meeting between the architect and internal/external reviewers who know SA. The tools and techniques should be investigated to ease the integration of features and factors.

Achievement and Results
The existing literature was reviewed to remind us of the past research on using SA evaluation methods to identify the proposed framework's main features in the first stage. In the second stage, the factors affecting lightweights are identified. These factors improve the SA evaluation framework used in the industry.
In the first stage, the SA evaluation framework's features were identified based on the text analysis of researchers' and practitioners' SA definitions and all published studies for the last three decades on SA's topics. This analysis concluded that a lightweight SA evaluation solution was needed to uncover distributed and heterogeneous software's security and performance problems. Consequently, the security and performance analysis of SA were reviewed, and the proper SA presentation and styles for the distributed and heterogeneous software were identified.
In the second stage, lightweights were identified from the weaknesses of the current state of the art in the lightweight SA analysis methods. Indeed, the study tried to bridge the gap of less usage of systematic SA evaluations in the industry. First, it should be clear why the industry refrains from SA evaluation methods proposed by academics. As a result, this study followed two strands of academic and practitioner concerns. The practitioners need the SA evaluation framework with specific industrial features to solve their current problems, while academics focus primarily on scientific issues and possible issues in the future. This mindset led us to analyze the online web repository of SA definitions. As a result of that, the main extracted features that are in demand for both sides were a lightweight framework that can evaluate heterogeneous software systems from a performance and security perspective.
Moreover, the study conducted a systematic literature review on SA evaluation methods. As a result, the SA evaluation comparison framework was proposed as a basis for the SA evolution comparison. Then, it narrowed down the literature to the lightweight SA evaluation methods. A total of six SA evaluation methods were studied deeply to identify the factors influencing the SA evaluation method.

Conclusions
Although SA evaluation methods are beneficial, they are not broadly applied in the industry [108,109]. The selected SA evaluation methods are reviewed comprehensively. This research focuses on SA architecture, design, and its evaluation. It introduces the comparison framework to compare existing methods. The comparison between ATAM, as the heavyweight method's pinnacle, and the five fashionable lightweight methods recognizes three main factors for lightweights. A total of five different steps have been taken to address this problem. Firstly, the differences between academic and practitioner definitions of SA prove that the industry needs a lightweight SA evaluation method. Secondly, it is noticed that SA research mainly focuses on "performance and security analysis". Finally, these are the main features and factor that have been identified. As a result, the literature review explored SA evaluation methods to categorize them to understand the factors that hinder the lightweight SA evaluation method's success. The research suggests further investigation to find the proper tools and techniques to ease the integration of features and factors and boost solution usage in the industry.