Extended Covering Arrays for Sequence Coverage

.


Introduction
With the development of computer technology and the increase in custom requirements, software systems are becoming more powerful and complex.In fact, the emergence of unexpected faults in such systems is inevitable.Once the system encounters a certain fault, it is likely to fail.Failure means that the system operates with unexpected behaviors.Testing is a very necessary and significant means of system quality assurance during the product development life cycle [1].A report released by NIST (National Institute of Standards and Technology) in 2002 stated that software system bugs cost the U.S. economy 59.5 billion dollars annually [2].However, more than one third of this cost could be saved if better testing is performed [2].A contributing test method that can exactly find more faults with fewer test cases is urgently needed.
Combinatorial testing (CT) [3][4][5][6] is an efficient testing method through which an optimal or near optimal test suite with fewer test cases can be designed or generated.CT has proven to be an effective technique for detecting faults caused by interactions among configurations or factors in a given input space [7].The empirical studies of system bugs suggest that CT is equivalent to exhaustive testing in a certain sense [8,9].Although CT has been widely studied and used, there are still some situations and requirements that combinatorial testing does not apply to well, such as a system under test (SUT) whose test cases need to be performed contiguously.For thorough testing, the testing requirements of this SUT are not only to cover all the interactions among factors, but also to cover all the value sequences of every factor.When CT is used to design a test suite for this SUT, only the interactions among factor values can be effectively covered to a certain level according to the t-way combinatorial criterion; there is no effective criterion for the value sequences of every factor [10].For example, the "text effect" application of "Microsoft Word" has seven options for users to modify some highlighted text.These options are "subscript", "superscript", "strikethrough", "double strikethrough", "all caps", "small caps", and "shadow" [9].The font-processing function within the application correctly modifies the highlighted text on the screen according to the settings consisting of these options.When using CT, a test suite with several test cases can be generated to cover interactions among every t options.When the test suite is executing, the font-processing function modifies the text according to the test cases in sequence.However, the sequence of settings in every factor from contiguous test cases cannot be guaranteed to be tested at a certain level, such as a sequence from text with subscript to text without subscript for the "subscript" option with binary settings.
To solve the mentioned sequence coverage requirement, a t-way sequence coverage criterion for the requirement was proposed [11] based on the t-way combinatorial coverage criterion.The t-way sequence coverage criterion was first introduced to apply to event sequence testing.In terms of an SUT with n input events, each event can only be input once during a test.Each test case in the test suite covers ( n t ) subsequences with length t (0 < t ≤ n), and the covered t events in a subsequence do not have to be neighboring.A t-way sequence coverage test suite can cover all ( n t )t! subsequences that have t different events.The size of a t-way sequence coverage test suite is considerably less than that of the exhaustive test suite.Thus, a t-way sequence coverage test suite can replace the exhaustive test suite, which cannot be executed in practice.Another similar form of t-way sequence coverage is t-wise sequence coverage, which was presented by Kruse [12].t-wise sequence coverage applies to SUTs with n inputs each of which can appear more than once in a test.Additionally, the t inputs covered in a subsequence must be contiguous.The two types of sequence coverage criteria generally meet the sequence coverage requirement.Then, for SUTs whose test cases need to be performed contiguously, test suites can cover both the interactions among factor values and the value sequences of every factor by combining t-way combinatorial coverage and t-wise sequence coverage.The simplest way to combine them is to generate test suites separately and then integrate them into a large test suite.Although some coverage redundancies exist in the integrated large test suite, it successfully enables testing of SUTs with test cases performed contiguously.
In practice, SUTs always involve constraints or dependencies in or among test cases [1], such as interaction {b 1 , c 1 } must not appear in a test case or value c 2 must not be input after c 1 .If a test does not meet the constraints, the test is invalid.To automatically generate a valid test suite, the constraints should first be formally specified.Then, the formal specification of constraints can be used to direct the test suite generation.Although some formal specification methods have been used, such as linear temporal logic (LTL) and computation tree logic (CTL), the constraints among test cases or test steps have not been effectively specified, such as when the third input is c 1 , the fifth input must be c 2 .Because of the existence of constraints, when a t-way combinatorial coverage test suite is used alone, it may potentially violate the constraints among test cases.When a t-wise sequence coverage test suite is used alone, it may potentially violate the constraints in a test case.Hence, simply combining these two test suites into a large test suite is infeasible.
In this paper, we present a research work on extending covering arrays for sequence coverage, and we introduce extended covering arrays that can achieve both t-way combinatorial coverage and t-wise sequence coverage.Then, we propose a formal specification method for specifying constraints based on clocked computation tree logic (CCTL), which is an extension of CTL.The main contribution of this paper is to propose extended covering arrays with combinatorial coverage and sequence coverage for SUTs with test cases to be performed contiguously.This research has practical application value and will improve the efficiency of software testing.To evaluate the constructed test suites, a method for verifying constraints' validity among test cases is presented corresponding to the specification method, and kernel functions are also introduced to measure the coverage for constructed test suites.
As Particle Swarm Optimization (PSO) is competitive in uniform and variable strength covering array generation [13], we propose Particle swarm optimization based Extended covering array Generator (PEG) for constructing extended covering arrays in this paper.The performance of our proposed PEG is evaluated using several sets of benchmark experiments for some common constraints and the feasibility and usefulness of PEG is validated.
The remainder of this paper is organized as follows.First, Section 2 reviews the theoretical background and methods for sequence coverage testing.Some relative definitions are given in Section 3. Section 4 introduces extended covering arrays.Section 5 outlines the design and implementation of PEG, including its corresponding algorithms.Section 6 presents evaluation methods for verifying constraints' validity and measuring coverage.The results of 12 benchmark systems under test generated by PEG are presented in Section 7. Finally, the paper is concluded with a brief summary and provides a discussion of future research in Section 8.

Related Work
Here, we review previous work toward efficient solutions about sequence coverage requirements using combinatorial testing.
In terms of SUTs with n input events, where each event occurs exactly once in a test, the CT-based testing is sequence-based t-way testing.Kuhn is perhaps the first person to apply sequence-based t-way testing.He proposed a "quick and dirty" (QnD) algorithm, which is based on a greedy algorithm and likely has room for improvement [14], and he then presented Sequence Covering Arrays (SCAs) according to covering arrays [15].Subsequently, he proposed a modified greedy algorithm that can handle the constraints between event pairs [16], and he used SCAs to test labeled transition systems [17].He reported several algorithms for generating SCAs with the proposed t-way sequence coverage criterion.These algorithms represent the first effort to systematically explore possible strategies for solving the problem of t-way test sequence generation in a general context.Zamli discussed the sequence-based fixed and variable strength testing as an extension of existing t-way strategies and noted that there is clearly room for improvement, particularly for the t-way sequence coverage testing [11].Then, a sequence-based t-way interaction testing strategy using the bees algorithm was presented by Zamli [18].The proposed algorithm shows a promising result when compared to QnD through an experiment.A method for a t-way event-driven test suite generation based on simulated annealing called t-way Event-Driven Input Sequence Test Case (EDISTC-SA) generator was presented by Rahman [19].Farchi defined a test as an ordered tuple of input parameter values and introduced the ordered constraints and the ordered interaction coverage criteria [10].Then, an efficient algorithm for generating test suites with minimum sizes that satisfies the ordered interaction coverage criteria was proposed and evaluated on several real-life systems.SCAs are of practical value in testing.As exhaustive testing always consists of an incredible magnitude of tests, SCAs can reduce the cost of testing by decreasing the number of tests.
An innovative approach that combines model-based testing and combinatorial testing to design executable and feasible test sequences was described by Nguyen [20].The approach starts from a finite state model, and based on the model, it generates executable paths that represent sequences of events to be executed against the SUT.Then, these paths are transformed to the equivalence classes of a classification tree [21].The first children classifications of the root node of the tree represent the events, and the classes of a classification are the optional values of the corresponding event.Finally, the executable test cases corresponding to an executable path are generated from the classification tree using t-way testing.The classification tree method [21], which is a model-based black-box test design technique [22], was proposed based on equivalence partitioning, and it is always used for systematic test design and description of test cases.Mature products based on the classification tree method to design test sequences are TESTONA [23] and TESSY [24,25].Using the classification tree method, the input domain of an SUT is regarded under various aspects assessed as equivalent by the tester.For every aspect, disjoint and complete classifications are partitioned.The subsequent partition of every aspect through classifications is a graphical representation based on the form of a tree [21].Classes, which are disjointed abstractions of individual input levels for test purposes [26], derived from these classifications may be further classified even recursively [21].The tree is the head of the combination table corresponding to a test suite, and test cases are the body of the combination table.Test cases are constituted by combining classes from different classifications and correspond to test steps of the testing task.During the test run, test cases are generally executed in sequential order.The testing design of the classification tree method has been widely used for embedded systems [12], embedded automotive systems [12,22,[26][27][28] and web applications [29] in terms of functional requirements.In addition, "Modbat" and Microsoft's "Spec Explorer" are also model-based test case generating tools."Modbat" can generate state transitions coverage test cases [30].Microsoft's "Spec Explorer" can generate automated test cases by running traversal techniques to achieve a form of transition coverage and enable testers to find violations of the requirements with a minimum of manual effort [31].
Generally, there are constraints among test steps in real systems [12,32,33].Once test steps in a test suite that violate the constraints exist, the test suite is invalid for testing.Thus, it is very important to specify constraints and generate valid test suites based on the specifications [12].Schooljan described the linear temporal logic (LTL)-based formal specification method for dependency rules [34].In his work, temporal logic expressions are used to validate each test step of a test sequence.The specification of dependency rules cannot describe the constraints within test steps subjected to LTL.A similar work was conducted by Fraser, in which constraints were presented by computation tree logic (CTL) [33].Using LTL and CTL, the constraints among test cases were specified, whereas the test cases involving constraints are restricted to neighboring test cases.Krupp and Müller proposed an innovative application of clocked CTL (CCTL) logic to describe the constraints in real-time systems [35].The corresponding proposed model checker can verify the validity of test sequences by combining I/O interval specifications and CCTL expressions.Some similar coverage requirements with t-way testing criteria were proposed by Kruse [1,12], such as state coverage, transition coverage and state pair coverage.The state coverage is similar to the 1-wise sequence coverage, which needs all the states to be covered at least once, and constraints among test cases involving the order of states in every factor need to be avoided.The transition coverage is 2-wise sequence coverage, which needs all the transitions of states of every factor to be covered at least once.The state pair coverage is similar to the 2-way combinatorial coverage, which requires all the interactions between two factors to be covered at least once, and the constraints among test cases involving the order of states in every factor need to be avoided.Then, three algorithms to generate state coverage and transition coverage test cases were proposed [10].

Background
Before we introduce extended covering arrays, we present the existing covering arrays.When using CT, the first step is to develop the input space model of the SUT [36].The term "input" is used here in a general sense; any factor that can have an influence on the behavior of the SUT and that can be kept under control is considered to be an "input" [36].The input space model implicitly defines the SUT's valid input space [37].Given that an SUT has k input factors P 1 , P 2 , • • • , P k , and factor P i (1 i k) has v i values or levels, the input space model of the SUT can be represented as M =< P, V >.P is the set of factors CT approaches systematically extract and produce a set of configurations that will be run in the testing phase from the input space model [36].A set of configurations is called a test suite and a t-way covering array, in which each valid combination among factor values corresponding to t different factors appears at least once.Definition 1.Given a set I = {(P i 1 , a i 1 ), (P i 2 , a i 2 ), • • • , (P i t , a i t )} with i j ∈ [1, k](j = 1, 2, • • • , t), if P i j belongs to the set of factors P with |P| = k and a i j belongs to [0, v i j − 1], the set I is defined as a t-way interaction to be covered [38].
We use the set H t = {I|I = {(P i 1 , a i 1 ), (P i 2 , a i 2 ), • • • , (P i t , a i t )}} to denote all the t-way interactions to be covered.[38,39].
Given a test case T = {t 1 , t 2 , • • • , t k } and an interaction I = {(P i 1 , a i 1 ), (P i 2 , a i 2 ), • • • , (P i t , a i t )}, if T[i j ] = t i j = a i j meets, then we say that T covers I, denoted as T ⊇ I.We use H T,t = {I|I = {(P i 1 , a i 1 ), (P i 2 , a i 2 ), • • • , (P i t , a i t )} ∧ T ⊇ I} to denote all the interactions covered by T. [38,40], such that every column i only has elements from the set [0, v i − 1] and every possible t-way interaction I = {(P i 1 , a i 1 ), ), which indicates that there are y 1 parameters with v 1 values, y 2 parameters with v 2 values, and so forth.It is clear that ∑ 1≤l≤u y l = k.When the sizes of all the value sets are the same, A single set of configurations consists of one value from every factor.However, not all combinations of factor values may be valid, as some constraints related to some certain factor values exist.Once a single set of configurations contains these values, the single set is invalid.Such existing constraints are often caused by logical relationships among factors.For a calendar example, if factor "month" takes February, then factor "date" must take no more than 29.To guarantee that a test suite avoids all the invalid combinations successfully, the constraints must be modeled and specified.Definition 4. CCA(n; t, k, v, F) is a constraint covering array, where a new variable forbidden interaction F is introduced to present the set of constraints [40].For each constraint interaction I = (a 1 , , where x denotes the "do not care" values.The constraints are often called forbidden tuples [41,42] and forbidden edges [43].

When the k factors have different number of values, constraint mixed-level covering arrays
CMCAs are used.
For example, an SUT has three factors A, B and

Extended Covering Arrays
In this section, we introduce extended covering arrays.

ECAs
When CMCA(7; 2, 2 2 3 1 , F = (0, 0, x)) shown in Table 1 is used to test an SUT contiguously, 2-way combinations are covered.If the SUT has a sequence coverage requirement, then the two-value sequences of each factor are not completely covered.As in the test sequence {0 → 0 → 0 → 1 → 1 → 1 → 1} of factor A, only three two-value sequences {0 → 0}, {0 → 1}, and {1 → 1} are covered.If we use a 2-wise sequence coverage test suite as shown in Table 2, although all the sequences of two values of each factor are covered, some value combinations are not covered, such as (A = 1, B = 1) and (B = 0, C = 2).If both the sequence coverage and combinatorial coverage are required, then the only way to satisfy this requirement is to combine them into a test suite.However, combining them into a test suite may lead to some redundancies and a larger size.To design test suites that can meet both sequence coverage and combinatorial coverage with smaller sizes, extended covering arrays are defined.
, where t c is combinatorial coverage strength and t s is sequence coverage strength, such that every column i only has elements from the set [0, v i − 1] and the following two conditions are met: When the sizes of all the value sets are the same, we denote the extended covering array as ECA(n; t c , t s , k, v) or ECA(n; t c , t s , v k ).
For the example presented above, an EMCA is shown in Table 3.The extended covering array covers all combinatorial pairs and two value sequences only with size 10.7; 2, 2, 2 2 3 1 ).

ECCAs
In real systems, in addition to the constraints among factor values, there are constraints that involve sequences of factor values.Such a sequence of factor values is also often identified by logically specifying constraints.These two types of constraints must be avoided in final test suites; otherwise, test suites may be invalid.Thus, the specification and model of the constraints are the preconditions to design and generate test suites.On the one hand, the specification of constraints can help to avoid constraints in the process of constructing test suites.On the other hand, the specification of constraints can help to verify whether the designed test suites are valid.We also define the extended constraint covering array as ECCA(n; t c , t s , k, v, C).As there exist constraints that involve value sequences of each factor, some interactions of factor values may not appear in one test case in practice.See Section 5 for details.This produces a conceptual differences between ECCA and ECA.In other words, ECCA may violate the first condition in Definition 5.Then, we modify the first condition of ECA to fit ECCA.The modified condition is that ECCA could cover as many t-way interactions as possible.The t-way interactions covered in ECCAs can be measured in the method described in Subsection 6.2.When the k factors have different number of values, extended constraint mixed-level covering arrays (ECMCAs) are used.ECCAs have the same expression forms with ECAs.
The set C is the set of constraint statements specified by CCTL.CCTL is a variant of timed CTL based on I/O-interval structures introduced by Ruf [44,45] in the context of the real-time system model checker.The I/O-interval structures are used in state transition systems to express time annotations.A time annotation is a constraint that assigns a [min, max]-time interval for a state transition [44].
[min, max]-time intervals can help CCTL to precisely model time in which state transitions occur compared to LTL and CTL [44].For this reason, [min, max]-time intervals can also support describing the temporal relationship among inputs based on time steps precisely.This is the maximum benefit of CCTL.Thus, we use CCTL to model the constraint relationship.
The CCTL syntax is defined as follows [46]: where AP is an atomic proposition and m, n ∈ N + are time bounds with m ≤ n. ¬, ∧, →, ⊕, ∨ and ⊗ are the classical logic operators included in CCTL syntax.φ is the CCTL formula.X, F, U and G are the temporal operators, where X is the "next" operator, F is the "final" operator, G is the "always" operator, and U is the "until" operator.A and E are path quantifiers.A "path" is an infinite sequence of states, denoted as ρ.If a CCTL formula φ is true in path ρ, then we write ρ |= φ.Because there are potentially many paths in a system, E means that "at least one path exists that satisfies the temporal operator", and A means "for all paths that satisfy the temporal operator".In testing, the generated ECCA is equivalent to the path because it should satisfy the constraints presented by CCTL formulas, denoted as ECCA |= φ.As there exists at least a test suite which is used to test, we use the path quantifier E in this paper to specify constraints.Based on a constraint which belongs to a test case or involves test cases, the specification of constraints are divided into the following two parts.

The Specification of Constraints in a Test Case
The constraints in a test case indicate the restrictions among factor values.These types of constraints correspond to the traditional covering array.The constraints forbid the appearance of some certain value combinations that are composed of different factors.Because generating a test case that avoids all the constraints is a boolean satisfiability problem [47], a formal specification is needed in the process of automatically generating test cases.Generally, the constraints are often represented in conjunctive normal form (CNF) [47].
For a forbidden tuple In addition, there are always two types of internal constraints for an SUT to limit each test case to having one and only one value from every factor.They are the at-least and at-most constraints [47].The at-least constraints are needed to ensure that there is no less than one value of each factor in a test case, and the at-most constraints are needed to ensure that no more than one value is assigned to a factor in a test case.

•
at-most constraints: for each factor P i with its value set [0, v i − 1] and a test case T = (t 1 , t 2 , • • • , t k ), there exists that ∀a i m , a i n ∈ [0, v i − 1] and a i m = a i n , and the at-most constraints can be denoted as . The at-least constraints mean that there must be at least one value of a factor appearing in a test case.

The Specification of Constraints among Test Cases
The constraints involving value sequences of factors are constraints among test cases or test steps.The constraints are generally divided into absolute constraints and relative constraints according to whether constraints have the logical operator "→".Absolute constraints do not contain the logical operator "→", whereas relative constraints have the logical operator "→".The relative constraints in front and behind of "→" are composed of the absolute constraints presented by EX, EG, EU and EF.Relative constraints means that when the condition in front of "→" is tenable, the condition behind of "→" must be tenable.
Each temporal operator can represent a type of absolute constraint.A formal description of the temporal operators is presented below.Given ECCA(n; t c , t s , k, v, C) and ∀u, w ∈ N + with 0 < u ≤ w ≤ n: We always omit "ECCA |=" in the specification of constraints.The constraints specified above are the absolute constraints.Then, we introduce the relative constraints.We classify the relative constraints according to the symbol in front of "→".We define that when the constraint has the form of EX [u] in front of other temporal operators, then the u + 1th test case is regarded as the first test case for the following representations.Given ECCA(n; t c , t s , k, v, C), ∀x, y, z, u, w ∈ N + with x ≤ y and u ≤ w: (1) EX (2) EG At least one l exists with x ≤ l ≤ y: ECCA[l, :] |= ψ with x > w.

A Case Study
Two real SUTs are analyzed in this subsection.The first real system under test is the "VideoGame", which is described in Ref [1]."VideoGame" has two factors: one is "startingGame" and the other is "Pause".Factor "startingGame" has four levels and they are "startingGame", "startup", "controlling" and "gameOver", and factor "Pause" has two levels and they are "running" and "paused".In terms of factor "startingGame", "startingGame" level can transit to itself and "startup", "startup" level can transit to itself and "controlling", "controlling" level can transit to itself and "gameOver", and "gameOver" level can transit to itself and "startingGame".In terms of factor "Pause", "running" level can only transit to "paused" and "paused" level can only transit to "running".The limitations that restrict levels not to transit to others are constraints.Combined with the at-most constraints and at-least constraints, the constraints of "VideoGame" are obtained.Thus, the ECMCA(9; 2, 2, 4 1 2 1 , C) of the "VideoGame" can be constructed as shown in Table 4 and its constraint set C is as follows.The ECMCA has covered all the eight possible level combinations and ten level transitions:  The second real system under test is the "text effect" application of "Microsoft Word", which has seven options for users to modify some highlighted text.These options are "subscript", "superscript", "strikethrough", "double strikethrough", "all caps", "small caps", and "shadow" [9].Each option has optional values: one is "true" and the other is "false".Three value combination constraints exist, and they are (subscript = true, superscript = true), (strikethrough = true, double strikethrough = true) and (all caps = true, small caps = true).The font-processing function within the application correctly modifies the highlighted text on the screen according to the settings consisting of these options.When tests are executing, the font-processing function modifies the text according to the test cases in sequence.Thus, the ECMCA(9; 2, 2, 2 7 , C) of the "text effect" application can be constructed as shown in Table 5 and its constraint set C is as follows.The ECMCA has covered all the 81 possible value combinations and 28 value transitions:

Particle Swarm Optimization
The Particle Swarm Optimization (PSO) was originally put forward by Kennedy [48] as an optimization technique inspired by the swarm behavior of birds in 1995.The swarm of particles always moves towards the optimal position in the process of optimization.The position X t i = (x t i1 , x t i2 , • • • , x t ik ) of a particle indicates the solution under optimization.The speed ) of a particle indicates the tendency of evolution and the degree of variation.The fitness factor value of a particle indicates the degree of optimization.Each particle remembers its coordinates in the solution space where it has found its best solution so far, which is called pBest.In addition to pBest, particles track the overall best value obtained by any particle in the population, called gBest.The process of evolution will continue untill the iteration time is reached.As a result of the discrete values in parameters, we adopt the discrete version of PSO (DPSO), which has been used in covering array generation [13].The particle X t i updates its coordinate x t ij according to Equations ( 2) and (3): where t is the current iteration number, j is the component of the dimension k, i is the particle index, (c 1 , c 2 ) are the acceleration coefficients to adjust the weight between components, ω is the inertia weight in the range of (0, 1), and (r 1 , r 2 ) are two random factors ranged in (0, 1).According to the equations above, each particle updates its velocity by following its pBest and gBest in order to produce a movement towards a better region in the search space.

The Interaction and Constraint Maps Generation Algorithm
In PEG, fitness factor values that indicate the degree of optimization are used to choose best particles.Hence, in order to compute the fitness factor values of particles conveniently, effective data structures are necessary.We propose Combinatorial Interaction Maps (CI Ms) and Sequence Interaction Maps (SI Ms) as the structures that are used to choose best particles.Similarly, to verify the constraint validity of each particle, effective data structures are also necessary.We propose Combinatorial Constraint Maps (CCMs) and Sequence Constraint Maps (SCMs) as the structures that are used to validate constraint validity.The "Map" here is an associative container in C++ that stores elements formed by a combination of a key value and a mapped value, following a specific order.Algorithm 1 shows the corresponding generation algorithm of them.The algorithm receives t c , t s , v 1 , v 2 , • • • , v k and C as inputs, and generates CCM, CI M, SCM and SI M one by one.

Algorithm 1
The interaction and constraint maps generation algorithm.
Here, the at-most and at-least constraints are omitted, as they are only used for constraint validity verification.The keys of the four maps are factors or factor combinations, and the values mapped to the keys are a set of values of factors.The values in combinatorial maps represent factor value combinations, whereas the values in sequence maps represent factor value sequences.In the algorithm, the CCM is generated first.Then all the value combinations of each t c factors are generated in CI M. When CI M is generated, each factor value combination that appears in CCM must be removed from it.For example, the value combination {A = 0, B = 0} is a constraint in CCM.Thus, it must not appear in CI M. The generation process of SCM and SI M is similar with the generation of CCM and CI M. The SCM is generated first.Then, all the value sequences with length t s of each factor are generated in SI M. When SI M is generated, each value sequence that appears in SCM must be removed from it.

The ECMCA Generation Algorithm
The ECMCA generation algorithm is performed immediately after the generation of CCM, CI M, SCM and SI M. The use of CI M and SI M is essential for computing fitness factor values.The use of CCM and SCM is essential for validating the constraint validity of test cases.The main idea of the algorithm is that the algorithm makes fully use of the ability of seeking excellent solutions to generate good results.As usually constraints that are related to sequences are very complex, this algorithm supports three kinds of most common constraints.The three kinds of common constraints are: 1 initial value constraints of factors that are presented by EX 1 ; 2 combinatorial value constraints among factors; and 3 value transition constraints of any factor.The algorithm is shown as Algorithm 2.

Algorithm 2
The ECMCA generation algorithm.

end if 25: end while
The algorithm is explained in the following aspects. (1) Processing of the initial value constraint When an SUT has an initial value constraint that is presented by EX [1] , then an initial test case in which each factor is assigned its initial value is generated.Then, the factor value combinations The ECMCA generation algorithm is initialized by generating a random population position space and a random population velocity space for each particle in line 6.The position of each particle takes the form of a k-dimensional vector, , where each dimension x 0 ij represents a random integer number from the value set [0, v j − 1].The velocity of each particle also takes the form of a k-dimensional vector, , where each dimension v 0 ij is also simultaneously initialized with a random integer number between −(v j − 1) and (v j − 1).(3) Population update During the iteration of the algorithm, velocities and positions of the population particles are updated in line 8 according to Equations 2 and 3.After each iteration, if is updated with a random value in the range.Each particle updates its pBest with the solution space where it has its largest fitness factor value thus far.The global best solution gBest is updated with the particle that has the larger fitness factor value and satisfies the constraint validity.We use the SAT solver zChaff to verify the combinatorial constraint validity.The SAT solver is initialized in line 1 and the at-most and at-least constraints are generated as the initialization parameters of the SAT solver.The value transition constraints of factors are verified by comparing with SCM.(4) Fitness factor value Fitness factor values are used with PSO in a greedy fashion to identify better particles.A fitness factor value of a test case is the sum of the number of factor value combinations with size t c covered in CI M and the number of value sequences of each factor covered in SI M composed with generated t s − 1 test cases.(5) Population disturbance To guarantee the avoidance of the local optimum, after each iteration, one random position in X t i is updated by a random value in line 9.We denote the new particle as X t * i .If the fitness factor value of X t * i is larger than that of X t i , replace X t i with X t * i .(6) Searching strategy After the iteration progresses, if the fitness factor value of gBest is ≤ 0, a searching strategy is applied in line 23.At this time, almost a large proportion of factor value combinations are covered and some value sequences of factors may be left.To find the remaining value sequences of factors as early as possible, we present a searching strategy.In the searching strategy, a new test case is constructed one by one factor value to guarantee that it has the potential to cover as many as value sequences of factors or to help the subsequent test case to cover value sequences of factors.The searching strategy is divided into two situations.The first situation is to judge each value of the last generated test case whether belongs to a value sequence uncovered in SI M.
If there exists a value sequence in SI M, the value must be guaranteed that it is not the last value in the value sequence.Then, the subsequent value is inserted into a new test case, such that the value a in the last generated test case belongs to a uncovered value sequence (a, b) and the value b is inserted into the new test case.If the first situation does not exist, a second situation is performed.A schematic shown in Figure 2 is used to illustrate it.The precondition of the second condition is that each value can reach other values for each factor.Given a generated value x ij in a last generated test case, a traversal hierarchy is constructed based on the values that can be reached.Suppose that (v j − 1, v j − 2) is the uncovered value sequence.A path (x ij , 0, v j − 1) can reach v j − 1, and then the value 0 is inserted into the new test case.This process guarantees that, if the fitness factor value of gBest is still less than 0 after the next iteration progress, then the first situation of the searching strategy works.It should be noted that when a new value is inserted into the new test case, the new test case's constraint validity must be satisfied, otherwise a random value that can satisfy the constraint validity is inserted.The searching strategy can guarantee that at least one factor can be updated towards the direction, where the fitness factor value of gBest is greater than 0. ( 7) End condition Whether SCM is empty, the algorithm has two end conditions.If SCM is empty, the algorithm is terminated when all the value sequences of factors and factor value combinations are covered in lines from 15 to 17.If SCM is not empty, the algorithm is terminated only when all the value sequences of factors are covered in lines from 19 to 21.The reason is that, owning to the existence of the constraints of value sequences, some factor value combinations can perhaps not appear in one test case.For example, an SUT has factors A and B, and each factor has values 0, 1 and 2.
The input value sequence of each factor is restricted in circles from 0 to 2. Thus, there are only three value combinations (A = 0, B = 0), (A = 1, B = 1) and (A = 2, B = 2) that exist.When t c = 2, CI M has 3 × 3 = 9 value combinations and six of them can never be covered.Therefore, to avoid an infinite loop, the end condition is only SI M = ∅.
A schematic of the second situation of the searching strategy.

Verification of Constraints
An important problem is the verification of constraints for constructed extended covering arrays.The extended covering arrays that are verified to be valid can be used.According to the two types of constraints, the verification is also divided into two types.One is the verification of constraints in a test case and the other is the verification of constraints among test cases.

Verification of Constraints in a Test Case
The constraints in a test case can be presented in conjunctive normal form and CCTL, such as (¬subscript = true ∨ ¬superscript = true) and EG(¬subscript = true ∨ ¬superscript = true).The constraint verification is essentially the boolean satisfiability problem, irrespective of which presentation is used.There are two types of tools according to the presentations.For the conjunctive normal form, the verification tools are zChaff [49], Simple Theorem Prover (STP) [50], and MiniSAT [51].For the CCTL form, the verification tools are Spin [52] and NuSMV [53].The constraints of factor value combinations, the at-least and at-most constraints are all the inputs of the tools.The tools provide a "true" or "false" result for each test case at the end of the verification process.

Verification of Constraints among Test Cases
Although there is not an effective tool that can be used for CCTL, some formulas can be used for directing the verification of constraints among test cases.For different absolute constraints, different verification measures should be taken.Given m, n ∈ N + and m ≤ n, First, verify whether the first test case is φ.Second, regard the next test case as the first test case as the EX.Then, execute the verification process to verify EG [m−1,n−1] φ by repeating the process above.
Verify whether there exists a positive integer i with m ≤ i ≤ n that makes that the ith test case holds φ.
To verify the relative constraints, the constraints need to be split into two parts according to the logical operator "→".Then, verify the condition on the left of "→" first.If the condition holds, verify the constraint on the right of the operator "→".Once there is one step in the verification process where the ECMCA violates the constraints, the ECMCA is invalid.

Coverage Measurement
Given a test suite, we expect that the test suite covers all the value combinations among factors and value sequences of each factor with a size that is as small as possible.We can use Formula (4) to calculate the coverage, where the symbol "Covered" indicates the number of targets that have already been covered and "Total" indicates the total targets to be covered in theory: Consider the ECA(n; t c , t s , k, v).It has ( k t c )v t c factor value combinations to be covered and ( v t s )t s !k value sequences with size t s .Thus, there are ( k t c )v t c + ( v t s )t s !k targets to be covered.When considering the constraints, the number of targets to be covered should be less than the targets without considering constraints.
Because coverage measurement is essentially pattern analysis, which has been widely used in many domains [54], we define the coverage measurement with combinatorial coverage and sequence coverage.

Measurement of Combinatorial Coverage
Given a test suite TS and an interaction I = {(P i 1 , a i 1 ), I (TS)) I∈H tc ∈ N 0 indicates the number of t c -way combinations: δ t c I (TS) indicates the number of interactions covered in TS, and H t c is the set of all interactions with 0 ≤ δ t c I (TS) ≤ n.Γ c ≡ 1 is a characteristic function used for comparing with δ t c I : Another form of θ t c I (TS) is the following: The kernel function based on θ t c I (TS) is as follows: There is 0 ≤ κ t c (TS, TS) ≤ |H t c | for the kernel function.If there are no constraints, then 0 . Thus, the formula to calculate combinatorial coverage is as follows:

.2. Measurement of Sequence Coverage
Given a test suite TS and for each sequence ∈ N 0 indicates the number of sequences from column i in TS: The sequence TS[:, i] is viewed as a series of three subsequences S 1 , S i and S 2 .[0, v i − 1] is the value set of factor P i , and H i t s = [0, v i − 1] t s is the set of all the sequences with length t s .Γ s ≡ 1 is a characteristic function used for comparing with δ t s S i : Another form of θ t s S i (TS) is as follows: The kernel function based on θ t s S i (TS) is as follows: There is 0 ≤ κ i t s (TS, TS) ≤ |H i t s | for the kernel function.If there are no constraints, then 0 ≤ κ i t s (TS, TS) ≤ |H i t s | = ( v t s )t s !. Thus, the formula to calculate sequence coverage is the following:

Experiments
This section describes the experimental results of PEG performed on a benchmark of SUTs.Then, more complex constraints that PEG can not support are discussed.

Experimental Results of PEG
PEG is developed in the environment that consists of a desktop computer with Windows 7 (Dell, Xiamen, China), 2.6 GHz Core 2 Duo CPU, 2 GB of RAM.It is coded and implemented in Qt Creator 4.8.1 (C++) (Digia, Helsinki, Finland).
For the experiments, we use a benchmark with 12 different SUTs.Six of them are from Ref. [12].They are the "Keyboard", the "Microwave", the "Autoradio", the "Coffee Machine", the "Elevator" and the "Transmission".The reason why we choose them is that they have more than one factor and for each factor of them each value can reach to others.The "text effect" application of "Microsoft Word" is also chosen as an SUT.Besides the seven real world SUTs, we supplement five SUTs to increase the configuration diversity of SUTs.The details of the SUTs are given in Table 6.The configurations of factors and factor values are listed in the third column.Three kind of constraints are listed from the fourth column to the sixth column.As PEG depends on some degree of randomness, it is non-deterministic.Thus, we performed 30 independent runs per SUT/coverage criterion for a statistical analysis.We use PEG to generate t c = t s = 2 coverage and t c ≥ 2, t s = 3 coverage, respectively.The results are shown in Tables 7 and 8.As SUT2 and SUT7 have the configuration of two factors, they can not have test suites with t c > 2 coverage.Thus we use the asterisks "*" to mark the results which are performed with t c = 2 coverage in Table 8.To demonstrate the performance of PEG, best generated sizes, average generated sizes, best generated time and average generated time are presented for each SUT.The average coverage of targets are also reported corresponding to factor value combinations with t c coverage and value sequences of each factor with t s coverage.
Generally, the generated time increases as the number of factors and factor values grows.However, the generated time of SUT1 seems longer than other SUTs.This is mainly because much time is wasted in the calls of the SAT solver under combination constraints.Based on the results obtained, PEG can generate satisfactory results with total coverage when SUTs have no constraints related to value sequences of factors.When SUTs have the constraints related to value sequences of factors, PEG can cover all the target value sequences with covering as many factor value combinations as possible.Overall, the results show that PEG is feasible and useful to generate ECMCAs.

Discussion
As everyone knows, a fundamental problem with software testing is that testing under all combinations of inputs and preconditions is not feasible, even with a simple product.The SUTs with test cases to be performed contiguously still face this problem.ECAs attempt to use as few test cases as possible to cover as many factor value combinations and value sequences of factors as possible.The purpose of ECAs is to find more system defects.Compared with the manual test suite generation, ECAs can design more comprehensive test suites.ECAs fill the blank of the test case generation method for SUTs with test case to be performed contiguously and are of great significance to ensure the reliability and quality of SUTs.ECAs will be widely applied to many fields with high reliability requirement, such as aviation, spaceflight and weapon industry.In these industries, most of the input instructions of components are messages that consist of some relevant elements and are needed to be performed contiguously.ECAs are very suitable for the element value combinations and value sequences of elements in the messages.
Take an input instruction of a radar as an example.A high coverage test is needed to ensure the radar works normally under various working conditions.Table 9 shows the instruction of the radar under test.The instruction needs to be input contiguously to control its working mode.When scanning mode is "fixed point", the scanning speed and the sector scan scope needs to be assigned invalid values, and the scan center needs to be assigned a degree in the range of [0, 360].When scanning mode is "sector scan", the scanning speed needs to be assigned a valid speed, the sector scan scope needs to be assigned a valid scan scope, and the scan center needs to be assigned a degree in the range of [0, 360].When scanning mode is "circular scan", the scanning speed needs to be assigned a valid speed, the sector scan scope and the scan center needs to be assigned an invalid value.ECAs are feasible to be used as the test suites, and ECAs can improve the test coverage dramatically compared to design test suites manually.There is a large amount of input messages in the industry like the previous instruction and high coverage test suites are needed for those messages.Therefore, we believe that ECAs have a broad application prospect.However, PEG still needs to be improved in practice for more complex contraints.The SUTs whose test cases need to be performed contiguously usually have the three kinds of constraints that are the prerequisites of PEG.More complex constraints may be exist though they have hardly been seen in real world systems, such as the constraints between factors values from one test step to another [32].Refs.[12,32] have put forward the requirement of generating test suites under complex constraints and described some complex constraints as follows: 1.
If value c i from factor C is selected in test case t n , then value c j from factor C must be selected in the succeeding test step t n+1 .

2.
If C = c i in t n , then C = c j in a later t n+m .

3.
If C = c i in t n , then C = c j in all t n+1 to t n+m .4.
If C = c i in t n , then C = c j in all t n+m to t n+o . 5.
If C = c i or B = b k in t n , then D = d in a later t n+m .
These complex constraints can all be presented in CCTL as follows.As they do not give the configurations of factors and factor values, we could not construct their test suites: Ref. [33] has presented a simplified real world system with complex constraints that is a simplified controller of a car.It has two boolean inputs that represent the user's decision to accelerate or brake.Upon acceleration, the car starts moving, with either a slow or fast velocity.Upon braking, the car immediately stops.The velocity is also a factor of the example.Figure 3 depicts the values of the three factors that impact the car controller.As to pedal the brake and accelerator at the same time should be avoided when driving, the value combination of "accelerate = true" and "brake = true" is a constraint for the car controller.This value combination constraint can be denoted as EG(¬accelerate = true ∨ ¬brake = true).For each factor, there are also at-least and at-most constraints.Figure 4 shows the states and the transitions of states [33].A constraint that can be denoted as EGEX [1] (accelerate = f alse ∧ brake = f alse ∧ velocity = stop) restricts the first state S 0 of the car controller.Figure 5 shows the value transitions of each factor.When "velocity = stop" holds in a test case, "velocity=fast" cannot occur in the next test case.The constraint is EG(velocity = stop → ¬EX(velocity = f ast)).When "velocity = slow" holds in a test case, "velocity=slow" cannot occur in the next test case.The constraint is EG(velocity = slow → ¬EX(velocity = slow)).Integrated with seven other complex constraints that are given in [33], all the constraints consist of the constraint set C in ECMCA(n; t c , t s , 2 2      × 100% = 15 17 × 100% = 88.24%, as "velocity=stop" → "velocity=fast" and "velocity=slow" → "velocity=slow" are value sequence constraints.As PEG cannot handle complex constraints, the ECMCA(12; 2, 2, 2 2 3 1 , C) in Table 10 violates complex constraints.The first and second test cases violate the constraint EG(accelerate = f alse ∧ brake = f alse ∧ velocity = stop → EX(velocity = stop)), as when the first test is (accelerate=false, brake=false, velocity=stop), the second test case violates the EX(velocity = stop).
In the same way, the second and third test cases violate EG(brake = true → EX(velocity = stop)).
The third and fourth test cases violate EG(accelerate = true ∧ brake = f alse ∧ velocity = f ast → EX(velocity = f ast)).The fourth and fifth test cases violate EG(accelerate = true ∧ brake = f alse ∧ velocity = slow → EX(velocity = f ast)).The seventh and eighth test cases violate EG(brake = true → EX(velocity = stop)).The ninth and tenth test cases violate EG(accelerate = f alse ∧ brake = f alse ∧ velocity = stop → EX(velocity = stop)).The tenth and eleventh test cases violate EG(accelerate = f alse ∧ brake = f alse ∧ velocity = slow → EX(velocity = stop)).The eleventh and twelfth test case violate EG(accelerate = f alse ∧ brake = f alse ∧ velocity = f ast → EX(velocity = slow)).To illustrate an ECMCA that satisfies the complex constraints, an ECMCA(12; 2, 2, 2 2 3 1 , C) is constructed manually as shown in Table 11.As the configuration is simple, we can construct it manually.The ECMCA shown in Table 11 satisfies all the constraints and covers all the value combinations and value sequences of each factors.Through the analysis above, though PEG can produce satisfactory EMCAs, PEG still has some limitations to generate ECMCAs for SUTs with complex constraints.The emerging complex constraints can be specified by CCTL actually.It is certain that more real world SUTs with complex constraints are needed and analyzed, so that complex constraints can be classified and handled reasonably.

Conclusions
In this paper, we have proposed extended covering arrays with t-way combinatorial coverage and t-wise sequence coverage for SUTs whose test cases need to be performed contiguously.In extended covering arrays, we have introduced the clocked computation tree logic based formal specification method for specifying constraints.A Particle swarm optimization based Extended covering array Generator (PEG) that can produce feasible and useful ECMCAs with common constraints has also been presented.The performance of PEG is assessed considering benchmark experiments.For generated test suites, the method for verifying constraints' validity has been presented corresponding to the constraint specification method.Moreover, kernel functions that can measure the coverage of generated test suites have be given.Compared with the manual test suite generation, ECAs can design more comprehensive test suites, which improves the possibility of finding system defects.In a word, ECAs fill the blank of the test case generation method for SUTs with test case to be performed contiguously and are of great significance to ensure the reliability and quality of SUTs.Though some deficiencies exists, we still believe that ECAs can have a broad application prospect through continuous progress in practice.
As part of our future work, we will first optimise PEG to cover more possible value combinations under the constraints of value sequences, then try to find real world SUTs with complex constraints and extend PEG to support them.Compared to the fixed strength combinatorial testing, the variable strength combinatorial testing usually considers the actual interaction relationship in software sufficiently.Therefore, ECAs with variable strength are also worthy of study.

Figure 1 .
Figure 1.The generation of CCM, CI M, SCM and SI M.

Figure 3 .
Figure 3.The factor values of the simplified car controller presented as a classification tree.

Figure 4 .
Figure 4.The state transitions of the car controller.

Figure 5 .
Figure 5.The value transitions of each factor.
we use covering array CA(n; t, k, v) to replace mixed covering array.If the smallest n for CA(n; t, k, v) exists, we also denote it by CAN(t, k, v) or CAN(n; t, v k ).

Table 2 .
A 2-wise sequence coverage test suite.

Superscript Strikethrough Double Strikethrough All Caps Small Caps Shadow
R, P; CCM, CI M, SCM and SI M 2: if there is EX [1] in C then 3: generate a test and update CI M and SI M; 4: end if 5: while TRUE do and value sequences of factors covered in the test case are removed from CI M and SI M in line 3.As the generated test case is the first test case in ECMCA, only if t s = 1, it covers value sequences in SI M. Otherwise, it does not cover value sequences in SI M.

Table 6 .
General characteristics of systems under test.

Table 7 .
Results of PEG with t c = 2 and t s = 2.

Table 8 .
Results of PEG with t c ≥ 2 and t s = 3.

Table 9 .
An instruction of a radar under test.