Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data

The paper discusses the use of parallel computation to obtain rough set approximations from large-scale information systems where missing data exist in both condition and decision attributes. To date, many studies have focused on missing condition data, but very few have accounted for missing decision data, especially in enlarging datasets. One of the approaches for dealing with missing data in condition attributes is named twofold rough approximations. The paper aims to extend the approach to deal with missing data in the decision attribute. In addition, computing twofold rough approximations is very intensive, thus the approach is not suitable when input datasets are large. We propose parallel algorithms to compute twofold rough approximations in large-scale datasets. Our method is based on MapReduce, a distributed programming model for processing large-scale data. We introduce the original sequential algorithm first and then the parallel version is introduced. Comparison between the two approaches through experiments shows that our proposed parallel algorithms are suitable for and perform efficiently on large-scale datasets that have missing data in condition and decision attributes.

Data analysis in RST starts from a data table named information system.Each row of the table induces a decision rule, which specifies a decision (action, result, outcome, etc.) if some conditions are satisfied.The original RST presupposes that all the condition and decision data in the information system are complete.However, incomplete (i.e., missing) data is often seen in real applications due to many reasons.It occurs not only in conditions but also in the decision attribute.For example, if an information system contains data on symptoms and diseases of patients in a hospital, some decision data may be missing when the patients stop seeking treatment, e.g., for financial reasons.Furthermore, decision data may be blank owing to an inadvertent erasure or for reasons of privacy.A common approach might be to remove such objects with missing decision data.However, in our previous work [22,23], we gave evidence that removing such objects may lead to information loss.For example, the removal may change the original data distribution, break relations between condition attributes, or the induced knowledge after the removal would be different from the knowledge in the original information system where those objects are retained.Clearly, such objects with missing decision data should not be removed or should be handled in an appropriate way.
Recently, with the emerging era of big data, new RST-based approaches have been studied to deal with large-scale datasets [5,18,[24][25][26][27][28][29][30][31][32][33][34][35].This study is inspired by the introduction of MapReduce framework [36] for processing intensive datasets, and the fact that most of algorithms on small-scale datasets did not perform well on large-scale datasets.As we observed, all of the above MapReduce-based studies aim for large-scale information systems with no missing data, or with missing data only in condition attributes.To the best of our knowledge, there was no study on large-scale information systems in which some of the data, both in conditions and the decision, are missing.Inspired from this shortage, we propose a parallel method to compute rough approximations in such massive, incomplete condition and decision information systems.
Our proposed method is motivated from the method twofold rough approximations, which was introduced by Sakai et al. [37].The method of twofold rough approximations was originally designed for information systems which contain missing data only in condition attributes.In this paper, we extend the method for information systems which contain missing data in both conditions and decision attributes.In addition, using sequential algorithms to computing twofold rough approximations in such information systems is time consuming, and even impossible.Hence, we propose parallel algorithms to accelerate the computation in such massive, incomplete condition and decision information systems.
The rest of the paper is organized as follows.Section 2 reviews some related studies in the literature.Section 3 summarizes basic concepts of RST, MapReduce model and the usage of MapReduce to accelerate rough set processing.Section 4 introduces the method of twofold rough approximations and how to apply it to information systems with missing condition and decision data.Section 5 introduces the sequential algorithm and MapReduce-based parallel algorithms to compute twofold rough approximations in large-scale information systems.Evaluation tests are presented at the end of this section.The paper ends with conclusions in Section 6.

Literature Review
On small-scale datasets, problems on missing condition values have been studied extensively and have achieved positive results .Some authors attempted to transform an incomplete information system to a complete information system by: removing objects with missing condition values; treating them as special values; replacing them with average values, with most common values, or with all possible values [42,44].Others extended the concept of classical equivalence relation by relaxing the requirements of reflexivity, symmetry, and transitivity.This created new binary relations such as tolerance relation [50,51], valued tolerance relation [48], maximal consistent block [52], similarity relation [47], difference relation [60], limited tolerance relation [59], characteristic relation [43,46], etc.By considering the weaknesses of the tolerance and similarity relations, Sakai et al. [37] introduced the method of possible worlds and twofold rough approximations.The latter proved its accuracy since it gave the same rough approximations as the former while more computational efficiency.Not only in conditions, missing data can appear in the decision attribute.To deal with the issue of missing decision data, Medhat [62] suggested a method to restore missing decision data by measuring the similarity distance between objects with and without missing decision data.However, his method requires all the condition data to be complete, which is also rare in practice.A common approach to deal with such objects with missing decision data is to remove them.In our previous studies, however, we proved that removing such objects may lead to information loss [22,23].Hence, instead of removing such objects, we proposed a parameter-based method to induce knowledge from such information system.Our proposed method offered a more generalized approach to handle missing decision data without having to remove them, thus minimizing the threat of information loss.
Parallel programming [25][26][27][28][29][30]63] is another approach to accelerate the process of knowledge acquisition.Most of the methods in this approach are based on MapReduce.MapReduce, introduced by Google (CA, US) [36], is a software framework to support processing large data on clusters of computers.MapReduce framework has proved its efficiency since it has been widely used in data mining [64,65], machine learning [66][67][68][69] and web indexing [70].Based on MapReduce, Zhang [25] proposed a parallel method to effectively compute set approximations.He also implemented algorithms on different MapReduce runtime systems such as Hadoop, Phoenix, and Twister to compare the knowledge extracted from these systems [26].Li et al. [71] was one of the earliest authors using MapReduce for attribute reduction.The proposed approach divided the original datasets into many small data blocks, and used reduction algorithms for each block.Then, the dependency of each reduction on testing data was computed in order to select the best reduction.Qian et al. [30] proposed a hierarchical approach for attribute reduction.He designed the parallel computations of equivalence classes and attribute significance, as a result, the reduction efficiency was significantly improved.In [29], Qian et al. proposed three parallelization strategies for attribute reduction, namely "Task-Parallelism", "Data-Parallelism", and "Data+Task-Parallelism". El-Alfy et al. [63] introduced a parallel genetic algorithm to approximate the minimum reduct and applied it to intrusion detection in computer networks.Li and Chen [27,28] computed set approximations and reducts in parallel using dominance-based neighborhood rough sets, which considers the orders of numerical and categorical attribute values.Some authors studied rough sets in large-scale datasets with missing condition data [72,73].Zhang [72] designed a parallel matrix-based method to compute rough approximations in large-scale information systems with some condition values being missing.Based on complete tolerance class, Yuan [73] proposed an algorithm to fill the missing condition values of energy big data.

Basic Concepts
In this section, we review basic concepts of rough sets [1,2] and MapReduce technique [36].

Rough Set
An information system (IS) in the rough set study is formally described as ξ = (U, AT ∪ {d}, V, f ), where U is a non-empty set of objects, AT is a non-empty set of condition attributes, d ∈ AT denotes a decision attribute, f is an information function f : U × AT ∪ {d} → V, V = ∪V t for any t ∈ AT ∪ {d}.f (x, a) and f (x, d), x ∈ U, a ∈ AT are represented by f a (x) and f d (x), respectively.V a and V d denote the domain of f a and f d , respectively.Any domain may contain special symbols " * " to indicate a missing value, i.e., the value of an object is unknown.Any value different from " * " will be called regular [52].We use complete information system (CIS) to denote an IS with no missing values, incomplete information system (IIS) to denote an IS where missing values are just only in the condition attributes, and incomplete decision system (IDS) to denote an IS where missing values are both in condition and decision attributes.
The RST is formed on the basic of equivalence relation.Two objects are equivalent on A ⊆ AT if and only if their corresponding values are equal on every attribute in A, denoted by EQR(A).

MapReduce Model
MapReduce is a software framework to implement parallel algorithms [36].It is designed to handle large-scale datasets in a distributed environment.In MapReduce model, the computation is divided into two phases: Map and Reduce.These phases are performed in order, i.e., the Map phase is executed first and then the Reduce phase is executed.The output of Map phase is used as input of the Reduce phase.We implement these phases through Map and Reduce functions, respectively.Each function requires a pair key/value as input and another pair key/value as output.The Map and Reduce functions are illustrated as follows: where K i , V i , i = {1, 2, 3} are user-defined data types and [..] denotes a set of the element in the square bracket.
Map takes an input pair (< ).The MapReduce library groups all intermediate values associated with the intermediate key K 2 , shuffles, sorts, and sends them to the Reduce function.Reduce accepts an intermediate key (K 2 ) and a set of values for that key ([V 2 ]).It merges these values together to create a possibly smaller set of values, and finally produces < K 3 , V 3 > pairs as output.
One of the primary advantages of MapReduce framework is that it splits tasks so that their execution can be done in parallel.Since these divided tasks are processed in parallel, this allows the entire task to be executed in less time.In addition, MapReduce is easy to use because it hides many system-level details from programmers such as parallelization, fault-tolerance, locality optimization, and load balancing.In practice, to use the MapReduce model, what programmers need to do is to design proper < key, value > pairs and implement Map/Reduce functions.

The Usage of MapReduce to Rough Set Processing
It is possible to use MapReduce to speed up Rough set processing.One of the earliest implementation was done by Zhang [25].Using MapReduce, he ran several steps in parallel such as computing equivalence classes, decision classes and associations between equivalence classes and decision classes.He proved that rough set approximations obtained by the parallel method are the same as those obtained by the sequential method but the former takes less time than the latter.
Let's examine how equivalence classes can be computed in parallel.Let ξ = (U, AT ∪ {d}, V, f ) be an CIS, and In Map step, we partition each ξ i into equivalence classes, and then these equivalent classes are aggregated in a Reduce step.Note that there will be k numbers of mappers running in parallel, and each mapper will execute Map function to process corresponding ξ i : Maps output: ., E km A }.

In the above, E km
A represents the m th equivalence class w.r.t.A on the sub-CIS U k , and v km represents a set of objects that belong to E km A .Reducers output: Reduce 1: ., E m A } in sequence costs O(|U| * |A|) time complexity while doing the same in parallel costs just O(|U| * |A|/k) time complexity.Clearly, computational time is reduced using the MapReduce platform.This shows the strength of the MapReduce platform in increasing performance of Rough set computation.

Twofold Rough Approximations for IDS
In this section, we discuss the usage of twofold rough approximations in the case of missing condition and decision data.
Sakai et al. [37] introduced the method of twofold rough approximations to deal with missing data in condition attributes.He pointed out that, when an IS contains incomplete information, we can not derive unique rough approximations but can only derive lower and upper bounds of the actual rough approximation.He refers to the lower and upper bounds as certain and possible rough approximations, hence the name twofold rough approximations.The rough approximations obtained from this method coincide with ones from the method of possible worlds.
The method is based on an idea of considering both aspects (discernibility and indiscernibility) of every missing value.For example, let us assume an IIS with two objects x, y whose values on a ∈ A are f a (x) and * , respectively.Since the missing value may equal f a (x) (or not), x may be indistinguishable (or distinguishable) from y on a.Because we do not know the exact value of * , we should consider these both cases.Thus, we should take {x}, {y} and {x, y} into account since they have the possibility that each of them is the actual equivalence class w.r.t a.The set {{x}, {y}, {x, y}} is called possible equivalence classes.
The same interpretation can also apply for missing value in the decision.Given object z with f d (z) = * , the above interpretation suggests that f d (z) may be any value in the decision domain.Since we do not know the exact value of f d (z), we need to consider all the possibilities.In the following, we describe the usage of twofold rough approximations for IDS.
Let ξ = (U, AT ∪ {d}, V, f ) be an IDS.For each a ∈ AT, we can divide the universe U into two sets U a= * and U a = * representing objects whose values on attribute a are missing and regular respectively.Let U a = * /a be a partition of U a = * by a.We define Cer(U/a), Pos(U/a) as certain and possible equivalence classes w.r.t a respectively.Formally, Let X be a target set.Unlike the original definition of target set, here we define X is a family of equivalence classes w.r.t. the decision d included in ., X m }.X might be called a family of certain equivalence classes w.r.t. the decision d, denoted as X Cer .We also define X Pos as a family of the possible equivalence classes w.r.t. the decision d, each of which is a union of a certain equivalence class and U d= * .U d= * itself is also included in the family.Formally, Cer(U/A), Pos(U/A) are defined as the family of certain and possible equivalence classes w.r.t.A ⊆ AT, respectively: The certain lower and upper approximations of X w.r.t.A: where r Cer A (x) ={e|e ⊆ e , e ⊆ x, e ∈ Cer(U/A), e ∈ Pos(U/A)}, r Cer A (x) ={e|e ∈ Cer(U/A), e ∩ x = ∅}. (5) The possible lower and upper approximations of X w.r.t.A: where Example 1.Given an IDS as in Table 1, where a 1 , a 2 ∈ A are condition attributes, d is the decision attribute.Let X = U d = * /d = {{x 1 , x 5 , x 7 }, {x 2 }, {x 3 , x 6 }, {x 8 }}, and U d= * = {x 4 }.Then, X Cer = X, X Pos = {{x 1 , x 4 , x 5 , x 7 }, {x 2 , x 4 }, {x 3 , x 4 , x 6 }, {x 4 , x 8 }, {x 4 }}.Results are shown in Tables 2-4.

Computing Rough Approximations in IDS
In this section, we give the sequential algorithm and MapReduce-based parallel algorithm to compute twofold rough approximations in IDS.

Sequential Algorithm
Algorithm 1 describes the sequential algorithm to compute twofold rough approximations.The algorithm consists of four steps.For each a ∈ A, we calculate a partition U a = * /a, then certain and possible equivalence classes on a, i.e., Cer(U/a) and Pos(U/a) (Step 1).These Cer(U/a) and Pos(U/a) are then aggregated together to form Cer(U/A) and Pos(U/A) (Step 2).Next, we compute X Cer , X Pos where X represents the target set (Step 3).Lastly, we calculate twofold rough approximations (Step 4).

X) End
Let us analyze the computational complexity of Algorithm 1.Let n be the number of objects in U, and m be the number of attributes, i.e., n = |U|, m = |A|.If we assume that the number of partition equals the number of objects in the worst case, the order of computation of U/a is O(n), thus Step 1 costs O(n * m).In Step 2, we compute Cer(U/A) and Pos(U/A), which is required to compute the Cartesian product between Cer(U/a) and Pos(U/a).Since Cer(U/a) or Pos(U/a) costs O(n), Step 2 costs O(n m ) in total.We compute X Cer and X Poss in Step 3, whose time complexity is O(n).

In Step 4, the computation order of R Pos
. Thus, Step 4 costs O(n 3 ) at worst.In total, the overall complexity of Algorithm 1 is dominated by Step 2, which is O(n m ).This overall complexity is intensive for large datasets where n and m are large.For a more efficient computation, we introduce an approach to process Step 2 in parallel.In addition, other computation-intensive steps are also performed in parallel in order to reduce the whole computation time.

MapReduce Based Algorithms
The sequential approach may take a lot of time to compute rough approximations, especially when input data are large.Using MapReduce, we speed up the computation by implementing the parallel algorithm for all the above steps.The flow of the parallel algorithm is illustrated in Figure 1.We divide into four sub-algorithms: computing equivalence classes (EC), computing possible and certain equivalence classes (PEC), computing aggregation of possible and certain equivalence classes (AP), and computing twofold rough approximations (RA).We examine each algorithm in the following.

Computing Equivalence Classes in Parallel
Example 2. Let us divide the data in Table 1 into two sub-IDS ξ 1 , ξ 2 (Table 5).Suppose V a Table 5. Sub-IDS.
Hence, the above proposition implies that equivalence classes on each attribute a ∈ A can be computed in parallel.We design the EC Map and EC Reduce functions as follows.

Algorithm 2 function EC Map
The input of the EC Map is a data split ξ i .For each attribute a, and each object x i ∈ U i , we output intermediate pairs < key , value > where key is a tuple (a, f a (x i )) and value is the object x i itself.The MapReduce framework copies objects x i with the same key from different mapper nodes to corresponding reducer nodes.The EC Reduce accepts (a, f a (x i )) as input key, and a list of x i as input values.We aggregate this list of x i to form X a (v).
The proposition proves that the certain and possible equivalence classes on each attribute can be computed in parallel.The PEC Map and PEC Reduce functions are designed as follows:

End
This step is executed right after the previous step, so it uses the direct output of EC Reduce as input.An input key of PEC Map is in the form (a, v i a ).We extract the first element of the input key to be our intermediate key.The second element of the input key combined with the input value creates our intermediate values.The MapReduce platform groups all intermediate values w.r.t an intermediate key (i.e., an attribute) and sends them to corresponding reducers.
The PEC Reducer accepts the intermediate pairs and loop over the intermediate values.c missing is used to represent objects whose values are missing while c regular is a set of subsets, each subset contains objects whose values are regular.Then, we calculate certain equivalence classes c cer and possible equivalence classes c pos according to Formula (1).When c missing is empty, both c cer and c pos are equal c regular .The output is < key , (c cer , c pos ) >, where key represents an attribute, c cer , c pos represent certain and possible equivalence classes on the attribute, respectively.

Aggregating Possible and Certain Equivalence Classes in Parallel (AP)
In this part, we aggregate in parallel the certain and possible equivalence classes on a set of attributes A ⊆ AT, following Equation (3).
Let us divide A into smaller subsets A t where ∩A t = ∅ and A = t A t .The following propositions hold: Proof.Similar to proof of Proposition 3.
The above propositions prove that the aggregation of certain and possible equivalence classes can also be executed in parallel.To avoid memory overhead, we separate the process of aggregating certain equivalence classes from the process of aggregating possible equivalence classes.For aggregating certain equivalence classes, we extract only c cer from the output of PEC Reduce and vice versa.Algorithms 6 and 7 illustrate the process of aggregating certain equivalence classes.

Input :
A list (L t ) of input, each input has an attribute as key, and the certain equivalence classes w.r.t. the attribute as value Output : Both algorithms are identical: they receive a list of certain equivalence classes and produce the intersection between these certain equivalence classes.Given a list of certain equivalence classes L t , the intersections of elements of L t can be computed sequentially.That is, intersections between the first and second elements of L t will be computed first; then, the result is used to find intersections with the third element of L t .The process repeats until the last element of L t .We denote c pre as a variable to contain the intersections of previous elements, c current to contain current element, and c result to contain the intersections between c pre and c current .In the end, c pre contains the intersections of all elements in L t .In AP Reducer, we collect all c pre from different nodes, and repeat the above process to compute the intersections between them.The final outputs are the intersections of certain equivalence classes of all attributes.
Since the process of aggregating possible equivalence classes is identical to the process of aggregating certain equivalence classes, the above AP Map and AP Reduce functions can also be used.The difference is the input of AP Map: instead of certain equivalence classes of each attribute (c cer ), we use possible equivalence classes (c pos ).
To avoid computation overhead, this AP step can be divided into multiple MapReduce jobs.However, in order to get the final intersection, we need to set the number of reducers of the last MapReduce job as 1.

< CL, r Cer
These results coincide with those in Example 1. Proposition 6.The twofold rough approximations of the target set X produced by the sequential and parallel algorithms are the same.
Proof.From Propositions 1-5, we see that our proposed parallel algorithms generate the same results at each corresponding step of the sequentiaL Algorithm 1.This ensures that our proposed algorithms give the final rough approximations as those from the sequential algorithm.Proposition 6 verifies the correctness of our proposed parallel approach since it produces the same results as the sequential approach.Our approach can generalize well not only for IDS but also for IIS.Since IIS is the special case of IDS when the decision attribute has no missing data, Algorithm 8 can change a little to be adapted for IIS.In the case of IIS, U d= * is empty, so X Cer coincides with X Pos .Other algorithms can be used in IIS without modification.Thus, our proposed approach can be used as a validation tool for other current approaches that deal with just the missing data in condition attributes of big datasets.

Evaluation Test
In this part, we will evaluate the performance of sequential and parallel algorithms on different databases.The sequential algorithm is run on a computer with Intel Core i7 7700K (eight cores, each core has a clock frequency of 4.2 Ghz), and 16 GB main memory.We run the parallel algorithm on a cluster with five nodes, one is a master, and four slaves, each slave runs Intel Core i5 650 (four cores, each has a clock frequency of 3.2 Ghz) and has 8 GB main memory.We do experiments in Ubuntu 17.10, JDK 9.0.1, and Hadoop 2.9.0 environments.We only evaluate the efficiency of the algorithms in terms of execution time, not its accuracy since our parallel algorithm produces the same results as those of the sequential algorithm.
We conduct experiments on commonly used machine learning datasets, KDDCup99 from the UCI Machine Learning repository [74].The dataset has approximately 5 million records, and each record consists of one decision attribute and 41 condition attributes.Since our parallel algorithms deal with categorical attributes, all 35 numeric attributes are discretized first.Furthermore, we create missing data on both condition and decision attributes at the one percent rate and the dataset is renamed Kdd100.To test efficiency of the proposed method with different sizes of datasets, we divide the Kdd100 dataset into smaller datasets.Datasets are named based on the percentage, e.g., Kdd10 stands for 10 percent of original dataset, and so on (Table 6).Figure 2 shows the comparison of execution time between the sequential and parallel algorithms.We compare execution time by each step of the Algorithm 1 and the corresponding step of the parallel algorithms.When performing Step 2 of the sequential algorithm, we met the insufficient memory error if we aggregated more than eight condition attributes.Hence, we decided to aggregate only 8 out of 41 condition attributes to be able to measure its execution time.Our data and source code can be found here [75].From the results of our experiments, we can draw some conclusions: • Execution time increases when the volume of data increases in both sequential and parallel algorithms.• The most intensive step is RA step, and the least intensive step is EC, PEC step.The AP step takes less time than the RA step in our experiments because we aggregate very few condition attributes.
The more attributes we aggregate, the more time the AP step will take.• In AP and RA steps, the parallel algorithms outperform the sequential algorithm.To the dataset Kdd100, the former performs 25 times faster than the latter at the AP step, and four times faster at the RA step, respectively.This is important since these are the most intensive computational steps.
For the EC, PEC step, the parallel algorithm costs more time.This is because we divide this step into two separate MapReduce jobs: EC and PEC.Since each job requires a certain time to start up its mappers and reducers, the time consumed by both jobs becomes larger than the one of the sequential algorithm, especially when the input data is small.Notice that the time difference becomes smaller when the input data is larger (63 s in case of Kdd50, and 32 s in case of Kdd100).
It is intuitive that the parallel algorithm is more efficient if we input larger datasets.In addition, since this step costs the least amount of time, it will not impact the total execution time of both algorithms.• As the size of the input data increases, the parallel algorithm outperforms the sequential algorithm.
We can verify this through the total execution time.The parallel algorithm is around four times less than the sequential algorithm in datasets Kdd50 and Kdd100.This proves the efficiency of our proposed parallel algorithm.
As we can observe, our proposed parallel is more efficient in terms of computation time than the sequential method.It is worth mentioning that the sequential algorithm was implemented on a machine with a faster CPU with more memory while our parallel algorithm was implemented on a cluster of less powerful machines.We could have gotten better results if we implemented the sequential algorithm on the same configuration used for the parallel algorithm.However, we could not due to the insufficient memory error during the implementation of the sequential method.
With the limit on the number of machines available, we could not arrange a bigger cluster with more nodes so that more rigorous experiments and larger datasets can be tested.We aim to do this in the future.In addition, more optimization settings such as compressed types, data blocksizes, etc should be more carefully considered.Changing these settings may affect the performance of our parallel algorithm.

Conclusions
In this emerging era of big data, a large-scale information system with both conditions missing and decision values are normally seen in practice.Such information systems are difficult to cope with, not only because of the incomplete information they contain, but also because they are large in size.In this paper, we have successfully extended the method of twofold rough approximations for such massive, incomplete information systems.In addition, computing rough approximations by the sequential approach is very slow; thus, different MapReduce-based parallel algorithms are proposed to accelerate the computation.Experimental results demonstrate that our proposed parallel methods outperform the sequential method.Since computing rough approximations plays a key role in rule extraction and features reduction when utilizing rough set-based methods, our future work is to further investigate rules extraction and features reduction from such massive, incomplete information systems based on rough sets.

5. 2 . 2 .Proposition 2 .
Computing Possible and Certain Equivalence Classes in Parallel (PEC) In this part, we compute possible and certain equivalence classes on each attribute a ∈ A following Equation (1).For a ∈ A, Cer(U/a) = Cer(U i /a) and Pos(U/a) = Pos(U i /a).Proof.Cer(U/a) = Cer(U i /a) is directly from Proposition 1 and Equation (1).For Pos(U/a), Pos(U/a) = {e ∪ U a= * |e ∈ U a = * /a} ∪ {U a= * regular else foreach c ∈ c regular do c pos .add(c∩ c missing ) end c pos .add(cmissing ) end key = key value =(c cer , c pos ) output(key , value ) End Example 4. (Example 3 continued) The output of PEC Map: is the first element of L t then c pre = c input else c current = c input foreach pre ∈ c pre do foreach curr ∈ c current do c result .add(pre∩ curr) end c pre = c result c result = ∅ //clear c result end end end key = ∅ value = c pre output(key , value ) End

Figure 2 .
Figure 2. Compare execution time of each step between sequential and parallel algorithms: (a) Step 1 of the sequential algorithm and EC, PEC steps of the parallel algorithm, (b) Step 2 of the sequential algorithm and AP step of the parallel algorithm, (c) Steps 3,4 of the sequential algorithm and RA step of the parallel algorithm, and (d) total steps of the sequential and parallel algorithms.
The equivalence relation is reflexive, symmetric, and transitive.An equivalence relation divides U into a partition, given by U/EQR(A) = {E A (x)|x ∈ U}.U/EQR(A) is normally denoted by U/A for simplicity.Let E A (x) = {y ∈ U|(x, y) ∈ EQR(A)} be a set of objects equivalent to x w.r.t.A, and be called the equivalence class of x w.r.t.A.Let X ⊆ U be a set of objects with regular decision values, called a target set.The lower approximation of X w.r.tA is defined by R A

Table 1 .
An example of IDS.

Table 3 .
Calculate r Cer A (x) and r Cer A (x).

Table 6 .
A description of datasets.