Abstract
The Eclat algorithm is a typical frequent pattern mining algorithm using vertical data. This study proposes an improved Eclat algorithm called ETPAM, based on the tissue-like P system with active membranes. The active membranes are used to run evolution rules, i.e., object rewriting rules, in parallel. Moreover, ETPAM utilizes subsume indices and an early pruning strategy to reduce the number of frequent pattern candidates and subsumes. The time complexity of ETPAM is decreased from O(t2) to O(t) as compared with the original Eclat algorithm through the parallelism of the P system. The experimental results using two databases indicate that ETPAM performs very well in mining frequent patterns, and the experimental results using four databases prove that ETPAM is computationally very efficient as compared with three other existing frequent pattern mining algorithms.
1. Introduction
Membrane computing is a branch of natural computing, and its development provides many computing frameworks and new bio-molecular computing models [1]. The development of membrane computing started with observing the structure and functions of living cells. Membranes play important roles in the functioning of cells and separate the cells from the outside environment [2,3]. The models extracted from membrane computing are usually called P systems and are divided into three main categories, i.e., cell-like P systems, neural-like P systems, and tissue-like P systems. Most of these P systems have high computing power, are very efficient and Turing universal [4]. This study employs a tissue-like P system with active membranes to mine frequent patterns. A P system mainly consists of three parts: Membrane structure, multiple sets of objects, and evolution rules. For the membrane structure, the size and spatial layout are not important, and the focus is on the relationship between membranes [5,6]. Object sets are usually represented by a string of symbols and evolution rules of the objects are given in the form of rewriting rules. A P system is a distributed and parallel computing model, and evolutionary rules run synchronously, non-deterministically, and in maximum parallel [7], making the system computationally very efficient.
Data mining is a knowledge discovery process from large amounts of data and has been extensively studied in many fields. Frequent pattern mining is a fundamental field of data mining, and the goal is to find patterns that appear frequently in a database [8,9,10,11]. Many algorithms for mining frequent patterns, such as Apriori, FP-growth, and Eclat, to mention only a few, have been developed. Apriori utilizes an iterative approach called level-wise search, where itemsets are generated from itemsets, by taking the join and prune actions [12,13]. Nevertheless, the database must be scanned multiple times, which is inefficient for large-scale databases. FP-growth employs an FP-tree structure and pattern fragment growth method to mine frequent patterns [14,15], but it is difficult to generate a main memory-based FP-tree when the database is large. Eclat mines frequent patterns using vertical data different from Apriori and FP-growth, and only needs to read the columns relevant to the query to avoid reading unnecessary columns. In the Eclat algorithm, a new candidate set is generated from the union of two sets. By finding the intersection of the of the two itemsets, the support count of the candidate set is quickly obtained. However, when there are too many candidates, the following problems will occur: (i) The operation of finding the intersection of the is time consuming; and (ii) the scale of the is quite large and consumes a lot of memory. Many important improvements have been proposed [16,17,18]. However, it is necessary to improve the computational efficiency of the Eclat algorithm when the database becomes large.
This study proposes an improved Eclat algorithm called ETPAM based on the tissue-like P system with active membranes. The active membranes are used to generate subsume indices and frequent patterns. ETPAM utilizes the parallelism of the P system to execute rules in parallel. For a database with items, the algorithm generates cells, uses cells to explore frequent patterns, and uses the other cell, usually cell 0, as the output cell to output all frequent patterns generated. The subsume is considered as a technique that can greatly reduce the size of the search space [19], the priori law is introduced into the frequent mining process, and a threshold is used to limit the number of candidates of subsumes to further improve efficiency. The time complexity of ETPAM is reduced from O() to O() as compared with the original Eclat algorithm. Experimental results using two databases indicate that ETPAM performs very well in frequent pattern mining, and those using four databases shows that ETPAM is computationally very efficient as compared with three existing frequent pattern mining algorithms.
The rest of this paper is arranged as follows. Section 2 describes the frequent pattern mining problem, the original Eclat algorithm, and the basic tissue-like P system. Section 3 introduces the design of the tissue-like P system for ETPAM, and provides explanations of the rules and the computing process. Section 4 presents an example to show how ETPAM works. In Section 5, two databases are used to evaluate the performance of the tissue-like P system in identifying frequent patterns and four databases are used to verify the efficiency of ETPM. Conclusions are drawn and further research directions are given in Section 6.
2. Preliminaries
In this section, some basic definitions about frequent pattern mining [10,11], the original Eclat algorithm, and structure of the tissue-like P system with active membranes are introduced.
2.1. Frequent Pattern Mining
Let = {,} be a set of items and = {,} be a transaction database with m transactions.
- (i)
- Pattern: A set of items is called a pattern or an itemset.
- (ii)
- h-pattern: A pattern consisting of items.
- (iii)
- Support count: The number of transactions containing a certain pattern P, denoted as.
- (iv)
- Frequent pattern: A pattern with a support count no less than a given thresholdis called a frequent pattern.
2.2. The Eclat Algorithm
Eclat mines frequent patterns using the vertical data format [18,20] that is different from Apriori and FP-growth because they use horizontal data.
Vertical data: The more commonly used horizontal data is in a format : Itemset, where represents a unique transaction in a transaction database , and an itemset represents a set of items that belong to a transaction. Relatively, vertical data is in a format of item: , where item represents the unique item in itemset , and represents the set of transactions that include the corresponding item. An example of vertical database is shown in Table 1. Vertical data is more efficient than horizontal data in the process of obtaining the support of items because an algorithm only needs to read the columns related to a query, but does not need to read other unnecessary columns. For instance, if the support of itemset { } is needed in Table 1, an algorithm just needs to read and intersect the of and and find support ) = Num[(1, 4, 5, 7, 8, 9)(1, 2, 3, 4, 6, 8, 9)] = Num(1, 4, 8, 9) = 4, instead of scanning the entire database as using horizontal data.
Table 1.
A transaction database.
The basic Eclat algorithm is described as follows and the procedure using the example database in Table 1 is shown in Figure 1.
Figure 1.
The procedure of Eclat for the example database in Table 1.
Input: Database in vertical data format and the threshold .
Step 1: Take all items as a set and find all subsets of the set . Let subset be , as shown in the labeling process in the red arrow in Figure 1.
Step 2: Find the intersection of each pair of the transaction sets corresponding to the items in each subset, let intersection be denoted by , as shown in the labeling process in the blue arrow in Figure 1.
Step 3: Count the number of items in each , and find the support of each itemset. Itemsets with support count greater than or equal to the threshold are frequent itemsets.
Output: Frequent patterns with support not less than the threshold .
2.3. Tissue-Like P Systems
The tissue-like P system is an important expansion of the cell-like P system [21]. In a tissue-like P system, multiple cells are placed in the same environment, both cells and the environment can contain objects, and the cells and the environment communicate through evolution rules. Evolution rules are conducted in a non-deterministic and maximally parallel manner, and usually can produce an exponential growth space within linear operation steps [22]. When no evolution rules can be executed, the operation of the system stops and the final results are stored in a specific cell.
A basic tissue-like P system is a construct of the form:
where:
- (i)
- is a non-empty alphabet that represents a collection of objects in the tissue-like P system.
- (ii)
- syn {1, 2, } * {1, 2, , } represents all channels between cells.
- (iii)
- represents the execution order of the rules in the membranes.
- (iv)
- is the output membrane which stores the final results of the algorithm.
- (v)
- are the cells, each of which is a construct of the form:where is the object set initially in cell , if no object is in cell initially, is empty represented by , and is the set of evolution rules in cell . A rule : means removing the object multiset represented by , generating the object multiset represented by and , and sending the objects in and out to a specific area according to the target command. In the rule, means objects in are sent to the cells connected to the current cell, and v means objects in stay in the current cell. In , is the promoter of the rule. If is in the rule, is the inhibitor of the rule. If the rule has a promoter, the rule can be executed only when all objects in the promoter appear, and if the rule has an inhibitor, the rule cannot be executed when the objects in the inhibitor appear. Active membranes are used to generate subsume indices for frequent 1-patterns, and dissolved when all subsume indices are found.
3. The ETPAM Algorithm
This section begins with an introduction of two improvements to the Eclat algorithm. The design of the tissue-like P system with active membranes to improve the algorithm is then discussed. The evolution rules and the computing process are explained next.
3.1. Improvements to the Eclat Algorithm
Improvement 1: The subsume index and a quick method to generate it. The subsume index is used to restrict the number of candidates in the process of frequent pattern mining [19,23].
Definitions: represents the subsume index of pattern :
represents the set of transactions including pattern .
Property: = ( ), {subsets of }.
The support of pattern is the same as the support of the union of the patterns that are subsets of with pattern .
Eclat mines frequent patterns using the vertical data format. is the of transactions, including pattern . Using vertical data can generate subsumes of 1-patterns quickly and effectively.
Improvement 2: Early pruning of the search space by the threshold. In step 1 of the Eclat algorithm, all items are taken as the set and all subsets of set are found. This step generates too many candidate subsets when the size of is large. Hence, a priori law is introduced to prune the search space early. In the process of obtaining the ( + 1)-itemsets through the intersections of the frequent -itemsets, a -itemset with a support count not larger than the threshold will be removed from the intersection since any superset containing this itemset cannot be a frequent itemset and candidates containing this itemset do no need to be generated. The process of generating subsume indices is also improved. Just finding subsume indices for items with support counts larger than the threshold instead of subsume indices for all items in the database reduces the time and memory used.
3.2. Algorithm and Evolution Rules
Assume that the database contains transactions with fields. The tissue-like P system with active membranes with + 1 cells, designed for the ETPAM algorithm, is shown in Figure 2. Frequent patterns are generated in cells 1 to t. The union of each of these frequent patterns and its corresponding subsume indices are formed in cell 0, where all frequent patterns are finally obtained.
Figure 2.
The tissue-like P system for the ETPAM algorithm.
The threshold is represented by in the P system. An object represents a transaction containing item . In this way, the vertical database can be transformed into objects used in the P system. Auxiliary object is used to perform the comparison between the support of an itemset and the threshold. In the comparison, one item in the itemset consumes one , and means copies of object . Object is the promoter of the intersection process, and the corresponding rules can be executed only when object appears. Object is the catalyst to delete the redundant in cell , keeping the uniqueness of the object.
The tissue-like P system with active membranes for ETPAM is defined as follows:
where:
- (i)
- O = {, ,,, , ,,, , , }, for 1 1 ;
- (ii)
- syn = {{0,1}, {0,2},,{0,}; {1,2}, {2,3}{−1,}};
- (iii)
- = {};
- (iv)
- = (, ), = (, ) = (, );
- (v)
- = 0.
In = (, ), = {} and
In = (, ), = and
:
for 1 and 1
In = (, ), = and
:
for 1 , 1 and 1
In = (, ), = and
:
for 1 and
In = (, ), = and
:
for 1 and 1
In = (, ), = and = .
In = (, ), = and
:
for 1 , 1 and 1 .
When computation starts, frequent 1-patterns are generated in cell 1, and then sent to cell 2 and cell 0 by executing rules in parallel. At the same time, objects of frequent 1-patterns are also sent to cell 2. Frequent 2-patterns are then generated in cell 2 and sent to cell 3 and cell 0 by executing rules also in parallel, and objects of frequent 2-patterns are sent to cell 3. This process stops when all frequent patterns are found. In cell 0, patterns that have subsumes combine with their subsumes to obtain all frequent patterns. When computation ends, the final results are stored in cell 0. Compared to other frequent pattern mining algorithms, the ETPAM algorithm executes evolution rules in parallel to generate frequent patterns, the time complexity of ETPAM is reduced from O() to O() as compared with the original Eclat algorithm.
3.3. Computing Process
Generation of Frequent 1-Pattern Itemsets. When computation begins, objects are entered into cell 1 and then copies of the objects are sent to cell 2 by the rule . The searching process of the candidate frequent 1-pattern is taken as an example, and the searching processes of the other candidate frequent 1-patterns are similar to that of the process of . A total of copies of is generated by rule and one consumes one through rule . Finally if any copy of is left in cell 1, is not a frequent 1-pattern because its support count is less than the threshold ; otherwise, if no copy of is left in cell 1, is a frequent 1-pattern and is sent to cell 2 and cell 0 through the rule .
Generation of Subsume of Frequent 1-Patterns. In cell 2, extra objects are removed first by rule , and frequent 1-pattern acts as an inhibitor so that only objects belonging to frequent 1-patterns are left in cell 2. Assume that frequent 1-patterns are obtained in cell 1, then cells that are the same as cell 2 are generated by rule . Rule cannot be executed without promoter . Rules in cell for 1 are executed and subsumes of for 1 are generated in the corresponding cell for 1 . In cell , objects belonging to are compared with objects belonging to for 1 sequentially, and one consumes one . Finally, if both and remain in the cell, and are not subsume of each other. If just remains, is a subsume of and is generated by rule . If just remains, is a subsume of and is generated by rule . In the searching process for subsumes of , for example, after are compared with , , and remain, so that and are not subsume of each other. After are compared with , remain, so that is a subsume of , and is generated. When all subsumes of in cell are found, computation halts, promoter is generated, and promoter and the subsumes are sent to cell 0 and cell 2.
Generation of Frequent 2-Pattern Itemsets. The execution condition of rule is met when promotor appears in cell 2, so that rule generates objects as frequent 2-pattern candidates. Subsumes of frequent 1-patterns are inhibitors of this process, so that only objects which are not subsumes of each other are generated in cell 2. Duplicates of are removed by rule to keep the uniqueness of the object. Then left in cell 2 are used to find frequent 2-patterns and one copy of is sent to cell 3 by rule . The searching process of candidate frequent 2-pattern {} is taken as an example, and the searching process of other candidate frequent 2-patterns is similar. Totally copies of are generated by rule and one consumes one . Finally, if any remains, {} is not a frequent 2-pattern; otherwise, if no remains, {} is a frequent 2-pattern and is sent to cell 0 and cell 3 together with subsumes and promotor .
Each of the other cells for 3 executes evolution rules similar to those in cell 2 and performs similar functions to find -frequent patterns.
In cell 0, is executed to combine frequent patterns obtained with their subsumes to get all frequent patterns of the database. After the algorithm finishes, all frequent patterns are stored in cell 0 as the final results.
3.4. Algorithm Specification
The typical Eclat algorithm runs sequentially. However, ETPAM is executed in parallel utilizing the nature of the tissue-like P system. A pseudo code of ETPAM is presented in Algorithm 1 in the following.
| Algorithm 1. ETPAM. |
| Input: Transactional database; representing the threshold k; |
| Method: |
| { |
| Rule : Transfer one copy of objects to cell 2. |
| Rule : Generate for 1 j t to check the candidate frequent 1-patterns . |
| Rule : Check all objects in the cell, and one object consumes one . Continue until all objects have been checked or all k copies of have been consumed. Rule : If all k copies of have been consumed, generate an object to add to as a frequent 1-pattern and transfer cell 2, and cell 0. |
| Rule : Generate a new membrane for each frequent 1-pattern , and transfer the corresponding objects and to cell . Rule : In cell , compare objects belonging to and objects belonging to for 1 j’ in parallel, and one consumes one . Finally, if both and remain in the cell, and are not subsume of each other. If just remains, is a subsume of and then is generated. If just remains, is subsume of and then is generated. Continue this way until all subsumes of have been found. Rule : Generate object and transfer it to cell 2 and cell 0 together with all subsumes of . |
| For (2 h t and ) |
| { |
| Rule : Delete objects not belonging to frequent ( − 1)-patterns. |
| Rule : Scan all objects representing the frequent ( − 1)-patterns to generate the objects representing the candidate frequent h-patterns . |
| Rule : Transfer one copy of objects to cell + 1. |
| Rule : Generate for each to check the candidate frequent h-patterns . |
| } |
| Rule : combine frequent patterns in cell 0 with their subsumes to get all frequent patterns of database. |
| } |
| Output: Frequent patterns mined from the database. |
3.5. Time Complexity
In this section, the time complexity of ETPAM in the worst case is evaluated. Obtaining frequent 1-patterns needs 4 steps. Passing objects to cell 2 and cell 0 needs 1 step and generating needs 1 step. Checking all frequent 1-pattern candidates in parallel takes 1 step. Finally sending the frequent 1-patterns obtained in cell 1 to cell 2 and cell 0 needs 1 step.
Obtaining subsumes of frequent 1-patterns needs 5 steps. Generating new membrane for each frequent 1-pattern needs 1 step, and transferring the corresponding objects and to cell needs 1 step. Comparing all belonging to and belonging to for 1 j’ in parallel and generating subsumes of frequent 1-pattern in cell in parallel take 1 step. Executing rules , , and in parallel in each cell and obtaining subsumes for all frequent 1-patterns simultaneously take 1 step. Sending object and subsumes obtained in cell 1 to cell 2 and cell 0 needs 1 step.
Obtaining frequent h-patterns needs 6 steps. Deleting extra objects not belonging to frequent (h – 1)-patterns needs 1 step. Generating candidate frequent h-patterns needs 1 step. Passing objects to cell h + 1 needs 1 step and generating needs 1 step. Checking all frequent h-pattern candidates in parallel takes one step. Finally sending the frequent -patterns obtained in cell h to cell + 1 and cell 0 needs 1 step.
Finally, in cell 0, obtaining all frequent patterns by combining all frequent patterns with their subsumes in parallel takes 1 step.
Thus, the complexity of ETPAM is 4 + 4 + 6(1) + 1 = 6 + 3, which gives . Table 2 presents the time complexities of some basic frequent pattern mining algorithms. In the table, represents the number of candidate frequent -patterns and represents the number of frequent ( − 1)-patterns. As shown in Table 2, the performance of ETPAM is better than that of other existing algorithms.
Table 2.
Time complexities of some pattern mining algorithms.
4. An Illustrative Example
To give a clear demonstration about how ETPAM works, this section presents an illustrative example to demonstrate the execution of the algorithm using the database in Table 1. As shown in Table 1, the database contains 9 transactions. The threshold is set to = 3.
Generation of Frequent 1-Pattern Itemsets. When computation begins, objects { }, {}, {} {}, and { are entered into cell 1 and then one copy is sent to cell 2 by rule . The auxiliary objects for 1 5 are created by rule . The searching process of candidate frequent 1-pattern is used as an example. Objects {} and are in cell 1 meaning that item is included in the first, fourth, fifth, seventh, eighth, and ninth transactions. After rules { } and { } are executed, objects remain in cell 1. Hence, is a frequent 1-pattern, and subrule { (} sends to cell 2 and cell 0. The searching processes of are the same as that of . Finally, are determined to be frequent 1-patterns in cell 1, and are sent to cell 2 and cell 0.
Generation of Subsumes of Frequent 1-Patterns. In cell 2, extra objects are removed by rule , and objects , {}, {}, and {} stay because frequent 1-patterns are inhibitors in rule . Rule is executed to create 4 cells to generate subsumes of frequent 1-patterns. In cell 2′, after are compared with , and remain, so that and are not subsume of each other. After are compared with , and remain, so that and are not subsume of each other either. After are compared with , remain, so that is a subsume of , and, therefore, is generated. To improve efficiency, just subsume indices for items with support larger than the threshold are found. Because is not a frequent 1-pattern, the process of searching subsumes of ends. Object , together with , is generated and sent to cell 2 and cell 0. The searching processes in cells for 1 4 are similar to that in cell 2′. All rules in cells for 1 are executed in parallel, and all frequent 1-patterns’ subsumes are obtained simultaneously. The process and results of the generation of frequent 1-patterns are summarized in Table 3.
Table 3.
Generation of frequent 1-patterns.
Generation of Frequent 2-Pattern Itemsets. Because the execution condition of rule is met when promotor appears in cell 2, rule generates . Because is an inhibitor, no objects like for items and are generated. Duplicate are removed by rule , left in cell 2 are used to find frequent 2-patterns, and one copy of is sent to cell 3 by rule . The auxiliary objects and are created by rule . After and are compared with , remains. After and are compared with , remains. After and are compared with , remains. Therefore, and are frequent 2-patterns. Subrules { (} and { (} send together with and the promotor to cell 0 and cell 3. The process and results of the generation of frequent 2-pattern itemsets are summarized in Table 4.
Table 4.
Generation of frequent 2-patterns.
Generation of Frequent 3-Pattern Itemsets. In cell 3, extra objects are removed by rule and stay because frequent 2-patterns are inhibitors. Because the execution condition of rule is met when promotor appears in cell 3, rule generates and . The auxiliary objects are created by rule . After and are compared with , remains, so that is not a frequent 3-pattern. The process ends since there are no frequent 3-patterns. The process and results of the generation of frequent 3-pattern itemsets are summarized in Table 5.
Table 5.
Exploration process of frequent 3-patterns.
In cell 0, executes to combine with and in turn to generate and . The computation of the P system ends at this point and all frequent patterns are stored in cell 0.
5. Experiments
Five databases, Connect, Mushroom, MovieItem, Retail, and T10I4D100K, from the UCI Machine Learning Repository were used in the experiments. Some of these databases are dense and others are sparse, and all of them are often utilized to test the performance of frequent pattern mining methods. The characteristics of these databases are given in Table 6. All experiments were performed on a personal computer with an Intel Core i3 processor and 4 GB of RAM under the Microsoft Windows 10 64-bit operating system. All the programs are coded in Python 3.
Table 6.
Characteristics of the databases used for the experiments.
5.1. Effectiveness of ETPAM in Identifying the Frequent Pattern Itemsets
Two databases, Mushroom and Connect, are used to verify the performance of ETPAM in identifying the frequent patterns. The results are reported in the following.
The Mushroom database includes 8124 transactions, each transaction has 23 attributes (fields), each attribute represents one characteristic of the mushrooms, such as the poisonousness of the mushroom, and each attribute has 2 to 12 values for a total of 119 possible values. The purpose is to know what attributes often appear together, i.e., to find frequent patterns in the database. The data is preprocessed first, where each attribute value is treated as a new attribute, and each new attribute has only two values, 1 or 0 representing yes or no. The threshold is set to k = 4062 (8124 * 0.50). The frequent patterns found by ETPAM are presented in Table 7.
Table 7.
Frequent patterns identified by ETPAM for the Mushroom database.
The Connect database includes 67,557 transactions, each transaction has 43 attributes (fields), and each attribute has 3 values for a total of 129 values. The data is preprocessed in a way similar to that used in the Mushroom database, i.e., each attribute value is treated as a new attribute, and each new attribute has only two values, 1 or 0, representing yes or no. The threshold is set to k = 66,205 (67,557 * 0.98). The frequent patterns found by ETPAM are presented in Table 8.
Table 8.
Frequent patterns identified by ETPAM for the Connect database.
5.2. Efficiency of the Proposed Algorithm
To verify the efficiency of the two improvements introduced into the original Eclat algorithm, ETPAM with rules executed serially is used to compare with those of Apriori [7], Fp-growth [24], and the original Eclat algorithm [25]. The total running time is used as a metric to evaluate performance in experiments. The total running time of each algorithm on each database is plotted against the values of the threshold k in Figure 3, where the vertical axis signifies the total running time in seconds and the horizontal axis represents the different threshold values. As shown in Figure 3, ETPAM with rules executed serially is more efficient than Apriori, Fp-growth, and Eclat for all values of the threshold. Thus, the experimental results verify the efficiency of the improvements proposed. More importantly, the evolution rules of the ETPAM algorithm are actually executed in parallel utilizing the nature of tissue-like P system. For example, the process of generating subsumes of the frequent 1-pattern in cell is conducted in parallel, rules , , and are executed in parallel in each cell , and the subsumes of all frequent 1-patterns are obtained simultaneously. Running in parallel, it will use much less running time, making the algorithm more efficient.
Figure 3.
Running times of the four algorithms on the four databases.
With these improvements, the time complexity of ETPAM is decreased to O(t) from O(t2) compared to the original Eclat algorithm. The tissue-like P system is a distributed and parallel model, and its evolutionary rules run synchronously, non-deterministically, and in maximum parallel, making the system computationally highly efficient. The tissue-like P system is a natural distributed parallel computing system that can be implemented biologically. The calculation requires only a few cells, which can reduce the computational resource requirements and improve the computational efficiency.
6. Conclusions
Membrane computing, inspired by the structure and functioning of biological cells, was introduced as a branch of natural computing. This paper introduces a tissue-like P system with active membranes to mine frequent patterns, and proposes a novel algorithm, called ETPAM, for mining frequent patterns based on the tissue-like P system introduced. ETPAM utilizes the parallel mechanism of the tissue-like P system to execute evolutionary rules synchronously, and in maximum parallel. The time complexity is decreased from O(t2) to O(t) as compared with the original Eclat algorithm. The experimental results using two databases show that ETPAM performed very well in mining frequent patterns. The experimental results on four databases prove that ETPAM is very efficient in mining frequent patterns as compared with three existing algorithms. In addition, only several cells are needed to implement tissue-like P system by biological methods, which can greatly reduce the computing resource consumption. For further research, some other types of P systems, such as the spiking neural P systems (SN P systems) [26] and the cell-like P systems, can be used to develop hopefully more effective and efficient data mining algorithms.
Author Contributions
Conceptualization, L.J., L.X. and X.L.; methodology, L.J.; software, L.J.; validation, L.J. and X.L.; formal analysis, L.J.; investigation, L.J.; resources, L.J., L.X. and X.L.; data curation, L.J.; writing—original draft preparation, L.J.; writing—review and editing, L.J. and X.L.; funding acquisition, X.L. and L.X.
Funding
This research was funded by the National Natural Science Foundation of China (Nos. 61472231, 61502283, 61876101, 61802234, 61806114).
Acknowledgments
This research project is partially supported by the Social Science Foundation of Shandong Province, China (Nos. 16BGLJ06, 11CGLJ22), China Postdoctoral Science Foundation Funded Project (2017M612339, 2018M642695).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Păun, G.; Rozenberg, G.; Salomaa, A. The Oxford Handbook of Membrane Computing; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
- Păun, G. Computing with membranes. Comput. Syst. Sci. 2000, 61, 108–143. [Google Scholar] [CrossRef]
- Păun, G. Membrane Computing; Springer: Berlin/Heidelberg, German, 2002. [Google Scholar]
- Pan, L.; Păun, G.; Song, B. Flat maximal parallelism in p systems with promoters. Theor. Comput. Sci. 2016, 623, 83–91. [Google Scholar] [CrossRef]
- Yahya, R.I.; Shamsuddin, S.M.; Yahya, S.I.; Hasan, S.; Alsalibi, B. Automatic 2d image segmentation using tissue-like p system. Int. J. Adv. Soft Comput. Its Appl. 2018, 10, 36–54. [Google Scholar]
- Wang, J.; Shi, P.; Peng, H. Membrane computing model for IIR filter design. Inf. Sci. 2016, 329, 164–176. [Google Scholar] [CrossRef]
- Liu, X.; Zhao, Y.; Sun, M. An improved apriori algorithm based on an evolution-communication tissue-like p system with promoters and inhibitors. Discret. Dyn. Nat. Soc. 2017, 2017, 1–11. [Google Scholar] [CrossRef]
- Mai, T.; Vo, B.; Nguyen, L.T.T. A lattice-based approach for mining high utility association rules. Inf. Sci. 2017, 399, 81–97. [Google Scholar] [CrossRef]
- Kabir, M.M.J.; Xu, S.; Kang, B.H.; Zhao, Z. A new multiple seeds based genetic algorithm for discovering a set of interesting boolean association rules. Expert Syst. Appl. 2017, 74, 55–69. [Google Scholar] [CrossRef]
- Hoseini, M.S.; Shahraki, M.N.; Neysiani, B.S. A new algorithm for mining frequent patterns in can tree. In Proceedings of the 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI), Tehran, Iran, 5–6 November 2015. [Google Scholar]
- Raval, R. F3 algorithm for association rules. Int. J. Comput. Appl. 2017, 164, 6–11. [Google Scholar] [CrossRef]
- Wei, Y.; Yang, R.; Liu, P. An improved apriori algorithm for association rules of mining. In Proceedings of the 2009 IEEE International Symposium on IT in Medicine & Education, Jinan, China, 14–16 August 2009; Volume 1, pp. 942–946. [Google Scholar]
- Ezhilvathani, A.; Raja, K. Implementation of parallel apriori algorithm on Hadoop cluster. Int. J. Comput. Sci. Mob. Comput. 2013, 2, 513–516. [Google Scholar]
- Ergen, B. Frequent pattern mining under multiple support thresholds. WSEAS Trans. Comput. Res. 2016, 4. [Google Scholar]
- Jia, K.; Liu, H. An improved FP-growth algorithm based on som partition. In Proceedings of the Third International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2017, Changsha, China, 22–24 September 2017; pp. 166–178. [Google Scholar]
- Suvalka, B.; Khandelwal, S.; Patel, C. Revised ECLAT Algorithm for Frequent Itemset Mining. In Information Systems Design and Intelligent Applications; Springer: New Delhi, India, 2016. [Google Scholar]
- Ma, Z.; Yang, J.; Zhang, T.; Liu, F. An improved eclat algorithm for mining association rules based on increased search strategy. Int. J. Database Theory Appl. 2016, 9, 251–266. [Google Scholar] [CrossRef]
- Jusoh, J.A.; Man, M. Modifying iEclat Algorithm for Infrequent Patterns Mining. Adv. Sci. Lett. 2018, 24, 1876–1880. [Google Scholar] [CrossRef]
- Vo, B.; Le, T.; Coenen, F.; Hong, T. Mining frequent itemsets using the n-list and subsume concepts. Int. J. Mach. Learn. Cybern. 2016, 7, 253–265. [Google Scholar] [CrossRef]
- Yu, X.; Wang, H.; Zhang, X.; Wang, Y. Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int. J. Ad Hoc Ubiquitous Comput. 2016, 23, 137. [Google Scholar] [CrossRef]
- Song, B.; Zhang, C.; Pan, L. Tissue-like p systems with evolutional symport/antiport rules. Inf. Sci. 2017, 378, 177–193. [Google Scholar] [CrossRef]
- Song, B.; Pan, L. The computational power of tissue-like p systems with promoters. Theor. Comput. Sci. 2016, 641, 43–52. [Google Scholar] [CrossRef]
- Dam, T.L.; Li, K.; Fournier-Viger, P.; Duong, Q. An efficient algorithm for mining top-rank-k frequent patterns. Appl. Intell. 2016, 45, 96–111. [Google Scholar] [CrossRef]
- Han, J.; Pei, J.; Yin, Y. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 15–18 May 2000; pp. 1–12. [Google Scholar]
- Zaki, M.J. Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 2000, 12, 372–390. [Google Scholar] [CrossRef]
- Pan, L.; Păun, G.; Zhang, G. Spiking neural p systems with communication on request. Int. J. Neural Syst. 2017, 27, 1750042. [Google Scholar] [CrossRef] [PubMed]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).


