Efficient Top-K Identical Frequent Itemsets Mining without Support Threshold Parameter from Transactional Datasets Produced by IoT-Based Smart Shopping Carts

Internet of Things (IoT)-backed smart shopping carts are generating an extensive amount of data in shopping markets around the world. This data can be cleaned and utilized for setting business goals and strategies. Artificial intelligence (AI) methods are used to efficiently extract meaningful patterns or insights from such huge amounts of data or big data. One such technique is Association Rule Mining (ARM) which is used to extract strategic information from the data. The crucial step in ARM is Frequent Itemsets Mining (FIM) followed by association rule generation. The FIM process starts by tuning the support threshold parameter from the user to produce the number of required frequent patterns. To perform the FIM process, the user applies hit and trial methods to rerun the aforesaid routine in order to receive the required number of patterns. The research community has shifted its focus towards the development of top-K most frequent patterns not using the support threshold parameter tuned by the user. Top-K most frequent patterns mining is considered a harder task than user-tuned support-threshold-based FIM. One of the reasons why top-K most frequent patterns mining techniques are computationally intensive is the fact that they produce a large number of candidate itemsets. These methods also do not use any explicit pruning mechanism apart from the internally auto-maintained support threshold parameter. Therefore, we propose an efficient TKIFIs Miner algorithm that uses depth-first search strategy for top-K identical frequent patterns mining. The TKIFIs Miner uses specialized one- and two-itemsets-based pruning techniques for topmost patterns mining. Comparative analysis is performed on special benchmark datasets, for example, Retail with 16,469 items, T40I10D100K and T10I4D100K with 1000 items each, etc. The evaluation results have proven that the TKIFIs Miner is at the top of the line, compared to recently available topmost patterns mining methods not using the support threshold parameter.


Introduction
The IoT is a comprehensive and much researched direction nowadays, and it can be categorized into two main components: embedded systems using sensors and processing technologies on Wi-Fi sensor networks. IoT is evolving day by day, yet it suffers from three main challenges: architecture [1], interoperability [2], and security [3][4][5]. Various IoT-enabled heterogeneous devices around us communicate through sensors via the internet to provide smart services. For example, smart health monitoring systems [6], smart parking systems [7], and smart shopping carts [8]. In this paper, we address one of the pertinent issues with smart shopping carts' transactional data collected on webservers. After preprocessing this coarse-grained data, some artificial intelligence method can be

Research Objectives
In this article, we propose a novel topmost frequent itemsets mining algorithm named TKIFIs Miner. The proposed algorithm is not a derivative of the Apriori or FP-Growth techniques, but it uses a recursive best-first-based depth-first traversal technique on search space to discover topmost frequent itemsets. During this traversal, two novel pruning strategies are used to early prune the portion of the search space not required for topmost patterns mining. Subsequently, support of the remaining candidate patterns using an intersection of TIDs approach is calculated. Finally, the support of the pattern is checked against the least support of the patterns available in the top-K list. If the support of the discovered pattern is greater than the least support of the patterns available in the top-K list, it is added to the top-K list; otherwise, the pattern is not added to the top-K list.

Organization of the Paper
The article is organized as follows: Section 2 presents previous work related to this area of research, followed by a discussion of the proposed algorithm and a presentation of a few preliminaries about the topmost frequent itemsets mining problem in Section 3. Additionally, a set of examples are presented to highlight the topmost frequent itemsets mining using k as parameter in place of support threshold. In Section 4, the experimental evaluation of the proposed algorithm and the comparison of the TKIFIs Miner with the current benchmark methods are presented. Finally, Section 5 discusses the achievements of this work and future directions that can be unveiled through topmost frequent patterns mining studies.

Related Work
The top-K FIM is introduced to avoid user-given threshold parameter tuning. On the other hand, top-K FIM is considered a computationally harder task to perform as compared to support-based FIM. Due to the self-explanatory results, it is applied in different real world application domains such as monitoring users' activity from their movements data [33], COVID-19 virus strains classification and identification [34], and extraction of metadata from dynamic scenarios [35,36].
Salam et al. for the first time suggested considering natural requirements of STPfree K-most FIs to design a novel topmost-FIM technique that was not derived from the aforementioned classical approaches [30]. This technique mines topmost maximal frequent itemsets and top-K maximal frequent itemsets from the given transactional dataset. According to this approach, a topmost maximal frequent itemset in a dataset is a frequent itemset of a highest support with item length greater or equal to 3. On the other hand, a top-K maximal frequent itemset set is a set of K distinct highest support maximal frequent itemsets. In this approach, first the association ratio of all 2-length itemsets is calculated, followed by the construction of an Association Ratio graph (AR-graph). Subsequently, the AR-graph is applied to construct an all-path-source-destination tree (ASD-tree). Finally, the ASD-tree is used to find the top-K maximal frequent itemset set. Additionally, the Top-K Miner introduced by Rehman et al. [29] does not extend any classic STP-based mining techniques. It performs only one scan of the dataset to construct frequent itemsets of length 2, and during this scan the top-K list is updated with frequent itemsets of lengths 1 and 2. The topmost frequent itemsets of length 2 from the top-K list whose support is greater or equal to the Kth support value in the top-K list are selected for topmost k frequent itemsets mining of lengths greater than or equal to 3. Subsequently, every 2-itemset is iteratively combined with the selected nodes of a Candidate Itemsets Search tree (CIStree) to form the itemsets of any length greater than 2. Their support is calculated and adjusted accordingly in the top-K list. The Top-K Miner finds all topmost FIs as per the tuned parameter, but suffers as memory-expensive due to the Candidate Itemsets Search tree (CIS-tree). Moreover, Iqbal et al. presented the TKFIM algorithm which inherits the Apriori mining technique for the discovery of K-topmost FIs [26]. This algorithm performs excessive candidate generation because it uses common itemsets prefixes of the already produced topmost frequent itemsets. The frequencies of the itemsets are generated using the diffset support finding method [37]. Apart from the automatic support threshold raising strategy, the algorithm does not adopt any other pruning strategy to perform search space pruning and to avoid excessive candidate generation. Table 1 presents the discussed topmost FIM techniques and their details.

Frequent Itemsets Mining
In this section, we will discuss the proposed frequent itemsets mining algorithms, which include three methods: the TK IFIs mining method that mines a user-required number of IFIs from the user-given dataset; the Gen-IFIs method that explores the search space to discover TK-IFIs; and the Candidate-IFIs method that generates the set of candidate itemsets for the newly created IFIs. Additionally, some examples will be presented while explaining the frequent itemsets mining methods to give more clarification to the reader. At the beginning, it is worth starting by presenting some preliminaries and definitions.

Preliminaries and Problem Definitions
The support threshold value provided by the user serves as the border between frequent and infrequent patterns. Our proposed FIM method is not based on a support threshold value supplied by the user. Then, how can our proposed method discriminate between frequent and in-frequent patterns to produce the required number of patterns?
There is an important observation related to the selection of a support threshold as far as our proposed method is concerned. The parameter required by the proposed method is K, which determines the number of topmost frequent patterns that are required to be discovered from search space. The value of the K parameter guides the proposed method for the automatic adjustment of the border between the number of required topmost frequent and infrequent patterns. The support count value of the patterns decides the aforementioned border. This is automatically determined or adjusted by the machine based on K value rather than given by the user. Tuning the value of the K for finding the required number of patterns is simple and unlike a support threshold value which is not dependent on the characteristics of the given dataset. Therefore, based on the above discussion, the following definitions are presented for further groundwork:

1.
Identical Itemsets (IIs): One or more itemsets are called IIs if and only if all of them have the same support count that is I IIs = { p i | ∀ p i ∈ X supp(p i ) = s }, where s ∈ S is the support count of every pattern p i found in the set of support count S.

2.
Top-1 Identical Frequent Itemsets (IFI 1 ): One or more itemsets are called IFI 1 if and only if all of them have same support count and their support is the first highest most support in the set of support count S that is I IFI 1 = { p i | ∀ p i ∈ X supp(p i ) = s 1 }, where s 1 ∈ S is the first highest support count in the set of support count S. 3.
Top-K Identical Frequent Itemsets (I IFI K ): One or more itemsets are called IFIs if and only if all of them have same support count and their support is the Kth highest most support in the set of support count S that is where every t is a transaction. An itemset p ⊆ I containing one or more transactions presents support of p, i.e., Supp(p). Let X be a set of all p s found in transactions of database D. That is, X contains all the itemsets that can be generated from D. Let S be the set of support counts of all the itemsets contained by X. The problem of topmost frequent itemsets mining is to find all IFIs of highest support to Kth support where K is a user specified number of topmost IFIs, i.e., TK IFI = { IFI 1 , IFI 2 , IFI 3 . . . . . . . . . . . . . . . , IFI K }.

TK IFIs Mining Method
In this section we introduce TK IFIs Miner method that mines the user required number of IFIs from the user-given dataset D basd on one simple parameter K as shown in step 1 in the above Algorithm 1. The database is scanned once to copy the tid in the corresponding position of the 1-itemset indexed with item i. It is done by performing the horizontal scan of the given datasets as shown in step 3 to 5 of the above Algorithm 1. The next step is copying all 1-itemsets of the highest support to the Kth support into set of candidate itemset C, and into the TK IFIs list. In order to perform the aforesaid task, first topmost highest support itemset is discovered and copied into itemset i, followed by copying this itemset in C as shown in step 6a and step 6b of the above Algorithm 1. The topmost selected itemset i in step 6a is copied into the TK IFIs list in step 6c of the above Algorithm 1.This itemset i is removed from the list of 1-itemsets in step 6d, as shown above in Algorithm 1, In step 7 of the above Algorithm 1, we used 1-itemsets of highest frequency to Kth frequency for producing frequencies of 2-itemsets. In step 8, we use the Gen IFIs algorithm for the top-K IFIs mining from the already trimmed search space. For more explanation, we give Example 1 to illustrate the algorithm by applying it to a transactional dataset that is given in Table 2. Table 2. Transactional dataset.

TID Transactions
Example 1: Consider the transactional dataset given in Table 2. The dataset consists of 10 transactions and 6 distinct items. For the given value of K = 4, 5, the dataset is mined for top-4, top-5 IFIs as shown in Table 3. Table 3. Top-4 IFIs, Top-5 IFIs.
f oreach i : 1 to |C| − 1 // 2-itemsets s support used for pruning purpose in Gen IFIs a.
(8) Gen IFIs (IFI = ∅, C, TK IFIs , K, 2_itemsets); (9) return TK IFIs ; Table 3 represents the topmost patterns mined for K= 4, and 5, from the transactional dataset given in Table 2. It is clear from Table 3 that the only input required from the user is the K value to produce the desired number of topmost patterns. In the top-K result sets in Table 3, it can be observed that all the subsets of an FI in top-K IFIs are placed at the same, or ranked at higher, positions due to Apriori property [1]. For example, all the subsets of a frequent itemset {FBA = 7} in top-5 IFIs in Table 3 are ranked at top-4 or higher positions.

TK IFIs Production Method
This section presents the Gen IFIs method which is a search space exploration algorithm that discovers the TK IFIs based on the number of topmost-K IFIs required by the user. It is a recursive best-first-based depth-first search TK IFIs discovery approach. Every time Gen IFIs is called recursively, it discovers one or more IFIs. The generated IFIs are then either accommodated in the currently maintained TK IFIs list or dropped due to lesser support count. The Gen IFIs procedure uses four parameters during the TK IFIs discovery process. The first parameter is current highest support IFI o . At the start, when Gen IFIs is called, it is passed as empty. The second parameter is candidate itemsets list C o which is used for the extension of current IFI o . Before calling Gen IFIs for the first time, the top-K 1-itemsets are copied in C o , and subsequently it is passed to Gen IFIs as the second parameter. The parameter TK IFIs maintains the current top-K IFIs produced before calling and within the Gen IFIs calls. The 2 − itemsets is a last parameter used in Gen IFIs calling. This parameter is used in pruning the candidate patterns in the Candidate_IFIs procedure in step 5 of Algorithm 2.
The Gen IFIs procedure extends the current IFI with the item i from the candidate itemset list C o iteratively as shown in step 1 of Algorithm 2. The minimum support threshold is raised automatically with the addition of IFIs in the TK IFIs list. Therefore, before appending item i from the candidate list to the current IFI, its support count is checked with minimum support of the TK IFIs list, as shown in step 2 of Algorithm 2. If the support of item i is less than the minimum support of IFIs in the TK IFIs list, then the current item i is pruned from combining it with the current IFI. In step 3, the selected item i is combined with current IFI o to form a new IFI, IFI n . In step 4, all the 1-itemsets in C o whose support value is greater than or equal to the current item i are selected/copied in the possible candidate itemset list P c .
The Candidate_IFIs procedure is used here to return a new candidate list C n for the new IFI, IFI n , as shown in step 5 of Algorithm 2. It uses the new IFI, IFI n , formed in step 3, and possible candidate list P c , to generate a new candidate itemsets list, C n . In step 6, the Gen IFIs algorithm is called recursively for further processing. The used search space strategy is explained below. For FIM, the search space is arranged in the form of a Set Enumeration tree (SE-tree), as presented in Figure 1. Since FIM methods discover patterns or itemsets in a search space, the SE-tree is one of the search space presentation methods used here to enumerate all of the possible subsets. The intuitions behind adopting the SE-tree for search space presentation is given below.

1.
The problems where search is a subset of power set. It represents irredundant complete search space.

2.
The SE tree is a structure that can be traversed recursively.

3.
For the FIM techniques adopting depth-first traversal of the search space, SE-tree can be used to model their search space efficiently.
Example 2: Consider the transactional dataset Table 2. Generate all 1-itemset for K = 5 and corresponding 2-itemsets from it. The given transactions dataset is scanned transaction by transaction using iterative step 3 of the Algorithm 1. Every transaction is scanned from left to right for every item. The transaction ID for every item is recorded. Finally, the K highest support 1-itemsets are selected iteratively as shown in steps 6a and 6c of Algorithm 1. The 2-itemsets are then computed from the K highest support 1-itemstets, as shown in the table below.

Candidate Generation Method
The method is used to generate the set of candidate itemset C for the newly created IFI n . Every item from this set will be used subsequently to extend the current IFI n . For the creation of candidate itemset C, individual items from possible candidate list P c are utilized. An item y is iteratively selected from P c in step 1, as shown above in Algorithm 3, for the purpose of addition into candidate list C. Before appending an item y into candidate list C, two types of pruning strategies are applied on it. These strategies are explained in the following.
Itemset Based Candidate Pruning (IBCP): An itemset IFI n ∪ y is not a candidate for IFI if an item in an itemset has support count less than minimum support of IFIs in TK IFIs .

Two-Itemset Based Candidate Pruning (TIBCP):
An itemset IFI n ∪ y is not a candidate for IFI if any 2-itemset in an itemset has support count less than minimum support of IFIs in TK IFIs .
Let X ⊆ IFI n ∪ y, where X is a two-itemset and y ∈ X is a one-itemset to extend IFI n . It is clear that the support count of y or X is less than minimum support of IFIs in TK IFIs , i.e., min_sprt( TK IFIs ). This implies that the sprt(IFI n ∪ y ) < min_sprt( TK IFIs ) according to the Apriori property [12]. Hence, an itemset IFI n ∪ y can never be termed as IFI.
The IBCP and TIBCP strategies are applied on line numbers 2 and 4, respectively, in Algorithm 3. If the support of an item y, or two-itemset containing an item y, is less than minimum support of IFIs in TK IFIs , the item y is skipped from frequency test. Algorithm 3. Candidate_IFIs(IFI n , P c , TK IFIs , 2 − itemsets) Input IFI n : New IFI (new node head portion) P c : Possible candidate TK IFIs : Current list of Top-K IFIs discovered so far 2-itemsets : set of 2-items formed from K highest support 1-items for pruning purpose Steps (1) f oreach item y ∈ P c (2) i f (sprt(y) < min_sprt(TK IFIs )) ( [1], y]< min_sprt(TK IFIs )) (9) continue; // prune selected item from the list (10) i f (sprt(IFI n ∪ y) ≥ min_sprt(TK IFIs )) (11) Append (TK IFIs , IFI n U y, sprt(IFI n U y)); The support count of the IFI n ∪ y is computed in step 10 of Algorithm 3 as shown above. If the support count of the IFI n ∪ y is found equal or greater than the minimum support of any IFI in a set of IFIs, i.e., TK IFIs , the resultant IFI is added to TK IFIs in step 11 of Algorithm 3 above. The currently selected item y is added to the set of candidates C in step 12 of Algorithm 3 above for further depth-first order processing of itemsets. We now give Example 1 to illustrate the Candidate_IFIs method by applying it to the transactional dataset that was given in Table 2.
Example 3: For the given transactional dataset in Table 2, assuming descending support order < on top-5 IFIs as shown in Table 4, the complete set enumeration tree is given above in Figure 1. Assuming specific order on the items of set I, the set enumeration tree of I will present all 2 n itemsets. All nodes of the tree except the root node shown in the above Figure 1 are made according to the following two observations.

1.
Let f p be a node label in the SE-tree. Let C ⊆ I be a candidate set of items which i s u s e d t o e x t e n d a n y n o d e i n t h e t re e a p a r t f ro m ro o t n o d e , t h a t i s C = {z | z ∈ I ∧ supp(z) ≤ supp(x), ∀x ∈ f p .

2.
A node label f c is an extension of a node label f p if f c = f p ∪ X where X ⊆ C, and X is any possible subset of C such that X = ∅. An itemset F = f ∪ {z} is a single item extension or child of f if z ∈ C.

Comparative Evaluation
In this section, we will start by presenting Example 2 and considering the transactional dataset given in Table 2.
Example 4: All methods that are included in the frequent itemsets mining algorithm and given in Algorithms 2 and 3 will be applied, in addition to finding all the top-5 IFIs from the given dataset, as illustrated below in Tables 5-9. Moreover, the SE-tree presented in Figure 2 is used to enumerate all of the possible subsets.   FD, D 6

Performance Trends
To analyze the performance trends of the TK IFIs Miner, it was compared against two recent top-most IFIs mining techniques, top-K Miner [29] and TKFIM [26]. The top-K Miner uses depth-first traversal and is not an Apriori-inspired method, while the TKFIM method uses breadth-first traversal strategy, which is considered an Apriori-inspired approach. All these techniques need one parameter, i.e., K, to find top-most IFIs from the given dataset. For the comparative evaluation, we used six benchmark datasets, as shown in Table 10 below. The first two datasets shown in Table 10 above are freely downloadable from the UCI Machine Learning Repository [https://archive.ics.uci.edu/ml/index.php (accessed on 15 September 2022)] [38]. The third, fourth, and fifth datasets are freely downloadable from a frequent itemsets mining datasets repository [http://fimi.uantwerpen.be/data/ (accessed on 22 June 2021)] [39]. The T40I10D100K and T10I4D100K datasets are synthetic in nature, generated on IBM Synthetic Data Generator by IBM Almaden Quest research group. The Chess and Connect datasets are dense in nature. Hence, these datasets will generate result patterns of long and short length if K value is large. The Retail, T40I10D100K, and T10I4D100K datasets are sparse in nature and the average presence of items in every transaction is low. These datasets will result in patterns of short length even if a large value of the parameter K is supplied.
For experimental evaluation with TKIFIs Miner, we selected three frequent itemsets mining methods: FP-Growth [40], Top-K Miner [29], and TKFIM [26]. The FP-Growth method is used as a benchmark method for experimental evaluation of the many mining methods. Apart from the FP-Growth method, all the other methods apply K as parameter to find topmost IFIs of highest support to the Kth distinct support. The FP-Growth method uses support threshold value for frequent patterns mining. Additionally, FP-Growth, top-K Miner, and TKIFIs Miner use tree data structure and depth-first strategy for mining patterns, whereas the TKFIM approach is an Apriori-inspired algorithm and uses breadthfirst strategy to explore the search space. In the experimental work, the support of Kth IFI will be applied to find patterns with FP-Growth. The mapping of K and support threshold parameter that is used to find the same patterns is already established in the work of the Top-K Miner algorithm [29].
On dense datasets such as Chess, Connect, and T40I10D100K, the performance trends of all the methods are almost similar for small values of K from 1 to 20. The top-K Miner, TKFIM, and TKIFIs Miner arrange the search space in descending support order. Therefore, for small values of K these methods compute IFIs of K-highest support in less amount of time. On the other hand, a high support threshold value enables the FP-Growth method to prune the search space and find the same result efficiently. For large values of K, the performance of the TKIFIs Miner is much better due to IBCP and TIBCP pruning strategies. These pruning strategies play a pivotal role enabling the TKIFIs Miner to prune the entire subtree before even finding the support of the candidate patterns. TKFIM and top-K Miner do not apply pruning strategies, hence they perform excessive candidate generation and support computation of patterns which are not even frequent. Therefore, these methods need more time on dense datasets with large values of K than does the TK IFIs Miner. For large values of K, the FP-Growth method uses low support threshold to create lots of conditional pattern trees on dense datasets and for a large number of long and short patterns. Hence, it consumes more time than all the top-K methods. patterns, whereas the TKFIM approach is an Apriori-inspired algorithm and uses breadthfirst strategy to explore the search space. In the experimental work, the support of Kth IFI will be applied to find patterns with FP-Growth. The mapping of K and support threshold parameter that is used to find the same patterns is already established in the work of the Top-K Miner algorithm [29].
On dense datasets such as Chess, Connect, and T40I10D100K, the performance trends of all the methods are almost similar for small values of K from 1 to 20. The top-K Miner, TKFIM, and TKIFIs Miner arrange the search space in descending support order. Therefore, for small values of K these methods compute IFIs of K-highest support in less amount of time. On the other hand, a high support threshold value enables the FP-Growth method to prune the search space and find the same result efficiently. For large values of K, the performance of the TKIFIs Miner is much better due to IBCP and TIBCP pruning strategies. These pruning strategies play a pivotal role enabling the TKIFIs Miner to prune the entire subtree before even finding the support of the candidate patterns. TKFIM and top-K Miner do not apply pruning strategies, hence they perform excessive candidate generation and support computation of patterns which are not even frequent. Therefore, these methods need more time on dense datasets with large values of K than does the TKIFIs Miner. For large values of K, the FP-Growth method uses low support threshold to create lots of conditional pattern trees on dense datasets and for a large number of long and short patterns. Hence, it consumes more time than all the top-K methods.     On sparse datasets like Retail and T10I4D100K, the performance of TKIFIs Miner is almost equal for small values of K, as shown in Figures 6 and 7, whereas on large values of K the performance of TKIFIs Miner is better than TKFIM, top-K Miner, and FP-Growth. As already mentioned, the high value of K increases candidate itemsets generation, and the pruning methods start pruning the entire subtrees in TKIFIs Miner, reducing its execution time in comparison to counterparts.  On sparse datasets like Retail and T10I4D100K, the performance of TKIFIs Miner is almost equal for small values of K, as shown in Figures 6 and 7, whereas on large values of K the performance of TKIFIs Miner is better than TKFIM, top-K Miner, and FP-Growth. As already mentioned, the high value of K increases candidate itemsets generation, and the pruning methods start pruning the entire subtrees in TKIFIs Miner, reducing its execution time in comparison to counterparts. On sparse datasets like Retail and T10I4D100K, the performance of TKIFIs Miner is almost equal for small values of K, as shown in Figures 6 and 7, whereas on large values of K the performance of TKIFIs Miner is better than TKFIM, top-K Miner, and FP-Growth. As already mentioned, the high value of K increases candidate itemsets generation, and the pruning methods start pruning the entire subtrees in TKIFIs Miner, reducing its execution time in comparison to counterparts.

Conclusions and Future Work
In this article, we presented an efficient TKIFIs Miner algorithm for mining top-K IFIs without using the support threshold parameter from transactional data collected through smart shopping carts. Hence, adjusting this parameter to mine the required number of FIs is a harder choice for users. Therefore, the TKIFIs Miner algorithm allows users to control the production of the number of IFIs using a parameter K. In TKIFIs Miner, as part of an automatic adjustment of the support threshold, we used new IBCP and TIBCP pruning techniques which were not used earlier in any topmost frequent patterns mining problem.

Conclusions and Future Work
In this article, we presented an efficient TKIFIs Miner algorithm for mining top-K IFIs without using the support threshold parameter from transactional data collected through smart shopping carts. Hence, adjusting this parameter to mine the required number of FIs is a harder choice for users. Therefore, the TKIFIs Miner algorithm allows users to control the production of the number of IFIs using a parameter K. In TKIFIs Miner, as part of an automatic adjustment of the support threshold, we used new IBCP and TIBCP pruning techniques which were not used earlier in any topmost frequent patterns mining problem.

Conclusions and Future Work
In this article, we presented an efficient TK IFIs Miner algorithm for mining top-K IFIs without using the support threshold parameter from transactional data collected through smart shopping carts. Hence, adjusting this parameter to mine the required number of FIs is a harder choice for users. Therefore, the TK IFIs Miner algorithm allows users to control the production of the number of IFIs using a parameter K. In TK IFIs Miner, as part of an automatic adjustment of the support threshold, we used new IBCP and TIBCP pruning techniques which were not used earlier in any topmost frequent patterns mining problem. These pruning methods have proven the effectiveness of pruning the entire itemsets' subtrees from the search space, and hence enabling the depth-first search based on the Gen IFIs method to quickly find those itemsets having a high support value.
TK IFIs Miner outperformed all the topmost patterns mining approaches and the FP-Growth technique in the experimental evaluation. TK IFIs Miner's computational time is equal to its counterparts on dense and sparse datasets with small values of K. On dense datasets, TK IFIs Miner excels by a bigger margin in terms of computational time for high values of K. As future work to show the created result-set in a more compact manner, we will integrate maximal or closed patterns mining with the TK IFIs Miner approach.